1. Legal Document Classification
Legal document classification is the process of categorizing legal texts like contracts, court rulings, or compliance forms into predefined classes. Labels can include confidentiality level, contract type, or jurisdiction. Doccano allows annotators to mark segments and tag them with specific categories, helping legal professionals sort and retrieve documents easily. This accelerates legal research and enables automation of repetitive document analysis tasks in law firms and regulatory bodies.
document = "This is a non-disclosure agreement (NDA) under US law."
label = "Confidential - NDA"
print("Classified as:", label)
2. Customer Feedback Analysis
Analyzing customer feedback involves reviewing survey comments, support tickets, or reviews to understand user sentiment, product issues, or feature requests. In Doccano, such texts are annotated by sentiment, topic, or urgency. This classification aids companies in making data-driven decisions, improving product features, and identifying negative experiences early. It also enables automated triaging of user complaints.
feedback = "The app keeps crashing on startup."
sentiment = "Negative"
category = "Bug Report"
print("Feedback Type:", sentiment, "| Category:", category)
3. Sentiment Labeling
Sentiment labeling assigns emotional tone to texts—positive, negative, or neutral. It's common in social media monitoring and customer service. Annotators mark sentences in Doccano with emotion tags to train models that detect opinions and public sentiment trends. Sentiment analysis is critical in brand monitoring, elections, and market research to assess people's views.
sentence = "I love this product!"
sentiment = "Positive"
print(f"Sentence: '{sentence}' labeled as {sentiment}")
4. Medical Named Entity Recognition
This involves tagging medical entities like diseases, drugs, or procedures in clinical notes. Annotators in Doccano highlight terms like "diabetes" or "aspirin" and label them with appropriate categories. It’s crucial for building systems that extract structured data from unstructured medical records to aid diagnostics or research. NER models trained on this data improve clinical decision support.
text = "Patient was prescribed ibuprofen for headache."
entities = [{"entity": "ibuprofen", "label": "Drug"}, {"entity": "headache", "label": "Symptom"}]
print("Extracted entities:", entities)
5. Social Media Classification
Posts from Twitter, Facebook, or Reddit can be classified by topic, sentiment, or user intent. This is especially useful for brands or researchers trying to monitor public discourse. In Doccano, annotators assign labels such as "complaint", "praise", or "product inquiry" to help train models that automate this task at scale.
tweet = "Why is my order delayed again?"
label = "Complaint"
print("Tweet Label:", label)
6. Chatbot Intent Tagging
Intent tagging defines what a user wants when they input a query to a chatbot (e.g., "Book flight", "Cancel reservation"). Annotators use Doccano to tag these sentences with intents so AI chatbots can learn to map phrases to backend actions. It forms the foundation of NLP-based conversational agents.
user_input = "I want to cancel my booking."
intent = "CancelBooking"
print("Detected intent:", intent)
7. Toxic Comment Detection
Detecting toxic content involves labeling offensive, abusive, or threatening language online. Annotators tag such instances in Doccano with categories like "hate speech", "harassment", or "spam". This annotated data is critical to developing content moderation tools used on platforms like YouTube, Twitter, or Reddit.
comment = "You are so dumb and useless!"
label = "Toxic - Personal Insult"
print("Flagged comment:", label)
8. Product Review Analysis
Annotators tag product reviews with aspects such as "battery life", "camera", or "shipping" to extract fine-grained sentiment. This helps e-commerce platforms understand detailed feedback, like customers liking the camera but disliking battery performance. The annotations can power aspect-based sentiment analysis systems.
review = "Camera quality is good but battery drains quickly."
tags = [{"aspect": "Camera", "sentiment": "Positive"}, {"aspect": "Battery", "sentiment": "Negative"}]
print("Review Analysis:", tags)
9. Contract Term Extraction
Legal agreements often require extracting terms like expiration date, governing law, or liabilities. Using Doccano, these key clauses can be annotated and later extracted automatically by models. It's valuable for automating compliance checks, contract analysis, and due diligence.
clause = "This agreement terminates on Dec 31, 2025."
term = {"label": "End Date", "value": "Dec 31, 2025"}
print("Extracted term:", term)
10. OCR Error Correction
OCR (Optical Character Recognition) often introduces text errors. Annotators can use Doccano to correct these by labeling incorrect segments and writing corrections. This helps improve OCR engines and build training datasets for error correction in scanned document pipelines.
ocr_text = "Th1s 1s a t3st."
corrections = {"Th1s": "This", "t3st": "test"}
print("Corrected Output:", corrections)