Skip to content

CLIP Model Document Classification Category Overlap Issue #477

@rushabhT3

Description

@rushabhT3

The current CLIP model implementation is unable to properly distinguish between similar document categories, particularly between receipts and invoices. When processing documents like detailed sales reports, the model shows significant overlap in confidence scores between categories that should be more distinctly classified.

Current Categories

CLIP_CATEGORIES = [
    "receipt",
    "invoice (table with with brought items and their price)",
    "cheque",
    "logo",
    "document",
    "blank document",
    "form",
    "contract",
    "letter",
    "chart",
    "graph"
]

Problem

  1. Category Overlap:

    • Receipt vs Invoice: Model struggles to differentiate between these when processing sales documents
  2. Example Case:
    When processing a restaurant sales report containing:

    • Itemized sales data
    • Payment summaries
    • Tax calculations
    • Business information

    The model produces ambiguous confidence scores between "receipt" and "invoice" categories.

Impact

  • Unreliable classification results
  • High uncertainty in document type determination
  • Reduced accuracy in automated document processing
  • Manual intervention often needed for correct categorization

Proposed Solutions

  1. Refine Category Definitions:

    • Add more specific categories like "sales_report", "financial_statement"
    • Create subcategories for business-specific documents
    • Include composite categories for hybrid documents
  2. Training Improvements:

    • Enhance training data with more diverse document examples
    • Include more restaurant-specific financial documents
    • Add clear distinguishing features between receipts and invoices
  3. Category Refinement:

REFINED_CATEGORIES = [
    "simple_receipt",  # Basic transaction receipts
    "detailed_sales_report",  # Comprehensive business sales data
    "commercial_invoice",  # Formal billing documents
    "financial_statement",  # Detailed financial reports
    "business_document",  # General business documentation
    "form",
    "contract",
    "letter",
    "chart",
    "graph"
]

Additional Context

Test document: Restaurant daily sales report containing detailed financial breakdowns, which received split classifications between receipt and invoice categories.

Labels

  • enhancement
  • machine-learning
  • document-classification
  • CLIP-model
  • accuracy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions