Skip to content

Automatic B/I Label Assignment for Entity Labeling #2389

Open
@Zhusl-cpu

Description

Proposed Feature: Integrating a tokenizer that can automatically assign a B-label to the first token in a selected entity and I-labels to subsequent tokens.

Currently, Doccano requires users to manually assign a B-label for the first token of an entity and I-labels for subsequent tokens, or add an extra step for processing the exported data. Such a B/I labeling step requires a tokenizer because tokens in models may not correspond directly to whitespace-separated words. If Doccano can allow user to set a tokenizer for an entity labeling task and automatically assign B and I labels based on the identified tokens, it would streamline the labeling process and improve the overall user experience.

I apologize if this feature has already been implemented or is in progress, as I may have overlooked it during my exploration of Doccano.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions