Description
Proposed Feature: Integrating a tokenizer that can automatically assign a B-label to the first token in a selected entity and I-labels to subsequent tokens.
Currently, Doccano requires users to manually assign a B-label for the first token of an entity and I-labels for subsequent tokens, or add an extra step for processing the exported data. Such a B/I labeling step requires a tokenizer because tokens in models may not correspond directly to whitespace-separated words. If Doccano can allow user to set a tokenizer for an entity labeling task and automatically assign B and I labels based on the identified tokens, it would streamline the labeling process and improve the overall user experience.
I apologize if this feature has already been implemented or is in progress, as I may have overlooked it during my exploration of Doccano.
Activity