Description
Describe the problem
Given the relative scarcity of tools for building a question-answering dataset, it would be great if doccano could serve that purpose. At its core, I believe that would mean each document is a passage, then there would be forms below (like in a Translation project) that allow you to enter in questions. Then you'd need the ability to annotate stretches of text as the answer (similar to a Sequence Labeling project). The main functionality that would be tricky would likely be tying the labels for the answer annotation back to the input questions.
A naive first pass could be to replicate a combination of the SequenceLabeling+MachineTranslation UIs together. The Labels for each answer could be simply answer-to-Q1
, answer-to-Q2
and each document could simply have a max number of questions associated.
Does this sound doable?
Activity