Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supersenses extra #141

Closed
simongray opened this issue Jul 4, 2024 · 2 comments
Closed

Supersenses extra #141

simongray opened this issue Jul 4, 2024 · 2 comments
Labels

Comments

@simongray
Copy link
Member

Expanding on #138 now that I had a meeting with Bolette and Sussi about it.

  • Comparing with the English WordNet, which uses adj rather than adjective for the adjective grouping, we should switch to match theirs.
  • Certain food-related words are tagged as verb.creation whereas they should be verb.consumption like in the OEWN. Special rules for food may apply, perhaps in other cases too.
  • Once the supersenses are cleaned up and officially in DanNet, I can begin to attach supersenses to the Semdax corpus (CoNLL-U Format). About 7700 senses exist in this corpus that come from DanNet, while the rest are mostly from DDO (not DanNet) and can't be annotated using the Supersenses from DanNet. These will have to be done manually or using data from DDO.
@simongray
Copy link
Member Author

Current status: I have binned the verb.creation synsets into 14 separate groups based on their hypernyms. These and the remaining 9 synsets need to be checked by Sussi/Bolette.

@simongray
Copy link
Member Author

I have remapped all of the verb.creation synsets with Bolette. They will need to be imported into the graph and added to the dataset in the next release.

The next step is to do the ConNLL-U file for Danish found at: https://www.clarin.si/repository/xmlui/handle/11356/1842

Most of these are mapped to DanNet senses. Perhaps the supersense can be added at the end of the lines? I also need to figure out a smart way to do tag much of the remaining ~2500 senses, since they are not directly linked to DanNet. Some of them might be able to be added via the lemma.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant