Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attach Supersenses to Synsets #138

Closed
simongray opened this issue Jun 21, 2024 · 4 comments
Closed

Attach Supersenses to Synsets #138

simongray opened this issue Jun 21, 2024 · 4 comments
Labels

Comments

@simongray
Copy link
Member

Supersenses, as seen in the English WordNet, have already been mapped 1:1 to DanNet's ontological types derived from the EuroWordNet ontology.

I have an excel file supplied by Bolette to use for populating DanNet with Supsersenses based on this mapping.

Supersenses

Princeton documentation: https://wordnet.princeton.edu/documentation/lexnames5wn

From email correspondence:

Bolette: Supersenses were popular in a certain period of wsd investigations because they made disambiguation more manageable in NLP. They are sometimes seen as an extension of NER. One could also use an ontology like the EuroWordNet Ontology, but for some reason supersenses became more used for the wsd purposes in a series of papers. I have not seen a lot of work supersenses in later years, though.

(...)

We refer among others to these two papers:

Massimiliano Ciaramita and Yasemin Altun. 2006. Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In Proc. of Proceedings of EMNLP, pages 594–602, Sydney, Australia, July.

Massimiliano Ciaramita and Mark Johnson. 2003. Supersense tagging of unknown nouns in WordNet. In Proceedings of the 2003 conference on Empirical methods in natural language processing, pages 168– 175. Association for Computational Linguistics.

We worked with them in this paper:
https://aclanthology.org/2016.gwc-1.30.pdf

Another email (usage of Supersenses):

Og link til korpusset, herunder den danske del: https://www.clarin.si/repository/xmlui/handle/11356/1842

Som er den del vi i første omgang gerne vil linke til supersenses

@simongray
Copy link
Member Author

The Supersenses mapping is a 1-to-many, but the many all seem to be separated by part-of-speech, fortunately.

The query will have to take this into account.

@simongray
Copy link
Member Author

Apparently, the only problematic rows are these

Plant+Object+Comestible		136	noun.food; noun.plant
Plant+Object+Part+Comestible	324	noun.food; noun.plant

so it may just be down to selecting if edible plants are food or plants.

@simongray
Copy link
Member Author

Currently blocked by row 137:

noun.food	804	noun.substance

The first column should be an ontotype, but it has been replaced with a Supersense, making the ~800 synsets impossible to classify until the original authors of this mapping (e.g. Bolette) chime in.

@simongray
Copy link
Member Author

I went with Natural+Substance after conferring with Sussi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant