“WordNet is a lexical database of semantic relations between words in more than 200 languages” – Wikipedia. It groups words into synsets. Synset is a group of words that reflects the same meaning in a given text. In simpler terms, a WordNet is similar to a thesaurus that groups words based on their meanings.
WordNet comes as a part of NLTK corpus. It provides relations between various words. This knowledge can be used to build applications based on Informational Retrieval.
Homonyms are words that are spelt and pronounced the same but have different meanings based on the context.
For example: right – correct, direction
- Right – correct: It turns out that I was right
- Right – direction: Take a right from the next junction to reach a cafe.
Similarly, some other homonymy groups:
- Pen – writing instrument, to write (verb), holding area for animals
- Arm – body part, division of a company
- Bat – an animal, cricket bat
- Fly – to fly (verb), an insect
Polysemes are words with the same spelling and very relatable meanings; similar to homonymy, but specific to a concept.
For example, consider the following meanings for the word “bank”:
- financial institution
- bank of the river
- building belonging to a financial institute
- to rely upon (verb)
In the above examples, 1, 3 and 4 depict a common theme and are polysemes.
Synonyms are words that spell and sound different but have similar meanings.
- small – little
- big – large
- intelligent – smart
- positive – optimistic
Hyponyms are a set of words that show a relationship between a generic term. The words may or may not be directly, however, refers to the same context.
For example: red, yellow, black, blue – all refer to a general lexical representation for color; i.e, Red is a hyponym of Color.
- Apple is a hyponym of Fruit
- Tomato is a hyponym of Vegetable
Further in the series: Named Entity Recognition in Natural Language Processing