By Aileen Agricola | June 16, 2015
Mining Paradigmatic Word Associations
Written by William Lyon
Mining word associations from a body of text is often one of the first Natural Language Processing techniques used when mining text data. Word associations are useful for performing NLP tasks such as part of speech tagging, parsing, entity extraction, etc. We will take a brief look at one type of word association called paradigmatic association and show how we can use the Neo4j graph database to help model our text corpus as a graph and implement a simple paradigmatic relation mining algorithm.
Mining Word Associations
There are two common types of word associations in natural language processing, paradigmatic and systagmatic:
- Paradigmatic: words A and B are paradigmatically related if they can be substituted for each other. This indicates they belong in the same class, such as “Monday” and “Thursday” or “Cat” and “Dog.
- Syntagmatic: words that can be combined with each other, such as “cold” and “weather”.