VISL CG3 is a neat tool for running constraint grammars (CGs) for things such as morphological disambiguation or syntactic parsing. Grammars of this formalism have been developed for a great many endangered Uralic languages boosting their NLP. And these CGs are actually easily available in UralicNLP for Python programmers.
Even for UralicNLP, a tool called vislcg3 needs to be installed on your machine, and it might be a tricky task if you cannot find the correct binaries for your operating system. Therefore I tailored this guide.
Omorfi is inarguably an amazing tool for processing Finnish morphology both in analysis and generation. However, using it might be quite a challenge for the users who are not too (H)FST savvy. 😅That is one of the motivations for my UralicNLP library
the purpose of which is to provide an easy Python interface for a multitude NLP tools for Uralic languages.
Morphology can be described as the smallest information bearing unit of the human language. Words that are inflected can be divided into morphemes, e.g. -ed in talked is a morpheme that adds the meaning of a past tense into the verb talk; -s in dogs pluralizes the noun and so on. These morphemes that are added to words are known as affixes. There are different kinds of affixes and in this post we are going to look at them more closely. 🤓 (more…)
Languages can be grouped together in different ways. One can put languages together based on their family relation (e.g. Uralic languages, Indo-European languages) or the area where they are spoken. But maybe the most interesting and eye-opeing way of grouping them is by their morphology. As it turns out, there are only four morphological groups for languages and all spoken languages fall into one of them. (more…)