Create Finnish sentences computationally in Python (NLG)

Share this postShare on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someonePin on PinterestDigg thisShare on RedditShare on StumbleUpon

If you are interested in generating Finnish with a computer (NLG), you have probably already run into the problem of the complex morphology and syntax of Finnish. In addition to knowing how to inflect words, you have to take agreement into account. That for example, the verb agrees with the subject’s number and person: minä syön, sinä syöt and so on. And there’s more: case governance has to be solved too. A verb takes its direct object in a certain case, for example, you would say uneksin autosta but näen auton. Such is the problem of natural language generation. 🤷🏼‍♂️

You are in luck, I have resolved this issue and created a kick-ass python library called syntax maker. Just for you, my friend, free to use.  Are you ready to unleash the power of NLG? 😊😊

Go ahead and do pip install syntaxmaker to start using it. You will have to install Omorfi as well, for example, by using these Omorfi binaries. For troubleshooting in the installation, see my HFST post and syntaxmaker wiki.

How to use it?

To understand how the library is suppose to be used, you have to understand the basics of phrases and heads. Let’s take for example a syntactic analysis of a cat eats food  VP [ DP [ a NP [ cat ] ] eats NP [ food ] ]. Every single word (or head) is inside of a phrase of its part-of-speech: eats is the head of the VP (verb phrase), cat the head of NP (noun phrase) and so on. And you can see that the phrases are nested inside of each other, leaving the verb phrase the root of the tree. This is how syntax maker works too; everything has to go under a verb phrase.

In order to create a sentence kissat syövät ruokaa, you just have to create a structure that looks like the one in the figure below.

a syntactic tree for kissa syödä ruokaa

This can be done with the following piece of code:


You can even make more complex sentences with relative clauses with the library. There are, however, a couple of things you should know about. First, when you create a sentence, you can pass a dictionary to the function with information about morphology. The possible values are:

 

Every Phrase object also has a structure in which they have components, order, head and agreement. You can learn all about the possible phrase types and their structures in the grammar.json. Basically, if you want to add more than the required phrases to a phrase, you can add a new component by whichever name and add it in the order list as well.

Let’s continue the example above by adding an adposition phrase to it:


You can even shuffle the order of the phrases and exploit the free word order of Finnish:

Note that this only shuffles the order of the phrases, not the words themselves, and that’s why ilman käsiä appears always together in the correct order.

More help with the NLG library?

You can find more instructions in syntax maker wiki or you can use the contact form on this site to ask me.

Conclusion

This has been a long time coming for me to write a new tutorial about the NLG library. I have put a decent amount of work into it, so I wouldn’t mind seeing people actually using it too. 😊

Related Post

Share this postShare on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someonePin on PinterestDigg thisShare on RedditShare on StumbleUpon