Linguistics is a category in a broad sense of the word. It could very well be Philology. Posts varying from syntax to sociolinguistics fall into this category.

crossword puzzle

Morphology can be described as the smallest information bearing unit of the human language. Words that are inflected can be divided into morphemes, e.g. -ed in talked is a morpheme that adds the meaning of a past tense into the verb talk-s in dogs pluralizes the noun and so on. These morphemes that are added to words are known as affixes. There are different kinds of affixes and in this post we are going to look at them more closely. 🤓 (more…)

Korp and Python. Access corpora from your Python code!

If you have done language technology in a Nordic country, you have probably heard about Korp. And by now, you have probably developed some sort of a love-hate relationship to it. My initial thought was: Korp is nice, but so what 🤷🏼‍♂️, I need to access it programmatically for it to serve any use. The fact that the API description is somewhat hidden online and that not all Korp services are open about the url of their API doesn’t really help at all. 😩

Luckily, once again, yours truly has been typing in some code to make your life easier. 🤓 Behold, my very own python library for querying Korp. 😊 (more…)

a drawn lost cat sign

Oh, sarcasm, sarcasm. The thing that puzzles us so much. It takes some knowledge of the person to know if he is being sarcastic or not. Regardless of how sarcastic we were ourselves. But is there any science behind it? As it turns out, there is, and I wrote my MA thesis about it in Spanish. But if you don’t have time to read it, just read this post instead. 😅 (more…)

a pen and a syntactic tree

If you are interested in generating Finnish with a computer (NLG), you have probably already run into the problem of the complex morphology and syntax of Finnish. In addition to knowing how to inflect words, you have to take agreement into account. That for example, the verb agrees with the subject’s number and person: minä syön, sinä syöt and so on. And there’s more: case governance has to be solved too. A verb takes its direct object in a certain case, for example, you would say uneksin autosta but näen auton. Such is the problem of natural language generation. 🤷🏼‍♂️

You are in luck, I have resolved this issue and created a kick-ass python library called syntax maker. Just for you, my friend, free to use.  Are you ready to unleash the power of NLG? 😊😊 (more…)

A globe showing a part of North America

Languages can be grouped together in different ways. One can put languages together based on their family relation (e.g. Uralic languages, Indo-European languages) or the area where they are spoken. But maybe the most interesting and eye-opeing way of grouping them is by their morphology. As it turns out, there are only four morphological groups for languages and all spoken languages fall into one of them. (more…)

A green python ready to use HFST :-D

HFST (Helsinki Finite-State Transducer Technology) is a neat tool for modelling morphology of languages in a computational way. The problem is that currently, the Python API is under-documented. But fear not, in this post you will learn how to load optimised lookup files in Python and use them to analyse and generate word forms. 😃

Creative thoughts on papers and a keyboard

When you are targeting an international audience and you have enough money to back your project up, the thing you have to do is to localize your application. Thinking that everyone knows English, is just naive. This is a general guide that shows how the process of localization works. (more…)

A shop full of books for language self-study

In this post, I have gathered all the important aspects you have to look at when you are buying a language study book. I base this list on my personal experiences as a language self-learner (or autodidact as we are called). And trust me, when it comes to study books, I have seen the best and worst of them. 🤓 (more…)

A light bulb on a desk

Previously, I have blogged about meaning in language, and especially the dichotomy related to it. Now, I feel, it’s the time to look at the issue from another perspective. Sure, our language models meaning in its own way, but how is it represented in the brain? This post will be about concepts and how they are understood. 🤓 (more…)