Novaator conducted an experiment. They asked five models questions about the Estonian language and culture.
The models answered 20 questions. The questions were divided into two parts: language and culture. For example, questions included terms like "jäääär" and "semiosfäär".
Grok provided the best answers. Grok knew that "pudsunudsija" means "tolmuimeja" in Võro.
Professor Kairit Sirts said that models learn from texts. Then they receive instructions.
The models struggled with culture-related questions. Models acquire knowledge about culture from English.
Professor Tanel Alumäe said that models are good in Estonian but make mistakes. Models handle words well but have problems with grammar.
Scientists are creating an open-source Estonian language model. This means all materials are public. The model will use data from the Estonian Language Institute.
Sirts said the goal is not to compete with large models. The goal is to create a model that can be used on their own server. This is important when data needs to be secure.
Alumäe said it is necessary to reduce dependence on USA and Chinese servers. An open-source model helps achieve this.
Sirts said it is important to maintain skills. Estonians can improve the model and maintain control.