Brief personal and contact information for Stig-Arne Grönroos

Ph.D. (Tech.), AI Scientist
Stig-Arne Grönroos
IRC nick
Waino (networks: IRCnet, quakenet, freenode)
Github
Waino
Google Scholar
ResearchGate
ORCID
0000-0002-3750-6924
Email
first-part-of-firstname.lastname@​gmail.com (first name includes only the part preceding the hyphen, lastname omits the dots on 'ö')
Phone
(+358) 40 739 8282

The preferred ways to contact me are by Signal, IRC (casual), or email.

Who am I?

I am currently an AI Scientist at Silo.AI, the largest private AI lab in the Nordics. I received my Ph.D. in language technology at Aalto University in 2021. My thesis was on the topic of machine translation into morphologically rich low-resource languages. I have focused on my scientific endeavors, giving me a strong understanding of linear algebra, probability, and algorithm design. During my academic career, I have written over 15 papers with a total of more than 400 citations, some in high-ranking journals and international conferences (See Google Scholar). In 2014, I received a degree of Master of Science in technology with distinction from Aalto University.

I'm most well known for my contributions to the Morfessor morphological segmentation software, and for 1st place in shared tasks: En -> {De,Fr} multimodal translation in WMT18, and Upper Sorbian -> German low-resource translation in WMT20. I have aimed to release as much as possible of the software I develop under open source licenses (See GitHub). My portfolio of applied machine learning includes machine translation, morphological segmentation, and other NLP.

I have experience with semi-supervised learning, active learning, and transfer learning. When the budget for labeling data for your specific task is small, these very powerful techniques can be applied to make use of existing similar resources.

Many collaborations in EU-funded projects and jointly written papers, together with experience teaching a course in Statistical Natural Language Processing and supervising M.Sc. theses, have made me more confident in my communication skills. Also, you never really understand something until you teach it to someone else.

In a machine learning project, I enjoy the intellectual challenge of deeply understanding a problem. I am driven by the passion to apply machine learning to make the world a better place. I believe in honesty and open dialogue.

When I'm not glued to a computer screen, I enjoy reading sci-fi and playing board games.

Education

Tekniikan tohtori / Doctor of Science (Technology) 2021
Aalto University School of Electrical Engineering, Dept. of Signal Processing and Acoustics
  • Research field: Speech and Language Technology
Diplomi-insinööri / Master of Science (Tech.) 2014
Aalto University School of Science
  • Major: Information and Computer Science
  • The degree was completed with distinction
Bachelor of Science (Tech.) 2012
Aalto University School of Science
Matriculation Examination 2004
Gymnasiet Lärkan

Dissertation

My dissertation is available as two pdf versions. I recommend the author's definitive preprint, which includes several readability enhancing improvements over the official version:
  1. It includes the publications, which are omitted in the official electronic version, but are included in the printed book.
  2. It contains a machine readable table of contents that can be used in the pdf viewer.
  3. It contains clickable document internal links: citations link to the list of references; mentions of publications link to the corresponding publication; section, figure and table mentions link to the appropriate page.
2020

Stig-Arne Grönroos. Machine translation into morphologically rich low-resource languages. G5 artikkeliväitöskirja, Aalto University, 2020. Aalto University publication series DOCTORAL DISSERTATIONS; 202/2020. [ bib | http (official) | .pdf (author's definitive preprint) ]

Publications

2020

Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. Morfessor EM+Prune: Improved subword segmentation with expectation maximization and pruning. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseilles, France, May 2020. ELRA. [ bib | http ]

Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, and Jörg Tiedemann. Multimodal machine translation through visuals and speech. Machine Translation, 34(2):97--147, 2020. arXiv:1911.12798 [cs.CL]. [ bib | arXiv | http ]

Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. Transfer learning and subword sampling for asymmetric-resource one-to-many neural translation, 2020. arXiv:2004.04002 [cs.CL], accepted for publication in Machine Translation. [ bib | arXiv | http ]

Yves Scherrer, Stig-Arne Grönroos, and Sami Virpioja. The University of Helsinki and Aalto University submissions to the WMT 2020 news and low-resource translation tasks. In Proceedings of the Fifth Conference on Machine Translation, pages 1127--1136. The Association for Computational Linguistics, 2020. [ bib ]

Abhilash Jain, Aku Rouhe, Stig-Arne Grönroos, and Mikko Kurimo. Finnish ASR with deep Transformer models. In Proceedings of Interspeech, pages 3630--3634, 2020. [ bib ]

2019

Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. North Sámi morphological segmentation with low-resource semi-supervised sequence labeling. In Proceedings of the Fifth International Workshop on Computational Linguistics for Uralic Languages, 2019. [ bib | http ]

2018

Stig-Arne Grönroos, Benoit Huet, Mikko Kurimo, Jorma Laaksonen, Bernard Merialdo, Phu Pham, Mats Sjöberg, Umut Sulubacak, Jörg Tiedemann, Raphael Troncy, and Raúl Vázquez. The MeMAD submission to the WMT18 multimodal translation task. In Proceedings of the Third Conference on Machine Translation. Association for Computational Linguistics, October 2018. [ bib | .pdf ]

Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. Cognate-aware morphological segmentation for multilingual neural translation. In Proceedings of the Third Conference on Machine Translation. Association for Computational Linguistics, October 2018. [ bib | .pdf ]

Umut Sulubacak, Jörg Tiedemann, Aku Rouhe, Grönroos Stig-Arne, and Mikko Kurimo. The memad submission to the iwslt 2018 speech translation task. In Proceedings of the 15th International Workshop on Spoken Language Translation (IWSLT 2018), 2018. [ bib | http ]

Franck Burlot, Yves Scherrer, Vinit Ravishankar, Ondrej Bojar, Stig-Arne Grönroos, Maarit Koponen, Tommi Nieminen, and François Yvon. The WMT'18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English. In Proceedings of the Third Conference on Machine Translation Shared Task Papers. Association for Computational Linguistics, 2018. [ bib | http ]

2017

Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. Extending hybrid word-character neural machine translation with multi-task learning of morphological analysis. In Proceedings of the Second Conference on Machine Translation, 2017. [ bib | http ]

2016

Stig-Arne Grönroos, Kristiina Jokinen, Katri Hiovain, Mikko Kurimo, and Sami Virpioja. Low-resource active learning of morphological segmentation. Northern European Journal of Language Technology, 2016. [ bib | .pdf ]

Teemu Ruokolainen, Oskar Kohonen, Kairit Sirts, Stig-Arne Grönroos, Mikko Kurimo, and Sami Virpioja. A comparative study on minimally supervised morphological segmentation. Computational Linguistics, 2016. [ bib | .pdf ]

Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. Hybrid morphological segmentation for phrase-based machine translation. In Proceedings of the First Conference on Machine Translation. ACL, 2016. [ bib | .pdf ]

2015

Stig-Arne Grönroos, Sami Virpioja, and Mikko Kurimo. Tuning phrase-based segmented translation for a morphologically complex target language. In Proceedings of the Tenth Workshop on Statistical Machine Translation. ACL, 2015. [ bib | .pdf ]

Sami Virpioja and Stig-Arne Grönroos. LeBLEU: N-gram-based translation evaluation score for morphologically complex languages. In Proceedings of the Tenth Workshop on Statistical Machine Translation. ACL, 2015. [ bib | .pdf ]

Stig-Arne Grönroos, Kristiina Jokinen, Katri Hiovain, Mikko Kurimo, and Sami Virpioja. Low-resource active learning of North Sámi morphological segmentation. In International Workshop on Computational Linguistics for Uralic Languages, 2015. [ bib | http ]

2014

Stig-Arne Grönroos, Sami Virpioja, Peter Smit, and Mikko Kurimo. Morfessor FlatCat: An HMM-based method for unsupervised and semi-supervised learning of morphology. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics. ACL, 2014. [ bib | .pdf ]

Peter Smit, Sami Virpioja, Stig-Arne Grönroos, and Mikko Kurimo. Morfessor 2.0: Toolkit for statistical morphological segmentation. In 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014. Software Demonstration. [ bib | .pdf ]

Stig-Arne Grönroos. Semi-supervised induction of a concatenative morphology with simple morphotactics. Master's thesis, Department of Information and Computer Science, Aalto University, 2014. [ bib | http ]

2013

Sami Virpioja, Peter Smit, Stig-Arne Grönroos, and Mikko Kurimo. Morfessor 2.0: Python implementation and extensions for Morfessor Baseline. Report 25/2013 in Aalto University publication series SCIENCE + TECHNOLOGY, Department of Signal Processing and Acoustics, Aalto University, 2013. [ bib | http ]

2010

Stig-Arne Grönroos. Parallelliserad klassifikation av dokumentsamling med hjälp av MapReduce. Bachelor's thesis, Department of Information and Computer Science, Aalto University, 2010. [ bib | http ]