Complex Systems: 'How complex is it to build a Yup'ik language spellchecker?' April 11, 20141

by Michelle Saport  |   

Friday, April 11, 11:30 a.m.-1 p.m. ConocoPhillips Integrated Science Building, Room 105A

Eric Somerville, UAA Department of Computer Science and Engineering, will present "How complex is it to build a Yup'ik language spellchecker?" This is the last Complex Systems talk for the spring semester.

Abstract: Correctly parsing agglutinative languages with rich morphophonological grammars into their constituent morphemes has proven an important element of non-English natural language processing (NLP).  Yup'ik is an agglutinative language for which no spell checking application has yet been developed. We examine two methods to account for the morphophonology in Yup'ik word-forming grammar: constructing a reference table by hand, allowing for manual definition of a near-complete rule set; and developing a Markov model to generate a rule set probabilistically. The Yup'ik Eskimo Dictionary (Jacobson, 2012) is used as a corpus in both methods, allowing for a comparative analysis of accuracy and correctness between the manually defined model and the probabilistic Markov model.

Creative Commons License "Complex Systems: 'How complex is it to build a Yup'ik language spellchecker?' April 11, 20141" is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
April Archive