Unity in Diversity: A Unified Parsing Strategy for Major Indian Languages

Tandon, Juhi; Misra Sharma, Dipti

Conference article

Unity in Diversity: A Unified Parsing Strategy for Major Indian Languages

Juhi Tandon
Kohli Center on Intelligent Systems (KCIS), International Institute of Information Technology, Hyderabad (IIIT-H), Gachibowli, Hyderabad, India

Dipti Misra Sharma
Kohli Center on Intelligent Systems (KCIS), International Institute of Information Technology, Hyderabad (IIIT-H), Gachibowli, Hyderabad, India

Download article

Published in: Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), September 18-20, 2017, Università di Pisa, Italy

Linköping Electronic Conference Proceedings 139:29, p. 255-265

Show more +

Published: 2017-09-13

ISBN: 978-91-7685-467-9

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

This paper presents our work to apply non linear neural network for parsing five resource poor Indian Languages belonging to two major language families- Indo-Aryan and Dravidian. Bengali and Marathi are Indo-Aryan languages whereas Kannada, Telugu and Malayalam belong to the Dravidian family. While little work has been done previously on Bengali and Telugu linear transition-based parsing, we present one of the first parsers for Marathi, Kannada and Malayalam. All the Indian languages are free word order and range from being moderate to very rich in morphology. Therefore in this work we propose the usage of linguistically motivated morphological features (suffix and postposition ) in the non linear framework, to capture the intricacies of both the language families. We also capture chunk and gender, number, person information elegantly in this model. We put forward ways to represent these features cost effectively

Conference article

Unity in Diversity: A Unified Parsing Strategy for Major Indian Languages

Abstract

Keywords

References

Citations in Crossref