Conference article

Universal Dependencies for Portuguese

Alexandre Rademaker
IBM Research and EMAp/FGV, Brazil

Fabricio Chalub
IBM Research, Brazil

Livy Real
University of São Paulo, Brazil

Cláudia Freitas
PUC-Rio, Brazil

Eckhard Bick
University of Southern Denmark, Denmark

Valeria de Paiva
Nuance Communications, USA

Download article

Published in: Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), September 18-20, 2017, Università di Pisa, Italy

Linköping Electronic Conference Proceedings 139:23, s. 197-206

Show more +

Published: 2017-09-13

ISBN: 978-91-7685-467-9

ISSN: 1650-3686 (print), 1650-3740 (online)


This paper describes the creation of a Portuguese corpus following the guidelines of the Universal Dependencies Framework. Instead of starting from scratch, we invested in a conversion process from the existing Portuguese corpus, called Bosque. The conversion was done by applying a context-sensitive set of Constraint Grammar rules to its original deep linguistic analysis, which was carried out by the parser PALAVRAS, with some additional manual corrections. Universal Dependencies offer the promise of greater parallelism between languages, a plus for researchers in many areas. We report the challenges of dealing with Portuguese, a Romance language, hoping that our experience will help others.


