Konferensartikel

Finance document Extraction Using Data Augmentation and Attention

Ke Tian
OPT Inc, Tokyo, Japan

Zijun Peng
Harbin Institute of Technology (Weihai), China

Ladda ner artikel

Ingår i: Proceedings of the Second Financial Narrative Processing Workshop (FNP 2019), September 30, Turku Finland

Linköping Electronic Conference Proceedings 165:1, s. 1-4

NEALT Proceedings Series 40:1, s. 1-4

Visa mer +

Publicerad: 2019-09-30

ISBN: 978-91-7929-997-2

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

This paper mainly describes the aiai that the team submitted to the FinToc-2019 shared task. There are two tasks. One is the title detection task from non-titles in the finance documents. Another one is the TOC (table of contents) prediction from the finance PDF document. The data augmented and attention-based LSTM and BiLSTM models are applied to tackle the title-detection task. The experiment has shown that our methods perform well in predicting titles in finance documents. The result achieved the 1st ranking score on the title detection leaderboard.

Nyckelord

FinToc-2019 shared task, data augmentation, attention-based LSTM, BiLSTM

Referenser

Inga referenser tillgängliga

Citeringar i Crossref