TANL is a suite of tools for text analytics based on the software architecture paradigm of data driven pipelines. The strategies for upgrading TANL to the use of Universal Dependencies range from a minimalistic approach consisting of introducing pre/post-processing steps into the native pipeline to revising the whole pipeline. We explore the issue in the context of the Italian Treebank, considering both the efforts involved, how to avoid losing linguistically relevant information and the loss of accuracy in the process.
Adapting the TANL tool suite to Universal Dependencies
ATTARDI, GIUSEPPE;SIMI, MARIA
2016-01-01
Abstract
TANL is a suite of tools for text analytics based on the software architecture paradigm of data driven pipelines. The strategies for upgrading TANL to the use of Universal Dependencies range from a minimalistic approach consisting of introducing pre/post-processing steps into the native pipeline to revising the whole pipeline. We explore the issue in the context of the Italian Treebank, considering both the efforts involved, how to avoid losing linguistically relevant information and the loss of accuracy in the process.File in questo prodotto:
File | Dimensione | Formato | |
---|---|---|---|
LREC16_UDpipeline.pdf
accesso aperto
Descrizione: Articolo principale
Tipologia:
Versione finale editoriale
Licenza:
Creative commons
Dimensione
424.24 kB
Formato
Adobe PDF
|
424.24 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.