This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcre-TEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated re-sources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.
CAPISCO@CONcreTEXT 2020: (Un)supervised Systems to Contextualize Concreteness with Norming Data
Alessandro Bondielli
Primo
;Lucia Passaro
Penultimo
;Alessandro Lenci
Ultimo
2020
Abstract
This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcre-TEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated re-sources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.File | Dimensione | Formato | |
---|---|---|---|
Bondielli_etal_CAPISCO.pdf
accesso aperto
Descrizione: Articolo principale
Tipologia:
Versione finale editoriale
Licenza:
Creative commons
Dimensione
353.67 kB
Formato
Adobe PDF
|
353.67 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.