Language research carried out with the aid of computer corpora avails itself, above all, of the fact that modern alphabet-based written language is usually represented in the form of a linear sequence of ‘words’ (or analogous text elements) interspersed with spaces and punctutation marks. These words, however, both in the language and in the corpus, do no necessarily coincide with units of meaning. One of the consequent problems for the researcher is how to look for phraseological units in a corpus. In the present paper, the authors decribe the problems encountered while making searches for specific Italian multiword units in an Italian corpus, and how a flexible search program such as DBT could be used to greatest advantage in order to overcome such problems.

Looking for preselected multiword units in an untagged corpus of written Italian: maximizing the potential of the search program DBT

COFFEY, STEPHEN JAMES
1995-01-01

Abstract

Language research carried out with the aid of computer corpora avails itself, above all, of the fact that modern alphabet-based written language is usually represented in the form of a linear sequence of ‘words’ (or analogous text elements) interspersed with spaces and punctutation marks. These words, however, both in the language and in the corpus, do no necessarily coincide with units of meaning. One of the consequent problems for the researcher is how to look for phraseological units in a corpus. In the present paper, the authors decribe the problems encountered while making searches for specific Italian multiword units in an Italian corpus, and how a flexible search program such as DBT could be used to greatest advantage in order to overcome such problems.
1995
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/25123
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact