Many word forms in English sometimes function as lexical units in their own right but are also found as constituents of multiword units. Too often we have frequency information regarding all occurrences of a given word form, irrespective of whether it is functioning as a lexical unit or forming part of a larger multiword unit. For example, there are 997 tokens of ‘greenhouse’ in the British National Corpus but in over 500 of these cases, ‘greenhouse’ is a constituent of the phrasal units (the) greenhouse effect or greenhouse gas/gasses. This is a very obvious example, and one which is relatively easy to investigate. At the other end of the scales of ‘obviousness’ and ‘difficulty’ there are the very high frequency words, often very grammatical in nature. The proposed presentation investigates one of these, the determiner A. The precise purpose of the proposed paper is to present an estimation of the proportion of occurrences of the determiner A in modern English which are best viewed in terms of its obligatory presence in (relatively) fixed phrasal units. The corpus used in the study was the British National Corpus, and the methodology involved the downloading of several large random samples of the word A. The paper presents two types of information. Firstly, a description of the types of phrase which were included in the statistics, those which were not, and those which caused particular problems. And secondly, the quantitative results both for the whole corpus and for the spoken corpus, including and excluding certain types of phrasal unit.
The determiner A as a constituent of phraseological items: an estimate of its frequency in contemporary British English
COFFEY, STEPHEN JAMES
2004-01-01
Abstract
Many word forms in English sometimes function as lexical units in their own right but are also found as constituents of multiword units. Too often we have frequency information regarding all occurrences of a given word form, irrespective of whether it is functioning as a lexical unit or forming part of a larger multiword unit. For example, there are 997 tokens of ‘greenhouse’ in the British National Corpus but in over 500 of these cases, ‘greenhouse’ is a constituent of the phrasal units (the) greenhouse effect or greenhouse gas/gasses. This is a very obvious example, and one which is relatively easy to investigate. At the other end of the scales of ‘obviousness’ and ‘difficulty’ there are the very high frequency words, often very grammatical in nature. The proposed presentation investigates one of these, the determiner A. The precise purpose of the proposed paper is to present an estimation of the proportion of occurrences of the determiner A in modern English which are best viewed in terms of its obligatory presence in (relatively) fixed phrasal units. The corpus used in the study was the British National Corpus, and the methodology involved the downloading of several large random samples of the word A. The paper presents two types of information. Firstly, a description of the types of phrase which were included in the statistics, those which were not, and those which caused particular problems. And secondly, the quantitative results both for the whole corpus and for the spoken corpus, including and excluding certain types of phrasal unit.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.