Analysing anaphoric ambiguity in natural language requirements

Hui, Yang; De Roeck Anne,; Gervasi, Vincenzo; Alistair, Willis; Bashar, Nuseibeh

doi:10.1007/s00766-011-0119-y

Many requirements documents are written in natural language (NL). However, with the flexibility of NL comes the risk of introducing unwanted ambiguities in the requirements and misunderstandings between stakeholders. In this paper, we describe an automated approach to identify potentially nocuous ambiguity, which occurs when text is interpreted differently by different readers. We concentrate on anaphoric ambiguity, which occurs when readers may disagree on how pronouns should be interpreted. We describe a number of heuristics, each of which captures information that may lead a reader to favor a particular interpretation of the text. We use these heuristics to build a classifier, which in turn predicts the degree to which particular interpretations are preferred. We collected multiple human judgements on the interpretation of requirements exhibiting anaphoric ambiguity and showed how the distribution of these judgements can be used to assess whether a particular instance of ambiguity is nocuous. Given a requirements document written in natural language, our approach can identify sentences that contain anaphoric ambiguity, and use the classifier to alert the requirements writer of text that runs the risk of misinterpretation. We report on a series of experiments that we conducted to evaluate the performance of the automated system we developed to support our approach. The results show that the system achieves high recall with a consistent improvement on baseline precision subject to some ambiguity tolerance levels, allowing us to explore and highlight realistic and potentially problematic ambiguities in actual requirements documents.