Lexical preferences in the Perfect construction

Research output: Contribution to conferencePosterAcademic

Abstract

Although earlier corpus studies have been attempted on the Perfect (e.g. Nishiyama and Koenig (2010) for English, van der Klis et al. (2017) for a cross-linguistic analysis), it has not been attested whether all verbs appear equally in the Perfect. As Goldberg et al. (2004) state, high-frequency collexemes could facilitate language learners to note correlations between meaning of the word and the construction itself. We set out to find whether these collexemes exist for English, and whether there is any variation in genre or across languages.

The English present perfect is constructed using by an auxiliary Have followed by a past participle (e.g. “he has left”). However, the auxiliary and the past participle need not be directly adjacent to each other (e.g. “he has just left”). Another caveat is that the past participle “been” might start a passive (“has been read”) or progressive (“has been reading”) construction. We therefore used an extraction algorithm by van der Klis et al. (2015) to extract present perfect occurrences from the British National Corpus, including the passive, but excluding progressive forms.

We retrieved co-occurrence frequencies for all verbs (i.e. the lemmata of the past participle) and compared that with the total frequency of these verbs across the corpus in a collostructional analysis (Stefanowitsch and Gries, 2003). We found (atelic) state verbs (like ‘be’ and ‘have’) scored high on Attraction (the relative frequency of the lemma given the Perfect), but low on Reliance (the relative frequency of the Perfect given the lemma).

Conversely, (telic) achievement verbs like ‘finish’ and ‘lose’ scored high on Reliance, but low on Attraction. Correcting for contingency information using the Fischer-Yates exact test reveals (telic) accomplishments (‘show’, ‘prove’) and perception verbs (‘see’, ‘hear’) are most typical for the English present perfect. Interestingly, we find some variation between genres. Newspaper items attract reporting verbs like ‘agree’ or ‘decide’, but repel cognitive state verbs like ‘think’, ‘believe’ and ‘know’. Fiction as well as the spoken section repels the same verbs, but attracts verbs like ‘happen’ and ‘change’, verbs that denote progression.

We ran a similar analysis on the Europarl parallel corpus (Tiedemann, 2012), comparing French and English. From the literature, we know the distribution of the Perfect is rather different between these languages, especially in narration, where the French passé composé is licensed, but the English present perfect is not (e.g. de Swart (2007), Schaden (2009)). In our collostructional analysis, we find that French, like English, repels cognitive state verbs, but, unlike English, strongly attracts verbs like ‘voter’ (to vote) and ‘dire’ (to say). Note that in this analysis, we used the French and English parts of Europarl as comparable monolingual corpora rather than aligned corpora. All in all, the role of the lexicon in the Perfect is not to be dismissed: not all verbs are equally willing to appear in a Perfect, and these dissimilarities could well prove to be of interest for getting closer to a semantics of the Perfect.
Original languageEnglish
Publication statusPublished - 26 Jan 2018
EventComputational Linguistics in the Netherlands - Nijmegen, Netherlands
Duration: 26 Jan 2018 → …

Conference

ConferenceComputational Linguistics in the Netherlands
Abbreviated titleCLIN2018
Country/TerritoryNetherlands
CityNijmegen
Period26/01/18 → …

Fingerprint

Dive into the research topics of 'Lexical preferences in the Perfect construction'. Together they form a unique fingerprint.

Cite this