1. INTRODUCTION
Linguistic research of the past decades has delivered a wealth of information on the phonological structure of signed languages (Sandler Reference Sandler1989, Brentari Reference Brentari1998, van der Kooij Reference Kooij2002). Although the difference between the spoken and the signed modality can be witnessed at the level of the phonetic features, the structural organisation of these features in terms of dependency relations and units like syllables and prosodic words does not appear to be different from that of spoken languages (Corina & Sandler Reference Corina and Sandler1993, van der Hulst Reference Hulst1993). The principal aim of the present study is to find out which prosodic features (such as hand movement patterns and non-manual signals) are used to express the information-structural notions of information focus and contrastive focus in one particular signed language, namely Sign Language of the Netherlands (NGT, Nederlandse Gebarentaal). Spoken languages employ various syntactic, morphological and prosodic means to express focus. While there appear to be spoken languages, such as Yucatec Maya (Kügler, Skopeteas & Verhoeven Reference Crasborn, Zwitserlood and Ros2007, Gussenhoven & Teeuw Reference Gussenhoven, Teeuw, Herrera and Martín Butragueño2008) or Northern Sotho (Zerbian Reference Zerbian, Aboh, Hartmann and Zimmermann2007), which do not express focus in the prosody at all, this does not seem likely for a signed language like NGT, given the rich set of prosodic features that are available in signed languages. Secondly, we aim to investigate whether the prosodic manifestation is influenced by the lexical phonological properties of the focused constituent.
We will start by summarising the current knowledge of prosodic features and structures in signed languages. Much of this knowledge has been based on the study of only two signed languages: American Sign Language (ASL) and Israeli Sign Language (ISL). Our own work on Sign Language of the Netherlands extends this data set by considering another signed language, yet it will become clear that no major structural cross-linguistic differences have been found as yet.
We present the results of an empirical study on the prosodic expression of focus in NGT, finding that both manual and non-manual cues may signal focused constituents. No clear distinction between cues for information focus versus contrastive focus were found, yet contrastive focus appears to lead to increased articulatory effort and it also appears to recruit the contrastive use of locations in signing space. Moreover, we came across the use of focus particles, some of which have not been studied before, and lexical-semantic patterns that appear to be recruited specifically to express focused information.
On the basis of our findings in NGT, we will propose a model of sign language prosody that emphasises the resemblance to the prosodic phonology of spoken language like earlier models have done, integrating the linguistic, paralinguistic, and extralinguistic influences on the phonetic appearance of the prosodic structure. At the same time, our analysis capitalises on the distinction between phonetic appearance and phonological features. We propose an abstract prosodic phonological feature to represent some of the non-manual findings that, depending on the context, may be realised by various articulators. In the manual domain we found evidence for a variable implementation of articulatory enhancement for focus, depending on the phonological make-up of the sign.
2. PROSODY IN A SIGNED LANGUAGE
Prosody is typically characterised as the collection of all phonetic or phonological phenomena that go beyond the segmental level in spoken languages (Ladd Reference Ladd1996, Gussenhoven Reference Gussenhoven2004). For example, Rietveld & van Heuven (Reference Rietveld and van Heuven1997: 231) define prosody as ‘the whole of temporal and melodic properties of speech utterances that are not due to the sequencing of vowels and consonants that form the segmental content’ (our translation). Although one might typically think of intonation, rhythm and stress when speaking about prosody, languages differ in whether prosodic properties such as length and tone also play a contrastive phonological role in the lexicon. In Finnish, for example, both vowel length and consonant length are lexically distinctive, whereas in French segment length is not phonologically contrastive at all.
The study of signed language prosody only started in the late eighties of the past century. In the next section we give an overview of the prosodic properties of signed languages and the studies of their functions in various sign languages. The information status of a spoken or signed item is among the functions that may be expressed by prosodic properties, as we illustrate in the subsequent sections.
2.1 General prosodic structure and features in signed languages
Leaving aside the difference in perceptual and articulatory channel, signed languages are not fundamentally different from spoken languages, as research of the past fifty years has shown (for an overview, see Sandler & Lillo-Martin Reference Sandler and Lillo-Martin2006, Brentari Reference Brentari2010). Lexical items are formed by phonological features that draw on the specifics of the visual channel and that represent bodily actions that are not those of the vocal tract but rather manual and facial gestures. The rich repertoire of possible manual articulations leads to very little sequential patterning. Indeed, most signs can be considered single segments according to some phonological models (Brentari Reference Brentari1998, Channon Reference Channon2002, van der Kooij Reference Kooij2002, van der Kooij & van der Hulst Reference Kooij, van der Hulst, van Oostendorp and van de Weijer2005, van der Kooij & Crasborn Reference Crasborn, Zwitserlood, Crasborn, Efthimiou, Hanke, Thoutenhoofd and Zwitserlood2008). Sequential patterning at the word level does occur, however: compounds can consist of two syllables (manual movement units), for example, and on the surface repeated movements within a lexical sign (just as in sequences of lexical items) will also lead to strings of syllables. In the phonological model of Brentari (Reference Brentari1998), all properties of movement in a lexical sign are joined under a node that is labelled ‘prosodic features’ as opposed to the inherent features that represent the static configuration of articulator configuration (handshape, orientation) and place of articulation.Footnote 2 Uniform accounts of dynamic features are found in other phonological models (e.g. van der Hulst Reference Hulst1993, van der Kooij Reference Kooij2002), although it has also been proposed that there is a movement segment that alternates with location segments (Liddell & Johnson Reference Liddell and Johnson1986, Reference Liddell and Johnson1989; Sandler Reference Sandler1989; Perlmutter Reference Perlmutter1992).
The work of Sandler (Reference Sandler, Hall and Kleinhenz1999a, Reference Sandlerb, Reference Sandler, Goldstein, Whalen and Best2006; Nespor & Sandler Reference Nespor and Sandler1999) and others has shown that in ASL and ISL, syllables in turn are organised into groups that are identical to the prosodic constituents proposed in the prosodic hierarchy of Nespor & Vogel (Reference Nespor and Vogel1986). The types of phonetic cues that can be observed as evidence for particular domains such as the phonological word, the phonological phrase and the intonational phrase are typically (but not exclusively) non-manual in nature (see Sandler Reference Sandler1999b, Reference Sandlerc for ISL; Sandler & Lillo-Martin Reference Sandler and Lillo-Martin2006 for ASL; van der Kooij & Crasborn Reference Crasborn, Zwitserlood, Crasborn, Efthimiou, Hanke, Thoutenhoofd and Zwitserlood2008 for NGT). Indeed, the gestures of facial articulators such as the eye brows, eye lids, lower jaw and lips, but also movements of the head and the upper body, have often been compared to intonation in spoken languages (Fischer Reference Fischer and Li1975; Wilbur Reference Wilbur1990a, Reference Wilbur, Emmorey and Lane2000; Sandler Reference Sandler1999b, Reference Sandlerc, Reference Sandler, Leuninger and Happ2005; Sandler & Lillo-Martin Reference Sandler and Lillo-Martin2006; Dachkovsky & Sandler Reference Dachkovsky and Sandler2009).
Non-manual features play a limited role in the lexicon of signed languages. Aside from the pervasive use of mouth actions across the whole lexicon (see papers in Boyes Braem & Sutton-Spence Reference Boyes Braem and Sutton-Spence2001 for various languages; for NGT see Crasborn et al. Reference Crasborn, Kooij, Waters, Woll and Mesch2008a, van de Sande & Crasborn Reference Sande2009, Bank, Crasborn & van Hout Reference Bank, Crasborn and Hout2011), all other non-manual features only occur sporadically in the lexicon. It is clearly not the case that they function as distinctive phonological features in the lexicon of NGT. Appearances of specific facial expressions that come with lexical items typically do not show phonological patterning and are derived from non-linguistic facial expressions showing emotions or other human behaviour. For example, the contrast in eye aperture found in the NGT signs FAR-AWAY (narrowed) versus SURPRISE (wide open) is not a recurrent phonological distinction that is used to create minimal pairs. Rather, it derives from the squinting typically related to staring in the distance and from the general human facial expression that comes with surprised affect, respectively. Likewise, a forward lean of the upper body is found in lexical items that semantically relate to eagerness or involvement whereas a backward lean occurs in signs that entail rejection or disgust (see van der Kooij, Crasborn & Emmerik 2006 on NGT; Wilbur & Patschke Reference Wilbur and Patschke1998 on ASL). In none of the signed languages studied so far has evidence been found that non-manual phonetic features are used contrastively throughout the lexicon, but this could in principle be due to the strong focus on ASL and other western sign languages in the literature.
In phrasal contexts, non-manual cues like head and upper body movement and facial gestures can be considered independent phonetic channels that allow articulations independent of the manual lexical gestures. Their independent timing is seen in utterances where a shaking head accompanies a sequence of signs that together form one sentence, for example. As in spoken Dutch interaction, headshakes express negation and head nods express affirmation; there is some evidence that in sign languages, these non-manual actions are more tightly aligned with strings of lexical items than in spoken languages (Baker-Shenk Reference Baker-Shenk1983, Dachkovsky & Sandler Reference Dachkovsky and Sandler2009). From the earliest studies of ASL, head and eyebrow movements have been noticed as playing a key role in the syntax of the language, involved in the expression of negation, affirmation and different types of questions (Bellugi & Fischer Reference Bellugi and Fischer1972, Friedman Reference Friedman and Li1976, Liddell Reference Liddell1980). In the early work on non-manuals, various non-manual cues were considered the direct expression of morphosyntactic structure or semantic features (see Sandler Reference Sandler2010 for discussion). Although the non-manual articulations were sometimes compared to intonation or prosody more generally, there was no interpretation of these articulations as phonetic features of a phonological prosodic structure that is separate from the semantic-morphosyntactic structure. This is perhaps not surprising given the fact that the phonetics–phonology relation for prosody has really only been made explicit since the early 1980s (Pierrehumbert Reference Pierrehumbert1980, Pierrehumbert & Beckman Reference Pierrehumbert and Beckman1988, Gussenhoven Reference Gussenhoven2004). However, also in recent decades, analyses of prosody in signed languages have rarely aimed to propose a clear distinction between a phonological level of representation and its phonetic implementation (Wilbur Reference Wilbur1990a, Reference Wilburb, Reference Wilbur, Bos and Schermer1995; Nespor & Sandler Reference Nespor and Sandler1999; Sandler Reference Sandler1999b, Reference Sandlerc). Syntactic analyses of focus typically tend to conflate syntactic structure with prosodic appearance as well, and do not discuss the prosodic form in great detail. For example, the analysis of ASL and LSB (Brasilian Sign Language) by Quadros and colleagues (Lillo-Martin & Quadros Reference Lillo-Martin and Quadros2004, Nunes & Quadros Reference Nunes, Quadros and Quer2008) conflates a syntactic analysis of focus with its phonetic appearance, and does not discuss the phonological form in any detail.Footnote 3 Sandler proposes an analysis of sign prosody in terms of the prosodic hierarchy theory of Nespor & Vogel (Reference Nespor and Vogel1986), which makes a clear distinction between syntactic and phonological representations, but the phonetics–phonology interface remains implicit and receives little discussion (Nespor & Sandler Reference Nespor and Sandler1999; Sandler Reference Sandler, Hall and Kleinhenz1999a, Reference Sandlerb, Reference Sandlerc; Sandler & Lillo-Martin Reference Sandler and Lillo-Martin2006).
Two studies that do focus on the phonetic form in great detail are Baker-Shenk (Reference Baker-Shenk1983) on ASL and Coerts (Reference Coerts1992) on NGT, both looking at variation in the appearance of phonetic forms relating to questions and topic constituents, among others. In her study of non-manual marking of several sentence types in NGT, Coerts (Reference Coerts1992) found that there is a large amount of variation in the facial expressions accompanying questions. Her conclusion was that certain features typically accompany yes–no questions on the one hand (head tilted forward, eye brows raised) and wh-questions on the other hand (head tilted backward, eye brows frowned). It was left to future investigations to find out what determined the variation present in her data. One clear factor in an attempt to explain variation in facial expression is the role that the face plays in the expression of affective states. Ekman (Reference Ekman and Cole1972, Reference Ekman1982, Reference Ekman1993) showed that several basic emotions are cross-culturally expressed with similar facial expressions. These expressions include non-manual articulations that are also used for the expression of linguistic information in sign languages. Consequently, the appearance of linguistic non-manual markers is complicated by the expression of paralinguistic cues, such as expressing emotional states. While some researchers have argued that linguistic and affective signals are clearly distinct (Baker-Shenk Reference Baker-Shenk1983, Reilly, McIntire & Bellugi Reference Reilly, McIntire and Bellugi1990, Reilly & Bellugi Reference Reilly and Bellugi1996, Dachkovsky & Sandler Reference Dachkovsky and Sandler2009), it has also been shown that linguistic and paralinguistic signals interact in a complex way in the appearance of eyebrow states (see Weast Reference Weast2008 for ASL; de Vos, van der Kooij & Crasborn Reference Vos, Kooij and Crasborn2009 for NGT). This is not surprising if we take into account that in signed languages, just as in the prosody of spoken languages (Gussenhoven Reference Gussenhoven, Mettouchi and Ferré2003, Reference Gussenhoven2004), linguistic and paralinguistic motivations for and influences on certain articulations can all co-occur at the same time. Thus, it will be important in any study of the prosody of a signed language to take into account that paralinguistic signals can interact with the expression of linguistic signals. We will return to this point in the discussion in Section 4.
In one classic study on ASL, Baker-Shenk (Reference Baker-Shenk1983) aimed to distinguish affective and linguistic eyebrow movements. She found that the temporal envelopes of brow gestures differ in two ways: while linguistic movements have a short temporal onset and offset, affective movements show more variation. Also, the peak value of the linguistic gestures was kept constant over a certain time span, while in the affective cases, there typically was a fluctuating pattern. Unfortunately these findings have never been replicated, whether on ASL or another signed language. More crucially, there has been no perceptual study that demonstrates that these specific differences in sources of a signal are actually picked up by sign language addressees. The articulatory distinctions found by Baker-Shenk demonstrated most of all that indeed there is a linguistic background to some of the non-manual articulations.
In conclusion, while the featural content of signed languages is clearly different as a direct consequence of the difference between the auditory and the visual modality, structurally, signed languages have been argued to be similar to spoken languages. As intonation in spoken languages is often used for the expression of information structural distinctions such as focused versus unfocused constituents (Gussenhoven Reference Gussenhoven2004), the question now arises to what extent prosody in signed languages is also used for these purposes. This is the question we aim to answer in this paper, leaving open the question whether there are also syntactic means to express focus in terms of word order, for instance.
2.2 The information-structural notion of focus
One can take different perspectives on the phenomenon of focus; semantic, pragmatic, syntactic, discourse, and prosodic aspects have all been studied in detail for many spoken languages (e.g. Vallduví Reference Vallduví1992, de Swart & de Hoop Reference Swart and Hoop1995, Ladd Reference Ladd1996, Steedman Reference Steedman2000, Gussenhoven Reference Gussenhoven2004). These studies tend to share the overall conception that the notion ‘focus’ forms part of the information structure component of languages, and refers to new, important or unexpected information which the speaker assumes not to be shared between him and the interlocutor (Jackendoff Reference Jackendoff1972, Krifka Reference Krifka, Féry and Krifka2007).
Vallduví (Reference Vallduví1992) made a three-way distinction between ‘link’, ‘focus’ and ‘tail’. The link or common ground (also called ‘topic’) relates the current sentence to the preceding discourse by selecting one of the elements in the preceding discourse. The focus is new or contrastive information that is related in some way to the topic, and the tail is all other information. The focused items (semantically or pragmatically important elements) are prominent and they stand out from the surrounding context. The prominence of focused elements in discourse can be reflected by the syntactic and phonological structure.
Gundel (Reference Gundel, Bosch and van der Sandt1999) made a further distinction between three aspects of the term ‘focus’ that are all relevant to linguistic analysis: psychological focus, semantic focus, and contrastive focus. ‘Psychological focus’ refers to the shared centre of attention of both speech participants. The ‘semantic focus’ is the new information that is expressed about the topic of the sentence, and thus is a relational concept. It is also referred to as ‘presentational focus’ (Zubizarreta Reference Zubizarreta1998) or ‘information focus’ (É. Kiss Reference Katalin1998); we adopt the latter term in this paper. The information focus can be considered the answer to an implicit or explicit question, as in the examples in (1) below. The two declarative sentences in (1) are instances of information focus, contributing new information to the discourse, yet they differ in the number of phrases, lexical items, and syllables that make up the focused constituent (which is in square brackets).
- (1)
(a) What does Connie do?
She [reads a novel].
(b) What does Connie read?
She reads [a novel].
It is crucial for a correct analysis of prosody to distinguish the focus constituents from the different types of focus: information focus and contrastive focus. While the information focus is relevant for evaluating the truth-conditional value of an utterance, in ‘contrastive focus’ the effort of the speaker is not to introduce new information, but rather to change the focus of attention of the addressee, to shift the topic of the discourse, or to implicitly or explicitly contrast different constituents. Information and contrastive focus can well be combined, as the example in (2), taken from Gundel (Reference Gundel, Bosch and van der Sandt1999: 296), illustrates. In this example, the small caps mark contrastive focus, the capitals mark information focus.
(2) That [coat] you're wearing [I don't think will be WARM enough].
Both the noun coat and the adverb warm carry a pitch accent in English, but they are expressions of different types of focus. The noun coat can be said to have contrastive focus in Gundel's terminology, here shifting the topic of the discourse, while the part of the sentence containing warm has information focus (Gundel used the term ‘semantic focus’), uttering new information.Footnote 4 There is prosodic and syntactic evidence that grammars may treat information focus and contrastive focus in different ways (Pierrehumbert & Hirschberg Reference Pierrehumbert, Hirschberg, Cohen, Morgan and Pollack1990 on English; É. Kiss Reference Katalin1998 on Hungarian), although it has been argued by some that this is not a categorical distinction (Krahmer & Swerts Reference Krahmer and Swerts2001, Ladd Reference Ladd2008, Calhoun Reference Calhoun2010). Since information focus and contrastive focus may lead to different prosodic expressions in spoken languages, we included this distinction as a heuristic in the elicitation of our data. Contrastive focus in this study is used in the sense of pragmatic correction (of what the interlocutor has said), having a limited set of alternatives.
Dik (Reference Dik1997) further differentiates the ‘scope’ of the focus in a sentence: the focus may be on operators like tense, aspect and polarity, on the predicate, or on the subject or object arguments. Initial observations on focus in NGT motivated us to not only study the size of the focused constituent (broad and narrow focus) but also to differentiate various types of syntactic constituents. Thus, the distinction between focus on the subject, predicate and object was also included in our study, as will be further discussed in Section 3.2 on our data collection.
2.3 Prosodic expression of focus in signed languages
In spoken languages, the prosodic expression of focus may take the form of specific intonation contours, highlighting focused constituents in different ways. The actual form of the intonation contour will vary per language, depending on other characteristics of its tonal structure and on how the mapping takes place of a specific tonal pattern to a domain (Gussenhoven Reference Gussenhoven2004, Reference Gussenhoven, Lee, Gordon and Buring2008).
A prominent phonetic difference between the signed and the spoken modality is that in the latter, static positions of articulators can be maintained in the same position over the span of many words (Crasborn & van der Kooij Reference Crasborn and Kooij2013). A raised position of the eyebrows can be maintained from the start to the end of the sentence, thus phonetically marking a certain time span independent of the activity of other prosodic features (Baker-Shenk Reference Baker-Shenk1983). Signed languages thus show a distinction between these domain markers (comparable to register) and punctual markers (comparable to accents). The latter are relatively brief phonetic events like head nods and eye blinks that co-occur with a single sign or may even fall between signs, overlapping with the transitional movement from one sign to the next (Pfau & Quer Reference Pfau, Quer and Brentari2010).
In spoken languages, so-called ‘focus projection’ is needed to interpret the focused constituent from a single accented syllable; conversely, a ‘focus-to-accent’ rule is needed to assign stress to the correct syllables on the basis of the focused constituent (Gussenhoven Reference Gussenhoven1983, Féry & Samek-Lodovici Reference Féry and Samek-Lodovici2006). For example, in each of the two declarative sentences in (1a) and (1b) above, the accent falls on the first syllable of ‘novel’, while the focused constituent is interpreted to be of different sizes (the NP in (1b) and the VP in (1a)). Although the accentuation of a spoken language syllable is a rather short-term phonetic event, the interaction with boundary tones can sometimes lead to a continuously high pitch over a longer domain. In the literature on signed languages until now, it has not been clear whether signed languages optimally exploit the phonetic possibilities of domain marking to indicate the extent of a focus constituent.
There has been limited explicit investigation of focus prosody in signed languages. The general phenomenon of focus is mentioned mostly in discussions of topic–comment constructions, where the comment contains the focused element in case of information focus. There is a rich literature on the prosodic marking of topics since Liddell's early work on ASL (Liddell Reference Liddell1980, Aarons Reference Aarons1994, Neidle, Kegl, MacLaughlin, Bahan & Lee Reference Neidle, Kegl, MacLaughlin, Bahan and Lee2000, Sandler & Lillo-Martin Reference Sandler and Lillo-Martin2006; see also Coerts Reference Coerts1992 on NGT). These studies argue that it is the topic rather than the focus that is prosodically marked.
Wilbur (Reference Wilbur1990b, Reference Wilbur1999; Wilbur & Schick Reference Wilbur and Schick1987), following Covington (Reference Covington1973), argued that stress in ASL can be expressed by a variety of phonetic cues, changing the manual articulation of signs, for instance by making them larger, tenser and faster, and realising them higher in signing space. However, she claimed that ASL does not use stress for marking focus per se: stress cannot move around in ASL sentences to indicate which sign is focused. Rather, stress in ASL always falls at the end of the sentence, and signs are realised in the final position when they need to be focused. A typical sentence structure found in ASL uses what appears to be a rhetorical question to announce the focused element that follows; Wilbur (Reference Wilbur, Bos and Schermer1995) analyses these constructions as ‘pseudo-clefts’. An example is given in (3).
(3) FRANCIS LIKE WHAT? FIGSFootnote 5
‘What Francis likes is [figs].’
Furthermore, Wilbur & Patschke (Reference Wilbur and Patschke1998) proposed that in ASL there is a general non-manual morpheme that marks contrasting categories at various levels. Leaning forward vs. backward with the upper body can express prosodic, lexical, semantic and pragmatic contrast. At the prosodic level, body lean is used to stress elements that are in focus. Another non-manual marker that has been argued to mark focus in ASL is raised eyebrows (Wilbur & Patschke Reference Wilbur and Patschke1999). As the discussion in Sandler & Lillo-Martin (Reference Sandler and Lillo-Martin2006) shows, this claim is hard to evaluate given the multiple functions of eyebrows (also marking certain topics and yes–no questions, for instance) and the many competing syntactic analyses of the sentences in question. Syntactic analyses of focus have tried to formally capture the prominence of the sentence-final position as a focus position. For example, the sentence-final doubling phenomena that are described by Petronio (Reference Petronio and Bobaljik1991, Reference Petronio1993) for ASL and Quadros (Reference Quadros1999) for Brazilian Sign Language (LSB) are analysed as focus constructions. In the LSB example in (4) the doubled final element is accompanied by a head nod (hn; Quadros Reference Quadros1999: 212).
(4) hn
PT-1 CAN GO PARTY CAN
‘I can go to the party.’
For NGT, we found evidence that the sentence-final position is prosodically prominent (Crasborn, van der Kooij & Ros Reference Crasborn, Kooij and Ros2012). Thus, we hypothesised that a doubled element at the end of the sentence receives emphasis because the final position in the sentence itself is prosodically prominent in some way. We argued that the nature of this final prominence is a trimoraic prosodic word, and that light elements such as cliticised pointing signs may appear for the sole purpose to fill that prominent position.
Waleschkowski (Reference Waleschkowski2009) studied focus marking in German Sign Language. She concludes that informational focus tends not to be marked at all (i.e., it does not receive an obligatory prosodic expression), and that contrastive focus tends to be ‘grammatically marked’ by various combinations of the non-manual features head nod, forward head tilt, raised eyebrows and widened eyes. Replacing or corrective focus, a subtype of contrastive focus, is marked obligatorily by combinations of various manual and non-manual cues including head nods, head tilts forward or raised eyebrows and tensed or enlarged signing, longer holds and realisations higher in signing space.
As Sandler (Reference Sandler2010) points out, the discussion on non-manuals in ASL and other sign languages has been obfuscated by the conflation of the syntactic and prosodic levels, seeing phonetic cues as direct expressions of specific syntactic structures or processes. For instance, in a paper that is entitled ‘Intonation and focus in ASL’, Wilbur (Reference Wilbur1990a) only considers non-manual features to the extent that they are direct expressions of syntactic constructions to express focus. However, non-manuals have been shown to have a wide range of functions in ASL and other sign languages: aside from constituting intonational markers at the prosodic level, they can function as adverbs, and they can be part of the phonological form of lexical items. We agree with Sandler's analysis, and in our study on NGT aim to investigate prosody as a phonological component of the grammar that is distinct from syntax.
2.4 Research questions
We wanted to answer three questions in studying prosodic correlates of focus in NGT, listed in (5).
(5) Research questions
(i) What prosodic cues are related to focus distinctions in NGT (if any)?
(ii) Are there different markers for informational vs. contrastive focus?
(iii) How can these prosodic phenomena be analysed phonologically and what are the implications for our model of sign language grammar?
3. MARKING FOCUS IN NGT
We start by describing the methodology of the empirical study we carried out. In Sections 3.2–3.4, we present our findings for non-manual cues, manual prosodic forms, and other structural means of marking focus. Conclusions are offered in Section 3.5.
3.1 Methodology
This study is based on elicited data from eleven native and near-native NGT signers, to avoid biasing our generalisations by the idiosyncratic behaviour of one or two subjects. All signers were between 20 and 65 years old at the time of recording (2002–2005), grew up using NGT, and considered NGT to be their primary language.
Based on the focus literature summarised above, we created question–answer pairs in written Dutch that incorporated the broad distinction between information focus and contrastive focus.Footnote 6 In addition to the distinctions in focus, we incorporated differences in the syntactic nature of constituents: subject, object, adjectival predicate, verb/verb phrase, and whole sentence were distinguished. Finally, phonological distinctions in location and movement type were included for focused objects. Examples of the elicited information focus distinctions are given in Table 1. The full list of elicitation data and the focus distinctions they incorporate is given in the appendix.
Table 1 Selected examples of focus distinctions that were elicited.
A deaf research assistant collected the data. The elicitation material consisted of a list of 59 written Dutch statement/question–answer pairs in random order. A set of cards in the same order with only the answers in written Dutch was presented to the participating signers. First the participant read the answer on the card, then the research assistant produced a signed translation of the question and the participant responded by the NGT translation of the appropriate answer from the cue card. Both the research assistant and the participants were recorded on video. Two cameras were used for the recording of the answer; one zoomed in on the face and one with a medium shot of the whole upper body and head of the participant.
There was quite some variation in the length of the answers given by the participants, both between and within signers. Sometimes signers replied by signing the focused constituent only, while others tried to incorporate all the information that was provided in the Dutch written answer on the card, sometimes adding some further context. Post-hoc judgements by the deaf research assistant on fluency and coherence of the answer and the question established that all signers put the information of the written Dutch answer into natural and grammatical NGT. Especially the production of additional material demonstrated that participants were not creating a literal translation of the answer, but adapted to the details and content of the question.
The videos were transcribed and analysed using the ELAN annotation software.Footnote 7 Transcriptions included gloss tiers for the left and the right hand and modifications of the manual signs, and independently aligned annotations for mouth actions, eyebrows, eye blinks, eye gaze, head actions, and upper body actions.
3.2 Non-manual prosodic appearance of different types of focus
In this section, we discuss several non-manual features that either consistently accompanied the focused elements (eye gaze and mouth actions) or were found in the context of specific types of focused constituents (position and movement of head and body and brow raise).
3.2.1 Eye contact
Eye gaze has been argued to have many different functions in sign languages, including playing a part in pronominal reference, signer turn regulation, and role shift (e.g. Baker & Padden Reference Baker[-Shenk], Padden and Siple1978 on ASL; Sutton-Spence & Woll Reference Sutton-Spence and Woll1999 on British Sign Language (BSL); Meurant Reference Meurant2008 on Langue des Signes Française de Belgique (LSFB)). While we recognise some of the functions reported in the literature on other languages for NGT, there are no studies on eye gaze in NGT as yet. It will therefore be difficult at present to establish its role in the language at large, although we expect to find similar behaviours as described for other languages.
In this study we found that eye contact, or rather, gaze direction of the signer towards the face of the interlocutor, is associated with the communication of important information. All of the focused constituents in our data are characterised by eye contact with the interlocutor. Eye contact is omnipresent in signed discourse and it is probably the default behaviour during sign production for the signer to look at the face of the addressee (see Siple Reference Siple1978, and see Goodwin Reference Goodwin1980 for interaction in spoken English). However, we found some crucial examples where eye gaze at the addressee is used only during the focused constituent, and not in the rest of the sentence. For example, in the sentence expressing object information focus illustrated in Figure 1, there is only eye contact during the realisation of the object, but not in the rest of the sentence. Figure 1 illustrates this contrast in gaze direction, where there is only eye contact during the focused fingerspelled sign #ASL. A natural explanation for this contrast is that the signer checks whether the most important information of the sentence contained in the focused element is received by the interlocutor.
Figure 1 (Colour online) Eye contact only during the focused object.
While there were no exceptions to looking at the addressee during the focused constituent, quite some variability could be observed in the extent to which this really sets apart the focused part of the sentence. Apparently, if there are other (linguistic or interactive) reasons to look at the addressee before and after the focused constituent, no discontinuation in the gaze direction will be observed. As we noted above, we did not attempt to fully analyse the gaze pattern in each utterance. What does stand out from the data, nonetheless, is the variability between signers in the extent to which they looked at the addressee. Four of the eleven signers kept eye contact during the entire clause in more than two-thirds of the sentences (40 or more of the 59). This provides substance to the hypothesis that eye contact is the default behaviour between signer and interlocutor. In other words, expressing focused information overlaps with the default gaze direction, and we conclude from our findings that any other functions of gaze direction such as indicating locations in space appear to be overruled by the pressure for looking at the addressee during focused information, as illustrated by the example in Figure 1.
Bahan (Reference Bahan1996) and Neidle et al. (Reference Neidle, Kegl, MacLaughlin, Bahan and Lee2000) argued that eye gaze is used for grammatical object agreement marking. These findings were not corroborated by the eye tracking studies of Thompson, Emmorey & Kluender (Reference Thompson, Emmorey and Robert Kluender2009), who showed that eye gaze is rarely directed towards objects of plain verbs in ASL, but does occur with the locative arguments of spatial verbs. It would be interesting to establish whether eye gaze has a similar function in NGT, and whether eye contact for focus marking would also overrule a syntactic use of gaze towards a location in space or whether they are simply combined in sequence. The consistency of eye contact during the focused constituent in our data would lead us to hypothesise the latter.
3.2.2 Mouth actions
Various types of mouth actions are used in NGT simultaneously with manual signs. Some of these actions clearly derive from Dutch words (and often are the full, yet silent, articulation of a word, thus adding to the specific meaning of the manual sign; these are called ‘mouthings’), while a minority does not appear to be derived from a spoken language (‘mouth gestures’). Within the latter category one can distinguish lexically specified meaningless articulations from bound morphemes with an adverbial function and from mouth actions that form part of an overall affective facial expression (see Crasborn et al. Reference Crasborn, Kooij, Waters, Woll and Mesch2008a for further discussion).
While the first dictionaries of NGT included a mouthing for only 16% of the entries (Schermer Reference Schermer1990), Crasborn et al. (Reference Crasborn, Kooij, Waters, Woll and Mesch2008a) showed that in narratives in three signed languages, including NGT, 60–90% of all manual signs were accompanied by some mouth action. For the two NGT signers, mouthings accompanied 20–30% of all signs. Recent studies of the Corpus NGT (Crasborn & Zwitserlood Reference Crasborn, Zwitserlood, Crasborn, Efthimiou, Hanke, Thoutenhoofd and Zwitserlood2008, Crasborn, Zwitserlood & Ros Reference Crasborn, Zwitserlood and Ros2008b) show that mouthings are much more pervasive in dialogues than in narratives and that there are very few manual activities that are accompanied by a neutral closed state of the mouth (van de Sande Reference Sande2009, Bank et al. Reference Bank, Crasborn and Hout2011).
With this background in mind, we looked at the mouth activity during the focused manual signs. We found that all focused signs for all signers were accompanied by some mouth action.Footnote 8 Both subjects and objects had mouthings, both in the information and in the contrastive focus conditions. Focused verbs were either accompanied by a mouthing (in some of the contrastive focus sentences) or by a mouth gesture (in most of the contrastive sentences and in the information focus sentences). As with the results on eye contact in the previous paragraph, we did not attempt to systematically study mouth activity throughout each sentence, making it hard to draw the conclusion that the presence of a lexically bound mouth action ‘marks’ the focused constituent, setting it off from constituents that do not have such an articulation. In fact, given the recent corpus findings referred to above, it is unlikely that this would be a strategy in NGT. However, the distribution and articulation of the mouth actions make them stand out from the mouth activity elsewhere in the same sentence in the following way.
First of all, the pronounced articulation of mouth actions on the focused signs was found to be a striking phonetic cue corresponding to the special information status. The mouthed words were not reduced in the number of syllables, which Bank et al. (Reference Bank, Crasborn and Hout2011) by contrast found to be quite common for the mouthing of polysyllabic Dutch words. In terms of the details of the articulation, vowels often appeared to show a wider mouth opening. In some cases the duration was clearly larger, although this was not systematically annotated and may well have corresponded to a longer manual articulation of focused elements.
A second remarkable property of mouth actions on focused constituents relates to the focused predicates. Schermer (Reference Schermer1990) found that in the first NGT dictionaries, mouthings were typically used with nouns rather than with verbs. This is intuitively plausible, given that verbs can be accompanied by mouth gestures that act as adverbials; their lexical specification would then presumably be without a mouth action, the mouth gesture being added by the morphosyntax. In our data, we found that contrastively focused verbs are uniformly accompanied by mouthings (i.e. mouthed words) to contrast the verb's meaning even more.
As far as the whole group of focused verbal predicates is concerned, we find either fully articulated mouthings (i.e. no deletion of syllables) or mouth gestures. Interestingly, some of these mouth gestures are formally similar to aspectual articulations (e.g. pursed lips, indicating ‘at ease’), or repeated (reduced) mouthings (e.g. ‘looplooploop’, English ‘walkwalkwalk’). These repetitions typically follow the repetition of the manual movements (see Section 3.3 below), which appear to be identical to aspectual modulations of the sign's movement indicating a repeated action (see Vogt-Svendsen Reference Vogt-Svendsen, Braem and Sutton-Spence2001 for a similar pattern in Norwegian Sign Language). The rhythmic parallel is similar to what Woll (Reference Woll, Braem and Sutton-Spence2001) called ‘echo phonology’: identity of movement properties between hands and mouth. However, the mouth gestures that resemble adverbial modulations, such as ‘at ease’, do not seem to have this adverbial meaning in these focused sentences. In discussions of these cases, signers indicated that an adverbial or aspectual meaning (such as ‘at ease’ or ‘repeated or continuous activity’) was not implied for these focused constituents. We thus conclude that mouth gestures that are similar in form to the aspectual modulations are only used here to add articulatory force in order to highlight the focused constituent.
3.2.3 Head and body position
Head nodding is used for affirmation by deaf people in the Netherlands, just as in conversations between hearing people in the Netherlands. Repeated nodding is not related to a specific grammatical function in the NGT literature, unlike headshakes, which were found to express negation in Coerts (Reference Coerts1992). In an earlier analysis of a portion of the present data, forward and backward movement of the head and body were found to be related to the expression of contrast (van der Kooij et al. Reference Kooij, Onno and Emmerik2006).Footnote 9
In addition to the forward–backward dimension, head and body leans are used in the left–right dimension to contrast two signs of the same category (subjects, objects, verbs). In some cases the body leans towards a location in signing space that was previously established. When there is a sentence-internal contrast (‘I don't like pears, but I do like apples’) body leans can in fact be used to create contrasting left–right locations. Especially in corrective focus items, in which a constituent in the question is replaced by another alternative (e.g. ‘not my brother but my sister’) frequently led to this pattern: in the response the constituent from the question that is to be replaced is repeated while leaning to one side, followed by the replacing constituent which is produced while leaning to the other side. An example of this is presented in Figure 2. More generally, employing a left–right spatial contrast in some way or other is a very common way of expressing contrastive focus. The assumption that a bilateral spatial contrast underlies this contrastive focus expression rather than the body lean per se is illustrated in a case of dominance reversal (switching from one hand to the other as the active articulator; Frishberg Reference Frishberg, Stokoe and Volterra1985) of one signer that is found in exactly the same condition as the use of left–right leaning in other signers.Footnote 10
Figure 2 (Colour online) Head and body leans towards the left and the right in combination with dominance reversal are used to correct information.
In the larger data set investigated here, head position was found to be associated with focused constituents in an interesting way, as it appears to differentiate between subjects and objects. Both for information focus and contrastive focus on the object, upward and/or backward movement of the head was frequently found during the articulation of the object. For information focus, this happened slightly more frequently than for contrastive focus, 72% vs. 67%, respectively. Examples of upward and backward head positions on contrastively focused objects are presented in Figure 3.
Figure 3 (Colour online) Head position (a) upward on contrastively focused object; (b) backward on contrastively focused object.
For contrastively focused subject NPs, upward and backward movement of the head were less frequent (18%). Subject NPs are more commonly followed by a head nod. There are many instances of head nods related to focused constituents in our data, defined as a single forward head tilt followed by a reverse tilt. We found some evidence that head nod for focus can be differentiated from other head nods. Subject NPs in general are often immediately followed by a head nod, often very shallow. An example with the small head nod in the fourth still is given in Figure 4.
Figure 4 (Colour online) Small head nod following subject NPs.
This example makes clear that the small head nod we often find to follow the first constituent are associated with the subject rather than to the topic. In the above example, the subject referent is expressed in full as a topic followed by a short pause, and an indexical sign referring to the subject at the start of the comment. There is no head nod following the topic, in this example. The head nod follows the third indexical sign in the sign sequence in Figure 4, referring to the subject that has information focus.Footnote 11
While such subject head nods are fairly small and subtle, the head nods that accompany contrasted constituents are more articulate and last longer. Another difference with the subject head nods is that they are articulated simultaneously with the focused constituent, rather than following it. We found pronounced head nods on nearly all types of contrasted constituents, including replacing subject NPs, verbs, and adjectival predicates. However, interestingly, we found no examples of contrastive objects with pronounced head nods. Figure 5 presents examples of a focused subject NP and a focused verb phrase, respectively.
Figure 5 (Colour online) Head nod on (a) a contrastively focused subject NP; (b) a contrastively focused adjectival predicate.
In summary, although we do find some variation and none of the non-manual signal appears obligatory, by and large, focused objects are accompanied by raised or backward tilted head positions, while other contrastively focused constituents are accompanied by pronounced head nods. The latter type of head nods can be distinguished from more subtle head nods that may follow subject NPs.
3.2.4 Brow raise
As discussed above, brow raises in NGT as in many signed languages have so far been established as related to the expression of yes–no questions and topic constituents (Coerts Reference Coerts1992). Brow raises can also express paralinguistic meanings, such as surprise, in combination with a linguistic expression (de Vos et al. Reference Vos, Kooij and Crasborn2009).
We found that brow raises are also used together with focused information. In particular, brow raises accompany counter-assertions, spanning the whole utterance. This happens both in response to positive utterances (95%) and negative utterances (71%). Examples are presented in Figure 6. In both cases, the non-manual activity of both head and eye brows starts well before the first manual sign, and lasts until well after the last sign; this was a common pattern in our data. This is contrary to the findings of Baker-Shenk (Reference Baker-Shenk1983) on ASL, who found a rapid onset and offset of non-manual activity right before and right after the manual activity in sentences.
Figure 6 (Colour online) Brow raise plus (a) head shake to mark focused negative counter-assertions; (b) head nod to mark focused positive counter-assertions. The first image shows the rest state of the face, at the end of the question of the interlocutor.
Another indication that brow raises accompany focused information comes from our finding that brow raises sometimes accompany focused constituents that are much smaller than the whole phrase, sometimes not exceeding the duration of a single sign. This happened on all kinds of focused constituents, but more frequently on objects and verbs than on subjects. Two examples are presented in Figure 7. This short brow raise was in fact also found occasionally for the counter-assertion sentences, where the raise co-occurred with the affirmative or negating particle; this is illustrated in Figure 8.
Figure 7 (Colour online) Brow raise only on (a) the focused object; (b) the focused verb.
Figure 8 (Colour online) Brow raise in counter-assertion falling on the affirmative particle.
Although we do not consider the short duration of these brow raises to be hard evidence for them stemming from linguistic rather than affective sources, we do interpret it as an indication that they are in fact related to the expression of focus in these cases. Further studies are needed to determine the nature of this relation.
3.2.5 Summary: Non-manual cues
Table 2 summarises the non-manual cues that were found for focused realisations of different constituents, highlighting the differences between the syntactic domains and the fact that multiple cues can be used for the same focus type. Where most features do not help in distinguishing contrastive focus from information focus, a left–right body lean distinction appears to be only used for marking contrastive focus.
Table 2 Non-manual cues for focused realisations of various constituents and of the whole clause.
3.3 Manual prosodic cues
Overall, one can impressionistically say that focused signs receive a stressed articulation. The movement appears ‘enhanced’, much as Wilbur (Reference Wilbur1990b, Reference Wilbur1999) found for ASL: signs are larger, more articulate, have a longer duration or an added hold at the end of the movement, and sharper onsets and offsets. However, the focused signs in our data revealed some more specific patterns as well, relating to their phonological representation. The height of fingerspelled words and the realisation of focused verbs show that the actual nature of the enhanced articulation may depend on the phonological specification of the sign. Whereas the path movement is repeated or prolonged in time and/or space in focused items with a path movement, a different pattern occurs in the focused items that lack a path movement in their underlying representation, more specifically in fingerspelled words.
NGT uses a one-handed fingerspelling system for the articulation of words from a(n alphabetic) written language. Fingerspelling is articulated in front of the ipsilateral side of the torso, typically at shoulder height. Our data included sentences in which the fingerspelled word #ASL was put in informative and contrastive focus. For all 10 signers, the height of articulation of #ASL was evaluated with respect to the other fingerspelled words by the same signer. In realisations of #ASL as a contrastively focused object, it was realised higher than in non-focused conditions or as part of a wider focus constituent. This was a consistent pattern in all but one signer, who invariably signed #ASL at chin level. Tyrone & Mauk (Reference Tyrone and Mauk2010) found that the actual height of a lexical sign in ASL is also determined by coarticulatory influences, which we did not take into account here. The clear difference in height between contrastively focused fingerspelled words and non-focused instances of fingerspellings of the same signer indicates that at least (narrow) focus has an impact on height as well. An illustration of the height of #ASL in two different constructions of the same signer is presented in Figure 9.
Figure 9 (Colour online) Neutral and focused articulation of the final segment ‘L’ of fingerspelled ‘ASL’.
As the fingerspelled items do not have a path movement, it is difficult to increase the size of the sign, but it can be made more prominent by using a longer hold at the final letter and adding a path movement, such as the forward movement on the final letter ‘L’ in some instances in our data. Waleschkowski (Reference Waleschkowski2009) found that symmetrical two-handed signs in German Sign Language (DGS) were articulated more precisely when focused, with sharper transition boundaries, while other signs, the form of which is not discussed, are articulated higher in space – just as we found for the fingerspelled sequence #ASL in NGT.Footnote 12 We propose that the way the articulation of a sign is enhanced for focus depends partly on the phonological specification and the prosodic contexts: path movements can be repeated, prolonged or enlarged. In our data, the fingerspelled items that lack a path movement are typically raised with respect to their non-focused counterparts.
In verbal focus, the manual form of the sign may change in a manner that strongly resembles the articulation of durative or continuative aspect. The manual movement is repeated in different ways: it can be lengthened as in the durative aspect (e.g. WALK), or repeated in cycles (with the same start point) as in the continuative aspect (e.g. STUDY). While aspectual modulation of NGT predicates has not been studied in any detail, it can commonly be observed and is taught as part of NGT grammar to sign language interpreters, for instance. We showed these instances of enhanced movement in sentence context to a native signer, who indicated that there was not an adverbial interpretation in these cases: the lengthening was solely interpreted as a kind of stress-for-focus. In this study, we did not control for position effects, although more recently we found that lengthening and repetition occurs in the final position in NGT sentences (Crasborn et al. Reference Crasborn, Kooij and Ros2012; see also Nespor & Sandler Reference Nespor and Sandler1999 on ISL).
As we discussed in Section 3.2.2, mouth actions similarly resembled aspectual modulations in many cases of focused verbs. This homology in form suggests that there is a general phonological category for both the mouth actions and the two types of manual modulations that is linked to two different functions, one expressing the meaning of a bound morpheme and the other expressing an information-structural modification. As a consequence, at least for NGT and potentially also for other signed languages, one should be extra careful in interpreting movement modulation as having a direct lexical-semantic impact. What remains to be investigated is to what extent the manual and non-manual properties systematically occur together. Put differently, do the changes in manual movement and the mouth activities co-occur in different contexts by chance, or are they in some way a package, specified in the lexicon as distinct but co-occurring phonological ways of expressing something more general like ‘emphasis’?
3.4 Non-prosodic structural means of marking focus
In this study, we primarily focused on prosodic expression of focused constituents, yet as we already noted in Section 2, it can be hard to disentangle prosodic from syntactic or other means of realising focus. Because of the limited number of publications on the lexicon and grammar of NGT in general (as compared to ASL), we will briefly discuss three other types of focus realisation that we found recurrently in our data set, all relating to the lexicon: sequences of semantically related items, focus particles, and the cliticised particles PERSON and INDEX. They clearly merit further research.
3.4.1 Sequences of semantically related items
A quite common pattern that was found consists of the repetition of focused elements, which resemble the manual repetition of movement that was discussed in Section 3.2.2 above. The repetition of indexical signs within a sentence has already been established in earlier studies (Bos Reference Bos, Bos and Schermer1995, Crasborn et al. Reference Crasborn, Kooij, Ros and Hoop2009), as was discussed in Section 2.3 above. In our data, we also found two other types of repetitions. The whole focused lexical item itself can be repeated later in the sentence, but there are also quite a few cases of both objects and verbs in focus that are immediately followed by synonyms or lexical items that are semantically quite close to the focused sign. Examples are presented in (6).
(6) Sequences of near synonyms in focus
• STUDY WRITE
• LEARN STUDY
• LEARN TAKE-IN
• STUDY TAKE-IN
• TO-BE-DISGUSTED DISLIKE
• CINEMA-THEATER FILM
• SOLD GONE
• GO DISAPPEAR
• ASSEMBLE REPAIR
These sequences are specific to NGT. They do not occur in Dutch and could therefore not be analysed as a form of ‘signed Dutch’ or code-mixing between NGT and Dutch.Footnote 13 It is possible that the verbal examples can be analysed as serial verbs, sharing the subject and not allowing intervening constituents aside from an object (Muysken & Veenstra Reference Muysken, Veenstra, Arends, Muysken and Smith1995). Since we found at least one example of near synonymous nouns this may not be the best analysis. We leave it to future studies to further investigate these patterns.
3.4.2 Focus particles
We found several signs recurring in our data that can be analysed as focus particles, compare similar analyses of the ASL signs THAT and SELF in ASL (Wilbur Reference Wilbur1994) and DGS (Hermann Reference Hermann2010). Some of these particles are similar to focus particles used in Dutch, like ONLY (Dutch alleen) which marks restrictive focus, ALSO (ook) which marks expanding focus, and REALLY (echt, toch, wel) which is used for counter-assertion. Illustrations of the signs are given in Figure 10.
Figure 10 (Colour online) NGT focus particles.
Some other particles are not similar to focus particles in Dutch. For instance, the lexical sign NOW, including the Dutch mouthing nu, is regularly added in our data, while it was never present in the target sentences. There are a few cases where NOW has a clear temporal reading, but in most cases this temporal interpretation appears bleached, or at least supplemented by a focus reading. In these cases, NOW is not interpreted as a temporal expression, though it seems to implicate some temporal contrast with a non-present situation, yet it precedes and highlights the focused constituent. The focal use of NOW was found for all but one signer, who realised very few extended sentences, typically responding in short phrases. While temporal expressions typically occur sentence- or discourse-initially in NGT (Schermer & Koolhof Reference Schermer, Koolhof, Prillwitz and Vollhaber1989, Crasborn et al. Reference Crasborn, Kooij, Ros and Hoop2009), having scope over the whole clause, NOW in some of the predicate and object focus sentences is found preceding the focused constituent, but not at the start of the sentence. For example, in Figure 1 above, NOW comes right before the focused object, rather than at the beginning of the sentence. It could well be that the location of temporal particles influences their interpretation, leading to a temporal versus a focus interpretation. Moreover, the reduced temporal reading of the focus interpretation may be a manifestation of grammaticalisation in process (Pfau & Steinbach Reference Pfau, Steinbach, Heine and Narrog2011). A related particle is UNTIL-NOW, which likewise seems to have an additional focus reading aside from its lexical temporal meaning.
The sign SELF was already identified as a focus particle for NGT by De Clerck & van der Kooij (Reference De Clerck, Kooij, Doetjes and van de Weijer2005), basing themselves on a subset of the same data. Three of the signers in the present study used this sign in particular in the context of subject focus, typically for human subjects. Examples can be found in Figures 2 and 5(a) above. The position of the sign SELF was not consistent between signers. In one signer the sign SELF follows the focused subject NP, or, in the afterthought it replaces the subject. In another signer the sign SELF precedes the focused subject NP. Interestingly, for this signer we found one example of SELF preceding the focused object.
The sign ONLY was used for restrictive focus on verbs, often in combination with the sign THAT'S-IT following the verb. The latter sign is also used for selective focus on the verb without ONLY. This suggests a similar meaning as the particle ONLY, which typically precedes the predicate.
3.4.3 Clitics PERSON and INDEX
The signs PERSON and INDEX can both occur in cliticised form with focused lexical items, forming one prosodic word with the host (van der Kooij & Crasborn Reference Crasborn, Zwitserlood, Crasborn, Efthimiou, Hanke, Thoutenhoofd and Zwitserlood2008 on NGT; see also Sandler Reference Sandler, Hall and Kleinhenz1999a on ISL). For example, the information focus subject NPs are often followed by a clitic or a combination of clitics (PERSON+PT). As we propose in Crasborn et al. (Reference Crasborn, Kooij and Ros2012), the addition of clitics may be closely linked to prosodic weight in NGT. When in a certain context a full prosodic word is required, an indexical sign may be added to fill rhythmic requirements even though it is not needed from a syntactic point of view.
3.5 Summary of the findings
In summary, in response to the research questions in (5) above, we found that there is a wide range of prosodic cues that co-occur with focused constituents in NGT. In (7), we list the core findings on focus in NGT.
(7) Summary of findings
(i) Eye contact for focus. Eye contact with the interlocutor is the default gaze direction, but we found that it is maintained or established during focused constituents, even if other grammatical or pragmatic processes would require gaze direction towards a location or simply away from the addressee.
(ii) Use of mouthings. Mouthings are always found on signs in narrow focus, and whereas they were never reduced in the number of syllables or otherwise, in some cases, the mouthings appeared to be hyperarticulated. Mouthings consisting of a repeated Dutch word were found on predicates that in their manual part also contained a repeated movement. Mouth gestures that are homophonous with the durational aspectual marking were also found on predicates.
(iii) Use of head actions is sensitive to syntactic role. Head actions distinguish focus on the object and focus on other constituents. Upward or backward movement of the head was found in relation to focus on the object. While we found that subject NPs in general are often followed by a small head nod in NGT, corrective focus on subject NPs, verbs and adjectival predicates can be accompanied by a single large nod.
(iv) Brow raise associates to contrastive focus domains of different sizes. Counter-assertions are often accompanied by a brow raise over the whole utterance, in response to both positive and negative statements. Shorter brow raises can also accompany one or more contrastively focused signs, although this is less likely on subjects than on objects or verbs.
(v) Enhancement of manual movement. In addition to these non-manual cues, several manual modifications and lexical signs were found to be related to the expression of focus. Focused signs often appear to be articulated in a ‘stressed’ manner, having a longer duration, larger movements, and more repetitions. Fingerspelled words and other signs lacking a path movement are articulated higher in space when they are focused. Finally, verbs can be modified in a way that is similar to the form of durational aspect, having lengthened movement and more repetitions than in the standard form.
(vi) Lexical enhancement. A remarkable lexical pattern was found in some sentences where the focused item can be repeated by a synonym or a related lexical item, as though to increase the amount of ‘semantic’ attention to the focused item.
(vii) Focus particles. A substantial set of focus particles was found, some of which are also known in Dutch, others appearing to be indigenous to NGT.
4. DISCUSSION
We start in Section 4.1 by summarising our responses to the questions we posed in Section 2.4. We then try to integrate the findings in a model of variation of phonetic appearance in Section 4.2. Some methodological reflections are presented in Section 4.3.
4.1 Answers to the research questions
First of all, in (5i) above, we asked which prosodic cues are related to focus distinctions in NGT. We have found that there are a variety of non-manual and manual cues related to the expression of focus in NGT. The general purpose of focused information is to be prominent in some way in the discourse. In NGT, this can be achieved by means of prosody as set out in (8).
(8) Means of prosodically expressing focused constituents
(i) Invoking non-manual features to give a special intonation to what is uttered. Both domain markers and punctual markers were found, the former being most frequent.
(ii) Use of manual prominence, modifying the standard phonetic realisation of a morpheme by enhancing the movement pattern in various ways, depending on the lexical/phonological form of the sign.
(iii) For contrastive focus, a contrast can be set up in the signing space by assigning different discourse elements (referents, utterances) a location in space, and then using non-manual and manual prosodic means of referring to the different locations.
At an abstract level, the first two of these distinct strategies add phonetic weight: enhanced or added non-manual or manual articulations add a focus interpretation to a neutral utterance. In that sense, the expression of focus can be characterised as an instance of the Effort Code (Gussenhoven Reference Gussenhoven2004). The Effort Code is based on the fact that more articulatory effort creates more elaborate and explicit phonetic realisations. The informational interpretation of the Effort Code is ‘emphatic’, derived from the perception that the signer regards his or her message as important and thus spends more energy on its production: more semantic prominence is expressed by more phonetic prominence. More articulatory effort was particularly prominent for mouthings and mouth actions: they were used for all signs in focused domains, and very clearly articulated.
With respect to the question in (5ii) above, whether NGT employs different markers for informational vs. contrastive focus, we have not found that specific non-manual cues are used to distinguish information focus from contrastive focus. While we had the impression that typically cues such as head nods or brow raise have larger amplitude when used for contrastive focus as opposed to information focus, this needs to be corroborated by more quantitative (experimental or corpus-based) studies. The possibility that the co-occurrence of several cues may be a distinguishing factor between information and contrastive focus is beyond the scope of this paper. For now, we interpret this finding as suggesting that the prosodic distinction between information focus and contrastive focus is at best gradient, compare the claims by Ladd (Reference Ladd2008) and Krahmer & Swerts (Reference Krahmer and Swerts2001).
A specific use of space, setting apart two similar constituents, was only found in contrastive focus in our data. This use of space is quite common in larger stretches of discourse in signed languages. While it is related to the prosodic choice between the left and the right hand (Crasborn & Sáfár Reference Crasborn and Sáfár2013), we do not consider the localisation of referents in space itself to be prosodic in nature. Spatial locations are set up in the discourse for referential purposes. The direction of the lean is determined by the locations that have thus been introduced. The leaning of the head and body itself can be considered an intonational feature, being realised simultaneously with (one or more) manual signs, serving the purpose of emphasis. We hypothesise that in larger stretches of discourse, body leans can also be used for information focus.
Finally, the question in (5iii) above was how these findings can be interpreted phonologically, and whether there are any consequences for our conception of the grammatical organisations of signed languages. In terms of the relation between the phonology and the phonetics of signs, the way in which the manual features location and movement are modified for focused signs depends on their phonological specification in the lexicon. Not all signs are raised in space, as far as we could establish, but specifically signs without a lexical path movement, such as fingerspelled items, are raised. Similarly, not all movement is lengthened or reduplicated, but specifically path movements in space are. This is an interesting finding in that the phonological form thus would appear to determine how prosodic strength is implemented – as if round vowels would be strengthened by being articulated with raised pitch while unrounded vowels would be lengthened. This variable implementation yields an abstract representation of prosodic strength.
4.2 Charting the influences on phonetic realisation
Just as spoken languages are characterised by sequences of phonological pitch accents and boundary tones in the modern analyses of intonation (Pierrehumbert Reference Pierrehumbert1980, Pierrehumbert & Beckman Reference Pierrehumbert and Beckman1988, Hayes & Lahiri Reference Hayes and Lahiri1991, Gussenhoven Reference Gussenhoven2004), signed language prosody can be characterised as the simultaneous articulation of manual lexical signs and phonological specifications leading to variations in the state of the upper body, head, and face. In the present study, we have found that focused items in NGT may be accompanied by eye contact, raised eye brows, head up, head nods, and/or a left–right lean of the upper body. We propose that these are reflections of phonological features whose precise phonetic form is variable, and which is likely to depend in part on factors that are (as yet) unknown, simply because they have not been studied. In the present section we present an overview of factors that are likely to be involved.
Underlying the various influences on the phonetic form of focus in sign language are different functional forces or ‘biological codes’ that are likely to be similar in nature to the Frequency Code of Ohala (Reference Ohala1984) and the Production and Effort Codes of Gussenhoven (Reference Gussenhoven2004). The Frequency Code relates pitch to body size, with smaller speech organs producing higher-pitched sounds. The Production code states that the gradual decline of subglottal pressure during an utterance, and thus the lowering of pitch, is associated with ending. Neither the Frequency Code nor the Production Code applies to the visual modality.Footnote 14 This is different for the Effort Code, which states that extra effort on behalf of the speaker producing larger pitch excursions can be interpreted by the listener as semantic emphasis or focus. We suggest that, similarly, the extra effort on the part of the signer in using extra phonetic cues in NGT is interpreted as focus.
The Effort Code does not necessarily directly determine the appearance of lexical manual phonological forms nor of facial, head, and body actions, as they can have affective interpretations as well as grammaticalised linguistic interpretations. Rather, these biological codes are constant pressures on linguistic structure, which in the long run may guide language change. Thus, in some cases, these pressures have developed into linguistic categories in the sense that – as with the phonological forms of lexical items – there is a phonological specification that is activated during language production and perception. This was essentially also the hypothesis of Baker-Shenk (Reference Baker-Shenk1983), who tried for the first time to disentangle linguistic and affective facial expressions. However, Baker-Shenk did not make explicit that being ‘linguistic’ in fact implies a phonological layer of organisation. As was already stated above, we would like to propose that for many of the signals that accompany focused constituents, there is indeed a phonological category involved. Thus, the phonetic raising of the eyebrows is not a focus marker as such; it is rather the realisation of a phonological feature with a specific value. The underlying drive for this association may well be something like the Effort Code proposed for speech: extra effort (in contracting the eye brow raising muscles or opening of the mouth) is related to extra emphasis.
Biological codes such as the ones above are ‘paralinguistic’ in the sense that they have an informational interpretation (‘uncertainty’) and can lead to a grammaticalised function (‘question’). Crasborn (2001) further distinguishes between paralinguistic and extralinguistic motivation, the latter not stemming from communicative intent. In sign language, for example, the extralinguistic motivation for eye blinks is that they are needed to keep the eyeballs moist and to protect the eyes from dirt particles entering the eye; furthermore, eye blink frequency is influenced by physiological factors such as fatigue. These circumstances form the baseline activity on which any possible paralinguistic (e.g. affective) or linguistic use of the eyelids is superimposed. As they include the anatomy and physiology of a person, extralinguistic factors do not impact the communicative event in the same way, in that they are fairly constant across time and are less likely to receive a semantic interpretation. Further, more variable local or physical circumstances may have an impact on the phonetic shape of an utterance. For instance, holding a glass while signing will have an impact on the form of manual signs yet does not directly stem from a communicative action. Finally, signers may have a personal style or ‘voice’ that has barely been touched upon in the literature. One signer in the present study has a clear preference for the use of brow raise and a backward-sideward tilted head position as a default position (see Figure 4 above), which we did not see in other signers. This too is clearly an area where future investigations are needed.
On the basis of the data in this study and the considerations above, we propose the model of sign language prosody in Figure 11.
Figure 11 A model of sign language prosody.
This model adopts the conclusion of Sandler (Reference Sandler, Hall and Kleinhenz1999a, Reference Sandlerb, Reference Sandlerc; Nespor & Sandler Reference Nespor and Sandler1999) that sign languages have a level of prosodic phonology, just as spoken languages do. The central argument for this level of organisation is that rhythmic groups formed by changes in manual and non-manual features do not always appear to be isomorphic to syntactic constituents. Aspects of facial expression and other non-manuals have a phonological form, just like lexical signs do, comparable to the H and L tonal elements in the analysis of spoken language intonation (see Dachkovsky & Sandler Reference Dachkovsky and Sandler2009). The present study adds to our understanding of the similarity of signed languages to spoken languages, by showing that abstract prosodic features may have variable phonetic realisations. Just like the abstract H and L tones in spoken languages, intonational elements in sign language have variable phonetic realisations.
In other words, in the analysis of sign prosody, the phonetic activities of the sign language articulators do not map one on one to phonological (intonational) features. For instance, cues such as brow raise, widened eyes, and head forward appeared to co-occur in many instances in our data, which leads to the hypothesis that they are in fact phonetic expressions of a single phonological feature that we could name [forward] or [open up!]. To find further evidence for such an abstract phonological feature, we need targeted production and perception studies looking at the distribution and processing of combinations of cues by signers. For the other phonetic cues observed in this study, we have no clear indication that the phonological form generalises over different articulators. Thus, head nods would be instantiations of a phonological feature [nod]. Sideward leaning of the upper body and head tend to co-occur. Should further research confirm that they can also occur in isolation, a general phonological feature [lean] would be warranted.
We propose that the phonetic appearance of a sign can be influenced by various components of the grammar, mediated by the prosodic phonology. As our primary aim was to better understand sign language prosody, the model does not include all the possible relations between the lexicon, information structure, syntax, and pragmatics; we refer to the work of Steedman (Reference Steedman2000) for further discussion on such relations. Yet we did find some interesting interactions between other components of grammar in our data.
First of all, we found that the prosodic expression of the information-structural notion ‘focus’ is sensitive to the syntactic roles and to the phonological forms of constituents. That is, objects typically receive a different head (and/or body) posture or movement than subjects when focused. Further, the enhancement of verbs lies in a higher number of repetitions, which we never saw for nouns. Nouns, on the other hand, may be complemented with clitics like PERSON or INDEX. Secondly, we found an interaction between information structure and phonological form: only certain forms (such as fingerspellings, or, more generally, signs lacking a path movement) are raised under focus. Thus, there is no general prosodic process at work that raises all signs that are articulated in space.
Secondly, the model makes explicit that there is a direct impact of syntax and of pragmatic and semantic features on the prosodic phonology. We found the influence of syntactic roles ‘subject’ and ‘non-subject’ in the position and movement of the head (Section 3.2.3 above). Pragmatic and semantic influences on the direction of body leans were described in van der Kooij et al. (Reference Kooij, Onno and Emmerik2006).
Finally, both paralinguistic and extralinguistic influences impact the phonetic signal in the above model. Several researchers have pointed to the link between emotional and linguistic signals in signed languages (Baker-Shenk Reference Baker-Shenk1983, de Vos et al. Reference Vos, Kooij and Crasborn2009). For instance, the eyebrow raise that leads to the interpretation of ‘surprise’ in many sign languages is also associated with question intonation. This suggests that emotional facial expressions form fertile material to distil linguistic patterns from. Given the young age of all signed languages that have been studied and the continuing conflation with aspects of emotional facial expressions, it is quite plausible that many intonational patterns are still in the process of prosodic grammaticalisation even after many generations.
4.3 Some methodological remarks
Overall, the results strengthen our earlier analyses of NGT prosody in the sense that prosodic phonetic cues and the underlying prosodic phonological features can typically have a variety of functions (van der Kooij et al. Reference Kooij, Onno and Emmerik2006, de Vos et al. Reference Vos, Kooij and Crasborn2009, Ormel & Crasborn Reference Ormel and Crasborn2012). For example, a cue like repetitive head shake can simultaneously mark the time span of a prosodic constituent like the intonational phrase and communicate the semantic feature ‘negation’. Similarly, in the present study, the domain of brow raises often overlapped with the focus constituent. However, as brow raise can also have several other linguistic and non-linguistic sources, it is sometimes hard to demonstrate that a given cue is only present for the duration of the focus constituent. In particular, this is problematic for cues like eye gaze, as it is not possible to ‘switch off’ one's gaze in signed languages: signing with closed eyes for longer stretches of signing does not appear to be used for linguistic purposes. There is an obvious functional motivation for looking at the addressee: only then can the signer check with the interlocutor whether he or she still follows what is being said and whether the intended pragmatic effect of the utterance has been reached.
Earlier studies on eye brows in NGT (de Vos et al. Reference Vos, Kooij and Crasborn2009) and head and body leans (van der Kooij et al. Reference Kooij, Onno and Emmerik2006) have shown that the actual phonetic appearance of an intonational feature can vary depending on the interaction with emotional expressions and phonetic or semantic-pragmatic factors. This formed one of the reasons for eliciting rather artificial, translated, question–answer and statement–response pairs in the present study: we hoped that at least affective expressions could be kept to a minimum. There is no way of checking this in our materials other than by conducting a separate perception study using the same data. Above, we have only reported on dominant patterns in our data set, but even then, we did find variation across utterances and signers. The use of data from eleven signers did give us enough confidence that at least these patterns reflect part of a prosodic grammar of NGT. At the same time, it has become clear that there are no obligatory prosodic correlates of focus in NGT. Perhaps the direction of eye gaze towards the interlocutor is a single exception, which overlaps with the default behaviour.
5. CONCLUSION
Our investigation showed the use of prosody to signal focus in Sign Language of the Netherlands to be rather complex. We might have expected non-manuals to work in a simpler way, given the large number of independent phonetic channels that are available in signed languages in comparison to the use of pitch, vowel quality and segment duration in spoken languages. One could imagine simply overlaying one specific intonational facial cue on a focused constituent, leaving the other intonational cues for other lexical and morphosyntactic functions intact. This is not what we found in NGT.
We have seen two types of complexity in the present study. First of all, we found that there is no single non-manual or manual activity that serves as a marker of focus. Moreover, different syntactic categories (subjects, objects, predicates) tend to be focused by different cues. Further, the phonological form of the sign determines in part what the modification of the sign's articulation will look like.
Secondly, the non-manual phonetic cues that co-occur with focused information all serve other linguistic functions in the language as well. Head movement is also used in relation to affirmation and negation, eye gaze is also used in referring to distinctive locations in signing space, and the eyebrows are used for the expression of questions and topics.
Sign language users are often characterised by non-signers as being ‘very expressive’. Looking at the problems second language learners face in acquiring a signed language for the first time, we hypothesise that prosody in all its complexity is one of the biggest hurdles in becoming a fluent signer, especially since it has been poorly described.
While we tend to think of the complex non-manual articulations as an orchestra where each instrument plays its own score, our study suggests that it may not be the best strategy to study instruments in isolation. An analysis in terms of abstract intonational units entails that there can be more abstract representations (such as [open up!], leading among other things to head movement forward, widened eyes and raised brows) that can be realised in multiple ways, by various articulators and in various gradations. The focus in the literature has sometimes been on isolated facial actions, to contrast such linguistic articulations with whole-face emotional expression. The above example suggests that it may be fruitful to further investigate co-occurrence patterns in non-manual articulations in sign languages.
In the preceding sections we have highlighted many areas for future research. It is difficult at present to design experimental studies on sign language prosody. Quantitative production data are hard to obtain because of a lack of accessible measurement tools for facial, head and body movements. Video recognition software may facilitate this in the near future (e.g. Piater, Hoyoux & Du 2010). Perception experiments are hard to set up given that avatar technology is not easily available either, although developments in this field are proceeding rapidly. While such technologies are typically developed in the context of more applied domains such as automatic sign recognition, they promise to facilitate our further understanding of sign language phonetics in the coming decades.
APPENDIX Elicitation materials
This appendix lists the 59 sentence pairs that were used for elicitation. Column A specifies whether the elicitation sentence was a statement (S) or a question (Q). Column B indicates whether the overall classification was information focus (I) or contrastive focus (C), while column C further specifies the nature of the corrective focus. Column D lists the focused constituent (whole sentence (S), non-verbal predicate (P), verb (V), verb phrase (VP), subject (Subj), or object (Obj). Column E specifies the different lexical movements of focused objects that were included: straight (S), L-shaped (L), circular (C), straight with handshape change (H), and waving (W).