In the present paper, I pursue a Construction Grammar (CxG) characterization of English modal auxiliaries (e.g., can-could, must, have (got) to, should, ought to, need to, will-would) that seeks to add to established lexical approaches. It is argued that Construction Grammar (e.g., Goldberg, Reference Goldberg1995, Reference Goldberg2006; Tomasello, Reference Tomasello2003) can successfully account for underlying modality patterns, the understanding of which can lead to distinct gains for both linguistics and second language acquisition research. To that end, some of the tenets of CxG are invoked:
1. Linguistic knowledge is the product of the interaction of our cognitive apparatus and the input;
2. The whole inventory of human linguistic knowledge consists of a network of symbolic pairings of form and function/meaning termed constructions (e.g., words, phrases, idioms, phonemes, morphemes, syntactic patterns, etc.);
3. Syntactic constructions have meanings in themselves that are independent from lexis; and
4. Language utilizes a number of metaphorical extensions to help us conceptualize the physical world.
Approaches to modality
It is widely accepted that modal expressions allowing one ‘to talk about possible scenarios and unrealized possibilities’ (Alonso–Ovalle & Menéndez–Benito, Reference Alonso–Ovalle, Menéndez–Benito, Alonso–Ovalle and Menéndez–Benito2015: 1) constitute one of the most complex topics in English grammar. In particular, English modal auxiliaries often include two meanings, i.e. root (physical, social, independent from the speaker) and epistemic (logical, that is, the evaluation of a speaker about an event), a fact that makes it difficult to trace clear-cut patterns of semantic systematicity. To illustrate this point, let us consider example (1).
(1) a. I play Lee Weathers, and she is a risk-management consultant who is hired to … come here to a … um, hospital, I guess you would call it? Um, to assess whether Morgan should be kept alive or terminated …
(Kate Mara, Morgan Interview, 2016: https://www.youtube.com/watch?v=usp_TiQaIEQ [00:00 – 00:04:26])
b. When I saw it I couldn't believe that I had missed it, it's so beautifully made (…)
(Julianne Moore, Julianne Moore Is Surprisingly Skilled at This Household Chore: https://www.youtube.com/watch?v=4ie3O-Ibp5Q [00:01:40 – 00:01:45])
In example 1a, would expresses the epistemic meaning of prediction, while should (in its perfective form) implies obligation. On the other hand, could in (1b) means that the speaker has trouble finding reasons to explain why she had not seen a particular movie before which is an epistemic evaluation.
In particular, the emphasis on verbal modality has conspired against a reliable identification of these constructions. As a result, modal auxiliaries have traditionally been divided into verb types conveying residual meanings (e.g., Quirk et al., Reference Quirk, Greenbaum, Leech and Svartvik1985; Biber et al., Reference Biber, Johansson, Leech, Conrad and Finegan1999), namely, permission-possibility (e.g., can-could), obligation-necessity (e.g., must, have (got) to, need to, should, ought to), and volition-prediction (e.g., will-would, shall). A closer look at these studies reveals that these verbs are analyzed in terms of opposing and sometimes slightly overlapping categories.
Other verb-centred approaches have adopted a whole-sentence semantic strategy to explain the meaning of modal auxiliaries in a more systematic fashion (e.g., Depraetere & Reed, Reference Depraetere and Reed2011). Although these potential links are exciting, they do not really attempt to explain modality as a clause phenomenon, but build on modal verb semantics to project modal meaning from one clausal element, for example, the referent subject, onto another, such as, post-verbal content, or ‘scope’. Ambiguities are tackled through the medium of linguistic modal auxiliary paraphrases that, nonetheless, fail to account for all the possible modal meanings that can coexist within the scope of an utterance.
On the other hand, although cognitive linguistic approaches to modals have been cautious in postulating modal categories, since ‘[i]t cannot be presumed that standard terms of this sort correspond to natural, well-delimited linguistic categories waiting to be discovered’ (Langacker, Reference Langacker, Marín–Arrese, Carretero, Hita and van der Auwera2013: 39), the tendency in this field is to view modals as ‘grammaticized constructions’ (Langacker, Reference Langacker, Marín–Arrese, Carretero, Hita and van der Auwera2013: 14), infused with ‘force dynamic qualities’ (Talmy, Reference Talmy1988; Sweetser, Reference Sweetser1990), and thereby producing ‘a finite clause with potential epistemic import’ (Langacker, Reference Langacker, Marín–Arrese, Carretero, Hita and van der Auwera2013: 39). On the other hand, some constructionists have postulated phrase-length ‘modal verb constructions’ such as Not if I can help it Footnote 1 (Cappelle & Depraetere, Reference Cappelle and Depraetere2016) that still require a somewhat cumbersome explanatory apparatus, especially when constructional inheritance links are invoked, which, nonetheless, adds very little to confirm the existence of stored modal constructions.
Finally, in the case of Applied Cognitive Grammar, Tyler (Reference Tyler2012: 129) claims that ‘[t]he CL alternative, based on force dynamics and metaphoric extensions, does provide both precise definitions for the individual modals and a systematic account of the relationship between [root and epistemic] uses’. While an emphasis on embodied cognition (whereby language is said to be constructed from bodily conceptualizations of our experience with the physical world) is promising, the appeal to forces, barriers and forward momentum can only be of use if a reference to the role of other clausal elements is provided. Inevitably, two questions arise. First, what is the role of both transitive and linking verbs in the construction of modality? Second, how can both pre-verbal and post-verbal content be used to identify the meaning of modal constructions?
Embodied cognition and argument structure
Returning to the broader goal in this paper, I believe that a good way to analyze modal auxiliaries can be found in embodied cognition, whereby ‘the learner[-speaker] can rely on associations between the movements of the body and the context in which words are spoken’ (Yu & Ballard, Reference Yu, Ballard, Mix, Smith and Gasser2010: 233). For the Goldbergian version of CxG, termed Cognitive Construction Grammar (CCxG) (Goldberg, Reference Goldberg2006: 215), syntactic patterns have meanings that are independent from the meaning of main verbs. This has important consequences in that it leads to an assumption that the semantics of simple sentences depends to a great extent on the meaning of argument structure constructions (ASCs) (Goldberg, Reference Goldberg1995, Reference Goldberg2006; Torres–Martínez, Reference Torres–Martínez2015, Reference Torres–Martínez2016, Reference Torres–Martínez2017, Reference Torres–Martínez2018). In this respect, the following example is particularly instructive:
(2) I think the fact that we're in South Africa is giving … giving us amazing locations. I know that a lot of the other films were done in studios. So that already brings a new element to it.
(Ruby Rose, Resident Evil: The Final Chapter Interview, 2017: https:// www.youtube.com/watch?v=gWqivP-OAD0 [00:01:40-00:2:12])
As shown, the speaker uses the verbs GIVE and BRING in clauses that convey the meaning of transfer. Thus ‘being in South Africa’ (donor) gives the ‘film crew’ (Recipient) ‘amazing locations’ (Undergoer)’. Likewise, ‘This fact’ (donor) brings a ‘new element’ (Undergoer) ‘to it’ (Recipient)’. Though both clauses are slightly different in that the former is a double-object sentence, while the latter uses a prepositional dative, both syntactic constructions provide an abstract pattern in which several participants are profiled. In other words, the participants available to the transfer ASC become activated when the construction fuses with a verb whose meaning is compatible. As a result, the transfer constructions described above share the patterns ‘someone (X) causes someone (Y) to receive something (Z)’. Crucially, this ASC requires a ‘GIVE’ verb (e.g., pass, toss, bring, send, lend, etc.) capable of activating specific participants, such as a Donor, a Recipient, and an Undergoer.
Stated this way, ASCs reflect some sort of embodied cognitive substrate that interfaces mental processes with our physical experience with the world. In fact, modal auxiliaries are, too, analyzable against the backdrop of the ASCs in which they are used (see Torres–Martínez, Reference Torres–Martínez2018). As can be gathered from examples (3)–(10), regardless of the modal meaning involved, the syntactic construction contributes its specific meaning to render the clause understandable. Moreover, it is possible to postulate a set of modal ASCs in which both modal- and full-verb semantics can be generalized as a result of their combined meanings:
(3) Root Ditransitive (prediction)
Example
She would probably arrive sometime that evening. All that long way … Well, she would give her (Recipient) the paper (Undergoer), if it mattered, and tell her the things he'd said about the twins.
(Anne Rice, Queen of the Damned, 1988)
(4) Epistemic Caused - motion (possibility)
Example
What was his motivation for hiding her out on the island? He must have taken her (Undergoer) away from her home, her friends, the other members of the family (ObliquePATH). Why?
(Barbara Freethy, Falling for a stranger, 2013)
(5) Epistemic Intransitive motion (possibility)
Example
‘Just get done and hurry home,’ she said. ‘I'm worried what I might learn from Mrs. Young, and you may need to get to Laramie in a hurry.’
(C. J. Box, Stone Cold, 2014)
(6) Root Intransitive motion (past habit)
Example
(…) so we will go to my cousin's and we'd sit and watch all these delightful VHSs (…)
(Felicity Jones, Star Wars Actress Felicity Jones Will Always Love James Dean | W Magazine: https://www.youtube.com/watch?v=95z4tVrq2XU [00:00:17 – 00:00:21])
(7) Root Removal (obligation)
Example
I won't scream. But I don't want to go in the water. I can do it from the deck.
You (Causer) should take off your dress (Undergoer), at least, you'll ruin it.
(Gillian Flynn, Gone Girl, 2012)
(8) Root Transitive (ability)
Example
Star Wars made me cry, Star Wars made me cry … That poor little robot, and he couldn't find his dad (…).
(Priyanka Chopra: ‘I Don't Crush on People, They Crush on Me’ | W Magazine: https://www.youtube.com/watch?v=UdUp02RUyc4 [00:02:01 – 00:02:06])
(9) Epistemic Transitive (prediction)
Example
Working with Guillermo, I knew he (Causer) would show this woman (Undergoer) in a very fair and compassionate light.
(Jessica Chastain, Crimson Peak: Jessica Chastain ‘Lucille Sharpe’ Official Movie Interview: https://www.youtube.com/watch?v=fNjDuDM91co [00:00:57 – 00:01:07])
(10) Root Resultative
Example
(…) that's this lack of hope that we can't get this (Undergoer) actually done (End-state) anymore.
(Emma Watson, Emma Watson & Caitlin Moran – In Conversation for Our Shared Shelf, 2016, https://www.youtube.com/watch?v=CynzW9Kz7Ds [00:03:56 – 00: 00:04:00])
Among other aspects, the examples reveal that modal usage is heavily reliant on force dynamic elements, including forces, barriers to action, and momenta.
The existence of modal ASCs points to the fact that verbal modality is constructed at a deep cognitive level through complex constructional relations. Further, modal ASCs are partially filled constructions (compositional) providing slots for the inclusion of pragmatic content. Thus, the [NP can/could V Adv] construction, meaning ‘X can/could live ZLOC’, can be modified by a ‘hedging string’ (see Torres–Martínez, Reference Torres–Martínez2014, Reference Torres–Martínez2018) which tones down the strong ability sense expressed by could as illustrated by the example (11).
(11) That's funny when I go home, people ask me where I'm from … So, I, I feel like I could kind of live everywhere now.
(Mia Wasikowska, Lynn Hirschberg's Screen Tests: https://www.youtube.com/watch?v=7X_gjhZNGaI [00:03:34 – 00:03:49])
Modal zones
The connection between modal ASCs and embodied cognition is also possible thanks to some general properties of the modal sentence. One of them is the division of labour between distinct semantic zones which distinguish force/referent zone, modal cluster, and scope. Thus, the force/referent zone contains any force (animate or inanimate) having some type of physical or metaphorical influence upon an Undergoer (direct object), or which acts as a referent to an attribute (complement). The modal cluster combines the modal meaning of an auxiliary with the meaning of the full verb. Finally, the scope is where the combined meanings of the modal verb and the full verb are linked to a scope (in non-motional sentences the reference is directly made to the subject). In particular, the scope refers to any post-verbal element that receives the action of the full verb or that complements a referent. As shown in Figure 1, a simple schema combining form-function relations and force-modal-full verb-scope indexical relations can be translated into a cognitive scene.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190428132752890-0848:S0266078418000081:S0266078418000081_fig1g.jpeg?pub-status=live)
Figure 1. A schema displaying constructional relations in a modal clause
Embodied cognition, agency and modality
It might be tempting to object to the above arguments by claiming that modal verbs are not only used in clauses that express some type of movement. However, this objection fails on one count, namely that one of the most important conceptual tools of both Cognitive Linguistics and Construction Grammar is the idea that language knowledge is shaped by the interaction of our cognitive apparatus and the physical reality. For example, ‘both upright posture and bipedal walking condition our experiences with path and trajectory respectively’ (Torres–Martínez, Reference Torres–Martínez2016: 13). According to this perspective, mental processes require different levels of metaphorical extensions that mediate the perceptual constraints imposed by our bodily architecture in the perception and further organization of reality. Embodied cognition thus hinges on a network of metaphorical extensions, which include paths such as, ‘geometries of ground objects’ (Mani & Pustejovsky, Reference Mani, Pustejovsky, Mani and Pustejovsky2012: 6), expressing metaphorical kinetics (the cause of motion of entities bestowed with motional attributes), as well as containment, location, direction, etc.
In the remainder of this paper, I claim that an embodied approach to modality is advantageous to reveal patterns for linguistic generalization. The reason is that the embodied mind operates at different levels of experience where agency, that is, ‘a causal capacity, say, flexibly wielding means toward ends’ (Kockelman, Reference Kockelman2012: 1) takes center stage. The core elements of agency are summarized by Enfield (Reference Enfield, Enfield and Kockelman2017:4–6) as follows:
1. A degree of flexibility in carrying out a behaviour (including control over it, composition of the way the behaviour is carried out, and subprehesion, i.e. how the reaction of others is predicted by the agent).
2. A degree of accountability whereby a behaviour is subject to evaluation and the agent has a degree of entitlement and obligation to carry out the behaviour.
Agency can be either eventive, in which case arguments are specified (Borer, Reference Borer, Everaert, Marelj and Siloni2012, Reference Borer2013), or referential/stative. In contrast to traditional lexical approaches, the explanatory power of embodied agency can be put to the test with example (12) reported in Depraetere and Reed (Reference Depraetere and Reed2011: 6):
(12) When the soil dries out, strain is put on the house structure and cracks can appear overnight.
According to the authors, the modal can expresses ‘a general situation possibility’ (epistemic possibility) extending over the whole utterance, in which case, so claim the authors, the scope is ‘wide’. Although this analysis seems holistic, in that the epistemic possibility conveyed by can is said to account for the utterance's meaning as a whole, it is evident that it is not. In contrast, the embodied agentive cognitive reading pursued in this paper reveals that the modality of the sentence, which takes into consideration all the constructions contributing meaning thereto, is triggered by the noun soil (eventive agent). It follows that it is the physical and chemical properties or other external forces of the soil which ultimately produce cracks overnight. To put it in simpler terms, rather than positing an ‘agent-free scenario’ whereby ‘cracks appearing overnight is a possibility’ (Cappelle & Depraetere, Reference Cappelle and Depraetere2016: 12), the embodied analysis reveals that the apparent possibility is actually an empirical fact subject to evaluation. The available evidence supports the fact that ‘cracks do appear overnight’ depending on the soil's physical and chemical properties; this evaluation is based on agency (accountability), which overrides the generality of the possibility.Footnote 2 Indeed, this is a good example of just how ‘there is a certain kind of epistemic possibility, even though there is no matching metaphysical possibility’ (Egan & Weatherson, Reference Egan and Weatherson2011: 2). Clearly, the conceptualization of the soil's structure has shifted from being an a priori possibility to a ‘posteriori [claim] that arise[s] from the nature of natural kinds’ (Egan & Weatherson, Reference Egan and Weatherson2011: 2).
This process is better understood when we take a closer look at the elements of agency mentioned earlier. Since the soil's properties are not capable of taking life by themselves, and thereby controlling a behaviour (however, they can be held ‘accountable’ for the results), flexibility is extended to the observer-reporter of the phenomenon. The observer-reporter exerts his/her agency to ‘control’ the development of the phenomenon by describing its physical properties through recourse to epistemic modality. Then s/he composes the phenomenon's behaviour through some type of measure. Finally, the outcome of the phenomenon is predicted by the observer (also called subprehension). Likewise, accountability is assumed by the observer-reporter that evaluates the behavior of the agent, defines the extent to which the phenomenon is the product of the soil's physical and chemical properties (also called entitlement), and finally concludes that this is an inevitable outcome of the process (also called obligation).
This case also shows that agency is not restricted to animate agents (heretofore forces). Moreover, embodied agency can disambiguate the relation between root semantics and the speaker's stance encoded by the modal in the form of epistemic senses. This can be seen in example (13).
(13) You should have trusted me to finish the job.
(Skyfall movie, Sam Mendes, 2012).
The root meaning of should in its perfective aspect implies that a moral obligation arising out of considerations of right and wrong, that is, trusting 007's field experience, was ignored by M. The pragmatic content of this speech act, which requires the reader -hearer (the one who decodes the reporter's interpretation) to enrich ‘the semantic content of the sub-sentential utterance – a process that helps the hearer make sense of the speaker's speech act’ (Elugardo, Reference Elugardo and Goldstein2013: 93), is embedded in the utterance that can be glossed cognitively as follows:
M (force 1) did not trust 007 (force 2) to do the job (Undergoer).
In this case, force 1 makes the decision (control) of ordering an agent to take the shot that nearly killed 007 (composition), while being fully aware of the consequences of her orders (subprehension). It is thus force 2 (007) who evaluates force's 1 behavior. However, while force's 1 entitlement is acknowledged, the obligatoriness of her behavior is questioned.
Stative agency
As already mentioned, agency can also be stative. However, in contrast to eventive agency, subjects in stative utterances cannot take arguments (Undergoer, Recipient), but have individual reference only. An example is provided in (14):
(14) There was a childlike quality to the man, even though Hammond must now be … what? Seventy-five? Seventy-six? Something like that.
(Michael Crichton, Jurassic Park, 1990)
In the above example, the age of Hammond is being calculated by a speaker-narrator by means of the modal must (conveying epistemic necessity). The verb is used to indicate that the speaker has drawn a conclusion from experience and observation. In this case, the degree of accountability of the speaker (as well as his/her entitlement) is assessed by the reader/hearer in terms of reference to the epistemic subject Hammond. This gives three interlocking layers of logical necessity explained as:
1. The reader/hearer interprets the speaker's statement to be epistemically plausible (which entails that the speaker has control, composition, and subprehension of his/her evaluation).
2. The speaker has both a degree of flexibility and accountability that entitles him to make use of a causal capacity to mobilize a means toward an end.
3. Hammond becomes a referent of someone else's age attribution.
At the root of this idea is that the English modal system can no longer be construed as a verb-centered, one-dimensional phenomenon, but as a result of a number of interconstructional relations taking place within a hierarchy network. In addition, embodied agency is said to unveil the richness and diversity of form-meaning associations without having to postulate an overwhelmingly complex inventory of constructional specification ‘with little or no gain through any generalization that could be achieved this way’ (Bergs, Reference Bergs2010: 228). Under this view, modal verbs can be defined as constructions only to the extent that they can be fused with other constructions in an utterance.
Conclusion
In this paper, I have presented a constructionist approach to modality in English. It has been claimed that the present account may well yield satisfactory understandings about how speakers conceptualize language thanks to the interplay of general cognitive mechanisms and accumulated world experiences. Crucially, modal ASCs meet two basic criteria for positing a construction. First, some aspects of their form and function are not predictable from their parts, or from other attested constructions (Goldberg, Reference Goldberg2006: 5). Secondly, some modal ASCs make up fully predictable patterns that are frequent enough to be stored in and retrieved from the constructicon (the lexicon-syntax continuum). It is hoped that this insight may well have an impact on the conceptualization of modality both in theoretical and applied linguistic contexts.
SERGIO TORRES–MARTÍNEZ holds a PhD in Applied Linguistics. Among his main interests are Cognitive Construction Grammar, cognitive semantics, Wittgenstein's philosophy of language, Peircean semiotics, and translation semiotics. His current research aims at redefining the tenets of both construction grammar and cognitive linguistics through the introduction of a new theoretical framework termed Agentive Cognitive Construction Grammar. The main objective of this research is to provide a comprehensive picture of English constructions that can be further connected with L2 instruction.