1 Introduction
A theory of phraseology is now well established (cf. Cowie, Reference Cowie1998; Sinclair, Reference Sinclair1991; Hunston & Francis, Reference Hunston and Francis2000) in which the lexical item is seen to have primacy (Sinclair, Reference Sinclair2004). Corpus searches have shown that lexemes display preferred collocational and colligational patterning (the lexical and grammatical company that words keep), and also have preferred semantic preferences and semantic prosodies, i.e., lexical items tend to co-occur with certain semantic sets and items are imbued with either a negative or positive connotation. For example, CAUSE tends to occur with the semantic set of ‘diseases’ and usually has a negative semantic prosody (Stubbs, Reference Stubbs1996). This linear, syntagmatic approach to language reveals that meaning does not reside in individual lexemes, leading Sinclair to argue for the existence of ‘extended units of meaning’. It is to be noted, however, that Cowie's approach to collocation is somewhat different from that of Sinclair's in that a ‘textual’ over a ‘statistical’ identification is preferred on the grounds that individual restricted collocations may recur to only a limited extent within a given text or across several texts on the same topic. As I am working with a small, specialized corpus of one million words, the Business Letter Corpus (BLC), I take a ‘textual’ approach to identification of collocational patterns.
Moreover, Stubbs (Reference Stubbs2006: 26) has signalled a relationship between Sinclair's ‘extended units of meaning’ and speech act theory: “Although they are based on very different kinds of data, both speech acts and extended units are functional and build the speaking agent into units of language structure and use”. Speech acts can be of a non-threatening or face-threatening kind. As Brown and Levinson (Reference Brown and Levinson1987) point out, certain speech acts are likely to damage a person's ‘face’, a concept first proposed by Goffman (Reference Goffman1967) to signify one's reputation or good name. The hearer's positive face can be damaged by the speaker expressing disapproval of the hearer's action; the hearer's negative face has the potential to be damaged if the speaker gives an order which impinges on their freedom of action. Similarly, the speaker's own positive or negative face can be damaged if they are pushed into admitting some error or an imposition is made upon them. Syllabus designers drew on speech acts to provide the theoretical underpinning for the functional approach to language learning, which was at the heart of communicative language teaching in the 1970s (Wilkins, Reference Wilkins1976). While speech act theory and a functional approach to language have mainly been discussed in relation to speaking, they are also of relevance for writing.
The purpose of this paper is to illustrate how a freely available online corpus has been exploited in a module on teaching business letters from a phraseological, functional perspective covering the following four speech acts (functions) commonly found in business letters: invitations, requests, complaints and refusals. It is proposed that different strategies are required for teaching more neutral, non-face-threatening (inviting and requesting) and face-threatening (complaining and refusing) speech acts.
In the first part of this article I first review briefly how a functional approach to language learning has been addressed in corpus-based materials. Most of these applications, however, focus on English for Academic Purposes (EAP) or spoken material specifically directed toward English Language Teaching (ELT). I then review the ‘Noticing Hypothesis’ which underlies much corpus-based pedagogy, although only in a few accounts is this made explicit. In the second part of the paper I illustrate how corpus consultation focusing on functions with particular emphasis on their phraseologies has been addressed in the teaching of a less explored genre, i.e., that of business letters. I also illustrate how the ‘Noticing Hypothesis’ underpins much of the corpus consultation, but is an aspect which has been little commented on in the literature.
2 Treatment of functions in corpus-based instruction
The teaching of the lexico-grammar of functions has been addressed in a variety of ways in corpus-based pedagogy. (Here, I use the term ‘corpus-based’ in a general sense to cover both hands-on and pen-and-paper activities derived from concordance output.) One early key endeavour is that by Thurstun and Candlin (Reference Thurstun and Candlin1998a & Reference Thurstun and Candlin1998b) for teaching the functions associated with general academic English, for example, stating the topic, reporting the research of others, starting from a key lexical item. Within each broad function, each key word (e.g., claim, identify) is examined using concordance output within the following chain of activities (Thurstun & Candlin, Reference Thurstun and Candlin1998b: 272):
• LOOK at concordances for the key term and words surrounding it, thinking of meaning.
• FAMILIARIZE yourself with the patterns of language surrounding the key term by referring to the concordances as you complete the tasks.
• PRACTISE key terms without referring to the concordances.
• CREATE your own piece of writing using the terms studied to fulfil a particular function of academic writing.
Corpus-based instructional material for English for general academic purposes has also been produced by Charles (Reference Charles2007, Reference Charles2011), Thompson and Tribble (Reference Thompson and Tribble2001) and Bloch (Reference Bloch2009, Reference Bloch2010). Like Thurstun and Candlin, Charles (Reference Charles2007) also targets key rhetorical functions, using a corpus of PhD theses written by native speakers, in this case the combinatorial function of defending your work against criticism, a two-part pattern: ‘anticipated criticism→defence and its realization using signals of apparent concession, contrast and justification’ (op. cit., 296). Another feature of Charles’ materials is that she approaches these functions by first using a top-down approach, providing students with a suite of worksheet activities to sensitize them to the extended discourse properties of this rhetorical function. She then supplements these with a more bottom-up approach by having students search the corpus of theses to identify typical lexico-grammatical patterns realizing these functions.
The hands-on activities by Thompson and Tribble (Reference Thompson and Tribble2001) and Bloch (Reference Bloch2009, Reference Bloch2010) target a specific function, that of citations. Thompson and Tribble's tasks focus on having students categorize the citations identified in a dedicated corpus according to their range, purpose and forms. Bloch (Reference Bloch2009) describes a user-friendly program for teaching the use of reporting verbs in academic writing in corpora compiled in-house to meet the needs of specific learners. The interface presents users with only a limited number of hits for each query and a limited number of criteria for querying the database, namely integral/non-integral; indicative/informative; writer/author; attitude towards claim; strength of claim, categories devized from Bloch's (Reference Bloch2010) research on the use of reporting verbs from a rhetorical perspective.
The teaching of functions has also been the focus of corpus-based materials of spoken communication in general ELT (Ackerley & Coccetta, Reference Ackerley and Coccetta2007; Coccetta, Reference Coccetta2011). Coccetta's materials are based on multimodal approaches to corpus analysis, which take a systemic-functional orientation to determine how different semiotic resources (language, gaze, gesture, etc.) interact to create meaning (cf. Baldry & Thibault, Reference Baldry and Thibault2006). Coccetta explains how an online multimodal concordancer incorporating a search engine allows students to find and isolate sequences in a corpus sharing the same characteristics by means of a functionally tagged corpus. For example, to see if the function of ‘declining an offer’ occurs in a subcorpus relating to requests, invitations and offers and to find the linguistic forms realizing this function, students choose a parameter from a drop-down menu to retrieve the relevant concordance lines exemplified in Table 1. Each concordance line has access to the film clip which provides non-linguistic information drawing on semiotic resources such as gesture, posture, gaze and facial expressions.
Table 1 Concordance results for the ‘declining an offer’ function (Coccetta, Reference Coccetta2011: 131)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20151127093454275-0733:S0958344012000043_tab1.gif?pub-status=live)
The account in this article of corpus-based pedagogy aimed at writing business letters addresses the use of corpora for English for Occupational Purposes (EOP), thus contributing to the existing literature on using corpora for teaching functions in English for Academic Purposes (EAP) and ELT settings.
3 Noticing Hypothesis and corpus consultation
The ‘Noticing Hypothesis’ discussed in second language acquisition (SLA) studies underpins many corpus studies. The principle underlying this cognitive concept is that learners’ acquisition of linguistic input is more likely to increase if their attention is drawn to salient linguistic features. Schmidt (Reference Schmidt1990, Reference Schmidt2001), one of the main proponents of this hypothesis, maintains that it precedes understanding and is a condition which is necessary for converting input into intake. Moreover, as Boulton (Reference Boulton2011) points out, ‘noticing’ overlaps with other features such as focus on form, consciousness-raising, and language awareness. Swain (Reference Swain1998: 66) links ‘noticing’ to frequency counts of form, remarking that there are several levels of noticing, one of which is that: “Learners may simply notice a form in the target language due to the frequency or salience of the features themselves”. In spite of its detractors, most notably Truscott (Reference Truscott1998) and Robinson (Reference Robinson1997) whose empirical research on implicit and explicit second language learning under four conditions (implicit, incidental, rule-search, instructed) found learning to be fundamentally similar across all four conditions, the concept does, in general, hold currency in corpus-based pedagogy on various accounts. One reason is that as Key Word in Context (KWIC) concordance lines highlight recurrent phrases, scrutiny of corpus data would seem to be an ideal means of enhancing learners’ input with attention paid to frequency counts, a level indicated by Swain earlier (1998). Another reason, as signalled by Boulton (Reference Boulton2011), is that inductive approaches, the mainstay of data-driven learning (DDL), are entirely dependent on noticing (although the concept itself is only explicitly referred to in a few corpus-based endeavours (cf. Johns, Lee & Wang, Reference Johns, Lee and Wang2008; Flowerdew, Reference Flowerdew2008, in press, Reference Flowerdew2012b).
A purely inductive approach to corpus consultation implies spontaneous noticing by the learners. However, as Johansson (Reference Johansson2009) points out, in reality, the pedagogic use of corpora combines inductive and deductive approaches, involving some kind of teacher intervention.
Is the use of corpora to be grouped with the explicit or implicit method? The term ‘data-driven’ learning suggests that it is an inductive approach and therefore comparable with the implicit method, though the emphasis is on gaining insight rather than establishing habits, and in this sense it is mentalistic. I believe that the dichotomy explicit-implicit is far too simple. In the case of corpora in language teaching, I would favour a guided inductive approach or a combination of an inductive and deductive approach where the elements of explanation and corpus use are tailored to the needs of the student.
(Johansson, Reference Johansson2009: 41–42)
Several studies providing a framework for corpus consultation mediate this inductive/deductive continuum. They involve some type of ‘pedagogic mediation’, a term first introduced by Johns (Reference Johns1997) and subsequently taken up by McCarthy (Reference McCarthy1998) and Widdowson (Reference Widdowson2000), for teacher-directed noticing activities. For example, Chujo, Anthony and Oghigian (Reference Chujo, Anthony and Oghigian2005: 1) propose the following four-step DDL approach to incorporate cognitive processes such as noticing and hypothesis formation: (1) hypothesis formation through inductive corpus-based exercises; (2) explicit explanations from the teacher to confirm or correct these hypotheses; (3) hypothesis testing through follow-up exercises; and (4) learner production. Meanwhile, Flowerdew (Reference Flowerdew2009: 407) proposes modifying Carter and McCarthy's (Reference Carter and McCarthy1995) ‘3 Is’ strategy: Illustration (looking at data); Interaction (discussion and sharing observations and opinions); Induction (making one's own rule for a particular feature) to accommodate the concept of noticing through adding ‘Intervention’ as an optional stage between Interaction and Induction.
Kennedy and Miceli's (Reference Kennedy and Miceli2010) corpus work also favours a guided-inductive approach. They note that corpus work of a purely inductive nature without any kind of pedagogic mediation or guidance would make high demands in terms of language proficiency, observation and inductive reasoning such as ‘the learner-as-researcher’ model, proposed by Bernardini for her advanced translation students (Bernardini, Reference Bernardini2002, Reference Bernardini2004). As their students are intermediate level Italian and not advanced like Bernardini's, they propose two modes of apprenticeship training, ‘pattern-hunting’ and ‘pattern-defining’, using a 500,000-word corpus of contemporary Italian, to aid intermediate-level Italian students with personal writing on everyday topics. For example, when writing about their sense of personal space for an autobiography, students were first prompted to come up with some key words for pattern-hunting. Many students suggested the common term spazio, which not only turned up ideas and expressions, for example, rubare spazio (take space) but also triggered further searches on words encountered in the concordance lines, for example, percorso (path). Other pattern-hunting techniques included browsing through whole texts on the basis of the title and text-type, and scrutinizing frequency lists for common word combinations. The pattern-defining function was used when students did have a specific target pattern in mind to check. For example, one student wanted to establish if the pattern “so” <adjective> “that” could be rendered in Italian with così <adjective> che and if the subjunctive mood was required after che. Both types of apprenticeship training involve noticing, ‘pattern-hunting’ instigated by the teacher and ‘pattern-defining’ by the student.
The following section first provides a brief overview of the business letters module and the approach taken to the corpus consultation.
4 Business letters module: background and approach
At the tertiary institution in Hong Kong where I work a 15-hour module on writing business letters is offered to final-year undergraduate science students to prepare them for the professional workplace. There is a comprehensive set of in-house textbook material, covering four key genre sets of business letters, namely invitations and thanks, requests and replies (refusals), complaints and adjustment letters, and sales letters. A one-million-word freely available business letters corpus comprising American and British business letters was used to supplement the existing course materials and textbook activities (see www.someya-net.com/concordancer for further details of this corpus).
This paper describes how I integrated corpus consultation into classroom activities based on the teaching of the module to six different groups of students, with approximately eighteen students in each class. My account differs from that of other initiatives in the literature in several aspects. First, there was only one computer in the classroom, which necessarily constrains the corpus-based activities. Another difference is that the students did not undertake a training session in how to formulate queries and search the corpus beforehand. Leading practitioners have emphasized the importance of incorporating strategy training into corpus consultation (cf. Chambers, Reference Chambers2005, Reference Chambers2007; Lee & Swales, Reference Lee and Swales2006; O'Sullivan & Chambers, Reference O'Sullivan and Chambers2006; O'Sullivan, Reference O'Sullivan2007; Kennedy & Miceli, Reference Kennedy and Miceli2010), which has been identified as one of the possible reasons for lack of uptake (cf. Frankenberg-Garcia, Reference Frankenberg-Garcia2011). While not denying the importance of strategy training, my attempts at using a set of in-house-produced materials for corpus training with two classes proved somewhat unsuccessful, the main reason being that the materials targeted searches in corpora for academic writing. It was found that building strategy training into the corpus consultation was a more effective and efficient mode of instruction in this particular teaching scenario where time was limited. Also, the freely-available business letters corpus proved to be ideal for teaching purposes on account of its restricted size and user-friendly search facilities (see Braun, Reference Braun2005). Moreover, Boulton's (Reference Boulton2009) empirical study has provided evidence indicating that even lower-level learners can cope with corpus data with no prior training, with Boulton remarking that “We are perhaps beginning to see something of a retreat on this strong insistence on training” (op. cit.: 40).
The third way in which the pedagogic activities differ from other accounts in the literature which describe how corpus consultation has been incorporated systematically (cf. Flowerdew, Reference Flowerdew2008; Kennedy & Miceli, Reference Kennedy and Miceli2010) is that the business letters corpus was used on an ad hoc basis. By this I mean that corpus searches were primarily either used at the initial stage of the unit in hypothesis-type activities or conducted whenever students were faced with problematic lexico-grammatical aspects arising from the in-house-produced textbook materials. A description of the corpus-based activities for the four functions is provided in the following section. I relate this to the phraseological approach and noticing hypothesis described earlier in the paper.
5 Corpus consultation for speech acts/functions
5.1 Invitations
Stubbs (Reference Stubbs1987) has noted that the formal written language of business correspondence is a context which produces a large number of explicit performatives, for example, May I wish you a successful and interesting conference, I emphasise that… (op. cit.: 10). This was the starting point for the corpus task on writing letters of invitation. Before looking at the sample letter in the textbook from an undergraduate student to the executive director of a well-established engineering company in Hong Kong inviting them to be a guest speaker at the quarterly dinner of the student alumni association, I asked students to write the opening sentence of this letter. After eliciting a variety of responses, most of which began with the phrase ‘I am writing to invite you…’ or ‘We would like to invite you…’, I asked students to search the corpus for various verb forms for invitations, e.g. invite, inviting, invited, i.e., the performative verb for the speech act of ‘inviting’. The main purpose behind this task was to discourage students from adopting a ‘phrase book’ mentality and to expose them to a wide variety of exponents. This was achieved by encouraging students to read the corpus both paradigmatically (from top to bottom) and syntagmatically (from left to right); O'Keeffe and Farr (Reference O'Keeffe and Farr2003) and Flowerdew (Reference Flowerdew2009) have indicated that students need training in ‘reading’ concordance lines in this way. For example, reading the corpus paradigmatically familiarizes students with a variety of phrases, e.g., we would like to invite you…; we are pleased to invite you…; you are cordially invited to attend…. Students then compared these openings with the ones they had written, noting that the phrase you are cordially invited to attend… from the corpus might not be quite appropriate for their context of writing.
An additional task in the textbook required students to circle an appropriate adjective from a choice of three or four to collocate with a particular noun, a language point which it would be difficult to find the answer to from grammars or dictionaries. This task was exploited to introduce students to reading the corpus syntagmatically through hypothesis testing. In the sentence below from the textbook, students were first asked to circle which adjective they thought most appropriate. When checking their responses I found that most students thought either cordial or kindwas best but none of them could explain why.
We extend a (friendly/sincere/cordial/kind) invitation to you to join the … Young Scientists Society and to participate in our exciting educational programmes.
To test their hypotheses, students were then asked to look up the illocutionary noun invitation, a concealed performative, in the BLC which yielded the following concordance lines:
However, students were only able to interpret the corpus data through teacher mediation to encourage ‘noticing’ beyond the collocational level. When prompted to read the lines syntagmatically in the spirit of Sinclair's ‘extended units of meaning’, i.e., to examine the subject + verb + adj. + noun, students were then able to work out that when cordial was used with invitation it was most commonly used when the sender was offering the invitation, e.g. Please accept our cordial invitation to visit… On the other hand, when invitation was preceded by kind, it tended to be used for thanking by the receiver, for example, Thank you for your kind invitation to attend… The final mini-task involved drawing students’ attention to frequency data, which elicited some surprise on the part of the students to discover that the noun form invitation was more common in this corpus. These examples thus serve as a means of acquainting students with the probabilistic nature of language and that language does not always consist of rule-governed behaviour.
In addition to noting form-function correlations, students also have to be made aware that business language is highly context-sensitive. Widdowson (Reference Widdowson1998) has pointed out that a corpus is transposed from its original context, which obscures the communicative intent and socio-cultural purpose. However, as Gavioli and Aston (Reference Gavioli and Aston2001: 240) have argued, it is not so much a question of whether corpora are divorced from their original setting, but rather “whether their use can create conditions that will enable learners to engage in real discourse, authenticating it on their terms” (see also Mishan, Reference Mishan2004). Sometimes the co-text can provide enough clues to the context, enabling students to ‘authenticate’ the corpus output for their own learning situation. For example, in Figure 1 the co-text of the collocation ‘kind invitation’ provides some help with interpretation at the pragmalinguistic level (the collocation of certain linguistic features in a certain register). A student would be able glean from the co-text of kind invitation the pragmalinguistic knowledge that ‘I thank you for your kind invitation on the occasion of…’ belongs to a more formal register than ‘Many thanks for your kind invitation to join in…’.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160626072527-41293-mediumThumb-S0958344012000043_fig1g.jpg?pub-status=live)
Figure 1 Collocations for ‘invitation’.
However, sociopragmatic appropriacy, which is influenced by social, cultural and personal preferences and the dynamics of the unfolding interaction (Kasper, Reference Kasper2001), is more difficult to discern in corpus data. Here, the use or non-use of certain direct or indirect speech acts can pose problems for interpretation. In Figure 1 for ‘cordial invitation’ it could well be that the example ‘Province of <name> extends a cordial invitation to <name> to attend…’ is a polite directive disguised as an invitation. But we have no way of knowing this unless we are familiar with the situation and social roles (although admittedly, as business letters are somewhat conventionalized these may be fairly obvious in some cases). But in cases such as these, the lack of situational context can serve as a consciousness-raising activity. Students can be asked to supply what they consider to be possible scenarios for this exponent, thus sensitising them to the necessity of using corpus data judiciously and avoiding a cut-and-paste mentality.
5.2 Requests
One input task in the student textbook required students to formulate a polite request from one member in a company to another of the same status, asking them to get in contact about a fairly routine matter. This yielded student writing such as in the following two examples:
* I would appreciate if you can contact me regarding….
* I would be very appreciated if you could contact with me…
However, general unfamiliarity with various patterns containing appreciat* made it necessary to adopt Kennedy and Miceli's ‘pattern-hunting’ strategy outlined in the first part of this article. Through scrutinizing the concordance lines students were able to see the correct pattern for ‘appreciate’ and note the obligatory object ‘it’, as illustrated by the concordance lines in Figure 2. The corpus data also revealed the prevalence of modals with illocutionary lexical verbs, i.e., hedged performatives, which Stubbs (Reference Stubbs1987) has noted as the commonest surface form of verbs in his small corpus of business correspondence.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160626072531-40364-mediumThumb-S0958344012000043_fig2g.jpg?pub-status=live)
Figure 2 Sample concordance output from the BLC for ‘appreciate’.
Likewise, students were able to discern the correct lexico-grammatical patterning with phrases for ‘appreciated’, and to work out that the string ‘…appreciated if…’ requires the dummy subject ‘it’. The concordance lines also revealed that the collocation ‘very appreciated’ as found in the students’ suggestions is a non-harmonic one, with possible ones being ‘very much’, ‘greatly’ and ‘highly’, as noted by students from their paradigmatic reading of the concordance lines (see Figure 3). Again, these tasks are an attempt to apply Sinclair's concept of ‘extended units of meaning’ to pedagogy.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160626072533-86133-mediumThumb-S0958344012000043_fig3g.jpg?pub-status=live)
Figure 3 Sample concordance output from the BLC for ‘appreciated’.
However, while the BLC is very useful for revealing lexico-grammatical patterning, this is only half the story. Students have to carry out further analysis to decide which pattern would be suitable for their context of writing. Again, this is another instance in which the co-text and also frequency counts can provide some help. Students’ attention was drawn to the fact that while there were 105 hits of patterns with ‘appreciate’, only nine instances of the pattern ‘It…appreciated if…’ were recorded, thus suggesting this is a marked form. That this pattern usually occurred with a high degree of modalisation and adverbials was an indication that it was used for making requests which placed onus on the recipient, or was used as a deferential marker for making requests to someone of a rank higher than the writer. Applying the concept of ‘noticing’ of salient linguistic features and frequencies helped students to decide on appropriate patterns for their own context of writing.
5.3 Complaints
In some situations, it may be more sociopragmatically appropriate to use an indirect speech act, especially for those functions which are regarded as somewhat face-threatening or make impositions on the hearer or reader. A case in point is the speech act of disagreeing which has the potential to damage the hearer's positive face. McCarthy (Reference McCarthy1998) notes that in the 5-million-word Cambridge Nottingham Corpus of Discourse in English (CANCODE) there are only eight occasions when someone says, I disagree, and, interestingly, no examples followed by with you. McCarthy further notes that all eight occurrences of I disagree are modified in some way with a mitigating device, e.g. I'd er, I'd disagree. Speech acts may not involve performative verbs, but may unfold indirectly and in negotiation “with due sensitivity to interlocutors’ personal face” (McCarthy, Reference McCarthy1998: 19). However, such types of indirect speech act present the dilemma of how to search for these in a corpus. An indirect speech act common to business letters, namely complaining, is discussed below.
One of the input tasks on style and tone in the textbook reads thus: Maria Wong is thinking about changing the wording of parts of her letter. Circle any phrases you consider inappropriate and rewrite them:
I am writing to complain about a contamination incident which has recently occurred on campus…
Students suggested a wide range of lexico-grammatical patterns to replace the direct speech act ‘complain’, the most common of which are given below. This procedure is somewhat similar to Kennedy and Miceli's (Reference Kennedy and Miceli2010) ‘pattern-defining’ technique as they had specific target patterns in mind.
I am writing to lodge a complaint about…
I am writing to comment on…
I am writing to inform you…
I am writing to express my opinion on…
I am writing to express my concern about…
Students’ responses indicated that they had a grasp of the sociopragmatics but lacked pragmalinguistic knowledge. While the task did not specify the kind of contamination incident that had occurred, students were able to indicate that the phrase ‘I am writing to lodge a complaint about…’ might be used for a serious incident, whereas ‘I am writing to express my concern about …’ would commonly be used for something of less severity. A search in the BLC confirmed students’ intuition on the use of ‘lodge a complaint’ with the following phrase found:
The <NAME> has lodged a formal complaint with me…
Sometimes corpus data triggered other queries, one such query being when the noun vs. verb pattern was used (possibly motivated by the search queries for the function of ‘inviting’ discussed earlier). A search in the corpus for complain* showed that of a total of 118 tokens, 82 were nouns, e.g., ‘We have received your complaint regarding…’, but invariably used for acknowledging a complaint. When the verb form was used, it was found to occur in two scenarios: as a follow-up to a previous complaint or in a reporting statement (similar to the function of disagree in the CANCODE corpus), e.g.:
We sent an e-mail complaining of the late shipment last week
…back to the old standard that brought about my original complaint.
Several secretaries have complained of major and frequent breakdowns…
The BLC was used to verify students’ other suggestions for making a complaint, realized by implied performatives such as ‘express my concern’. However, as noted by several students, the corpus data do not exactly correspond to the student's suggestion as no examples of ‘my concern’ were found in the one-million-word BLC. This observation alerted students to the principle expressed in Carl Sagan's well known aphorism ‘absence of evidence is not evidence of absence’. A Google search revealed that this combination was indeed possible.
One could make a pragmatic distinction between phrases with and without a possessive prefacing ‘concern’, but whether it is worth covering such a fine distinction with intermediate-level students and in light of the debate on English as a lingua franca is open to debate (see Seidlhofer, Reference Seidlhofer2011). One student query triggered by the focus of discussion on verbs collocating with ‘concern’, was whether ‘voice concern’ would be appropriate. As a search in the BLC did not yield any patterns of this kind, another Google search was conducted. Of interest is that this pairing most often occurred as reported speech such as that associated with news reporting and writing of minutes, thus sensitizing students to the genre-specific nature of collocations. Other search engines such as WebCorp, which would provide a more ‘linguistic’ presentation of results from the web, or a large ‘principled’ corpus such as the BNC would arguably have been more useful. However, given the context of my teaching situation, a 15-hour module on business letter writing to final-year undergraduate science students, my approach tended towards expediency and efficiency.
5.4 Refusals
Refusals, like complaints, are another potentially face-threatening speech act. Similar to the previous task in the textbook, students were asked to identify the problem in the following sentence and rewrite it in an appropriate tone for the scenario provided:
I refuse your request to return the items that you ordered from our company.
Students were asked to provide patterns to replace ‘refuse’, which were then discussed in class and checked against the corpus data. One student suggested the phrase ‘May we remind you that items ordered are not returnable’, saying that it had appeared on a notice in a shop and printed on the receipt she received. While corpus data may be decontextualized as Widdowson has pointed out, having students reflect on genres and contexts when defining patterns helps to overcome this obstacle to some extent. The most common phrases proposed by students for the function of ‘refusal’ were as follows:
These patterns were subsequently searched in the BLC to verify students’ suggestions. One student follow-up query concerned whether it was necessary to use ‘that’ to introduce the reporting clause. After examination of the concordance data, students concluded that although ‘that’ was optional, it was preferred in formal writing. Of interest is that students discovered that the string ‘…sorry that…’ was found to be multifunctional (Moon Reference Moon1997), as illustrated in Figure 4. When ‘sorry’ occurred with a verb in the past or present perfect in the reporting clause, it had the function of apologizing, for example:
I am very sorry that this happened to you.
We are very sorry that we have not replied earlier.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160626072658-40775-mediumThumb-S0958344012000043_fig4g.jpg?pub-status=live)
Figure 4 Sample concordance output for ‘sorry’.
In most cases, though, it was found to function as a polite refusal, which, it could be argued, also serves as an implicit performative for apologizing, for example:
We are sorry that we can not allow you the special discount.
The elicitation of students’ patterns for polite refusals revealed confusable speech acts. Students were therefore instructed to scrutinize the concordance output in the ‘that’ clause, again echoing Sinclair's concept of ‘extended units of meaning’. Follow-up class discussion of the data revealed that students had noticed the overlap in speech acts, i.e., while ‘I am sorry / I regret / I apologize for…’ can all be used to realize the speech act of apologizing, only ‘I am sorry / I regret that’ also function as polite refusals. It is also important to draw students’ attention to overused, stereotypical phrases such as ‘we regret to inform you that the train will be delayed’ (Henry Tye, personal communication). In other words, consideration should be given to the perlocutionary effect of the speech act, i.e., the effect of the apology on the addressee and whether they regard it as genuine or merely a standard, impersonal response.
In addition to the concordancer, another search facility in the BLC which proved useful was the Bigrams Plus function. It was found that students might be familiar with two or three key words in a pattern, but unsure how to ‘string’ them together. Figure 5 below shows the output for the bigram ‘sorry……attend’. However, students need to select and modify the corpus data to suit their own context of writing and be trained to carry out further searches (in this case, using Google) on more unusual patterns. For example, a Google search revealed the one occurrence of ‘sorry to say…’ in the BLC was usually restricted to more informal contexts.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160626072736-71030-mediumThumb-S0958344012000043_fig5g.jpg?pub-status=live)
Figure 5 Using the Bigram Plus function to search for patterns.
6 Concluding remarks
This paper has illustrated how a freely available online corpus has been exploited in a module on teaching business letters. With reference to concepts derived from speech act theory and the ‘Noticing Hypothesis’ I hope to have shown how a small, specialized corpus can profitably be used to supplement existing course materials. Many of the student queries centred on phraseological aspects of language which are not always covered in dictionaries and grammars. Secondly, hands-on concordancing activities have advantages over using purely text-based material in which only one or two sample letters are provided, as recurrent phraseologies alert students to the ‘preferred ways of saying things’ (Sinclair, Reference Sinclair1991).
It is hoped that the strategies discussed in this paper will contribute to the growing body of literature on teaching functions from a corpus-based perspective and to corpus-based pedagogy in general. There remain several limitations to this DDL initiative, however. Only observational data gleaned from teaching the business letters module across six classes are reported. No systematic experimental data were collected on students’ evaluation of using corpora or to what extent their writing profited from corpus consultation. In any case, it is only very recently that studies have been conducted on students’ performance based on DDL activities (see Flowerdew Reference Flowerdew2012a for a review of these studies). Boulton (Reference Boulton2011: 39) raises the issue that “the real advantages of DDL lie in longer-term benefits, cognitive/constructivist as well as purely linguistic; in addition to ‘incidental’ learning and greater learner autonomy, these include language awareness and noticing ability”. What are now needed are more studies to determine the long-term benefits of DDL. This article has merely touched on heightening language awareness through teacher-directed ‘noticing’ activities related to the phraseologies of speech act functions (Austin Reference Austin1962) – How to do things with words, with corpora.
Acknowledgements
I wish to thank the two anonymous reviewers for their feedback on a previous draft of this article. Their comments have been very helpful for revising and reworking the paper.