1 Introduction
The concept of data-driven learning (DDL) was introduced to the field of second language (L2) learning by Johns (Reference Johns1990). DDL is associated with “using the tools and techniques of corpus linguistics for pedagogical purposes” (Gilquin & Granger, Reference Gilquin and Granger2010: 359). Since Johns’ pioneering work, research exploring the effectiveness of DDL has been expanding and has by now convincingly shown that this method can be beneficial for various instructional foci and is applicable in many institutional contexts. Nevertheless, recent overviews show that “the field has yet to reach full maturity” (Boulton & Pérez-Paredes, Reference Boulton and Pérez-Paredes2014: 122) and that “the direct uses of corpora in language teaching are treated rather marginally in the literature in the field” (Leńko-Szymańska & Boulton, Reference Leńko-Szymańska and Boulton2015: 3). In addition to the general scarcity of DDL studies in language teaching research, the foci of available studies have been narrow with regard to target language (English to the exclusion of other languages), L2 proficiency levels (intermediate to advanced), participating instructors (DDL researchers rather than regular teachers), and linguistic targets (primarily lexical and morphologically simple grammatical features). This study aims to expand the empirical DDL research body by addressing these limitations – targeting German as an L2, very low proficiency levels, interventions conducted by regular instructors, and complex lexico-grammatical items. Furthermore, this study investigates the feasibility of brief one-time DDL interventions within non-DDL curricula as well as DDL effectiveness for teaching new versus previously learned linguistic structures, another underexplored area.
2 Literature Review
2.1 Theoretical background
DDL is fully compatible with usage-based second language acquisition (SLA) theories that conceive of language as an open-ended dynamic system that emerges by way of probabilistic bottom-up abstraction rather than a fixed system that follows categorical top-down rules (Ellis, Reference Ellis2014). One of the most important tenets of usage-based approaches is inseparability of grammar and lexis, conceptualized by Ellis (Reference Ellis2014: 399) as follows:
Language is intrinsically symbolic, constituted by a structured inventory of constructions as conventionalized form-meaning pairings used for communicative purposes. […] Adult language knowledge consists of a continuum of linguistic constructions of different levels of complexity and abstraction. Constructions can comprise concrete and particular items (as in words and idioms), more abstract classes of items (as in word classes and abstract constructions), or complex combinations of concrete and abstract pieces of language (as mixed constructions). No rigid separation exists between lexis and grammar.
The second theoretical principle behind DDL is Schmidt’s (Reference Schmidt1990) noticing hypothesis, which posits that some level of learner awareness of the target L2 construction is necessary for it to be learned. In order to attract learners’ attention to L2 targets, specific instructional methods have been proposed, notably input enrichment and input enhancement. Input enrichment, or input flood, is increase of frequency of the target feature in the input (Trahey & White, Reference Trahey and White1993) and input enhancement refers to making the target feature more salient, for example with typographical means such as bolding, underlining, or color marking (Sharwood Smith, Reference Sharwood Smith1993). DDL is inherently conducive to both teaching techniques as corpora can supply a large number of attested language samples containing the target construction (input enrichment) and these samples can be retrieved from corpora with the help of concordance tools in form of stacked concordance lines with the target construction highlighted and centered (input enhancement). Although input enrichment and enhancement had been originally proposed as rather implicit instruction methods (i.e., those facilitating learner awareness at the level of subconscious noticing), empirical SLA research has shown that implicit instruction is often insufficient (especially for non-salient L2 targets) and that more explicit methods (i.e., those facilitating learner awareness at the level of conscious understanding) are necessary (Ellis, Reference Ellis2005; Sharwood Smith, Reference Sharwood Smith2013; Spada & Tomita, Reference Spada and Tomita2010). The specific method that has been adopted in most DDL interventions can be characterized as explicit inductive within the implicit–explicit/inductive–deductive taxonomy (DeKeyser, Reference DeKeyser2003: 314). In this method, also termed “discovery learning” (Bernardini, Reference Bernardini2002), rich and enhanced input (e.g., concordance lines) serves as material for learners’ noticing and analysis of language patterns. As a result, learners arrive at generalizations about the usage of the target constructions, which, in turn, lead to the complexification and expansion of their L2 knowledge (Flowerdew, Reference Flowerdew2015).
The third principle is that of learner autonomy, the development of which has been repeatedly pointed out as an important long-term benefit of DDL (e.g., Chambers & O’Sullivan, Reference Chambers and O’Sullivan2004). However, it has also been acknowledged that this development should be mediated through teacher and peer assistance. The adoption of the principle of the mediated nature of language development from sociocultural theory (see Flowerdew, Reference Flowerdew2015) has led to the modification of the inductive discovery method to the so-called guided induction approach. This method, originally proposed by Herron and Tomasello (Reference Herron and Tomasello1992) in the area of general language teaching, has recently been adopted by many DDL teachers-researchers (Flowerdew, Reference Flowerdew2009, Reference Flowerdew2015; Frankenberg-Garcia, Reference Frankenberg-Garcia2014; Huang, Reference Huang2008; Yoon & Jo, Reference Yoon and Jo2014). Smart (Reference Smart2014: 187) describes guided induction as follows:
Guided induction is a particular evolution of the inductive approach where learners are first presented with language samples in the form of an interactive task that guides them to discover the language structure they contain; the teacher has an active role in facilitating these tasks, but typically does not present explicit grammar rules. Learners are then guided to produce the language structure in meaningful communicative tasks.
It must be noted that guidance implies not only teacher but also peer scaffolding (cf. “interactive task”), when learners collaboratively work on corpus tasks and assist each other in the inductive discovery process (Flowerdew, Reference Flowerdew2015; Kennedy & Miceli, Reference Kennedy and Miceli2001).
In sum, the abovementioned theoretical underpinnings inform the main pedagogical principles behind DDL, some of which stand in stark contrast to widely spread conventional teaching techniques as summarized by Flowerdew (Reference Flowerdew2015: 15–16): (1) a lexico-grammatical approach as opposed to the separation of lexis and grammar into different curricular modules; (2) reliance on naturally occurring attested language as opposed to concocted textbook examples; and (3) guided inductive methods based on rich and enhanced input as opposed to rule-based deductive methods.
2.2 Empirical DDL research
Although DDL research is still a young area of inquiry, it has accumulated enough empirical studies for several research syntheses (Boulton & Pérez-Paredes, Reference Boulton and Pérez-Paredes2014; Chambers, Reference Chambers2007; Leńko-Szymańska & Boulton, Reference Leńko-Szymańska and Boulton2015) and the first meta-analysis (Cobb & Boulton, Reference Cobb and Boulton2015). Having integrated the results of quantitative studies published up to 2012, Cobb and Boulton showed large effect sizes for both learning gains with DDL methods and their superiority to conventional teaching methods. The overwhelming majority of L2 foci in these studies included English collocations (frequently co-occurring patterns of words) taught with DDL methods to learners in different countries and instructional settings. Most of the earlier studies explored the so-called “hard” (Gabrielatos, Reference Gabrielatos2005) version of DDL, where learners searched corpora online on their own with minimal guidance from their teachers. However, this version was soon recognized as not equally feasible for all instructional settings. Online corpus searches by learners turned out to be challenging for many students and teachers alike due to high cognitive task demands, lack of technological expertise and support, or simply absence of computer-equipped classrooms (Farr, Reference Farr2008; Tian, Reference Tian2005). This hurdle has led to the emergence of numerous DDL modifications that can be positioned on a “cline” from hard to soft versions (Mukherjee, Reference Mukherjee2006), depending on the medium (computer or paper) and task difficulty (open-ended and controlled, addressing variable or fixed rules). The number of studies devoted to “softer” DDL versions remains small but it has been recently growing, as evidenced by the overview below.
The effectiveness of paper-based DDL interventions, in which learners work with concordances selected and printed on worksheets by their teachers, has been investigated in a number of exploratory and experimental studies (see Boulton, Reference Boulton2010, for an overview). All studies that employed quantitative methods showed that paper-based DDL led to significant gains in learners’ knowledge of a variety of L2 English targets (e.g., phrasal verbs, connectors, and passive voice constructions), and that DDL was either as effective as or better than conventional rule-based methods. Furthermore, quantitative studies that compared computer-based and paper-based DDL found no difference in the effectiveness of these two methods (Boulton, Reference Boulton2012; Vyatkina, Reference Vyatkinain press). Boulton (Reference Boulton2008, Reference Boulton2009, Reference Boulton2010, Reference Boulton2012), in particular, has shown that “taking the computer out of the equation” (Boulton, Reference Boulton2010: 234) in paper-based DDL does not compromise its benefits for language instruction. Several studies have also shown that paper-based DDL works at both higher and lower proficiency levels (Koosha & Jafarpour, Reference Koosha and Jafarpour2006; Tian, Reference Tian2005; Yoon & Jo, Reference Yoon and Jo2014). Boulton (Reference Boulton2008, Reference Boulton2009, Reference Boulton2010, Reference Boulton2012) has conducted systematic paper-based DDL research with first language (L1) French university learners of English at relatively low levels of proficiency (levels A2–B1 of the Common European Framework of Reference, or CEFR; Council of Europe, 2001) despite up to eight years of instruction. Boulton showed that DDL helped such learners to acquire certain linguistic targets that were often impermeable to traditional instruction methods. A number of studies focused on verb-preposition collocations, which are the target of the present study. These constructions are notoriously difficult for learners with any L1-L2 background because of form-meaning mapping mismatches between languages (Kennedy & Miceli, Reference Kennedy and Miceli2001; Nesselhauf, Reference Nesselhauf2004). DDL studies that have targeted English verb-preposition collocations either alongside other foci (Boulton, Reference Boulton2010; Frankenberg-Garcia, Reference Frankenberg-Garcia2014) or as a sole instructional focus (Koosha & Jafarpour, Reference Koosha and Jafarpour2006) have shown positive DDL effects. It must be noted that most DDL interventions have been administered by the researchers themselves with two notable exceptions: Yoon (Reference Yoon2008), who closely worked with an instructor who administered computer-based DDL to his students over a longer period; and Boulton (Reference Boulton2010), who reports on a successful short paper-based DDL intervention administered by several teachers.
Several recent studies have fleshed out specific features of DDL conducive to language learning. Frankenberg-Garcia (Reference Frankenberg-Garcia2014) demonstrated that learners who worked with multiple concordance lines for English collocations achieved greater gains in L2 production than learners who worked with one concordance line, worked with definitions of the target structures, or were in a control group. Smart (Reference Smart2014) showed that for English passive constructions, the guided inductive DDL group outperformed two deductive groups, one which worked with corpus-informed materials and another with constructed examples from textbooks. The only study (conducted in the computer-based DDL context) that explicitly singled out the specific knowledge effect is Chan and Liou (Reference Chan and Liou2005). They found that learners with lower entry-level collocational knowledge made greater gains than learners with higher entry-level knowledge. However, the students in Chan and Liou’s study had had some previous knowledge of the focal collocations, which is in line with virtually all DDL research that has focused on “‘known’ but error-prone items” (Boulton & Pérez-Paredes, Reference Boulton and Pérez-Paredes2014: 124) that have been claimed to be more amenable to DDL interventions than completely new items (Cobb, Reference Cobb1999; Nesselhauf, Reference Nesselhauf2004). On the other hand, informal feedback from Boulton’s (Reference Boulton2008) participants suggested that they had had no previous knowledge of the target items (two phrasal verbs), which they successfully learned with paper-based DDL. Nevertheless, empirical comparisons of how DDL fares in teaching previously learned versus new items (a task especially relevant at early stages of instruction) are yet to be conducted.
The results of the research reviewed above thus suggest that the primary DDL benefits are richness of input and the use of a guided inductive approach, that both computer-based and paper-based DDL can be equally effective provided that they follow these principles, and that paper-based DDL works even for lower proficiency learners. However, a number of research gaps still remain, some of which will be addressed in the present study.
2.3 Remaining gaps and the goals of this study
Despite the positive findings discussed above, Mauranen’s (Reference Mauranen2004: 208) call, issued a decade ago, is still relevant and urgent: “to make a serious contribution to language teaching, corpora must be adopted by ordinary teachers and learners in ordinary classrooms”. One way to bring this project to fruition is to promote paper-based DDL that is more feasible for “ordinary” settings. However, in addition to the general scarcity of studies that compare the effectiveness of paper-based DDL and non-DDL methods (see review above), there are a number of design limitations that have yet to be addressed. The first research gap concerns the target language focus. So far studies have been almost exclusively limited to English. This, in turn, has led to primary attention directed at lexical linguistic targets and analytical grammatical constructions (e.g., passive and subjunctive verb forms), whereas morphologically complex targets (e.g., nominal inflection), which are largely irrelevant to English, have flown under the radar. A case in point is verb-preposition collocations, which in English essentially belong to the lexical domain as they consist of a content word and a functional word. In contrast, German verb-preposition collocations represent a complex lexico-grammatical construction because either the verb or the preposition assigns grammatical case to the subcategorized noun phrase indexed by inflectional markers on that phrase. The German nominal inflectional paradigm represents a salient case of morphological complexity (Pallotti, Reference Pallotti2015), and verb-preposition-case collocations have been shown to be difficult for L2 German learners at different proficiency levels (Baten, Reference Baten2011; Vinagre & Muñoz, Reference Vinagre and Muñoz2011). At the same time, explicit teaching has been shown to be necessary to attract learners’ attention to inflectional morphology – an abstract, low-salience grammatical feature (DeKeyser, Reference DeKeyser2003; Ellis, Reference Ellis2005). Therefore, it is worth exploring whether DDL as an explicit inductive method may fare better than traditional explicit deductive methods to achieve this goal.
Furthermore, the empirical DDL research field is in urgent need of replication studies, as any classroom research is by nature small scale and quasi-experimental with intact classes functioning as participant groups. Partial replication studies can enhance the generalizability of classroom research findings (Chun, Reference Chun2012; Porte, Reference Porte2012), but such studies are still rare in DDL (see, however, Frankenberg-Garcia, Reference Frankenberg-Garcia2014). This study aims to take this avenue by partially replicating two recent DDL studies. The first study is Boulton (Reference Boulton2010), who compared the effectiveness of brief interventions for teaching a number of English collocations to L1 French EFL (English as a foreign language) students at the A2-B1 proficiency levels. The study showed that the paper-based DDL method was marginally better than the non-DDL dictionary-based method, with more students improving their scores on more items, and that both methods were better than no instruction. The study thus argued in favor of applying paper-based DDL along with more traditional methods in teaching some lexico-grammatical targets to low-proficiency students. The second study is Vyatkina (Reference Vyatkinain press), who compared the effects of computer-based and paper-based DDL for teaching German verb-preposition collocations to L1 English learners at the B1 proficiency level. That study showed that both methods were equally effective, with higher and longer lasting gains for controlled production tasks (gap filling) than for free production tasks (sentence writing). The present study is similar to Boulton (Reference Boulton2010) in that it compares the effects of a paper-based DDL and a non-DDL method administered by regular teachers as brief, narrowly focused interventions embedded within a non-DDL curriculum and without prior DDL training of the participants. It also replicates Vyatkina (Reference Vyatkinain press) in that it focuses on learners of German and targets German verb-preposition collocations as complex lexico-grammatical constructions. The novel features of the present study are that its participants have very low L2 proficiency levels (CEFR A2 and below), that it assigns intact classes to different treatment conditions (unlike Boulton and Vyatkina who administered different treatments to the same cohort of learners), that it separates lexical and morpho-syntactic learning, and that it compares the DDL effects for teaching completely new versus previously learned collocations. The broader purpose of this study is, similar to Boulton’s (Reference Boulton2010: 541), to “counter a number of frequent objections to DDL and contribute to greater awareness of its potential”.
3 Design
3.1 Research questions
The study explores the following research questions:
1. Does learner lexical and morpho-syntactic knowledge improve following focused instruction (as demonstrated by written performance on a closed gap-filling task)?
2. Are the gains higher following the DDL or the non-DDL treatment?
3. Are there interactions between treatment, course level, and linguistic items (lexical or morpho-syntactic)?
3.2 Participants and institutional setting
The study was administered at a large public North American university. Participants were recruited from the third- and fourth-semester German classes in a four-semester-long program that fulfills the foreign language requirement for certain majors. The classes met three times per week for 50 minutes. This is a multi-section program in which all sections of the same course follow a uniform syllabus and use the same textbook. The instructional approach combines a communicative approach with focus-on-form activities, and all courses devote an approximately equal amount of time to speaking, writing, reading, listening, vocabulary, grammar, and cultural learning. All syllabi also have a substantial learning-with-technology component including an electronic workbook and biweekly computer lab meetings, mostly devoted to searching German websites for cultural information, but without a DDL component. All seven experimental classes in this study were taught by graduate student instructors under the researcher’s supervision. Some instructors had basic knowledge of corpora and DDL from their graduate coursework (see Vyatkina, Reference Vyatkina2013) but none of them used DDL in teaching the focal classes.
Altogether, 88 students participated in the two iterations of the experiment. All of them had American English as their L1. The average age was 21 (18–35 range). Gender was not considered a variable in this study, however it can be mentioned that although the proportion of females and males in each class differed, the overall distribution was balanced (43 females and 45 males). The L2 German proficiency of the participants was fairly low. Students who enroll in the first semester of this program have no or almost no knowledge of German. Some of them progress through all four semesters in the program while others join it at later time points via a placement test. To obtain a more general measure of the participants’ L2 proficiency, a standardized German proficiency test was administered at the end of the experimental semester. All fourth-semester classes took an official diagnostic test which was administered by the onDaF Institute in Bochum, Germany (www.ondaf.de) and proctored at the researcher’s institution. Participants take this online cloze test over 40 minutes and are then automatically placed within CEFR bands (Eckes & Grotjahn, Reference Eckes and Grotjahn2006). The results showed that approximately two thirds of all fourth-semester students reached the A2 level by the end of the semester, one third did not reach it, and only two participants reached the B1 level. Since the onDaF test only measures proficiency at or above the A2 level, it was not considered meaningful to test third-semester students. Overall, although proficiency is not considered a variable in this study, it is safe to state that both third- and fourth-semester students in the focal instructional program were generally at or below the A2 CEFR level, i.e. had roughly low-intermediate L2 proficiency.
Importantly, third- and fourth-semester participants differed not only in their course level but also in their entry-level knowledge of the target items. The third-semester students were explicitly taught them for the first time, whereas the fourth-semester students were reviewing material taught in the previous semester. Although previous incidental exposure to the target items by the third-semester students cannot be excluded, the bulk of the engagement of this student population with their L2 is limited to the classroom. As far as the fourth-semester groups are concerned, the overwhelming majority of the students progressed to their fourth-semester course immediately after the third-semester course, which means that they were first exposed to the target items at the same time and in the same fashion during the previous semester. As the intervention was administered during a spring semester, only a relatively short (one-month-long) winter break separated the semesters, which does not allow much time for knowledge attrition or additional out-of-class exposure to the L2 (e.g., traveling to German-speaking countries, which none of the participants undertook). Therefore, most of the residual knowledge of the target items in the fourth-semester groups can be confidently attributed to the instructional effects from the previous semester. The only exception is three participants who enrolled in the course via a placement test, and who therefore did not follow the same instructional sequence as other participants. However, they showed an entry-level knowledge of the target items similar to other participants and thus were included in the study.
3.3 Target items
The intervention focused on eleven verb-preposition collocations (Appendix 1) from the regular textbook used in the first three semesters in this program (Di Donato, Clyde & Vansant, Reference Di Donato, Clyde and Vansant2012: 359). In ten of these collocations, prepositions are not congruent in German and English: For example, to wait for is equivalent to warten auf, although the prototypical translation of the preposition for is für. Furthermore, the German noun phrase or the pronoun following the preposition carries an obligatory gender, case, and number marker (e.g., Ich warte auf meinen Bruder/ihn [masculine, accusative, singular] – I am waiting for my brother/him). However, if the pronoun refers to an inanimate object, German uses pronominal adverb-preposition contractions, the so-called da-compounds, instead (e.g., Ich warte darauf – I am waiting for that), in which no inflectional markers are present. Furthermore, German pronominal adverbs (e.g., davon, dabei) are extremely frequent, not genre-restricted, and can combine with most prepositions, unlike their very infrequent and genre- and item-restricted English counterparts (e.g., thereof, thereby). German da-compounds are considered an important part of active vocabulary for learners and are typically taught in conjunction with prepositional verbs along with prepositional phrases. Needless to say that, being an important instructional focus due to the sheer frequency of verb-preposition collocations in language usage, such lexically, grammatically, and semantically complex constructions present considerable difficulties to learners, although typical syllabi and textbooks allocate only one to two lessons to introducing them.
3.4 Procedures
There were two iterations of the experiment. In the first iteration, two sections of the third-semester course and three sections of the fourth-semester course participated. The experimental sections were assigned to four conditions based on treatment (D=DDL; T=textbook) and course level (L=low, i.e. third semester; M=mid, i.e. fourth semester). Only the data of the students who agreed to participate and took both the pretest and the posttest were included in the study. The DL (n=15) and TL (n=13) sections were taught by the same instructor. Two fourth-semester sections taught by a different instructor were assigned the TM condition due to the low number of students (combined n=16). Finally, the third fourth-semester section, taught by yet another instructor, was assigned the DM condition (n=13). It must be noted that the DM section followed a syllabus different from the TM sections as it constituted a different course (introduction to German for the professions). However, since the intervention was administered on the tenth day of the semester in fourth-semester classes, no noticeable difference in instruction had yet taken place. Also, since the regular instructor of the DM section was out of town on the day of the experiment, the researcher administered the intervention. Since she was known to the students as program coordinator, and the regular instructor was still new to the students, the intervention did not cause any considerable disruption to the regular instructional flow. After the first iteration of the experiment, the researcher decided to replicate the study with another cohort of third-semester students with slight design changes (see explanations under Results below). Two sections taught by an instructor who did not participate in the first iteration of the study were assigned to the DL2 (n=16) and TL2 (n=15) conditions.
The intervention in the third-semester classes was administered at the time designated for the target items in the course syllabus (after about one third of the semester was over), and in the fourth-semester classes as part of the start-of-the-semester review. Prior to the intervention, the researcher met with all participating instructors individually for about 30 minutes, discussed the interventions, gave them packets with detailed lesson plans, test sheets, and worksheets, and answered their questions. In the T-groups, the procedure followed the regular syllabus (barring the pretest and the posttest), so the instructors did not have to change anything in their teaching. Regarding the D-groups, instructors showed much interest in trying out the new teaching method. The ten-minute-long pretest was administered at the end of the class immediately preceding the intervention class. The homework assignment in all groups was unrelated to the target items. The intervention, administered during the next class, lasted for 40 minutes and the class concluded with a ten-minute-long posttest. Both tests were announced to the students as ungraded quizzes. Additionally, all participants filled out a brief electronic personal and language learning background questionnaire during one of their regular computer lab meetings.
The test instruments were paper worksheets with fourteen sentences from the DWDS corpus: a large, freely and publicly available corpus of contemporary German (www.dwds.de). Ten sentences contained the target verbs and four sentences contained control items (see Boulton, Reference Boulton2010) – verb-preposition collocations not taught in the intervention. Only ten target constructions were tested because two out of the eleven collocations contained the same verb with different prepositions that indicated different meanings. To remove this additional difficulty, only one collocation of these two was tested. The verb was followed by a preposition and a noun phrase with a definite article in half of the sentences and by a da-compound in the other half. Only singular nouns were used and the gender of each noun was indicated in parentheses so that participants had to only think about case while working on the grammatical items. The target prepositions, articles, and da-compounds were blanked out and learners had to fill the gaps. A model with a non-target prepositional verb introduced each part of the test. The verbs that were used with prepositional phrases in the pretest were used with da-compounds in the posttest and vice versa. During scoring, one point was given for each correctly supplied lexical item (preposition or da-compound), and one point for each correctly supplied grammatical item (article) after correct prepositions.
The difference between the 40-minute-long treatments was the following. The TL and TM groups followed the sequence of activities in the textbook (Di Donato et al., Reference Di Donato, Clyde and Vansant2012: 359–362). This lesson followed the typical deductive ‘triple P’ (Presentation-Practice-Production) model (e.g., DeKeyser, Reference DeKeyser1998). The target verb-preposition-case collocations were first presented as a list with English translations. A practice drill followed, and then da-compounds were introduced in contrast with preposition-pronoun collocations with a few examples. Next, students practiced using da-compounds in individual, pair-work, and whole-class drills. The sequence concluded with an oral pair-work exercise in which students exchanged questions and answers about everyday topics using the target constructions. In contrast, the DL and DM groups worked with DDL worksheets containing five to seven concordance lines for each collocation (Appendix 2; see also Vyatkina, Reference Vyatkina2015) copied and pasted from the DWDS corpus. The intervention followed the inductive ‘triple I’ (Illustration-Interaction-Induction) model proposed by Carter and McCarthy (Reference Carter and McCarthy1995) and extended by Flowerdew (Reference Flowerdew2009) to the guided induction model with an additional ‘Intervention’ step between Interaction and Induction. The instructor in each D-class briefly introduced the idea of a corpus as a rich repository of language usage examples and distributed the worksheets. Then, she instructed the students to find patterns in the concordance lines and modeled that with one focal verb. The students then worked in pairs discussing the verb-preposition-case patterns and wrote individual examples for each verb. Next, the instructor discussed the results with the whole class, making sure that everybody arrived at the right pattern. A similar procedure was followed with concordance lines for verb-preposition-pronoun collocations and da-compounds. The lesson concluded with a Q&A oral pair-work exchange similar to that in the T-groups. Therefore, the lessons in all groups were designed as in Boulton’s (Reference Boulton2010: 547) study: “The intention was thus that all students should come away with essentially the same final information; the main differences lay in the way it had been reached and the materials from which it was derived.”
4 Results
4.1 Iteration 1
First, the pretest-posttest scores for the control items that were not part of the intervention were analyzed. A total of 88% of all participants scored zero points on these items (out of the possible 6) on both tests and the maximum score was 1 for the lexical knowledge and 1 for the grammatical knowledge. Furthermore, only four participants showed a gain of 1 or 2 points. These very low frequencies show that the test effect was minimal in this study. Therefore, the scores for the control items were not included in the statistical analysis and the following report only refers to the experimental items.
Participants could earn up to 10 points for the lexical knowledge and up to 5 points for the grammatical knowledge on each test. The descriptive statistics for the pretest and posttest scores are presented in Table 1 and illustrated in Figures 1 and 2. The overall scores on the pretest were very low, with even the mid-level students getting, on average, only about a third of both the lexical and grammatical items right.
The data was analyzed with multilevel modeling methods (Cunnings, Reference Cunnings2012). Raw test scores were not normally distributed, so zero-inflated Poisson regression models with associated z-tests were used. Gain scores were more normally distributed, so a multilevel linear regression with associated t-tests was used. Although all variables were fed into the models together, the results will be presented separately for lexical and grammatical gains for clarity.
Regarding the pretest scores for lexical knowledge, several differences across groups were revealed. Mid-level students scored significantly (ca. 6 times) higher than low-level students on the lexical outcome (95% CI [3.55, 10.34], z=6.61, p<.0001). Whereas this result was expected, a difference between low-level groups was unexpected: the DL group scored higher than the TL group by 0.9 points (95% CI [0.39, 1.51], z=−2.54, p=.01). On the other hand, the mid-level groups were not different from one another (z=−0.29, p=.77). Second, all groups improved on the posttest, and the overall posttest scores were (on average) ca. 2 times higher than pretest scores (95% CI [1.57, 2.42], z=6.14, p<.0001). Next, lexical gains were compared across treatments controlling for course level. It turned out that among low-level students, the D-method resulted in significantly greater lexical gains than the T-method by 1.7 points on average (95% CI [0.28, 3.03], t(104)=−2.39, p=.02). Among mid-level students, the D-method resulted in somewhat lower lexical gains than the T-method (by 1 point on average), but this difference was not significant (95% CI=[−0.27, 2.44], t(104)=1.59, p=.11).
Pretest comparisons of grammatical knowledge yielded results very similar to the lexical knowledge results. Mid-level students scored significantly (3 times) higher than low-level students (95% CI [1.57, 5.91], z=3.28, p=.001) and the DL group had significantly (0.7 points) higher pretest scores than the TL group (95% CI [0.31, 1.15], z=−2.34, p=.02), whereas the mid-level groups were not different from one another (z=0.80, p=.43). Another parallel result was that the overall posttest scores were 1.9 times higher than pretest scores (95% CI [1.34, 2.63], z=3.63, p=.0002). However, in contrast to the lexical gains, there was no difference in grammatical gains across treatments. Although the D-groups had higher gains than the T-groups, this difference was not significant either between mid-level groups (95% CI [−0.66, 2.05], t(104)=−1.20, p=.23) or low-level groups (95% CI [−1.46, 1.29], t(104)=0.12, p=.91).
In summary, all groups improved both types of knowledge following instruction. There is no evidence to support the effect of teaching method on grammatical items or lexical items in the mid-level classes, but the DDL method led to significantly more improvement of lexical scores in the low-level classes. Since the participant number was low and because the low-level groups turned out to be different already on the pretest, it was decided to replicate the study with another cohort of low-level students.
4.2 Iteration 2
First, only low-level classes were included, so the course level variable was eliminated. Second, since control items in the first iteration of the study had already shown test effects to be negligible, only experimental items were included in the testing materials. Third, the data for several participants were eliminated prior to statistical analysis for the following reasons. First, the data for three participants who scored 5 or more points (out of a possible total of 15) on the pretest were eliminated since the study focused on participants with low entry-level knowledge of the target items, and because none of the low-level participants in the first iteration had scored more than 4 points on the pretest. Furthermore, the data for four participants were eliminated since they simply copied and pasted the preposition and the article from the model into all gaps in the posttest. On the other hand, upon consultation with a statistician, it was decided to keep the data for the participants who did not participate either in the pretest or in the posttest because the statistical methods used in this study allow for missing data points (Cunnings, Reference Cunnings2012). This resulted in the DL2 (n=16) and the TL2 (n=15) design. The descriptive statistics are presented in Table 2 and illustrated in Figures 3 and 4.
The multilevel modeling results showed that in this cohort, the groups were much more uniform on the pretest: although the DL2 group scored somewhat lower than the TL2 group on both the lexical and grammatical items, no significant difference was discovered for either (95% CI [−0.57, 0.64], z=0.09, p=.93, and 95% CI [−0.45, 0.37], z=0.28, p=.78, respectively). Next, there was an overall improvement on the posttest in comparison with the pretest. Regarding the lexical outcome, overall posttest scores were (on average) 3.2 times higher than pretest scores (95% CI [1.87, 5.63], z=4.13, p<.0001). Regarding the grammatical outcome, overall posttest scores were (on average) 4.6 times higher than pretest scores (95% CI [2.16, 11.45], z=3.66, p=.0002). Finally, the DL2 group improved significantly more than the TL2 group on both the lexical outcome, by 1.7 points (95% CI [0.82, 2.65], t(42)=3.81, p=.0004), and the grammatical outcome, by 1.2 points (95% CI [0.26, 2.09], t(42)=2.58, p=.01). This result stands in contrast to the result from study iteration 1, where the DL group had a significantly higher gain than the TL group only on the lexical outcome. A possible explanation of this difference is that the DL1 group scored higher on the pretest than DL2 and thus cannot be compared directly to the ‘true’ beginner DL2 group.
In summary, the results of study iteration 2 confirm the result for the low-level groups from iteration 1 by showing a significant DDL treatment effect for lexical learning, and extend it to grammatical learning. The result from the latter iteration is also more reliable because there were no significant differences between the groups on the pretest. Moreover, although the DDL group scored slightly lower on the pretest on both outcomes, it scored significantly higher than the non-DDL group on the posttest on both outcomes. Finally, the distribution of gains was much more even for the DDL group: all eleven participants who took both the pretest and the posttest improved, with ten participants showing lexical gains and nine participants showing grammatical gains. In contrast, only five participants in the non-DDL group out of thirteen who took both the pretest and the posttest improved, all of them showing both lexical and grammatical gains, whereas five other participants showed no gains and two participants scored lower on the posttest.
5 Discussion, implications, and conclusion
This study explored the effectiveness of brief DDL interventions for teaching collocations to students at low L2 proficiency levels. On the theoretical level, the results lend support to the benefit of usage-based approaches (Ellis, Reference Ellis2014) and the guided induction DDL approach (Flowerdew, Reference Flowerdew2009, Reference Flowerdew2015; Smart, Reference Smart2014) in teaching lexico-grammar. The results demonstrate that such interventions are not less effective and, on some parameters, more effective than a traditional deductive teaching method. More specifically, this guided induction method worked better than a deductive method during initial exposure of low-level learners to a new lexico-grammatical construction. Arguably, the combination of guided induction and input enrichment and enhancement by the teacher has led to enhanced learners’ perception and understanding (Sharwood Smith, Reference Sharwood Smith2013) and, therefore, to a higher level of awareness (Schmidt, Reference Schmidt1990), as evidenced in their improved performance.
This study also confirms the finding from a previous study (Vyatkina, Reference Vyatkinain press) that DDL can be extrapolated beyond English as an L2 and is effective for teaching lexico-grammatical collocations in inflectional languages. The novel finding of this study is that it shows that paper-based DDL works with students at very low proficiency levels (at and below CEFR A2) rarely considered in research before and that, moreover, it is especially effective with students at the lowest proficiency level. This result is in line with Yoon and Hirvela’s (Reference Yoon and Hirvela2004) study of learner DDL perceptions, who found intermediate learners to be more receptive to DDL than advanced learners. More specifically, this study showed that DDL was significantly more effective for teaching new collocations (thus contradicting Cobb, Reference Cobb1999, and Nesselhauf, Reference Nesselhauf2004), whereas both the DDL and the non-DDL method were equally effective for improving the knowledge of previously learned items. Since the pretest scores of mid-level students were higher than the posttest scores of low-level students, we can infer that all students gradually improve their knowledge of the target collocations even with traditional instruction methods but this learning is extremely slow, as shown by still very low pretest scores of mid-level students. However, the low-level DDL groups showed a much higher initial learning which may have given them a better jumpstart. This study has only explored the short-term development of explicit knowledge (measured by a controlled production test), therefore no claims can be made about long-term development and acquisition of implicit knowledge (DeKeyser, Reference DeKeyser2003; Doughty, Reference Doughty2003). However, if later supported by recycling of the target items (with either a DDL or non-DDL method), this can potentially lead to better long-term learning.Footnote 1 Especially in regard to difficult German inflectional morphology, this short explicit instructional intervention played the role of “enhancing later implicit acquisition by increasing chances of noticing” (DeKeyser, Reference DeKeyser2003: 332).
On the practical level, this study supports an argument in favor of integrating brief paper-based DDL interventions into non-DDL syllabi (thus corroborating Boulton, Reference Boulton2008, Reference Boulton2009, Reference Boulton2010). Similar to Boulton (Reference Boulton2010), this study also shows that regular teachers were able to successfully implement these lessons following brief oral and written instruction. Furthermore, all instructors in this study commented positively on their experience with this new teaching method. This finding shows that, if more ready-made DDL materials were available, it could be more widely implemented in mainstream language teaching. However, as Boulton (Reference Boulton2010: 560) notes, “DDL materials are extremely time-consuming to prepare” and “published materials are virtually nonexistent”, which is especially true for languages other than English. Therefore, we hope that positive results from more DDL studies like this one will inspire publishers to produce more DDL materials and more researchers to share them with open access (see, e.g., Vyatkina, Reference Vyatkina2015).
This study has a number of limitations that should be addressed in future research. The first limitation, typical of all classroom research, is that the results need to be interpreted with caution due to the low number of participants. More partial replications with other participants, proficiency levels, and target items are needed to increase the generalizability of the results. Second, there is a dire need for studies of long-term DDL effects (Boulton & Pérez-Paredes, Reference Boulton and Pérez-Paredes2014; Leńko-Szymańska & Boulton, Reference Leńko-Szymańska and Boulton2015), although these are notoriously difficult to design. In particular, it is worth exploring the feasibility of a wider integration of various paper-based and computer-based DDL tasks into L2 syllabi across the curriculum and the effects of “more substantial training or repeated use of such materials” (Boulton, Reference Boulton2010: 559). Third, more studies that single out specific DDL effects (e.g., guided induction vs. deduction, input richness, input enhancement) are needed. Finally, future studies should investigate the DDL effectiveness for various grammatical targets, especially those in inflected languages. One such direction to expand the present study is an item-based analysis comparing the DDL effects for collocations of different levels of grammatical complexity in morphologically rich languages (e.g., verbs and prepositions governing fixed vs. variable cases).
To conclude, this study shows that guided induction learning based on the analysis of authentic language use patterns is conducive to learning not only at advanced but also at incipient proficiency levels. Hopefully, these findings will inspire greater adoption of DDL by language teachers and more studies on DDL by language researchers.
Acknowledgements
This study was supported in part by the Department of Germanic Languages and Literatures of the University of Kansas. I would also like to acknowledge all instructors who participated in this intervention and Terrence D. Jorgensen Jr. for his invaluable help with the statistical analysis.
Appendix 1
Verb-preposition-case collocations used in the study