1. INTRODUCTION
After years of relative neglect, the importance of vocabulary for developing proficiency in a second language (L2) is now generally acknowledged by researchers and theorists (e.g. Hatch and Brown, Reference Hatch and Brown1995; Coady and Huckin, Reference Coady and Huckin1997; Schmitt and McCarthy, Reference Schmitt and McCarthy1997; Nation, Reference Nation2001; Read, Reference Read2000; Malvern, Richards, Chipere and Durán Reference Malvern, Richards, Chipere and Durán2004; Bogaards and Laufer, Reference Bogaards and Laufer2004). This importance is also acknowledged by language teachers, who often feel that their students, particularly those in foreign language learning contexts, are not developing their lexicons to levels that would permit them to function adequately in contexts beyond the language classroom (Carter and McCarthy, Reference Carter and McCarthy1988; Arnaud and Bejoint, Reference Arnaud and Bejoint1992; Laufer, Reference Laufer, Foster-Cohen, Garcia-Mayo and Cenoz2005; Nurweni and Read, Reference Nurweni and Read1999; Ellis, Reference Ellis and Ellis1994). This sentiment also prevails among teachers of French as a Foreign Language (FFL) in Belgium, the larger language learning context where the present study was conducted (Housen, Janssens and Pierrard, Reference Housen, Janssens and Pierrard2002).
However, in order to be able to promote lexical development in FFL classrooms, research first needs to determine the dynamics of the French lexical learning processes in educational contexts, and the factors that impact on these processes. To this end, longitudinal research is needed. Such research, however, is scarce. Most L2 vocabulary studies to date, whether on French or on other L2s, have adopted a synchronic design, analysing lexical data from L2 learners at one point or over a brief period in time. Such studies can only provide information on aspects of lexical use, processing and representation, but not, or only indirectly, on lexical development over time. Thus, the present study was conducted to (a) explore the operationalisation and definition of lexical L2 proficiency and related constructs with a view to identifying a set of measures that can adequately capture the dynamics of lexical L2 proficiency development over time, and (b) shed more light on the development of lexical proficiency in FFL classes in the specific context of Dutch-medium schools in Brussels, Belgium.
To attain these two objectives we will first consider some basic theoretical, terminological and methodological issues in L2 vocabulary research (section 2). In section 3 the specific context in which the current study was conducted, Dutch-medium schools in Brussels, is outlined. The fourth and fifth sections present a longitudinal study of the lexical development of Dutch-speaking adolescents learning French as a Foreign Language in these Dutch-medium schools. Section 4 outlines the methodology of this study and section 5 presents analyses and results. The final section summarizes and discusses the findings and formulates implications for further research on L2 vocabulary development.
2. INVESTIGATING LEXICAL PROFICIENCY DEVELOPMENT – CONCEPTUAL, TERMINOLOGICAL AND OPERATIONAL ISSUES
A proper understanding of the results of any study of lexical proficiency requires one to be clear about central concepts, terminology and the nature of the measurements used. Unfortunately, terms such as word, lexical proficiency, lexical knowledge, lexical competence, lexical richness, are used rather loosely and interchangeably in the vocabulary literature, which hinders the interpretation and comparison of individual research findings. Therefore this section is concerned with three things: (a) construct specification (how can key constructs in L2 vocabulary research be further specified), (b) specific lexical assessment metrics used in L2 studies, and (c) the link between the first two.
Figure 1 represents an attempt to classify the different components of lexical competence that are relevant for the present study, and to illustrate how they can be related to lower-order constructs and to the concrete measures and tests commonly used in research on L2 vocabulary.

Figure 1 Analytic framework for investigating L2 lexical proficiency development.
At the theoretical level of cognitive constructs, we propose to speak of lexical ‘competence’, which consists of both a declarative component and a procedural component. The declarative component has to do with lexical ‘knowledge’ as such, which can be further subdivided into the constructs ‘size’, ‘width’ and ‘depth’ of lexical knowledge. Size of knowledge refers to the number of lexical entries in memory (with varying degrees of mastery in terms of production and reception). Width and depth of knowledge both refer to the quality and degree of elaboration of that knowledge, both in terms of intensional relations to other entries in the lexicon (width) and in terms of the relations between the various form and meaning components of a given entry (depth). The second component, procedural lexical competence, is more a matter of skill and control over knowledge and refers to how strongly linguistic information is stored in lexical memory, which in turn determines how well learners can access, retrieve and encode/decode relevant lexical information in real time.
Lexical competence and its various subcomponents are theoretical constructs which are not open to direct observation or measurement. They are inferred from lower-order constructs which are taken to be the behavioural manifestations of the underlying cognitive constructs in actual L2 performance (production and reception). For the behavioural correlative of lexical competence in actual L2 use we propose the term lexical proficiency. As shown in Figure 1, a learner's lexical proficiency can be construed in terms of the lexical diversity, sophistication, complexity, productivity and fluency of his L2 use.
In our terms, lexical diversity (or variation) is the observational correlate of knowledge size and refers to the impression of a learner's lexical proficiency produced by the number of different words (or phonologically-orthographical different word forms) which he or she uses. Lexical sophistication refers to the perception of a L2 user's lexical proficiency formed by, among other things, his use of semantically more specific and/or pragmatically more appropriate different words from among a set of related words. In this sense, lexical sophistication implies knowledge of semantic relations (e.g. in terms of hyponoymy, hypernymy, synonymy, antonymy) and, hence, of different but related lexical alternatives for referring to a referent. In these respects, lexical sophistication manifests itself at the macro-level of the lexicon as a whole, making it the nearest correlate of what we termed lexical width. In contrast, lexical complexity manifests itself at the micro-level of the individual word. It refers to the impression of someone's lexical proficiency created by, for instance, the ability to comprehend or use not only the prototypical or default semantic, collocational, grammatical or pragmatic aspects of a specific word but also a variety of other, more specific, peripheral and less frequent properties. In this sense, lexical complexity corresponds to a learner's depth of lexical knowledge. Lexical productivity is a construct at the behavioural-observational level of lexical production which reflects the sheer number of words (tokens) which a speaker uses to complete a given task. For instance, in a given context some learners will describe a given event by using 100 words while others will use 200 words. Finally, the related construct of lexical fluency refers to the speed with which the learner produces or encodes (esp. content) words, as determined by the degree to which the relevant lexical information has been proceduralised.Footnote 1 As with lexical competence, more dimensions of lexical proficiency can be distinguished than the five discussed here, but these five already suffice to show that any concrete measure of L2 lexical proficiency is bound to be only partial.
The need for reliable and valid measures for lexical proficiency has long been recognised in SLA research, particularly in the area of L2 production (e.g. Harley, Reference Harley1995; Malvern et al., Reference Malvern, Richards, Chipere and Durán2004; Daller, Milton and Treffers-Daller, Reference Daller, Milton and Treffers-Daller2007). Without such measures several theoretical and practical questions cannot be answered. These questions include determining the number of words ‘known’ by learners at a particular stage of development (or level of instruction), the rate at which words are acquired, the contribution of factors such as word length, frequency and saliency to the ease/difficulty with which a word is learned, and the relationship between vocabulary size and other aspects of lexical competence (e.g. depth, width, automatisation of vocabulary knowledge). As a result, the number of L2 vocabulary measures and tests has proliferated in recent years, tapping into different dimensions of lexical proficiency. The lower level boxes in Figure 1 are an attempt to classify different types of measures (with an emphasis on productive lexical proficiency), and to relate them to the various higher-order constructs discussed earlier. Measures of productive vocabulary size/diversity (how many words are known?) are generally based on analyses of oral or written L2 corpora and usually take the form of a general type-token ratio (TTR) and transformations thereof (e.g. Guiraud Index, Über Index), or on an analysis of how fast TTR falls with increasing token count within the language sample (D). These measures have been widely used and discussed in the L2 literature (e.g. Malvern et al., Reference Malvern, Richards, Chipere and Durán2004; Van Hout and Vermeer, Reference Van Hout, Vermeer, Daller, Milton and Treffers-Daller2007; Broeder, Extra and Van Hout, Reference Broeder, Extra, van Hout and Perdue1993; McCarthy and Jarvis, Reference McCarthy and Jarvis2007). Other elicitation measures of productive vocabulary size include translation tests and productive vocabulary level tests (Laufer and Nation, Reference Laufer and Nation1999). Measures of lexical width/sophistication (which words are known) and of depth/complexity (how well words are known) focus more narrowly on, respectively, specific or ‘advanced’ words (e.g. less frequent words) or on specific aspects of the meaning or use of words (cf. the Advanced Guiraud as discussed in Daller et al., Reference Daller, van Hout and Treffers-Daller2003 and Tidball and Treffers-Daller, Reference Tidball, Treffers-Daller, Daller, Milton and Treffers-Daller2007 and this special issue). Each type of measure and each operationalisation has its own inherent strengths and problems, in terms of validity, reliability, discrimination and feasibility. These strengths and problems are well documented in the literature, and some of these will be taken up in the methodology section. Suffice it here to point out that there is no one-to-one match in Figure 1 between the statistical constructs and the observational-behavioural constructs which these statistical constructs are supposed to measure (nor, for that matter, between the constructs at the observational and theoretical levels). This is indicated by the multiple lines starting from the lower-level boxes in Figure 1. In fact, one could argue that all the quantitative measures listed in Figure 1 say something about all four behavioural constructs distinguished in this framework (which points to a general validity problem) though some measures clearly say more about a given construct than others. This is reflected by the weight of the interconnecting lines.
3. FRENCH AS A FOREIGN LANGUAGE IN BRUSSELS
The French learners whose lexical proficiency is discussed in this article learn French in a specific context, namely Dutch-medium education in Brussels, Belgium. The issue of language learning and teaching in Belgium education has been characterised as paradoxical (Baetens Beardsmore, Reference Baetens Beardsmore1993), and nowhere is this paradox clearer than in Brussels, the officially Dutch-French bilingual but predominantly francophone capital of Belgium. In Brussels, two separate monolingual school systems, one in Dutch and one in French, operate in parallel and independently from each other. The Dutch school system in Brussels was originally established in the 1960s to help maintain the Dutch-speaking minority in Brussels but in recent years an increasing number of Francophone parents and parents from other language backgrounds have sent their children to the Dutch-language schools in the expectation that this will provide their children with the Dutch immersion experience felt necessary to become bilingual in both national languages. This trend has dramatically changed the demographics and the linguistic climate in the Dutch-medium schools in Brussels (henceforth DuB) to the extent that French-speaking pupils now begin to outnumber the Dutch-speaking pupils who are becoming a minority in most classrooms. This trend is most salient in kindergarten and primary school, where the number of children from Dutch-speaking families has dropped in 25 years from over 78% to only 14% in 2007. This trend has also now gradually spread to secondary schools, where Dutch-speaking children now represent only 34% of the population (www.vgc.be). All this has drastically changed the linguistic balance in Dutch-medium education in Brussels. In most schools French has now become the dominant language in the playground and in the corridors though Dutch is still the only language of instruction (except for in the French-foreign language lessons; cf. below), with little or no provisions for the French-speaking pupils. The general public and policy makers in Belgium believe that the complex multilingual school environment that has thus been created has a profound and overall negative effect on the linguistic and the academic development of the French-speaking and Dutch-speaking pupils alike (see Housen and Pierrard, Reference Housen, Pierrard, Housen, Van de Craen and Pierrard2004; Mettewie, Reference Mettewie, Puren and Babault2007; Mettewie, Housen and Pierrard Reference Mettewie, Housen, Pierrard, Hiligsmann, Beheydt, Degand, Godin and Vanderlinden2005). The only potentially positive effect, some believe, would be on the Dutch-speaking pupils' proficiency in French which would, with time, perhaps even approximate that of their francophone classmates. Whether, and to what extent this last assumption is correct, is of particular concern for the French-Foreign Language teachers in these Dutch-medium schools. It is this issue which the present study seeks to address, focusing on the lexical proficiency in French of Dutch-speaking pupils in comparison to their francophone peers in the first three years of Dutch-medium secondary education in Brussels.
Reasons for believing that the particular multilingual context in the Dutch-medium schools in Brussels may have positive effects on the lexical development of French of the Dutch-speaking pupils are related to the different sources of exposure to French which these pupils have at their disposal. The first source of exposure for these Dutch-speaking pupils is the French-Foreign Language classroom. Much has been written about the constraints on lexical development imposed by the foreign language classroom context, and the role and efficacy of incidental versus explicit vocabulary learning (e.g. Huckin and Coady, Reference Huckin and Coady1999; Gass, Reference Gass1999; Hunt and Beglar, Reference Hunt and Beglar2005; Laufer, Reference Laufer2001, Reference Laufer, Foster-Cohen, Garcia-Mayo and Cenoz2005; Folse, Reference Folse2004). Compared to both naturalistic first and second language learners, foreign language (FL) learners often lack an adequate amount of contextualised oral or written input from which they can extract and create the relevant semantic and structural specifications about words and integrate such information in their mental lexicons; consequently, to compensate for FL learners' lack of naturalistic learning opportunities, FL teaching typically resorts to explicit instruction.
FFL lessons in Dutch-medium education in Brussels, where Dutch-speaking pupils sit together with their French-speaking peers, start in Year 3 of primary school and are typically taught by non-native and non-specialist teachers for two or three hours a week. Traditional rationalistic-analytic teaching practices prevail in the FFL lessons at primary school, partly because of the spread of one particular textbook cum method (Decoo, Reference Decoo2004) which is specially developed for the Dutch-speaking Flemish market and which adheres to the principles of the audio-lingual and direct methods. In this method, vocabulary and grammar receive equal emphasis and are presented explicitly and developed systematically (Housen, Reference Housen, Nikolov and Curtain2003). In the first three years of secondary school, French is taught for four hours a week by non-native specialist teachers. Here, teaching syllabi and methods take a more communicative and functional-notional turn but there is also analytic grammar work and a systematic and focused approach to vocabulary building, including the study of decontextualised lexis (e.g. rote memorisation of word lists, sentence-gap filling exercises, translation of Dutch L1 words), the provision of concise definitions of new French words, using dictionaries and consciously inferring vocabulary from context (Kemps, Housen and Pierrard, in press).
In addition to the FFL lessons, the Dutch-speaking pupils in DuB have other contact sources with French which are not normally available in other foreign language teaching contexts. One such additional source is through the informal contacts with their native French-speaking peers in the playground, in the corridors and often also in class. As mentioned earlier, in many of these Dutch-medium schools, French has become the de facto lingua franca for informal pupil interaction. This potentially provides the Dutch-speaking pupils with authentic linguistic role models and rich, varied, and accurate input for learning French. Thirdly, there is the predominantly French-speaking environment outside the Brussels schools, which provides yet another potentially rich source of authentic input and output opportunities for learning French.
In sum, the learning of French by Dutch-speaking pupils in Dutch-medium schools in Brussels can be characterised as a mixture of both instructed and naturalistic L2 learning, and their lexical development in French as involving both formal and informal learning processes.
4. METHODOLOGY
The study reported here is exploratory in nature and part of a larger research project on the development of L2 proficiency in various educational contexts (e.g. Housen, Reference Housen2002; Housen, Janssens and Pierrard, Reference Housen, Janssens and Pierrard2003; Mettewie et al., Reference Mettewie, Housen, Pierrard, Hiligsmann, Beheydt, Degand, Godin and Vanderlinden2005; Kemps et al., in press). The present study specifically focuses on the development of lexical diversity, lexical productivity and lexical sophistication of Dutch-speaking pupils learning French in Brussels and compares this development to benchmark data obtained from their French native speaker peers. The central research questions are the following:
1) How does the lexical proficiency of Dutch-speaking L2 learners of French in DuB develop over time in terms of the diversity, sophistication and productivity of their French L2 production?
2) How does the lexical proficiency in French of these Dutch-speaking L2 learners compare to that of their native French-speaking peers?
3) Which measures of lexical proficiency explain the most variance in the Dutch-speaking pupils' lexical proficiency development over time?
4.1 Participants and Data Collection
The French data analysed in this study come from the unplanned oral retellings of the wordless picture story Frog, Where Are You? (Mayer, Reference Mayer1969; see also Berman and Slobin, Reference Berman and Slobin1994) by 19 Dutch-speaking and 19 French-speaking pupils from ten different Dutch-medium secondary schools in Brussels. These pupils were selected from a larger sample on the basis of their home language (either Dutch or French with their parents) and time spent in Dutch-medium education in Brussels (at least since Year 1 of primary school). At the time of the first data collection, these pupils had all been taught French as a Foreign Language for at least four years for two to four 50-minute sessions a week. In addition, they had had varying amounts of naturalistic exposure to French (cf. section 3).
The progress in French of the Dutch-speaking pupils (the target group) was tracked over a period of two school years, starting when they were in Year 1 of secondary school (age 12) and ending when they were in Year 3 (age 14). They were administered the Frog story task three times, once every school year (Y1, Y2, Y3). The French proficiency of the French-speaking pupils was evaluated only once, in Year 2, to provide a benchmark for the Dutch-speaking pupils.Footnote 2 The total corpus for this study thus consists of 76 data sets.
4.2 Data processing
The French oral speech data which the pupils produced during the retell task were recorded and subsequently transcribed, segmented and annotated in CHAT format. Next the transcriptions were analysed with the help of CLAN software (MacWhinney, Reference MacWhinney2000). As a matter of convenience when reporting research, we use as a unit of analysis the ‘word’ in the sense of lexical entry (Jiang, Reference Jiang2000). The aim of the analysis was to identify the number and type of words known by the Dutch-speaking L2 learners of French, asking whether they are different from those known by the native French-speaking pupils, with a special focus on their productive lexical knowledge.
To this end, root or base forms (e.g. grenouille, sauter) and their related inflected forms (e.g. grenouilles, saute) were counted as a single unit (i.e. as different tokens of the same type). Accordingly, all content word tokens in the transcriptions were lemmatised, that is, reduced to their root form or, in the case of inflected verb tokens, to their infinitive form. Finally, all nouns, verbs and adjectives tokens were also coded for part of speech.Footnote 3
4.3 Analytic procedures
4.3.1 Measures
The overall research approach used in our study was to identify a number of properties of learners' oral production while carrying out the narrative task which allowed for quantitative and qualitative characterisations of the Dutch- and French-speaking pupils' lexical proficiency in French on a cross-sectional and a developmental basis. To this end, a total of 42 different quantitative measures were initially calculated, covering the various components of lexical proficiency distinguished in Figure 1. However, for reasons of space, only 22 of these measures are included in this article. These 22 measures are listed in Table 1, along with an indication of the component of lexical proficiency which they are primarily intended to gauge, viz. lexical diversity, sophistication or productivity. Each of these 22 measures was calculated for the four groups of French data sets in our corpus: the 19 transcriptions of the Dutch-speaking L2 learners in Years 1, 2 and 3, and the 19 transcriptions of the French native speaking pupils.
Table 1. Overview of lexical proficiency measures

Table 2. Standard measures – descriptive statistics

All measures are either extrinsic frequency measures (indicated in Table 1 by #) or ratio measures (U for Uber Index, G for Guiraud Index), or include the intrinsic frequency distribution of types (like D).
For ease of discussion, we further divided the 22 measures into three categories: 1) ‘standard’ measures, which simply distinguish between word types and tokens at the level of the lexicon as a whole, 2) lexical ‘class’ measures, which focus on a particular lexical class (nouns, verbs, adjectives), and 3) ‘frequency’ measures, which take into account external information about words (i.e. their relative frequency of occurrence) and distinguish between ‘basic’ versus ‘advanced’ words.
In our view, lexical proficiency has to do with the knowledge and use of linguistic form units with a specifiable, self-contained semantic-conceptual meaning or, in other words, with semantic content words rather than with grammatical function words. Knowledge of function words essentially pertains to a learner's grammatical competence. The distinction between content and function words is often blurred at the operational level of the quantitative measures that are used in many L2 vocabulary studies. In line with this view, measures of productive lexical proficiency should for the sake of validity preferably be based on an analysis of content words only, rather than on the total number of words used (i.e. content + function words). However, to allow for comparison with other studies, most of which have calculated measures on the basis of the total number of word types and tokens, several measures in this study were computed twice, once for all content and function word types (indicated by ‘All’ in Table 1) and once for content words only. Moreover, some measures, such as D, were difficult to calculate for content words only for practical reasons (e.g. some data sets did not contain the minimum number of 50 content tokens required for the calculation of D). A statistical comparison between the measures calculated for all words (e.g. G-All, U-All) and those calculated for the content words only (e.g. G-Content; U-Content) revealed that most of these measures were, in fact, highly correlated (with r-values in excess of 0.90).
The set of ‘standard’ measures include the following seven measures: total number of different word types, total number of different content word types (nouns, adjectives and verbs), Uber's index (U) and Guiraud's index (G) (both calculated for all word types and for content word types only), and finally D.
Previous studies suggest that increases in lexical diversity and productivity may be mainly due to lexical growth in one or two content word classes only (e.g. Broeder et al., Reference Broeder, Extra, van Hout and Perdue1993), though it is not entirely clear which lexical class is most prone to such developmental fluctuations nor what the link is with factors such as task type or the L2 learner's general stage of development. To further explore this issue, we calculated separate measures for the three major lexical content classes (nouns, adjectives, verbs). These ‘class’ measures are listed in the second column of Table 1. For the ratio measures, only Guiraud's index was calculated,Footnote 4 albeit again in two versions for each class, once within the word class and once across all word classes (i.e. relative to text length), using the following formulae:
Frequency-based vocabulary measures attempt to capture qualitative aspects of learners' productive lexical proficiency, namely the less frequent and, hence, potentially semantically more specific or more abstract words. This makes these measures potential indicators of what we call lexical sophistication (hence the alternative term ‘advanced’ measures; Daller et al., Reference Daller, van Hout and Treffers-Daller2003). Since there is much discussion about the quality of the word frequency lists on which such measures are based, we again explored different options. First, we used the Français Fondamental Premier Degré (FF1) list as discussed by Tidball and Treffers-Daller (Reference Tidball, Treffers-Daller, Daller, Milton and Treffers-Daller2007). This frequency list can be labelled ‘extrinsic’ in the sense that it was compiled independently from the speech corpus under investigation in this study. The FF1 list contains 1445 ‘frequent’ French words drawn from an established corpus of spoken native French. In addition to the FF1 list, we compiled two additional frequency lists on the basis of our own corpus of French narrative speech data. A first list, called FR66, contains all the types that appear at least once in two-thirds (66%) of the 76 native and non-native speaker transcriptions in our corpus. The FR66 list contains the 43 most frequent types in the entire corpus. The second frequency list is called FR5 and contains all words that appear five times or more in the 19 transcriptions of the French native speaker pupils in our sample. This list includes 149 word types. Both the FR66 and the FR5 frequency lists are ‘intrinsic’ in the sense that they are compiled from the same speech data to whose analysis they will be applied. The advantage of ‘intrinsic’ frequency lists is that they are context and task-specific and respect the integrity of the text samples analysed.
All three frequency lists were used to distinguish between ‘frequent’ and ‘rare’ words in each of the 76 transcriptions. We tallied not only the number of ‘rare’ word types but also their relative frequency of occurrence by using the following derivative of the Guiraud Index :
4.3.2 Statistical analyses
For each of the four data sets, mean scores were computed for each of the 22 measures. After checking for normality of distribution by means of the Kolmogorov-Smirnov test, a series of repeated measures ANOVA (with ‘Year’ as within-subjects factor) were performed to investigate how the lexical proficiency in the French oral narrative production data of the Dutch-speaking pupils developed over time. Their development was assessed in more detail by calculating Bonferroni pairwise comparisons (corrected paired sample t-tests) for the different test moments (Year 1–Year 2, Year 2–Year 3 and Year 1–Year 3). Paired sample t-tests were used to compare the scores of the Dutch-speaking target group at the three different test moments with the scores of the French native-speaker benchmark group.
5. RESULTS
The presentation of the results is organised per category of measure (‘standard’, ‘class’ and ‘frequency’). A first table contains the descriptive statistics (mean scores, standard deviations) for each of the measures for the four data sets (Year 1, 2 and 3 for the Dutch-speaking learners, and the native benchmark group). A second table presents the inferential statistics (RM ANOVA: significance, F-value between brackets; Y1 vs Y2, Y2 vs Y3 and Y1 vs Y3: significance of Bonferroni pairwise comparison). Statistically significant differences (at p < 0.05) are marked by three asterisks. The final part of this section presents a correlation matrix for the 22 measures.
5.1 ‘Standard’ measures
The repeated measure ANOVA analyses indicate that the factor ‘Year’ has a significant effect. This means that there is a significant increase in the scores of all seven ‘standard’ measures of lexical proficiency of the Dutch-speaking pupils over the three observation periods. The pairwise comparisons further indicate that although the increase on all measures is significant over the entire observation period (from Year 1 to Year 3), the increase is most pronounced in the course of the first year (Year 1 to Year 2). The increase on the two frequency measures of lexical productivity (number of different word types, number of different content word types) is significant neither over the first nor over the second year. The ratio measures of lexical diversity (U and G) and D all show a significant increase over the first year, but not over the second. Finally, the last column in Table 3 shows that the French L1 benchmark group significantly outperforms the Dutch-speaking L2 group on every measure and at each of the three testing moments.
Table 3. Standard measures – inferential statistics

5.2 ‘Class’ measures
Table 4 shows that the verbs are the most frequent word class in our corpus, followed by the nouns and then the adjectives. Adjectives are very much a minor phenomenon, both in the learner and the native speaker benchmark corpus (on average 2 or 3 adjectives per dataset). The statistical analyses of the lexical class measures in Table 5 yields a slightly different picture than that of the standard measures. The ANOVAs indicate that the Dutch-speaking pupils significantly progress over the total observation period on all measures except for one of the two lexical diversity measures for adjectives. However, the pairwise comparisons show that the pupils' overall lexical growth is mainly due to the increase on the lexical diversity measures for the noun and verb classes in the course of the first year (Year 1 to Year 2) only, and that no further significant progress was made on any of the measures over the second year (Year 2 to Year 3). Finally, the paired sample t-tests reveal that at each of the three testing moments the French L1 benchmark group significantly outperforms the Dutch-speaking L2 group on the productivity and diversity measures for nouns and verbs. No significant differences between the learners and the French L1 speakers were revealed for the measures focusing on the adjectives.
Table 4. Class measures – descriptive statistics

Table 5. Class measures – inferential statistics

Table 6. Frequency-based measures – descriptive statistics

Table 7. Frequency-based measures – inferential statistics

5.3 ‘Frequency’ measures
The general trends which emerge from the analysis of the frequency measures, primarily intended to gauge the pupils' more ‘advanced’ vocabulary, correspond to that of the standard and class measures presented in the previous sections: the Dutch-speaking learners' scores steadily increase in time though they never attain the values of the native-speaker benchmark pupils (Table 5). The ANOVAs indicate that the Dutch pupils' progress on the measures based on the extrinsic FF1 frequency list and the intrinsic FR66 frequency list is significant. The measures on the basis of the FR5-list (which is derived from the French-speaking pupil sub-corpus) does not yield any significant differences. The picture emerging from the pairwise comparisons is less straightforward. Both the number and ratio measure of advanced word types based on the FF1-list significantly increase in the course of the first, second and third year. The same measures calculated on the basis of the FR66-list significantly advance only over the first year (Year 1 to Year 2) and over the entire observation period (Year 1 to Year 3). Finally, the last column of Table 5 shows that the French L1 benchmark group once more outperforms the Dutch-speaking learner group on all measures and for all observation periods.
5.4 Correlations
This final section examines the correlations between the different measures from all data sets combined. Table 8 in the appendix shows that all correlations are significant and positive and in some cases very strong. This is the case for the correlation between the number of types and the number of content types (0.98), the correlations between G and U calculated for all words and G and U calculated for content words (respectively 0.96 and 0.95) and for the correlations between the number of noun lemmas (0.92) and verb lemmas (0.95). Especially the ‘standard’ measures of lexical diversity (G, U and D for all words) are strongly intercorrelated (between 0.95 and 0.99) and they correlate well with the total number of lemmas (0.92 for G, 0.88 for U and 0.92 for D). The lowest correlation coefficients are found for the measures involving adjectives.
5.5 Summary of the results
In summary, results of the statistical analyses reveal certain trends both in relation to the overall development of the L2 learners over time and to the lexical proficiency of the learners and the native speakers compared. A few exceptions aside,Footnote 5 the repeated measures ANOVA (with time as within-subjects factor) show that these learners of French progressed significantly on all measures. Bonferroni pairwise comparisons for the different test moments further reveal that the learners' overall development was stronger over the first year (with significant progress on 12 measures evenly distributed over the four dimensions of lexical proficiency they were intended to gauge) than over the second year (learners only progressed for two of the frequency measures, namely the number of non-frequent word types in the FF1 list and the diversity of these ‘advanced’ word types relative to the text length). Finally, a series of paired sample t-tests shows that except for the ‘class’ measures of adjective production, the native speakers of French scored consistently and significantly higher than the learners.
6. DISCUSSION AND CONCLUSION
The present study sought to investigate the lexical proficiency in French of the Dutch-speaking pupils vis-à-vis their Francophone classmates in the first three years of Dutch-medium schools in Brussels. A first goal of this study was to explore the operationalisation and definition of lexical L2 proficiency and related constructs with a view to identifying which measures of productive lexical proficiency best capture productive lexical proficiency development in L2 French over time. In future work the model of lexical competence and proficiency proposed in section 2 will be situated within broader models of language proficiency and competence (e.g. Bachman and Palmer, Reference Bachman and Palmer1996; Hulstijn, Reference Hulstijn2007). In this article we focused on the strength of different statistical measures of lexical proficiency, particularly in terms of discriminatory power and validity. In this respect the findings of our study suggest that:
1. all selected ‘standard’, ‘class’ and ‘frequency’ measures (except for the measures based on the FF5 list) allow to demonstrate differences between different stages of L2 development as well as between L2 learners and L1 users.
2. most measures but particularly the standard measures of lexical diversity correlate strongly (at least for the -admittedly restricted- data sets in the present study).
These findings imply that the lexical measures calculated here all have longitudinal and cross-sectional discriminatory power. At the same time, however, the observation that all measures are strongly correlated appears to confirm our suspicion that they tap into similar aspects or dimensions of lexical proficiency, which poses problems in terms of construct validity. Particularly the measures proposed here as indicators of lexical sophistication appear to have low construct validity. As already indicated in Table 1, these frequency-based measures do not only measure lexical sophistication (or, rather, aspects thereof) but also other dimensions of lexical proficiency such as diversity and productivity. Moreover, although the frequency-based vocabulary measures can capture some qualitative aspects of learners' productive lexical proficiency (by looking at the less frequent, more advanced and potentially semantically more specific or more abstract words), lexical sophistication also refers to extensional semantic relations between these words that cannot be captured by referring to frequency lists alone. What is needed then, is greater conceptual clarity as to what lexical sophistication entails and more varied and sophisticated measures of this dimension of lexical proficiency than the ones used in this study.
A second aim of this study was to longitudinally explore the development of French productive lexical proficiency in terms of the diversity, sophistication and productivity of Dutch-speaking pupils vis-à-vis their Francophone classmates in the first three years of secondary education in Dutch-medium schools in Brussels. To this end oral French production data from 19 Dutch-speaking French learners were collected on three occasions at twelve months intervals and compared to the oral production data from 19 native French speakers who attended the same classes. Twenty-two quantitative measures tapping into different components of lexical proficiency were computed and statistically analysed. With respect to the Dutch-speaking pupils' lexical development, the following outcomes appear to be particularly noteworthy:
1. Learners progressed significantly in terms of lexical diversity, sophistication and productivity;
2. The development of lexical proficiency, particularly in terms of lexical productivity and diversity trails off after the first year.
Although these findings clearly indicate an improvement in selected aspects of the French lexical proficiency of the Dutch-speaking pupils over the first year of secondary education, an explanation for the relatively lower scores obtained in the third year is not immediately apparent. Perhaps this last finding can be attributed to methodological aspects of our study. It is not unlikely that after having told the frog story in Year 1 and Year 2 the learners experienced diminished task motivation the third time around and invested less effort in creating a lexically productive and diverse narrative (even though the interviewers were different at each occasion and pretended to be unfamiliar with the story). This point merits further exploration. In future analyses we will include available data on the pupils' attitudinal and motivational dispositions to examine not only individual variation in lexical development but also to what extent the behavioural manifestations of lexical competence are influenced by non-linguistic learner variables. In addition, future work will also include qualitative analyses of the pupils' lexical productions to identify more subtle changes in and differences between the learners' and native speakers' lexical repertoires that may have been overlooked by the quantitative analyses of the present study.
Although at the end of the study the Dutch-speaking pupils had had seven years of FFL instruction and at least nine years of fairly intensive informal contact with French-speaking peers, and although the French-speaking pupils had had their basic education in a second language (Dutch), the findings of this study suggest that the two groups still differ markedly in their French lexical proficiency. At the end of the observation cycle the native French pupils still outperform the L2 learners on all dimensions investigated except on the measures for lexical productivity and diversity of adjectives. This last finding may again be attributed to methodological issues related to the task design. The fact that adjectives are a minor phenomenon in the French productions of both the Dutch-speaking target group and the French native-speaker benchmark group points to the limitations of the specific data elicitation procedure used in this study. An oral retell task on the basis of a wordless picture story such as the popular Frog story creates too few obligatory or appropriate contexts for using adjectives for any potential differences between the Dutch-speaking pupils and the French native-speaking group to emerge. More generally, this points to the crucial importance, not only of selecting reliable, valid and sensitive proficiency measures, but also of selecting data collection procedures that elicit lexically rich and varied speech.
Methodological limitations notwithstanding, the outcomes of this study unambiguously indicate that even after nine years of intensive and varied contact with French the Dutch-speaking pupils had not yet attained the same level of lexical productivity, diversity and sophistication as their native speaker peers. This observation is obviously not to be taken as a sign of failure on part of the specific pupils, their teachers, the schools or the educational system involved, the more so as the levels of lexical proficiency attained may well be high when compared to FFL contexts elsewhere (a possibility which will be empirically investigated in a follow-up study comparing the French lexical proficiency of Dutch-speaking pupils in Brussels with that of Dutch-speaking FFL learners in monolingual Flanders). Rather, these findings demonstrate that lexical development in a second or foreign language is not always a given, not even in contexts as seemingly conducive to L2 learning as the learning of French in Brussels. More specifically, these findings raise questions about what can be realistically expected from the contribution of formal and informal L2 exposure to lexical development, and of the explicit and incidental learning processes which these different exposure sources are said to elicit. Explicit lexical instruction and learning strategies applied in the FFL lessons in Dutch-medium education in Brussels are ample (e.g. memorisation of word lists, use of dictionaries). Many researchers agree that explicit or form-focused vocabulary teaching, and the explicit learning mechanisms which it elicits, may be necessary, or at least beneficial for learning the core vocabulary in classroom L2 learning, particularly for the learning of basic lexical and semantic knowledge, but informal vocabulary learning should also be encouraged for learning additional vocabulary and for further lexical and semantic development of the words learned through explicit instruction (Huckin and Coady, Reference Huckin and Coady1999). Informal vocabulary learning in instructional contexts is often referred to as ‘incidental’ (cf. Wesche and Paribakht, Reference Wesche and Paribakht1999; Gass, Reference Gass1999) because it is typically a by-product, not the target, of another meaning-focused communicative activity such as reading, listening and negotiating meaning in the context of authentic communicative native-nonnative interactions. Although incidental vocabulary learning is still not fully understood, it is believed to have clear advantages over explicit instruction and learning, including the following: (a) it is contextualised, giving the learner a richer sense of a word's use and meaning than can be provided in traditional vocabulary-building exercises, (b) it is more individualised and learner-based because the vocabulary being acquired is dependent on the learner's own selection of communicative topics or reading materials and (c) it occurs through multiple exposures to a word in different contexts (Hucking and Coady, Reference Huckin and Coady1999; Gass, Reference Gass1999). Be that as it may, the findings of this study would still suggest that there are clear limits to the contributions of incidental vocabulary learning, too, and, more generally, that one has to be patient, and realistic in what can be expected from L2 vocabulary development in educational contexts.
APPENDIX
Table 8. Correlation matrix
