1 Introduction
Aviation English, a key component of air traffic control (ATC) communication, can be defined as a comprehensive but specialized subset of English for specific purposes (ESP) related broadly to aviation and consisting of both plain language and standardized phraseology for radiotelephony communications (International Civil Aviation Organization [ICAO], 2004). Although aviation English between air traffic controllers and pilots is used in highly predictable circumstances following a prescribed sequence (Mell, Reference Mell1992), miscommunications do occur, especially between non-native English speaking pilots and air traffic controllers (Cushing, Reference Cushing1994; Tajima, Reference Tajima2004). It has been reported that at least 11% of fatal airplane crashes worldwide were due to miscommunications related to second language use in the period of 1982–1991 (Crystal, Reference Crystal2003). In response to growing concerns about aviation safety, several aviation English tests such as the test of English Language Proficiency for Aeronautical Communication by EUROCONTROL (Alderson, Reference Alderson2009), Test of English Language Level for Controllers and Pilots by TELLCAP (Alderson, Reference Alderson2006), and RMIT English Language Test for Aviation by RMIT University (Zokić, Boras & Lazić, Reference Zokić, Boras and Lazić2012) have been developed based on the ICAO policy of language proficiency requirements (LPRs).
Unfortunately, however, despite the promotion of a standardized quality of the tests, little to no confidence can be placed in the reliability and validity of the several currently available aviation language tests (Alderson, Reference Alderson2006). Although most aviation English testing companies boast their assessments’ computer-assisted test service features, the companies do not make full use of the online testing environment. The companies simply present motionless pictures and plain text prompts for playing audio materials and record test takers’ voices instead of integrating interactive features in order to assess interactive ATC task performance. Therefore, there is a current pressing need to develop an authentic and interactive aviation English test to observe test takers’ aviation English proficiency and capture not only test takers’ knowledge of aviation English communication, but also their demonstrated strategic competence in ATC interactions while taking advantage of innovative technology such as virtual worlds (VWs).
To address this discrepancy, the recent efforts of the US Army Research Institute for the Behavioral and Social Sciences (ARI) to develop prototype interactive training applications and assessments in maturing learning technologies, including VWs and collaborative game-based technologies, have been thought provoking and refreshingly positive (Brusso, Wisher, Paddock & Hatfield, Reference Brusso, Wisher, Paddock and Hatfield2014). Such endeavors are very much in line with the latest assessment trends in K–12 US classrooms seeking to enhance the feasibility of virtual performance assessments and to evaluate scientific inquiry through authentic interactive testing environments (Clarke-Midura & Dede, Reference Clarke-Midura and Dede2010). Although VWs were not explicitly created for language education purposes, several language educators and researchers have made use of them as platforms for teaching and learning a variety of disciplinary content, including languages (da Silva, Reference da Silva2012; Sadler, Reference Sadler2012). Instructional benefits of VWs include promoting social interaction (Chuang, Chang & Chen, Reference Chuang, Chang and Chen2014), creating community (Oliver & Carr, Reference Oliver and Carr2009), facilitating collaboration (Peterson, Reference Peterson2012; Vuopala, Hyvönen & Järvelä, Reference Vuopala, Hyvönen and Järvelä2016), lowering social anxiety (Melchor-Couto, Reference Melchor-Couto2017), and enhancing learner motivation and engagement (Verhagen, Feldberg, van den Hooff, Meents & Merikivi, Reference Verhagen, Feldberg, van den Hooff, Meents and Merikivi2012). VWs excel at providing users with highly interactive arenas for dynamic feedback, learner experimentation, and real-time personalized task performance (Abdallah & Mansour, Reference Abdallah and Mansour2015; Lin, Wang, Grant, Chien & Lan, Reference Lin, Wang, Grant, Chien and Lan2014; Swier, Reference Swier2014; Wang, Calandra, Hibbard & Lefaiver, Reference Wang, Calandra, Hibbard and Lefaiver2012). Simulating target language use (TLU) situations by providing a set of specific language use tasks that the test taker is likely to encounter outside of the test itself (Bachman & Palmer, Reference Bachman and Palmer1996; Douglas, Reference Douglas2000) in a VW and providing test takers with authentic and interactive tasks in the VW may encourage test takers to fully demonstrate what they are actually capable of accomplishing in the TLU situations. It is therefore important to explore the affordances of VWs as a potential language assessment platform for observing test takers’ strategic competence, as well as linguistic knowledge, during interactions facilitated by authentic target tasks.
This paper will first examine aviation English assessment and the significance of strategic competence and then elaborate on the potential role of VWs for language assessment. What follows is an exploratory study that investigates cognitive, metacognitive, and communication strategies used during interaction in a VW and their relation to test takers’ performance.
2 Literature review
2.1 Aviation English assessment and strategic competence
Language ability is an abstract concept; therefore, it is necessary to define “ability” in explicit terms for language assessment. Such defined ability, inferred from a meaningful interpretation of observed behavior, is called a construct (Bachman & Palmer, Reference Bachman and Palmer2010). The construct under investigation in this study is air traffic controllers’ aviation English ability based on an interactionalist definition (Chapelle, Reference Chapelle1998) in which a test taker’s performance on a test indicates not only underlying language knowledge and the assessment context, but also strategic competence (Bachman & Palmer, Reference Bachman and Palmer1996). This may suggest that the test taker utilizes her or his language knowledge in a given assessment context using various strategies. To provide a more articulated concept of strategic competence, the researcher operationalized cognitive, metacognitive, and communication strategies (Chamot & O’Malley, Reference Chamot and O’Malley1987; Dörnyei & Scott, Reference Dörnyei and Scott1997; Purpura, Reference Purpura1999) as follows:
1. Cognitive strategies, which involve the test taker’s interaction with the interlocutors and assessment context to be assessed by manipulating it mentally through the processes of identification, retention, storage, comprehension, and retrieval.
2. Metacognitive strategies, which involve executive processes in goal setting, assessment (assessing the situation, self-monitoring, and self-evaluating), and planning (formulating a plan).
3. Communication strategies, which involve the test taker’s direct, interactional, and indirect communication-enhancing devices such as paraphrasing, substitution, coining new words, switching to the first language, and asking for clarification.
Although the use of strategic competence implies direct or indirect impact on task performance, unlike communication strategies, both cognitive and metacognitive strategies are not directly observable as they take place as mental processes and are largely ignored in language assessment (Lai, Reference Lai2011). Thus, only communication strategies, the observable actions that the test taker selects for interacting with other interlocutors, are often included as integral in the assessment criteria under the interaction category adopted in the LPRs’ rating criteria by the ICAO (2004).
For the assessment of aviation English, the ICAO has published a set of LPRs and proficiency rating criteria, which consists of six areas of language use: pronunciation, structure, vocabulary, fluency, comprehension, and interaction. Among the six areas of language use, successful interaction is defined by the ICAO as “Responses are usually immediate, appropriate, and informative. Initiates and maintains exchanges even when dealing with an unexpected turn of events. Deals adequately with apparent misunderstandings by checking, confirming, or clarifying” (ICAO, 2004: 4-14). Interaction in aviation English is goal oriented and formulaic with the following characteristics: high responsiveness, less valued politeness, restricted turn-taking, and determined topics according to information-transfer requirements (Van Moere, Suzuki, Downey & Cheng, Reference Van Moere, Suzuki, Downey and Cheng2009). In this regard, it is highly important to recognize the role of strategic competence in task performance and to find ways to reflect strategy uses in language assessment.
Despite endeavors to standardize the quality of aviation English assessment, currently released aviation English tests may focus on language ability superficially in the context of aviation English but fail to measure what the test taker actually could accomplish in an authentic TLU situation (Alderson, Reference Alderson2006). It is not enough to merely give test takers topics relevant to the field in which they are studying or working and assess verbal interaction without taking into consideration test takers’ cognitive and metacognitive strategy uses. If we want to assess how well test takers can use language for specific purposes, a measure that considers language knowledge and background knowledge, and use of strategic competence related to the TLU situation is required. In other words, the more authentic the testing environment, the more strategic competence the examinees may exploit in the task performance (Douglas, Reference Douglas2000). Accordingly, it is crucial for test developers to provide test takers with authentic and interactive test content and contexts to encourage them to fully demonstrate what they can cognitively process and what they can actually do in real-life ATC communication scenarios.
2.2 Virtual worlds for language assessment
Currently, there are hundreds of VWs that have drawn interest from educators. Table 1 summarizes some popular VW platforms and includes information about their year of release, costs, target audiences, themes, number of members, and installation method.
Table 1 Overview of popular VW platforms
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_tab1.gif?pub-status=live)
As highlighted in Sadler (Reference Sadler2012), many VWs, including the nine platforms shown in Table 1, share the following common characteristics, such as offering online 3D environments, inclusion of avatars, facilitation of real-time interaction, 24-hour accessibility, provision of social space, heightened numbers of users, and longevity of the environment. Among the features of VWs, interaction is one of the most crucial processes in which language learners can develop both linguistic knowledge and communicative competence (Tseng, Tsai & Chao, Reference Tseng, Tsai and Chao2013). It is thus imperative to provide authentic language learning and interaction opportunities to language learners so that they can practice the target tasks through synchronous or asynchronous interaction (Holliday, Reference Holliday2007; Khalsa, Maloney-Krichmar & Peyton, Reference Khalsa, Maloney-Krichmar and Peyton2007).
A review of recent empirical research on the use of VWs in language learning and teaching (see Table 2) shows that the two most popular study topics are (1) language gain through negotiation of meaning and/or form in a VW, and (2) usability testing (e.g. learners’ perception and affordances of VWs). To explore such research themes, researchers of VWs frequently adopt surveys, pre-/post-tests, and interviews as research instruments. Among the diversity of platforms, Second Life has been one of the most widely explored for educators and researchers. According to a recent survey given to 237 Second Life users (Sadler, Reference Sadler2011), more than 80% of Second Life participants have experienced a great deal of second language practice in their interactions with other users in Second Life. In particular, the landmark function in Second Life, similar to an Internet bookmark, enables experienced users, educators, and researchers to link their own virtual space with the landmark and conveniently invite other users. Based on the distinct characteristics of Second Life, it sustains a strong community of educators, numbering in the thousands, including teachers and researchers working with various languages (Sadler, Reference Sadler2012).
Table 2 Overview of recent empirical research with VWs
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_tab2.gif?pub-status=live)
In what was considered the early stage of adopting and applying VWs to language learning and teaching, it was adequate to examine users’ interactive communication in and perceptions of virtual spaces to identify empirical evidence to support the affordances, capability, and effectiveness of VWs for language learning and teaching. However, greater efforts should now be made to examine non-verbal interaction (e.g. strategic competence, situated cognition) facilitated and promoted by VWs, which this researcher believes is one of the most fundamental advantages of virtual spaces. Additionally, more diverse research areas and research methods with VWs may need to be explored to take best advantage of virtual environments in the realm of second language learning and testing.
The use of VWs has the potential to improve language assessment dramatically by validly measuring test takers’ sophisticated intellectual and psychosocial performance and compensating for the shortcomings of multiple-choice or simple online tests that contain little to no interactive features. There are two major advantages supported by the literature. First, VWs can increase face validity, or test takers’ intuitive judgments about the test (Alderson, Clapham & Wall, Reference Alderson, Clapham and Wall1995). Although frequently dismissed by testers as being unscientific and irrelevant (Stevenson, Reference Stevenson1985), face validity can be integral in determining the test’s acceptability and reasonableness to those who will be tested and to those who use test results (Messick, Reference Messick1989). Second, VWs can enhance construct validity. With respect to the construct definition of aviation English, strategic competence serves as a mediator between test takers’ language knowledge and context by controlling the interaction between them (Douglas, Reference Douglas2000). By approximating authentic TLU situations in VWs, test takers’ use of strategic competence should be promoted to better predict the language learners’ future real-life performance in authentic ATC situations.
The goal of the current research project was to explore and identify the potential of a VW (Second Life) for promoting more authentic cognitive and metacognitive strategies, such as those that occur in real-world scenarios, by capturing what specific strategies the test takers actually use during their task performance in Second Life. This investigation sought to answer the following questions: (a) What types of strategic competence are used in the virtual aviation English task performance by the test takers? and (b) How can the assessed strategies be interpreted in relation to test takers’ aviation English performance? This would shed light on the potential use of VWs for language assessment as well as windows into strategic competence.
3 Research methods
This study adopts a mixed-method research design (Creswell & Plano Clark, Reference Creswell and Plano Clark2007) using quantitative data and qualitative data to answer the research questions: (1) quantitative data from test takers’ aviation English test performance scores and coded think-aloud data, and (2) qualitative data from think-aloud discourse data.
3.1 Participants
Participants in this study were five enlisted soldiers (all male) and five non-commissioned officers (NCO; two females, three males) whose job was to conduct ATC in the military air traffic service battalion in an Asian country. Initially, 12 test takers were invited to participate in the stimulated recall data collection; however, due to the quality of audio recording and unexpected absences as a result of emergency duties, 10 test takers’ stimulated recall data were analyzed. A brief summary of the participants’ demographic information is provided in Table 3.
Table 3 Descriptive statistics of the participants’ demographic information
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_tab3.gif?pub-status=live)
Note. The total number of the participants was 10.
3.2 Virtual interactive aviation English tasks in Second Life
The Virtual Interactive Tasks for Aviation English Assessment (VITAEA), consisting of pre-existing aviation English tasks developed by the researcher, entail listening to simulated pilots’ radiotelephony transmissions involving requesting information, instructions, and clearance, watching animated helicopters landing and departing, reading flight plans and weather information in a digital board in the ATC tower, and then orally responding to the pilots in seven prototype aviation English tasks simulated in Second Life. To explore the target aviation English tasks and task topics in the TLU situation, and to examine rating criteria, the researcher administered a task-based needs analysis survey based on a task-based performance assessment design (Long, Reference Long2005) to 87 experienced air traffic controllers from the same target context. Based on this authentic task and task-topic identification, seven prototype aviation English assessment tasks were created (see Table 4).
Table 4 Final blueprint for the Virtual Interactive Tasks for Aviation English Assessment
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_tab4.gif?pub-status=live)
Note. The total test time was 20 minutes.
The evaluation of test takers’ aviation English performance in Second Life was founded on two different rating criteria: (1) an ICAO-implemented language-centered rating rubric, and (2) a task-centered rating rubric focusing on the task accomplishment. The rationale for having two different rating criteria was that even though ICAO’s analytical assessment criteria were well defined and appropriate for assessing complex and multidimensional aviation English proficiency, such a language-centered assessment rubric would still be unable to directly determine successful accomplishment of the particular target tasks. To make better use of this innovative assessment in Second Life as a vehicle for determining task accomplishment and the elicitation of language performance, respectively, the criteria for task-based aviation English performance assessment aimed to not only inform inferences about test takers’ aviation English proficiency (ability), but also provide test users with the experience of successfully completed authentic tasks.
In the Manual on the Implementation of ICAO Language Proficiency Requirements (ICAO, 2004), six levels of operational proficiency, ranging from pre-elementary (Operational Level 1) through expert (Operational Level 6), are provided. Six dimensions of proficiency are evaluated:
∙ Pronunciation – pronunciation, stress, rhythm, and intonation;
∙ Structure – grammar, sentence patterns, global meaning errors, and local errors;
∙ Vocabulary – style, tone, lexical choices, which correspond to context and status, idiomatic expressions, articulation of subtle differences or distinction in expression, and meaning;
∙ Fluency – naturalness of speech production, absence of inappropriate hesitations, fillers, and pauses that may interfere with comprehension;
∙ Comprehension – clear and accurate information transfer that results in understanding;
∙ Interactions – sensitivity to verbal and non-verbal cues and appropriate response to them.
A primary concern of the task-centered rating rubric is test takers’ accomplishment of target tasks in TLU situations. Raters were asked to identify the level of task accomplishment as (1) excellent, (2) acceptable, or (3) unacceptable. Task-centered rating criteria in each level correspond to one another. The prototype of the task-centered rating rubric was revised according to domain experts’ consensus and feedback before piloting.
Figure 1 shows the Second Life simulated airbase environment, created based on authentic satellite images, in which test takers engaged in aviation English communication. When test takers log into the ATC tower environment in the testing situation, they can interact with simulated pilots and helicopters and access ATC information, such as weather information, airbase information, notice to airmen (NOTAM), and flight plans. Figure 2 presents the image displaying a screen-capture image of the ATC tower in the Second Life testing environment.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_fig1g.jpeg?pub-status=live)
Figure 1 Overview of the virtual interactive aviation English test environment simulated in Second Life
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_fig2g.jpeg?pub-status=live)
Figure 2 Actual and emulated image of the front view of the ATC tower
3.3 Data collection and analysis
The VITAEA were administered with 10 air traffic controllers one at a time in the airbase over the period of a month. Three laptop computers were used for the test administration: the first was one of the best high-performance PC gaming laptops on the market, and had been set up with a screen-capturing program called Camtasia to record test takers’ performance during the test; the second was used for the simulation test administration (e.g. operating helicopters, playing audio files interactively) in Second Life by the researcher; the third was connected to the same Second Life server to provide a high-quality screen-capture angle for recording video during the task performance. The time duration of the test taking itself ranged between 16 and 20 minutes per person. When combined with the follow-up stimulated recall, the entire process lasted approximately 50 minutes per test taker.
For the test rating, two expert air traffic controllers with more than 15 years ATC experience each were recruited to holistically rate the test takers’ task performance. For analytical assessment using ICAO’s LPRs’ rating criteria, three applied linguists in the US (one native speaker of English from the US and two international but native like) were also recruited to rate the same task performance audio files.
In addition to test takers’ performance scores, stimulated recall data from the 10 test takers were collected to answer the research question on test takers’ strategy use during the target task performance in Second Life. From these verbal reports, recorded using a digital MP3 recorder, inferences could be made about what and how strategies were planned, used, and assessed during the virtual interactive task performance in Second Life. Immediately after administering the aviation English tasks in Second Life, the researcher trained the participants in how to verbalize their cognitive/metacognitive processes during their task performance. In each of the test takers’ radiotelephony transmissions, participants were encouraged to think out loud whenever they clicked on navigation buttons, moved the mouse cursor, or read particular information.
Initially accomplished by Purpura (Reference Purpura1999) and Dörnyei and Scott (Reference Dörnyei and Scott1997), the transcribed protocols were analyzed using grounded theory analysis techniques (Corbin & Strauss, Reference Corbin and Strauss2008) using open coding to identify an array of strategic actions, and then axial coding to group those actions into a certain number of core strategy types for completing the aviation English tasks in Second Life. For communication strategy use analysis, a coding scheme was adopted from Dörnyei and Scott (Reference Dörnyei and Scott1997). The verbal reports focused on the types of cognitive, metacognitive, and communication strategies used in the VITAEA and their relationship with task performance. To assess to what degree a set of test takers’ transcripts were consistently coded by the two different coders, intercoder agreement for an entire set of data was calculated with the finding of acceptable intercoder reliability (87.4% agreement) (Paltridge & Phakiti, Reference Paltridge and Phakiti2015).
Regarding the expected relationships between the participants’ test scores and their strategy use, it was hypothesized that the test scores and the number of strategies the test takers adopted would be positively correlated. To operationalize this, the total number of cognitive and metacognitive strategy use, the number of cognitive and metacognitive strategy types, the number of communication strategy use, task-centered rating scores, and language-centered rating scores were calculated and analyzed. However, due to the small sample size, descriptive statistics and data visualizations are used to display the findings.
4 Results and discussion
In this section, the researcher presents the results of the study sequentially based on the research questions.
To find the answer to the first research question, verbal reports from stimulated recalls with the 10 air traffic controllers were transcribed and analyzed. The verbal reports focused on the types of cognitive, metacognitive, and communication strategies used in the VITAEA and their relationship to task performance.
Instances of each type of cognitive and metacognitive strategy identified from the stimulated recall data analysis are provided below:
∙ Clarifying/verifying (CLAR)
I had to ask the pilot for repetition of his last transmission as I was not familiar with the pilot’s way of reading numbers. (Test taker #1)
∙ Assessing the situation before events <ASSIT>
I changed the screen view of Second Life to the right side to follow up the aircraft when I received the call. As I could see the aircraft is leaving the control zone from the east bound, I approved frequency change. (Test taker #1)
∙ Applying rules (APR)
As the pilot said the aircraft is at two miles east of the airbase and requested landing, I approved the landing. (Test taker #4)
∙ Linking with prior knowledge (LPK)
This request of amending a flight plan was very similar to IFR plan change report, so I wanted to let the pilot know that I am ready to copy the changed flight plan. (Test taker #2)
∙ Monitoring <MON>
I moved the mouse and screen in Second Life as I wanted to see the moving aircraft clearer. (Test taker #4)
∙ Inferencing (INF)
I was confused, but tried to guess. (Test taker #8)
∙ Self-evaluation <SE>
On the runway the pilot just said “ready for take-off”. I realized that I should have approved take-off in advance. (Test taker #7)
Findings from the stimulated recall data analysis reveal a variety of cognitive and metacognitive strategies that were adopted during the test task performance in Table 5. Applying rules accounts for the largest portion out of the total number of strategy uses (44.09%) followed by Linking with prior knowledge (16.13%). Together, these two strategies constitute more than 60% of the entire reported strategy use, which corresponds to the findings from the actual aviation English communication contexts in the current study. As aviation English consists of standard phraseology, a simplified, situational, and highly structured version of English used in a very restricted context, test takers may need to utilize these linguistic rules, as well as learned situational rules, in order to perform the aviation English communication tasks. This contextual feature may account for the high frequency of use of the two strategies. Additionally, for those who have been involved with ATC relatively longer than other test takers, there was a tendency to apply more experience and/or knowledge to the problem-solving process.
Table 5 Identified strategy use during task performance in VITAEA
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_tab5.gif?pub-status=live)
In addition to test takers’ cognitive and metacognitive strategy use, their communication strategies, adopted during their test task performance, were coded by two coders (the researcher and an applied linguist) and analyzed. Table 6 displays identified communication strategies adopted by the test takers during the VITAEA task performance.
Table 6 Identified communication strategy use during task performance in VITAEA
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_tab6.gif?pub-status=live)
As seen in Table 6, a total of 13 types of communication strategies were identified. Among the various communication strategies used, four of the most frequently used are Direct appeal for help (i.e. by using test takers’ first language during the target task performance) (18.46%), followed by Use of filler (e.g. Uhm …) (15.38%), Omission (i.e. no response) (13.85%), and Asking for repetition (e.g. Can you say it again; one more time; say again) (12.31%). These four strategies account for 60% of the total communication strategy use by the test takers.
As Douglas (Reference Douglas2001) highlighted, communication strategies were indeed employed even when there were no obvious difficulties during the test task performance. Collected data also show that test takers often use repetition of the pilot’s response to acknowledge the successful transmission of the delivered information, such as read back. Yet, given the limited number of participants and scarcity of data, there was a tendency for most experienced air traffic controllers to not apply communication strategies as frequently as novice controllers. Compared to most experienced air traffic controllers, novice controllers tended to utilize more communication strategies as they encountered more communication breakdowns or comprehension issues than the experienced controllers. To achieve communication goals in each air traffic control procedure and to compensate for limited communication, novice air traffic controllers in the current study adopted various communication strategies throughout the task performance.
In answering the second research question exploring the relationship between test takers’ strategy uses and their VITAEA scores, the N size was too small to use correlations. However, positive relationships between the number of different strategies used and test takers’ performance (test scores) were identified. The results are presented in Table 7 and Figure 3.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_fig3g.jpeg?pub-status=live)
Figure 3 Relationships between VITAEA scores and strategy uses
Table 7 Summary of the nine test takers’ strategy uses and test scores
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180413083803334-0229:S0958344017000362:S0958344017000362_tab7.gif?pub-status=live)
Note. Strategy total = total number of cognitive/metacognitive strategy uses; Strategy type = cognitive/metacognitive strategy types; Com-Strategy = communication strategy use; TaskCS = task-centered scores of VITAEA; LangCS = language-centered scores of VITAEA.
Despite the limitation of the small N size and the variables’ (test scores and frequency of strategy use/type) unit, as can be seen in Figure 3, overall it appears that a positive relationship exists between test scores and strategy use. According to Figure 3(a) and (b), there appears to be a positive relationship between the total number of strategies used and the two scores on the VITAEA.
This may imply that test takers with higher test scores on the VITAEA tended to use a greater number of cognitive and metacognitive strategies during the test task performance.
More significantly, as is apparent in Figure 3(a), a positive relationship between task-centered VITAEA scores and strategy use may also indicate that those test takers who were given a high score on the ATC task utilized a greater number of cognitive and metacognitive strategies during the task performance. This may imply that the use of interactive assessment tasks in a virtual world could facilitate situated cognition and promote test takers’ use of various cognitive and metacognitive strategies during the task performance. Furthermore, a positive relationship between the total number of strategy types test takers utilized and task-centered scores on the VITAEA also suggests there is a positive relationship between the two variables. This could also suggest that language performance in an authentic virtual environment could promote the usage of strategic competence in not only total frequency of the use, but also the various types of strategies.
Overall, the findings regarding strategy use provide some backing for the assumption that strategies required by tasks are construct relevant based on the positive relationship between the VITAEA test scores and test takers’ strategy use. Moreover, the results are in accordance with the interactionalist construct definition (Chapelle, Reference Chapelle1998) in which strategic competence directs and assesses world knowledge and language knowledge during task performance. This critical finding of a positive relationship between cognitive/metacognitive strategy use and task-centered scores on the VITAEA provides some empirical evidence that corresponds to Douglas’s emphasis on the test developers’ responsibility for “providing sufficient contextual information to enable the test takers to establish the context, to know where they are, and engage an appropriate discourse domain” (Reference Douglas2000: 76) in the authentic virtual environment by actively utilizing strategic competence.
In the case of cognitive and metacognitive strategy use, data were extracted through a retrospective recall right after the test to identify test takers’ cognitive processes. By contrast, communication strategy use data were gathered from test takers’ actual test task performance by conducting a coded analysis of the test takers’ oral responses during the test performance to examine what amount and types of communication strategies the test takers adopted while they were trying to accomplish the given tasks. A closer look at the coded communication strategy data reveals that, in many cases, test takers utilized a variety of communication strategies (e.g. direct appeal for help; asking for repetition) when they were faced with difficulties during the task performance; yet there were also some cases in which test takers adopted communication strategies when there had been no obvious difficulties in task completion.
5 Conclusions
The design and implementation of VITAEA for this study was motivated by the need to develop an aviation English test in a virtual world that was well grounded in principles from second language acquisition and language testing. The VITAEA incorporates a sequence of authentic aviation English tasks in Second Life to boost interactive completion of target tasks and designed to promote test takers’ strategy uses. The findings reveal that virtual interactive tasks in a virtual world facilitated the employment of a variety of strategy types and their frequency to accomplish target tasks. Furthermore, a positive relationship between participants’ test scores and the number of adopted strategies suggests that strategies engaged in during task completion in Second Life are construct relevant and in accordance with theoretical expectations (Chapelle, Reference Chapelle1998, Reference Chapelle2005; Douglas, Reference Douglas2000).
The current study has several limitations. The first limitation concerns the small sample size used for statistical analysis. Due to limited access to the entire population of military air traffic controllers, only 10 participants’ data could be used for a non-parametric statistical analysis. Although the results of the study cannot be generalized at this time, it would have been beneficial to include more participants in the data collection by conducting various surveys, interviews, and tests. Second, despite the researcher’s efforts to create an authentic environment, there is much room for improvement in establishing life-like simulations in the TLU situations and test tasks.
Findings from the current study still hold implications for the field of language assessment in language for specific purposes and computer-assisted language learning. Despite the great potential for virtual environments, the field of instructed second language acquisition has mainly constructed an experimental arena for language learning or immersion based on a Vygotskian approach (Schwienhorst, Reference Schwienhorst2002); so, why could these environments not be used for language assessment? The researcher believes that the use of virtual environments could dramatically improve language assessment, especially in language for specific purposes assessment, by allowing the observation of test takers’ use of situated cognition (cognitive and metacognitive strategies) in addition to the collection of their verbal responses. Although the current study could not distinguish what specific portion of test takers’ strategy use accounts for their task accomplishment, a strong positive relationship between test takers’ use of cognitive/metacognitive strategies and their task accomplishment scores was identified. This meaningful finding could not have been obtained if the current study had utilized a paper-based test or a computer-based test with motionless pictures and hypertext only on the screen. In this regard, an immersive interface and simulated real TLU situations in virtual environments could provide test takers with more authentic opportunities to accomplish the target tasks. For future research, it is the researcher’s hope to utilize virtual environments for language for specific purposes assessment to motivate and encourage task-based researchers, language testers, and computer-assisted language learning practitioners who are especially interested in virtual worlds to collaborate in designing and developing more authentic language learning and testing arenas for language learners.
Acknowledgements
I wish to thank the anonymous reviewers and editors for their insightful and constructive comments. My sincere appreciation is extended to the recruited military air traffic controllers for expressing their enthusiasm for and willingness to participate in this research.
About the author
Moonyoung Park, PhD, is an assistant professor in the Department of Curriculum and Instruction at the Chinese University of Hong Kong. His areas of specialization are aviation English assessment, language teacher education, computer-assisted language learning and assessment, and curriculum and instructional design.