1 Introduction
Idioms, as a class of lexica, have long held the status of being non-compositional, which refers to the fact that the literal parts of idioms do not cumulatively contribute to what they mean figuratively. Stated simply, idioms cannot be understood solely via a literal interpretation. As some scholars (Nunberg et al. Reference Nunberg Geoffery and Wasow1994; Moon Reference Moon1998) have rightly pointed out, however, idioms are only non-compositional in the strictest literal sense, and such a view of non-compositionality does not take into consideration the ability of native speakers to decompose (see Gibbs & Nayak Reference Gibbs and Nayak1989) and metaphorically analyze the literal parts of an idiom in order to make sense of its figurative meaning. Though the terminology used in the literature to describe the degree to which an idiom lends itself to metaphorical analysis varies (e.g. decomposable/transparent; abnormally decomposable/semi-transparent; unanalyzable/opaque), I will refer to this as levels of semantic transparency (high, mid and low), as I believe this terminology better reflects the scalable nature of transparency and therefore is more suited to what is under investigation here.
As it relates specifically to metaphorical idioms, semantic transparency could be defined as the degree to which the literal parts of an idiom are collectively perceived to contribute to that idiom's figurative meaning. Thus, when a native speaker analyzes an idiom such as skating on thin ice, the relative transparency of its figurative meaning, to do something risky, is a positive function of how well the literal parts taken together metaphorically or metonymically elucidate said meaning. But what factors underlie how semantic transparency is perceived by native speakers? The role of conceptual metaphor in idiom comprehension has largely driven the discussion, evidenced by the plethora of studies (e.g. Lakoff & Kovecses Reference Lakoff, Kovecses, Holland and Quinn1987; Gibbs et al. Reference Gibbs, Bogdanovich, Sykes and Barr1997; Boers Reference Boers2000) examining commonly cited conceptual metaphors and their linguistic instantiations, such as anger is heat and anger is heated fluid in a container to account for how metaphorical idioms such as he was breathing fire and he flipped his lid are understood. However, the strong version of Conceptual Metaphor Theory (CMT) (Lakoff & Johnson Reference Lakoff and Johnson1980) based on embodied experience has not been without its fair share of critics (Murphy Reference Murphy1997; Rakova Reference Rakova2002; McGlone Reference McGlone2007; Howe Reference Howe2008; Steen Reference Steen2011), and, importantly, some studies (Keysar & Bly Reference Keysar and Bly1995, Reference Keysar and Bly1999) have found that semantic transparency intuitions are more arbitrary than cognitive linguists would perhaps be inclined to believe.
1.1 Semantic transparency intuitions and arbitrariness
Keysar & Bly (Reference Keysar and Bly1995, Reference Keysar and Bly1999) contend that native speakers impose their own arbitrary interpretations based on the stipulated meaning of idioms in hindsight. In one study (Keysar & Bly Reference Keysar and Bly1995), the researchers selected a number of idioms that were no longer used in modern English to ensure no prior knowledge, and taught either the actual figurative meaning or its conceptual opposite to the native speakers. Following this, the native speakers tried to make sense of the relationship between the literal parts and the figurative meaning they received. The results showed that the native speakers tended to ascribe a meaning that made sense to them based on which type of figurative meaning they were given (actual or conceptual opposite). In other words, irrespective of whether they were exposed to the true figurative meaning, or its conceptual opposite, they tended to derive their own interpretation that led them to perceive the expression as being more semantically transparent.
Keysar & Bly (Reference Keysar and Bly1995) argue that these findings suggest that even for very semantically transparent idioms, individuals unaware of the true stipulated meaning could produce a range of interpretations, suggesting an arbitrariness that CMT does not address. While the authors did not discount the claims of CMT, they emphasized the need to consider the fact that prior knowledge of the stipulated meaning of idioms could influence the perceived semantic transparency, and it should therefore be part of any discussion relating to semantic transparency intuitions.
These findings do, in fact, suggest that knowing the stipulated meaning of idioms a priori can influence how they are decomposed and analyzed in terms of semantic transparency. Yet there are some caveats about the design and claims that need to be pointed out so that the overall gravity of these findings can be weighed and assessed. The authors caution that even ‘the most transparent idioms’ (Reference Keysar and Bly1995: 103) are susceptible to the effect of prior knowledge. However, as Skoufaki (Reference Skoufaki2009: 32) noted, Keysar & Bly's (Reference Keysar and Bly1995) study employed ‘highly biasing contexts’ and many of the idioms used in their study were later revealed to be perceived as low transparency (e.g. the goose hangs high [fig. things are looking good], to lay out in lavender [fig. chastise harshly and in no uncertain terms]). It therefore does not necessarily follow that the same would be true of idioms that have been categorized as transparent in the literature, such as skating on thin ice (Moon Reference Moon1998) and add fuel to the fire (Fernando & Flavell Reference Fernando and Flavell1981). In light of this, and given the finding that semantically opaque idioms tend to be rare (Vega Moreno Reference Vega Moreno2005), it is unclear whether prior knowledge of the stipulated meaning of idioms impacts the perceived semantic transparency for all or even most metaphorical idioms. Semantic transparency intuitions, however, are subjective (Moon Reference Moon1998), and can therefore be difficult to measure. This is perhaps one of the reasons why this issue over the nature of semantic transparency remains unsettled.
1.2 Motivated semantic transparency intuitions of idioms
In addition to the methodological caveats mentioned above, evidence has been reported that shows that the figurative meaning of unknown, semantically transparent idioms can be correctly predicted (Irujo Reference Irujo1993; Kovecses & Szabo Reference Kovecses and Szabo1996; Boers & Demecheleer Reference Boers and Demecheleer2001; Bortfeld Reference Bortfeld, Heredia and Altarriba2002). This further implies that the degree of arbitrariness Keysar & Bly (Reference Keysar and Bly1995) found for semantic transparency intuitions might be limited to very opaque idioms. Better understanding of the extent to which Keysar & Bly's findings apply to idioms of greater and lesser degrees of semantic transparency, however, will aid in informing the debate on the nature of semantic transparency intuitions. Although there is little doubt that conceptual metaphors play a role in some idiom comprehension and by extension semantic transparency intuitions, not all metaphors are conceptual (Steen Reference Steen2008; see also Steen Reference Steen, Taylor and Littlemore2014), and there are other, less examined accounts of how metaphor can be interpreted and therefore contribute to understanding the figurative meaning of idioms. The studies presented here focus on one such account, which is how native speakers can draw upon their general world knowledge (termed here as encyclopedic world knowledge) to pragmatically infer how metaphorical and metonymical elements in idioms elucidate their figurative meanings.
1.3 Pragmatic inferencing via encyclopedic world knowledge
Cognitive Linguistics offers a tenable theoretical framework to account for how encyclopedic world knowledge and pragmatic inferencing can influence how metaphors and metonymies in idioms are interpreted. Essentially, encyclopedic world knowledge, as defined within the context of the two studies reported here, is the sum of explicit general knowledge a speaker has about the world around him or her. The process by which this knowledge is consciously acted upon in an effort to interpret the meaning of a figurative expression is pragmatic inferencing (see Vega Moreno Reference Vega Moreno2005). Viewed through the lens of the cognitivist, encyclopedic world knowledge can be informed and bolstered by repeated schematicity judgments between related superordinate and subordinate categories (see Langacker Reference Langacker1987). What this means is that speakers, as a fundamental cognitive exercise, continually make mental comparisons between categories that range in specificity. Similarities between categories over time become entrenched and thus the activation of one category can likewise activate corresponding superordinate or subordinate categories. As it relates to the studies presented here, these categories of knowledge can be drawn upon to aid speakers when they are interpreting literal and figurative relationships in language.
To better understand how this might operate in practice, consider the idiom give someone the green light (fig. give someone permission to do something). A speaker's encyclopedic world knowledge of traffic regulations and associated subordinate categories can motivate how he or she interprets the figurative meaning of said idiom in the following way:
A green light, which universally indicates a kind of permission to drive through or walk across an intersection or crosswalk, is an elaboration of traffic light rules, which in turn is an elaboration of traffic regulations. Green light considered in isolation may carry very little semantic value. Yet when analyzed in the domain of transportation and in conjunction with its superordinate categories, it could be that speakers can pragmatically infer that green light is a kind of metonymy for permission, thereby potentially shaping how they perceive the relationship between the literal words and the figurative meaning (further examples of encyclopedic world knowledge and its possible relationship with CMT is explored in sections 3.3 and 3.4).
In spite of the potential relevance pragmatic inferencing and encyclopedic world knowledge have for shedding light on how metaphor and metonymy are interpreted and understood in idioms, there is a noticeable lack of experimental research investigating this as it relates to semantic transparency (for an overview of pragmatic inferencing and figurative language, see Vega Moreno Reference Vega Moreno2007). It is not inconceivable that pragmatic inferencing via encyclopedic world knowledge plays a key role, perhaps even more so than do conceptual metaphors, in contributing to elucidating the figurative meaning of idioms, both before and after the fact. Stated differently, semantic transparency could be connected to the degree to which encyclopedic world knowledge of the literal constituent parts informs the overall interpretation of an idiom's figurative meaning.
1.4 Research questions
In sum, there is a good deal of controversy surrounding semantic transparency intuitions. First and foremost, it is important to determine the extent to which semantic transparency intuitions are consistent among raters, both in terms of rating data and qualitative support. A high degree of quantitative and qualitative convergence among native-speaker semantic transparency intuitions would further cast doubt on the strong claims of arbitrariness made by Keysar & Bly (Reference Keysar and Bly1995, Reference Keysar and Bly1999). Furthermore, there remains the issue of how varying degrees of semantic transparency affect rater interpretations. It could be the case that idioms that elicit higher semantic transparency ratings also elicit more similar interpretations among the raters, since higher-transparency idioms would presumably have a more narrow scope for interpretation. These two considerations underlie the motivation for the first research question:
To what extent do raters agree in their semantic transparency ratings?
Furthermore, aside from conceptual metaphors, it is important to determine in what ways and to what extent pragmatic inferencing via encyclopedic world knowledge can account for how native speakers interpret metaphors and metonymies in metaphorical idioms. I hypothesize that metaphorical idioms that are perceived to be of higher transparency should provide comparatively more instantiations motivated by encyclopedic world knowledge, as such knowledge will be readily available to raters and thus aid in elucidating the relationship between the literal parts and figurative meaning. This prompted the second research question:
What characteristics of idioms lead raters to rate them as being of higher or lower transparency?
2 Study 1 method
2.1 Participants
Fifteen (13 male and 2 female) university-level native-speaker English teachers participated as raters in the study. The raters had lived in Japan from 4 to 24 years, with 11.5 years being the median. All of them were experienced teachers of English.
2.2 Materials
The rater questionnaire comprised four sections, including participant background, literature background and rating instructions, qualitative rating of idioms, and, importantly, quantitative rating of idioms. The background and rating instructions provided basic information related to semantic transparency and idioms. In the qualitative section, a total of 30 idioms were spread out over three different versions of the questionnaire, which participants had to rate and then write out a justification for their rating. In the last section, the remaining of the 222 idioms was listed for rating (see appendix I for the full list of rated idioms).
2.3 Procedures
One of the transparency rater questionnaire's purposes was to create a semantic transparency scale to which metaphorical idioms could correspond. As semantic transparency intuitions may vary from individual to individual, it was important to elicit ratings from a sufficient number of raters for a sizable amount of metaphorical idioms to have a clearer sense of the scale's precision and reliability. The semantic transparency ratings were based on a 6-point Likert scale, with 1 being highly opaque and 6 being highly transparent. Descriptors for each number were written in parallel structure across the scale with 3–1 being progressively more opaque and 4–6 being progressively more transparent (see table 1).
2.3.1 Qualitative rating of idioms
This section of the questionnaire served three purposes: (i) to act as an activation and priming task by having raters explain and justify qualitatively a small subset of idioms before proceeding to the quantitative section; (ii) to serve as a means of identifying raters who were not following the procedures laid out in the instructions (which would also call into question their ratings on the subsequent, more important quantitative section); and (iii) to collect qualitative data on a mix of higher- and lower-transparency idioms to be later analyzed in order to aid in answering the research questions. These 30 idioms that comprised the qualitative section were drawn directly from the original 222 idioms. As my intention was to have a balance of transparency levels, I selected 10 high-, 10 mid- and 10 low-transparency-level idioms based on ratings from a pilot study I had previously carried out. Within each transparency level, the majority of the idioms were selected at random, though I intentionally selected the highest- (skating on thin ice) and lowest-rated (go cold turkey) idioms, as well as two idioms with literary references (an Achilles heel and have your pound of flesh).
2.3.2 Quantitative rating of idioms
In this section of the questionnaire, raters, utilizing the same 6-point Likert scale shown in table 1, rated the complete set of 222 English idioms. If an idiom was known, the rater would circle the number corresponding to the ratings in the semantic transparency scale. Due to the large number of items that the raters were asked to rate, there was no qualitative aspect to this section. In fact, part of the reason for including the previous section was to encourage raters, through written output, to think about and consider the transparency more carefully, so as to prime them for assessing the longer complete list in this quantitative section.
3 Study 1 results and discussion
Analysis of the semantic transparency ratings revealed a high level of agreement among the 15 native-speaker raters. A Cronbach's Alpha measure of inter-rater reliability was at .914, which indicates an ‘excellent’ degree of internal consistency (George & Mallery Reference George and Mallery2003) among the raters. A strong degree of internal consistency is not only critical to support the remainder of this study, but it also provides evidence in line with a Cognitive Linguistics view of semantic transparency. That is, semantic transparency of idioms is not arbitrary to the degree suggested by previous studies (Keysar & Bly Reference Keysar and Bly1995, Reference Keysar and Bly1999), but rather is motivated by factors such as underlying conceptual metaphors and widely understood encyclopedic world knowledge (see sections 3.3 and 3.4 for additional qualitative support for these claims). Insofar as the rating questionnaire is valid as an instrument to measure semantic transparency, the high inter-rater reliability corroborates to a certain extent this non-arbitrary relationship between semantic transparency and some idioms.
For this study, rater consistency, as a scalable phenomenon, can provide tentative insights into the possible scope and delimitations of interpreting idioms via their literal constituent parts. If, for instance, the ratings for one particular idiom show a much smaller degree of variance when compared to a different idiom, this could indicate a more constrained interpretation for that idiom (be it transparent or opaque) relative to the other. Conversely, an idiom that has been rated more erratically could imply a stronger degree of arbitrariness and a larger scope for interpretation. Naturally, other confounding variables can unfortunately muddle the data, and for this reason, it is essential to compare ratings of individual idioms and also draw from qualitative data gleaned from the priming task in the rating questionnaire in order to have a clearer understanding and more informed discussion about what can account for differences in ratings.
3.1 Rater consistency for individual idioms
Closer inspection shows a substantial range of standard deviations among the semantic transparency ratings of individual idioms ranging from 0.3 for go cold turkey to 1.9 for an Achilles heel. This suggests that despite overall high agreement among raters, some idioms elicited far more erratic ratings than others. By grouping idioms according to how consistently they were rated, it might be possible to tease out some group-wide features or commonalities that contributed to the greater or lesser degree of agreement. In order to do this, I will first examine idioms from the extremes of both ends of the standard deviation range in tables 2 and 3.
The first notable observation is that only high- (4.1–6.0) and low- (1.0–2.9) semantic-transparency idioms are instantiated in table 2, while table 3 is almost entirely composed of mid- (3.0–4.0) transparency idioms. This is unsurprising, as average ratings that fall at either extreme of the semantic transparency scale must necessarily have relatively high agreement among raters. This is because the stronger the disagreement among raters, the more likely it is that there will be ratings from both ends of the scale, yielding an average that falls closer to a middle rating. This suggests that, as a whole, idioms falling into the high- and low-transparency groups enjoy stronger internal consistency than do their mid-transparency counterparts. Indeed, the average standard deviations according to transparency level in table 4 substantiate this claim. A one-way ANOVA also showed that the standard deviations were statistically different for different levels of semantic transparency, F (2,219) = 28.065, p < .001.
3.2 Accounting for disagreement in ratings
Another possible explanation to account for greater or lesser degrees of internal consistency found in the data could be linked to the raters’ background knowledge. In the rater instructions, raters were told that they could draw from their background knowledge while rating. Given that background knowledge varies among speakers, it is unsurprising that idioms with highly specific historical or cultural references elicited greatly different ratings, such as an Achilles heel (std. 1.9). As I was aware from the outset that this was a possibility, I included this idiom in the priming task in one form of the rater questionnaire. Table 5 gives the four qualitative responses for an Achilles heel.
As can be seen, there seems to be a consensus among these four raters in terms of their understanding that this idiom is historically based. Yet, as Rater 4 acknowledges, lack of this historical knowledge would render the literal–figurative relationship completely opaque, which could have been the source of disagreement (it elicited the most disagreement of all 222 idioms in the study).
Similarly, background knowledge of somewhat common etymologies possibly impacted the ratings as well because raters who were familiar with an idiom's etymology would have more context to discern a relationship between the literal parts and the figurative meaning. The idiom throw in the towel (std. 1.8) is a good example of this because an informal survey of rater participants revealed that many but not all of them were aware of the connection between this idiom and the source domain of boxing. In boxing, when a boxer is badly losing a match, his or her coach can signal to stop the fight by throwing a towel into the boxing ring. Since the meaning of the idiom is to admit failure or defeat, knowing the etymological origin would clearly elucidate the meaning of the idiom via its literal constituent parts. Given that the etymology for this particular idiom seemed to be familiar with many but not all of the raters, it is unsurprising that the ratings for this idiom were not consistent.
Interestingly, qualitative data from the priming task were strikingly similar even for idioms that tended to elicit a higher standard deviation in the rating. Justification for the rating of the mid-transparency idiom, to ruffle feathers, illustrates this well in table 6.
Ruffle feathers had a semantic transparency rating of 3.2 and a standard deviation of 1.2, which was close to the average for mid-transparency idioms (1.27). In spite of the notable standard deviation, however, all four raters who encountered this idiom on the priming task explained their semantic transparency rating in a similar fashion by citing birds and employing comparable descriptive language such as dissatisfied, disturbed and disgruntled in their explanations. This qualitative convergence shown for ruffle feathers should highlight the fact that some measure of rater disagreement does not necessarily entail vastly different interpretations. Importantly, this kind of qualitative convergence is evidence for the way in which raters, through their encyclopedic world knowledge of birds, can pragmatically infer in a very similar fashion the relationship between the literal parts and figurative meaning.
3.3 Further evidence of qualitative convergence
In the previous sections, the ratings were shown to be highly consistent overall, and a few representative outliers were examined to account for those ratings that were not. As with the case of ruffle feathers, there were other examples of qualitative convergence among those 30 idioms in the initial priming task, and it is important to draw from these in order to observe how encyclopedic world knowledge could contribute to figurative interpretations. To this end, I will summarize points of convergence for the idioms come out of your shell, go bananas and skating on thin ice.
For come out of your shell (fig. become less shy in social situations; trans. 4.5), all four raters related shell to the outer protective covering of a timid animal. These are clear instantiations of encyclopedic world knowledge connected to animal behavior and physiology. Animal behavior was similarly referenced for go bananas (fig. become excited; trans. 2.0), in which three of the four raters related their interpretation to the association between monkeys and bananas. One important distinction between these two idioms, however, is their transparency. Though most of the raters arrived at a similar interpretation for go bananas, the perceived low saliency of the literal–figurative relationship could have contributed to its perceived low semantic transparency. As one rater observed, ‘The connection between bananas and the meaning of this idiom is very tenuous.’ This shows that encyclopedic world knowledge in and of itself does not always lead to a perceived high transparency and it is constrained by other factors.
With regards to skating on thin ice (fig. do something risky), three of the four raters referenced the danger associated with this activity. Presumably, all the raters are familiar with the basic properties of ice and what occurs when too much pressure is applied across a thin layer of it, as well as the subsequent peril of plunging into the icy cold waters beneath it. Owing to an increasingly globalized world, it is also not unreasonable to presume that most speakers have at least heard of ice skating even if they do not live in a cold climate. Thus, though this idiom is rated as high transparency, it involves the interaction of encyclopedic world knowledge from a variety of sources, such as the properties of ice, the danger of hypothermia and the sport of ice skating.
3.4 The relationship between encyclopedic world knowledge and conceptual metaphors
The three idioms described above illustrate how encyclopedic world knowledge via pragmatic inferencing can shape how speakers interpret figurative meaning in idioms. Yet, what relationship does encyclopedic world knowledge have with conceptual metaphors (if any), and how might such a relationship influence semantic transparency intuitions? Though conceptual metaphors were not the focus of the studies reported here, it is worth briefly mentioning the rater data for be under the weather (fig. feel ill or in low spirits), as it exemplifies how both conceptual metaphors and encyclopedic world knowledge could work in conjunction to shape speakers’ interpretations. For instance, one rater referred to the effect of poor weather on one's mood or health, which would appear to be encyclopedic world knowledge at work. Conversely, another rater connected the word under with feeling down or depressed, which could be an instantiation of the good is up/bad is down conceptual metaphors. Though anecdotal, this illustrates that the role of encyclopedic world knowledge and conceptual metaphors could be complementary, and need not be viewed as competing models.
4 Study 1 conclusion
Based on evidence collected from both the semantic transparency ratings and priming task, it appears that on the whole the native-speaker raters in this study rate consistently among themselves and often draw upon similar encyclopedic world knowledge to explain their understanding of many idioms. Furthermore, the qualitative data tentatively suggest that conceptual metaphors could function in conjunction with encyclopedic world knowledge in shaping how some idioms are interpreted some of the time.
Therefore, the answer to the first research question would be that native-speaker raters tend to agree substantially in their ratings of semantic transparency of idioms in this study. These findings are consistent with the cognitive linguistic view that the relationship between the literal constituent parts and figurative meaning of many idioms is not completely arbitrary and as such can be exploited for both pedagogical and research-oriented purposes. Furthermore, these results refute the strong degree of arbitrariness of semantic transparency intuitions of idioms suggested by Keysar & Bly (Reference Keysar and Bly1995, Reference Keysar and Bly1999).
5 Study 2 method
Based on the initial findings in Study 1, I hypothesized that idioms that can be understood via encyclopedic world knowledge would tend to elicit higher semantic transparency ratings. This is because the motivation of such idioms would presumably be more accessible to raters irrespective of background or country of origin. For instance, common encyclopedic world knowledge about fire and how it interacts with fuel might facilitate understanding, through pragmatic inferencing, of add fuel to the fire (fig. make a bad situation worse). On the other hand, idioms with more culturally restricted metaphors and metonymies could be more likely perceived as opaque. As an example, consider red tape (fig. official rules and documents that seem unnecessary and cause delay), which alludes to the former British practice of tying together documents with a red ribbon. If such an origin is unknown, then it is potentially more difficult to understand the relationship between the literal parts and figurative meaning.
5.1 Participants
Study 2 does not make use of participants per se. However, the semantic transparency ratings obtained in Study 1 are important for informing the analysis and subsequent discussion for Study 2.
5.2 Materials
Much of the analysis in Study 2 relied upon the etymological notes found in a number of idiom dictionaries. In table 7, I have included all of the idiom dictionaries that I initially consulted in compiling the data on the etymologies of the idioms concerned. Further citation details for each can be found at the end of the reference section.
5.3 Procedures
The hypothesis underlying this research question presupposes that the way in which raters perceive an idiom's semantic transparency could be related, in part, to how an idiom is motivated. Although linguists offer definitions of motivation that vary slightly (see Radden & Panther Reference Radden, Panther, Radden and Panther2004), I am defining motivation in the context of this study as the non-arbitrary elements that exist between an idiom's literal words and figurative meaning, which is adapted from Hiraga's (Reference Hiraga1994) notion of motivation as the ‘non-arbitrary relationship between form and meaning’ (p. 8). Although all idioms are assumed to be motivated, motivation can occur through a number of channels: conceptual, encyclopedic and cultural to name a few. As culturally motivated etymologies and origins of many idioms are often restricted, obscure, forgotten, uncertain or unknown, it seems reasonable to postulate that these idioms will more often than not be perceived as lower transparency. Conversely, conceptually or encyclopedically motivated idioms might be perceived, on the whole, as being more semantically transparent, as their motivation derives from cross-cultural encyclopedic world knowledge, such as in the example of give someone the green light illustrated in section 1.3. In Study 2, I examine how pragmatic inferencing via such knowledge can contribute to elucidating the figurative meaning of idioms in a non-arbitrary way, thus resulting in higher semantic transparency intuitions.
The first step I took in testing this hypothesis was to analyze the idioms falling into the highest- and lowest-perceived semantic transparency ratings. By juxtaposing these two groups, patterns and commonalities pertaining to each grouping (see tables 8 and 9 for details) could be explored. Furthermore, I could consider how different motivational sources might have contributed to the idioms’ higher or lower semantic transparency rating in order to get a sense of the tendencies of each grouping. Since it was hypothesized that the higher-transparency idioms would be more likely to be conceptually or encyclopedically motivated than lower-transparency idioms, I could compare idioms from both extremes of the scales to determine if there was any evidence for this before proceeding to a more systematic and in-depth analysis.
Following the initial approach mentioned above, I furthered the investigation more objectively by referring to the etymological notes made available by the idiom dictionaries listed in table 7. In such a way, I could curtail any of my own biases by referring to the stipulated etymologies written by lexicographers. By surveying a sufficient number of well-known idiom dictionaries, I could better determine the frequency of etymological notes and associated origins for the idioms of interest. It is important to note that half of the idiom dictionaries surveyed had no or very few etymological notes, and for that reason they were not helpful in addressing the research question for Study 2 and were therefore discarded from the analysis.
The etymologies gathered and compiled from the OAI, HDI, IORI and CID allowed me to uncover, both quantitatively and qualitatively, evidence of the link between motivational source and perceived semantic transparency. In the results and discussion that follow, the findings from each stage of the analysis will be presented. This includes (i) the initial comparison of the highest- and lowest-semantic-transparency idioms as determined by the raters in Study 1 and (ii) the quantitative differences between etymological frequencies and discrepancies found among the four dictionaries. In the next section, I address these results and discuss how they aid in answering the research question for Study 2:
What characteristics of idioms lead raters to rate them as being of higher or lower transparency?
6 Study 2 results and discussion
To answer this research question, it is first necessary to compare those idioms that fell at each extreme of the semantic transparency scale. That is, those idioms that best instantiate high- and low-transparency idioms. In this way, any emergent patterns across each group can be identified and analyzed. Tables 8 and 9 present the upper and lower 10 percent of idioms falling at each end of the semantic transparency scale.
6.1 Observations of quantitative data
The data generated by comparing the etymological notes for the idioms in tables 8 and 9 revealed a number of relevant considerations that should be addressed before dissecting in detail any individual idioms. Firstly, it should be noted that a substantial number of idioms were not listed in many of the four dictionaries used. Though this was expected since these dictionaries varied greatly in the total number of entries included (from about 600 in IORI to over 10,000 in HDI), it is important to establish that there was no great difference in the total number of unlisted entries between transparency groups. Otherwise, it could be the case that one transparency grouping included a disproportionate number of idioms either deemed too low frequency or unimportant by the lexicographers who compiled these dictionaries. Such a case might introduce a further unwanted variable that would complicate the analysis. A tally of each group, however, showed that the cumulative number of unlisted idiom entries was 26 out of a possible 88 (22 idioms x 4 dictionaries), or 29.5 percent, for the high-transparency idioms and 18 out of 88, or 20.5 percent, for the low-transparency idioms. This shows that though there were a number of unlisted idioms, a fairly similar majority of idiom entries were provided among the four dictionaries for the high- and low-transparency idioms under scrutiny in Study 2.
A second consideration is the frequency for which listed idioms have accompanying notes on etymological origin. This is important because the omission of etymological notes could possibly offer insights about the way in which idioms are motivated. It is of course impossible to know with any certainty why a particular idiom may not have accompanying etymological notes. It could be the case that there is no known etymology, or perhaps the existing explanations are tenuous or unreliable and therefore go unlisted. However, it is not unreasonable to posit that one potential reason lexicographers might exclude etymological notes is when the origin is so clear that notes would prove superfluous. That is, idioms whose origins strongly relate to encyclopedic world knowledge or embodied experience might comparatively have fewer or no etymological entries among the four dictionaries, because this kind of knowledge is not necessarily culturally bound and therefore are less likely to require explanation. If this is true, then the incidence of etymological notes should be more frequent among the low-transparency idiom group, because, as posited at the beginning of Study 2, perceived lower transparency might be a result of, at least in part, culturally or historically bound knowledge.
In order to determine whether there is evidence of this, I examined the data to see if there was any marked contrast in the number of cases of etymological notes between the two transparency groups. For the high-transparency group, a tally revealed that etymological notes were provided for 27 of the 62 (43.5 percent) listed idioms. In contrast, for the low-transparency group, etymological notes were included in 44 out of the 70 listed idioms (62.9 percent). This indicates a 44.3 percent increase in the incidence of etymological notes for the low-transparency group of idioms. The two obvious caveats to this finding are that the idioms in this study were not randomly selected, as they were intended to address other research questions that go beyond the scope of the studies reported here, and perhaps more importantly, as mentioned, it cannot be certain why etymological notes were excluded. The author of IORI, however, did convey to me that idioms whose etymologies were ‘self-explanatory’ were sometimes excluded unless there was some compelling reason to include them (L. Flavell, personal communication, 24 September 2015). If the lexicographers in the remaining three dictionaries adopted a similar methodology, then it is possible that this marked decrease in etymological notes among the higher-transparency idioms is connected to the motivational source obviating explicit explanations. In other words, the motivation of some high-transparency idioms, such as skating on thin ice (described in detail in section 3.3 (Study 1)), is obvious to the extent that lexicographers need not include such superfluous details. Thus the marked increase in etymological entries for the low-transparency idioms could indicate that they tend to be motivated differently than high-transparency idioms.
Finally, another way to attempt to answer this research question quantitatively would be to compare the number of disputed etymological origins across both transparency groups. If it were the case that an idiom's etymology is very clear due to encyclopedic world knowledge, then there is likely to be just one, uncontested explanation. If, on the other hand, the precise origin of the idiom is disputed, this could be indicative of idioms whose etymology derives from a specific historical event or culturally bound context. Among those idioms in tables 8 and 9, there was one (4.5 percent) disputed case for the high-transparency group (alive and kicking), but there were six (27.3 percent) such cases for the low-transparency group (tie the knot, spill the beans, face the music, kick the bucket, cut the mustard and go cold turkey). Again, while this cannot be taken as strong evidence for the reasons mentioned earlier, cumulatively, along with the previous data, patterns that support the hypothesis appear to be emerging between the two groups of idioms.
7 Study 2 conclusion
In Study 2, I have attempted to build upon the findings in Study 1 and contribute to the discussion on the relationship between the perceived semantic transparency of idioms and how this could be related to motivational sources, particularly for how encyclopedic world knowledge can, through pragmatic inferencing, elucidate the figurative meaning of idioms and lead to higher-semantic-transparency intuitions. In spite of the limitations of Study 2 discussed in the previous section, there does appear to be some support indicating that the way in which an idiom is motivated can impact its perceived transparency. Quantitatively, I observed that the high-transparency idioms in Study 2 had a lower incidence of etymological notes than did the low-transparency idioms. One possible way of accounting for this is that high-transparency idioms are more likely to be motivated by encyclopedic world knowledge, and as a result, these idioms might be more self-explanatory than idioms motivated by a delimited cultural or historical contexts. Furthermore, the incidence of conflicting etymologies cited in the dictionaries was higher among the low-transparency idioms in Study 2. If the hypothesis in Study 2 is true, then this is expected, as low-transparency idioms with motivational sources rooted in obscure or otherwise unknown cultural or historical contexts would be more likely, compared to high-transparency idioms, to have multiple etymological accounts.
In sum, the semantic transparency ratings from Study 1 provide support for the notion that semantic transparency intuitions are motivated to a substantial degree across transparency levels, and such motivation is in part connected to encyclopedic world knowledge. The data in Study 2 corroborate the findings in Study 1 by examining trends between etymological notes in idiom dictionaries for higher- and lower-semantic-transparency idioms.
8 Limitations
Upon considering the methodologies and implementation of Study 1 and Study 2, I have identified some limitations that might help direct future research in this area. In terms of the methodology of Study 1, the raters were all native-speaker university-level English teachers. Though these participants did originate from different countries (USA, UK, New Zealand, Canada and Australia), it must be said that their profiles were otherwise very similar. Given that they shared the same L1 and profession, this could have influenced how similarly they interpreted the semantic transparency of the idioms in question. Had the raters come from more dissimilar backgrounds, the data might not have yielded such a high degree of agreement as they did.
Another important limitation was the relative focus on quantitative data over qualitative data. Initially, I had conceived of the priming task in Study 1 as a means to encourage participants to carefully consider their justifications for the ratings – in such a way, the raters might be less likely to rate in a haphazard or unsystematic way. Since its purpose was to support the quantitative ratings, it involved only a much smaller subset of the idioms used in the semantic transparency ratings. Given the utility of the qualitative data in buttressing the findings from the ratings, however, more idioms and more raters per idiom should have been part of the design in the priming task.
Lastly, Study 2 was a much smaller, exploratory study intended to triangulate findings from Study 1 by relying on etymological entries in idiom dictionaries. Yet many of the initial dictionaries I consulted lacked the idiom in question or did not provide etymological notes. This was problematic due to the resulting sample size and the findings are therefore only supplementary to the findings in Study 1. Due to the lack of availability of appropriate idiom dictionaries with etymological notes, their value in investigating semantic transparency appears to be somewhat limited.