The information, stories and advertisements that we are exposed to through the media contribute to the formation of our memory in the same way as other life experiences (Johnson, Reference Johnson2007). Radio is a widely extended mass media but very few studies have explored the memory for radio advertisements as opposed to television ads (Potter & Callison, Reference Potter and Callison2000). Our primary aim is to study the memory for radio advertisements according to the type of program in which they are embedded and the influence of the typicality of the elements of the advertised product.
Research on the influence of programs on the memory for advertisements has yielded contradictory results. Some studies carried out with television programs have found a positive relationship between the entertaining or enjoyable nature of a program and different memory measures on the advertisements they contained (Moorman, Neijens, & Smit, Reference Moorman, Neijens and Smit2007). The authors maintain that the cognitive involvement induced by the program has a carry-over effect to the advertisement. Other studies focusing on television programs (Norris & Colman, Reference Norris and Colman1994) or the written press (Norris & Colman, Reference Norris and Colman1992) have found a negative relationship between program ratings and the memory for embedded advertisements. The authors in these cases suggest that our attention is captured by the program content, and that since we have a cognitive limit to the amount of information we can process, we will not have sufficient cognitive resources available to focus on the information in the advertisements. This divergence might be due to methodological differences (Moorman et al., Reference Moorman, Neijens and Smit2007). In field experiments, where the participants choose the program, there was a positive relationship; in contrast, a negative relationship appeared in laboratory experiments, in which the program was selected by the experimenter.
To date, only one study has been published on the effect of program type on the memory for radio advertisements, in this case a laboratory experiment. Norris and Colman (Reference Norris and Colman1996) had subjects listen to advertisements embedded in different context programs: popular music, music of the 50s, 60s and 70s, or a phone-in radio debate on topics including exercise addiction, statistics, etc. The participants then completed a series of questionnaires in which they rated different program characteristics (enjoyment, entertainment, boredom and level of involvement in the program content, among others) and several memory tests. The results showed that memory performance was better for the advertisements in programs rated enjoyable and entertaining, and associated with a high level of involvement, whilst worse for ads in programs rated boring or enjoyable.
This finding favors the carry-over effect as an explanation of the relationship between program type and memory for advertisements. Still, none of the studies on advertisements have used recall tasks or recognition tests to actually measure the memory for programs. By simultaneously measuring the memory for programs and advertisements, the carry-over effect hypothesis could be verified, if a similar memory pattern between each program and its advertisements is found.
The typicality of the elements of the radio advertisements can also be a key component in determining the recall of information. In this study, we are interested in the units of information that appear in the advertisement, specifically the characteristics that are attributed to the advertised product. Hence, we can refer to two different types of elements, high typicality and low typicality (Brewer & Treyens, Reference Brewer and Treyens1981; Lampinen, Copeland, & Neuschatz, Reference Lampinen, Copeland and Neuschatz2001; Luna & Migueles, Reference Luna and Migueles2008; Neuschatz, Lampinen, Preston, Hawkins, & Toglia, Reference Neuschatz, Lampinen, Preston, Hawkins and Toglia2002). High-typicality elements are the most representative characteristics, ones which best define the product (e.g., a detergent ad that refers to its cleaning capacity). Low-typicality elements are generally contents coherent with the nature of the product, but less likely to appear in an ad (e.g., a detergent ad that distinguishes different types of fabrics). The standard result when typicality is manipulated is that participants perform better when recalling high-typicality information. But since high-typicality information also generates more intrusions or false alarms, accuracy tends to be better for low-typicality elements (Lampinen et al., Reference Lampinen, Copeland and Neuschatz2001; Neuschatz et al., Reference Neuschatz, Lampinen, Preston, Hawkins and Toglia2002). With this manipulation we want to examine whether recall is better for high-typicality or low-typicality elements.
The higher accuracy with low-typicality elements is usually explained by the use of schematas (Neuschatz et al., Reference Neuschatz, Lampinen, Preston, Hawkins and Toglia2002). When we deal with new information, we activate the schema that is most closely associated with the situation and helps us understand it. Thus, high-typicality elements are quickly and automatically encoded since they are easily identified as part of the schema (Grafman, Reference Grafman, Stuss and Knight2002). Low-typicality elements are not a representative part of the schema and therefore require deeper processing (Migueles & García-Bajos, Reference Migueles and García-Bajos2012; Trafimov & Wyer, Reference Trafimow and Wyer1993). Therefore, high-typicality elements will be more difficult to discriminate from other typical contents representative of the schema, while the opposite occurs with low-typicality elements. As a result, when asked questions about high-typicality elements, whether previously presented or not, we have a strong tendency to report that they did in fact appear; a recognition test will therefore yield a high number of hits, but also false alarms. Conversely, when we are asked about low-typicality elements, we are better at identifying whether they have appeared or not since they do not fit into our initial schema. As a result, we make fewer false alarms, which leads to greater accuracy with these contents than with high-typicality elements. Giving the consistent pattern of results of typicality, in this study we expect to find a similar pattern of results, namely more hits and false alarms with high-typicality elements and better accuracy with low-typicality elements.
To examine the impact of program type and content typicality on memory we conducted an experiment in which participants listened to three radio programs each containing two embedded advertisements. Finally, participants completed a recognition test for advertisement contents.
Method
Participants
Twenty nine psychology students (21 females; M = 22.10 years, SD = 1.05) from the Universidad del País Vasco volunteered to take part in this experiment.
Design
A 3 (Program Type: enjoyable, boring, interesting) x 2 (Typicality: high, low) experimental design was used with repeated measures in both variables.
Materials
Two normative studies were conducted from which we selected the advertisement and program elements to for the main experiment.
Advertisement normative study Six advertisements were selected from catalogues used by professional advertising agencies (discotheques, travel, driving license, mattresses, furniture store and home decor shop). A 72-item questionnaire was constructed using six true elements from each ad and six false elements. A sample of 15 participants (12 female, M = 32.5, SD = 2.23) rated the likelihood that the elements appeared in similar ads on a scale of 1 (unlikely) to 5 (very likely). Twenty-four elements were extracted from the normative study on the advertisements, 12 true and 12 false. Half of the elements were of high typicality (“Student discounts”, M = 4.52; SD = 0.09) and half were of low typicality (“Driving lessons in high-end vehicles”, M = 2.39; SD = 0.16).
Program normative study A new sample of 13 participants (11 female, mean age M = 33.71, SD = 2.23), listened to nine recorded programs, and, then rated them on a scale of 1 (very low) to 5 (very high) in three dimensions or categories, enjoyable, boring and interesting. The three programs with the highest ratings in each category (enjoyable M = 4.69, SD = 0.48; boring M = 4.08, SD = 0.95; interesting M = 4.54, SD = 0.66) were chosen as experimental material. Comparisons between program ratings using the Student’s t-test revealed that there were no differences (all of the p values > .05). The program selected as boring was on economics, the program selected as interesting explained the phenomenon known as synesthesia, and the program selected as enjoyable was a comedian sketch.
Four true elements were selected from each program and an equal number of false elements were invented. Thus, the recognition test on program content consisted of 24 questions, 12 about true elements (e.g., The host says: How many people attend to the last protest organized by Church?) and another 12 about false elements (e.g., The program ends with: Don’t forget to join us tomorrow at the same time).
Recordings Using the selected programs and advertisements, six different recordings were created. Two ads were embedded in each radio program after a short musical interlude, which was heard again before returning to the program. All materials were counterbalanced. The total audio time was 18 minutes and 12 seconds. Each program had a similar duration of approximately 5 minutes, and each ad ran approximately 35 seconds. The musical interludes lasted 3 seconds each.
Recognition test The final recognition test consisted of 48 questions, half about the advertisements and the other half about the radio programs. For each question, the participants had to indicate whether the information presented was “True” or “False”. The wording in all of the questions, for both the ads and the programs, was transcribed literally from the recordings.
Procedure
The participants entered the laboratory and were told that they would listen to a recording and would be evaluated on the content afterward. They were not told that the recording consisted of programs and advertisements, nor were they given any other information on the content of the test. After listening, participants performed a 3-minute filler task consisting of a word search puzzle and a crossword puzzle. They were then asked to complete the recognition test at their own pace. For each question they had to indicate whether the information had appeared in the original recording (true) or not (false).
Results
Recognition of advertisement elements
The means and standard deviations are shown in Table 1. The hits, false alarms, A’ scores (accuracy) and B’’D scores (response criterion) were analyzed using a within-subjects 3 (Program Type: enjoyable, boring, interesting) x 2 (Typicality: high, low) Analysis of Variance (ANOVA). The post-hoc analyses were performed using the Student’s t-test.
Table 1. Mean (Standard Deviation) of Advertisement Elements as a Function of Program Type and Typicality
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409155747-02474-mediumThumb-S1138741613000802_tab1.jpg?pub-status=live)
* = Indicates values not different from 0, which corresponds to the neutral response criterion value.
Hits An analysis was conducted on the proportion of correct answers for the true contents of the recording. Differences in hits were found depending on program type, F(2, 52) = 7.45, p = .001, η p 2 = .22, and typicality, F(1, 26) = 55.46, p < .001, η p 2 = .68. The interaction was not significant. There were more hits for the interesting program than for the enjoyable program, t(27) = 2.07, p = .048, Cohen’s d = 0.40 or the boring program, t(26) = 3.71, p = .001, Cohen’s d = 0.91, but there were no differences between the latter two. There were also more hits for high-typicality elements, i.e., those that appear in most advertisements of this type, than for low-typicality elements.
False alarms False alarms are reflected by the proportion of ‘true’ answers when presented with false sentences. Differences were found according to program type, F(2, 56) = 9.13, p < .001, η p 2 = .25, and typicality, F(1, 28) = 34.46, p < .001, η p 2 = .55. The interaction was not significant. There were more false alarms for the ads embedded in the interesting program than in the boring program, t(28) = 4.25, p < .001, Cohen’s d = 0.83, and in the enjoyable program than the boring program, t(28) = 3.42, p = .002, Cohen’s d = 0.71. There were no differences in the amount of false alarms between the ads in the interesting and the enjoyable programs. There were also more false alarms with high- than with low-typicality elements, as the participants made mistakes discriminating between representative product characteristics present in or absent from the recording.
A’ scores The A’ scores (Snodgrass & Corwin, Reference Snodgrass and Corwin1988) indicate the participants’ accuracy, taking into account the number of both hits and false alarms. Values range from 0 to 1. A score of .5 indicates that the participants’ performance was at or close to chance. Scores above .5 reflect higher accuracy and scores below .5 indicate lower accuracy. There were no significant differences in accuracy as a function of program or typicality, nor was there any interaction.
B’’ D scores B’’D scores (Donaldson, Reference Donaldson1992) indicate the criteria adopted by the participants in their answers. The values range from –1 to +1, with a 0 score indicating a neutral response criterion. Negative scores indicate a lax or liberal criterion (tendency to respond “true”) and positive scores, a stringent or conservative criterion (tendency to respond “false”). There were significant differences in response criteria as a function of program type, F(2, 56) = 17.41, p < .001, η p 2 = .38, typicality, F(1, 28) = 82.85, p < .001, η p 2 = .75, and the interaction, F(2, 56) = 4.25, p = .019, η p 2 = .13. A more stringent criterion was employed for the advertisements in the boring program than the enjoyable program, t(28) = 4.28, p < .001, Cohen’s d = 1.07 or the interesting program, t(28) = 4.72, p < .001, Cohen’s d = 1.16, and there were no differences between the latter two.
The interaction showed that the participants adopted a more stringent response criterion with the high-typicality elements from the boring program than from the enjoyable program, t(28) = 5.27, p < .001, Cohen’s d = 1.39 or the interesting program, t(28) = 3.67, p = .001, Cohen’s d = 0.85, and there were no differences between the latter two. When indicating whether an element had appeared or not, the boring program seemed to elicit a more stringent criterion when participants were asked about elements consistent with our knowledge schema. The response criterion was also more stringent when applied to low-typicality elements in the enjoyable program than in the interesting program, t(28) = 3.36, p = .002, Cohen’s d = 0.37, and more stringent with the low-typicality elements in the boring program than in the interesting program, t(28) = 3.42, p = .002, Cohen’s d = 0.14; there were no differences between the advertisement elements in the enjoyable and boring programs. When asked about more easily identifiable elements such as the low-typicality elements, programs with both boring and enjoyable contents prompted a stringent response criterion. The response criteria for high-typicality elements in the boring program and for low-typicality elements in the interesting program were neutral, both p > .25.
Recognition of program elements
The means and standard deviations are shown in Table 2. A one-way repeated measures ANOVA was conducted with Program Type (enjoyable, boring, interesting) as the variable. The typicality of the program elements was not manipulated.
Table 2. Mean (Standard Deviation) of Program Elements as a Function of Program Type
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170409155747-62824-mediumThumb-S1138741613000802_tab2.jpg?pub-status=live)
Hits Significant differences were found according to program type, F(2, 56) = 30.58, p < .001, η p 2 = .53. There were more hits in the interesting than in the boring program, t(28) = 6.43, p < .001, Cohen’s d = 1.98, and more hits in the enjoyable than in the boring program, t(28) = 6.34, p < .001, Cohen’s d = 1.64; no differences were found between the enjoyable and interesting programs.
False alarms Differences were found in the false alarm rate as a function of program type, F(2, 56) = 14.16, p < .001, η p 2 = .34. More false alarms were generated for the boring program than for the interesting program, t(28) = 5.20, p < .001, Cohen’s d = 1.28, and for the enjoyable than for the interesting program, t(28) = 3.70, p = .001, Cohen’s d = 0.88; no other significant differences were found.
A’ scores Differences in accuracy were found in the program elements depending on the type of program, F(2, 56) = 27.51, p < .001, η p 2 = .49. Accuracy was higher for the interesting and enjoyable programs than for the boring program, t(28)= 5.38, p < .001, Cohen’s d = 1.90, and t(28) = –6.39, p < .001, Cohen’s d = 0.73, respectively.
B’’ D scores Differences in accuracy were found in the criterion adopted depending on the program, F(2, 56) = 5.61, p = .006, η p 2 = .17. Participants adopted a more conservative criterion with the boring than with the enjoyable program, t(28)= –3.15, p = .004, Cohen’s d = 0.73, but there were no other differences.
Comparison between advertisement and program results
We performed comparisons between the recognition of elements on the advertisements and programs (hits and false alarms), accuracy (A’) and response criterion (B’’D). No differences were found between the hits for ads and programs (lower p = .34) but accuracy was higher for the programs (M = .75, SD = .08) than for the ads (M = .63, SD = .12), t(28) = 3.85, p < .002, Cohen’s d = 0.85, with both values different from .5. However, there were more false alarms for the ads (M = .42, SD = .18) than for the programs (M = .24, SD = .14), t(28) = –5.23, p < .001, Cohen’s d = 1.15. The response criterion was more stringent for the programs (M = .04, SD = .31) than for the ads (M = –.18, SD = .35), t(28) = 3.81, p < .002, Cohen’s d = .66. The response criterion for the ads was different from 0, t(28) = –2.73, p = .010, but it was not for the programs (p = .48).
From this comparison between the results of advertisements and programs, we can conclude that memory performance was better for program contents than for the contents of the embedded ads.
Discussion
The aim of this work was to study the influence of radio program type and element typicality on the memory for radio advertisements. The findings indicate that the number of hits and false alarms were influenced by the type of program in which the ads were embedded and by the typicality of the constituent elements. Moreover, although type of program and typicality had a significant impact on the response criterion adopted for the task, they did not affect accuracy.
In the applied setting of publicity it is very important to know where to place advertisements for maximum exposure. Experimental research has suggested that the higher the ratings of the programs or media context, the more effective the ads (Norris & Colman, Reference Norris and Colman1996). A cursory study of our data leads us to support this hypothesis. There were more hits for the ads inserted in the interesting programs than for the ads in the enjoyable or boring programs. In the absence of other data, we could conclude that the ads in the interesting program were better encoded and, hence, better remembered. However, the proportion of false alarms was higher for the ads embedded in the interesting and enjoyable programs than for the ads in the boring program. From this result, we could conclude quite the opposite of what was deduced from the hits alone. In other words, the participants did not adequately encode the information from the ads inserted in the interesting and enjoyable programs and were ready to accept false contents.
The response criterion seems to be a determining factor. A different response criterion was found depending on the type program. The participants adopted a more lenient criterion for the advertisements presented in the interesting and enjoyable programs. This lax criterion implies a greater tendency to answer “true”, whether asked about true or false information (Macmillan & Creelman, Reference Macmillan and Creelman2005). We can therefore conclude that the primary effect of program type on the recall of advertisements seems to provoke a change in the type of response criterion. The criterion tends to be more lax for ads embedded in interesting or enjoyable programs than the criterion adopted for programs viewed as boring.
The criteria adopted for advertisements was influenced not only by program type, but also by the typicality of the elements. For high-typicality elements, the criterion used for the ads inserted in the enjoyable and interesting programs was more lax than for the boring program. High-typicality elements tend to be accepted (Lampinen et al., Reference Lampinen, Copeland and Neuschatz2001), and if they are also associated with information we find interesting or entertaining, the tendency to answer “true” is very strong. Conversely, when elements are of low typicality, not representative of the product and have little representation in the knowledge schema, there is a greater tendency to answer “false” (Luna & Migueles, Reference Luna and Migueles2008). In our study, this tendency grew stronger when the advertisements were included in the enjoyable or boring programs. Thus, the main shift in criterion was due to the effect of the enjoyable program. When the elements in the embedded ads were of high typicality, the response criterion was lax and similar to the interesting program. But when the elements were of low typicality the criterion became more stringent and similar to the boring program. Future studies should explore the reasons why this shift in criterion happens specifically in the enjoyable program and as a function of the elements of the products advertised.
The second important finding is that the typicality of the elements in the ads influenced later recall. To the authors’ knowledge, this was the first time that memory has been studied associated with the typicality of advertisement elements. We found that there were more hits with high-typicality elements, i.e., elements more likely to appear in ads for similar products, but that there were also more false alarms for high-typicality elements. Moreover, a more stringent response criterion was adopted for the low-typicality elements. These findings replicate the results from studies in other areas of memory (Luna & Migueles, Reference Luna and Migueles2008), and show that participants used their schemata about the type of content usually included in similar advertisements (Nakamura, Graesser, Zimmerman, & Riha, Reference Nakamura, Graesser, Zimmerman and Riha1985). However, unlike other studies that have manipulated element typicality (Lampinen et al., Reference Lampinen, Copeland and Neuschatz2001; Neuschatz et al., Reference Neuschatz, Lampinen, Preston, Hawkins and Toglia2002), in this study we observed no differences in accuracy between high- and low-typicality elements.
This result is not unusual in the literature (Luna & Migueles, Reference Luna and Migueles2008) and can be explained by the definition of typicality. Many of the studies on typicality, such as the study by Lampinen et al. (Reference Lampinen, Copeland and Neuschatz2001), did not use low-typicality elements, but rather atypical elements incongruent or inconsistent with the schema (for example, a toy car in an office). In this study, even though the low-typicality false elements had a low probability of appearance, they were all consistent with the schema, for example, “Driving lessons in high-end vehicles” in a driving school advertisement. Our results are consistent with other studies that, while manipulating the typicality of the information, use situation-coherent contents, (e.g., Luna & Migueles, Reference Luna and Migueles2008). Another possible reason might be that, although we performed normative studies to select the high- and low-typicality elements with the most extreme values, the differences were not significant enough to produce differences in accuracy.
To take a deeper look at how advertisements are influenced by the programs in which they appear, we also measured the memory for the programs. Considering the two explanations for how programs influence their advertisements (carry-over effect and the limited capacity for processing information), we would expect better recall for the contents of the interesting and entertaining programs than for the boring program. Although the results of the memory for the programs support both explanations, if either held true here, the differences in accuracy in the programs should have elicited differences in accuracy in the advertisements since. However we found no differences in accuracy in the advertisements. Thus, the memory measurements for the programs do not help us determine which of the two theories best explains our results. Since the participants did not know what content they were going to be asked about, they may have assumed that it was about the programs and took advantage of the advertisement time to try to remember the content. If so, they would have been encoding the advertisements superficially, thus explaining why there were no differences between the ads in the different programs, while there were differences between the contents of the programs. Another factor to consider is that our sample consisted of students. Older populations with other interests may have led to different results (Fung & Cartesen, Reference Fung and Cartesen2003). Future research should examine whether the variables we manipulated have the same effect as a function of different population.
Lastly, it is important to mention that not all of the advertisements are for commercial purposes. Ads for social, or awareness-raising campaigns can also benefit from the study of the impact of ads. The systematic study of the impact of different types of media will also help gain a better understanding of the cognitive system and the social processes by prompting new questions on their effects and relationships, and by enabling the generalization of theoretical ideas and results from other areas (Johnson, Reference Johnson2007).