The percentage growth of older adults in the population in most countries has been accompanied by a reported increase in the well-being of older adults. Measurement of well-being has often involved research into the related concepts of positive and negative affect (Schimmack, Reference Schimmack, Eid and Larsen2008). Affect in such studies is measured primarily by a variety of self-report methods, ranging from sets of single adjectives (Kercher, Reference Kercher1992; Watson, Clark, & Tellegen, Reference Watson, Clark and Tellegen1988) to a variety of questionnaires designed to assess positive and negative emotional states.
One of the more popular questionnaire measures of affect has been the Bradburn Affect Balance Scale (ABS; Bradburn, Reference Bradburn1969). One of the reasons for the popularity of this measure is its brevity (10 items), and also the simple and straightforward scoring with only two response options (yes or no). It is axiomatic that the scoring of psychological measures instantiates the underlying conceptual model. Unfortunately, in the ABS case, the underlying model is unclear. Positive and negative affect were initially established as two independent but correlated constructs by Bradburn. The counter-argument for the bipolarity of affect has been made by Russell and Carroll (Reference Russell and Carroll1999a, Reference Russell and Carroll1999b) and by Yik, Russell, and Barrett (Reference Yik, Russell and Barrett1999) among others; however, Watson and Tellegen (Reference Watson and Tellegen1999) and Tellegen, Watson, and Clark (Reference Tellegen, Watson and Clark1999) have presented other perspectives on this issue. If positive and negative affect are separate and independent dimensions, then separate scores for each domain are meaningful and might contribute separately to the prediction of other variables.
If affect is two domains, this implies that some people can display both high (or low) positive affect and high (or low) negative affect. If affect were a single bipolar dimension, then such a state of being simultaneously high or low on the two domains would be impossible. Debate on this point involves both theory and measurement practice, because how constructs are conceptualized in part depends on how they are measured (Russell & Carroll, Reference Russell and Carroll1999a, Reference Russell and Carroll1999b; Tellegen et al.; Watson & Tellegen). For example, Russell and Carroll (Reference Russell and Carroll1999a) made the case that the adjectives used in the Positive and Negative Affect Scales (PANAS; Watson et al., Reference Watson, Clark and Tellegen1988) measure both constructs of emotion (valence) and activity, whereas other scales may measure only affect. Tellegen et al. (Reference Tellegen, Watson and Clark1999) argued for the importance of the role of acquiescence in evaluating the structure of affect and for a hierarchical model of affect and activation.
The ABS, in contrast to related developments arising from research done well after its inception, has taken on something of a contradictory approach. Some evidence suggests that the ABS is primarily a measure of two separate constructs, but the recommended scoring procedure for the ABS assumes that the two scales form opposite ends of a single bipolar dimension. Accordingly, a difference score has been assumed to reflect the balance between positive and negative affect (Bradburn, Reference Bradburn1969). The procedure of subtracting negative affect from positive affect clearly assumes that the two forms of affect define opposite poles of a single bipolar construct. The assumption is clear because the subtraction procedure is equivalent to simply recoding the negative affect items in the positive direction and adding these scores to those of the positive affect items. This contentious underlying model is implemented in the procedure in which the total for the five negative affect items is subtracted from the total of the positive affect items with a constant added to remove any negative numbers. Later analyses of the scale have generally agreed with Bradburn’s own findings in showing that the positive and negative affect scores are largely two distinct dimensions, but many of these studies did not report results for the affect balance difference score (e.g., Charles, Reynolds, & Gatz, Reference Charles, Reynolds and Gatz2001; Diener & Emmons, Reference Diener and Emmons1985; Harding, Reference Harding1982; Kempen, Reference Kempen1992; Macintosh, Reference Macintosh1998).
The debate over the dimensions of affect dates to the 1970s and has evolved over time to incorporate various issues, but the debate seems to have had comparatively modest impact upon actual practice. Studies using the ABS (e.g., Stacey & Gatz, Reference Stacey and Gatz1991) generally do not refer to the fundamental issue of how the scale is scored, which in turn affects how the scale is interpreted. Other studies using the ABS do address this issue. Harding (Reference Harding1982), for example, found that the balance score was a better predictor of well-being than the two individual components of affect. At the same time, Kim and Mueller (Reference Kim and Mueller2001) expressed doubt that the combination of two independent components would invariably be a better predictor of a criterion than the components considered separately. Certainly, Harding’s study is much the exception in finding better prediction by the difference score than the two independent scale scores.
Russell and Carroll (Reference Russell and Carroll1999a), Yik et al. (Reference Yik, Russell and Barrett1999), and Tellegen et al. (Reference Tellegen, Watson and Clark1999) all noted multiple conceptual and empirical issues relevant to the assessment of positive and negative affect and models of affect and related concepts, such as arousal. At the conceptual level, whether or not the theoretical emphasis should focus on bipolar circumplex structures or hierarchical models has been an issue (Tellegen et al.; Yik et al.), as has the role of activation or arousal in measures using adjectives and, at the empirical level, what the most appropriate methods of analysis are (Watson & Tellegen, Reference Watson and Tellegen1999). How one conceptualizes “independence” is also an issue (Russell & Carroll, Reference Russell and Carroll1999b). These debates might be influenced as well by considerations such as underlying neurological models that posit two different systems of neurotransmitters and associated structures for positive and negative affect, akin to Gray’s (Reference Gray1982) model of approach (positive affect) and avoidance (negative affect) systems. These examples illustrate different influences upon how one can conceptualize the dimensions of affect.
With regard to the ABS, formal evaluation of its underlying structural model is, however, rare in the literature on the use of this scale in assessing affect. The ABS was not included, for example, in a comprehensive evaluation of models of affect (Yik et al., Reference Yik, Russell and Barrett1999). Our purpose in the study was to evaluate the degree to which the model of affect proposed by Bradburn (Reference Bradburn1969) was supported in a sample of older adults.
The literature on the ABS includes many studies of older adults (for example, McMullin & Marshall, Reference McMullin and Marshall1996; Richardson, Reference Richardson2007; Shmotkin, Reference Shmotkin1990), although most studies using the ABS have focused on younger groups. Most studies of affect in general have also involved younger groups, but the increasing proportion of older adults in the population of many countries makes the issue of the appropriate scoring model for the ABS in older adults more salient. Lawton, Ruckdeschel, Winter, and Kleban (Reference Lawton, Ruckdeschel, Winter and Kleban1999) pointed out the complexities in the progression of changes in affect regulation from middle to older age that combines individual differences and developmental processes. Such developmental processes lead to less stability of personality attributes in children than in adults (Roberts & DelVecchio, Reference Roberts and DelVecchio2000). However, Roberts and DelVecchio noted that negative emotionality is one of the more stable aspects of temperament. There were insufficient numbers of studies of positive affect at the time of their review for comment on its stability, but the overall stability of personality attributes reached a maximum after the age of 59. A scale such as the ABS should therefore be evaluated to determine if it has comparable properties across the age span, and there is some importance in demonstrating that the underlying dimensionality of a measure holds for all groups in which it is used.
Indeed, research on the comparative utility of balance scores and the components of such scores is not as extensive in the domain of affect as might be expected. Kim and Mueller (Reference Kim and Mueller2001) made one of the few studies to report correlations of all ABS scores, including the balance score. A more common practice has been to use only the two separate affect scores (Kempen, Reference Kempen1992; Stacey & Gatz, Reference Stacey and Gatz1991). If the practice of using the difference score to assess affect balance were supported in a sample of older adults but not in younger adults, then the assumption of the stability of domains of affect across the lifespan would be challenged.
The literature is extensive on positive and negative affect in young adults, some of which is based on use of the ABS. Much of this literature deals with correlations of the ABS with other measures of positive and negative affect. Forming a balance score by taking the difference between positive and negative affect leads to a loss of information and less interpretability of the single difference score. For example, the same balance score would result from being very high in positive affect and moderately high in negative affect, or from being moderately high in positive affect and low in negative affect. Even though both situations result in the same balance score, the respective individuals’ comportment could be markedly different. This loss of information will also likely be reflected in lower predictability of relevant criteria than would be provided by the two separate scores (Loevinger, Reference Loevinger1957). Difference scores are also less reliable than their constituent scores, which makes the use of a difference score either as a predictor or a criterion measure more problematic. The use of a difference score in combination with its constituent scores in multivariate analyses is likely to cause multicollinearity problems, and to have lower variance than the constituent scales, which may in turn limit the maximum possible size of correlation coefficients.
Russell and Carroll (Reference Russell and Carroll1999a) observed that most truly bipolar scales are those arising from the use of adjectives that are logical antonyms, such as happy-sad and solitary-gregarious, which should also show high negative correlations. High correlations between the Positive and Negative Affect scales are rarely, if ever, reported for the ABS. Most literature on the ABS in fact reports low correlations between the two ABS affect scales. Perkinson, Albert, Luborsky, and Moss (Reference Perkinson, Albert, Luborsky and Moss1994) reported an interesting analysis using open-ended responses to the ABS items by a group of mostly middle-aged women. They found that the positive and negative items had a different focus for female respondents. Responses to positive affect items were based on a sense of personal accomplishment and recognition by others, whereas negative affect was based more on internal personal characteristics. Such findings strongly suggest that the positive and negative scores of the ABS are not operating at the same basic level, which would make taking the difference between them as a measure of balance even less tenable.
Consequently, the purpose of this study was to clarify the ambiguity of the ABS scoring model and the implications of its use of the balance score for studying associations of affect with measures of the related construct of morale in older adults. The first goal of this study was thus to test the presumed structure of the ABS using structural equation modeling. Second, we explored the relationships of positive and negative affect, and affect balance, with morale in terms of discriminant validity, particularly with the response style of social desirability.
Method
Participants
A total of 187 older, community-dwelling people participated (117 females). The mean age was 69.7 years (SD = 6.24, range of 59 to 92 years), with the majority (64.7%) being married and 28.3 per cent widowed. Approximately two thirds of participants were contacted through the membership list of a recreation center, and of these approximately 50 per cent of those contacted by telephone agreed to participate. The remaining participants were recruited by word of mouth through the original participants.
Measures
Bradburn Affect Balance Scale
The ABS consists of 10 dichotomous items, five of which are intended to measure positive affect while the remaining five are intended to measure negative affect. In this study, the Andrews and Withey (Reference Andrews and Withey1976) modification was used, which has four response categories per item instead of the original dichotomous scoring. This was done to remove the influence of any operating acquiescent response style that has been claimed to influence the yes or no ABS response format (see Russell & Carroll, Reference Russell and Carroll1999a, for a summary of the debate over acquiescence in measuring affect with the ABS).
Philadelphia Geriatric Center Morale Scale (PGCMS)
The PGCMS (Lawton, Reference Lawton, Kent, Kastenbaum and Sherwood1972) was scored using the original scoring key and the 22 original items. It was scored as a single global scale as per its originator’s instructions. Smith, Sim, Scharf, and Thomas (2004) reported a convergent validity correlation for a shortened version of the PGCMS with the Satisfaction with Life Scale (Diener, Emmons, Larson, & Griffin, Reference Diener, Emmons, Larsen and Griffin1985) of .59 and a coefficient alpha value of .89 in a sample of older residents of three cities in the U.K. Stock, Okun, and Benito (Reference Stock, Okun and Benito1994) reported coefficient alpha values ranging from .50 to .72 in Spanish samples using Castilian and Catalan translations of the PGCMS. Correlations with the Positive and Negative Affect scales of the ABS in that study were .35 and –.62 respectively.
Kutner Morale Scale
The seven items of this scale (Kutner, Fanshel, Togo, & Langer, Reference Kutner, Fanshel, Togo and Langer1956) are dichotomous and were scored unidimensionally. The Kutner scale has been used much less extensively than the PGCMS. Dick and Friedsam (Reference Dick and Friedsam1964) reported a Guttman coefficient of reproducibility of .93 for the scale. Gilhooly (Reference Gilhooly1984) reported correlations of .71 of the Kutner scale with a measure of mental health, .29 with use of a home help service, and .34 for visits by a community nurse.
Social Desirability
The measure of social desirability we used is the 16-item Desirability scale of the PRF-E (Jackson, Reference Jackson1984). The items of the scale were generated for the purpose of measuring social desirability independent of any particular psychological construct.
Procedure
Questionnaires were presented in one of two different orders of presentation, within which the questionnaires were randomly ordered. A total of 134 questionnaires were self-administered, with another 53 collected in the form of interviews. Follow-up analyses (not presented here) found no effect attributable to the mode of administration.
Analysis
All measures were scored according to the instructions provided in the original sources. Three people omitted one item on the PGCMS, and seven people omitted one item on the ABS. These were handled by substituting the relevant mean response for those items. The hypothesis relating to the number of dimensions underlying the Bradburn ABS was tested using EQS version 6.1 (Bentler, Reference Bentler2006) with covariance matrices, and descriptive statistics were calculated using SYSTAT 12 (Systat Software, 2007). Goodness of fit was evaluated using the well-known Comparative Fit Index (CFI), Akaike Information Criterion (AIC), and Root Mean Square Error of Approximation (RMSEA) indices. The CFI (Bentler, Reference Bentler1990) adjusts goodness of fit test statistics for sample size. Hu and Bentler (Reference Hu and Bentler1999) suggested a value of .95 or better is associated with satisfactory levels of model fit. Browne and Cudeck (Reference Browne, Cudeck, Bollen and Long1993) suggested a value of .05 or lower for the value of the RMSEA coefficient. Smaller values of the AIC index reflect better fit of the model and can be used to compare two models of the same data set (Akaike, Reference Akaike1987).
Results
Table 1 reports means, standard deviations, and coefficient alphas for the various measures. A preliminary principal component analysis of the 10 items of the ABS showed two eigenvalues greater than one that together accounted for 50 per cent of the variance. The scree plot also suggested the retention of two components.
ABS = Affect Balance Scale.
GC = Geriatric Center.
PRF = Personality Research Form.
The EQS confirmatory analysis of a model with one underlying dimension, with positive and negative at opposite poles, gave a χ2 with 35 df = 194.14, with associated CFI = .597, RMSEA = .158, and AIC = 124.14. A model with positive and negative affect items defining two separate dimensions gave a χ2 with 34 df = 49.68 with associated CFI = .960, RMSEA = .050, and AIC = –18.32. The difference between χ2 values for the two models was highly significant (Δχ2 = 144.46, 1 df, p < .001).
Table 2 reports the correlations among the various measures. The positive and negative subscales of the ABS correlated .24 with one another, with high scores reflecting both high positive and high negative affect by the standard scoring method. Both subscales correlated in the same direction and in approximately the same magnitude with both measures of morale and with the social desirability scale. The two morale scales correlated more highly with one another than with either affect or desirability. Notably, the correlations of the affect balance score with both morale measures and desirability were much lower and not notably different from zero.
ABS = Affect Balance Scale.
PGCMS = Philadelphia Geriatric Center Morale Scale. Correlations in excess of .21 are significantly different from 0 at p < .01.
PRF = Personality Research Form.
A second set of confirmatory analyses was performed on the structure of the five scales used in the study. In order to have a sufficient number of markers for the presumed negative affect dimension, the 10 items from the ABS were used, together with scale scores for the PGCMS, Kutner and Desirability scales. The first model tested a single-factor structure, with all measures set to load on a single factor. This model had a χ2 with 65 df = 255.79, p < .001, CFI = .710, RMSEA = .127, and AIC = 125.79. A second model evaluated two dimensions, with the Negative Affect items of the ABS defining the second factor. This model fit slightly better (χ2 with 64 df = 228.00, CFI = .750, RMSEA = .118, and AIC = 100.00). The second model fit the data significantly better than the one-dimensional model (χ2 with 1 df = 27.79, p < .001).
Discussion
These results are broadly consistent with existing literature that supports two independent dimensions of affect (Charles et al., Reference Charles, Reynolds and Gatz2001; Harding, Reference Harding1982; Kempen, Reference Kempen1992; Kim & Mueller, Reference Kim and Mueller2001; Maitland, Dixon, Hultsch, & Hertzog, Reference Maitland, Dixon, Hultsch and Hertzog2001). The correlation here between positive and negative affect was approximately .2, again roughly consistent with the values reported in these studies, all of which have reported null to low correlations between the two ABS scales. These figures are consistent with separate positive and negative affect subscales as opposed to the originally recommended unidimensional scoring.
Our results suggest that the original unidimensional, bipolar scoring method cannot be supported and should not be used, consistent with the argument that the concept of affect balance cannot be supported empirically or logically (Kim & Mueller, Reference Kim and Mueller2001). Whereas expecting a high negative correlation between any measures of positive and negative affect may not be reasonable in most conditions (Russell & Carroll, Reference Russell and Carroll1999a), an argument for the independence of the two types of affect can be defended more strongly. The use of a difference score, as recommended to calculate the affect balance measure of the ABS, brings with it all the problems associated with an analysis of difference or change scores involving two scores (Cronbach & Furby, Reference Cronbach and Furby1970), including the much lower reliability of the difference score than of the two component scores. A common misinterpretation underlying the original use of the difference score to assess affect balance is that such scores are not a function only of the measure, but of the particular sample in question as well (Streiner & Norman, Reference Streiner and Norman2008). Our use of structural equation modeling, a different computational model from that used by Kim and Mueller (Reference Kim and Mueller2001), provided the ability to test explicitly the goodness of fit of one- and two-dimensional models. The finding that the two-dimensional model fit significantly better than the unidimensional model provides additional support for using the two dimensions of positive and negative affect on the ABS and not the balance score.
In addition, the internal consistency reliability of both scales exceeded 0.70, at odds with reports that suggest that the short (five item) scales have insufficient reliability for research use (see Russell & Carroll, Reference Russell and Carroll1999a, on this point). In part, this may have been due to our use of a four-point rating rather than the original binary response format. The four-point scoring method should likely be used in future research with the ABS as it appears to resolve both the issue of the scales’ reliability and to moderate the influence of acquiescence associated with the binary response format. Short scales are often wanting in reliability, and comparisons with longer scales should make use of the Spearman-Brown formula to equate scales of different lengths to a common number of items.
Our results are also consistent with some other research in showing different outcomes with external criteria for the affect balance score and the two separate dimensions of affect. Harding (Reference Harding1982) was one of the few papers to report on the use of positive and negative affect scores as well as the balance score, reporting different correlations for the two affect scales. Although correlations of both positive and negative affect were positively associated with both measures of morale and social desirability in that study, correlations of the three other measures were close to zero for the balance measure of the ABS. This was to be expected from the perspective of a simple aggregation of independent measures resulting in a loss of important information and predictive power.
Our results contribute to the base of knowledge on the constructs of positive and negative affect by exploring the associations with measures of morale. The constructs are clearly related because the correlations between both measures of affect and morale are roughly the same magnitude, although the confirmatory analysis supported the expectation that morale is more associated with positive than negative affect. The structural model that incorporated the morale scales was consistent with two dimensions of affect, with both morale and desirability associated with the positive affect domain. Neither model associated with the analysis of the morale scales fit the data well, but our concern was more for the relative fit of the two models and not for the best fitting model(s). Further research should expand on the evaluation of the bipolarity model of affect from the use of adjectives (Russell & Carroll, Reference Russell and Carroll1999b, Watson & Tellegen, Reference Watson and Tellegen1999) to other aspects of self-report.
Our study showed the same pattern of correlations as in the earlier report of Kim and Meuller (Reference Kim and Mueller2001). If such findings are replicated further, then a strong argument based on extensive empirical evidence can be made for not using the ABS balance score. We therefore recommend use of the two affect scales of the ABS and not the balance score except with great caution. Our findings suggest that the results of studies, such as that of Smith (Reference Smith1995) that relied heavily on the interpretation of the ABS balance measure, should be viewed with caution and that any other study making use of the affect balance score from the ABS should also be regarded cautiously.
Our sample size was acceptable for the first set of structural equation models that we tested on the basis of common guidelines (e.g., see MacCallum et al., Reference MacCallum, Browne and Sugawara1996, for one perspective, but also see Goffin, Reference Goffin2007, and Aguinis & Harden, Reference Aguinis, Harden, Lance and Vandenburg2008, for arguments as to the absence of empirical support for many such guidelines on sample size). Nonetheless, a larger sample would have permitted separation into two or more groups for replication of the results. In addition, the sampling procedure that was used could have led to a sample that was less representative of older adults in general as most participants were members of a recreational club and volunteered to participate. Also, in the future, the use of other scales for positive and negative affect might provide broader evidence for the validity of the ABS than the domains of morale that we sampled.
In general, it is clear that the interpretation of the balance measure in the Bradburn (Reference Bradburn1969) Affect Balance Scale is far from straightforward. The evidence from this study is consistent with the position that positive and negative affect are separate constructs. Further research using both adjectives and self-descriptive statements with all age groups will be needed to provide final resolution of this issue.