Work design is a hotly debated area of indisputable importance in the study of Organizational Sciences. To thoroughly comprehend this concept, it must be considered in relation to that of work organization. Strictly speaking, organizing work has to do with understanding and breaking down the process of production or service providing to enable effective job performance and coordinate the final integration of different subprocesses (Fernández Ríos & Sánchez, Reference Fernández Ríos and Sánchez1997). And work design, which is becoming more and more inclusive, concerns the way in which work tasks are configured within the broader system of work, acknowledging the close connection between work actitivies, and the organizational context in which they take place.
Work design was at first closely connected to how jobs are configured, which was justified by reasons such as productivity, workforce qualifications, and organizational efficiency. It later came to be associated primarily with motivational aspects (Hackman & Oldham, Reference Hackman and Oldham1980), and the current, more wholistic view is that it incorporates processes and results into explanations of how work is structured, organized, experienced, and represented (Grant, Fried, & Juillerat, Reference Grant, Fried, Juillerat and Zedeck2010; Morgeson & Humphrey, Reference Morgeson, Humphrey and Martocchio2008). Furthermore, there is a clear, growing tendency to consider context a relevant factor in design decisions and, at the same time, depart from classic, static job descriptions and instead embrace more dynamic features, like those associated with the concept of role (Ilgen & Hollenbeck, Reference Ilgen, Hollenbeck, Dunnette and Hough1991).
Morgeson and Humphrey’s (Reference Morgeson, Humphrey and Martocchio2008) conceptualization is fully aligned with those trends. Integrating the design of jobs as well as teams, they formulated the most complete definition of work design to date:
The study, creation, and modification of the composition, content, structure, and environment within which jobs and roles are enacted. As such, it concerns who is doing the work, what is done at work, the interrelationship of different work elements, and the interplay of job and role enactment with the broader task, social, physical, and organizational context. (p. 47).
This definition meets the current objectives and purposes of work design research, incorporates elements of earlier theoretical models, does not contradict any, and opens new horizons that were in dire need of theoretical development and research.
Work design is becoming increasingly relevant, which is why this study’s reason for being – its objective – is to adapt Morgeson and Humphrey’s (Reference Morgeson, Humphrey and Martocchio2006) Work Design Questionnaire (WDQ) Footnote 1,Footnote 2 into the Spanish language, thereby providing researchers and practitioners with a reliable, valid tool properly adapted into Spanish with which they can investigate, measure, and change the reality of work.
Several empirical studies have emphasized work design’s impact on a wide range of individual, group, and organizational outcomes (Fried & Ferris, Reference Fried and Ferris1987; Humphrey, Nahrgang, & Morgeson, Reference Humphrey, Nahrgang and Morgeson2007); that is, without a doubt, why it has played a crucial role in bridging the theory and practice of organizational science. In addition, it represents an important synthesis of multiple disciplines (Morgeson & Campion, Reference Morgeson, Campion, Borman, Ilgen and Klimoski2003), and its study is crucial to the effective implementation of new forms of work organization (Fernández Ríos, San Martín, & De Miguel, Reference Fernández Ríos, San Martín Castellanos and de Miguel Calvo2008; Smith, Reference Smith1997).
Morgeson and Campion (Reference Morgeson, Campion, Borman, Ilgen and Klimoski2003) maintained that despite its enormous impact on organizational success and individual well-being, research interest in work design has gradually waned since the ’80s, as reflected by the dearth of articles about it in the most prominent journals. This reached its most critical point when after twenty years of investigation, researchers presumed to have a “clear picture” of the psychological and behavioral effects of work design (Humphrey et al., Reference Humphrey, Nahrgang and Morgeson2007).
Despite contributions from the empowerment movement of the ’80s, and the lean production literature of the ’90s, it came to be accepted that theory and practice in this field of research was relatively mature (Parker, Wall, & Cordery, Reference Parker, Wall and Cordery2001). That perpetuated the dominant paradigm of the time, the job characteristics model (Hackman & Oldham, Reference Hackman and Oldham1980), and disincentived further developments on the subject, especially in terms of theory. Proof of that lays in the fact that this model and the other dominant perspective, the sociotechnical systems model (Trist, Reference Trist, Van de Ven and Joyce1981), have barely changed in the last four or five decades despite abundant criticism for their lack of theoretical substantiation and applicability to the content and context of real-life work situations (Roberts & Glick, Reference Roberts and Glick1981). These shortcomings were even greater in terms of instrument development; that was apparent in mounting distrust of the available tools and resulting abandonment of the field by researchers as well as practitioners.
Today the scientific community is seeing a resurgence of interest in this subject, as reflected in the formation of comprehensive theoretical frameworks that go beyond the motivational features of work, actively incorporating social and contextual design elements and thus incentivising empirical research. This trend is apparent in select papers by Morgeson and Humphrey (Reference Morgeson and Humphrey2006), Humphrey et al. (Reference Humphrey, Nahrgang and Morgeson2007), and Grant et al. (Reference Grant, Fried, Juillerat and Zedeck2010), among others.
This resurgence is a reaction not just to the research stagnation described above, but also a warranted response to changes in the nature of work at contemporary organizations in the globalized context. It is characterized by questioning the underlying assumptions of previous paradigms, and a joint effort to spur new theory production. It constitutes a new paradigm, one of integration, redesign, and reinvention. Specific examples of efforts to integrate different ideas include Parker et al. (Reference Parker, Wall and Cordery2001), Morgeson and Campion (Reference Morgeson, Campion, Borman, Ilgen and Klimoski2003), Humphrey et al. (Reference Humphrey, Nahrgang and Morgeson2007), and Grant et al. (Reference Grant, Fried, Juillerat and Zedeck2010), and examples of more ground-breaking, emerging contributions – reinventions – include Wrzesniewski and Dutton (Reference Wrzesniewski and Dutton2001) and Clegg and Spencer (Reference Clegg and Spencer2007), among others.
Various formulations have been configured along those lines, and a new theory seems to be distilling that, in short, proposes that work design is a constructive process (Rico & Fernández Ríos, Reference Rico and Fernández Ríos2002; Wrzesniewski & Dutton, Reference Wrzesniewski and Dutton2001) that should include certain antecedents (environmental features, available technology, company culture, etc.) and certain outcomes that do not solely result from the work design in place. Instead, the relationship is influenced by individual, group, and organizational contingencies, and by intermediary mechanisms. In short, work design is an essential component, but ultimately it is one component among many, and whatever work design is put in place, its results will be conditioned, if not determined, by elements beyond the work itself that, being part of context, cannot be ignored when designing work at an organization.
This theoretical perspective is exceptional not only for the contributions it holds, in and of itself, or for its great potential for future contribution, but also because a) it neither excludes nor directly contradicts previous theories. Rather, it partially fuses them and moves them forward as a whole; and b) it clarifies the basic elements that were pondered and explored previously (Fernández Ríos, Reference Fernández Ríos, de Juan-Espinosa, Colom and Quiroga1996; Fernández Ríos, Rico, & San Martín, Reference Fernández Ríos, Rico and San Martín2004; Fernández Ríos & Sánchez, Reference Fernández Ríos and Sánchez1997; Fernández Ríos et al., Reference Fernández Ríos, San Martín Castellanos and de Miguel Calvo2008; Rico & Fernández Ríos, Reference Rico and Fernández Ríos2002) but never clearly presented as a set.
Formulating organizational theory, branching into design, and rooting work design in that theory do not happen “just because,” or for aesthetic reasons. They happen because they have individual, group, organization, and extraorganization-level consequences. Those consequences are intentional, worked for, and desired. In other words, work design involves actions that clearly, intentionally try to change organizations; it is a series of explicit efforts to improve the organization, boost productivity, and reap positive results for individuals, their families, society, and the non-social environment. These outcomes are numerous and varied, often affecting systems beyond the confines of the formal organization.
Fortunately, around the turn of the century, several authors in different parts of the world had similar concerns about work design; they included F. P. Morgeson and S. E. Humphrey in the United States, S. K. Parker and J. Cordery in Australia, and T. D. Wall in England. In the Spanish-speaking world, studies by M. Fernández Ríos and R. San Martín were noteworthy. But a qualitative leap was needed – a broader vision of work analysis – to overcome the conceptual and metric limitations of the main instruments in use, like the Job Diagnotic Survey – JDS – (Hackman & Oldham, Reference Hackman and Oldham1980) and the Multimethod Job Design Questionnaire – MJDQ – (Campion & Thayer, Reference Campion and Thayer1985).
A new way to conceptualize and measure work design was needed, without the limitations of 20th century research advancements (for an in-depth analysis of said limitations, see Morgeson & Humphrey, Reference Morgeson and Humphrey2006, and Humphrey et al., Reference Humphrey, Nahrgang and Morgeson2007). And that needed to be addressed on the level of theory, methodology, and measurement. And if Parker et al. (Reference Parker, Wall and Cordery2001) shone a light on theory, Morgeson and Humphrey (Reference Morgeson and Humphrey2006) did so on measurement. It was the Work Design Questionnaire (WDQ).
The Work Design Questionnaire (WDQ)
The WDQ is a comprehensive, integrative measurement instrument that according to its authors, is needed for three reasons: a) to date, the available measures were either highly specific, like those of tasks in very concrete jobs, or quite general, like those of work characteristics. A measure was needed to bridge the gap between tasks and characteristics; b) when designing or redesigning jobs, practitioners were only able to act with a quite limited range of work characteristics (autonomy, variety, etc.). By taking more characteristics into account, many more changes would be possible; and c) theoretical debate needed to resume. Along those lines, after the last 35 years of meager contributions to theory, we need to progress toward greater integration across disciplines and fusion of perspectives.
The WDQ focuses on work (rather than job), considering not only the person’s job, but relations between workers and the larger environment like Parker et al. (Reference Parker, Wall and Cordery2001) suggested. To develop the WDQ, the work design literature was reviewed to identify key work characteristics and measures used previously. An item pool was developed to encompass all the work characteristics identified in the specialized literature to date and produce a more complete set of the scales considered previously.
Interest in adapting the WDQ into spanish
Spanish-language adaptation of a tool like the WDQ is of tremendous research interest for two fundamental reasons: a) because it is a new, powerful measurement instrument that in a way synthesizes all the available knowledge, operationalizes it, and makes it available for use by researchers and practitioners; and b) because this instrument is consistent with new notions of work design. It certainly does not cover everything in what have come to be called “expanded models or perspectives” but it does cover virtually the full spectrum of variables related to work characteristics.
The benefits of the WDQ have stirred the attention of numerous research teams around the world. It has been adapted into German (Stegmann et al., Reference Stegmann, van Dick, Ullrich, Egold, Wu, Charalambous and Menzel2010), Chinese (Chiou, Chou, & Lin, Reference Chiou, Chou and Lin2010), and Polish (Hauk, Reference Hauk2014) in addition to versions not yet published in Italian, Portuguese, and French, among other languages, and preliminary versions in languages including Arabic, Hebrew, Japanese, and Korean (Morgeson, 2011). Given the interest it attracts, we decided to create a Spanish adaptation (in Spain), something unprecedented in this context so far, with the conviction that in Spanish-speaking countries too, this aspect of the reality of work can and should be developed, and that the WDQ is an indispensable tool.
Bases for and development of the WDQ
The WDQ rests on a three-factor, integrated typology proposed by Morgeson and Campion (Reference Morgeson, Campion, Borman, Ilgen and Klimoski2003) that has abundant theoretical and empirical evidence to support it. Those authors posited that the field of work design could be broken down and analyzed in terms of three major components:
-
-Job complexity: This dimension encompasses the motivational work features studied most extensively (e.g., autonomy, variety, significance), and others such as cognitive demands and specialization. When these features are more prevalent or increase, one’s work tends to be more complex, which is more demanding of the worker.
-
-Social environment: This dimension includes features of the relational or social context in which work is done, including for instance interdependence, social support, and feedback from others. This dimension has gradually demonstrated its relation to important outcomes of work design.
-
-Physical demands: This dimension encompasses the features of the physical environment in which work is done, including aspects like physical activity, work conditions, ergonomic design, and the use of technology. Its importance is inescapable anytime work activity takes place under such conditions.
With that definition in mind, which synthesized the bulk of work design research and made sense of it, Morgeson and Humphrey (Reference Morgeson and Humphrey2006) adapted their main ideas and proposed that work design is comprised of three categories of work characteristics corresponding to the original three-component structure presented above.
The first category, motivational characteristics, have been the most extensively studied in the literature and reflect the overall complexity of work. Those are divided into task characteristics and knowledge characteristics. Task characteristics relate to how work is done, and to the range and nature of the tasks associated with a specific job. Meanwhile, knowledge characteristics address the kinds of demands placed on the individual – knowledge, skills, individual abilities – as a function of what he or she does on the job.
The second category, social characteristics, reflects the fact that work is done within a broader social and relational setting. Historically, these have not been studied as much, less than motivational features for example, but that trend has slowly shifted as its important role in various outcomes of work design is revealed. And the third category, physical or contextual characteristics, corresponds to the physical and material context in which work is done. With the exception of the MJDQ (Campion & Thayer, Reference Campion and Thayer1985), the physical context of work has mostly been neglected in the scientific research on work design (Morgeson & Humphrey, Reference Morgeson and Humphrey2006). Thus, the work characteristics in the WDQ are organized into these three larger categories – motivational, social, and physical/contextual.
The creation and development of the WDQ took place through a series of steps briefly summarized in this section. First, work characteristics were identified; an extensive literature review identified 107 characteristics that had been debated and/or measured. Then, through various processes involving the definition of different characteristics, comparative analysis, sorting into categories, etc., 18 categories of work characteristics were established. Those were then grouped into the three higher-order categories described above: motivational, social, and physical/contextual. And the first was halved into two subcategories: task and knowledge work characteristics. Hence four larger factors will be discussed: task motivational characteristics, knowledge motivational characteristics, social work characteristics, and physical or contextual work characteristics. Those categories encompass 18 features: 5 task motivational, 5 knowledge motivational, 4 social, and 4 contextual. On that foundation, one model of 19 characteristics (interdependence was subdivided, adding a social dimension) and another with 20 (this one has 7 task motivational characteristics because it separates autonomy in three) have been discussed. Finally, there is a model of 21 characteristics that subdivides autonomy as well as interdependence: 12 motivational, 5 social, and 4 physical. The above appears in Figure 1 and will be required in the Method section to analyze the instrument’s factor structure.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-38425-mediumThumb-S1138741617000245_fig1g.jpg?pub-status=live)
Figure 1. Conceptual Structure of the WDQ: Definitions.
After identifying different categories, subcategories, and specific variables or characteristics, Morgeson and Humphrey explored whether or not specific items already existed in the scientific literature for each of the constructs or variables to measure. They utilized pre-existing items, modified some, and created some new, always striving for consistency with the definitions compiled in Figure 1. Ergo, the WDQ is a mix of previously existing items (17%), adapted items (33%), and new items (50%). They utilized a relatively simple response scale to avoid extraneous construct variance (5-point scale from 1 -strongly disagree- to 5 -strongly agree-). Furthermore, the authors suggest that all items are phrased in positive terms so as to avoid the factor structure issues reported about other, earlier work design measures (e.g., Idaszak & Drasgow, Reference Idaszak and Drasgow1987). The only exceptions to that rule were items on the job complexity and ergonomics scales, which were easier to comprehend when phrased negatively.
To achieve adequate internal consistency while maintaining reasonable scope, all scales have at least four items unless there is suspicion that various dimensions exist within the same construct, as in the cases of autonomy, feedback, interdependence, and contextual variables. All those have just three items. Many refer to the work itself, not individual responses to work, since it is the properties of work itself that are of interest, not idiosyncratic reactions. Items were grouped according to the features they examine, not randomly distributed. That choice was in keeping with Schriesheim, Solomon, and Kopelman (Reference Schriesheim, Solomon and Kopelman1989), who showed that grouping items had different psychometric advantages (e.g., convergent and discriminant validity), particularly when measuring work characteristics.
The data were collected by students – juniors and seniors in a business administration course. They were asked to administer the questionnaire in paper-and-pencil form to family members, kin, and acquaintances with at least 15 years of full-time work experience. The questionnaire was administered first, then a brief interview was conducted to gauge the main tasks and other duties of the job, and identify the corresponding job name or title in the Dictionary of Occupational Titles (DOT; U.S. Department of Labor, 1991) and its O*NET code. This guaranteed data from a very heterogeneous range of jobs. Data were collected from 540 workers holding 243 different jobs; 22 out of the 23 professional groups represented in O*NET; and an average job tenure of 15 years (SD = 9.80).
As we will see later on in detail (in the Method and Results sections), to determine the instrument’s validity, Morgeson and Humphrey (Reference Morgeson and Humphrey2006) used confirmatory factor analysis (CFA) to compare different factorial models, but they also tried to determine to what extent scores on the WDQ scales are consistent with data published previously about jobs and occupations. Therefore, information about indicators of cognitive skills or social/interactive or contextual aspects of work, provided by O*NET or the Dictionary of Occupational Titles, could be interpreted as independent, preliminary evidence for the discriminant and convergent validity of the main categories of work characteristics. As Morgeson and Humphrey (Reference Morgeson and Humphrey2006) propose, “evidence that responses to the WDQ are related to these external measures would be powerful because it suggests that the measures correspond to some larger objective reality unaffected by perceptual biases” (p. 1327).
The authors’ argument about what relations to expect between the different WDQ measures and external measures – coming from the O*NET or gathered using other measurement techniques – is long, thorough, and incorporates measures like job descriptions (cognitive, interpersonal, and physical), occupational categories, and varied results. Specifically, in the present study, according to the data available for this adaptation and adjustment of procedure, these original hypotheses from Morgeson and Humphrey’s (Reference Morgeson and Humphrey2006) study are important to consider and will go on to be empirically tested.
-
Hypothesis 4a: Jobs in professional occupations will have higher levels of knowledge characteristics and autonomy than jobs in non-professional occupations.
-
Hypothesis 4b: Jobs in non-professional occupations will have higher levels of physical demands and less positive work conditions than jobs in professional occupations.
-
Hypothesis 4c: Jobs in “human life” occupations will have higher levels of task significance than jobs in other occupations.
-
Hypothesis 4d: Jobs in sales occupations will have higher levels of interaction outside the organization than jobs in other occupations.
Method
Participants
A total of 1035 subjects participated in this study, representing 492 different jobs. To ensure consistency with the U.S. study, this sample of workers represented many of the various occupational groups compiled in the Standard Occupational Classification (SOC) (U.S. Department of Commerce, 2000). Morgeson and Humphrey used the same classification system as a criterion in their original study.
Participants’ average age was approximately 39 years (SD = 12.50), with an average of 11 years’ (SD = 11.07) tenure at their respective jobs. The sample’s equal distribution according to sex was noteworthy, with 49% men and 51% women. Essential characteristics of the sample, that is, number of participants, age, work experience in their job, and sex are displayed in Table 1. That information is organized according to the occupational groups proposed in the SOC.
Table 1. Descriptive Statistics of the Spanish Sample, by Occupational Category
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-47019-mediumThumb-S1138741617000245_tab1.jpg?pub-status=live)
Note: N = 1035
In reference to the sample’s descriptive data, we would like to emphasize certain important aspects. First of all, all 23 SOC occupational groups are represented in the sample, but in one case, only by a minimal number (“farming, fishing and forestry”). In Morgeson and Humphrey’s original study, 22 were represented, all but “building and grounds cleaning and maintenance occupations.” Second, as we mentioned above, an equal representation of men and women was achieved, which is noteworthy and unexpected given the convenience sampling strategy used. Third, the sample tends to represent more professional (e.g., “management,” “business and financial” “office and administrative support,” “education”) than non-professional jobs (e.g., “building and grounds cleaning and maintenance,” “transportation and material moving”), but not as dramatically as in the U.S. sample. Fourth, in all occupations, tenure in current job was relatively long, which is consistent with the U.S. sample and ensured employees were knowledgeable about the main characteristics of their jobs. On that note, we should point out that everyone was required to have at least three years of general work experience and 6 months’ tenure in their current job.
Procedure
This section will first describe the process of adapting the instrument into Spanish. Then it will detail the data collection and data analysis procedures utilized.
In adapting the WDQ into Spanish, we took into account the International Test Commission Guidelines for Translating and Adapting Tests (International Test Commission, 2010). With that in mind, we present some important specifications relating to the Spanish-language version and adaptation of items to the sociocultural context.
This translation of the WDQ into Spanish pertains to European Spanish. Generally speaking, this version is expected to be valid for the Spanish spoken in Hispanic American contexts, at least that is what Spanish-speakers from countries like Chile, Colombia, Uruguay, Argentina, and Mexico have said. That being said, there are no doubt differences that would justify empirical research to verify results, or at least an expert panel for each country in the Spanish-speaking world.
The sociocultural adaptation of items might be the most critical factor of all in the translation process. A back-translation technique was utilized, along with an expert panel in which up to eight judges took part, all with knowledge of the subject and adequate mastery of both languages. First of all, experts on this subject with mastery of both languages translated the original instrument. In so doing, their criterion was to strive to conserve each question’s exact meaning, varying only their idiomatic expression when necessary. We then proceeded to back-translation, and high equivalency was found between the original test and the back-translated version generated by independent translators. Last, the aforementioned expert panel was convened. Its main objective was to make sure questions have sense and clarity, and to correct any possible errors of content or format, thereby producing the final version of the instrument.
A notable departure from the original test is the items’ order. As described above, Morgeson and Humphrey grouped items pertaining to each characteristic together, like Schriesheim et al. (Reference Schriesheim, Solomon and Kopelman1989) recommend, instead of randomly distributing items throughout the data-collection instrument. In the present case, items were sequentially distributed to maximize the distance on the final form between any two items corresponding to the same characteristic. To do so, items were numbered like on the original form, then reordered according to the following pattern: 1, 10, 20, 30, 40, 50, 60, 70, 2, 11, 21, … etc. Thus, nine items from other categories were presented between any two items that measure the same characteristic. Obviously, since not all characteristics have the same number of items, it was not always possible to maintain the exact same distance between them. This approach avoided an important issue that many subjects reported – the similarity of various items, which in a few cases led participants to reject the test entirely and stop. The appendix includes the relation between the original WDQ items and their respective versions in Spanish, and the details about their order.
The data collection process was quite similar to the one Morgeson and Humphrey utilized, recruiting students in their last year of university. They were asked to administer the questionnaire in paper-and-pencil form to family members, kin, and acquaintances with full-time work experience of three years or more, and at least six months’ tenure in their current job. The questionnaire was administered first, then a brief interview was conducted to determine the job’s main tasks and other duties, and match those to jobs in the Dictionary of Occupational Titles and their corresponding O*NET codes. This ensured data from a wide variety of jobs.
As far as data analysis procedures, to establish the instrument’s factor structure, we followed a similar plan as the WDQ’s authors. Using a confirmatory factor analysis (CFA) technique, we were able to obtain empirical evidence of the instrument’s construct validity and internal dimensionality (Williams, Ford, & Nguyen, Reference Williams, Ford, Nguyen and Rogelberg2004). Six different models were compared; they were based on conceptual elements of the instrument. The first, one-factor model was used to test whether participants would manage to distinguish among the instrument’s different dimensions. The second model has four factors, corresponding to the four major categories discussed in the work characteristics literature review (Task Motivational, Knowledge Motivational, Social, and Contextual). The third model has 18 factors, corresponding to the dimensions of work specified a priori. The fourth model is the same as the third, except Interdependence is split into Initiated and Received, so it consists of 19 factors. The fifth model has 20 factors; it is the 18-factor model (third model) with the Autonomy variable broken down into three components: Work scheduling autonomy, Decision-making autonomy, and Work methods autonomy. The sixth and final model, with 21 factors, makes both those changes to the third model, that is, it divides both Interdependence and Autonomy.
All models but the first, with one factor, were extracted from the authors’ original model, displayed in Figure 1. This technique was believed to be the best for several reasons: a) Theoretically, various factor structures were possible, and CFA made it possible to test different alternative models’ goodness of fit; thus, a model would not be selected based on its goodness of fit alone, but on its goodness of fit relative to different available options. b) Theoretical models are previously defined, and by testing various models, the researcher is less likely to favor one model over the others, which tends to occur in exploratory factor analysis (EFA).
To determine the six models’ (1, 4, 18, 19, 20, and 21 factors) goodness of fit, we utilized the four goodness of fit indicators that were used in constructing the original instrument: χ2 /df, comparative fit index (CFI), root mean square error of approximation (RMSEA), and standardized root mean square residual (SRMR). In addition, three indicators were calculated that were not included in Morgeson and Humphrey’s original study: the Tucker-Lewis index (TLI), Akaike information criterion (AIC), and Bayesian information criterion (BIC).
Regarding the goodness of fit levels found through CFA, according to the indices that appear in the literature, a χ2 /df ratio less than or equal to 3 indicates acceptable goodness of fit, but that index is strongly affected by sample size (Hair, Black, Babin, & Anderson, Reference Hair, Black, Babin and Anderson2010); values of RMSEA under 0.03 indicate excellent goodness of fit to the data, under 0.05 very good, and under 0.08 good (Williams et al., Reference Williams, Ford, Nguyen and Rogelberg2004); meanwhile, SRMR values under 0.08 indicate good fit to the data, while values less than or equal to 0.09 are acceptable as long as RMSEA or CFI corroborates the model’s goodness of fit (Hu & Bentler, Reference Hu and Bentler1999); finally, values of CFI equal to 0.95 would indicate that a model shows good fit to the data (Hair et al., Reference Hair, Black, Babin and Anderson2010; Hu & Bentler, Reference Hu and Bentler1999), though some authors maintain that values of .90 or even .80 are acceptable (Hair et al., Reference Hair, Black, Babin and Anderson2010). As for the additional indicators, TLI values closest to 1 show the best fit; meanwhile, the indicators BIC and AIC, which serve to compare models, tend to penalize complexity so the higher their value, the lower a model’s goodness of fit (Arbuckle, Reference Arbuckle2013). In addition, magnitudes of increase and decrease were applied as criteria: first, as Chen (Reference Chen2007) proposed, RMSEA increases of less than 0.015 indicate irrelevant differences that may support the most parsimonious model; and second, as Cheung and Rensvold (Reference Cheung and Rensvold2002) suggested, decreases in CFI of more than 0.01 will be considered relevant.
In analyzing convergent and discriminant validity indexes, we followed Shipp, Burns, and Desmul’s guidelines (2010). They suggest that as a convergent validity index, item-factor loadings should exceed .70, and that how many of those loadings are significant should be considered as well. Meanwhile, for discriminant validity, they suggest that correlations between factors not exceed .85.
CFA was carried out using maximum likelihood estimation and the statistics program IBM SPSS AMOS, version 20. Now given that maximum likelihood estimation is sensitive to not meeting the assumption of normal distribution, we tested for noticeable deviation in the data using Mardia’s test of multivariate normality (Mardia, Reference Mardia1974). We also retested the proposed structure using a bootstrap method, which meant retesting the structure in 200 random replacement samples.
The sample’s descriptive statistics were analyzed as well, including disaggregated data by occupational category and analysis of the skewness and kurtosis of all the scales’ items. Furthermore, we applied reliability analysis based on Cronbach’s alpha, and mean difference analysis to determine whether the dimensions of the WDQ can detect differences between occupations (like in the original study). All those analyses were carried out using the statistics program IBM SPSS, version 20.
Results
Factor structure of the WDQ in spanish
In determining multivariate normality, Mardia’s coefficient yielded standardized values all under the recommended maximum of 5 points: Task Motivational Work Characteristics = 2.197; Knowledge Motivational Work Characteristics = 1.844; Social Work Characteristics = 1.1756; Physical or Contextual Work Characteristics = 1.316.
Bootstrap estimations yielded almost negligible biases in the estimators obtained through maximum likelihood estimation. The bias of loadings on the dimensions of Task Motivational Work Characteristics ranged from –.004 to .008, with bias in standard error ranging from –.003 to .002 on task dimensions. Bias in factor loadings on Knowledge Motivational Characteristics fell between 0 and 0.151, with bias in corresponding standard error ranging from –.005 to .001. The bias in factor loadings on Social Work Characteristics ranged from –.005 to 0.103, with bias in standard error between –.007 and .004. Bias in factor loadings on Contextual Work Characteristics ranged from –.003 to .007, with bias in corresponding standard error ranging from –.004 to .001. Nevertheless, none of these biases reached the level of statistical significance with respect to zero.
As far as CFA results, Table 2 presents goodness of fit statistics for each proposed model, in the cultural adaptation as well as the original version of the instrument (except the single-factor model). We observed that the one-factor model, which was not included in the U.S. version but was in this adaptation, had the poorest goodness of fit, especially judging from comparative fit indices, which were the most important in this case: CFI, BIC, and AIC, indicating that participants were indeed able to distinguish among factors. Second, we observed that the pattern of goodness of fit improvement across the proposed models was similar in the two populations, consistent with the U.S. study, particularly when it comes to RMSEA. Larger discrepancies were observed, however, in the goodness of fit indices CFI and SRMR; but the models’ ranking according to comparative fit followed a similar pattern. Keep two things in mind. First, the sample size used in the cultural adaptation is practically twice that of the U.S. study, which has a negative impact, increasing the magnitude of different goodness of fit statistics in the Spanish sample. Second, every single model showed fewer degrees of freedom in the U.S. sample, leading us to assume the original authors set restrictions on their models to improve goodness of fit (maybe by correlating error terms) that were not sufficiently documented. With that in mind, one might think the goodness of fit values reported in the U.S. sample are overly benign; but that does not prevent us from comparing models relative to one another.
Table 2. Confirmatory Factor Analysis Results from the WDQ by Morgeson and Humphrey (Reference Morgeson and Humphrey2006) and Its Spanish Adaptation
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-87220-mediumThumb-S1138741617000245_tab2.jpg?pub-status=live)
Note: N = 540 (US: United States sample); N = 1035 (SP: Spanish sample); SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation; CFI = comparative fit index. Additional indicators (Spanish SP sample only): TLI = Tucker-Lewis index; AIC = Akaike information criterion; BIC = Bayesian information criterion. The 1-factor model was only tested in the Spanish adaptation process.
Regarding the additional indicators utilized (TLI, and to compare models, BIC and AIC), the 21-factor model continues to show the best goodness of fit, despite being the most complex. In effect, these indicators penalize complexity, yet comparing it to the other models, it seems to be the best. Regarding our analysis of the magnitude of increases and decreases, those data also tend to align with the 21-factor model proposed by the authors. While some increases in RMSEA are slightly less than 0.015, which might prompt a search for a more parsimonious solution, the decreases in CFI are slightly greater than .01, supporting the factor solution the authors proposed. That finding along with considerations presented above support the 21-factor solution.
Once the 21-factor model was identified as fitting the Spanish adaptation best (same case for the U.S. version), we analyzed factor configurations corresponding to each of the four major categories of work characteristics. Thus, CFA was applied to Task Motivational Work Characteristics (7 factors), another to Knowledge Motivational Work Characteristics (5 factors), a third CFA explored Social Work Characteristics (5 factors), and a fourth examined Physical or Contextual Work Characteristics (4 factors). Table 3 presents goodness of fit indices obtained using said analysis strategy.
Table 3. Confirmatory Factor Analysis Results for the Spanish Adaptation of the WDQ by Morgeson and Humphrey (Reference Morgeson and Humphrey2006), According to Macro Work Design Factors
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-31629-mediumThumb-S1138741617000245_tab3.jpg?pub-status=live)
Note: N = 1035 (SP: Spanish sample). SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation; CFI = comparative fit index; TLI = Tucker-Lewis index.
Based on those criteria, it can be said that all the indexes computed, except χ2 /df, suggest good fit between the models and the data, indicating that the underlying factor structure of work design, per the WDQ, was reproduced in this empirical study’s data. Figures 2, 3, 4, and 5 present the structures resulting from the various CFAs that determined the 21-factor model to be superior. Given the large sample size, the poor performance of the χ2 /df statistic was to be expected.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-06134-mediumThumb-S1138741617000245_fig2g.jpg?pub-status=live)
Figure 2. Task Motivational Work Characteristics.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-35225-mediumThumb-S1138741617000245_fig3g.jpg?pub-status=live)
Figure 3. Knowledge Motivational Work Characteristics.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-74082-mediumThumb-S1138741617000245_fig4g.jpg?pub-status=live)
Figure 4. Social Work Characteristics.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-48469-mediumThumb-S1138741617000245_fig5g.jpg?pub-status=live)
Figure 5. Contextual Work Characteristics.
Convergent and discriminant validity indices of the WDQ in spanish
The following results present item-factor loadings, which provide convergent validity indexes, and correlations between factors, which provide discriminant validity indexes. Observing Figure 2, the three subfactors of Task Motivational Characteristics (Work Scheduling Autonomy, Decision-making Autonomy, and Work Methods Autonomy) are highly correlated with one another, with values ranging from .82 to .91, and may constitute a unified autonomy factor, which is consistent with the current literature. Nonetheless, differentiating between these factors improves goodness of fit and, as the Discussion section will analyze, that is the most theory-consistent choice although it is not entirely consistent with discriminant validity indexes. The other four factors (Task Variety, Task Significance, Task Identity, and Feedback from Job) correlated with each other with coefficients ranging from .03 to .39, allowing for a discussion of relative independence between them all, with reasonable discriminant validity indexes. It is noteworthy that the Feedback from Job subfactor showed important correlations with the other subfactors, ranging from .30 to .49. All indicators’ loadings on their corresponding factors were significant (p < .001), ranging in magnitude from .42 (pertaining to R66) to .92 (corresponding to R41), so generally speaking, they meet the convergent validity requirements.
In Figure 3, which portrays CFA results for the dimension Knowledge Motivational Characteristics, the correlations between different subfactors and items in each subfactor appear. Relations between subfactors were high, ranging from .51 (Job Complexity and Problem Solving) to .96 (Information Processing and Problem Solving). In other words, the Problem Solving subfactor had the highest correlation, with Information Processing, as well as the lowest, with Job Complexity. These results suggest a set of interrelated factors should be discussed, which would justify their treatment as an independent block. In terms of correlation coefficients between factors, we maintain they are reasonable evidence of discriminant validity. In this case, item loadings onto their respective factors were relatively low, two were under .30; nonetheless, the remaining item-factor correlations easily account for convergent validity.
Figure 4 describes the CFA pertaining to Social Work Characteristics. Here, five subfactors are found with correlations ranging from .03 (Social Support and Initiated Interdependence) to .67 (Initiated Interdependence and Received Interdependence). Relations among different subfactors were moderate, which can provide discriminant validity evidence. Most of the loadings were over .60, with more than half over .70 and only two under .30; the above leads us to maintain that in general, evidence of convergent validity was found.
And finally, Figure 5 describes the CFA corresponding to Contextual Work Characteristics, finding four subfactors with correlations ranging from .03 to .96. Notice there were negative correlations between Ergonomics and Physical Demands (–.47), Physical Demands and Work Conditions (–.50), and Work Conditions and Equipment Use (–.20). The highest correlation was found between Ergonomics and Work Conditions (.96), and the lowest was between Ergonomics and Equipment Use (.03); in general, correlation coefficients were moderate, so it can be said that evidence of discriminant validity was found. All loadings were over 0.5, except for item R55, which was extremely low (0.08). Given these data, in this case, there is no clear, identifiable evidence of convergent validity.
Reliability of the spanish-language WDQ scales
Table 4 presents different descriptive statistics pertaining to items on the scales of the Spanish-language WDQ, including skewness and kurtosis data. Responses to each item clearly show adequate dispersion across response options, demonstrating the instrument’s ability to measure different levels of these features. Furthermore, the items in general show good correlations with their respective dimensions, conveying their theoretical belonging. In reference to Contextual/Physical dimensions, these showed weaker performance that, as you will see, is associated with results on the Ergonomics dimension.
Table 4. Descriptive Statistics of Items on the Different Scales of the WDQ in Spanish
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-63420-mediumThumb-S1138741617000245_tab4.jpg?pub-status=live)
Note: N = 1035
Table 5 presents various descriptive and psychometric statistics pertaining to the WDQ. Means and standard deviations are displayed in the first two columns. The data indicate acceptable variability, but as expected, a certain trend toward the middle of the response scale. The third column addresses internal consistency, reporting Cronbach’s alpha values for each scale. The numbers are good or very good, except on the Ergonomics (.38) and Problem Solving (.60) scales, where internal consistency did not meet the accepted minimums for psychometric goodness of fit (around .70); on the weakest dimension, Ergonomics, although a slight change in internal consistency is observed when the inverse item is eliminated (it becomes .56), it was not enough to impact the instrument’s reliability (global internal consistency and mean reliability were unchanged) or other results (descriptive statistics and the instrument’s goodness of fit results), so it was kept. On the whole, Cronbach’s alpha values are slightly lower in the Spanish sample than in the U.S. sample, but on various scales – Decision Making Autonomy, Work Methods Autonomy, and Physical Demands – they are slightly higher in the Spanish sample. The global internal consistency (of 77 items) obtained for this adaptation of the WDQ was Cronbach’s alpha of .92, indicating a high level of homogeneity across items. Furthermore, mean reliability for the set of scales was Cronbach’s alpha of .77, compared to .87 in the U.S. sample.
Table 5. Means, Deviations, Reliability, and Statistics According to the 21-factor Model in U.S. and Spanish Samples
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-11352-mediumThumb-S1138741617000245_tab5.jpg?pub-status=live)
Note: a Alpha coefficient.
b ICC(2).
c r wg .
The last two columns of Table 5 present interrater correlations (the extent to which judges’ ratings of their jobs covary with other workers’ ratings, represented as an intra-class correlation coefficient, ICC2; Bliese, Reference Bliese, Klein and Kozlowski2000) and interrater agreement (the absolute level of agreement among workers, that is, to what extent raters assign the same values on average, with the index r wg ; James, Demaree, & Wolf, Reference James, Demaree and Wolf1984). In general, results suggest that worker agreement is relatively high when they assign scores to work characteristics, with overall agreement in the results of the U.S. and Spanish samples. Differences were found in the ICC2 values of only three work characteristics that Morgeson and Humphrey’s original study did not find significant (see Table 5). High levels of interrater agreement (r wg index) in the U.S. and Spanish samples suggest the results are not the fruit of idiosyncratic perceptions of the people in those samples considering multiple judges were in broad agreement in their ratings of work characteristics.
Occupational indexes of the spanish WDQ’s construct validity
As discussed at the end of the Introduction, Morgeson and Humphrey (Reference Morgeson and Humphrey2006) formulated four hypotheses about the likelihood that certain occupations would show high or low levels of specific work design characteristics, that being an index of construct validity. Considering the available data about this adaptation and adjusted procedure, we chose to replicate this analysis.
In empirically testing hypotheses, we used the same criteria the original authors used to form occupational categories. Thus, non-professional occupations included jobs in the SOC categories (U.S. Department of Commerce, 2000) “food preparation and serving related occupations,” “farming, fishing, and forestry,” “construction and extraction,” “installation, maintenance, and repair,” “production,” “transportation and material moving,” “military specific occupations,” and “building and grounds cleaning and maintenance” (the latter was added in the present study; it was not covered by Morgeson & Humphrey). Professional occupations, on the other hand, included jobs in the remaining SOC categories. Human-life focused occupations, meanwhile, included jobs in the categories “community and social services,” “healthcare practitioners and technical,” “healthcare support,” and “protective service” occupations; not human life-focused occupations included all remaining occupational categories. Finally, sales occupations were jobs in the category “sales and related;” job titles in all other categories were considered non-commercial – or non-sales – jobs.
The occupational structure described above, and results, appear in Table 6. Evidently jobs in professional occupations had higher levels of Knowledge Characteristics, that is Job Complexity, t(1033) = 3.20, p < .001; Information Processing, t(1033) = 7.80, p < .001; Problem Solving, t(1033) = 3.95, p < .001; Skill Variety, t(1033) = 3.33, p < .001; and Specialization, t(1033) = 1.99, p < .046. Professional job also showed higher levels of Work Scheduling Autonomy, t(1033) = 5.36, p < .001; Decision Making Autonomy, t(1033) = 5.21, p < .001; and Work Methods Autonomy, t(1033) = 4.93, p < .001. As a result, Hypothesis 4a was confirmed for the eight work design characteristics considered. Morgeson and Humphrey (Reference Morgeson and Humphrey2006) did not manage to gather support for the Specialization characteristic; the present study did, though its effect size was small.
Table 6. Means on Work Characteristics by Occupational Category
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-49852-mediumThumb-S1138741617000245_tab6.jpg?pub-status=live)
Note: All means were significantly different across occupational categories.
With respect to Hypothesis 4b, we found that jobs in non-professional occupations showed higher levels of Physical Demands, t(1033) = 11.25, p < .001, and less favorable Work Conditions, t(1033) = 9.56, p < .001, lending empirical support to our hypothesis. As for Hypothesis 4c, it has empirical support in the finding that jobs in human-life focused occupations displayed higher levels of Task Significance, t(1033) = 6.71, p < .001. Finally, jobs in sales occupations exhibited higher levels of Interaction Outside the Organization, t(1033) = 5.23, p < .001, thus confirming Hypothesis 4d.
Discussion
This study’s main objective was to adapt the most comprehensive, thorough measure of work design available, the Work Design Questionnaire by Morgeson and Humphrey (Reference Morgeson and Humphrey2006), into the Spanish language to give researchers and practitioners alike a good instrument with which to investigate, measure, and change the reality of work. The WDQ attempts to exhaustively measure work design. Toward that end, its authors considered the specialized literature on this subject, particularly the literature about Hackman and Oldham’s (1980) model of job characteristics and subsequent models. The result was an extensive questionnaire (77 items) that measures 21 different dimensions of work design. Indeed a very complete measure of work design, especially considering the number of measures other instruments use to do the same Job Diagnostic Survey –JDS– by Hackman and Oldham (Reference Hackman and Oldham1980) (5 measaures); Multimethod Job Design Questionnaire –MJDQ– by Campion and Thayer (Reference Campion and Thayer1985) (4 measures); Job Content Questionnaire –JCQ– by Karasek et al. (Reference Karasek, Brisson, Kawakami, Houtman, Bongers and Amick1998) (7 measures); the Measurement of Job Characteristics by Sims, Szilagyi, and Keller (Reference Sims, Szilagyi and Keller1976) (6 measures); and the New Scales of Timing Control, Method Control, Monitoring Demand, Problem-solving Demand and Production Responsibility by Wall, Jackson, and Mullarkey (Reference Wall, Jackson and Mullarkey1995) (5 measures).
The WDQ appears in the context – it is partly a result of this context – of a paradigm shift in work design. From solely examining work itself, there has been movement toward considering it an activity undertaken within a social and technological context, thus broadening the scope of interest, and explaining and justifying the instrument’s large number of measures. As a work measurement tool preferred by researchers around the world, the WDQ is playing an important role in this paradigm shift.
In short, this questionnaire has been translated and adapted into many languages (e.g., German by Stegmann et al., Reference Stegmann, van Dick, Ullrich, Egold, Wu, Charalambous and Menzel2010; Chinese by Chiou et al., Reference Chiou, Chou and Lin2010; Polish by Hauk, Reference Hauk2014; and according to Morgeson, 2011, there are unpublished, preliminary adaptations into French, Italian, and Portuguese, and especially preliminary versions in Arabic, Hebrew, Japanese, and Korean), and it was deemed highly advantageous to translate it into Spanish, too. With that in mind, this study’s objective was to premiere the Spanish version of the WDQ, and try to determine its internal consistency, factorial structure, and relation to certain other criteria that indicate validity. The discussion that follows will address and expound upon various aspects of the WDQ and its Spanish adaptation.
First, the WDQ lacks a clear theoretical framework. The set of dimensions that comprise it came from an in-depth review of documents, which identified numerous variables examined in the research literature of the last 60 years. A process of selection, synthesis, differentiation, and definition identified the 21 factors that were ultimately included in the questionnaire. The task of identifying, recognizing, and constructing items was similar. The authors indicate what items they took from other sources, and which they created new. Even so, probably not every potential dimension of work design was tapped. Aspects like motivation, the worker’s emotional well-being, job security, a deeper exploration of the worker’s personality, etc., and the time factor all warrant consideration.
That being said, its lack of a theoretical framework is an important limitation that is largely ameliorated by the fact that recently, probably semi-coincidentally, several purely theory articles were published that advocate for a reconsideration of work design. The views they propose constitute a real paradigm shift on the subject toward adoption of what, one way or another, has come to be called “extended work design theory” or “work design in situ.” Such is the case of papers by Parker et al. (Reference Parker, Wall and Cordery2001), Humphrey et al. (Reference Humphrey, Nahrgang and Morgeson2007), and others. The simultaneity of these contributions to the field in different parts of the world, adopting almost a shared perspective, might be explained by the spirit of the times, as a need that is collectively, intuitively felt and answered. In any case, the question is how far contextual design should go, and what variables it should take into account, and conversely, what environmental determinants of design should be examined. So far, what has been done is to break the old molds and propose more or less extensive lists of variables that are presumed to be important without the least bit of empirical support.
Based on this process of theoretical reflection and empirical construction of the instrument, the WDQ’s length is justified. The time invested by each participant is acceptable, which becomes especially important when it is compared to other common methods; therefore, the WDQ seems ideal for working with large samples for mainly research purposes. However, its advantage in terms of data collection has a downside: certain constructs are evaluated quite superficially. In addition, it is missing high- and low-difficulty items; this makes it harder to compare jobs with extreme characteristics and reduces the instrument’s sensitivity to small deviations within homogenous samples. In addition to this lack of depth, the WDQ cannot attempt to evaluate all relevant characteristics. Some recent theoretical reflections suggest a need to select characteristics based on the situation (Parker & Ohly, Reference Parker, Ohly, Kanfer, Chen and Pritchard2008). Despite those limitations, the WDQ includes strategies to avoid earlier instruments’ failures and limitations; for instance, its response format is very simple, and all items are phrased affirmatively even though that means reverse scoring certain items.
After looking at certain general features of the instrument, there are aspects of the Spanish adaptation to consider. In the adaptation process, there were certain deviations from the original test that should be kept in mind: a) The first has to do with participants’ differential work experience. Whereas Morgeson and Humphrey required respondents to have 15 years’ experience in their job, this study merely required a minimum work experience of three years, and six months’ tenure in their current job. Yet the data indicate that in this study, respondents’ average experience in their current jobs was 11 years, which was basically equivalent to the original sample. b) Morgeson and Humphrey, based on some research, chose to group each dimension’s items together when collecting data. This adaptation process began the same way, but changes had to be made because of respondents’ high level of rejection of certain items being repeated almost word for word. Some of the differences in results might be due to that variation in item order. In particular, somewhat inferior results were observed in the internal consistency of scales compared to those Morgeson and Humphrey reported (α values between .80 and .95 with an average of .87). Nevertheless, the total instrument’s internal consistency (of 77 items) was Cronbach’s alpha of .92, indicating a high level of homogeneity between items. The various scales’ reliability ranged from Cronbach’s alpha of .70 to .96, except for three: Job Complexity (α = .69), Problem Solving (α = .60), and Ergonomics (α = .38). In the latter, which deviated a lot from expected values, α would change slightly if the dimension’s inverted item were eliminated (α = .56), but that change was not satisfactory and did not significantly improve other results, like the reliability of the scale on the whole (its internal consistency metrics stayed the same), descriptive statistics, or the instrument’s goodness of fit results, which did not show greater variation. Considering that eliminating the Ergonomics item has little impact on the instrument’s results, and that this problem is shared by Morgeson and Humphrey’s original version as well as successive adaptations of it in other languages (e.g., Stegmann et al., Reference Stegmann, van Dick, Ullrich, Egold, Wu, Charalambous and Menzel2010), we chose to keep the original instrument’s structure. That being said, this is clearly a weakness to resolve, and an issue for this dimension’s measurement stability (probably related to how items were worded), so to researchers and practitioners alike, we must urge care and caution when interpretting this factor of the Spanish WDQ.
Despite the above, it is noteworthy that the average global reliability of all the different dimensions was α = .77, and that every dimension’s reliability was higher (with the aforementioned exceptions) than its JDS equivalent (with reliability ranging from .65 to .70 and average reliability of .68 according to Taber & Taylor, Reference Taber and Taylor1990). This highlights the psychometric potential of this work design questionnaire.
It is also important to point out the high levels of interrater agreement (especially per the r wg index) found in both the U.S. and Spanish samples. This suggests the results are not idiosyncratic perceptions, because multiple judges were in broad agreement in their appraisals of work characteristics. In any case, those values Morgeson and Humphrey (Reference Morgeson and Humphrey2006) regarded so highly are not as important if one considers that the test is reliable and valid, and therefore reliably captures true differences between the wide array of jobs that were studied. In other words, low interrater correlation and agreement could also be the result of meaningful variability in job design, variability which the WDQ captures reliably. In that case, low interrater correlation and agreement would be a better indicator than if the opposite were true.
In relation to the instrument’s construct validity – and independently of the WDQ’s lack of theoretical backing, which was reasonably well resolved above – what is certain is that the hypothesized 21-factor structure received considerable empirical support. The one-factor model did not show goodness of fit (that is, participants distinguish between factors), and other models did (with 4, 18, 19, and 20 factors, consistent with Morgeson & Humphrey). However, it was the 21-factor model that showed the highest goodness of fit despite keeping all items in the Ergonomics dimension together (for reasons explained above). Along those lines, apparently the pattern of improvement in goodness of fit across the proposed models is similar in the U.S. and Spanish populations, especially in terms of the index RMSEA. There is, however, a larger discrepancy in goodness of fit according to the indices CFI and SRMR, even when the models’ ranking in terms of comparative fit followed a similar trend. It is important to bear in mind that support for the authors’ factor solution is consistent with the analysis of increase and decrease magnitudes. Thus, while some increases in RMSEA were slightly less than 0.015 (Chen, Reference Chen2007), which might invite a quest for a more parsimonious solution, decreases in CFI were slightly greater than 0.01 (Cheung & Rensvold, Reference Cheung and Rensvold2002), supporting the factor solution the authors proposed. Therefore, considering these antecedents altogether, we chose to keep the 21-factor solution, because there was not enough reason to select a solution other than Morgeson and Humphrey’s (Reference Morgeson and Humphrey2006).
Moreover, based on this model, applying CFA to the four major factorial dimensions generated important information about the goodness of fit of different latent factors – that is, Task Motivational (7 factors), Knowledge Motivational (5 factors), Social Work (5 factors), and Physical or Contextual Work Characteristics (4 factors). All models showed goodness of fit, thus confirming the appropriateness and stability of Morgeson and Humphrey’s original model (2006). We also computed convergent and discriminant validity indices for the Spanish WDQ. Generally speaking, convergent validity requirements consistent with Shipp et al. (Reference Shipp, Burns and Desmul2010) were met, the exception being Ergonomics, in which case it could relate to the dimension’s measurement stability, which eliminating an item would not correct. Furthermore, in terms of discriminant validity, the requirement that most correlations between factors be under .85 was met, except in the case of Autonomy Characteristics, which could be interpreted as a single factor if not for the fact that dividing it into three components is highly consistent with theory and adds to the goodness of fit of the factorial structure.
Notwithstanding the above, when CFA is applied to the total instrument, goodness of fit falls to medium or low levels due to the instrument’s large number of items. That produces an exponential rise in the number of correlations between items that the model should explain, and keeps overall goodness of fit from being obtained. Nevertheless, we thought applying CFA to macro work design factors might end up being more suitable since the internal dimensionality of each set is evaluated, more specifically, which is replicated in the results. This subject should undoubtedly be considered with an eye to future studies, especially the lower overall goodness of fit.
Additionally, the fact that different studies, adaptations, etc. have found the same factorial structure is evidence of confirmatory factor validity, because it rests on certain expectations about structure. However, the authors’ reasoning holds up – despite certain inconsistencies – about what relations to expect among the various WDQ measures and different external measures stemming or derived from the application of other measurement techniques.
In view of the available data, we sought further evidence of the instrument’s validity. We replicated the validity analysis of differences between occupational categories, examining whether WDQ dimensions can detect differences between jobs belonging to different occupations using cognitive, interpersonal, and physical variables. In other words, we tried to determine whether certain occupations, based on estimated scores on variables external to the WDQ, are more likely to present high or low levels of specific work design characteristics. Toward that end, Morgeson and Humphrey’s (Reference Morgeson and Humphrey2006) four original hypotheses about the expected relation were tested. Results confirmed the hypothesized relationship across the board. From the above, we concluded that the results provide important empirical evidence favoring the Spanish WDQ’s validity.
As far as possible methodological limitations, we analyzed the fact that data on all constructs were collected solely using questionnaires; thus relations could potentially have surfaced as a result of common-method bias. That is a risk we are willing to take, not only because differences were found in objective criteria (professional group, autonomy, management responsibility), but also because not “everything is correlated with everything else.” Rather, patterns of results are differential; add to that the fact that confirmatory factor analysis establishes that effectively, different constructs were evaluated. For these reasons, and the fact that these results were reproduced in various other studies and adaptations into other languages, we are confident in the results obtained. That being said, the validation process still has a long way to go.
With that in mind, it would be good to establish consistency over time in the measures obtained. The response format is indeed so simple that if one observes respondents as they answer items, they would tend to think it does not matter if they answer 1 or 2, or 3 or 4 for example, when the reality is different. Yes, the response format should be quite simple, but it should also prevent respondents from falling into a routine that could render differences in response null. Therefore, we suggest creating an instrument with far fewer items and clearly differentiated response options, where choosing between 1 or 2, for example, is a decision the respondent really must make responsibly and with a clear sense of the reality they presumably know.
Finally, though we have repeatedly argued that the WDQ has many dimensions, yet cannot include them all, there is strong conviction that a true instrument that allows for true design or redesign of work should have fewer measures. Work certainly is a complex reality, but is it really necessary to establish so many independent dimensions? Are so many dimensions truly independent? If a person were tasked with redesigning a job or other activity, would he or she know what to do with so many dimensions? As much as the empirical results lend their support, does it not seem that various aspects of the same dimension are being accounted for, rather than distinct dimensions? Therefore, it would be beneficial in the future to practice or test the reality, and in so doing, try to ascertain the different dimensions’ higher or lower authenticity Footnote 1, Footnote 2 .
Appendix
Relation between WDQ Items and their Spanish-language Versions
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170630082402-19457-mediumThumb-S1138741617000245_tabau1.jpg?pub-status=live)