One of the main challenges in human resource (HR) management is assessing and developing the competencies that allow for adaptation and success in increasingly competitive environments (Armstrong & Taylor, Reference Armstrong and Taylor2014). As the display and development of individual competencies is expected to stimulate workers to propel their organizations and themselves toward success there has been a growing investment on competency modeling and development (Du Gay, Salaman, & Rees, Reference Du Gay, Salaman and Rees1996). However, although the body of knowledge on competency-based approaches to managing and developing personnel is significant (Armstrong & Taylor, Reference Armstrong and Taylor2014), there is a gap in the literature with regard to the psychometric evaluation of competency assessment instruments specific to the languages and cultures to which they are applied.
The usual application of Anglo-Saxon language questionnaires to different cultural contexts has been shown to result in less accurate comparable data (Batista-Foguet, Boyatzis, Guillén, & Serlavós, Reference Batista-Foguet, Boyatzis, Guillén, Serlavos and Emmerling2008; Extremera, Fernández-Berrocal, & Salovey, Reference Extremera, Fernandez-Berrocal and Salovey2006). Indeed, the back-translation of questionnaires cannot guarantee construct equivalence and a series of methodological problems such as invalid substantive inferences and perpetuation of unreliable measures may derive from that, thus compromising the systematic accumulation of research findings (Cheung, Reference Cheung2004). This is in line with a growing call for the development of valid HR assessment instruments for cultures other than Anglo-Saxon (e.g., Extremera et al., Reference Extremera, Fernandez-Berrocal and Salovey2006; Vandenberg & Lance, Reference Vandenberg and Lance2000).
In addition, the assessment of behavior at work through managerial measurement instruments is controversial, and scholars have raised concerns about the validity of such measures for research purposes (e.g., Scullen, Mount, & Judge, Reference Scullen, Mount and Judge2003). Researchers mostly focus on internal, statistical or external validity rather than on bridging the gap between abstract theoretical constructs and their measurements i.e., on the previous necessary test of instruments´ construct validation (Bagozzi, Reference Bagozzi1984). The present paper addresses this issue while responding to appeals made by Bagozzi (Reference Bagozzi1984) and Jarvis, Mackenzie and Podsakoff (Reference Jarvis, Mackenzie and Podsakoff2003) to spend more effort theoretically justifying measurement (i.e., epistemic relationships) hypotheses by proposing and illustrating a comprehensive approach to construct validity. We argue that the common use of either Exploratory or Confirmatory Factor Analysis (EFA or CFA) and the (ab)use of Cronbach’s alpha coefficient do not guarantee construct validity. These practices over-simplify construct validity assessment; consequently, measurement misspecifications lead to biased estimates of the structural relationships among the variables. This caveat is important, because it can seriously limit the validity of further research findings and ultimately result in the development of interventions that lack the appropriate empirical foundation. Thus, the objective of this paper is twofold: first, to fill the gap for more culturally sensitive assessment instruments in the specific field of human resources by developing a Spanish Questionnaire of Personal and Motive-based Competencies and second, to test its psychometric properties and factorial invariance based on gender and work experience while critically reflecting on the common use of those testing techniques that helped to find validity evidences (e.g. Exploratory or Confirmatory Factor Analysis) that alone cannot guarantee construct validity.
In this paper, we attempted to contribute to two streams of literature. First, we added to the research on managerial competencies by developing a questionnaire for Spanish-speaking countries, thereby avoiding construct bias due to the different meanings attached to the test dimensions by different groups (Libbrecht, De Beuckelaer, Lievens, & Rockstuhl, Reference Libbrecht, De Beuckelaer, Lievens and Rockstuhl2014). Second, we also supplemented the literature on methodological issues related to the measurement of competencies (e.g. Batista-Foguet, Saris, Boyatzis, Guillén, & Serlavòs, Reference Batista-Foguet, Saris, Boyatzis, Guillén and Serlavòs2009) by illustrating good practices for the validation process of a newly designed questionnaire. Herein, we guide the reader throughout the steps that might need to be considered by scientists and practitioners embarking in similar projects; this emphasis on the methodology supplements the previous literature on competency development by providing a tool that can foster accurate assessment and development of competencies in Spanish speaking countries.
Theoretical Framework: A motive-based structure of competencies
Competencies have been proposed to predict life and job outcomes. A growing number of studies have examined this legacy and have shown that competencies lead to numerous positive job outcomes, such as popularity, individual performance and status achievement (e.g., Du Gay et al., Reference Du Gay, Salaman and Rees1996; Emmerling & Boyatzis, Reference Emmerling and Boyatzis2012). Unlike other constructs, such as cognitive intelligence, which is considered relatively stable across the lifespan, competencies are shaped by experience and can be developed (Emmerling & Boyatzis, Reference Emmerling and Boyatzis2012). In fact, people learn, change and grow over the course of their careers, and the assessment of competencies followed by feedback of strengths and weaknesses provide a road map for the development of executives throughout their careers.
Nonetheless, scholars are raising concerns regarding the structure of competencies that can best serve organizations (Bartram, Reference Bartram2005). Management literature suggests two types of competencies: personal competencies, which are based on the idea that effective leadership depends on the characteristics of the leader, such as his/her personality, and social competencies, which are based on the idea of leadership as social processes (Bartram, Reference Bartram2005; Petrides & Furnham, Reference Petrides and Furnham2001). Since we understand that being effective in the work environment requires both adequate personal characteristics as well as relational skills our instrument draws from the two streams of literature to propose personal and motive-based competencies.
With respect to social competencies, scholars advocate that competencies should be placed within the motivational realm (e.g., McClelland & Boyatzis, Reference McClelland and Boyatzis1982). The three-factor structure based on the three social motives of affiliation, achievement and power of McClelland, Atkinson, Clark, and Lowell (Reference McClelland, Atkinson, Clark and Lowell1953) has been repeatedly used across studies to classify social competencies into motive-based dimensions (e.g., Batista-Foguet et al., Reference Batista-Foguet, Boyatzis, Guillén, Serlavos and Emmerling2008; Guillén & Saris, Reference Guillén and Saris2013). Indeed, the three motive-based clusters of competencies have been shown to have different personality correlates and are related to different facets of performance at work (Guillén & Saris, Reference Guillén and Saris2013). Thus, drawing on previous research linking motives with behavior at work (e.g., Bartram, Reference Bartram2005; Guillén & Saris, Reference Guillén and Saris2013), we classified social competencies into three higher-order dimensions: collaboration (manifestations of the affiliation motive), mobilization (manifestations of the power motive) and achievement (manifestations of the achievement motive). Following a literature review targeting competencies to include in each of the motive-based dimensions (e.g., Bartram, Reference Bartram2005; Guillén & Saris, Reference Guillén and Saris2013), we propose the following: mobilization includes inspirational leadership, influence, communication, conflict management, service orientation and development of others; collaboration encompasses empathy, teamwork and flexibility; achievement includes achievement orientation, responsibility, problem solving and planning and organization.
With respect to the personal competencies included in our model, we drew on the emotional intelligence (EI) and management literature. A review of this literature indicates the following personal characteristics are commonly linked to effective behavior at work and are the basis for effective management: self-efficacy, self-control, optimism, assertiveness, initiative and stress management (Bar-On, Reference Bar-On2006; Bartram, Reference Bartram2005; du Gay et al., Reference Du Gay, Salaman and Rees1996; Petrides & Furnham, Reference Petrides and Furnham2001). This body of research demonstrates that these competencies are related to positive job outcomes, such as commitment, performance and satisfaction. For these reasons, our personal competency cluster includes self-confidence, self-control, positive-outlook, stress management and assertiveness. The 19 competencies are defined in Table 1.
Table 1. Competencies included in the SQPMBC

In summary, the competencies on our Spanish questionnaire include personal and motive-based competencies. However, the discussion relative to its content validity remains rather broad, and further empirical evidence is needed to gain an understanding of how such taxonomy is suited to a Spanish-speaking culture. As noted earlier, assessment instruments are often developed for Anglo-Saxon cultures and then translated into different languages (Batista-Foguet et al., Reference Batista-Foguet, Boyatzis, Guillén, Serlavos and Emmerling2008). Given the risk of a lack of semantic (i.e., linguistic differences), conceptual (disparity of measures), scaling (scoring formats interpretation or calibration) and equivalence across samples (Libbrecht et al., Reference Libbrecht, De Beuckelaer, Lievens and Rockstuhl2014), this procedure has drawbacks. In the ensuing discussion, we explored this issue by addressing the psychometric properties of a Spanish questionnaire of personal and motive-based competencies. We focused on construct validity, which is a major concern when using questionnaires to assess behavioral dimensions at work (Scullen et al., Reference Scullen, Mount and Judge2003), since it can lead to weak content validity, unstable factor structures and a lack of empirical support for divergent or convergent validity (McEnrue & Groves, Reference McEnrue and Groves2006).
Figure 1 illustrates the overall research plan and structure of this paper. First, we provided a brief introduction of the sample in this study and presented the competency model development process of the questionnaire (steps 1–5 in Fig. 1). Second, we illustrated the process for assessing the content validity of the items intended to measure competencies (step 6). After agreeing on the definite competency model (step 7), we performed a second study to reassess the psychometric properties of the scale and study its external validity by evaluating the effects of gender and work experience on each of the proposed competencies (step 8). Because we are targeting a generic public in Spanish-speaking countries, it is important to check the external validity of our test. The paper concludes by discussing the implications of the results for the research and practice of management assessment methods.

Figure 1. Competency model development and validation process.
Method
Participants
Two different groups in two independent studies completed the questionnaire. A study group composed of scholars, coaches and people from organizations expert in competency development (n1 = 274) participated in the first testing of the questionnaire. The sample for the second study consisted of participants (n2 = 482) from a public institution, the Chamber of Commerce in Navarra, Spain, who voluntarily enrolled for an assessment of their competencies. The assessment of the psychometric properties of the Spanish Questionnaire of Personal and Motive-based Competencies (SQPMBC) was based on this definitive sample, which included 244 men (M) and 238 women (W), with a mean age of 35.7 years (SD = 7.83) and a mean number of years of work experience of 3.1 (SD = 2) for the women and 3.8 (SD = 2.2) for the men.
Both samples can be considered “convenience sample” since subjects were gathered because of their knowledge of the topic in the first study, and because of their accessibility, actually self-selection, in the second study.
Instrument Development
As a preliminary step, we undertook a systematic literature review with special focus on recognized questionnaires such as, the Trait EI Questionnaire (TEIQue, Petrides & Furnham, Reference Petrides and Furnham2001), the EI Questionnaire (EIQ, by Dulewicz, Higgs, & Slaski, Reference Dulewicz, Higgs and Slaski2003) and the “Inventario Bochum de Personalidad y Competencias” (Hossiep & Parchen, Reference Hossiep and Parchen2006). We analyzed the best practices in competency modeling in other European settings (a review of the handbook of the competency conference held in London in 1997) and took advantage of our twenty years of experience in teaching evaluation and development of emotional and social competencies at a Spanish Business School and other organizations. Once the theoretical foundations of the dimensions were identified and ratified by scholars and expert practitioners, we continued with the scale development process by taking a deductive approach. However, the appropriateness of the approach in our particular situation did not preclude us from taking preventive measures to assure face and content validity in the generation of the initial items.
In the original questionnaire it has been attached 6 items to each of the 19 competencies displayed in Table 1. So it included 114 items.
Since the 11-point scale has shown to be an answer modality that provides higher quality data when the aim is to assess the frequency of behaviors (Batista-Foguet et al., Reference Batista-Foguet, Saris, Boyatzis, Guillén and Serlavòs2009), which is the case with our questionnaire, we have used this scale. Accordingly, the respondents were asked to indicate the frequency of the behavior attached to each item on an eleven point-scale ranging from (0) ‘the behavior is never shown’ to (10) ‘the behavior is consistently shown’.
Procedure
To look for evidences of validity of the proposed taxonomy of competencies, ten HR managers were asked to classify each of the competencies in Table 1 according to the personal and motive-based theoretical structure proposed: personal, collaboration, mobilization and achievement. The managers were asked to complete the task individually and then reach a group consensus. All competencies were successfully assigned to the theoretical clusters. The next step was to generate a set of six items per competency and to do rounds of discussions to achieve consensus and content validity. Following this, three external groups (coaches involved with leadership development, potential users of the questionnaire and a group of professionals from HR departments) were invited to take part of the content validity process. Thus, after the generation of items, a preliminary version of the questionnaire was distributed to these three external groups, who were then asked to comment on the clarity of the wording, face validity of the items and to assign each item to the hypothesized competency. To do so, the items, competency labels and competency definitions were randomly presented to the external groups with a request to first match each competency definition to what they believed to be its corresponding competency label. Subsequently, the items were asked to be match with corresponding competency labels. This process led to a few amendments before the questionnaire was made available for the first study. The first proposal of the SQPMBC comprised 114 items aimed at assessing the 19 competencies listed in Table 1. An ad hoc web platform was built for collecting the participants’ responses.
Data Analysis
Once the data were gathered, we screened for missing values. We found a maximum of 1% of missing values per variable and missing values were imputed using the SPSS EM maximum likelihood method (Cuesta & Fonseca-Pedrero, Reference Cuesta and Fonseca-Pedrero2014; Fernández-Alonso, Suárez-Álvarez, & Muñiz, Reference Fernández-Alonso, Suárez-Alvarez and Muñiz2012).
In order to assess the underlying factor structure of a questionnaire it is first necessary to study the nature of the items, i.e., whether they are formative or reflective (see Bisbe, Batista-Foguet, & Chenhall, Reference Bisbe, Batista-Foguet and Chenhall2007; Jarvis et al., Reference Jarvis, Mackenzie and Podsakoff2003). While in management research it is common to consider management constructs as reflective; scholars are beginning to acknowledge the importance of the distinction between reflective and formative items (Bisbe et al., Reference Bisbe, Batista-Foguet and Chenhall2007; Jarvis et al., Reference Jarvis, Mackenzie and Podsakoff2003).
We suggest that management related competencies can be formed of reflective and formative items and that scientists and practitioners should be aware of this difference when developing and assessing their questionnaires. Otherwise, the “ritual” usually followed, rooted in Classical Test Theory (Nunnally, Reference Nunnally1978), leads to the use of tools based on internal consistency, such as EFA and Cronbach’s alpha, which could lead to misspecification of the epistemic relationships and, consequently, to biased estimates of the structural relationships (Bisbe et al, Reference Bisbe, Batista-Foguet and Chenhall2007; Jarvis et al., Reference Jarvis, Mackenzie and Podsakoff2003).
Thus, our concern is not with Cronbach´s alpha and factor analysis per se, but with the unreflective (or ritualistic) use of both. In regard to Factor Analysis, what the paper proposes is the need to do a previous assessment of the formative or reflective nature of the items, otherwise some items would have been discarded because they would have been erroneously deemed as unreliable. And, in regard to Cronbach´s alpha usefulness, what we point out, and correct for in our study, is its frequent use without checking the application pre-conditions such as the need for tau-equivalence.
To test the hypothesis that the relationship among the items can be accounted for by the nineteen hypothesized factors (i.e., the competencies distinct scales), we used LISREL 8.80 on the covariance matrix to estimate the factorial structure of the questionnaire.
The different nature of the items (i.e., reflective or formative) entails assessing the items differently to reduce questionnaire’s length, to avoid boredom and fatigue. We examined reflective items relevance in our pilot study according to their internal competency consistency (loading magnitude) and thereby eliminated those items with factor loadings less than .65. However, formative indicators are not expected to correlate with each other; therefore, traditional measures that help to find validity evidences are not appropriate (Bisbe et al., Reference Bisbe, Batista-Foguet and Chenhall2007). Thus, we assessed the formative indicators by their relevance to the domain.
Concerning reliability, we used Cronbach’s alpha for assessing the internal consistency of the set of reflective items. However, for those competencies in which tau-equivalence was not fulfilled, we used Heise & Bohrnstedt’s (Ω coefficient) (Heise & Bohrnstedt, Reference Heise, Bohrnstedt, Borgatta and Bohrnstedt1970), which only requires fitting to the factor analysis model.
Results
Pilot and First Study
After the three external groups reached a consensus regarding the test content, as described in the procedure section, we assessed the psychometric properties of the items. Since the questionnaire included both formative and reflective items, it was required specifying a multiple indicators-multiple causes (MIMIC) model for data analysis to assess the scale’s psychometric properties. In the pilot study (n = 274) CFA did not reject the hypothesized 19-factor structure.
This pre-test led to the elimination of a number of items; the questionnaire was reduced from 114 to 95 items and then to 76 items (i.e., from the original five or six items per competency to four items per competency). The items were pruned according to the following criteria: a) we excluded those formative items that showed less substantial contribution and b) we excluded the reflective items with lowest reliabilities.
Definitive Study Results
Social Desirability
It is well known that the characteristics of a questionnaire and respondents can often lead to a systematic answering of questions (known as response style) in ways that undermine conclusions. According to Steenkamp, De Jong, & Baumgartner (Reference Steenkamp, de Jong and Baumgartner2010), this is due to respondents’ enduring tendency to provide overly positive self-descriptions (i.e., a socially desirable response, SDR). In fact, people may delude themselves into thinking that they are doing what is desired.
On the one hand, as the SQPMBC is a self-assessment questionnaire, it is likely that the data provided biased information due to the inflated self-perceptions of the respondents. To address this potential SDR problem, we included a subset of 20 items from Paulhus’s (Reference Paulhus, Robinson, Shaver and Wrightsman1991) Balanced Inventory of Desirable Responding (BIDR), which assess the degree of a conscious moralistic response tendency (MRT, 10 items) and egoistic response tendency (ERT, 10 items).
On the other hand, our data collection process allowed us to assume that the respondents’ incentives toward favorable self-assessment were minimal. Indeed, there was no incentive to make an impression because the respondents were answering the questionnaire on a voluntary basis and under conditions of guaranteed anonymity. However, the average scores were generally fairly high, indicating a ceiling effect due to the frequency of responses on the right side of the scale. We initially interpreted this tendency toward higher scores to be related to the social desirability factor. However, results showed that most of the correlations between each of the 19 competencies and the SDE-ERT or MRT were either non-significant or negligible, i.e., none of the standardized regression coefficients exceeded 0.2 (Steenkamp et al., Reference Steenkamp, de Jong and Baumgartner2010).
Global Goodness of Fit and Detailed Diagnosis
All loadings of the reflective selected items per competency of the final version of the SQPMBC are above .65. All global indexes, such as the χ2/df ratio, Root Mean Square Error of Approximation (RMSEA), Comparative Fit Index (CFI), Non-normed Fit Index (NNFI) or Parsimony Goodness-of-Fit Index (PGFI), were above the usual thresholds (Hu & Bentler, Reference Hu and Bentler1999). However, it is well known that these indexes may have important drawbacks that can lead to erroneous conclusions (Saris, Satorra, & van der Veld, Reference Saris, Satorra and van der Veld2009). Therefore, in the diagnostic stage, we avoided looking only at indexes of overall model fit and ignoring a more detailed diagnosis indicators. We checked whether 1) all the estimated values were reasonable and of the expected sign, 2) the correlation residuals suggested the addition of parameters and 3) the modification indexes and expected parameter changes led to plausible estimates. This process is in agreement with a recent proposal (Saris et al., Reference Saris, Satorra and van der Veld2009) to focus more attention on the detection of misspecification errors rather than solely on the global fit and to consider the power of the test in addition to the significance levels. Because our initial model led to some misspecifications, generally magnified due to the high power situation (large sample size and high reliability), we released a few justified constraints on uncorrelated uniqueness. As a result, the model fit was much better, with values of Satorra-Bentler χ2 = 4043 (df = 2484), 90% CI RMSEA = (0.048; 0.056), CFI = 1.00 and SRMR of 0.0413.
Convergent and Discriminant Validity and Reliability
The results showed that all the reflective items had loadings above 0.65, and that the corresponding competencies had an Average Variance Extracted (AVE; i.e., the average communalities per competency) above 0.5. As mentioned, for these competencies that had only reflective items reliability was assessed with Cronbach’s alpha or Heise & Bohrnstedt’s (Ω coefficient).
Discriminant validity was assessed by comparing the square root of the AVE (Table 3) of each reflective construct with the correlations between the constructs (Table 4). The results suggested that the competencies were adequately discriminated, despite the relatively high magnitude of some correlations. In fact, every model with the correlation between two competencies constrained to one was rejected. Therefore, these results suggested the appropriateness of maintaining these competencies as separate facets of the clusters of competencies.
Table 2. Loading range of the reflective items and number of formative items out of the total 4 items per competency; the formative item loadings are excluded from the table

Table 3. AVE*, Cronbach’s alpha and Omega of the 19 competencies

* AVE was computed excluding the formative items.
** The correlation is provided for those competencies that only have 2 reflective items.
*** Ω was computed instead of α for the reflective items of those competencies that were not tau-equivalent.
Table 4. Correlation Matrix of the 19 competencies

Intra-clusters correlations are shadowed.
Effect of Gender and Work Experience on the SQPMBC Self-Evaluation
Since one of the main challenges of the use of questionnaires is coping with external validity issues, we addressed it by assessing whether the structure of the 19 competencies was invariant according to gender and work experience. If invariance were not fulfilled, it would suggest that the differences between the competencies among groups were likely due to different meanings attached to those factors. This problem is present in both the psychology (Meredith, Reference Meredith1993) and management literature (Batista-Foguet et al., Reference Batista-Foguet, Boyatzis, Guillén, Serlavos and Emmerling2008; Vandenberg & Lance, Reference Vandenberg and Lance2000).
The equivalence of the underlying competencies of the SQPMBC across gender and work experience can be established through sequential steps in nested multi-group mean and covariance structural equation models. To test the gender and work experience invariance, we specified a CFA model for each of the 4 proposed clusters of competencies (Personal, Achievement, Mobilization and Collaboration). Table 5 illustrates how we tested the three equivalence requirements (i.e. configural, metric and scalar invariance) for gender - columns a, b and c, respectively. The results show that males and females used the same factor model (column a), the same loadings on each competency (column b) and the same origins of the measurement scales (column c).
Table 5. Factorial Invariance of the SQPMBC: Females vs. Males

Results also showed that equality restrictions did not lead to a deterioration of the global fit of the model, and equality constraints held true for respondents with different work experiences, except in the Collaboration cluster Footnote 1 . It should be noted that, if the three requirements were not fulfilled, the competency comparison between gender or work experience would not be meaningful because the differences in loadings and intercepts could easily mask differences in the meaning of the underlying construct.
Once we succeeded in assuring factorial invariance, we were able to evaluate the effect of gender and work experience on each competency. Table 6 shows the results of the mean comparison of competencies by gender and work experience. Taylor and Hood (Reference Taylor and Hood2011) showed that professional women tend to be under-estimators (their self-assessment is often lower than the assessment of their behavior by others), whereas professional men tend to be over-estimators. However, our data (Table 6) showed that both males and females typically rate their competencies quite similarly. Although there are some exceptions, as in the competencies of “Service Orientation” and “Empathy”, for which women (coded with 1) rated themselves higher than men (coded with 0), none of the other competencies present significant (ns) differences between the means (as shown in the t-test and Mann-Whitney U test Footnote 2 in Table 6).
Table 6. Student T-tests and U Mann-Whitney tests showing the results of the mean comparison tests of competencies by gender and work experience

* p < .05.
** p < .001.
In contrast, we predicted that work experience would be positively related to the self-assessment of the constructs proposed. However, except for “Influence” and “Empathy”, which show non-significant mean differences, and “Assertiveness” in which more experienced workers rated themselves higher, the workers with less experience generally rated themselves as more competent.
Discussion
In this study, we sought to contribute to HR management research and practice by providing an instrument to assess personal and motive-based competencies in Spanish-speaking countries. We led the reader through the steps to test the construct validity of the questionnaire. Moreover, we addressed the methodological issues that researchers developing questionnaires in the HR field might face by guiding them throughout the process of the SQPMBC development. The identification of personal and motive-based competencies that may promote work achievement in workplace settings is a very important goal. However, in the last two decades, the few published tests that promised to measure similar constructs, such as emotional-social intelligence, have not been empirically evaluated or have been developed without much attention to the standard methods used to establish validity evidences (McEnrue & Groves, Reference McEnrue and Groves2006). Furthermore, most of the tests published have been validated in English-speaking countries, compromising their construct validity when applied in other countries (Batista-Foguet et al., Reference Batista-Foguet, Boyatzis, Guillén, Serlavos and Emmerling2008; Extremera et al., Reference Extremera, Fernandez-Berrocal and Salovey2006).
The construct validity of the proposed 76 items for measuring 19 competencies, which are clustered into four broad groups according to personality and motives, was first assured by establishing the items’ face and content validity with a group of experienced coaches, a group of potential users and five experts in the field. The CFA results in the second sample (n = 482) were also consistent with the hypothesized 19-factor model with four clusters. The fit indexes of the measuring model were satisfactory, as were the factor loadings. The results of the discriminant validity analysis also showed that all the competencies are adequately discriminated. Once the evidence of validity was established, we addressed reliability issues. However, because of the inclusion of formative items in the questionnaire, the routinely applied reliability 0.70 threshold for basic research or above 0.90 for applied settings with substantial stakes (Nunnally, Reference Nunnally1978) could not be considered. If the formative nature of the items identified (Table 2) had not been considered, they would have been discarded because they would have been erroneously deemed as unreliable.
Furthermore, our results provided evidence that the SQPMBC was relatively free of the social desirability threat (likely because of the non-evaluative situation in which the data were collected) since the correlations with Paulhus’ (Reference Paulhus, Robinson, Shaver and Wrightsman1991) social desirability scale were either negligible or very low (and below the usual threshold of .2). However, social desirability might become a challenge for the validity of competency assessments in applications other than purely developmental (i.e., linked, for example, to performance evaluations). Future research should address this issue.
In routine use of HR-related questionnaires it is common to perform sub-group comparisons by performing ANOVAs between females vs. males, novice vs. experienced professionals or U.S. vs. European test takers. However, as Vandenberg and Lance (Reference Vandenberg and Lance2000) have noted, an approach based only on mean differences is problematic because researchers assume that the particular scale’s measurement are equivalent across groups without investigating whether this assumption is satisfied. The measurement of factor invariance is a logical prerequisite for making decisions based on any selection tool, yet this is often forgotten by researchers who use surveys developed in other countries or by practitioners in comparative studies. In this paper by testing factor invariance with gender and work experience, we addressed this problem and therefore established increased external validity of our assessment instrument. Our results supported the stability of the SQPMBC structure across gender, but factor invariance does not hold for the competencies within the mobilization cluster across group work experience (i.e., the meaning of the competency items varies by work experience). Therefore, we will explore gender and work experience differences in those competencies that fulfilled the necessary requirement of scalar invariance next.
With respect to gender, our results show significant self-rating differences related to gender in only two of the nineteen competencies, empathy and service orientation, in which women rated themselves higher than men. Studies claim that women are better at emotional attention, empathy, appraisal of emotions and social skills, whereas men are better at regulating or utilizing their emotions (e.g. Gouveia, Milfont, Gouveia, Neto, & Galvão, Reference Gouveia, Milfont, Gouveia, Neto and Galvão2012). Thus, our results on gender difference are consistent with research showing that females have greater empathic response than males (Mestre, Samper, Frías, & Tur, Reference Mestre, Samper, Frías and Tur2009). With respect to work experience, to our knowledge, there are no studies relating work experience to personal and motive-based competencies. We found that people with different levels of work experience did not have a similar conceptualization of the six competencies in the mobilization cluster.
Unlike gender, the mean differences related to work experience were statistically significant in 12 of the remaining 13 competencies, indicating that the less experienced workers rated themselves higher than the more experienced workers. A possible explanation for this phenomenon is that more experienced and mature people would most likely agree with Aristotle’s quote, “the more you know, the more you know you do not know”, and less experienced people tend to view their competencies more positively. This finding calls for increased attention in how the development of novice managers can be guided in the initial stages of their careers. Nevertheless, because our sample was not gathered randomly, no inferences can be drawn from the statistical tests. Therefore, we must understand that the statistical significance of our conclusions throughout the paper has a more descriptive, rather than inferential, meaning.
In spite of the satisfactory results obtained, the questionnaire has some limitations. Further evidence of predictive utility (validity) is required, particularly with regard to achievement criteria. In addition, we must recognize that the discriminant validity or independence of existing measures of personality (Costa & McCrae, Reference Costa and McCrae1992) have yet to be tested.
Nevertheless, we developed an instrument to fill the existing gap in the field of competency measurement in Spanish speaking countries. The paper also adds clarity and understanding to the process of the development of a competency measurement tool. The potential usefulness and applicability of the SQPMBC is quite broad. Professionals in the area of human resources may benefit from a reliable and valid competency assessment instrument that can guide feedback and talent development. Similarly, the development of competency theory may benefit from this new measure for research in Spanish-speaking countries.
This research is part of the Spanish MICINN project (EDU2010-15250) on “Emotional and Social Competencies Development Program within the European Higher Education”.