Comparing Two Instruments of Transformational Leadership

Abstract Purpose: To compare two transformational leadership instruments, Bass’s Full Range Leadership Model and its instrument Multifactor Leadership Questionnaire with the Transformational Leadership Scale proposed by Rafferty and Griffin (2004), using empirical evidence from a single sample. Methodology: The sample includes participants from different levels of the Estonian Defence Forces’ military hierarchy (N = 2570). The structures of the Transformational Leadership Scale and Multi‐factor Leadership Questionnaire were examined with exploratory and confirmatory factor analysis, while other methods were used to compare the two instruments. Findings: The Multifactor Leadership Questionnaire is statistically significantly better at predicting outcome variables like satisfaction with leader, effectiveness, and extra effort; however, the Transformational Leadership Scale did predict outcome variables at a sufficient level. Research & Practical Limitations/Implications: The current research contributes to the validation of the Transformational Leadership Scale proposed by Rafferty and Griffin (2004). The results indicated that the Transformational Leadership Scale is a valuable research tool to study transformational leadership; however, some subscales require further development. Moreover, we may conclude that there is a difference between subsamples – e.g. between commanders and conscripts – that describe outcome variables using the Transformational Leadership Scale as a transformational leadership instrument. Originality/value: There is very limited research that compares different transformational leadership instruments.


Introduction
In the eighties, management and leadership researchers started to focus on emotional and symbolic aspects (Yukl, 2013, p. 300). One of the most influential approaches arose from what has been known as the fullrange leadership theory (FRLT), which consists of transformational, transactional and laissezfaire components (Bass, 1998). It has been very successful and widely accepted by management and leadership literature, as it was developed in an integrative manner and garnered a remarkable amount of empirical examination (Antonakis and House, 2002, p. 4;Lowe, Avolio, and Dumdum, 2013, p. 71). Moreover, several studies demonstrated the positive relations between components of FRLT and performance in various contexts (Wang et al., 2011), includ ing the military (e.g. Dvir et al., 2002;Bass et al., 2003). The first component of FRLT is laissezfaire, which is defined as the "avoidance or absence of leadership." Obviously, it is the most inactive and ineffective leadership style (Bass and Riggio, 2006, p. 8).
The second component, that of transactional leadership, focuses on the process of exchange. This type of leadership "occurs when the leader rewards or disciplines the follower, depending on the adequacy of performance" (Bass and Riggio, 2006, p. 8).
On the other hand, transformational leaders "act as agents of change by arousing and transforming followers' attitudes, beliefs and motives" (Antonakis and House, 2002, p. 8). They are typically contributing to followers' commitment, trust, and loyalty to the organization (Dumdum, Lowe, and Avolio, 2002, p. 42). Despite its popularity, FRLT and especially its measurement instrument, the Multifactor Leadership Questionnaire (MLQ) received some criticism in the leadership literature. This was especially due to the difficulty in differentiating between the subdimensions and the lack of consistent empirical support for the factor structure of the model (see Careless, 1998;Rafferty and Griffin, 2004;Tejeda, Scandura, and Pillai, 2001;Yukl, 1999).
At the same time, several researchers developed their own transformational leadership models, along with measurement instruments; e.g. AlimoMetcalfe and AlbanMetcalfe (2001), Bennis and Nanus (1985), Conger and Kanungo (1994), Kouzes and Posner (1987), Larsson (2006), Nissinen (2001), Podsakoff et al. (1990), Rafferty and Griffin (2004), and Careless, Wearing, and Mann (2000). Several of these showed promising results in empirical studies. However, there seems to be a lack of research published from independent research groups that would study different cultures and contexts to confirm the validity of these models/instruments (Antonakis, 2012, p. 269-274). Therefore, there is a clear gap in the literature when it comes to investigating alterna tive approaches of transformational leadership and comparing them with FRLT/MLQ to better understand how they may be related to each other and whether they really measure the same concept of leadership. Vol. 28, No. 1/2020 Antek Kasemaa, Reelika Suviste The current study selected the instruments proposed by Rafferty and Griffin (2004) and by Bass and Avolio (1997) as the basis of its research. Sometimes it is mentioned in the literature that, despite different models of leadership (e.g. charismatic and transformational), they should conceptualize and measure a single phenomenon (Anto nakis, 2012, p. 273-274). It is especially so, if we take into account the content of those models. However, in several cases it was not tested empirically. The Transformational Leadership Scale (TLS; Rafferty and Griffin, 2004) is assessed in the literature as promising (Antonakis, 2012, p. 273-274), while MLQ is one of the most popular instru ments of transformational leadership in consultancy businesses and as a research tool as well (Gill, 2011, p. 82). Therefore, the current study aims to compare two models/ instruments that use the settings of Estonian military to 1) identify the extent to which they measure the same construct (transformational leadership); 2) ascertain how well they predict outcome variables; 3) supplement additional arguments of construct, criterion, and concurrent validity for the TLS. All this will contribute to the longterm goal, which aims to develop a valid and reliable, reasonably short and easily administrable TL research and feedback instrument, which is openly available in the Estonian language.
The sample of the current study is from the military, so the results further contribute to transformational leadership research in the military hierarchy. The military frame work is justified by the argument that transformational leadership is applicable both for the military and nonmilitary environment (Wong, Bliese, and McGurk, 2003). This understanding allows using military samples for theory testing, because the concept as such is assumed to be contextfree. The method applied to analyze the data includes an explorative and confirmative factor and regression analysis procedures.
The article covers the following structural points. First, it discusses the differences and similarities of the two transformational leadership instruments, followed by the develop ment of propositions. Second, we explain the research method, including the explicit description of the data analysis strategy. Finally, we present the results, discuss them, and conclude.

Comparison of the Multifactor Leadership Questionnaire and the Transformational Leadership Scale
To establish a solid basis for statistical analysis, one needs a theorybased comparison of the two transformational leadership instruments. The MLQ consists of nine factors that form the three components: transformational, transactional, and laissezfaire. The factors of the transformational component are the following (Kark and Shamir, 2002, p. 79): Comparing Two Instruments of Transformational Leadership 1) idealized influence -attributed and behaviors (IIA and IIB) -involves behaviors such as putting group benefits above the leader's own, using personal example, and showing high moral and ethical standards. Initially, this factor was onedimen sional; however, in order to respond to criticism, later versions of MLQ divided it between attributional and behavioral components (Antonakis, 2012, p. 266); 2) inspirational motivation (IM) includes the creation and rendering of an appeal ing vision and demonstration of optimism, enthusiasm, symbols, and emotional arguments; 3) intellectual stimulation (IS) involves actions that enhance the awareness of possible problems and influence the solving of those problems from different angles; 4) individualized consideration (IC) embraces support, encouragement, and coach ing.
On the other hand, the transactional component consists of contingent reward (CR) and active and passive managementbyexception factors (Bass, 1997;Bass and Riggio, 2006, p. 21). Despite some controversial results about the factor structure of the MLQ (e.g. summarized by Northouse, 2010, p. 188-190), Antonakis and House (2002) note that the original ninefactor structure works best, especially for homogenous samples, although results may differ if the analysis includes several subsamples. Therefore, FRLT/MLQ is somewhat contextsensitive, yet remains universal across conditions. However, taking into account the published criticism of the FRL Model and its instrument, Rafferty and Griffin (2004) proposed a theorydriven approach to transformational leadership, which could demonstrate better discriminant validity with subdimensions and with outcomes. The model is somewhat different from the FRLM and consists of five transformational leadership factors: 1) vision (VIS); 2) inspirational communication (IC); 3) supportive leadership (SL); 4) intellectual stimulation (IS); and 5) personal recognition (PR).
To measure this, we administered a sample from the Australian public sector 15items instrument; three items per each of the above factors. The leadership literature indi cates that this model may be a reasonable alternative to MLQ; however, this approach was never widely researched by separate research groups, including groups from dif ferent cultures (Antonakis, 2012, p. 269, 274). Moreover, the sample used by Rafferty 1) Vision expresses an idealized picture of the future based on organizational values; 2) Inspirational communication conveys positive and encouraging messages about the organization, along with statements that build motivation and confidence; 3) Supportive leadership demonstrates the concern for followers and their indivi dual needs; 4) Intellectual stimulation foregrounds the enhancement of employees' interest in and awareness of problems but also shows how their ability to think about problems in new ways increases; and 5) Personal recognition shows the provision of rewards such as praise and acknow ledgment of effort for the achievement of specified goals.

Similarities and Differences
A comparison of MLQ and TLS factors is presented in Table 1. One important difference between these two models/instruments might be that the MLQ considers contingent reward (CR) as part of transactional leadership. However, several studies found remark ably high correlations between CR and transformational leadership factors (e.g. Antonakis and House, 2014;Edwards and Gill, 2012;Kanste, Miettunen, and Kyngäs, 2007;Judge and Piccolo, 2004), but also between CR and outcome variables (Bass and Riggio, 2006;Dumdum et al., 2002, p. 52). Moreover, Goodwin, Wofford, and Whittington (2001) find empirical support for the twofactor structure of CR, concluding that some aspects of CR may belong to transformational leadership while other aspects to transactional leadership. However, the TLS includes a factor called personal recognition, which is meant to cover the aspects of contingent reward in transformational leadership (Rafferty and Griffin, 2004). The second difference is that the TLS stresses vision in a narrower way compared to charisma (Conger and Kanungo, 1998) or idealized influence (Bass and Riggio, 2006). It means that the articulation of the idealized picture is the key compo nent here (Rafferty and Griffin, 2004), not so much charismatic influence. Third, inspirational communication (TLS) is presented in a way that covers some aspects of individualized influence (Bass, 1998).
The similarities between the two instruments are the following: 1) Both instruments use the same definition for the intellectual stimulation component of TL (Rafferty and Griffin, 2004;Bass, 1998); 2) Supportive leadership (TLS) is closely connected to indivi dualized considerations (MLQ). Both emphasize follower needs; however, the MLQ CEMJ 7 Comparing Two Instruments of Transformational Leadership seems to have a broader definition, as it includes elements of the leader's empowering behavior (Bass, 1998  The contraction of exchange of rewards for effort, promising rewards for good performance, recognition of accomplishments The provision of rewards such as praise and acknowledgement of effort for achievement of specified goals Note. a Only those factors are presented that have conceptual similarities with the TLS (active and passive management-by-exception and laissez-faire were excluded). b Earlier works from Bass and his associates labelled it as "Charisma," later it became "Idealized influence," which has two aspects: leader's behaviour and the elements attributed to leaders by followers (Bass and Riggio, 2006, p. 6).

Propositions
We first focus on the convergent validity of the two instruments. Thus, we assume that both measures are highly correlated, which means that e.g. all TLS subscales demon strate the highest correlations with respective MLQ subscales (see Table 1 Antek Kasemaa, Reelika Suviste effectiveness, their extra effort, and their satisfaction with leaders were used for statis tical analysis; all three are part of the standard MLQ package. Nevertheless, one important assumption of transformational leadership is that it will lead to the aug mentation effect. Originally, this assumption was offered by Bass (1985), who argues that transformational leadership has a unique contribution over transactional leader ship on the organizational outcomes. Antonakis and House (2002) conclude that the correlations between leadership factors and outcome variables -effectiveness, extra effort, and satisfaction -might differ among organization types; e.g. in the military environment these correlations tend to be remarkably higher compared to civilian organizations.
However, several metaanalyses show the strong predictive validity of transformational leadership on the subjective outcome variables (e.g. Dumdum et al., 2002;Judge and Piccolo, 2004;Lowe et al., 2013). Third, we concentrated on the assumption that the TLS adds something unique to the MLQ's predictive power on the subjective outcomes. Therefore, as mentioned in the literature, the TLS may be a promising model of transfor mational leadership (Antonakis, 2012, p. 273-274). Therefore, we were interested in the incremental validity (Sackett and Lievens, 2008) of the TLS over the MLQ. Fourth, we examined the impact of the military hierarchy on the predictive power of the TLS on the subjective outcomes; e.g. Rowold and Heinitz (2007) find that higherlevel mana gers use more transformational and charismatic leadership than lowerlevel managers. Based on the above discussion and comparison of the models, the current study has the following hypotheses: 1) H1: Vision from Rafferty and Griffin (2004) is more positively correlated to idealized influence (attributed and behaviors) from Bass and Riggio (2006) than other dimensions of the MLQ; 2) H2: Inspirational communication from Rafferty and Griffin (2004) is more positively correlated to inspirational motivation from Bass and Riggio (2006) than other dimensions of the MLQ; 3) H3: Supportive leadership from Rafferty and Griffin (2004) is more positively correlated to individualized considerations from Bass and Riggio (2006) than other dimensions of the MLQ; 4) H4: Intellectual stimulation from Rafferty and Griffin (2004) is more positively correlated to intellectual stimulation from Bass and Riggio (2006) than other dimensions of the MLQ; 5) H5: Personal recognition from Rafferty and Griffin (2004) is more positively correlated to contingent reward from Bass and Riggio (2006) than other dimen sions of the MLQ; 6) H6: The TLS (Rafferty and Griffin, 2004) predicts transformational leadership outcome measures (Bass, 1998) at least to the same extent as MLQ; 7) H7: The TLS (Rafferty and Griffin, 2004)

accounts for variance in followers'
leadership outcomes -the sum of followers' extra effort, leadership effectiveness, and job satisfaction -above that accounted for by the MLQ.

Method Sample
The main sample of the research consisted of 2570 military service members from the Estonian Defence Forces (EDF). A detailed description of the sample and data collec tion is available elsewhere (Kasemaa, 2015;Meerits, Suviste, and Kasemaa, 2015). The questionnaire was sent to 3351 Estonian military personnel from the rank of private to captain in winter 2012 by MA students of the Estonian Military Academy (EMA). The response rate was 77% (2584); however, some responses were excluded from the analysis due to their incompleteness. The final sample included 350 full contract service members and 2220 conscripts (n = 2570). The majority of respondents were male (more than 99%) with the mean age of 22.4 years (SD = 3.1), 90% Estonian natio nals, with 87% holding secondary education. By the time of the survey, the conscripts had completed six or nine months of their mandatory service out of the required eight or eleven months. Professional service members had served in the EDF for an average of 6.5 years. The final sample represented 12 battalions or equivalents, divided between 29 companies and 94 platoons.

Instruments and Procedure
Leadership This study employed the transformational leadership questionnaire (Rafferty and Griffin, 2004), which includes five subscales, each containing three items. A fourpoint Likerttype scale was used, in which point 1 represented "strongly disagree" and point 4 -"strongly agree." The respondents were asked to evaluate the leader who had been posted as a commander of their units or subunits. The questionnaire subscales were the follow ing: articulating a vision, intellectual stimulation, inspirational communication, suppor tive leadership, and personal recognition. This questionnaire was used before -e.g. by Rafferty and Griffin (2006;2006a) and Strauss, Griffin, and Rafferty (2009)  The second leadership measure used in this study was MLQ 5X (Avolio and Bass, 2004), which has 36 items; permission to use it was granted by Mind Garden, Inc. We used 24 of items: all the transformational leadership items (20) and contingent reward items (4) from transactional leadership. A fourpoint Likert type scale was used, in which point 1 represented "strongly disagree" and point 4 -"strongly agree." The respondents were asked to evaluate the leader who had been posted as a commander of their units or subunits. The subscales were the following: idealized influence, inspirational motivation, intellectual stimulation, individualized considerations, and contingent reward. The MLQ is the most widely used instrument of transformational leadership (Antonakis, 2012, p. 264;Gill, 2011, p. 82
The questionnaire was administered in the following manner: platoon members -mostly conscripts, except for one professional battalion -evaluated platoon commanders,

CEMJ 11
Comparing Two Instruments of Transformational Leadership platoon commanders evaluated company commanders, and company commanders with battalion staff members evaluated battalion commanders. Paper and pencil admini stration was used, and participation in the survey was voluntary. All questionnaires were delivered to the potential participants with the instruction to put filled out papers into collection boxes.

Analysis Strategy
The convergent validity was firstly assessed through zeroorder correlations, using ttests to evaluate statistical significance between compared correlations (Diedenhofen and Musch, 2015). For that purpose, we used aggregated TLS dimensions: for instance, our first hypothesis predicted that vision is highly correlated with idealized influence (attributed and behaviors) from the MLQ. Thus, we compared correlations between VIS and II (attributed and behaviors) with correlations between an average of the other four TLS dimension scores. In order to find supportive arguments for the convergent validity of TLS, a series of Explorative Factor Analyses (EFA) were conducted using Principal Axis Factoring with Varimax rotation -if necessary -as a method and criteria EIGN > 1 for factor extraction. We assumed items hypothesized to form one single dimension -either from the MLQ or the TLS -should load into the one single factor, because they are supposed to measure similar components of transformational leader ship. Moreover, we assessed the average variance extracted (AVE) and composite relia bility (CoRel) to get further arguments for convergent validity. We used .50 (AVE > .50) as a cutoff value for the AVE (Hair, Black, and Babin, 2014, p. 619) and .70 for CoRel (CoRel > .70). We assumed that if the items from their respective subscales of the TLS and the MLQ would load into one component, the values of the AVE and CoRel would exceed the threshold.
CFA and Structural Equation Modelling (SEM) models were analyzed by LISREL 8.80 (Jöreskog and Sörbom, 2006). For the models using raw data -like the Likert scale from 1 to 4 -we applied diagonally weighted least squares (DWLS) as the estimation method. For the models using continuous variables (aggregated items) as observed indicators, we applied the robust maximum likelihood (RML) estimation, because the multivariate normality assumption was not met. The ML method is robust to minor deviation in normality; however, data with excessive kurtosis should be analyzed by methods other than maximum likelihood (Brown, 2006, p. 379). A normality test was conducted in SPSS and LISREL, and both indicated that the assumption of a multi variate normal data distribution may be violated. Univariate normality: a) SPSS: Kurtosis z values between 5.17 and 2.56 and skewness z values between 3.00 and 12.70. The ShapiroWilk test of normality was for all variables statistically significant (p ˂ .000); Vol. 28, No. 1/2020 Antek Kasemaa, Reelika Suviste b) LISREL: kurtosis z values 4.45 and 2.55 and skewness z values between 2.50 and 11.11. Therefore, the robust maximum likelihood estimator (Browne, 1987;Satorra and Bentler, 2001;Jöreskog et al., 2001) was used because it is robust to nonnormality (Brown, 2006, p. 379).
In all models, the observed variables were specified to their corresponding latent vari able; no modifications indices between observed variables were used. The following fit indices were assessed to evaluate the models recommended by the literature (Hooper, Coughlan, and Muller, 2008;Kline, 2011, p. 204): 1) Comparative fit index (CFI) (Bentler, 1990) considered the values above .90 as acceptable; 2) Nonnormed fit index (NNFI) (Bentler and Bonett, 1980) also known as Tucker Lewis index (TLI) considered the values starting from .90 as indicators of good fit (Van de Schoot, Lugtig, and Hox, 2012), although Hu and Bentler (1999) recommended a cutoff value NNFI ≥ .95 as more proper; 3) Root mean square error of approximation (RMSEA; Browne and Cudeck, 1992) indicates the best fit in the value of zero (Kline, 2011, p. 205). The threshold for the RMSEA is ˂ .05 (Kelley and Lai, 2011), although Hoyle (2011, p. 47-48) summarizes the interpretation of RMSEA as follows: a) from zero to .05 certainly acceptable fit; b) the range of .05-.08 as close fit; c) .08-.10 as marginal fit; and d) over .10 as unacceptable fit; 4) Chisquare (χ²) needs to be nonsignificant (p ˃ .05), although it is very difficult to achieve, especially if the sample size is rather large (Hu and Bentler, 1999); therefore, we applied the approach to accept the values of χ² that are statisti cally significant. Moreover, we applied the recommendation from Hair et al. (2014, p. 579) to not take into account the χ²/df ratio due to the large sample size (N > 750); 5) Standardized root mean square residual (SRMR; Kline, 2011, p. 208): Ideally, SRMR must be close to zero, which indicates the perfect fit of the model; however, Furr and Bacharach (2014, p. 343) propose a suitable cutoff value at ≤ .06. Moreover, Δ χ² difference test was conducted to compare different models (James, Mulaik, and Brett, 1982;Hoyle, 2011, p. 49;Kline, 2011, p. 215) and model CFI difference (Widaman, 1985). A nonsignificant chisquare compari son (p ≥ .05) would indicate the better model and CFI greater than .01 demon strates the practical difference in model fit. Under the assumption that model fit is generally acceptable, a more complex model should be preferred; if not, the simpler model would be favored (Kline, 2011, p. 216).

CEMJ 13
Comparing Two Instruments of Transformational Leadership Finally, usefulness analysis was employed for the incremental validity (Darlington, 1990;Judge et al., 2003) through the comparison of two basic regression models: 1) all MLQ dimensions predicting leadership (LS) outcomes; 2) all TLS dimensions predict ing LS outcomes. For the first model, we added all the TLS dimensions one by one, observing the incremental contribution of TLS dimensions over the MLQ predicting the outcome variable. For the second model, we used a converse procedure, which means adding MLQ dimensions to the TLS predicting outcome. For model comparison, we used ∆R² (Tabachnick and Fidel, 2007, p. 152) and ∆AIC (Akaike Information Criteria; Burnham and Anderson, 2002;2004). We interpreted ∆R² as the figure to indicate how much additional variables added into the model increased the proportion of the explana tion of independent variables (leadership components) over the dependent variable (summary of outcome). The individual AIC values are not interpretable; however, we used the principle that the model with smaller AIC value is preferred (Kline, 2011, p. 220, 222;Weakliem, 2004).

Results
First, we studied the correlation matrix between the TLS, the MLQ, and the outcome variables. Table 2 presents the results, which show that TLS factors had remarkable correlations with all the MLQ and outcome factors represented in the current research. The differences were rather small. The MLQ sum and TLS sum were related to each other with a correlation of r = .76 (p < .001). All TLS subscales were significantly correlated to the summarized outcome variable, lowest on VIS and IS (both r = .55; p < .001) and highest on SL (r = .68; p < .001). Means and standard deviations of all variables are discussed elsewhere (Kasemaa, 2015;Meerits, Suviste, Kasemaa, 2015). VIS (TLS) and IIA and IIB (both MLQ) had correlations .49 and .42 with simultaneous VIS (TLS) and CR (MLQ) .51 (p < .001); which gives no support to the first hypothesis.
For convergent validity, we compared dependent correlations (comparingcorrelations. org), following the calculations offered by Diedenhofen and Musch (2015). As Table 3 shows, if we considered the p value < .01, then H1, H4, and H5 are not supported; how ever, when taking p level as < .05, only H1 is not supported. Therefore, we can conclude that there was no stronger relationship between VIS (TLS) and II (attributes and beha viors) as predicted by H1.
Moreover, for the convergent validity of the TLS, a series of EFAs were conducted. The assumption was that the items hypothesized to measure the same TS components should load into one single factor. The results of the EFA were the following: Vol. 28, No. 1/2020 Antek Kasemaa, Reelika Suviste 1) Vision (TLS) and idealized influence (MLQ) formed one single factor, although communalities and loadings to the factor were rather low; respectively from .08-.37 and .29-.61. The model described 26% of all variance; 2) IC (TLS) and IM (MLQ) clearly formed a single factor, which described 40% of variance, loadings between .54-.72; 3) SL (TLS) and IC (MLQ) formed two factors, 44% of variance, and one item from the MLQ loaded into the TLS factor; 4) IS (TLS) and IS (MLQ) formed two separate factors, respectively 23% and 18% of variance; 5) PR (TLS) and CR (MLQ) formed one single factor, with 43% of variance, loadings ranged from .46 to .80; 6) EFA with all TLS and MLQ transformational components plus contingent reward from MLQ clearly showed one single dimension describing 57% of all variance, with loadings starting from .63 (VIS from TLS) and ending with .83 (IS from MLQ). In conclusion, the EFA results were not fully consistent with the hypothesized pattern of relationships between TLS and MLQ, because some of them demonstrated singularity -as predicted -and others supported twofactor solutions. To confirm the results from EFA, we conducted a series of CFA analyses (see Table 4 for results) and compared two models of each pair of hypothesized subscales: 1) onefactor model with all MLQ and TLS items on one subscale; 2) twofactor model with items respectively either to MLQ or TLS factor. The results indicated that the twofactor solution may be better for all analyzed pairs; however, the differences were rather small for IC (TLS) and IM (MLQ), SL (TLS) and IC (MLQ). The former was also sup ported by the figures of AVE and CoRel, showing minor change in comparison with one and twofactor solutions (∆AVE .02 and ∆CoRel .01). These results were not consi stent with the EFA results, indicating that TLS subscales may measure different constructs than MLQ subscales. Therefore, all tests conducted for the singularity of respective TLS and MLQ factors demonstrated support for the H2; the results were controversial for hypotheses 1, 3, 4, and 5.
To analyze how much transformational leadership factors predict leadership outcome (summarized variable), we performed standard multiple regression, in which all IVs entered simultaneously into the analysis, and show the results in Table 5. The first regression model included MLQ dimensions, while the second model included TLS dimensions. Table 5 displays the unstandardized regression coefficients (B), the standar dized regression coefficients (β), the partial correlations (r k ), the semipartial correla tions (sr i ²), R², and adjusted R². The correlations between MLQ subscales are presented in Table 6 and for TLS subscales in Table 7.     (TLS) of the variability in outcome is predicted by MLQ or TLS factors; we need to remember that all data is collected using the one source and one method approach. Furthermore, the results revealed that in Model 1 (MLQ) the IIB factor (Idealized InfluenceBehavior) did not have a significant contribution to the outcome variable; therefore, we may assume that its correlation with the outcome (r = .57, p < .01, Table 2) is covered by other variables. Thus, both models predicted a remarkable outcome variable, measured as the sum of satisfaction with the leader, readiness to provide extra effort, and feeling of effectiveness. Note. N = 2452; a II-B was omitted due to the non-significant contribution to the model: β = -.004; t = -.254 (p = .800); b p < .001.
The next step was to conduct usefulness analyses in order to assess the incremental validity between the two instruments: the MLQ and the TLS. We ran the basic regres sion model including all MLQ factors (Model 1) and added all the TLS factors one by one to assess the additional value of those over the basic model (MLQ) on the outcome

CEMJ 19
Comparing Two Instruments of Transformational Leadership variable. For that reason, we assessed the change of R² (∆R²) and also ∆AIC criteria. The second set of models was organized conversely, meaning that all TLS factors were used as the basic model (Model 2) and MLQ factors were added one by one. The results are demonstrated in Table 8.
All TLS factors added additional value for the MLQ model; however, some of them were very small. All ∆R² were statistically significant at the p < .001 level; nevertheless, the AIC values and ∆AIC VIS and PR (from TLS) did not show a smaller value than the basic model (MLQ) AIC. Therefore, this criterion allows us to conclude that the variance of those two TLS factors are covered by the MLQ. At the same time, looking at Model 2 we can conclude that all MLQ factors added a significant contribution to the TLS (∆R² between .065 and .112). Therefore, the MLQ predicts the outcome vari ables -measured by extra effort, satisfaction, and the feeling of effectiveness -better than the TLS.
The next step was to conduct structural equitation modeling/confirmatory factor analyses procedures for the MLQ and TLS subscales and outcome variables that com pared various subsamples. The aim of comparing those was to demonstrate the model validity. The models tested (see Figure 1 for a graphical overview) were specified as follows: 1) Model 1: All TLS and MLQ components into one single latent variable (LS -leader ship), which had a path to the summarized outcome variable; 2) Model 2: TLS components to the TLS latent variable, which had two direct paths to the MLQ latent variable (measured by all 6 MLQ components) and to the summarized outcome variable; 3) Model 3: All TLS components to the TLS latent variable, which had a direct path to the summarized outcome variable; 4) Model 4: All MLQ components to the MLQ latent variable, which has a direct path to the summarized outcome variable.
All models were tested with four different subsamples (indicated in Table 9 Table 9.
Note. a -for simplicity reasons, only a selection of MLQ and TLS components is presented.
The results indicated that almost all models for all subsamples were acceptable, except for Model 1 that showed RMSEA between .092-.095; however, all other fit indices exceeded their thresholds (for the CFI ˃ .90; NNFI ˃ .95; SRMR ˂ .06). Moreover, Model 3 did not fit well for the model, including the full sample and subsamples of conscripts and professionals (taking into account the value of RMSEA: .095-.101). Although, it fit rather well for commanders. Looking at the values of χ² (Table 9) and taking into consideration the ratio between χ² and degrees of freedom, we might conclude that models of all participants and the model of conscripts were not fully acceptable (ratio > 3:1). However, we ignored these numbers following the recommendation of Hair et al. (2014, p. 579). The results (Model 3) may indicate that the MLQ is a better instru ment to analyze the sample of conscripts and professionals, whereas the TLS is accept able in the analysis of the commanders' subsample. Considering path coefficients, we may conclude that the TLS's direct effect on the outcome variable is greater for the conscript subsample (.23) and the lowest for commanders (.016; ns.). On the other hand, the indirect effect was greater for the commander subsample (.86) and the lowest for conscripts (.65). It may mean that the TLS has a unique contribution to describe the outcomes for conscripts, whereas this effect is rather weak or nonsignificant for professionals and commanders. Antek Kasemaa, Reelika Suviste In the last step, we used CFA/SEM to assess the fit of different models, adding TLS com ponents measured by items, three per component, one by one into the MLQOutcome model. The same fit indices were used as explained above and the results are presented in Table 10. The purpose of this analysis was to assess how much value different TLS components may add to the MLQ when predicting the summarized outcome variable. Note. For all models N = 2162; RMSEA -root mean square error of approximation; CFI -comparative fit index; NNFI -non-normed fit index; SRMR -standardized root mean square residual; a -All paths are statistically significant at the level p ˂ .001.
Model 1. VIS from TLS measured by three items, paths to OC (measured by extra effort, effectiveness, and satisfaction) first directly and second indirectly through MLQ (measured by six LS factors) and OC.
Model 2. IC from TLS measured by three items, paths to OC (measured by extra effort, effectiveness, and satisfaction) first directly and second indirectly through MLQ (measured by six LS factors) and OC.
Model 3. SL from TLS measured by three items, paths to OC (measured by extra effort, effectiveness, and satisfaction) first directly and second indirectly through MLQ (measured by six LS factors) and OC.
Model 4. IS from TLS measured by three items, paths to OC (measured by extra effort, effectiveness, and satisfaction) first directly and second indirectly through MLQ (measured by six LS factors) and OC.
Model 5. PR from TLS measured by three items, paths to OC (measured by extra effort, effectiveness and satisfaction) first directly and second indirectly through MLQ (measured by six LS factors) and OC.
Looking at Table 10, we may conclude that all models fit the data very well. The range of χ² is 54.00-.07, all statistically nonsignificant; the RMSEA is .000-.001; CFI and NNFI equal 1.00 for all models; SRMS is .031-.053. More interesting information came from path coefficients: 1) the model with VIS had an indirect effect on OC (through the MLQ) = .71 (di rectly .10; ns.); 2) the model with SL had an indirect effect on OC = .62 and direct effect = .20, alto gether = .82; 3) the model with IC had an indirect effect on OC = .62 and direct effect = .20, alto gether = .82; 4) PR had an indirect effect on OC = .67 (directly .03 ns.); and 5) IS had an indirect effect on OC = .64 (directly .06 ns.).
In conclusion, the SL and IC components of TLS had direct effects on the outcome vari able over the indirect effects through the MLQ.

Discussion
The aim of the current study was to compare two TL models/instruments: the FRLT/ MLQ (Bass, 1998) with the TLS (Rafferty and Griffin, 2004). This aim was related to statements from the leadership literature that TLS is not a fully validated transforma tional leadership model and not extensively studied by separate research groups other than the model's authors (Antonakis, 2012, p. 269-274). The study assumed this state ment emerges from the lack of knowledge in the literature and supplemented it with data from a different culture and organization. The sample used for data collection and analysis came from the Estonian Defense Forces.
Firstly, we analyzed correlations between the MLQ and the TLS; all hypotheses were constructed based on the content of respective subscales. Moreover, EFA and CFA were conducted to gather additional arguments to find support for research hypotheses. Thus, we assumed higher correlations between VIS (TLS) and II (MLQ), IC (TLS) and IM (MLQ), SL (TLS) and IC (MLQ), IS (TLS) and IS (MLQ), PR (TLS) and CR (MLQ).
Taking into consideration the correlations, the EFA and CFA results, we concluded that H1 was not supported: VIS from TLS did not demonstrate the highest correlation with corresponding MLQ subscales. However, the items loaded into the single factor, but loadings and communalities were not very high, also the model described a rather low percentage of data variance. CFA confirmed that the best model was the three factors model (VIS+IIA+IIB); nevertheless, the differences were rather small, and all analyzed models received acceptable fit indices. At the same time, the reliability of the VIS (TLS) subscale was rather low (α = .50); moreover, Kasemaa (2015) reported a remarkable difference in reliability between subsamples. This may suggest that VIS has an ambiguous meaning for the lower level of military hierarchy; e.g. conscripts are not as interested in an idealized picture of the future, because organizational Antek Kasemaa, Reelika Suviste membership is limited for them by the compulsory time of military service. At the same time, permanent organizational members -professional officers, noncommis sioned officers, and soldiers -perhaps perceive more value in the future of the organi zation, compared to the conscripts. On the other hand, VIS (TLS) showed the highest correlation with IM (MLQ). We may explain it through the definitions of both sub scales. Thus, we assume that the sense of the future or mission is not perceived dif ferently, despite its source. This means that an idealized picture of the future (VIS from TLS) or articulation of shared goals and mutual understanding using symbols (IM from MLQ) may be understood similarly by the respondents. Rafferty and Griffin (2004) argue that vision items (TLS) reflect general vision as such, without addressing the aspect of optimism and confidence. We believe that this might be the case for the lower level of military hierarchy, e.g. conscripts.
The second hypothesis was supported by the correlation between respective compo nents and additionally by the EFA and CFA results, so inspirational communication (TLS) and inspirational motivation (MLQ) demonstrated the highest correlation. More over, the EFA and CFA results supported the conclusion from the correlations. This means that respondents perceived the expression of positive and encouraging messages about the organization, along with positive and motivating statements (TLS), as the same as the communication of high expectancies and the expression of important purposes (MLQ). Therefore, we may say that these subscales have something in common, which supports the assumption that they may measure similar concepts. However, considering the above discussion about vision, we argue along with Rafferty and Griffin (2004) that further research is necessary to clarify the distinction between these two components of transformational leadership.
The third hypothesis was supported by the correlations of supportive leadership (TLS) and individualized considerations (MLQ), which demonstrated the highest correlation. However, EFA and CFA provided controversial results: items from SL (TLS) and IC (MLQ) were separated between two factors and one item from MLQ loaded into TLS (EFA); nevertheless, differences between the onefactor and twofactor model were rather small (CFA). Therefore, we may say that those subscales have something in common, which supports the assumption that they measure related concepts. Thus, the perception of expressing concern for followers and taking into account their personal needs (TLS) may be different from the perception of individual needs' understanding and followers' abilities, not to mention the development of their individual strengths (MLQ). Rafferty and Griffin (2004) argue that supportive leadership does not have a unique contribu tion to outcome measures; nevertheless, we did not find evidence to support such indications, so we conclude that SL has an important contribution to transformational CEMJ 25 Comparing Two Instruments of Transformational Leadership leadership. However, we agree with Rafferty and Griffin (2006a) that -considering the items from this subscale -SP (TLS) is missing the aspect of followers' development, which individualized considerations (MLQ) clearly reflects (Avolio and Bass, 2004). However, considering the MLQ items, Schriesheim, Wu, and Scandura (2009) find differences between them in the level of analysis. For instance, the items from the IC (MLQ) reflect individuallevel reference except one, which generally states that the leader is dealing with teaching and coaching. Thus, by eliminating this MLQ item from the analysis we found full support for the third hypothesis.
The fourth hypothesis was not supported by correlations, which means that the pattern of correlations did not clearly demonstrate the highest relation between the respective subscales. Moreover, EFA and CFA both demonstrated that items from TLS and MLQ -supposed to measure the same concept (intellectual stimulation) -did not behave in the predicted way. Therefore, two concepts are clearly perceived differently by the Estonian military sample, one means enhancing followers' interests and awareness of intellectual problems (from TLS), the other focuses on challenging the assumptions of followers' beliefs and values, their analysis of problems, and solutions they generate (from MLQ). This result is rather controversial, because both instruments used the same definition for the IC (Avolio and Bass, 2004;Rafferty and Griffin, 2004). However, look ing closely on respective subscales, TLS items are more general and not as focused on problemsolving or assignments compared to the MLQ items.
Moreover, the fifth hypothesis was not supported by correlations, which means that the pattern of correlation did not clearly demonstrate the highest relation between personal recognition (TLS) and contingent reward (MLQ). Nevertheless, EFA showed the singularity of those items, but CFA confirmed that the twofactor model had clearly better fit indices than the onefactor model. This might be explained by findings from the previous studies (e.g. Goodwin et al., 2001) that CR has two sides: one side represents more transformational leadership, while the other side reflects transactional leadership. Items that measure PR from TLS represent clearly the transformational side and CR items from MLQ might be divided between the two. Having weaker correlations with TLS items, the first pair reflects more on the exchange process -common for transac tional leadership -while the second pair reflects more on transformational leadership as not so precisely concentrated on meeting performance goals as agreed between the follower and the leader.
The second question of this study was to examine how well the two instruments predict outcome variables: the willingness for extra effort, satisfaction with direct leader, and the effectiveness of the leader; all of them in followers' perceptions. Thus, Vol. 28, No. 1/2020 Antek Kasemaa, Reelika Suviste the sixth hypothesis assumed that the TLS predicts an outcome at least to the same extent as the MLQ. We started our analysis by considering the relation between TLS and MLQ summarized scores. The correlation was rather high, which explained approximately 61% of the variances, in line with previous studies that used military samples (e.g. Antonakis and House, 2002). We may conclude that the TLS generally measured the same concept of leadership as the MLQ. This finding is further supported by the relations between the summarized outcome variable and the TLS or MLQ -des cription ratios of 58% and 72% respectively -and additionally by the EFA results that clearly demonstrated a onedimensionality. The regression analysis showed that MLQ components are better predictors of outcome variable than TLS components; however, TLS components predict outcomes at an acceptable level. Therefore, there is no support so far for the sixth hypothesis that TLS's predictive power is at least at the same level as the MLQ's. Usefulness analysis confirmed this conclusion. Some TLS components -IC, SL, and IS -added predictive power to the MLQ. However, model differences were very small. When analyzing the TLS model by adding MLQ components, the results were vice versa, which leads to the conclusion that MLQ components added a signifi cant predictive power to the TLS. However, there are some indications in the literature about the MLQ and its outcome variables, stating that this particular outcome measure may be preferred especially by the MLQ (Kane and Tremble, 2000). Thus, we may still argue that the TLS has comparable predictive power over transformational leadership outcomes.
The third research question was to collect additional arguments for the construct, criterion, and concurrent validity of the TLS. Therefore, the seventh hypothesis stated that the TLS accounts for variance in followers' leadership outcomes above that accounted for by the MLQ. CFA/SEM models use various subsamples indicated that TLS had a mostly indirect effect (through the MLQ) on the outcome variable. However, this indirect effect was highest for the subsample of professionals and commanders. At the same time, the direct effect was highest for conscripts. This might mean that the TLS measures something supplementary to the MLQ for conscripts and does not do the same for professionals. Moreover, CFA/SEM models demonstrated that VIS, PR, and IS did not have a statistically significant effect directly on outcomes -i.e. addi tional to the effect through the MLQ -however, IC and SL did have such an effect. Therefore, our conclusion is that the seventh hypothesis was not fully supported. However, we admit that supportive leadership and inspirational communication (from TLS) adds additional variance to the MLQ components. The same conclusion was also found by the regression analysis.

Conclusion
The general aim of our research was to compare two transformational leadership models and thus collect additional arguments for the construct, criterion, and concurrent validity of the TLS. As a general conclusion, we note that the transformational leader ship scale proposed by Rafferty and Griffin (2004) demonstrates good psychometric properties to predict the outcome variable, as proposed by Avolio and Bass (2004).
The value of the research is that the TLS is an acceptable research instrument to measure transformational leadership in different languages and cultural contexts. For researchers it means that there are alternatives for the most used questionnaire MLQ about transformational leadership, which is very easy to administrate and free to use for scientific purposes. However, especially the subscales reflecting vision should be conceptually rethought, because different personnel categories in the military ranks may perceive it dissimilarly, therefore it is difficult to measure vision by using the same items. The same conclusion applies to practitioners.
The current research used a homogenous sample from the Estonian military, which may be considered a limitation of this study. This means that further work is required to confirm the results. For instance, Antonakis and House (2002) conclude that the correlations between leadership factors and outcome variables -effectiveness, extra effort, and satisfaction -can differ between organization types. In the military environ ment, these correlations tend to be noticeably higher compared to the civilian envi ronment. This conclusion is supported by metaanalyzing and comparing correlation coefficients from Dumdum et al. (2002, p. 48, 53) with figures from this study, which means that respondents from the military demonstrate stronger relations between transformational leadership and outcome variables. Therefore, it seems reasonable to administrate both tools for sampling outside of the military so as to make the necessary comparisons between different contexts.