Introduction
Anxiety disorders are associated with pervasive functional impairment. The cost to society in income or productivity loss for patients who meet criteria for one of these disorders was $46.6 billion in 1990 (DuPont et al. Reference DuPont, Rice and Miller1996). In addition to the financial burden, patients with anxiety disorders experience interpersonal problems such as marital discord and other family and social problems (Markowitz et al. Reference Markowitz, Weissman, Ouellette, Lish and Klerman1989; Lochner et al. Reference Lochner, Mogotsi, du Toit, Kaminer, Niehaus and Stein2003), education problems (Stein et al. Reference Stein, Walker, Hazen and Forde1997; Wittchen et al. Reference Wittchen, Fuetsch, Sonntag, Muller and Liebowitz2000), and increased use of non-prescription drugs (Lochner et al. Reference Lochner, Mogotsi, du Toit, Kaminer, Niehaus and Stein2003). Thus, the functional impairment associated with anxiety disorders is as important to understand and address as the symptoms of anxiety.
Throughout this paper, for the sake of clarity, symptoms will be referred to throughout as anxiety and depression characteristics that are operationalized in DSM-5 (APA, 2013), including physical, behavioral, and cognitive disturbances. Functioning will refer to activities of daily living such as the ability to work or attend school, or fulfill social or familial obligations. The relationship between symptom severity and functional impairment in patients with anxiety disorders is not as straightforward as might be expected. For instance, while the majority of studies demonstrate that the frequency of panic attacks or number of depression symptoms is predictive of disability (Katerndahl & Realini, Reference Katerndahl and Realini1997; Rubin et al. Reference Rubin, Rapaport, Levine, Gladsjo, Rabin, Auerbach, Judd and Kaplan2000; Chan et al. Reference Chan, Jia, Chiu, Chien, Thompson, Hu and Lam2009), others have found that the frequency of panic attacks is not correlated with functioning (Michelson et al. Reference Michelson, Lydiard, Pollack, Tamura, Hoog, Tepner, Demitrack and Tollefson1998). In addition, the number of anxiety symptoms endorsed accounts for only a small percentage of variance in quality of life or functioning (Leon et al. Reference Leon, Shear, Portera and Klerman1992; Michelson et al. Reference Michelson, Lydiard, Pollack, Tamura, Hoog, Tepner, Demitrack and Tollefson1998; Rapaport et al. Reference Rapaport, Clary, Fayyad and Endicott2005). These data suggest that improvements in functioning are unlikely to occur merely through symptom reduction.
Improvement in functioning is often measured as a sign of treatment effectiveness. A number of studies of pharmacological treatment report significant improvements on measures of functioning alongside improvements on symptom measures for anxiety disorders (Jacobs et al. Reference Jacobs, Davidson, Gupta and Meyerhoff1997; Lecrubier & Judge, Reference Lecrubier and Judge1997; Mavissakalian et al. Reference Mavissakalian, Perel, Talbott-Green and Sloan1998; Michelson et al. Reference Michelson, Lydiard, Pollack, Tamura, Hoog, Tepner, Demitrack and Tollefson1998; Malik et al. Reference Malik, Connor, Sutherland, Smith, Davison and Davidson1999; Stein et al. Reference Stein, Fyer, Davidson, Pollac and Wiita1999). However, the situation is complicated by other studies that fail to find differences in functioning between patients receiving medication v. placebo (Hoehn-Saric et al. Reference Hoehn-Saric, McLeod and Hipsley1993) or find differences on self-report measures of functioning but fail to find differences on clinician-rated measures of functioning (Michelson et al. Reference Michelson, Lydiard, Pollack, Tamura, Hoog, Tepner, Demitrack and Tollefson1998). Evidence for changes in functioning from pre- to post- cognitive behavior therapy (CBT) is more limited, particularly compared to other active treatments, but several studies indicate improved quality of life for patients with anxiety disorders following CBT (Safren et al. Reference Safren, Heimberg, Brown and Holle1996; Moritz et al. Reference Moritz, Rufer, Fricke, Karow, Morfeld, Jelinek and Jacobsen2005; Arch et al. Reference Arch, Eifert, Davies, Plumb, Rose and Craske2012). Some studies indicate equivalent improvements in functioning following CBT compared to pharmacotherapy (Kilic et al. Reference Kiliç, Noshirvani, BaşoĞlu and Marks1997).
Few studies have attempted to parse the complicated relationship between symptom levels and functional impairment. Those that have only examine unidirectional relationships, typically by examining whether improvements in symptom levels predict improvements in functioning. The results from such studies have been contradictory. On the one hand, several studies show that symptom improvements are associated with, or predict, functional improvement, following pharmacotherapy (Jacobs et al. Reference Jacobs, Davidson, Gupta and Meyerhoff1997) and CBT (Telch et al. Reference Telch, Schmidt, Jaimez, Jacquin and Harrington1995; Moritz et al. Reference Moritz, Rufer, Fricke, Karow, Morfeld, Jelinek and Jacobsen2005). On the other hand, other studies fail to substantiate the link between symptom and functional impairment outcomes (Tenney et al. Reference Tenney, Denys, van Megen, Glas and Westenberg2003; Monson et al. Reference Monson, Schnurr, Resick, Friedman, Young-Xu and Stevens2006).
No study to our knowledge has simultaneously analyzed the predictive role of symptom improvement on functioning while examining the role of functional improvement on symptom alleviation. This is an essential question that has clear implications for treatment development. If functional improvement is a stronger predictor of symptom reduction than the converse, then clinicians would be encouraged to target ways of improving functioning in patients’ daily lives early in the treatment process. Alternatively, if symptom alleviation more strongly predicts functional improvement, clinicians would be justified in dedicating more time to targeting symptom alleviation from the onset of treatment. The current study aimed to evaluate the bidirectional nature of the relationship between symptoms and functioning. We used data from a sample of patients with principal anxiety disorders presenting to their primary-care physicians as part of the Coordinated Anxiety Learning and Management (CALM; Roy-Byrne et al. Reference Roy-Byrne, Craske, Sullivan, Rose, Edlund, Lang, Bystritsky, Welch, Chavira, Golinelli, Campbell-Sills, Sherbourne and Stein2010; Craske et al. Reference Craske, Stein, Sullivan, Sherbourne, Bystritsky, Rose, Lang, Welch, Campbell-Sills, Golinelli and Roy-Byrne2011) study. Analyses were conducted on data covering an 18-month period, providing an extended time period to capture changes in both functioning and symptoms. The primary hypothesis was that symptom levels at a given time point would predict functioning at a subsequent time point, and simultaneously that functioning at a given time point would predict symptom levels at a subsequent time point.
Method
Participants
Participants were recruited from 13 primary-care clinics throughout the United States (Roy-Byrne et al. Reference Roy-Byrne, Craske, Sullivan, Rose, Edlund, Lang, Bystritsky, Welch, Chavira, Golinelli, Campbell-Sills, Sherbourne and Stein2010; Craske et al. Reference Craske, Stein, Sullivan, Sherbourne, Bystritsky, Rose, Lang, Welch, Campbell-Sills, Golinelli and Roy-Byrne2011). They were diagnosed with a principal anxiety disorder of panic disorder (n = 262), generalized anxiety disorder (n = 549), social anxiety disorder (n = 132), or post-traumatic stress disorder (n = 61). Participants were at least 18 years old, spoke either English or Spanish, were not currently suicidal, had no marked cognitive impairment or life-threatening medical conditions, and had no diagnoses of bipolar I disorder or psychosis. With the exception of alcohol and marijuana abuse, substance abuse or dependence was also an exclusion factor. Over half of participants had co-morbid anxiety disorders and two-thirds had co-morbid major depression. Participants were referred through their primary-care physician or nursing staff and were screened for eligibility by an anxiety clinical specialist (ACS). Full details about recruitment are available in the primary outcome papers (Roy-Byrne et al. Reference Roy-Byrne, Craske, Sullivan, Rose, Edlund, Lang, Bystritsky, Welch, Chavira, Golinelli, Campbell-Sills, Sherbourne and Stein2010; Craske et al. Reference Craske, Stein, Sullivan, Sherbourne, Bystritsky, Rose, Lang, Welch, Campbell-Sills, Golinelli and Roy-Byrne2011). Participants averaged 43.5 years of age (s.d. = 13.4), and were primarily white (69.6%) and female (71.1%).
Intervention
Participants were randomized to either usual care (UC) or CALM intervention (ITV), comprised of CBT, medication recommendations, or both.
Intervention (ITV)
The CBT component was a computer-assisted program that guided the ACS and the patient, and included generic modules (self-monitoring, psychoeducation, fear hierarchies, breathing retraining, and relapse prevention) and modules that were tailored to the most distressing/disabling anxiety disorder (cognitive restructuring and exposure; Craske et al. Reference Craske, Rose, Lang, Welch, Campbell-Sills, Sullivan, Sherbourne, Bystritsky, Stein and Roy-Byrne2009). Full details of the computer-assisted CBT are described by Craske et al. (Reference Craske, Rose, Lang, Welch, Campbell-Sills, Sullivan, Sherbourne, Bystritsky, Stein and Roy-Byrne2009) as are details of ACS training by Rose et al. (Reference Rose, Lang, Welch, Campbell-Sills, Chavira, Sullivan, Sherbourne, Bystritsky, Stein, Roy-Byrne and Craske2011).
ACS relayed medication recommendations from study psychiatrists to the primary-care providers, who prescribed medications. Medications were monitored by the ACS (56% in person, 43% over the phone). This included tracking adherence to medication, as well as providing counseling to avoid alcohol and caffeine and improve sleep quality. After 10–12 weeks in ITV, patients who remained symptomatic could opt to continue in the same modality (CBT or medication) or the alternative modality, for up to 12 months (although the majority completed active treatment within 6 months of baseline assessment). Following active treatment, participants received monthly follow-up phone calls to reinforce CBT concepts and/or medication adherence, again up to 12 months following baseline.
Usual Care (UC)
Participants (70.9% female, mean age 43.7, s.d. = 13.7, 57.7% white) who were randomized to UC received continued care with their primary-care provider. Participants in this condition received medication, counseling as typically available based on the clinic, and referrals to mental health providers. Their only contact with study personnel was for assessment purposes.
Measures
For the current study, all measures of interest were completed at baseline, and 6, 12, and 18 months thereafter during telephone assessments conducted by interviewers at the RAND Corporation who were blind to treatment condition and the timing of the assessment. These include three measures of functioning from two different instruments, and three measures of symptoms.
Short-form health survey (SF-12; Ware et al. Reference Ware, Kosinski and Keller1996)
The SF-12 is a 12-item questionnaire about mental and physical health functioning with items that are particularly relevant to depression. The measure has demonstrated high reliability and construct validity (Ware et al. Reference Ware, Kosinski and Keller1996). In the current study, the oblique subscales for physical health (PHS) and mental health (MHS) were calculated using weights derived from confirmatory factor analysis in Fleishman et al. (Reference Fleishman, Selim and Kazis2010). These subscales include items focused on the interference in activities of daily living caused by physical and mental health issues. The MHS and PHS were highly correlated with each other (0.71, 0.74, 0.73, and 0.77 at baseline, 6, 12, and 18 months, respectively), but they represent distinct domains of functioning (see Appendices 1 and 2 for summary statistics and correlations for all variables). Therefore, they were analyzed separately.
Sheehan Disability Scale (SDS; Sheehan, Reference Sheehan1983)
The SDS measures functional impairment with three items that are rated on a 0–10 point scale, where 0 = not at all and 10 = extremely. This measure queries patients about their functional impairment due to anxiety, tension, or worry. Specifically, it rates participants’ interference with work, school, social, or family obligations. It has demonstrated high internal consistency and construct validity (Leon et al. Reference Leon, Olfson, Portera, Farber and Sheehan1997).
Brief Symptom Inventory (BSI; Derogatis & Melisaratos, Reference Derogatis and Melisaratos1983)
The BSI-18 is a self-report measure of psychological symptoms. Examples of symptoms rated on this measure include nervousness, shakiness, and spells of panic. The current study used a 12-item version of the measure that included somatization and anxiety subscales, but excluded the depression subscale. Items are rated on a 0–4 point Likert scale, where 0 = not at all and 4 = extremely. The BSI has demonstrated good test–retest reliability, as well as high correlations with its parent measure, the Symptom Checklist-90R (Derogatis, Reference Derogatis1994).
Anxiety Sensitivity Index (ASI; Reiss et al. Reference Reiss, Peterson, Gursky and McNally1986)
The ASI is a 16-item self-report measure of beliefs that anxiety is harmful, rated on a 0–4 point Likert scale, where 0 = very little and 4 = very much. Examples of cognitive symptoms on this measure include fears of heart racing and fears of feeling shaky. It has demonstrated good reliability and is factorally independent from other measures of anxiety (Peterson & Heilbronner, Reference Peterson and Heilbronner1987).
Patient Health Questionnaire (PHQ-8; Spitzer et al. Reference Spitzer, Kroenke and Williams1999)
The PHQ-8 is a measure of depression severity (the item assessing suicidal ideation and intent from the PHQ-8 was dropped), with each item rated on a 0–3 point Likert scale, where 0 = not at all and 3 = nearly every day. Examples of depression symptoms rated on this measure include hopelessness and low energy. Given the high co-morbidity with depression in this principal anxiety disorder sample (64.5% had co-morbid major depression), this measure was included in the current study to examine the relationship between depression and functional impairment. The PHQ-8 has demonstrated adequate reliability and validity (Kroenke et al. Reference Kroenke, Spitzer and Williams2001).
Procedure and data analysis
This study examined four pairings of symptom and functioning measures. These include: (1) Mental Health Subscale of SF-12 (MHS; functioning) and Patient Health Questionnaire (PHQ; depression symptoms); (2) Physical Health Subscale of SF-12 (PHS; functioning) and Patient Health Questionnaire (PHQ; depression symptoms); (3) Sheehan Disability Scale (SDS; functioning) and Brief Symptom Inventory (BSI; anxiety symptoms); and (4) Sheehan Disability Scale (SDS; functioning) and Anxiety Sensitivity Index (ASI; beliefs about anxiety symptoms). Cross-lagged panel analyses were conducted for each of the four variable pairings based on procedures outlined in Martens & Haase (Reference Martens and Haase2006) using Mplus software v. 6.2 (Muthén & Muthén, Reference Muthén and Muthén2011).
The Martens & Haase (Reference Martens and Haase2006) procedure involves testing a series of four models to determine the form of the relationship between the two variables of interest. The first model is the base model and consists of the paths labeled ‘A’ in Fig. 1. These include autoregressive paths that represent stability over time across successive observations for each variable. Achieving adequate model fit for the base model in this dataset also required the addition of paths from each prior observation of a variable to each later observation of the same variable. This elaboration of the base model involves only paths between observations of the same construct and will make tests of cross-lagged paths between different constructs (described below) more conservative than the base model originally described by Martens & Haase (Reference Martens and Haase2006). The base model also includes time-specific residual variances. Initial observations of the two constructs were allowed to correlate in the base model and in all subsequent models, as were the error terms associated with the two variables at each of the later time points. The second model adds prospective paths predicting symptoms at a given time point from functioning at the previous time point (the paths labeled ‘B’ in Fig. 1) to the base model. The third model adds paths predicting functioning from symptoms (the paths labeled ‘C’ in Fig. 1) to the base model. The final model contains all prospective paths outlined above (A, B, and C) simultaneously.

Fig. 1. Diagram of the analytic approach based on the Martens & Haase (Reference Martens and Haase2006) cross-lagged panel design. The values 00, 06, 12, and 18 represent the measurements at baseline, 6, 12, and 18 months, respectively. The paths labeled ‘A’ represent the regressions included in the base model, the paths labeled ‘A’ and ‘B’ represent the second model, the paths labeled ‘A’ and ‘C’ represent the third model, the paths labeled ‘A’, ‘B’, and ‘C’ represent the fourth/full model. The ε's represent error terms, and the error terms were correlated, as were the baseline observations. Time-specific error variances were part of the base model.
A series of χ 2 difference tests were conducted to compare the fit of the different models. Comparison of the base and second models indicates whether the addition of prospective paths from functioning to symptoms improves model fit, and comparison of the base and third models indicates whether the addition of prospective paths from symptoms to functioning improves model fit. Comparison of models 2 and 3 with the full model indicates whether the addition of a particular set of prospective paths improves model fit over and above the other.
Given that participants were randomized to the UC and ITV groups, a secondary series of analyses repeated the Martens & Haase (Reference Martens and Haase2006) procedure described above within each group to examine potential differences between treatment conditions. Because the primary purpose of this study was to focus on the effects of symptoms and functioning, not specific effects of treatment, results are reported separately by treatment group only when the Martens & Haase procedure (Reference Martens and Haase2006) indicated different forms of association between the symptoms and functioning variables in best-fitting models. When no such differences were found, the best-fitting model for the full sample is presented.
On average across all observations of all dependent variables, 13% of data was missing (range 0–20%). In addition, the variables were generally not normally distributed, and skewness (absolute range 0.044–1.364) and kurtosis (range −0.94 to 1.688) increased at later time points. Although these univariate skew/kurtosis measures do not exceed problematic thresholds (2 for skew and 7 for kurtosis; West et al. Reference West, Finch, Curran and Hoyle1995) indices of multivariate non-normality were significant for each of the four variable combinations (Mardia skewness range 5.85–10.51; Mardia kurtosis range 94.84–108.49; Mardia, Reference Mardia1970). To adjust for these factors, maximum-likelihood estimation with robust standard errors (s.e.s) was employed using the MLR option in Mplus. MLR estimates of s.e. use a sandwich estimator which is more accurate than maximum-likelihood estimation in the face of normality violations in terms of s.e. and χ 2 estimation (Maas & Hox, Reference Maas and Hox2004). However, parameter estimates do not differ significantly regardless of whether MLR or traditional maximum-likelihood estimates are used (Hox et al. Reference Hox, Maas and Brinkhuis2010). Model fit comparisons are made using the Satorra–Bentler scaled χ 2 (Satorra & Bentler, Reference Satorra and Bentler2001), taking into account the degrees of freedom, χ 2 value, and scaling factor for both models under comparison. In addition, the 1004 subjects in this study were nested within 13 clinics, raising the potentially problematic issue of modeling within-clinic similarity in responses. Imposition of a multilevel structure on the analysis was not possible in Mplus because the number of parameters estimated in the model was greater than the number of clinics. However, univariate intraclass correlation coefficients for the nine variables included in this study were relatively low (0.0139–0.0969) and multilevel random coefficient analyses conducted separately on each variable did not detect any significant between-clinic variance, which suggests that the results of this analysis are unlikely to be affected by the nested data structure.
Constrained analysis: After determining the best-fitting model for each pair of constructs, a subsequent set of analyses was conducted to establish if there were differences in the relative strengths of the predictive associations. These analyses are referred to as ‘constrained analysis’ below for simplicity. These analyses involve comparing the fit of models in which the standardized cross-lagged path coefficients (B and C in Fig. 1) at each time point are constrained to be equal v. when they are freely estimated. For example, the cross-lagged path predicting symptoms from functioning from 6 to 12 months was constrained to be equal to the cross-lagged path predicting functioning from symptoms within the same time period. These analyses were conducted by importing a correlation matrix into Mplus and fitting two models, one constraining the appropriate unstandardized paths to be equal, the other allowing all paths to be freely estimated. A significant difference between fit of the correlation-based models with constrained and unconstrained unstandardized paths from this correlation-based analysis would indicate a significant difference in the strength of the standardized paths in the default covariance-based analysis conducted on raw data. MLR estimates are not available in Mplus for correlation-based analyses, and the maximum likelihood estimation that was used does not directly adjust for missing data or non-normality. However, the correlation matrix input to this procedure was calculated using full-information maximum likelihood to account for missing data using the corFiml function of the ‘psych’ package in R (Revelle, Reference Revelle2013). Moreover, the extent to which χ 2 values are overestimated due to non-normality should be similar for both the constrained and unconstrained models, and thus the difference between them should still provide a useful test of the equality between standardized path coefficients (albeit one that should be evaluated with some degree of caution).
Results
Model fit indices for all four combinations of functioning and symptomatology variables indicated good fit for the final models in terms of the (ranging from 0.996 to 1.0, compared to a recommended value >0.95; Bentler & Bonett, Reference Bentler and Bonett1980; Hu & Bentler, Reference Hu and Bentler1999) Comparative Fit Index (CFI, ranging from 0.994–1.000, compared to a recommended value >0.95; Bentler, Reference Bentler1995; Hu & Bentler, Reference Hu and Bentler1999), standardized root mean squared residual (ranging from 0.008 to 0.038, compared to a recommended value <0.08; Hu & Bentler, Reference Hu and Bentler1999), root mean square error of approximation (RMSEA, ranging from 0.013 to 0.05 for the final models, compared to a recommended value <0.06; Steiger & Lind, Reference Steiger and Lind1980; Hu & Bentler, Reference Hu and Bentler1999), and change in Akaike's Information Criterion (AIC) from the base model to the final model (range of difference 26.106–163.496; Akaike, Reference Akaike1976).
PHQ and MHS analysis
There were no differences in the best-fitting model for the PHQ and MHS variables based on treatment condition, and therefore the results from the full sample are reported. The best-fitting model for PHQ and MHS is the full model which includes prospective paths from depression to mental functioning and prospective paths from mental functioning to depression (see column 5 in Table 1 for tests of the various model comparisons; path coefficients are provided in Fig. 2). The results of the constrained analysis indicated that there were no significant differences in the magnitude of the cross-lagged paths (see column 6 in Table 1). This is an indication that prospective paths predicting depression from mental functioning are equally important to model fit as those predicting mental functioning from depression. PHQ and PHS analysis: There were differences in the best-fitting model for the PHQ and PHS variables based on treatment condition, and therefore the two treatment conditions are discussed separately. In the UC sample, the third model, including only prospective paths from PHQ (functioning) to predict PHS (symptoms), was the best-fitting model (see column 5 in the upper panel of Table 2 and Fig. 3b ). Adding the paths from PHS to predict PHQ did not improve the fit of the model. The constrained analysis was not conducted on the UC group because the best-fitting model was not the full model. In the ITV sample, the best-fitting model is the full model which includes prospective paths from depression to physical functioning and prospective paths from physical functioning to depression (see column 5 in the lower panel of Table 2 for tests of the various model comparisons; path coefficients are provided in Fig. 3a ). The results of the constrained analysis indicated that there were no significant differences in the magnitude of the cross-lagged paths (see column 6 in Table 2). This indicates that in the ITV sample, prospective paths predicting depression from physical functioning are equally important to model fit as those predicting physical functioning from depression.

Fig. 2. Comparison between the Patient Health Questionnaire (PHQ) and Mental Health Subscale (MHS) in the full sample (usual care and intervention). *p < 0.05, **p < 0.01.

Fig. 3. (a) Comparison of Patient Health Questionnaire (PHQ) and Physical Health Subscale (PHS) in the usual care (UC) and intervention (ITV) only groups. (a) Diagram of model 3 for the UC group, (b) diagram of the full model for the ITV group given that these models were the best-fitting models, respectively (*p < 0.05, **p < 0.01).
Table 1. Fit statistics of Mental Health Subscale (MHS) and Patient Health Questionnaire (PHQ) model including the full sample

CFI, Comparative Fix Index; AIC, Akaike's Information Criterion; SRMR, standardized root mean square residual; RMSEA, root mean square error of approximation.
The correlation matrix was used in the constrained analyses, whereas the covariance matrix was used in all other comparisons.
**p < 0.01.
Table 2. Fit statistics of Patient Health Questionnaire (PHQ) and Physical Health Subscale (PHS) model including the full sample

CFI, Comparative Fix Index; AIC, Akaike's Information Criterion; SRMR, standardized root mean square residual; RMSEA, root mean square error of approximation.
The correlation matrix was used in the constrained analyses, whereas the covariance matrix was used in all other comparisons.
**p < 0.001, *p < 0.01,† p < 0.05.
SDS and BSI analysis
There were no differences in the best-fitting model in the SDS and BSI variables based on treatment condition, and therefore results from the full sample are reported. The best-fitting model for BSI and SDS is the full model which includes prospective paths predicting anxiety from functioning and predicting functioning from anxiety (see column 5 in Table 3 for tests of the various model comparisons; path coefficients are provided in Fig. 4). As with the prior comparisons, the results of the constrained analysis indicated that there were no significant differences in the magnitude of the cross-lagged paths (see column 6 in Table 3). This is an indication that prospective paths predicting anxiety from functioning are equally important to model fit as those predicting functioning from anxiety. SDS and ASI analysis: There were differences in the best-fitting model for the SDS and ASI variables based on treatment condition, and therefore the treatment conditions are discussed separately. In the UC sample, the third model, including only cross-lagged paths from beliefs about anxiety to functioning, was the best-fitting model (see columns 4 and 5 in the upper panel of Table 4 for tests of the various model comparisons; path coefficients are provided in Fig. 5a ). The constrained analysis was not conducted on the UC group because the best-fitting model was not the full model. In the ITV sample, the best-fitting model was the full model (see column 5 in the lower panel of Table 4 for tests of the various model comparisons; path coefficients are provided in Fig. 5b ), and the constrained analysis did not reveal a significant difference in the magnitude of the paths (see column 6 in Table 4). This indicates that in the ITV sample, prospective paths predicting beliefs about anxiety from functioning are equally important to model fit as those predicting functioning from beliefs about anxiety. This suggests that for patients in ITV, prospective paths predicting beliefs about anxiety from functioning are equally important to model fit as those predicting functioning from beliefs about anxiety.

Fig. 4. Brief Symptom Inventory (BSI) and Sheehan Disability Scale (SDS) in the full sample (usual care and intervention). *p < 0.05, **p < 0.01, †p = 0.05.

Fig. 5. Comparison of Anxiety Sensitivity Index (ASI) and Sheehan Disability Scale (SDS) for the usual care (UC) group. (a) Diagram of the full model for the UC group, (b) diagram of model 3 for the intervention group given that these models were the best-fitting models, respectively (*p < 0.05, **p < 0.01).
Table 3. Fit statistics of Sheehan Disability Scale (SDS) and Brief Symptom Inventory (BSI) model including the full sample

CFI, Comparative Fix Index; AIC, Akaike's Information Criterion; SRMR, standardized root mean square residual; RMSEA, root mean-square error of approximation.
The correlation matrix was used in the constrained analyses, whereas the covariance matrix was used in all other comparisons.
**p < 0.01.
Table 4. Fit statistics of Anxiety Sensitivity Index (ASI) and Sheehan Disability Scale (SDS) model including the full sample

CFI, Comparative Fix Index; AIC, Akaike's Information Criterion; SRMR, standardized root mean square residual; RMSEA, root mean square error of approximation; UC, usual care; ITV, intervention.
The correlation matrix was used in the constrained analyses, whereas the covariance matrix was used in all other comparisons.
**p < 0.001, *p < 0.01.
Discussion
The relationship between anxiety disorders and functional impairment has been clearly established, but the temporal relationship between symptom improvement and functional improvement has not been adequately studied to this point. This study fills this gap in the literature. The data indicate a bidirectional relation, with anxiety and depression symptom severity predicting functional impairment and conversely, functional impairment predicting anxiety and depression symptom severity.
Prior research has shown that changes in anxiety symptoms predict levels of functional impairment (Telch et al. Reference Telch, Schmidt, Jaimez, Jacquin and Harrington1995; Moritz et al. Reference Moritz, Rufer, Fricke, Karow, Morfeld, Jelinek and Jacobsen2005). The results of the current study extend prior research by showing that changes in level of functioning are equally as predictive of symptoms of anxiety and depression as changes in symptom severity are predictive of functioning. This was the case for somatic and cognitive symptoms of anxiety (measured using the BSI), symptoms of depression (measured using the PHQ-8), and beliefs about anxiety (measured by the ASI). In sum, our study hypotheses were supported. Symptom severity at a particular time point predicted functioning at a subsequent time point while controlling for previous symptom severity, and functioning at a particular time point predicted symptom severity at a subsequent time point, while controlling for previous functioning levels.
This study also demonstrates that the relationship between functioning and beliefs about anxiety varies as a function of treatment condition. In particular, in the ITV condition, prospective prediction of functioning from beliefs about anxiety and beliefs about anxiety from functioning were both important in modeling these associations. However, in the UC group, only the prospective prediction of functioning from beliefs about anxiety was important for modeling the associations. This suggests that the relationship between functioning and beliefs about anxiety is not static, and a strong dose of evidence-based treatment can change the relationship between these constructs. Interestingly, in the UC group, beliefs about anxiety a baseline predicted functioning at 6 months, whereas this relationship was not significant in the ITV group. One possibility is that the effect of the intervention overrode the impact of baseline beliefs about anxiety while treatment was ongoing. The relationship between beliefs about anxiety and outcome may be dependent upon type of treatment. For instance, type of treatment moderates the relationship between beliefs about anxiety and distress anxiety symptoms, with no clear relation between beliefs about anxiety and outcome in mindfulness-based treatments (Hayes et al. Reference Hayes, Strosahl and Wilson1999; Wolitzky-Taylor et al. Reference Wolitzky-Taylor, Arch, Rosenfield and Craske2012; Arch & Ayers, Reference Arch and Ayers2013). Therefore, it is not entirely surprising that this difference emerged in the current study. A similar finding occurred in the comparison between physical health functioning and depression, in that in the UC group, only the prospective prediction of functioning from depression was important for modeling the associations. However, the fact that functioning predicts symptomatology in the ITV condition indicates the importance of focusing on functional impairment during evidence-based treatment.
Overall, these results highlight the importance of focusing on both functional impairment and symptom severity when treating patients with anxiety disorders and co-morbid depression. A common approach to treatment is to wait until symptoms subside before encouraging participants to increase their functioning. That is, patients often are discouraged from engaging in behaviors aimed at improving relationship satisfaction, work performance, or other activities of daily living until symptoms have subsided. The underlying assumption is that patients are not capable of functional living until their symptoms have abated. In contrast, the current findings suggest that encouraging patients to engage in more functional behaviors may facilitate symptom stabilization. Behavioral therapies such as exposure therapy to feared situations and behavioral activation for depression target functioning by encouraging patients to engage in activities they are avoiding. The current findings reinforce this approach. The results of this study also indicate the importance of monitoring functioning levels over the course of treatment. This information will provide clinicians with an additional marker of patient progress, and may alert clinicians to those patients who require additional intervention to improve their functioning.
There are several strengths of this study, such as generalizable sample which was recruited from primary-care centers throughout the United States. Another strength is the data-analytic approach which allows for an examination of the relationship between functioning and symptoms at multiple time-points simultaneously rather than including single time points in separate analyses. Finally, this study included multiple measures (completed by blind assessors) of both functioning and symptoms and the results were largely consistent across the measures, indicating that they are robust and likely to hold across multiple measures of functioning and symptomatology.
There are also limitations of the study, including the possibility of an unidentified third variable which is accounting for the predictive power of both functioning and symptom levels that was not included in this analysis. Some possibilities include Axis II disorders which may help explain disruption in both functioning and psychological symptoms. Additionally, functioning was measured by self-report scales rather than by objective data (e.g. number of work days missed as measured by a third party). Objective report may identify a greater degree of impairment (Maor et al. Reference Maor, Olmer and Mozes2001). Furthermore, this sample is heterogeneous in terms of principal diagnosis. While the sample size is large enough to address the research questions proposed in this study with the pooled group, it is not sufficiently large to run separate models by diagnosis. Therefore, future studies should compare models based on principal diagnosis. In addition, future research needs to be conducted on other patient populations to determine the relevance of these findings to other psychiatric disorders. Further, the sample size in this particular study was not large enough to model more than two predictors in a given analysis, and future research with larger samples should address whether this pattern of findings holds when symptoms of anxiety and depression are combined in one model. Finally, this work merits replication to ensure the validity of the findings.
Overall, these results emphasize the importance of focusing on both symptom alleviation and improvement in functioning in patients with anxiety disorders. While prior research has demonstrated that symptom severity predicts functioning levels, to our knowledge, no other studies have parsed apart the bidirectional relationship between these two constructs. Although more research is needed in this area, particularly for replication beyond anxiety disorders, this research should guide clinicians to be mindful of their patient's functioning in everyday activities in addition to their symptom severity, and to encourage functioning even in the presence of ongoing symptoms.
Acknowledgements
This work was supported by the following National Institute of Mental Health grants: U01 MH070018, U01 MH058915, U01 MH057835, UO1 MH057858, U01 MH070022, K24 MH64122, and K24 MH065324.
Appendix 1 Summary statistics

Appendix 2
(a) Correlation matrix

(b) Correlation matrix for the primary analyses
