1. Introduction
As one of the most populous countries in the world, China is rapidly ageing due to improvements in life expectancy and low fertility rates in past decades. In 2015, one in five older persons aged 65+ globally lived in China, while in 2050 one in four elderly – over 370 million people – will be Chinese. China’s old age dependency ratio was 15% in 2015, but will rise to 50% by mid-century (United Nations, Reference Melorose, Perroy and Careas2015). The need for health care, aged care and financial services for the elderly in China is already large and will keep growing in the future.
Traditionally, older Chinese were cared for by family members, but the availability of family caregivers is declining due to demographic changes, the weakening of traditional values, greater geographic mobility and improved gender equality (see, e.g. Lu et al., Reference Lu, Liu and Piggott2015; Zhu, Reference Zhu2015). In China, the current social security programmes for older people provide basic medical insurance and a low pension income. However, they do not cover the full cost of residential-aged care facilities and also do not fund community-based services (Yang et al., Reference Yang, Browning and Thomas2013). The resulting unmet aged care needs have a measurable impact on the mortality risk of older Chinese (Zhen et al., Reference Zhen, Feng and Gu2015). Hence, there is a need for social security programmes specialising in the provision of aged care (Zhen et al., Reference Zhen, Feng and Gu2015) and the development of private market solutions such as long-term care insurance or specialised home equity release products.
These challenges motivate our study on health transitions of older Chinese. There is a large and growing actuarial literature on multi-state health transition models (see, e.g. Pitacco, Reference Pitacco1995; Renshaw & Haberman, Reference Renshaw and Haberman1995; Ferri & Olivieri, Reference Ferri and Olivieri2000; Rickayzen & Walsh, Reference Rickayzen and Walsh2002; Fong et al., Reference Fong, Shao and Sherris2015), but these studies focus on the mortality and morbidity experience of developed countries such as the United Kingdom and United States. As far as we know, there is a lack of specific studies on China on this topic. Since the demographic changes in China are happening at a very fast speed, it is important to consider time effects in health transitions in order to develop more accurate projections. Several studies have developed different approaches to consider time effects in multi-state health transition models. Important early contributions include Renshaw & Haberman (Reference Renshaw and Haberman2000) and Rickayzen & Walsh (Reference Rickayzen and Walsh2002) based on UK data. Majer et al. (Reference Majer, Stevens, Nusselder, Mackenbach and van Baal2013) modelled health transition probabilities in the Netherlands based on the Lee–Carter framework with stochastic time trends. Li et al. (Reference Li, Shao and Sherris2017) adopted a multi-state model with latent factors to capture systematic time trends in US health transition intensities. Aro et al. (Reference Aro, Djehiche and Lofdahl2015) developed a semi-Markov model with stochastic period effects using disability claims data from Sweden. Their model was extended by Djehiche & Lofdahl (Reference Djehiche and Lofdahl2018) to a hidden Markov model with a stochastic time trend.
In this paper, we develop a generalised linear model (GLM) that incorporates age effects, time trends and age–time interactions in the transition rates in a Markov model with three health states (healthy, functionally disabled and dead). Our model extends existing literature by allowing for time trends and age–time interactions. Another strength of our approach is the ability to tailor different functional forms for each transition intensity in different subpopulations. This provides greater flexibility in the model structure. We apply this new model to provide first evidence on the health transitions of older Chinese males and females in urban and rural areas.
We use individual-level panel data from the Chinese Longitudinal Healthy Longevity Survey (CLHLS) for ages 65–105 over the period 1998–2012. CLHLS is the largest longitudinal survey of the “oldest old” (aged 80+) internationally (Zeng, Reference Zeng2012). Mortality and morbidity data in the CLHLS have been found to be of good quality (Zeng, Reference Zeng2012) and have been used in many studies analysing health patterns of older Chinese (e.g. Peng et al., Reference Peng and Wu2010; Peng & Wu, Reference Peng, Ling and He2015; Fong & Feng, Reference Fong and Feng2016). With a sample size of over 128,000 exposure years we are able to estimate separate models for male and female residents in both urban and rural areas. This distinction is important as large economic and demographic differences continue to exist between urban and rural areas in China (Wang & Yu, Reference Wang and Yu2016). We classify individuals’ health status based on the Activities of Daily Living (ADL) information collected by CLHLS. ADL limitations are widely used internationally to measure an individual’s functional status and long-term care needs for insurance purposes, including a recent long-term care insurance pilot programme in the city of Qingdao in Eastern China (Yang et al., Reference Yang, He, Fang and Mossialos2016). Six basic ADLs are considered in our study including bathing, dressing, eating, using the toilet, continence, and transferring in and out of bed.
The empirical results confirm that age and time effects are important factors for modelling health transitions at higher ages. Many of the selected models for the health transition rates of the different subpopulation also include age–time interactions which capture time trends that differ by age. Our results suggest that the recent improvements in the mortality rates of older Chinese are largely driven by the decline in the mortality rates for functionally disabled older persons rather than by the mortality rates of the non-disabled population. Using the estimated health transition models, we also provide new estimates for life expectancies and healthy life expectancies at different ages for 1998, 2011 and 2020.
The remainder of the paper is organised as follows. Section 2 introduces the new model consisting of the three-state Markov process and the GLM framework. Section 3 describes the CLHLS data used in this study. Section 4 presents and discusses the results. Section 5 concludes.
2. A Multi-State Health Transition Model With Time Trends
2.1. A three-state time-inhomogeneous Markov process
Following previous literature (see, e.g. Olivieri & Pitacco, Reference Olivieri and Pitacco2001; Rickayzen & Walsh, Reference Rickayzen and Walsh2002; Fong et al., Reference Fong, Shao and Sherris2015; Shao et al., Reference Shao, Sherris and Fong2017), we assume that individuals’ health transitions can be modelled as a multi-state Markov process, where the conditional probability distribution of future states of the process (conditional on both past and present values) only depends on the state presently occupied and is independent of the process history. We define a three-state Markov process as in Figure 1. The process has two transient states, “N” (non-disabled) and “F” (functionally disabled), and one absorbing state, “D” (dead). It allows for three transitionsFootnote 1 , Footnote 2 :
∙ σ: N→F, the intensity to become functionally disabled.
∙ μ: N→D, the mortality intensity for a healthy person.
∙ ν: F→D, the mortality intensity for a disabled person.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_fig1g.jpeg?pub-status=live)
Figure 1 Three-state Markov process.
We assume that health transitions follow a time-inhomogeneous Markov process, where the transition probability depends on the time at which the transition takes place:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU1.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU2.gif?pub-status=live)
where x represents age, t the time with h≥0. S(x,t) denotes the stochastic health status of an individual at age x and time t, and i,j∈{N,F,D}. P i,j (x,t,h) denotes the transition probability from state i at age x and time t to state j at age x+h and time t+h. α i,j (x,t) denotes the corresponding transition intensity at age x and time t.
2.2. Model specification
Following earlier works of Renshaw & Haberman (Reference Renshaw and Haberman1995) and Fong et al. (Reference Fong, Shao and Sherris2015), we model the transition intensities using a GLM approach. Separate GLMs are estimated for each of the three transition intensities σ, μ and ν. The models are specified by three components: the link function, the linear predictor and the probability distribution.
Link function: We adopt a log link function g(⋅):
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU3.gif?pub-status=live)
where α i,j (x,t) are the respective transition intensities σ x,t , μ x,t or ν x,t for age x at time t. η x,t is a linear predictor of regressors.
Linear predictor: As our primary interest is to explore time trends in health transitions, we introduce time effects as additional covariates besides the age factors considered in Fong et al. (Reference Fong, Shao and Sherris2015) and allow for age–time interactionsFootnote 3 . The linear predictor is given by:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU4.gif?pub-status=live)
where x represents age, t the time and the β j are unknown coefficients that need to be estimated. The model allows for some of the values of β j to be 0, allowing for flexibility in the functional form.
We include age factors up to the quadratic effect in agreement with the findings of Fong et al. (Reference Fong, Shao and Sherris2015). This is also in line with common practice in mortality modelling (see, e.g. Cairns et al., Reference Cairns, Blake, Dowd, Coughlan, Epstein, Ong and Balevich2009). Since the CLHLS data only allow us to compute at most five transition intensities per individual, we focus on a linear time trend in this study. The inclusion of age–time interaction effects has been an important feature in recent developments in mortality modelling (Cairns et al., Reference Cairns, Blake and Dowd2006; Li et al., Reference Li, O’Hare and Zhang2016; Plat, Reference Plat2009). It ensures that the improvement in mortality has a non-trivial correlation structure across different age groups. Moreover, several studies have recognised the benefits of including quadratic age–time effects for model fitting and forecasting (Cairns et al., Reference Cairns, Blake, Dowd, Coughlan, Epstein, Ong and Balevich2009; Dowd et al., Reference Dowd, Cairns, Blake, Coughlan, Epstein and Khalaf-Allah2010). Therefore, to build on recent developments in mortality and health transition modelling, and to keep the model parsimonious and interpretable, we consider the aforementioned age, time and age–time factors in this study.
Probability distribution: The Poisson distribution has been widely used in the actuarial literature to model the number of deaths (see, e.g. Brouhns et al., Reference Brouhns, Denuit and Vermunt2002; Cairns et al., Reference Cairns, Blake and Dowd2006; Haberman & Renshaw, Reference Haberman and Renshaw2009). In this paper, assuming that each transition intensity described in section 2.1 is constant for each 1-year age group in a given time interval, we assume that the number of health transitions follows an independent Poisson distribution between survey waves. For illustrative purpose, we use the mortality rate μ x,t from the healthy state as an example in the following. Let n x,t be the number of transitions from state N to D at age x and time t:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU5.gif?pub-status=live)
where
$e_{{x,t}}^{H} $
is the central exposure to risk in healthy state at age x and time t.
The Poisson assumption implies that the dispersion parameter equals 1, which means that the mean and variance of transition counts should be the same. However, several recent mortality studies have found that death data has an “overdispersed” feature in many countries (see, e.g. Cairns et al., Reference Cairns, Blake, Dowd, Coughlan, Epstein, Ong and Balevich2009; Li et al., Reference Li, O’Hare and Vahid2015), implying that the variance of the number of deaths is much larger than the mean. Heterogeneity is often considered as a key reason for overdispersion in insurance data. Moreover, strong dependence in the observed sample can also lead to overdispersion. Therefore, we test the dispersion parameter in each of the transition counts before estimating the GLMs. We found that in most of the cases the dispersion parameter is close to 1Footnote 4 . Only for cases where the dispersion parameter is significantly different from one, we relax the restriction on the value of the dispersion parameter and estimate this parameter based on the underlying data. However, the estimates for parameters in the model will remain unchanged upon the introduction of an dispersion parameter in the Poisson distribution (for discussions, see Hinde & Demetrio, Reference Hinde and Demetrio1998; McCullagh & Nelder, Reference McCullagh and Nelder1989).
2.3. Estimation and model selection
Maximum likelihood estimation (MLE) is used to obtain estimates of the proposed GLM models. We define Φ as the parameter set. Using the mortality rates of healthy individuals again as an example, the log-likelihood function is given by:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU7.gif?pub-status=live)
We select the model specification for each transition intensity by comparing the Bayesian information criterion (BIC) for all possible combinations of the six terms in equation (4). That is, we do not impose a single model structure for all health transition rates. Our modelling approach is data-driven and more flexible than relevant earlier works (e.g. Haberman & Renshaw, Reference Haberman and Renshaw2009; Fong et al., Reference Fong, Shao and Sherris2015). For each type of transition, we tailor the model design to only include those age, time and age–time effects that are important and relevant.
We choose the BIC for model selection because it is widely used in statistics and is proven to be consistent (Schwarz, Reference Schwarz1978). The BIC penalises the number of parameters estimated in the model as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU8.gif?pub-status=live)
where l(
$$\hat{\Phi }$$
) is the log-likelihood based on the MLE estimators, N the total number of observations and k the number of parameters in the model. The model with the smallest BIC value is selected as the preferred model. We also provide the results of a stepwise comparison of several nested model variants based on the BIC in section 4.2.
3. Data
3.1. CLHLS survey
We use longitudinal data from the CLHLS, which provides information on the health status and quality of life of the elderly in 22 provinces of China over the period 1998–2011. The survey contains detailed information on health, socioeconomic characteristics, family, lifestyle and other demographic variables. It has been conducted by the Center for Healthy Aging and Family Studies at the National School of Development at Peking University.
The baseline survey was carried out in 1998 in a randomly selected half of the counties and cities in 22 provinces of China. The survey areas contained about 85% of China’s total population in 1998. The data was collected through face-to-face home-based interviews and basic physical capacity tests. The survey team tried to interview all centenarians who agreed to participate in the study in the sample counties and citiesFootnote 5 . For each centenarian interviewee, one octogenarian (aged 80–89) living nearby, one nearby nonagenarian (aged 90–99) and one nearby younger elder aged 65–79 of predesignated age and sex were also interviewed. Follow-up surveys with replacement for deceased elders were conducted in 2000, 2002, 2005, 2008 and 2011Footnote 6 . In 2002, a sub-sample of adult children of survey participants was included in the survey. More details about the survey design can be found, for example, in Yi et al. (Reference Yi, Vaupel, Zhenyu, Chunyuan and Yuzhi2001) and Zeng (Reference Zeng2012).
The sample size of CLHLS is sufficiently large even at higher ages and allows us to estimate models using 1-year age groups for the age range 65–105. We consider males and females separately and distinguish between urban and rural residency, which is important in the context of China. We use residency status as reported in CLHLS: urban (city and town) and rural. About 5% of the sample lives in a nursing home which is consistent with the low number of nursing homes reported for China (see, e.g. Lu et al., Reference Lu, Liu and Yang2017).
In this study, information on ADL limitations is used as a measure of health status. Six ADL items were consistently evaluated in all waves of CLHLS: bathing, dressing, eating, using the toilet, continence, and transferring in and out of bed. Individuals reported their ability to perform these activities in three categories (1=do not need help, 2=need partial assistance, 3=need full assistance). We classify an individual as able to perform an ADL only if she/he does not need help. We define individuals as functionally disabled if they have difficulty with two or more (i.e. 2+) ADLs, which is consistent with the main analysis presented in Fong et al. (Reference Fong, Shao and Sherris2015) based on data from the US Health and Retirement Study. In addition, this disability definition is in line with the trigger of benefit payments for many existing long-term care insurance policies in the US market.
3.2. Descriptive statistics
We analyse health transitions between waves of data collectionFootnote 7 . To fully utilise the available information, we use an unbalanced panel design which includes all individuals with at least two consecutive observations. Every individual can have up to five health transitions between the six CLHLS waves 1998, 2000, 2002, 2005, 2008 and 2011. The numbers of transition counts are given in Table 1. We observe in total 27,659 health transitions, of which 16% are disability transitions, 59% are deaths of healthy individuals and 26% are deaths of disabled individuals.
Table 1 Number of transition counts.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_tab1.gif?pub-status=live)
To calculate the central exposure to risk of the sample population in both healthy and functionally disabled states, we use the exact interview, birth and death date from the survey or the 15th of the reported month in case the exact day was missing. We assume that disability happened at the mid-point between survey waves. Table 2 gives the number of exposure years. The total number of exposure years is 128,206. The sample split is 42%:58% between urban and rural areas; and 43%:57% between males and females, which allows us to estimate separate models for these four populations.
Table 2 Number of exposure years.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_tab2.gif?pub-status=live)
We calculate crude transition intensities as the number of health transitions divided by the corresponding central exposure to risk for a given age and time. Figures 2–4 show the crude transition intensities on a log scale. Blank areas in the graphs indicate missing data for younger age groups in the first waves 1998 and 2000 of CLHLS. Darker colours indicate lower rates. There are age patterns in most of the graphs and some also show time trends. In particular, the mortality rates ν from the functionally disabled state “F” decrease over time (see Figure 4). The model estimates presented in the following section will show which age and time factors are statistically significant drivers of the different health transitions.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_fig2g.jpeg?pub-status=live)
Figure 2 Crude log disability intensities (σ: N→F). (a) Males, urban, (b) males, rural, (c) female, urban, (d) females, rural.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_fig3g.jpeg?pub-status=live)
Figure 3 Crude log mortality rates for healthy individuals (μ: N→D). (a) Males, urban, (b) males, rural, (c) female, urban, (d) females, rural.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_fig4g.jpeg?pub-status=live)
Figure 4 Crude log mortality rates for disabled individuals (ν: F→D). (a) Males, urban, (b) males, rural, (c) female, urban, (d) females, rural.
For the model estimation, we define the year 1998 as t=0 and set the data points in the model to t=(1, 3, 5.5, 8.5, 11.5) to reflect the fact that the transition intensities refer to the middle of the time intervals between survey waves and to account for the different interval lengths between survey waves. We define the age variable as x=age−65, with a range of [0,40]. These definitions ensure that both covariates have similar magnitudes.
4. Empirical Results
4.1. Selected models: estimation results
We estimated the GLM described in section 2.2 separately for the three transition intensities σ, μ and ν for each sub-population in our sample. Table 3 gives the estimation results of the selected linear predictor for each case based on a comparison of all possible model variants as described in section 2.2. The selected models all have highly significant parameters and the results are interpretable.
Table 3 Selected model: parameter estimates.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_tab3.gif?pub-status=live)
Note: The functional form of the linear predictor is η x,t =β 0+β 1 x+β 2 x 2+β 3 t+β 4 tx+β 5 tx 2.
BIC=Bayesian information criterion.
*p<0.05; **p<0.01; ***p<0.001.
The selected models for the disability rate σ x,t in each subpopulation all include a positive linear and a negative quadratic age terms, implying that disability rates increase with age, but at a decreasing rate. Moreover, the negative age–time interaction effects in all four subpopulations except for urban males show that there has been an overall improvement in disability rates over time. It also shows that the rate of improvement in disability rates over time differs across age groups.
For all subpopulations, the mortality rates for healthy individuals μ x,t increase with age but again there is a deceleration for higher age groups. We note that there are no significant time effects or age–time interaction effects in any of the four sub-populations. This agrees with the fact that the plots of μ x,t show fairly stables pattern throughout the sample period (see Figure 3).
The models for the mortality rates of the disabled ν x,t all include positive linear age effects and negative time/age–time effects. For urban and rural males and urban females, a linear negative mortality trend is found for all age groups. The speed of mortality decline over time is similar in these three sub-populations. The model for rural females includes a negative quadratic age–time effect, indicating that mortality decline for this subpopulation is more rapid for older age groups.
Overall, these results show that both age and time effects are important factors explaining patterns in health transitions at higher ages in China. In addition, several of the models rely on age–time interactions which capture time trends that differ by age. Our results also suggest that the improvements in mortality rates of older Chinese aged 65+ are largely driven by the decline of mortality rates of functionally disabled elderly, rather than by the mortality rates of healthy individuals. China has experienced rapid economic growth since the beginning of its market reforms in 1978. Over our sample period 1998–2012, the average annual disposable income increased 3.5 times for urban households and 2.6 times for rural households (National Bureau of Statistics of China, 2013). This economic development allowed households to afford higher living standards and better medical care. Furthermore, several policy reforms have improved people’s access to health services and increased health insurance coverage in China (Meng et al., Reference Meng, Xu, Zhang, Qian, Cai, Xin, Gao, Xu, Boerma and Barber2012). As a result, China has seen major changes in the causes of death (Zhou et al., Reference Zhou, Wang, Zhu, Chen, Wang, Liu, Li, Wang, Liu, Yin and Liu2016) and in the survival and disability of the elderly (Zeng et al., Reference Zeng, Feng, Hesketh, Christensen and Vaupel2017). We argue that these factors can explain our results.
Figures A.1–A.3 in the Appendix show the residuals for the 12 selected models, computed as the difference between the crude and estimated transition rates. The errors fluctuate around 0 and show no systematic patterns. We conclude that the selected models effectively capture age and time patterns in the data.
4.2. Stepwise model selection
The previous section discussed the selected model specifications for each transition intensity and each subpopulation that were identified by comparing all possible model variants. It is interesting to compare these results with those from a stepwise model selection process where additional terms are added in each step. Table 4 gives the BIC values for six nested model variants and compares these with the BIC values for the selected models identified in Table 3.
Table 4 Stepwise model selection: goodness-of-fit of nested models.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_tab4.gif?pub-status=live)
Note 1: The table gives the Bayesian information criterion (BIC) for several nested model variants.
Note 2: Bold font indicates minimum BIC values. Selected model refers to the model identified in Table 3.
We note that in six of the 12 cases the stepwise model selection identifies models with higher BIC values (representing a worse model fit) than the models selected in section 4.1. This is the case for all four models for the mortality rate from functionally disabled state ν x,t , and for two of the models for the disability rate σ x,t . The limitations of stepwise model selection algorithm are widely recognised in statistics (Hurvich & Tsai, Reference Hurvich and Tsai1990; Grafen et al., Reference Grafen, Hails, Hails and Hails2002; Whittingham et al., Reference Whittingham, Stephens, Bradbury and Freckleton2006). One of the weaknesses of this method is the fact that model selection is very sensitive to factors such as the order of parameter entry and whether we choose to use forward selection algorithm or backward elimination algorithm (Derksen & Keselman, Reference Derksen and Keselman1992). Therefore, to avoid these limitations, in this paper we have considered all possible model designs for the three health transition models. The preferred models turn out to be parsimonious and have good fitting performances.
Nevertheless, the detailed analysis of the stepwise model comparison confirms that including time effects and age–time interaction terms improves the model fit for most health transition models except for the mortality rates μ x,t of the non-disabled.
4.3. Likelihood ratio tests (LRTs)
In order to further verify the significance of the time effects in the three transition intensities, we conduct the LRT on the selected models given in Table 3. For cases where time effects are included in the selected models, we drop these effects to construct the null model and then perform the LRT. For cases where the selected model does not contain time effects, we construct an alternative model with all the selected age terms and at least one time effect. We select the alternative model using the BIC. The test statistic based on the deviance D is defined as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU10.gif?pub-status=live)
with
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU11.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_eqnU12.gif?pub-status=live)
where L null represents the likelihood of the null (simpler) model and L alternative the likelihood of the alternative (larger) model. The saturated model has the same number of parameters as the sample size and L saturated is the corresponding likelihood.
Under the null hypothesis that the simpler model is preferred, ΔD follows a χ 2 distribution with degrees of freedom equal to the difference in the number of parameters between the null model and the alternative model. We conduct the LRT at the 5% level of significance. The test results given in Table 5 show that for all selected models which include time effect, adding these time effects significantly improves the model fit. The results also show that adding time effects does not significantly improve the fit for the selected models without time effect. The only exception is the model for the mortality rate μ of healthy urban females, where the LRT indicates that including time effects would improve the model fit. We note this result but decide to keep the more parsimonious model that was selected in Table 3 based on the BIC. Apart from this exception, the results of the LRTs confirm our model choices based on the BIC.
Table 5 Results of the likelihood ratio test.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_tab5.gif?pub-status=live)
Note: D denotes the deviance and ΔD is the test statistic defined in section 4.3.
4.4. Life expectancy and healthy life expectancy
We use the models identified in section 4.1 to compute estimates for life expectancy and healthy life expectancy. Table 6 shows the estimated life expectancies at age 65 and 75 conditional on the initial health status, where “healthy” is defined as having at most one ADL limitation (see section 3.1). We provide estimates for the first and last time point of the investigation period (1998 and 2011) and out-of-sample forecasts for 2020. A death time dispersion measure for each subgroup, which is calculated as the standard deviation of death time distributions, is also shown in Table 6. This dispersion measure captures idiosyncratic mortality risk. As death time distribution for older age groups can be very different from a normal distribution, this standard deviation-based measure is an approximate indicator of the death time dispersions around life expectancy.
Table 6 Life expectancy and death time dispersion conditional on health status at age 65 and 75.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_tab6.gif?pub-status=live)
The estimated life expectancies vary in plausible ways: life expectancies of urban residents are higher than those of rural residents, females have higher life expectancies than males and healthy individuals have higher life expectancies than disabled ones. Our models include time trends in three of the four models for disability rate and in all models for the disabled mortality rates. These trends are reflected in the life expectancies which increase over time for all population subgroups and show the largest improvements for disabled individuals. When comparing the computed life expectancies in Table 6 with several related studies, we find consistencies in the results. For example, Luo et al. (Reference Luo, Wong, Lum, Luo, Gong and Kendig2016) report for the age group 65–69 in 2011 a remaining life expectancy of 15.0 years for males and 18.7 years for females (Luo et al., Reference Luo, Wong, Lum, Luo, Gong and Kendig2016, Table 2). From Table 6, we can also see that death time dispersions for the disabled subgroups increase over time, a trend which is potentially driven by the mortality improvement of the disabled individuals. On the other hand, the death time dispersions of the healthy subgroups are generally stable over time.
Table 7 gives the estimated healthy life expectancies at age 65 and 75Footnote 8 . Female urban residents have the highest healthy life expectancy and male rural residents have the lowest healthy life expectancy. Healthy life expectancies of females improve faster over time than those of males. We find that the ratios of healthy life expectancy to life expectancy are quite stable over the period 1998–2020, indicating a dynamic equilibrium where both life expectancy and healthy life expectancy shift to the right – a finding which agrees with the results of several related studies on China (see, e.g. Liu et al., Reference Liu, Chen, Song, Chi and Zheng2009; Guo, Reference Guo2017).
Table 7 Healthy life expectancy at age 65 and 75.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_tab7.gif?pub-status=live)
Overall, our results show persistent health differences between urban and rural China. For life expectancy, we find that the existing urban–rural gaps increase over time for healthy males and for healthy and disabled females. For disabled males, the gap seems to be slowly decreasing (see Table 6). For healthy life expectancy, our results suggest convergence between urban and rural males, but divergence for females.
5. Conclusions
In this article, we develop a new flexible approach to modelling health transitions at higher ages based on the GLM framework. Our model extends existing modelling approaches by allowing for time trends and age–time interactions in the linear predictor in addition to the commonly used age effects. We apply the model to health transitions of older Chinese aged 65–105 and consider males and females in urban and rural areas separately.
We identify important factors explaining the health transition intensities σ, μ and ν in each sub-population using the BIC model selection algorithm. Different functional forms are selected for the different health transitions in each subpopulation. The selected models all include age effects which have been included in previous studies including Renshaw & Haberman (Reference Renshaw and Haberman1995) and Fong et al. (Reference Fong, Shao and Sherris2015). The models for the disability rates and the disabled mortality rates also include time trends and age–time interactions, which confirms that these factors should be considered when modelling health transitions at higher ages.
Using the selected models for each group, we compute estimates of life expectancies and healthy life expectancies. The results are largely consistent with the results of previous studies on health expectancies in China (Liu et al., Reference Liu, Chen, Song, Chi and Zheng2009; Luo et al., Reference Luo, Wong, Lum, Luo, Gong and Kendig2016; Guo, Reference Guo2017). We also confirm that health differences continue to persist between urban and rural China, which agrees with recent findings by Wang & Yu (Reference Wang and Yu2016). In addition, our study adds new findings on the life expectancy and healthy life expectancy for urban and rural populations over a longer time horizon, and conditioning on initially health status.
We developed this model as an input for further research on population ageing and retirement financial planning in China. Our model can be used, for example, to estimate the demand for long-term care insurances based on the disability rate and life expectancy of disabled individuals produced by the model. The outputs of the model can also be used to assist the design and pricing of new retirement financial products for the Chinese market including reverse mortgages and other home equity release products (see, e.g. Alai et al., Reference Alai, Chen, Cho, Hanewald and Sherris2014; Shao et al., Reference Shao, Hanewald and Sherris2015). Moreover, in this paper we have used an ADL limitation-based definition of disability. The approach developed here can be easily adjusted to capture other dimensions of health such as chronic diseases or cognitive impairment.
We adopted the GLM framework in this study due to the limited amount and length of data available for the estimation. Once more data becomes available for China, the model can be extended to consider, for example, stochastic volatilities in the time effects as well as smoothness in the age dimension. Potentially, we can modify our model to have a similar model structure as the Lee–Carter model.
Acknowledgement
The authors acknowledge the financial support of the Australian Research Council Centre of Excellence in Population Ageing Research (CEPAR). The authors thank Professor Colin O’Hare, Dr Anastasios Panagiotelis, Dr Andrés Villegas and Dr Hong Li for their valuable comments and suggestions.
Appendix
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_fig5g.jpeg?pub-status=live)
Figure A.1 Estimated errors for the disability rates (σ: N→F). (a) Males, urban, (b) males, rural, (c) female, urban, (d) females, rural. The model estimates are in Table 3.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_fig6g.jpeg?pub-status=live)
Figure A.2 Estimated errors for the mortality rates of healthy individuals (μ: N→D). (a) Males, urban, (b) males, rural, (c) female, urban, (d) females, rural. The model estimates are in Table 3.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190124120144938-0714:S1748499518000167:S1748499518000167_fig7g.jpeg?pub-status=live)
Figure A.3 Estimated errors for the mortality rates of disabled individuals (ν: F→D). (a) Males, urban, (b) males, rural, (c) female, urban, (d) females, rural. The model estimates are in Table 3.