1. Introduction
China’s population ageing is exceptional in many ways. China is home to the largest elderly population in the world: 158 million people aged 65+ lived in China in 2018, representing 23% of the world’s elderly population (United Nations, 2017). China is ageing much faster than most other countries: The old age dependency ratio in China is expected to double from 16% in 2018 to 32% in 2035, an increase that took most high-income countries several decades. China’s median population age is expected to exceed that of high-income countries from 2035 onwards (United Nations, 2017). At the same time, China is still a middle-income country: while life expectancy in China is just 2 years less than in the United States (76.3 versus 78.7 years), GDP per capita in China is only 15% of that in the United States (World Bank, 2019).
The Chinese government is addressing the consequences of population ageing with major social insurance reforms and by promoting the private insurance sector. New public pension programmes for rural and urban residents have been introduced, and existing pension programmes have been reformed, changing both contribution and payout structures (Fang & Feng, Reference Fang and Feng2018). There are plans to raise the pension age for urban employees from age 60 for men and age 50–55 for women to age 65 for both men and women.Footnote 1 The government also promotes the development of occupational pensions and has introduced tax benefits for contributions to “enterprise annuities” (for the private sector employees) and “occupational annuities” (for the public sector) (Fang & Feng, Reference Fang and Feng2018). At the same time, the Chinese government aims to set-up a multi-layered health insurance system centred on (near-universal) basic health insurance and supplemented by commercial health insurance (Li & Fu, Reference Li and Fu2017). The government has also authorised several local programmes testing different forms of long-term care financing including social insurance and government subsidies (e.g. Lu et al., Reference Lu, Mi, Zhu and Piggott2017; Yang et al., Reference Yang, Jingwei He, Fang and Mossialos2016). Furthermore, a pilot programme to introduce reverse mortgages which started in mid-2014 in four cities was extended nationwide in mid-2018 (e.g. Hanewald et al., Reference Hanewald, Bateman, Fang and Wu2020).
Our study aims to provide a new evidence base for the development of public and private insurance in China. To do so, we develop a new approach for estimating regional health life expectancy and apply the model to Chinese provinces. Healthy life expectancy (HLE) is an important population health indicator (Sanders, Reference Sanders1964), which is increasingly used in the actuarial literature (Hammond et al., Reference Hammond, Baxter, Bramley, Kakkad, Mehta and Sadler2016; Hanewald et al., Reference Hanewald, Li and Shao2019; Li et al., Reference Li, Shao and Sherris2017). HLE measures the average number of years that an individual will spend in good health. By taking both mortality and morbidity into account, HLE captures the quality as well as the quantity of life. Depending on the definition of health, HLE can quantify how long individuals can participate in the workforce, the age at which they will need increased medical services or the onset of long-term care needs. HLE has been used in the “Outline of the Healthy China 2030 Plan”, a national medium- and long-term strategic plan for China’s health sector issued by the Chinese Central Government and State Council on 25 October 2016.
Most earlier studies rely on the Sullivan method to compute HLE, which requires detailed information on age-specific morbidity that are often not publicly available at the province level. In this paper, we develop a new approach to estimate regional HLE at birth and apply the model to China, where longevity and health have substantially improved in recent decades, but health inequalities across provinces are still large (Hanewald et al., Reference Hanewald, Li and Shao2019; Huang, Reference Huang2017). We propose a multiple regression model for HLE that does not rely on age-specific morbidity data, but rather on longevity and socio-economic variables that are widely available. This new approach can be used to construct and estimate regional-level HLE for countries where detailed mortality and morbidity information is not readily available at the regional level.
We estimate the proposed model using data from 139 countries in the years 1990, 2005 and 2013 from a number of international databases including the Global Burden of Disease Study, the World Bank and the Organization for Economic Co-operation and Development (OECD). Due to the large variation in health and economic development across Chinese provinces, we treat each province-level region in China as a “country”. This allows us to use the estimated model to predict province-level HLEs for China. We evaluate the out-of-sample predictive performance of the model by comparing the model-predicted HLE with published HLE figures. These results give us confidence in applying our model to province-level regions in China.
Using the estimated model, we calculate HLE for 31 province-level administrative units in China in 2015, which gives the most up-to-date estimates for regional HLE for China. The results show that HLE varies by more than 10 years across Chinese provinces for both males and females. The estimated male HLE in Beijing is 70.5 years which is similar to developed countries in Europe such as Sweden. In contrast, Yunnan’s estimated male HLE of only 60.4 years is comparable to African developing countries such as Egypt. Our research provides a new approach to estimating regional-level HLE at birth for counties where the Sullivan method cannot be applied due to incomplete regional morbidity data.
The rest of the paper is organised as follows. In the next section, we further discuss the background and motivation for our work. We introduce the multiple regression model in section 3. Section 4 describes the data used in this study. The estimation results, validation tests and province-level HLE projections are shown in section 5. Section 6 concludes the paper and discusses the implications of our research for the design of public policies and the development of insurance and banking products in China.
2. Background and Motivation
Depending on how being “healthy” is defined, HLE can be measured in different ways. Examples in the literature include life expectancy in good perceived health (Smith et al., Reference Smith, Edgar and Groom2008; Brønnum-Hansen, Reference Brønnum-Hansen2005), disease-free life expectancy (e.g. Dubois & Hébert, Reference Dubois and Hébert2006; Lièvre et al., Reference Lièvre, Alley and Crimmins2008), disability-free life expectancy (Crimmins et al., Reference Crimmins, Saito and Ingegneri1997; Imai & Soneji, Reference Imai and Soneji2007) and active life expectancy (e.g. Kaneda et al., Reference Kaneda, Zimmer and Tang2005; Manton et al., Reference Manton, Gu and Lamb2006). In this study, we adopt health-adjusted life years (HALE), which is a widely used HLE measure published by the World Health Organization. HALE measures the expected years of life living in full health, taking into account severity-weighted disability prevalence estimated in the Global Burden of Disease Study (Murray & Lopez, Reference Murray and Lopez1997).
HALE in the Global Burden of Disease Study was estimated using the Sullivan method (Murray et al., Reference Murray, Barber, Foreman, Ozgoren, Abd-Allah, Abera, Aboyans, Abraham, Abubakar and Abu-Raddad2015). This Sullivan method is the most common method used to estimate HLE. It requires information on age-specific prevalence rates of ill-health and age-specific mortality data (Jagger et al., Reference Jagger, Van Oyen and Robine2014). The first step of the Sullivan method is to use age-specific population and death counts to calculate the person-years lived in each age group. Alternatively, the person-years lived can also directly be obtained from published life tables (Jagger et al., Reference Jagger, Van Oyen and Robine2014). The person-years lived are then multiplied by one minus the age-specific prevalence of ill-health to compute the person-years lived without illness. Next, the person-years lived without illness are cumulated to compute the total years lived without illness. Based on that, HLE is calculated in a final step (Jagger et al., Reference Jagger, Van Oyen and Robine2014).
The biggest challenge of conducting a province-by-province HLE study for China is the lack of high-quality age-specific morbidity and mortality data that are publicly available. There are several household-level longitudinal surveys collecting morbidity information in different regions, such as the China Health and Nutrition Survey and the Chinese Longitudinal Healthy Longevity Survey. However, the sample sizes of these surveys are not large enough for province-level analysis. The China Disabled Persons’ Federation has carried out two National Sample Surveys on Disability in 1987 and 2006. Using face-to-face interviews, these surveys collected detailed information on disabled individuals such as age, gender, residence, education and employment. Liu et al. (Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010) used the second National Sample Surveys on Disability to compute the disability-free life expectancy at age 60 for 31 province-level administrative units in 2006 with the Sullivan method.
To date, the study by Liu et al. (Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010) remains the main reference for regional differences in HLE in China. Since province-level information on ill-health rates is sparse, regional variations in HLE across China are still under-researched. As China is rapidly ageing, more research in this area is needed to inform policymakers, health care providers and insurance companies in order to identify the differences in the demand for health care, aged care and financial services across China.
Liu et al. (Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010) found that there is a large degree of variation in HLE across regions with estimates ranging from 11.2 to 20.8 years, which reflects the patterns in regional economic developments. It is well known that China’s provinces are at very different stages of development (e.g. Evandrou et al., Reference Evandrou, Falkingham, Feng and Vlachantoni2014; Zhang & Li, Reference Zhang and Li2015). This motivates our modelling approach to model the HLE for a province based on other countries with similar social, economic and demographic conditions. A similar idea has been applied in The Economist (2015) where province-level life expectancies (LE) in China were compared with other countries. The comparison showed that in 2013 LE in Shanghai was as high as in Switzerland, while LE in Xinjiang was roughly the same as in Algeria.
Therefore, we argue that by treating each province as a separate “country”. We can learn from other countries’ experience and predict province-level HLE in China. By sampling mortality and morbidity experience from a wide range of countries, we “borrow” information to overcome the problem of not having sufficient data for province-level analyses for China. In the next section, we develop a multiple regression model that utilises the predictive powers of factors that drive the changes in HLE. We estimate the model using data from a wide range of countries.
3. A Predictive Regression Model for HLE
There are two purposes of regression modelling: (i) to explain the relationship between dependent and independent variables or (ii) to make predictions of the dependent variable based on a set of independent variables (e.g. Mac Nally, Reference Mac Nally2000; Shmueli et al., Reference Shmueli2010). Our primary interest is to predict HLE using information on LE and observable socio-economic variables, hence we develop a predictive regression model.
3.1 Model specification
In the last two decades, several studies have analysed which socio-economic variables have a strong influence on LE and HLE (e.g. Banister & Hill, Reference Banister and Hill2004; Liu et al., Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010; Murthy & Okunade, Reference Murthy and Okunade2014; Babiarz et al., Reference Babiarz, Eggleston, Miller and Zhang2015). Socio-economic variables such as GDP per capita, government expenditure on health care and the number of physicians have been frequently used to explain the variations in HLE (see, e.g. Mondal & Shitan, Reference Mondal and Shitan2014; Sede & Ohemeng, Reference Sede and Ohemeng2015). Banister & Hill (Reference Banister and Hill2004) found that the determinants of LE at birth in China during the period 1981–1995 included per capita consumption, the illiteracy rate, the number of doctors and the share of education and health care expenditures. Liu et al. (Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010) explore the impact of additional variables such as modern household utilities and health care infrastructure on province-level HLE. However, as pointed out by Jagger (Reference Jagger2015), projections of HLE purely based on socio-economic factors are still rare in practice, and the failure to include certain important factors could lead to a misunderstanding of future HLE trends.
To overcome this problem and better predict HLE, we also include LE in our model, utilising the strong positive correlation between LE and HLE which has been found in many studies (e.g. Law & Yip, Reference Law and Yip2003; Liu et al., Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010). Intuitively, this strong correlation is easy to understand as total LE is the sum of HLE and “lost” healthy life years due to disability. Moreover, socio-economic factors that affect LE will also have an impact on HLE Jagger (Reference Jagger2015). Therefore, rather than performing regression on HLE itself, we model the ratio of HLE to LE instead.
Let yit = logit(HLEit/LEit) where the logit function is defined as:
The predictive regression model is introduced as follows:
where i ∈ [1, N] represents country/region and t ∈ [1, T] represents time. βj,k is the coefficient of the jth variable of order k, where K is the highest polynomial order to be considered and J represents the number of explanatory variables in the model. Note that the model selection process described in the following section can set some of these coefficients to zero. $\epsilon_{it}$ denotes the error term, which is assumed to be normally distributed. We model the logit transformation of the ratio of HLE to LE to ensure that the value of HLE will not exceed the value of LE.
The selection of socio-economic variables in our model is based on the studies mentioned earlier in this section. In addition, we require the selected socio-economic variables to be both available and consistently defined for all countries included in the estimation as well as the provinces in China. Based on these criteria, we have selected the following five socio-economic variables:
GDP: gross domestic product per capita (in 1,000 USD)
Health exp: public health expenditure as a percentage of GDP
Education exp: public education expenditure as a percentage of GDP
Hospital beds: number of hospital beds per 1,000 people
Physicians: number of physicians per 1,000 people
In addition, we introduce an East-Asia binary indicator variable DEast Asia into the model to capture any distinctive characteristics of East-Asian countries that would affect the estimation of HLE. Several studies have found that there is a certain degree of advantage in Asian mortality experience compared to other ethnic groups (e.g. Elo & Preston, Reference Elo, Preston, Martin and Soldo1997; Acciai et al., Reference Acciai, Noah and Firebaugh2015). Also, over the last few decades, the most rapid improvements in LE at birth have occurred in East-Asia (National Institute on Ageing, 2011). In our study, apart from China, other countries included in the East-Asia region are Cambodia, Indonesia, Japan, South Korea, Laos, Mongolia, Philippines, Thailand and Singapore. We have included these countries because all of them have strong cultural, historical or linguistic ties with China and in some cases also have significant Chinese minorities.
3.2 Estimation and model selection
The βj,k coefficients in the model are estimated by the ordinary least squares (OLS) method. We obtain estimates of these coefficients by minimising the following sum of squared residualsFootnote 2 :
As mentioned in section 3.1, some of the coefficients in equation (2) can be set to zero as we only want to include those variables that have high predictive power for the dependent variable. We therefore adopt a model selection process that reflects the predictive purpose of the proposed model. We need to consider the trade-off between goodness of fit and parsimony of the model as a non-parsimonious model can lead to poor prediction results (Tibshirani, Reference Tibshirani1996). Obviously, a “better” fit can often be obtained by introducing more model terms, but not all of the terms will have high predictive power and thus sometimes the forecasting results can be worsened (Härdle, Reference Härdle1990; Mac Nally, Reference Mac Nally2000).
We identify the optimal model using the following selection process:
(1) We start with a polynomial order of K = 1 for all variables in equation (2) and compare all possible model specifications based on the Bayesian information criterion (BIC). The BIC is a widely used model selection criteria due to its many desirable properties (Schwarz, Reference Schwarz1978). We select the model with the lowest BIC value.
(2) We use the Ramsey Regression Equation Specification Error Test (RESET) to test for misspecification of the selected model (Ramsey, Reference Ramsey1969). If the model is mis-specified, we move on to step (3).
(3) We repeat steps (1) and (2) for the next higher polynomial order until the Ramsey RESET test is passed.
4. Data
4.1 Data used for estimation
The proposed model is estimated using data from a wide range of countries worldwide. We obtained data on LE and HLE at birth for 188 countries in 1990, 2005 and 2013 from the Global Burden of Disease report (Murray et al., Reference Murray, Barber, Foreman, Ozgoren, Abd-Allah, Abera, Aboyans, Abraham, Abubakar and Abu-Raddad2015), for both males and females. The explanatory variables including GDP per capita (in 1,000 USD), public health expenditure as a percentage of GDP, public education expenditure as a percentage of GDP, the number of hospital beds per 1,000 people and the number of physicians per 1,000 people were collected from two main sources: the World Bank (2017) and the OECD (2017). For each explanatory variable, when the World Bank data for a given country in a given year were missing, we used data from the OECD instead. If the observation was not available from either data source, a nearest neighbour interpolation with a 2-year bandwidth was used to approximate the value based on information from the two data sources. Otherwise, the observation was treated as missing. Our final sample has 222 observations from 139 countries that have complete data in 1 or more years of 1990, 2005 and 2013.
4.2 Data used for model validation
Before applying the estimated model to predict province-level HLE for China, we want to make sure that the model fits the data well and provides reliable out-of- sample predictions for HLE. Two sets of validation are conducted in section 5. First, we compare the model-predicted male and female HLE for 128 countries in 2010 with the corresponding published figures in the Global Burden of Disease Study (Salomon et al., Reference Salomon, Wang, Freeman, Vos, Flaxman, Lopez and Murray2013). Information on all socio-economic variables included in the model was collected from the World Bank (2017) and the OECD (2017). We note that the 2010 data were not used for estimating the model.Footnote 3
Second, we assess the performance of the model based on its out-of-sample prediction accuracy for Taiwan in the years 2005, 2010 and 2013. Even though information on HLE and LE for Taiwan can be obtained from the Global Burden of Disease Study, Taiwan is not included in the estimation dataset as its socio-economic variables are not available from either the World Bank or the OECD. However, due to its strong historical and cultural links to China, the accuracy of HLE predictions for Taiwan provides us a credible indication of whether the model is suitable for predicting HLE for Chinese provinces. Therefore, we compute the HLE estimates for Taiwan and compare them with published figures. Explanatory variables included in the model are obtained from three sources: World Data Atlas (2017), National Statistics Republic of China (Taiwan) (2017) and Chowdhury (Reference Chowdhury2007).
4.3 Data used for prediction of province-level HLE
Using the estimated model, we predict HLE at birth for 31 province-level regions in China in 2015. The National Bureau of Statistics of China publishes province-level male and female LE based on census data every 10 years. We first estimate the province-level LEs at birth in 2015 using linear extrapolation from the published province-level LEs in 2000 and 2010. We adjusted the extrapolated values to ensure that the national-level LE computed based on our estimates is consistent with the national-level LE figure published in the Global Burden of Disease study (Kassebaum et al., Reference Kassebaum, Arora, Barber, Bhutta, Brown, Carter, Casey, Charlson, Coates and Coggeshall2016). The adjustment requires data on the population size for each province, which were obtained from the China 1% Population Sample Census 2015 (National Bureau of Statistics of China, 2017a). A detailed description of the adjustment method is provided in Appendix B. The province-level socio-economic data are obtained from two main sources: the National Bureau of Statistics of China (2017b) and China Data Online (2017).Footnote 4
5. Results
5.1 The estimated model
Following the model selection procedure described in section 3.2, we identify the optimal models using the BIC measure and the Ramsey RESET test. The results are shown in Table 1. When we only include first-order terms (K = 1), the selected models for both male and female HLE fail the Ramsey RESET test at the 5% level of significance, suggesting the occurrence of possible model mis-specifications. When quadratic terms are allowed in the model (K = 2), both optimal models pass the Ramsey RESET test and have lower BIC values indicating a better model fit. Therefore, these are our final selected models.
Note 1: Each line represents the selected model for a given polynomial order K of the explanatory variables. Models are selected using the BIC.
Note 2: No. of parameters is the number of parameters in the model including the intercept.
The detailed estimation results for the final selected models are shown in Table 2. For both genders, the selected variables are jointly and individually significant at the 5% level. Male HLE can be best predicted by a combination of GDP per capita (GDP), public health expenditure as a percentage of GDP (Health exp), public education expenditure as a percentage of GDP (Education exp.), the number of hospital beds per 1,000 people (hospital beds), hospital beds-squared and the East-Asia binary indicator. The HLE model for females shares similar variables with the male model but does not include public health or education expenditure and instead includes GDP-squared.Footnote 5 The R 2 statistics show that both models explain the variations in HLE well.
Note 1: The response variable in both columns is the logit transformation of the ratio of HLE to LE.
Note 2: Robust standard errors (Huber–White) are shown in parentheses.
Note 3: F-stat is the test statistics for the joint significance of all independent variables.
Note 4:* and ** indicate significance at the 5% and 1% level, respectively.
Note 5: R 2 represents the percentage of variations in HLE explained by the model.
We note that some of the coefficients of socio-economic variables in Table 2 are negative. For example, the coefficients of GDP for both males and females are negative. Considering GDP as a single factor in isolation, a large degree of variation in HLE across regions would be due to the differences in regional economic development. However, since the model contains multiple factors, we cannot interpret the coefficients without taking into account the interactions across all factors. One way to interpret the negative coefficient of GDP is, when everything else is held constant, an increase in GDP will lead to a lower level of HLE. This could result from side effects of high-speed economic development such as pollution. In fact, the negative correlation between GDP and HLE (as a ratio of LE) is consistent with international evidence. In many countries, the medical advancement from economic prosperity in recent decades has significantly improved old-age LE but has increased HLE to a much less extent. Using comparable survey data across 13 EU member states over the period 1995–2001, Jagger et al. (Reference Jagger, Gillies, Cambois, Van Oyen, Nusselder and Robine2009) find that the majority of countries experienced an expansion of morbidity. We do not attempt to further interpret the coefficients in Table 2, as the primary objective of the proposed model is to predict HLE, rather than to explain how different socio-economic factors affect HLE.
We also plot the residuals from the estimated models in Figure 1. The figure shows that both models produce unbiased estimates of the dependent variable and the residuals are approximately normally distributed. Also, it can be seen that the residual volatility for the male model is slightly higher than for the female model.
5.2 Out-of-sample prediction performance
We first assess the prediction performance of the estimated model by comparing the model-predicted HLE with the published HLE for 128 countries in 2010. Table 3 shows the results for both males and females. About 113 out of 128 countries in this comparison are originally included in the estimation process. The remaining 15 countries are therefore counted as out-of-sample countries. For males, the model-predicted HLE falls within the confidence interval of the published HLE for 97% of the in-sample countries and 87% of the out-of-sample countries. We find similar results for the female model with only two countries’ model-predicted HLE falling outside the published confidence interval. Overall, the results show that the proposed models capture the cross-sectional variations in HLE well. We find a high level of prediction accuracy for both the in-sample countries and out-of-sample countries.
Note 1: We compare model-predicted HLEs with HLEs and corresponding confidence intervals (in parentheses) published in Salomon et al. (Reference Salomon, Wang, Freeman, Vos, Flaxman, Lopez and Murray2013).
Note 2: In-sample countries are those which are included in the dataset for the estimation of the model.
Note 3: Out-of-sample countries are those which are not included in the estimation.
In addition, we want to test whether the model is suitable to predict province-level HLE and can capture the development of HLE over time. Table 4 compares the model-predicted HLE for Taiwan with the published figures for both genders in the years 2005, 2010 and 2013. We find that the model-predicted HLE for males increased from 66.51 years in 2005 to 68.14 years in 2013. These numbers are very close to the published HLE, which increased from 66.68 to 68.11 years over the same time period. Similarly, the model-predicted HLE for females increases in line with the published figures. On average, the model-predicted HLE deviates from the published HLE by less than 0.5 years. These results show that the estimated model provides very accurate predictions of HLE for Taiwan over the period from 2005 to 2013. Because Taiwan has strong historical and cultural links with China, this validation result gives us additional confidence to apply the model to predict HLE for the province-level units in China.
Note: We compare model-predicted HLEs with HLEs and corresponding prediction intervals (parentheses) published in Salomon et al. (Reference Salomon, Wang, Freeman, Vos, Flaxman, Lopez and Murray2013).
5.3 Province-level HLE for China
Using the estimated models, we predict the HLE for 31 province-level regions in China in 2015. Tables 5 and 6 show the model-predicted HLE and the associated prediction intervals for males and females. We can see from the two tables that there is a large degree of heterogeneity in HLE across China for both males and females. Beijing has the highest HLE for both genders: HLE is 70.53 and 73.78 years for males and females, respectively. Conversely, Yunnan has the lowest HLE for males (60.44 years), while Tibet has the lowest HLE for females (63.37 years). For both genders, HLE varies by about 10 years across China. The prediction intervals for HLE computed by the model are around 3.7–4.0 years for males and 2.7–2.9 years for females.
Note: The table is ranked by the value of predicted HLEs from the highest to the lowest.
Note: The table is ranked by the value of predicted HLEs from the highest to the lowest.
Tables 5 and 6 also show the corresponding LE and number of years spent in disability for each province. Based on our model, provinces with high LE also tend to have high HLE. Years spent in disability are calculated as the difference between LE and HLE. Interestingly, males and females in provinces that have high LEs and HLEs also spend more years in disability. In the case of males, Tianjin has the highest number of years in disability (8.58 years). On the other hand, the years in disability are only 6.14 years for Qinghai which has the third lowest HLE across all provinces. It is well-documented that, on average, females have longer lifespans than males (see, e.g. Oeppen & Vaupel, Reference Oeppen and Vaupel2002; Barford et al., Reference Barford, Dorling, Smith and Shaw2006). The results in Tables 5 and 6 confirm this finding and show that the HLEs of females are generally higher than those of males by roughly 2–3 years. Females also spend more years in disability. For females, the years in disability range between 10.43 and 8.57 years, while for males, the corresponding range is only 8.58 to 6.14 years. This gender difference in disability years has also been found in several previous studies (see, e.g. Murtagh & Hubert, Reference Murtagh and Hubert2004; Van Oyen et al., Reference Van Oyen, Nusselder, Jagger, Kolip, Cambois and Robine2013).
In Figures 2 and 3, we show two heat maps to further explore the geographic distribution of HLE across China. The figures show that HLE generally decreases from eastern to western China, with Beijing, Tianjin, Shanghai, Zhejiang and Hainan having the highest HLEs. The HLE in provinces along China’s east coast is above 65 years for males and above 70 years for females, which are comparable to the values for European developed countries published in the Global Burden of Disease Study (Kassebaum et al., Reference Kassebaum, Arora, Barber, Bhutta, Brown, Carter, Casey, Charlson, Coates and Coggeshall2016). Provinces in central China have moderate levels of HLE, ranging from 63 to 67 years for males and 68–71 years for females. Western China generally has lower levels of HLE compared to central and eastern China. In particular, Tibet, Qinghai and Yunnan in the southwest of China have the lowest HLE figures, with male and female HLE lower than 62 and 67 years, respectively. These numbers are comparable to the HLE in African developing countries such as Egypt.
Liu et al. (Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010) estimated province-level HLEs for China using the Sullivan method. They compute disability-free life expectancy at age 60, which is different from the HLE measure we adopted in this study. Nevertheless, the regional patterns of HLE in 2006 based on their estimates are very similar to our results, as shown in Figures 2 and 3. Moreover, Liu et al. (Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010) also report a gap of about 10 years between the highest and the lowest province-level HLE estimates. Using single and multiple linear regression analyses, Liu et al. (Reference Liu, Chen, Chi, Wu, Pei, Song, Zhang, Pang, Han and Zheng2010) find that the disability-free life expectancy is highly correlated with GDP per capita which implies that the differences in disability-free life expectancy by region mirror the differences in regional development in China. Our approach incorporated this information into the model structure. As a result, based on the HLE estimates in Tables 5 and 6, we find that HLE is strongly correlated with GDP per capita. For males, the correlation coefficient is 0.83 and for females it is 0.58.
6. Discussion and Conclusions
In this paper, we propose a new method to estimate HLE at birth for different regions within a country and apply the model to China’s provinces. Our model does not require age-specific morbidity and mortality data required by the commonly used Sullivan method. Instead, we develop a predictive multiple regression model based on socio-economic variables including GDP per capita, public health expenditure, public education expenditure, number of hospital beds and number of physicians. Information on LE is also used to assist with the prediction of HLE. The proposed model is estimated using data from a wide range of countries globally.
We contribute to the understanding of health inequalities in China by providing the most up-to-date estimates of HLE at birth for 31 province-level units in 2015. Our results show that the inequalities in health outcomes across Chinese provinces persist. For both males and females, the difference between the highest and the lowest province-level HLEs is approximately 10 years. We also find regional clustering of HLEs with high HLEs mainly in eastern China and low HLEs mainly in the southwestern part of China.
The estimated province-level HLE can inform the design of public policies in China. The Chinese government is planning to gradually increase the pension eligibility age for urban employees to age 65 for both men and women. The current pension eligibility age is 60 for males and 50–55 for females. Previous research has shown that health is an important factor in individuals’ retirement decisions in China (Giles et al., Reference Giles, Lei, Wang and Zhao2015). Our HLE estimates indicate that there is potential for increased labour force participation at higher ages. However, we also find that there are 12 provinces with a male HLE of under 65 years, suggesting that in these provinces many men may not be able to work until age 65 for health reasons.
Another challenge China is currently facing is the development of its health care and long-term care system for the elderly. There are several ongoing pilot projects to develop public long-term care insurance in China (e.g. Lu et al., Reference Lu, Mi, Zhu and Piggott2017; Yang et al., Reference Yang, Jingwei He, Fang and Mossialos2016). Our study provides estimates of both HLE and the number of years spent in disability for each province. These results are useful to assess the differences in regional demand for health care and long-term care services. Since China will maintain a very high level of demand for health-related services in the following decades, the government is also actively seeking private market solutions to share the burden of health costs. Our study can contribute to the development of private market retirement financial products including private health insurance, private long-term care insurance and home equity release products such as reverse mortgages.
Finally, since the proposed predictive regression model does not require detailed age-specific mortality and morbidity rates, our model can also be applied to estimate HLE for other countries in years where age-specific morbidity information is not available. Furthermore, as the methods for forecasting life expectancy (Majer et al., Reference Majer, Stevens, Nusselder, Mackenbach and van Baal2013; Wong & Tsui, Reference Wong and Tsui2015) and macro-level social-economic variables (see, e.g. Litterman, Reference Litterman1986) are readily available, the model can also be used to forecast HLE.
Acknowledgements
The authors acknowledge the financial support from the Australian Research Council Centre of Excellence in Population Ageing Research (CEPAR). The authors are indebted to colleagues from Monash University and UNSW Sydney and, in particular, to Professor Farshid Vahid, Professor Rob Hyndman and Dr Andrés Villegas for their valuable feedback. The authors thank the participants at the 21st International Congress on Insurance: Mathematics and Economics (Vienna, 2017) and the 8th China International Conference on Insurance and Risk Management (Guilin, 2017) for their insightful comments and suggestions. The authors also thank Kevin Krahe for his excellent research assistance on the project.
Appendix A. Matrix Expression of the OLS Estimators and Prediction Intervals
To derive the matrix form expressions of the OLS estimators and their standard errors, we further define the following terms:
$\tilde{Y}=(Y^\prime _1,\ldots, Y^\prime _N)^\prime$ , where $Y_i=(y_{i1},\ldots, y_{iT})^\prime$ for $i=1,2, \ldots, N$ .
$\tilde{X}=({X}^\prime _1,{X}^\prime _2,\ldots,{X}^\prime _N)^\prime$ with $X_i= \left( \begin{array}{cccccccc} 1&x_{i1,1}& \ldots &x_{i1,J}&\ldots &x^K_{i1,1}& \ldots &x^K_{i1,J}\\ \vdots&\vdots & \ddots& \vdots&\ldots &\vdots& \ddots& \vdots\\ 1&x_{iT,1}&\ldots& x_{iT,J}&\ldots &x^K_{iT,1}&\ldots& x^K_{iT,J}\\ \end{array} \right)$ for $i=1,2, \ldots, N$ .
$\beta=(\beta_0,\beta_{1,1},\ldots,\beta_{J,1},\ldots,\beta_{1,K},\ldots,\beta_{J,K})^\prime.$
Therefore, we can re-express equation (3) in a matrix form as:
and the OLS estimators are thus expressed as:
We define λ to be the logistic function. The point estimate for HLE is then given by
where $\lambda(b)=\frac{e^b}{1+e^b}$ , and b ∈ R. To construct prediction intervals for HLE, we use the delta method described in Greene (Reference Greene2003) for approximating moments of the parameters in the model. We define f(β) to be the matrix of partial derivatives of β coefficients:
The 100 × (1-α)% prediction interval is then given by
where n is the sample size and p is the number of parameters in the model. $t_{(1-\frac{\alpha}{2},n-p)}$ is the two-sided critical value at the significance level α from the Student’s t-distribution with (n − p) degrees of freedom. We define $s^2=(\tilde{Y}-\tilde{X}\hat\beta)^\prime(\tilde{Y}-\tilde{X}\hat\beta)/(n-p)$ .
Appendix B. Linear Extrapolation Method to Compute Life Expectancy in 2015
Linear extrapolation was used to compute LE in 2015 for the 31 province-level regions included in this study. We used the LE figures published by the National Bureau of Statistics of China in 2000 and 2010. The formula used to extrapolate the 2015 HLE for province i is given by
We then compute a constant scaling factor c for both male and female populations such that
where we have
The estimated constant scaling factor $\hat c$ was 0.9840 for males and 0.9988 for females.
Finally, we obtain $\text{LE}^\ast_{2015,i}$ which is the adjusted LE for province i in the year 2015 as