Published on 07.01.23 in Vol 9, No 1 (2023): Jan-Dec
Preprints (earlier versions) of this paper are available at http://preprints.jmir.org/preprint/40659, first published Jul 01, 2022.
Original Paper
The Environmental and Socioeconomic Effects and Prediction of Patients With Tuberculosis in Different Age Groups in Southwest China: A Population-Based Study
ABSTRACT
Background: While the End Tuberculosis (TB) Strategy has been implemented worldwide, the cause of the TB epidemic is multifactorial and not fully understood.
Objective: This study aims to investigate the risk factors of TB and incorporate these factors to forecast the incidence of TB infection across different age groups in Sichuan, China.
Methods: Correlation and linear regression analyses were conducted to assess the relationships between TB cases and ecological factors, including environmental, economic, and social factors, in Sichuan Province from 2006 to 2017. The transfer function-noise model was used to forecast trends, considering both time and multifactor effects.
Results: From 2006 to 2017, Sichuan Province had a reported cumulative incidence rate of 1321.08 cases per 100,000 individuals in male patients and 583.04 cases per 100,000 individuals in female patients. There were significant sex differences in the distribution of cases among age groups (trend χ2=12,544.4; P<.001). Ganzi Tibetan Autonomous Prefecture had the highest incidence rates of TB in both male and female patients in Sichuan. Correlation and regression analyses showed that the total illiteracy rate and average pressure at each measuring station (for individuals aged 15-24 years) were risk factors for TB. The protective factors were as follows: the number of families with the minimum living standard guarantee in urban areas, the average wind speed, the number of discharged patients with invasive TB, the number of people with the minimum living standard guarantee in rural areas, the total health expenditure as a percentage of regional gross domestic product, and being a single male individual (for those aged 0-14 years); the number of hospitals and number of health workers in infectious disease hospitals (for individuals aged 25-64 years); and the amount of daily morning and evening exercise, the number of people with the urban minimum living standard guarantee, and being married (for female individuals aged ≥65 years). The transfer function-noise model indicated that the incidence of TB in male patients aged 0-14 and 15-24 years will continue to increase, and the incidence of TB in female patients aged 0-14 and ≥65 years will continue to increase rapidly in Sichuan by 2035.
Conclusions: The End TB Strategy in Sichuan should consider environmental, educational, medical, social, personal, and other conditions, and further substantial efforts are needed especially for male patients aged 0-24 years, female patients aged 0-14 years, and female patients older than 64 years.
JMIR Public Health Surveill 2023;9:e40659
doi:10.2196/40659
KEYWORDS
Introduction
Tuberculosis (TB) has been the leading cause of the global disease burden. Approximately 25% of the population worldwide are infected with M tuberculosis [
]. In 2019, according to the World Health Organization, there were approximately 10 million new cases of TB globally and 1.2 million deaths [ ]. Of note, almost 90% of individuals who are infected with TB each year originate from low-income countries [ ]. Poor health services, malnutrition, and crowded working and living conditions in these countries leads to the increased risk of TB across populations. Fighting poverty has become a major theme for the World Health Organization, which aims to end the global TB epidemic using “the End TB Strategy” [ ]. According to the End TB Strategy, the targeted percentage reduction in the absolute number of TB deaths and incidence rate for 2035 are 95% and 90% of the 2015 baseline, respectively [ ]. However, the cause of the TB epidemic is multifactorial and not fully understood.Most people (approximately 90%) develop the disease in adulthood, with men being more susceptible than women [
- ]. Previous ecological studies have demonstrated a significant association between TB cases and ecological factors, including environmental, economic, health, and social conditions [ - ]. Although ecological risk factors for TB at the population and individual levels have raised great concerns, these results are not consistent [ , , - ]. The cause of TB varies in different populations and countries, especially in different groups (eg, of different ages and sexes) or in areas with a high TB burden [ ], which is a major barrier for the End TB Strategy. China had the third-largest TB burden in the world in 2019, while Sichuan Province, known for its geographical and ethnic diversity, has a top-ranked TB burden, thus providing the opportunity to comprehensively identify specific risk factors in a complex background.Furthermore, elucidating the trend of TB incidence with identified TB factors helps to assess the effectiveness of containing measurements that may aid policy maker decisions and public health practice. Therefore, our research aimed to identify these potential risk factors for TB in Southwest China using multiple regression and to predict the trend of TB incidence using a transfer function-noise (TFN) model [
- ]. The overarching goal of our study was to gain insight into the secular TB trend, providing implications for advancing TB prevention and control strategies to achieve the targets of the End TB Strategy.Methods
Data Sources and Objects
Geographic Information
We obtained data from China’s Geographic Information Center on Sichuan Province in 2009. The map of county-level administrative divisions included prefectures, counties, cities, and districts. In total, Sichuan Province has 21 cities or prefectures and 181 districts, counties, and cities, with a total of 1.4 million inhabitants.
Social, Economic, Environmental, Education, and Health Information
Data were extracted from the 2006-2017 Statistical Yearbook of Sichuan Provincial Bureau of Statistics. All TB cases were grouped by sex and age. The age groups were as follows: children (aged 0-14 years), youths (aged 15-24 years), adults (aged 25-64 years), and older individuals (aged >64 years).
From the age-stratified population of the “Epidemic Information Network Direct Reporting System” from 2006 to 2017, we enrolled the subpopulation in Sichuan Province. All population data were for permanent residents, specific to county administrative divisions, and included population data by age and sex. Information on TB and HIV/AIDS was obtained from the TB Information Management System of the Chinese Disease Prevention and Control Information System and the Statistical Yearbook published by the Sichuan Provincial Bureau of Statistics. The incidence of TB was analyzed in different age groups.
Statistical Analysis
The ecological analysis used data such as case reports, registered case data, and ecological information to explore the risk factors related to the prevalence of TB. Data on TB cases and the incidence rate of TB were collected per administrative division, sex, and age group. The variables for each model were chosen by auto-modeling.
The 128 ecological factors were all obtained from 12 years (2006-2017) of data. The transfer function model was fitted by the autoregressive integrated moving average (ARIMA) model by time series [
- ], using expert modeling and adding independent variables for the model fitted to choose the variables through the fitted data and prediction data; we used 3 different models (the Grey model, the ARIMA model, the TFN model) to identify the model that most closely aligned with the real TB data of 2018.TFN Model
The expert modeler selected the optimal model from multiple fitted models, when the R2 value reached the ideal state. The linear regression analysis adopts the stepwise regression analysis method and uses multiple models to fit and to achieve a better R2 value in four different age groups. Finally, only the best model was used for display. Multiple stepwise regression was carried out for multivariate analysis (stepwise regression rules: F-to-enter≥3.840, F-to-remove≤2.710). The models were established by each factor, stepwise in and out of the models.
shows the stepwise regression summary; when the R2 and standard estimated error reached the best value, the model concluded.Univariate analyses (Pearson correlation analysis) and multivariate analyses (regression analysis) were used to analyze the protective factors and risk factors for TB in Sichuan Province according to sex and age group (trend χ225=12544.4; P<.001). Statistical analysis (descriptive analysis and cluster analysis of spatiotemporal scans) and prediction of TB incidence in 2035 were performed according to sex and age.
The autocorrelation test of the residuals uses the Durbin-Watson (DW) test, with the following test statistic:
Inline graphic 1
DW values occur in a range from 0 to 4 as follows: a DW value of 0 indicates complete positive autocorrelation, values between 0 and 1.5 indicate positive autocorrelation, values between 1.5 and 2.5 indicate no autocorrelation, values between 2.5 and 4 indicate negative autocorrelation, and a value of 4 indicates complete negative autocorrelation. In addition, the autocorrelation function and partial autocorrelation function show whether the data sequence reached a stable state. At the same time, the R2 and Bayesian information criterion values of the TFN model were used before and after data unit standardization to find the best model. The closer the DW value is to 2, the more independent the observations of multiple linear regression are.
Considering the effect of time and multiple factors, we used Panel regression, Poisson regression, and Lasso regression for the analysis; however, the data were not suitable for these regressions; for example, our study has a large number of factors to explore the relationship, so the Panel regression could not include all 128 factors and ID and time to fit. The pilot analyses revealed that the random effect model was better than the pool model and fixed effect model. For these reasons, we did not use other regression methods.
We included all reported cases of TB during 2006-2017; these data possibly contain information and selection bias. The TFN model is a multivariate time series analysis method that can be seen as a combination of the ARIMA model and a multiple regression model. We used SPSS 23.0 (IBM Corp) and ArcGis Map 10.6 (ESRI Inc) to create spatiotemporal scans and to conduct the analyses that predicted TB trends by sex and age. The three main steps were as follows: model identification, parameter estimation, and model testing. For the calculation of the P value, the analysis was applied under the assumptions of unequal variances and a statistical significance of P<.05.
Ethics Approval
Data collection of TB was required by the Law of the People’s Republic of China on Prevention and Treatment of Infectious Diseases. The ethics approval in this study was granted by the Ethics Committee of Sichuan Center for Disease Control and Prevention (SCCDCIRB2022-001).
Results
Annual TB Cases and Incidence Rate
From 2006 to 2017, Sichuan Province reported 548,584 cases of pulmonary TB in male patients, with a reported cumulative incidence rate of 1321.08 cases per 100,000 individuals (average 110.09 cases/100,000 individuals), and 235,149 cases of pulmonary TB in female patients, with a reported cumulative incidence rate of 583.04 cases per 100,000 individuals (average 48.59 cases/100,000 individuals). Thus, there were approximately 2.33 times more cases in male patients than in female patients. The reported cumulative incidence of TB in Sichuan Province from 2006 to 2017 was 961.71 cases per 100,000 individuals (average 80.14 cases/100,000 individuals). These TB cases mainly occurred in individuals aged 15-64 years, which accounted for 82.02% (n=642,808) of the total cases.
As shown in
, there were sex differences in the distribution of cases among age groups, and these differences were significant (trend χ2=12,544.4; P<.001). The number of TB cases in Sichuan Province peaked in individuals aged 20-24 years and gradually decreased in individuals older than 64 years ( ). During these 12 years (2006-2017), the incidence rate in those aged 80-85 years was lower than in those aged 60-79 years, while individuals older than 70 years had the highest TB incidence peak ( ).