The baseline characteristics of the covariates are given in Table 1. Table 1 indicates that approximately 15,400 breastfed women of childbearing potential with babies were included in this study. Of these, around 27.6% of the participants were censored, for whom the actual duration of breastfeeding is unknown and the remaining events with known duration of breastfeeding, 65.7% lived in rural areas, around 42.6% of the women came from poor households, 18 .2% of them came from middle-income households and the rest from high-income households. Table 1 also shows that approximately 63.6% of the participants were women with female children and 65.4% of the participants were out of work (unemployed). Of 10,072 unemployed women, 27.7% were still breastfeeding and 72.3% of women stopped breastfeeding (event). Of a total of 5,328 working women, 27.5% of the women were still breastfeeding and 72.5% of the women stopped breastfeeding.
Among the participants, approximately 92.1% were non-smokers, 17.9% of the women had secondary or higher education, 49.7% of the women had an elementary school education, and the rest were uneducated women. The religion category in Table 1 shows that approximately 41.1% of the participants were Orthodox, 39.4% were Muslim and 17.9% were Protestant. In terms of their regions, about 11.8% of them were from Addis Ababa city, 12.0% from Oromia, 11.7% from ancient SNNP, 11.0% from Amhara, about 7.1% of them from Afar, 7.2% from Benishangul-Gumuz, 7.3% from Dire Dawa, 6.5% from Gambela, 5.8% from Harari and about 10.8% of them were from Tigray region.
Nonparametric Survival Analysis
The non-parametric analysis was performed using plots of Kaplan-Meir survival and exposure experience curves for the duration of breastfeeding, as shown in Fig. 1 and Fig. 2, respectively. The result showed that the survival plot decreased more and more at the beginning and continued to decrease from time to time. This implies that most women breastfed heavily shortly after giving birth. On the other hand, the hazard plot initially increased at an increasing rate and increased with increasing time.
Log rank test
The log-rank test was used at the 5% level of significance to validate the differences in survival time for each factor. The difference between the probabilities of an event occurring at any given time was the null hypothesis tested.
Cox proportional hazards regression model
After comparing survival experience between groups of covariates, the next important step was model development. The first step in the model building process identified sets of explanatory variables that had the potential to be included in the linear components of a multivariable proportional hazards model. The Cox proportional hazards regression model, including the univariable Cox proportional hazards models, was fitted. From the univariate analysis; Place of residence, family wealth index, child sex, childhood age, women’s smoking status, birth interval, place of residence, wealth index, and women’s educational level were statistically significant for the variable of interest.
Model diagnostics for Cox proportional hazards mode1
In the current study, the two basic assumptions of the Cox regression model, log-linearity and proportional hazards, as shown in Table 2, were tested. The log linearity test found that the relationship between log hazard or log cumulative hazard and a covariate was linear. The proportional hazard test in this study shows that the ratio of the hazard function does not vary over time for two people with different regression covariates.
The global test of goodness in Table 2 also shows that the Wald chi-square test statistic was significant, suggesting that the proportional hazards assumption is violated. In other words, the plots of the duration of breastfeeding covariates were not parallel to each other. Therefore, there is a violation of the proportional risk assumption. This indicates that the residuals were not random, that there is a systematic pattern, and that the smoothed plot does not look like a straight line, and that there is some deviation from the horizontal line. This is a violation of the proportional hazards assumption.
Because the proportional hazards assumptions were not met, the accelerated failure time (AFT), including univariable and multivariable analysis, of the model should be performed for the current data analysis. The univariate analyzes were adjusted for each covariate, considering AFT models for participants’ baseline characteristics. For the duration of the breastfeeding data, AFT models of Weibull, exponential, log-logistic, and log-normal distribution were considered. To select the best model for the current analysis, AFT models, namely Weibull, exponential, log-logistic, and log-normal distributions, were compared using AIC and BIC, considering that the model with the smallest AIC and BIC is the one that fits the data well. as shown in Table 3. Table 3 shows that the Weibull distribution had the smallest AIC and BIC. Therefore, it was chosen for univariate and multivariate data analysis in the current study.
In all univariable analyzes of the Weibull AFT models, women’s age, smoking status, place of residence, household wealth index, child sex, women’s educational level, children’s ages, and women’s birth interval were significantly associated with duration associated with breastfeeding at the 5% level of significance. The summary of the univariable analysis is given in Table 4. Variables that were significant at 5% of the significance level in the univariable model were included in the multivariable data analysis. The multivariable data analysis used in this study is presented in Table 5. In Table 5, the main effects of the covariates, namely place of residence, age of the women, birth interval, educational level of the women, smoking status and age of the child, were considered as potential predictors for the variable of interest. Multivariable data analysis using the Weibull AFT models and the corresponding AIC and BIC values was performed as indicated in Table 5.
The result of the Weibull AFT model in Table 5 shows that educational level, women’s age, birth interval, place of residence, child’s sex, smoking status, and wealth status were statistically significant variables for the variable of interest.
Analysis and model comparisons for the parametric shared frailty model
To test the effect of regions on the variable of interest, multivariate survival analysis was performed, including the gamma shared frailty term. This was done using the covariates; Place of residence, woman’s level of education, religion, employment status, child’s age, child’s sex, wealth index, woman’s level of education, smoking status and birth interval. In this study, the AIC and BIC criteria were considered to compare different candidate parametric shared frailty models, with the model with the smallest AIC and BIC being considered the best model.
The baseline parametric distributions, namely gamma frailty, log-normal, and inverse Gaussian, were fitted and compared by considering the women’s regions as frailty terms. The effect of the random component (frailty) was significant for the split Weibull gamma frailty due to their smallest AIC and BIC values . The final Weibull gamma shared frailty model is given in Table 6.
Table 6 shows that the area of residence had a significant impact on breastfeeding duration. Therefore, the expected duration of breastfeeding was reduced by 4% for urban women compared to rural women, with the other covariates held constant (Φ = 0.96; 95% CI; (0.94, 0.97); pvalue = 0.001). This suggests that rural women breastfed longer than city women.
Education level had a significant impact on the variation in breastfeeding duration. Therefore, when comparing uneducated women with secondary education and above, the expected duration of breastfeeding for uneducated women was increased by 3% compared to women with secondary education and above, with other covariates held constant (Φ = 1.03; 95% CI; (1.00 ,1.06); pvalue = 0.039). Similarly, the expected duration of breastfeeding was increased by 13% for women with elementary education compared to women with secondary education and above, with the other covariates held constant (Φ = 1.13; 95% CI; (1.11; 1.15 ); pvalue < 0.001). This result indicates that women with higher levels of education had a shorter duration of breastfeeding compared to women with no or less levels of education.
The age of a child also played a significant role in the variation in breastfeeding duration. Thus, if a child’s age increased by one month, the expected duration of breastfeeding decreased by 1%, with the other covariates remaining constant (Φ = 0.99; 95% CI; (0.76, 0.99); pvalue < 0.001). Therefore, the increasing age of a child leads to a reduction in the duration of breastfeeding.
The smoking status of the mothers/women also played a significant role in the variation in breastfeeding duration. When comparing smokers to non-smokers, the expected duration of breastfeeding was increased by 60% for non-smokers compared to smokers, with the other covariates held constant. (Φ = 1.60; 95% CI; (1.57, 1.63); pvalue < 0.001).
The birth interval between consecutive births also had a significant influence on the variation in breastfeeding duration. Thus, when comparing a woman with a 2-3 year birth interval to one of less than 2 years, the expected duration of breastfeeding was 2% longer for a woman with a 2-3 year birth interval compared to a woman with a birth interval < 2 years, with the other conditions held constant (Φ = 1.02; 95% CI; (1.09, 1.25); p-Value < 0,027). In ähnlicher Weise war die erwartete Stilldauer bei einer Frau mit einem Geburtsintervall von > 4 years 28% longer than in a woman with a birth interval of < 2 years (Φ = 1.28; 95% CI; (1.06, 1.43); pvalue < 0.01).
The age of the women was also another significant variable for the variation in breastfeeding duration. Therefore, the expected duration of breastfeeding was increased by 4% for a woman aged 40-44 compared to a woman aged 15-19, assuming other covariates were constant (Φ = 1.041; 95% CI (1.01 , 1.22); pscore = 0.019) and the expected duration of breastfeeding for a woman aged 35-39 years was increased by 3% compared to a woman aged 15-19 years (Φ = 1.030; 95% CI; (1.01, 1.55); pvalue = 0.005). The more the age of the women leads to a longer breastfeeding duration for the current study.