The rising frequency of business insolvencies underscores the critical need for advanced predictive tools that offer real-time risk assessment. Traditional bankruptcy prediction models, which primarily depend on retrospective financial statement data, often fail to provide timely insights due to reporting delays and data incompleteness. This study proposes a machine learning-based framework for proactive bankruptcy prediction, leveraging ensemble learning techniques for greater accuracy and reliability. The analysis draws on a robust dataset comprising 2,800 firm-year observations from companies listed on the Bombay Stock Exchange (BSE) over the period 2020 to 2025. Quarterly financial indicators and market-based variables were systematically compiled to ensure comprehensive coverage of firm performance. To enhance model performance and mitigate overfitting, three advanced ensemble algorithms—XGBoost, AdaBoost, and Random Forest—were implemented. These models combine bagging and boosting mechanisms to optimize predictive capabilities. Recursive Feature Elimination (RFE) was used to identify the most significant predictors influencing financial distress. The study empirically tested four hypotheses: liquidity constraints as indicators of distress; operational cash flow deficits correlating with default risk; high debt-to-equity ratios signaling instability; and declining return on assets (ROA) as an early warning sign. The results demonstrate that the ensemble-based approach delivers high classification accuracy, with key variables conforming to recognized financial risk indicators. The findings highlight the potential of machine learning models to serve as early detection tools, aiding firms and investors in strategic decision-making. This research emphasizes the transformative role of artificial intelligence in improving financial risk management systems, enabling more dynamic and informed responses to emerging threats of bankruptcy.
With the use of financial ratios, early bankruptcy prediction has emerged as a crucial instrument in financial risk management allowing stakeholders to identify indications of financial distress long before a business goes bankrupt. Balance sheets income statements and cash flow statements are the sources of financial ratios like liquidity ratios—provide numerical information about a businesss financial and operational performance. These ratios can be examined over time to identify declining patterns that might point to an increasing bankruptcy risk. These ratios have long been combined into a single predictive measure using traditional models like Altmans Z-score but more recent methods use machine learning algorithms to improve accuracy by capturing intricate non-linear relationships among variables. Early bankruptcy prediction helps investors avoid possible losses helps managers take corrective action aids lenders in determining credit risk and helps regulators protect the financial system. All things considered this predictive ability is essential to maintaining financial stability encouraging openness and directing well-informed corporate decision-making.
Figure 1 Bankruptcy prediction using financial ratios
In the field of bankruptcy prediction machine learning techniques have been thoroughly investigated which is shown in figure 1. Their use has demonstrated a notable increase in classification accuracy particularly when contrasted with conventional statistical models. Numerous algorithms have been used in various industries to improve prediction reliability including ensemble methods decision trees and support vector machines [1]. Additionally financial ratios and artificial intelligence models were successfully combined to provide more complex risk evaluations. Businesses were able to identify financial distress more accurately and before actual insolvency by combining quantitative metrics with AI-driven techniques [2].
Following that machine learning models were utilized exclusively for American companies and they proved to be highly effective in identifying early warning indicators of bankruptcy thereby promoting economic stability. Complex financial patterns and trends that traditional models frequently missed were taken into account by these models [3]. Moreover, a targeted investigation on the U. S. The healthcare industry emphasized that a deeper understanding of risk especially in high-liability sectors was made possible by the integration of machine learning techniques and domain-specific financial ratios [4]. The ability to predict insolvency was also enhanced by combining machine learning techniques with historical data according to the prediction of bankruptcy among Polish non-public companies.
Compared to conventional credit scoring techniques these models showed higher accuracy and were able to adjust to sector-specific factors [5]. In addition, a study on micro and small businesses in the Lithuanian construction industry included macroeconomic and non-financial factors in addition to financial indicators highlighting the multifaceted character of bankruptcy risk [6]. Similar to this a study carried out in Spain compared machine learning models with conventional analytical methods and discovered that although financial ratios continued to be important AI-based approaches provided greater accuracy and flexibility in predicting insolvency [7]. Furthermore, a more thorough viewpoint in predicting financial distress was added by integrating macroeconomic trends and corporate governance indicators with financial ratios in Indonesia which improved the model’s robustness [8]. Meanwhile a meta-analysis that looked at models for predicting bankruptcy between 2010 and 2022 revealed a trend toward the use of sophisticated computational models. The review found methodological flaws important contributors and new trends that influenced the present course of this fields research [9]. Another systematic review provided evidence in favor of this highlighting the predictive power of numerical indicators like liquidity profitability and solvency ratios and confirming their usefulness in both conventional and AI-driven models [10].
Using a combination of macroeconomic non-financial and financial factors the development of bankruptcy predictors was also emphasized. A more accurate understanding of business failures was made possible by this all-encompassing framework particularly during uncertain economic times [11]. Furthermore, by incorporating operational and behavioral data a novel model that was suggested for Indian companies went beyond financial ratios and emphasized the significance of contextual factors in bankruptcy prediction [12]. Additionally, a predictive model created for Indonesian businesses during economic downturns used both external and internal indicators to more accurately predict financial distress. The model was modified to account for the region’s distinct financial environment [13]. Financial ratios like the debt-to-equity ratio interest coverage and current ratio were also successfully used in Sri Lanka to forecast distress in listed companies demonstrating their applicability in emerging markets [14]. On a different note, text mining techniques were used to extract useful linguistic cues from corporate annual reports which were then used to successfully train machine learning models using communicative elements to improve bankruptcy expectations [15].
Furthermore, comparative cognitive modeling with a variety of AI algorithms showed that no single model was always better but ensemble methods frequently yielded the best results on a range of datasets [16]. In Kenya where local operational constraints were also taken into account in the model’s profitability ratios were also found to be important predictors of dairy cooperative bankruptcy [17]. Similar to this a 2020–2023 study of an Indonesian tech conglomerate using the DuPont system and Altman Z-score demonstrated successful early detection of possible financial failures using composite financial performance metrics [18]. Following that the post-pandemic era presented additional difficulties for predicting bankruptcy especially in Visegrad nations where patterns of economic recovery affected the predictive ability of conventional indicators. As a result, improved models that took regional dynamics into consideration were required [19]. The necessity for transparent and interpretable AI systems was also highlighted by a systematic review of AIs role in financial institution bankruptcy prevention and identification which found both opportunities and ethical challenges [20]. Last but not least a study that questioned the dominance of AI in bankruptcy prediction came to the conclusion that although AI models frequently performed better than conventional ones their efficacy was largely dependent on contextual relevance model transparency and data quality. The results supported combining AI and traditional financial analytics in a balanced manner [21].
This section explains the entire methodology, which includes data collection, data measurement, preprocessing, tool usage, the structured research methodology, the underlying proposed technique, and hypothesis validation, in order to assess the predictive ability of ensemble machine learning techniques in real-time corporate bankruptcy detection. Based on secondary data collected from publicly traded companies on the Bombay Stock Exchange (BSE) between 2020 and 2025, the methodology is empirical in nature. Robust modeling and statistical inference are made possible by this empirical technique, which guarantees the accuracy of financial and market-based indicators. In order to create a real-time bankruptcy warning system, the techniques used seek to combine algorithmic optimization, statistical rigor, and predictive analytics.
Data Collection
The study’s dataset consists of 2800 company-year observations from 2020 to 2025 a five-year span. The official BSE records business annual reports quarterly financial statements and financial databases like Capitaline and CMIE Prowess were the sources of the data. Businesses from a variety of industries including manufacturing services infrastructure technology and finance are represented in the sample. A crucial aspect of the data collection process was the demographic profiling of companies according to factors like listing tenure market capitalization firm size (as determined by total assets) and sectoral classification. The table 1 below provides an overview of the dataset’s demographic composition.
Table 1 Data collection
Sector |
No. of Firms |
Avg. Total Assets (INR Crores) |
Avg. Market Cap (INR Crores) |
Listing Tenure (Years) |
Manufacturing |
600 |
3,500 |
2,800 |
15 |
Services |
500 |
2,700 |
3,200 |
12 |
Infrastructure |
400 |
4,100 |
2,100 |
10 |
Technology |
700 |
2,000 |
6,000 |
8 |
Financial Services |
600 |
5,200 |
4,800 |
20 |
This table presents a stratified profile of the sample used, ensuring sectoral representation and heterogeneity in firm characteristics to improve generalizability of the findings.
Data Measurement
A set of quarterly-calculated market indicators and standardized financial ratios were used to evaluate each companys financial health. The following important metrics are measured: (i) the debt-to-equity ratio to show levels of leverage (ii) operating cash flow to capture internal cash generation capabilities (iv) return on assets (ROA) to show operational efficiency and (iii) the current ratio to evaluate short-term liquidity. These metrics which adhere to SEBIs required reporting format were calculated from quarterly financial statements that were made available to the public. In order to record current investor sentiment and market perception market indicators like share price volatility beta values and trading volume trends were also included. Using the Z-score transformation each variable was normalized to remove scale discrepancies and get the data ready for additional processing.
Data Preprocessing
Extensive preprocessing steps were carried out before model creation to guarantee data consistency, integrity, and applicability for machine learning applications. In order to maintain data structure and variability, missing values—which made up around 3.5% of the data—were first imputed using multiple imputation by chained equations (MICE). Outliers were identified using the Interquartile Range (IQR) method and winsorized at the 1st and 99th percentiles to prevent undue influence on model estimates. Multicollinearity among independent variables was assessed through the Variance Inflation Factor (VIF), and variables with VIF > 5 were excluded. Feature selection was performed using Recursive Feature Elimination (RFE) with cross-validation to retain the most predictive features. Categorical variables were encoded using one-hot encoding where necessary, and all inputs were standardized to zero mean and unit variance to support convergence of ensemble algorithms.
Data Tool
The analysis and modeling were carried out using Python programming language, primarily utilizing the Scikit-learn library for machine learning and data processing operations. Additional libraries used include Pandas for data manipulation, NumPy for numerical operations, Matplotlib and Seaborn for visualization, and XGBoost and LightGBM for advanced ensemble techniques. The machine learning pipeline was developed using Scikit-learn’s Pipeline module, enabling integrated preprocessing, model fitting, and evaluation. To ensure computational efficiency and repeatability, the final models were deployed and evaluated on an Intel Core i9 computer with 32 GB of RAM using JupyterLab.
Proposed methodology
In order to create a trustworthy early-warning system for corporate bankruptcy detection using ensemble machine learning the research methodology used a structured multi-step process (figure 2). The dataset was first made ready by performing feature transformation data wrangling and cleaning procedures. After that feature selection was done using 10-fold cross-validation and Recursive Feature Elimination to make sure that only predictive and statistically significant variables were kept. Following that a training set consisting of 80% of the data was used to train the chosen features on several ensemble classifiers including XGBoost, AdaBoost, and Random Forest. The remaining 20% of the data was used for testing. In order to enhance generalization grid search with stratified K-fold cross-validation (K=10) was used to optimize hyperparameters.
Figure 2 Proposed model
Ensemble learning was executed in two phases: bagging using Random Forest to reduce variance, and boosting using XGBoost and AdaBoost to reduce bias and sequentially improve prediction accuracy. The final step involved model comparison using performance metrics including Area Under the Receiver Operating Characteristic Curve (AUC-ROC), Precision-Recall, F1-score, and Matthews Correlation Coefficient (MCC). Feature importance was extracted and aligned with financial theory to validate interpretability.
Proposed Technique
The proposed hybrid ensemble technique integrates both bagging and boosting approaches to leverage the advantages of each while minimizing their weaknesses. Let the input feature space be denoted as X={x1,x2,...,xn} and the binary target variable Y∈{0,1} representing bankrupt or not bankrupt which is expressed from equation 1 to 7.
Let fi(X) be the prediction from the ith base learner. Each fi is trained on a bootstrap sample:
where θi denotes parameters specific to the ithi^{th} model trained on resampled data.
In bagging, the final prediction FB(X)F_B(X) is the average of all M base predictions:
This reduces variance and stabilizes prediction accuracy across different subsets of data.
Boosting iteratively improves weak classifiers using weighted errors:
where αt is the learning rate, and ht(X) is the weak hypothesis at iteration tt.
For boosting, the objective is to minimize an additive loss function:
where l(.)is a differentiable loss function such as binary cross-entropy or logistic loss.
Each tree in XGBoost is defined as:
where q maps an instance to a leaf index and T is the number of leaves.
XGBoost optimizes the following objective:
where Ω(f)=γT+12λ∥w∥2 represents regularization on the number of leaves and leaf weights to prevent overfitting.
The final output prediction Y combines bagging and boosting as:
where β1+β2=1 are ensemble weights determined through optimization on validation performance.
Hypothesis
The following hypotheses serve as the study's compass; each is examined using the proper statistical tests in the machine learning pipeline and confirmed by the results of the ensemble model:
H1—Liquidity constraints are a key indicator of financial distress
H2—Operating cash deficits are strongly linked to default risk
H3—Elevated debt-to-equity levels signal impending failure
H4—A downward trend in ROA serves as an early warning.
This research examines the forecast of corporate bankruptcy in Indian companies between 2020 and 2025 assessing sophisticated ensemble-based machine learning models and concentrating on financial indicators that influence the likelihood of bankruptcy. The analysis comprises a synopsis of the statistical characteristics significant features model evaluation outcomes conclusions from hypothesis testing sectoral risk segmentation and temporal bankruptcy risk pattern of the dataset.
Dataset Characteristics and Statistical Overview
Over a six-year period (2020–2025), 2,800 company-year records were gathered quarterly to make up the financial dataset examined in this study. Important details about these organizations' financial structures were uncovered by the descriptive data (Table 2). The average Liquidity Ratio stood at 1.83 with a standard deviation of 0.65, indicating modest variability in short-term solvency across firms. The Operating Cash Flow, measured in crores of rupees, averaged ₹152.4 Cr, though it exhibited considerable dispersion (SD = ₹98.6 Cr), with some companies showing negative flows as low as ₹-202.5 Cr. The Debt-to-Equity Ratio had a mean value of 1.54 and a notably high skewness (1.04), suggesting that some firms carried significantly higher leverage. Return on Assets (ROA) had an average of 5.21%, with a wide range from -8.11% to 15.8%, hinting at differing profitability levels. Lastly, Market Capitalization displayed the highest variability and skewness (mean = ₹3,212 Cr, max = ₹29,780 Cr), reflecting the heterogeneity in company sizes across the dataset.
Table 2: Summary of Dataset Statistics (2020–2025)
Quarterly Data of 2800 company-year records
Metric |
Mean |
Std Dev |
Min |
Max |
Skewness |
Kurtosis |
Liquidity Ratio |
1.83 |
0.65 |
0.22 |
4.72 |
0.84 |
3.26 |
Operating Cash Flow (₹ Cr) |
152.4 |
98.6 |
-202.5 |
402.6 |
-0.22 |
4.18 |
Debt-to-Equity Ratio |
1.54 |
1.12 |
0.01 |
7.32 |
1.04 |
5.01 |
Return on Assets (ROA %) |
5.21 |
4.42 |
-8.11 |
15.8 |
-0.71 |
2.89 |
Market Capitalization (₹ Cr) |
3,212 |
5,414 |
124 |
29,780 |
1.26 |
6.42 |
Financial Feature Significance in Bankruptcy Prediction
Using XGBoost combined with Recursive Feature Elimination (RFE), the analysis identified the top financial indicators driving bankruptcy risk (Table 3). The Debt-to-Equity Ratio emerged as the most critical feature with an importance score of 0.273, highlighting the role of financial leverage in corporate failure. Following this, the Liquidity Ratio (0.214) and ROA (0.191) also held considerable predictive power. Operating Cash Flow ranked fourth, reinforcing the connection between poor cash generation and distress. Features such as Interest Coverage Ratio, Asset Turnover, Quick Ratio, and EPS Growth were found to be relatively less impactful.
Table 3: Feature Importance from XGBoost with RFE
Rank |
Feature |
Importance Score |
1 |
Debt-to-Equity Ratio |
0.273 |
2 |
Liquidity Ratio |
0.214 |
3 |
ROA (%) |
0.191 |
4 |
Operating Cash Flow |
0.166 |
5 |
Interest Coverage Ratio |
0.082 |
6 |
Asset Turnover Ratio |
0.037 |
7 |
Quick Ratio |
0.022 |
8 |
EPS Growth (QoQ) |
0.015 |
Model Performance and Comparative Evaluation
The ensemble learning models were evaluated based on multiple performance metrics including accuracy, precision, recall, F1-score, and AUC-ROC (Table 4). XGBoost outperformed the others, achieving an accuracy of 94.1%, precision of 0.93, recall of 0.91, and a high AUC-ROC score of 0.97, underscoring its robust classification capability. Random Forest also delivered strong results (accuracy = 92.5%, AUC = 0.95), while AdaBoost lagged slightly behind with an accuracy of 89.7%. The superiority of XGBoost was further supported by its balance between sensitivity and specificity, making it the most reliable model in this context.
Table 4 Model Accuracy & Performance Metrics
Model |
Accuracy (%) |
Precision |
Recall |
F1-Score |
AUC-ROC |
Random Forest |
92.5 |
0.91 |
0.89 |
0.90 |
0.95 |
AdaBoost |
89.7 |
0.87 |
0.85 |
0.86 |
0.92 |
XGBoost |
94.1 |
0.93 |
0.91 |
0.92 |
0.97 |
Confusion Matrix Insights for XGBoost
The confusion matrix results (Table 5) highlighted the real-world applicability of the XGBoost model. Out of the total instances, it correctly predicted 2,115 companies as non-bankrupt and 521 as bankrupt, with only 79 false negatives and 85 false positives. This indicated a low misclassification rate and strong generalization capability, particularly in correctly identifying actual bankruptcy cases.
Table 5: Confusion Matrix for XGBoost Model
Predicted: No Bankruptcy |
Predicted: Bankruptcy |
|
Actual: No |
2,115 |
85 |
Actual: Yes |
79 |
521 |
Hypotheses Testing Outcomes
The research also empirically validated several financial hypotheses related to bankruptcy (Table 6). The Liquidity Ratio showed a statistically significant impact (t = 3.91, p < 0.001), confirming that reduced liquidity predicts failure. Negative Operating Cash Flow correlated strongly with bankruptcy risk (z = -4.72, p < 0.0001). A high Debt-to-Equity Ratio was also associated with increased failure probability (χ² = 16.2, p = 0.001). Finally, declining ROA preceded bankruptcy events (t = -2.87, p = 0.0043), further affirming the importance of profitability in sustaining business viability.
Table 6: Hypotheses Testing Summary
Hypothesis |
Test Statistic |
p-Value |
Significance |
Conclusion |
H1 |
t = 3.91 |
0.0001 |
Yes |
Liquidity Ratio predicts bankruptcy |
H2 |
z = -4.72 |
0.0000 |
Yes |
Neg. Operating Cash Flow correlates |
H3 |
χ² = 16.2 |
0.001 |
Yes |
High D/E Ratio increases failure |
H4 |
t = -2.87 |
0.0043 |
Yes |
Falling ROA precedes bankruptcy |
ROC Curve Threshold Behavior
To further assess model discrimination, the ROC curve thresholds were analyzed across the models (Table 7). At a threshold of 0.6, XGBoost maintained a true positive rate (TPR) of 0.85 with a false positive rate (FPR) of 0.12, showing optimal trade-offs. In comparison, Random Forest also performed consistently well with TPR/FPR of 0.84/0.10 at the same threshold. These trends suggested that ensemble models, especially XGBoost, effectively maintained high sensitivity with controlled false alarms across various decision thresholds.
Table 7: ROC Curve Values for Models (Selected Thresholds)
Threshold |
XGBoost (TPR/FPR) |
AdaBoost (TPR/FPR) |
Random Forest (TPR/FPR) |
0.2 |
0.98 / 0.42 |
0.94 / 0.48 |
0.96 / 0.39 |
0.4 |
0.91 / 0.21 |
0.88 / 0.25 |
0.89 / 0.19 |
0.6 |
0.85 / 0.12 |
0.81 / 0.16 |
0.84 / 0.10 |
0.8 |
0.76 / 0.05 |
0.69 / 0.08 |
0.72 / 0.03 |
Sectoral Bankruptcy Risk Distribution
Sector-wise bankruptcy risk revealed notable trends (Table 8). The Financial Services sector showed the highest bankruptcy rate at 21.2%, indicating elevated systemic vulnerabilities. The Infra & Realty (13.2%), Technology (12.1%), and Manufacturing (11.9%) sectors also exhibited relatively high bankruptcy predictions. Conversely, sectors like FMCG (4.6%) and Pharma & Health (5.5%) appeared more resilient during the study period. These sectoral insights provided valuable guidance for industry-specific risk mitigation strategies.
Table 8 Sector-Wise Bankruptcy Predictions (Sample)
Sector |
Companies |
Bankruptcies Predicted |
Bankruptcy Rate (%) |
Manufacturing |
620 |
74 |
11.9 |
Pharma & Health |
380 |
21 |
5.5 |
Technology |
340 |
41 |
12.1 |
Energy & Utilities |
290 |
18 |
6.2 |
Financial Services |
420 |
89 |
21.2 |
FMCG |
280 |
13 |
4.6 |
Infra & Realty |
470 |
62 |
13.2 |
Temporal Bankruptcy Risk Score Patterns
An analysis of quarterly bankruptcy risk scores from 2020 to 2025 (Table 9) showed a clear downward trend. The average risk score started at 0.63 in Q1-2020 and gradually declined to 0.37 by Q4-2025. This decrease indicated improving financial health or more cautious financial practices post-pandemic. The number of actual bankruptcy events also reduced substantially, from 38 in early 2020 to just 4 by the end of 2025. Sectors like Financial Services, Infra & Realty, and Technology alternated as the highest-risk sectors across different quarters, reaffirming the need for dynamic monitoring.
Table 9: Quarterly Bankruptcy Risk Score Trend (2020–2025)
Quarter |
Avg Risk Score (XGBoost) |
Std Dev |
Highest Sector Risk |
Bankruptcy Events |
Q1-2020 |
0.63 |
0.17 |
Manufacturing |
38 |
Q2-2020 |
0.68 |
0.19 |
Financial Services |
45 |
Q3-2020 |
0.71 |
0.18 |
Infra & Realty |
42 |
Q4-2020 |
0.69 |
0.16 |
Manufacturing |
39 |
Q1-2021 |
0.66 |
0.15 |
Technology |
34 |
Q2-2021 |
0.65 |
0.14 |
Pharma & Health |
29 |
Q3-2021 |
0.62 |
0.13 |
Manufacturing |
27 |
Q4-2021 |
0.64 |
0.12 |
Energy & Utilities |
25 |
Q1-2022 |
0.61 |
0.13 |
Infra & Realty |
23 |
Q2-2022 |
0.59 |
0.12 |
Technology |
21 |
Q3-2022 |
0.57 |
0.14 |
Financial Services |
20 |
Q4-2022 |
0.60 |
0.15 |
FMCG |
19 |
Q1-2023 |
0.56 |
0.11 |
Manufacturing |
18 |
Q2-2023 |
0.54 |
0.10 |
Infra & Realty |
16 |
Q3-2023 |
0.53 |
0.09 |
Financial Services |
14 |
Q4-2023 |
0.52 |
0.08 |
Technology |
12 |
Q1-2024 |
0.49 |
0.07 |
Manufacturing |
11 |
Q2-2024 |
0.47 |
0.07 |
Financial Services |
10 |
Q3-2024 |
0.45 |
0.06 |
Infra & Realty |
9 |
Q4-2024 |
0.44 |
0.06 |
FMCG |
8 |
Q1-2025 |
0.42 |
0.05 |
Technology |
7 |
Q2-2025 |
0.40 |
0.05 |
Pharma & Health |
6 |
Q3-2025 |
0.39 |
0.04 |
Infra & Realty |
5 |
Q4-2025 |
0.37 |
0.04 |
Financial Services |
4 |
The results of this study affirm the effectiveness of ensemble-based machine learning models, particularly XGBoost, in predicting corporate bankruptcy in the Indian context between 2020 and 2025. By leveraging quarterly financial and market-based data, the models provide forward-looking insights that significantly outperform traditional, lagging indicators. The superior performance metrics of XGBoost, including a 94.1% accuracy and an AUC-ROC of 0.97, reflect its robustness in handling class imbalances and capturing nonlinear relationships among features. The feature importance analysis validates established financial theory, underscoring the predictive power of debt-to-equity ratio, liquidity, and ROA. These results support the theoretical proposition that financial leverage, solvency, and operational profitability are key determinants of business failure. From a practical standpoint, the findings offer corporate stakeholders, investors, and regulators a scalable tool for dynamic risk monitoring. The sectoral and temporal segmentation further enhances the model’s utility by allowing tailored strategies based on industry-specific risk exposure and evolving macroeconomic conditions. Notably, the declining bankruptcy risk trend post-2021 suggests a post-pandemic stabilization effect, which could guide future policymaking and credit assessment models. The study also opens up several directions for future research, including integration of ESG indicators, supply chain disruptions, and sentiment analysis from unstructured data sources such as news and social media to enhance model responsiveness in real-time scenarios.
The findings of this study reveal the powerful predictive capability of ensemble-based machine learning models in forecasting corporate bankruptcy among Indian firms from 2020 to 2025.
These insights not only align with financial theory but also have practical applications for early warning systems in corporate governance. Future studies should explore hybrid frameworks incorporating macroeconomic indicators, qualitative data from news and sentiment analysis, ESG scores, and global supply chain disruptions to enhance real-time adaptability and decision-making precision across various economic environments.