Page 63 - ISC PROCEEDINGS 21.4
P. 63
In a context where millions of payment transactions are processed annually, manual
sampling methods can only examine a small proportion. ML models such as Isolation
Forest or Local Outlier Factor can scan the entire dataset to detect anomalous patterns.
Common types of anomalies include:
Year-end disbursement spikes: For example, if disbursement reaches 55% of the
annual plan over the first 11 months but increases by an additional 35% in December
alone, the system may identify this as an anomaly relative to historical sectoral patterns.
Excessive capital adjustments: If the construction sector’s average capital increase is
5-7%, but a specific project records an increase of 20-30%, the model will flag it.
Repeated or unusually frequent payments: Detection of multiple transactions with
identical values to the same contractor within a short time frame.
Based on simulation testing with a hypothetical dataset of 10,000 projects, anomaly
detection models can narrow down approximately 5-8% of projects with high-risk
probabilities, enabling inspectors to concentrate resources on a targeted subset rather
than conducting broad, unfocused audits.
(ii) Contractor Network Analysis
Network analysis uses graph theory to model relationships among contractors,
project owners, and projects. Each contractor represents a node, while joint participation
or co-award relationships form edges.
Using metrics such as degree centrality, betweenness centrality, and modularity,
the system can identify:
Abnormal concentration of contract awards: For instance, if a group of five firms
accounts for 60% of total awarded contract value in a locality over three consecutive
years, significantly higher than the assumed national average of 25-30%, this may signal
the need for further review.
Cross-enterprise linkages: Firms sharing the same legal representative, registered
address, or repeatedly appearing in joint ventures.
Bid-rigging risks: Detection of clusters of contractors that rotate winning bids within
the same project group.
Applying network analysis shifts oversight from monitoring individual bidding
packages to supervising the broader structure of the procurement market.
(iii) ML in State Budget Expenditure Management for Public Investment
A risk scoring model can be developed using variables such as project size, number
of project adjustments, implementation delays, contractor award history, and local
characteristics.
The output is a probability estimate of potential risks (e.g., cost overruns, delays,
audit findings). Supervisory agencies can prioritize inspections for projects with the
highest risk scores, thereby optimizing oversight resources.
To ensure transparency, the model should incorporate explainable AI (XAI)
techniques to clarify which variables most significantly influence the risk score.
(iv) Risk Scoring Model
The risk scoring model uses quantitative variables to calculate the risk probability of
each project. Variables may include:
Capital scale (logarithm of total investment);
Number of total investment adjustments;
Ratio of difference between winning bid price and package price;
History of audit recommendations;
62

