Page 479 - ISC PROCEEDINGS 21.4
P. 479
2.2. Why infrastructure, not just data, is critical
Data alone is insufficient; the economic benefits arise only when data is embedded
in reliable infrastructure that enforces governance, quality, and interoperability:
Secure, auditable storage and provenance. Infrastructure must ensure
confidentiality for sensitive records, record provenance and lineage, and provide
immutable audit trails so regulators can verify data origin and transformations.
Controlled access and fine-grained authorization. Role-based access, attribute-
based policies, and time-limited credentials let institutions share data with regulators,
approved researchers, or sandbox participants without wholesale exposure.
Privacy-preserving services and synthetic data capabilities. Techniques such as
differential privacy, secure multiparty computation, federated learning, and synthetic
data generation let stakeholders extract utility from data while limiting re-identification
and leakage risks.
Validation, monitoring and model governance. Data pipelines must include
validation suites (schema checks, distributional tests), monitoring for drift, and
governance workflows for model approval, retraining, and retirement.
2.3. Theoretical foundations of synthetic financial data infrastructure
Synthetic financial data should not be viewed only as a machine learning output
generated for privacy protection or data augmentation. In the context of digital economy
and financial innovation, it can also be interpreted as an institutional and infrastructural
response to three closely related problems: information asymmetry, weak data
governance, and the need for scalable digital infrastructure. These perspectives help
explain why Financial GANs are relevant not only from a technical standpoint, but also
from an economic and organizational one.
Information Asymmetry Theory.
Information asymmetry theory explains that economic agents do not have equal
access to information, and this imbalance is especially visible in financial markets where
data are often proprietary, fragmented, costly, and restricted by regulation. In this
context, synthetic financial data generated by GANs can reduce informational barriers by
creating realistic datasets that preserve key statistical patterns while protecting
confidential records. This makes it easier for researchers, fintech firms, regulators, and
financial institutions to test models, compare methods, and conduct analysis without
direct exposure to sensitive original data.
Data Governance Theory
Data governance theory emphasizes that data become valuable only when they are
managed through clear rules, responsibilities, and accountability mechanisms. In financial
contexts, this includes data quality, privacy protection, access control, auditability,
provenance, and compliance with institutional and legal requirements. Synthetic financial
data should therefore be considered not only as a technical output, but also as a
governed data product that requires documentation, validation, and monitoring before
use. GAN-generated datasets must be assessed for realism, leakage, bias, and
downstream reliability, especially when they are intended for high-stakes tasks such as
credit analysis, fraud detection, or stress testing. From this perspective, Financial GANs fit
into a broader governance framework in which data are not simply generated, but also
classified, controlled, and evaluated according to institutional standards.
Digital Infrastructure Theory
Digital infrastructure theory views data, models, standards, and platforms as
478

