Enables secure data sharing without exposing sensitive information.
Finance
|
Data
|
Generative AI
Generating anonymized synthetic financial data that mimics real datasets for safe testing, development, and analytics.
Goldman Sachs creates fake but realistic financial data that looks and behaves like real data. This lets teams test and build tools safely without risking anyone’s private information.
Goldman Sachs developed a synthetic data generator that produces anonymized datasets closely matching the statistical, structural, and polymorphic properties of real financial data, especially complex financial contracts. This technology uses advanced generative AI models to learn the intricate relationships and distributions in production data and then generate new, artificial data points that preserve these characteristics without containing any real customer information.
The generator supports a no-code interface enabling users, including non-technical staff, to create customized synthetic datasets tailored to specific testing or analytical needs. This democratizes access to synthetic data generation across the firm. The synthetic data maintains high fidelity to real data, ensuring that downstream applications such as software testing, model training, and analytics yield meaningful and reliable results.
By leveraging generative adversarial networks and other generative modeling techniques, the system can simulate rare events and edge cases that may be underrepresented in real data, enhancing robustness in model development. The synthetic data also facilitates compliance with privacy regulations like GDPR and CCPA by eliminating exposure of sensitive information, enabling safe collaboration internally and with external partners.
This approach accelerates development cycles, reduces operational risks, and supports innovation in financial services by providing a secure, flexible, and realistic data environment. The technology is integrated into Goldman Sachs’ broader data and analytics infrastructure, supporting multiple business lines and use cases.
It is like creating a high-quality movie prop that looks exactly like the real thing but can be handled freely without any risk, allowing filmmakers to rehearse scenes without damaging anything valuable.
4
/5
This project leverages multiple state-of-the-art generative AI techniques (GANs, VAEs, LLMs) combined with statistical models to produce high-fidelity, privacy-compliant synthetic financial data, which is a leading but increasingly established practice in finance; the no-code interface democratizing access adds significant practical innovation. The solution addresses complex regulatory constraints and supports rare event simulation, elevating its impact beyond standard synthetic data efforts.
Timeline:
14 months
Cost:
$2,700,000
Headcount:
12