Generating High Fidelity, Synthetic Time Series Datasets with DoppelGANger
Dr. Giulia Fanti from Carnegie Mellon University
QuantUniversity partnered with PRMIA to do the first QuantUniversity summer school in Machine Learning and AI in Finance. We had more than 1000 participants from more than 20 countries including India, China, Australia, UK, Turkey, South Africa etc. attend the summer school series.
This year, we offered 3 courses in Data Science, Machine Learning, and Model Risk Management:
- Just Enough Python for Data Science
- Machine Learning and AI for Financial Professionals
- Model Risk Management for Machine Learning Models
In addition, we had 10 lectures from eminent quants, innovators, and thinkers on various topics in AI/ML and Fintech related topics.
In Week 4, we had Dr.Giulia Fanti from Carnegie Mellon University discussed her work on Generating Synthetic Data with Generative Adversarial Networks (GAN). Here is a summary of the workshop.
Limited data access continues to be a barrier to data-driven product development. In this talk, we explore if and how generative adversarial networks (GANs) can be used to incentivize data sharing by enabling a generic framework for sharing synthetic datasets with minimal expert knowledge.
We identify key challenges of existing GAN approaches with respect to fidelity (e.g., capturing complex multidimensional correlations, mode collapse) and privacy (i.e., existing guarantees are poorly understood and can sacrifice fidelity).
To address fidelity challenges, we discuss our experiences designing a custom workflow called DoppelGANger and demonstrate that across diverse real-world datasets (e.g., bandwidth measurements, cluster requests, web sessions) and use cases (e.g., structural characterization, predictive modeling, algorithm comparison), DoppelGANger achieves up to 43% better fidelity than baseline models.
With respect to privacy, we identify fundamental challenges with both classical notions of privacy as well as recent advances to improve the privacy properties of GANs, and suggest a potential roadmap for addressing these challenges.
Slides, Demos, and videos at https://academy.qusandbox.com/#/market/5f29eb1699aa4a24691da53a
If you want to try out the demos yourselves on the QuAcademy:
Use ‘QUSUMMERSCHOOL’ as Registration Code