Generating High Fidelity, Synthetic Time Series Datasets with DoppelGANger

QuantUniversity partnered with PRMIA to do the first QuantUniversity summer school in Machine Learning and AI in Finance. We had more than 1000 participants from more than 20 countries including India, China, Australia, UK, Turkey, South Africa etc. attend the summer school series.

This year, we offered 3 courses in Data Science, Machine Learning, and Model Risk Management:

  1. Just Enough Python for Data Science
  2. Machine Learning and AI for Financial Professionals
  3. Model Risk Management for Machine Learning Models

In addition, we had 10 lectures from eminent quants, innovators, and thinkers on various topics in AI/ML and Fintech related topics.

Lecture 4

In Week 4, we had Dr.Giulia Fanti from Carnegie Mellon University discussed her work on Generating Synthetic Data with Generative Adversarial Networks (GAN). Here is a summary of the workshop.

Summary:

Limited data access continues to be a barrier to data-driven product development. In this talk, we explore if and how generative adversarial networks (GANs) can be used to incentivize data sharing by enabling a generic framework for sharing synthetic datasets with minimal expert knowledge.

We identify key challenges of existing GAN approaches with respect to fidelity (e.g., capturing complex multidimensional correlations, mode collapse) and privacy (i.e., existing guarantees are poorly understood and can sacrifice fidelity).

To address fidelity challenges, we discuss our experiences designing a custom workflow called DoppelGANger and demonstrate that across diverse real-world datasets (e.g., bandwidth measurements, cluster requests, web sessions) and use cases (e.g., structural characterization, predictive modeling, algorithm comparison), DoppelGANger achieves up to 43% better fidelity than baseline models.

With respect to privacy, we identify fundamental challenges with both classical notions of privacy as well as recent advances to improve the privacy properties of GANs, and suggest a potential roadmap for addressing these challenges.

Slides, Demos, and videos at https://academy.qusandbox.com/#/market/5f29eb1699aa4a24691da53a

If you want to try out the demos yourselves on the QuAcademy:
https://academy.qusandbox.comZ
Use ‘QUSUMMERSCHOOL’ as Registration Code

@quantuniversity

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Analyzing Starbucks Offers using Python

This is why your deep learning models don’t work on another microscopy scanner

Time Series Data Components with Microsoft excel

Why We Need To Ask Stupid Questions

Shine Bright Like A Diamond

Gate.io Listing Vote #74 Loon Network (LOON) Voting Result & Listing

Clustering European cities

An image signifying Europe

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
QuantUniversity

QuantUniversity

@quantuniversity

More from Medium

How to Find the Equation of Line by using Slope? | Learn ZOE

Tricks to identify type of clause

How to get started in Python: An overview of recent trends

Thinking Out Loud: Handling WFH and Hybrid-work