20 Apr 2023
by Harry Keen

Synthetic data & GDPR compliance (Guest blog by Hazy)

Guest blog by Harry Keen, CEO & Co-founder at Hazy #AIWeek2023

Overview

  • The obligations towards processing Personal Data under the UK GDPR do not apply to anonymous data.
  • Synthetic data that has built-in differential privacy guarantees can meet the criteria for anonymous data as defined by the draft guidelines of the UK regulator, the Information Commissioner’s Office (ICO).
  • Hazy’s Synthetic Data Platform generates synthetic data that is sufficiently anonymous that the UK GDPR does not apply to it.

The regulatory landscape

Let’s start with the GDPR. UK GDPR makes it clear that the principles within the legislation relating to data protection, do not apply to personal data rendered anonymous, provided this is achieved in such a manner, so that the data subject is not or no longer identifiable.

This is all well and good, but in practice:

  • The actual identifiability of individuals can be highly context-specific;
  • Different types of information have different levels of identifiability risk depending on the circumstances in which they are processed;
  • The process of creating synthetic data using a dataset containing personal data requires the processing of personal data.

So, what constitutes a sufficient level of anonymisation in the resultant synthetic dataset? To answer this, we turn to the Information Commissioner’s Office (ICO) draft guidance:

“Effective anonymisation reduces identifiability risk to a sufficiently remote level”

ICO draft anonymisation, pseudonymisation and privacy enhancing technologies (PETs) guidance published September 2022

Based on this guidance, whether the resultant synthetic dataset constitutes personal data or anonymous information is a question to be determined based on an assessment of the identifiability risk.

Hazy’s comprehensive approach to privacy protection

The Hazy platform is a sophisticated privacy enhancing technology that combines two well-known privacy methods to produce synthetic datasets where the identifiability risk is sufficiently remote:

Generative Models

Generative models break the one-to-one data points mapping and greatly reduce singling out and linkability. The technical part: Hazy relies on generative machine learning models to generate synthetic data in a two-step process:

  • Fitting: the generative model training algorithm takes the real data as an input, updates its internal parameters to learn a (lower-dimensional) representation of the probability distribution of the real data, and outputs a trained model.
  • Generation: the trained model is sampled to produce a synthetic dataset, breaking the one-to-one mapping from a single real data point to a single synthetic data point.

Differential Privacy (DP)

Differential Privacy mechanisms are designed to eliminate singling-out, linkability, and other re-identifiability concerns even if faced with a resourceful and strategic adversary.

The technical part: We incorporate Differential Privacy in the generative model fitting step. Differentially Private mechanisms rely on randomness and noise perturbation.

Using differential privacy with synthetic data can protect any outlier records from linkage attacks with other data.

ICO draft anonymisation, pseudonymisation and privacy enhancing technologies (PETs) guidance published September 2022

By combining synthetic data with differential privacy, Hazy’s approach to sufficiently anonymising synthetic data with a sufficiently low risk of re-identification is consistent with ICO guidance, which takes into account the concept of identifiability in its broadest sense.

It does not simply focus on removing obvious information that relates to an individual. Instead, it is designed to sufficiently anonymise the data to reduce the possibility of:

  • Singling out or individuation
  • Linkability
  • A motivated intruder re-identifying individuals

Hazy's platform generates synthetic data that is sufficiently anonymous that the UK GDPR does not apply to it.

Read the full article here


Get our tech and innovation insights straight to your inbox

Sign-up to get the latest updates and opportunities from our Technology and Innovation and AI programmes.

 

 

 

Authors

Harry Keen

CEO & Co-founder, Hazy