Protecting sensitive data in huge datasets: Cloud tools you can use
Before releasing a public dataset, practitioners need to thread the needle between utility and protection of individuals. Felipe Hoffa explores how to handle massive public datasets, taking you from theory to real life as they showcase newly available tools that help with PII detection and brings concepts like k-anonymity and l-diversity to the practical realm. You’ll also cover options such as removing, masking, and coarsening.
What you'll learn:
- Learn how to identify PII in massive datasets
- Explore k-anonymity, l-diversity, and related research and options such as removing, masking, and coarsening
- Gain experience with practical demos over massive datasets