Homepage of Sarah Diot-Girard

PyConDE 2019

Privacy-preserving text analysis

Data privacy is probably one of the most important challenges we are facing in Data Science. Applications are collecting more and more personal data and it is paramount to ensure anonymity. Privacy cannot be solved just by removing personal identifiers, and concepts such as k-anonymity have been developed to help with structured data. But what if you are working with unstructured text data? Things can get even trickier… This talk aims at presenting a few tips and tricks to ensure privacy when working with text, as well as identifying still open research questions. No silver bullet here, but hopefully a step in the right direction.

Trust me,

I'm a Data Scientist

Thoughts by Sarah Diot-Girard

PyConDE 2019

Privacy-preserving text analysis