PyData London 2022
Engulfed in a tedious refactoring of your code, you’re adding the 7th layer of
mocks to a test when you realise something must have gone wrong somewhere, but what?
You’ve written readable code, split into functions and classes to avoid long chunks
of code, and yet, every time, you end up with hardly testable code, a test suite
that runs for hours, functions with seventeen arguments, and you wonder if it’s
you mocking the code or the code mocking you.
Data privacy is probably one of the most important challenges we are facing in Data Science.
Applications are collecting more and more personal data and it is paramount to ensure anonymity.
Privacy cannot be solved just by removing personal identifiers, and concepts such as k-anonymity have been developed
to help with structured data. But what if you are working with unstructured text data? Things can get even trickier…
This talk aims at presenting a few tips and tricks to ensure privacy when working with text, as well as identifying still
open research questions. No silver bullet here, but hopefully a step in the right direction.
This short talk in French aims at highlighting in a fun way that many technical words are used both in data science and
in web development, but with very different meanings - which can lead to misunderstanding when collaborating.
We all know that we should test our code more, but somehow, we never seem to find the time. Test writing is sometimes
perceived as tedious, boring, and unappealing, but it doesn’t have to be that way!
This talk given in collaboration with my coworker Stéphanie Bracaloni introduces our open-source project MLV-tools.
PyData Amsterdam 2019
Have you ever heard about Machine Learning versioning solutions? Have you ever tried one of them?
And what about automation? Learn how to easily build versionable pipelines! This tutorial explain through small
exercises how to setup a project using DVC and MLV-tools.
Python is known as a slow programming language. It is nonetheless very popular in the scientific community, and is used
to perform massive numerical computations. How can that be? In a word: NumPy.
This workshop is intended for Python developers with no previous experience in Machine Learning.
Company internal meetup
This talk is a high level overview of the principles of data privacy. Featuring toucans, koalas and lemurs.
Data Science is gonna save the world, right? Or is it? Machine Learning epic fails are being largely commented.
It’s easy to convince ourselves that they are due to the inconsiderate misuse of Data Science. But is it really so?
Is it possible that innocuous choices lead an honnest team to a disaster?