Sarah DIOT-GIRARD

sdg@jlbl.net

github.com/SdgJlbl

Machine Learning Specialist

I'm good at
Machine Learning, Python, Natural Language Processing, pandas, NumPy, PyTorch, scikit-learn, French, English

I'm proficient in
pytest, git, Linux CLI, Computer Vision, data viz, DVC (Data Version Control), ML interpretability, German

I'm dabbling in
bash, Ansible, SQL, Flask, sphinx, html/css, Python packaging

I know a bit about
task queues, Docker, Kubernetes

I care about
algorithmic fairness, data privacy, ML pipeline explainability, code quality, DataOps

Since November 2017

Machine Learning Engineer

PeopleDoc

Built from scratch an automated classification pipeline for HR documents, all formats.

  • Improved HR users day-to-day workflow, and reduced error rates.
  • Facilitated new customer implementation to the internal data model.
  • Validated technical feasibility through a POC model working both on text and on scan (images).
  • Productionised the code and improved execution performances.
  • Designed the prediction API for integrating with other internal applications.

Set up the Machine Learning team in the company.

  • Promoted a Machine Learning culture inside the company, through demos, talks and workshops.
  • Supervised the development of an internal ML platform.
  • Initiated a ML mindset in different departments (hardware requirements, data access, ...).

Developed MLV-tools, an open-source toolkit to version easily Machine Learning pipelines.

June 2016 - September 2017

Lead Data Scientist

WayKonect

Developed data-driven algorithms for connected cars.

Implemented the data pipeline from scratch, with a focus on code quality and reproducibility of results.

Designed the corporate data strategy to improve data gathering in the long term.

Developed a personalised coaching algorithm to improve driver safety and promote eco-driving.

May 2012 - June 2016

Machine Learning Research Engineer

Dassault Systèmes

Research and development on a rule inference engine (quality analysis on manufacturing processes).

Refactored legacy code, updated documentation and tests, improved rule intelligibility.

Improved predictive power of the rule engine using boosting techniques.

Added a prescriptive module for correcting poor quality outcomes.

6-month internship: Inference of gene regulatory networks from DNA chips data

October 2011 - April 2012

Research assistant

TU München / DLR

Implemented in Python optimisation algorithms for Deep Neural Networks.

Education

  • Master of Science "Robotics, Cognition, Intelligence", TU München, 2012
  • Engineering degree, ENSTA ParisTech, 2012

Interests

  • Open-source contributor
  • Speaker at tech conferences (EuroPython, PyData, EuroSciPy, national PyCons)
  • Horseback archer competing at international level