Violetta Mishechkina

I'm a Solution Engineer from dltHub. I've started my professional journey as a Data Scientist at Nokia. So I worked a lot with Telecommunication data and Time Series. Both Classical ML and Neural Networks were quite popular at the time, so I've used them in my projects. The next step for me was moving away from Data Science to MLOps, because I felt that the problem of moving from an ML model to a production is not solved but it is highly important. All questions about getting data, versioning, and proper testing are still on the table.

When I've joined dltHub 4 months ago, I've entered the world of Data Engineering. Having experience mostly in ML I had to learn how to talk in Data Engineering language. All the terms like schema evolution, data ingestion, and semantic layer were new to me. That is partially why I was so impressed by dlt Python library, cause it abstracted away a lot of these issues. Personally, I believe, that dlt could become a part of standard modern open-source data stack. Because it was built by people knowing what they are doing and tackling the problem of data ingestion hundreds of times.

Linkedin: https://www.linkedin.com/in/violetta-mishechkina/


Session

07-10
14:35
30min
From Pandas to production: ELT with dlt
Violetta Mishechkina, Adrian Brudaru

We created the “data load tool” (dlt), an open-source Python library, to bridge the gap between data engineers and data scientists. In this talk you will learn about how dlt can help you overcome typical roadblocks in your data science workflows, and how it streamlines the transition from data exploration to production. We will also discuss the pains of maintaining data pipelines and how dlt can help you to avoid common engineering headaches.

Join us to learn best practices around data handling and managing failures with real-life examples!

PyData: Data Engineering
North Hall