Profile, Optimize, Repeat: One Core Is All You Need™
2024-07-11 , Terrace 2A

Your data analysis pipeline works. Nice!
Could it be faster? Probably.
Do you need to parallelize? Not yet.

Discover optimization steps that boost the performance of your data analysis pipeline on a single core, reducing time & costs.

This walkthrough shows tools to identify bottlenecks via profiling, and strategies to mitigate those, demonstrating them in an example. To improve our memory and runtime performance we will use numpy, numba jit-ing and pybind11 extensions.


Expected audience expertise

Intermediate

See also: Slides (5.6 MB)

Jonathan is a senior ML software engineer at Aignostics in Berlin, Germany. He works on machine-learning pipelines for medical image analysis, ensuring scalability and maintainability.

Valentin is a software and machine learning engineer at scalable minds. He works on implementing the newest models for biological image analysis and makes sure the data analysis pipeline scales on the cluster.