Python developer since 2001, has been relying on Python as his primary development language for more than 20 years.
He worked as CTO and Director of Engineering with Python teams for the past 10 years and is currently Senior Director of Open Source Engineering
Alessandro is also the author of the Crafting Test-Driven Software with Python and Modern Python Standard Library Cookbook books.
Apache Arrow, and its Python library PyArrow are becoming the standard de facto for transfering data and interoperability between libraries and languages. As more compute engines, storages and databases start to speak arrow, you might be relying on it without even knowing.
The same transformation is happening with Substrait, that is on track to be the standard representation of query plans themselves. Allowing queries to be routed to different engines as far as they speak substrait, or even decomposed and forwarded to different engines.
This talk we will provide a quick introduction to the Arrow ecosystem, showing to Python developers how libraries like Pandas, Polars and PyArrow itself leverage Arrow and how compute engines like Velox, Datafusion and Acero are embracing Arrow and Substrait.
The talk will also show how a basic database system based on Arrow and Substrait can be built with a minimum amount of code thanks to all the foundations they provide.