Justine Wezenaar EuroPython 2024

Justine Wezenaar
.ical

Justine Wezenaar is a Software Engineering Team Lead for Bloomberg’s ESG (Environmental, Social & Governance) Quant team, which owns the implementation and maintenance of the firm’s quantitative parametric scoring models. She took a less-traditional route to Software Engineering, studying mathematics and theoretical physics at McGill University, then working as a data scientist for a healthtech startup in her hometown Halifax, Canada before joining Bloomberg Engineering in New York City in 2018. Before joining the ESG team in 2022, Justine was on the quant engineering team for Bloomberg’s Evaluated Pricing (BVAL) product, where she worked on pricing models for mortgage-backed securities. In her role, her team builds systems which must satisfy both the performance and reliability requirements of Engineering, while also remaining sufficiently flexible and agile to accommodate the Research and Product teams’ responses to the dynamic ESG market landscape.

Session

07-11

16:05

45min

How we used vectorization for 1000x Python speedups (no C or Spark needed!)

Justine Wezenaar, Jonathan Hollenbeck

Want to make all your code faster? With matrices, library knowledge, and a sprinkle of creativity, you can consistently speed up multivariate Python functions by 1000x!

Modal optimization requires simple axioms - arithmetic, checking a case, calling the right sklearn function, and so on. When that’s not sufficient, three core tricks - converting conditional logic to set theory, stacking vectors into a matrix, and shaping data to match library expectations - cover the vast majority of real world cases (90% of the ~400 functions we vectorized).

At Bloomberg, ESG (Environmental, Social, and Governance) Scores require complex computations on large data sets. Time-series computations are fundamental for Governance - one UDF infers board support for a policy from prior cyclical votes and other time offset inputs. By rewriting the pandas backfill as a series of reductions on a 4-tensor, we reduced the runtime from 45 minutes to 10 milliseconds! Analogously, due to real world complexity, finance UDFs can end up with 100+ if/else branches in one function. With a mix of De Morgan’s laws and sparse matrix representations, we simplified the cases and achieved 1000x+ speedups.

We’ll conclude with a quick overview of cutting-edge tools, and hope you’ll leave with a concrete strategy for vectorizing financial models!

PyData: Machine Learning, Stats

Forum Hall

Justine Wezenaar .ical

Session

Justine Wezenaar
.ical