Picking What to Watch Next - build a recommendation system :: EuroPython 2022

Picking What to Watch Next - build a recommendation system
.ical

2022-07-12 09:30–11:00, Liffey Hall 2
2022-07-12 11:15–12:45, Liffey Hall 2

All times in Europe/Dublin

Recommendation algorithms are the driving force of many businesses: e-commerce, personalized advertisement, on-demand entertainment. Computer algorithms know what you like and present you with things that are customized for you. Here we will explore how to do that by building a system ourselves.

Get the workshop project repo in advance: https://github.com/Cheukting/knn_recommender

In this workshop, we will use MovieLens Datasets to build a very basic recommendation engine. The model will be KNN Item-Based, meaning based on show information, it suggests users who have watched X show a Y show which is similar to that one. While doing it, we will do data transformation with Pandas and Scipy, and train the model with Scikit-learn.

For whom is your Workshop
Data Scientist or developers who have no experience in building a recommendation engine and is curious how it can be done. The model we built is in no way good enough to be deployed as a product but would be a very good first model to learn from and get the idea of how machine learning can be used in finding correlations between items.

Short Format of your Workshop
Overview-10 min, Lecture - 30 mins, Breaks - 20 minutes, Hands-on training - 120 mins, Closing - 10 mins

Workshop Agenda
Overview-10 min

In this session, we will go through the workshop structure, introduce the steps that we take and the tools that we will be using in the workshop.

Lecture - 30 mins

In this session, through slides and presentation, we will go through some knowledge about recommendation engines, what models are available in the market and how they work. We will discuss their advantages and disadvantages and the fundamental theories that work behind the scenes. This will include the model that we are going to build and the more complicated ones that are more commonly used in business.

Breaks- 20 minutes

A short break, overrun buffer, answering questions and setup for the hands-on training.

Hands-on training - 120 mins

At the start of this session, we will have a look at the project skeleton and look at the functions that we will implement in this workshop. (10 mins)

Then we will work on the part that transforms the data into a format that is ready to be trained with the KNN model. (50mins)

Afterwards, we will work on the part that train the KNN model. Here we will run some experiments and play with different parameters, for example, different similarity metrics, in training. (50 mins)

Finally, we will test our recommender in the CLI. (10mins)

Bonus: we will test our recommender in the browser with PyScript

Closing - 10 mins

In this session, we will wrap up what we learned and suggest further learning materials to those who may want to study further in this topic.

What is required from attendees
A computer with a stable internet connection; Python 3.9 or above and poetry installed; An opened mind and ready to learn something new

What Attendees will Learn
By the end of the workshop, you will have built your first recommendation engine. You will be given enough information about how to build a better one and where to study further in you are inspired to be an expert in this field.

Course Benefits
You will have learned a new skill set that may assist you in your next data science project. You will be inspired to study further to be able to build a better recommendation engine or do your own research on related topics.

Expected audience expertise: Domain:

none

Expected audience expertise: Python:

some

Abstract as a tweet:

Recommendation algorithms are the driving force of many businesses, here we will explore how to do that by building a system ourselves.

Cheuk Ting Ho

Before working in Developer Relations, Cheuk has been a Data Scientist in various companies which demands high numerical and programmatical skills, especially in Python. To follow her passion for the tech community, now Cheuk is the Developer Relations Lead at TerminusDB - an open-source graph database. Cheuk maintains its Python client and engages with its user community daily.

Besides her work, Cheuk enjoys talking about Python on personal streaming platforms and podcasts. Cheuk has also been a speaker at Universities and various conferences. Besides speaking at conferences, Cheuk also organises events for developers. Conferences that Cheuk has organized include EuroPython (which she is a board member of), PyData Global and Pyjamas Conf. Believing in Tech Diversity and Inclusion, Cheuk constantly organizes workshops and mentored sprints for minority groups. In 2021, Cheuk has become a Python Software Foundation fellow.

This speaker also appears in:

I have to Confess, I still Love Pandas

Picking What to Watch Next - build a recommendation system .ical

Picking What to Watch Next - build a recommendation system
.ical