2022-07-15 –, Liffey Hall 1
Online platforms have a hard time combating hate, hate speech, explicit content and other NSFW material. Most of the solutions are rule based keyword approaches which are brittle and can be bypassed easily. At PayPal, we have a wide range of user generated content and there is a great need to automatically identify and flag hate, explicit and other typologies, to improve user experience and adhere to regulatory policies. In this talk we showcase how AI can help us identify such content with great precision.
Online content moderation at scale is a non trivial task especially with an ever changing landscape of hate, hate speech with changing geopolitical scenarios. Moderation platforms need to support multiple typologies like - hate, sexually explicit, violence, bullying, spam and other toxic material. Add multi-language support for all typologies and it becomes an uphill task. In this talk we will cover the below topics:
- Why is Text Content Moderation is hard? Why we need AI?
- What are the available open-source datasets to train models?
- What are the available pre-trained models for content moderation?
- Why pre-trained models do not always work?
- Data labelling strategies and how to leverage open data and models?
- How to build multi-language support and challenges?
none
Expected audience expertise: Python:some
Abstract as a tweet:Moderating content on online platforms at scale is an every changing landscape. Keyword based and rule based systems are brittle and easy to bypass. We showcase how we leverage AI at PayPal to solve such hard problems.
Raghotham is an AI Architect at PayPal and leads AI teams for the Customer Success Platform. He comes with rich background in building AI platforms and teams for startups and large enterprises. Drawing on his deep love for data science and neural networks and his passion for teaching, Raghotham has conducted workshops across the world and given talks at a number of data science conferences. Apart from getting his hands dirty with data, he loves traveling, Pink Floyd, and masala dosas.
Ryan is an NLP Engineer at PayPal, primarily working on text content moderation for the Customer Success Platform. He is fairly new to the field of NLP, with just over 1 year of experience, but has a great passion for all things statistics and data science. This is his first conference appearance (but hopefully not his last!), and he is very excited to present and learn! He's also very enthusiastic about ice cream. Probably too enthusiastic...