On October 29, 2019 the women of Pinterest Engineering and WiBD hosted a casual evening of mind-sharing, lightening talks about Pinterest’s work.The event took place at the Pinterest headquarters in San Francisco CA, USA.
Regina Karson, WiBD SF Bay Area Chapter Director, kicked off the well-attended evening with Clara Timpe and Tian-Ying Chang from Pinterest, who talked about Pinterest’s passion for supporting WiBD and introduced the following speakers of the five lightning talks.
Goku – Pinterest’s In-House Time-Series Database
Goku is a highly scalable, cost-effective and high-performant online, time-series database service. It stores and serves a massive amount of time-series data without losing granularity. Goku can write tens of millions of data points per second and retrieve millions of data points within tens of milliseconds. It supports high compression ratio, downsampling, interpolation, and multidimensional aggregation. It can be used in a wide range of monitoring tasks, including production safety and IoT. It can also be used for real-time analytics that makes use of time-series data. The latest Goku will also support long term disk-based data, as well as downsampling to save costs to store even data over years.
Airflow at Pinterest
We recently decided to adopt Airflow to schedule workflows at Pinterest, replacing our old system, Pinball. In this talk, we’ll go through how this decision was made, the challenges faced with bringing in this technology into our environment, and the future of workflows at Pinterest.
Experimentation at Pinterest
A/B experiment is a method to compare two (or more) variations of something to determine which one performs better against your target metrics. It is key to metrics driven development. This presentation will discuss how experimentation works in Pinterest, and we will also present our whole Experimentation framework, which handles large scale of data.
Large Scale Batch Jobs at Pinterest—Why and how to run them on Kubernetes
Running large scale batch jobs, including both big data processing and machine learning jobs, has been a pain point within Pinterest for a long time. Existing Hadoop-based platform has huge limitations on compute resources as well as tenant management. On the ML side, previous Mesos-based compute platform lacks community support. Here we present two cases, TF training and Pintext data processsing, focusing on why and how they migrated to our inhouse Kubernetes compute platform.
Processing Offline Jobs at Pinterest
Taking a look at how Pinterest handles billions of offline jobs per day. The Pinlater service supports a large number of use cases of Pinterest’s day-to-day business. We will talk about the architecture behind the service and some of the issues we had to consider during implementation.
WE ARE BUILDING A VISUAL DISCOVERY ENGINE THAT POWERS RECOMMENDATIONS FOR 250M+ PEOPLE EVERY MONTH. ON PINTEREST, PEOPLE HAVE SAVED MORE THAN 175B PINS IN CATEGORIES LIKE FOOD, STYLE AND HOME. WE USE A MIX OF MACHINE LEARNING AND HUMAN CURATION (HOW PEOPLE ARE LABELING PINS AND NAMING BOARDS) TO DESCRIBE TASTE, AND COMPUTER VISION TO IDENTIFY VISUALLY SIMILAR OBJECTS. EVERY DAY, PEOPLE AROUND THE WORLD USE PINTEREST TO DREAM ABOUT, PLAN AND PREPARE FOR THINGS THEY WANT TO DO IN LIFE, AND WE’RE JUST GETTING STARTED.
Thank you Pinterest for such a great event and WiBD look forward to more to the same.
Presentation deck is available here.