Felix Zhou | Shopping, Weiran Li | Shopping, Somnath Banerjee | Shopping
Pinterest’s mission is to bring everyone the inspiration to create a life they love. Shopping is at the core of Pinterest’s mission by helping Pinners find and purchase the products they like. Oftentimes when Pinners want to buy something, they are not aware that they can go to a product-only feed, where every Pin is a product from a trustworthy merchant. To increase awareness, we present an explicit portal to this product-only page in the main search results, which we refer to as “shopping upsells” (Fig.1, highlighted in the pink box). Showing upsells to non-shoppable queries like “how to make a mask” is a poor Pinner experience; this shopping upsell appears only when the search query shows high shopping intent, e.g., “white dress”.
As it may be fairly straightforward to decide if a shopping upsell is appropriate for each search query on a case-by-case basis, it is a nontrivial task to scale it to hundreds of millions of daily searches on Pinterest. To address searches of this scale, we trained a high accuracy machine learning model to determine the query shopping intent, namely, if Pinners would like to shop when they search for this query. We also optimized the model serving by a combination of online and offline inference to reduce the latency and the infra cost.
Shopping Intent model
What’s the query shopping intent? Different interpretations lead to different models. We firstly interpreted it as the upsell click rate (V1 model).Then we proposed to interpret it as the product Pin long click propensity (V2 model), which mitigated the issues of the V1 model.
The V1 model characterized the query shopping intent as the click rate of the upsell for a given query. The higher the upsell click rate, the more likely the query has the shopping intent. At the very beginning, we didn’t have any training data because there was no shopping upsell displayed. To collect the training data, we randomly displayed upsells for queries within a small portion of production traffic.
For the collected training data, we defined the feature as the raw query and the label as the upsell click rate. To train the model, we used a 100 dimension pre-trained embedding to featurize the queries and built a three-layer vanilla deep neural network to train a binary cross-entropy model. The model output is the predicted upsell click rate.
The upsell click rate-based model has two disadvantages:
- False positive signal: We observed that many times users clicked upsells without any engagement with the pins on the shopping search page. It indicated that either the users had no shopping intent and they accidentally clicked the upsell, or the search quality on the shopping page was not good. In either case, we should not trigger the upsell because it does not achieve the desired engagement outcome.
- Biased feedback loop of the positive signal: All the positive signals came from the clicked upsells predicted by models in the past. A newer version of the model just tried to fit the signal from the past.
To address both issues, we switched to an improved V2 model.
Pinterest shows product Pins on both the shopping upsell and the organic search page, but we don’t show price tags for product Pins on the organic search page. Based on users’ engagement with product Pins, strong signals indicate whether users are interested in shopping: users may save the Pin to a board or click through the Pin to visit the associated website and remain offsite for an extended period of time (called ‘long click’ in Pinterest). In the V2 model, we characterize the query shopping intent as the long click propensity of product Pins for a query.
For a search query, we define the long click propensity as
We use the save signal to calculate the long click propensity because (1) save is like add-to-cart on shopping websites and users may save Pins for future purchases and (2) it can generate more training samples and further break the feedback loop. We trained a model defined in Fig 2.
To reduce the infra cost of serving a deep learning model, we don’t call the model for the following queries:
- Head queries, which contribute one third of our yearly search traffic: Wrong predictions on the head queries have worse impact than on the non-head queries. While we can not ensure 100% predicting accuracy, we precompute the shopping intent scores for the head queries and retain the scores in a key-value storage where the lookup latency is on the millisecond level.
- Queries belonging to the non-shoppable categories, such as ‘recipe’ or ‘finance’: We leverage the existing query to product category model in production to filter those queries.
After filtering, we reduced 70% traffic to the deep learning model. Fig. 3 shows the whole workflow.
After launching the experiment, the model increased more than 2X traffic to the shopping search page without hurting overall search metrics in terms of long clicks or saves. The model also increased more than 2X product impressions and product long clicks through the upsell.
Displaying shopping upsells for high shopping intent queries is just one of the strategies to acquaint users with Pinterest shopping. In the future, we will incorporate more signals to improve the model performance, helping more users enjoy shopping on Pinterest.
Acknowledgments: The author would like to thank the following people for their contributions — Jinyu Xie, Karthik Anantha Padmanabhan and Tien Nguyen.
Driving Shopping Upsells from Pinterest Search was originally published in Pinterest Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.