{"id":12,"date":"2017-01-06T10:54:33","date_gmt":"2017-01-06T10:54:33","guid":{"rendered":"http:\/\/www.zombiesoftwares.com\/blog\/?p=12"},"modified":"2017-01-13T13:53:01","modified_gmt":"2017-01-13T13:53:01","slug":"search-serving-and-ranking-at-pinterest","status":"publish","type":"post","link":"http:\/\/www.zombiesoftwares.com\/blog\/search-serving-and-ranking-at-pinterest\/","title":{"rendered":"Search serving and ranking at Pinterest"},"content":{"rendered":"<p>Pinterest Search handles billions of queries every month. Every day, we help millions of Pinners discover useful ideas by delivering results among billions of Pins saved by people with overlapping tastes. In the early days, we built our first search system on top of Solr and Lucene. Over past few years, we\u2019ve evolved our search stack by adding new layers, designing services and experimenting with ranking functions. Advancements in our search product have resulted in product evolutions over the years and across platforms that range from the <a href=\"http:\/\/www.zombiesoftwares.com\/blog\/search-serving-and-ranking-at-pinterest\/\">search guides<\/a> that have become industry standard, to improved results based on <a href=\"http:\/\/www.zombiesoftwares.com\/blog\/search-serving-and-ranking-at-pinterest\/\">signals like interests and location<\/a>, to <a href=\"http:\/\/www.zombiesoftwares.com\/blog\/search-serving-and-ranking-at-pinterest\/\">visual search<\/a> that uses the latest in computer vision. In this post, we\u2019ll provide an overview of our search serving and ranking stack, and look ahead to future improvements.<\/p>\n<h2>Life of a search query<\/h2>\n<p>The following diagram shows the life of a search query on Pinterest.<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/www.zombiesoftwares.com\/blog\/wp-content\/uploads\/2017\/01\/ScreenShot2016-12-09at10.02.22AM-1.png?w=840\" alt=\"\" \/><\/p>\n<p>When a Pinner searches on Pinterest, the query goes from our API layer to our search backend. In the search backend, an Anticlimax service understands the query, Obelix machines find the most relevant\u00a0Pins for the query and an Asterix service coordinates the returned results.<\/p>\n<h2>Asterix<\/h2>\n<p>Asterix serves as the super root of our search system. It first talks to Anticlimax to understand the query intent and then rewrites the query. With the query rewrite result, Asterix constructs a personalized ranking function that may favor personalized results, such as local content, fresh content or partial matches.<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/www.zombiesoftwares.com\/blog\/wp-content\/uploads\/2017\/01\/ScreenShot2016-12-09at10.03.46AM-1.png?w=840\" alt=\"\" \/><\/p>\n<p>Inside Asterix, there are three major components: cluster client, rerankers and blenders. The cluster client is a scatter\/gather service that distributes search requests to Obelix nodes, waits for results and then merges results together. (The cluster client also retries on outliers and handles partial results and other failures.) The merged search result is then reranked based on different business logic. For example, a machine learned reranker generates a new ranking score for each Pin based on context features, while a local reranker boosts Pins in a Pinner\u2019s language.<\/p>\n<p>The search results from different clusters are blended using both simple and complex blending logic. For instance, we can use proportional blending to insert 10 percent fresh Pins into results. More complex blending logic allows us to, for example, surface Buyable Pins in results based on query intent, such as \u201cmen\u2019s black sneakers\u201d.<\/p>\n<h2>Anticlimax<\/h2>\n<p>Anticlimax is our query understanding and rewrite service. It has a pluggable interface so engineers can plug in their rewriters and datasets. Each query rewriter takes in a structured query and user information from the previous worker and rewrites it into another structured query.<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/www.zombiesoftwares.com\/blog\/wp-content\/uploads\/2017\/01\/ScreenShot2016-12-09at10.05.41AM-1.png?w=840\" alt=\"\" \/><\/p>\n<p>All query rewriters are chained in a sequence, and we execute them one by one. Spell correction and query segmentation must be executed before other rewriters. Query expansion, query category prediction and other workers can switch order or execute in parallel.<\/p>\n<p>We support <a href=\"http:\/\/www.zombiesoftwares.com\/blog\/search-serving-and-ranking-at-pinterest\/\">different data sources<\/a> for each query rewriter. For example, the spell correction model is stored in memory, larger dictionaries are stored as HFile on disk and query category prediction data is stored in a different service (which we query for every search).<\/p>\n<h2>Obelix<\/h2>\n<p>Obelix is a single leaf search server. It receives search requests from Asterix, retrieves and scores matching Pins from the index and returns top results to Asterix.<\/p>\n<p>An Obelix server may have multiple index segments. The Pins inside each index segment are ranked according to query independent score. This score measures Pin quality, which is an important factor of the final ranking. With the static rank, we\u2019re able to score first few retrieved Pins for most search queries and guarantee the best Pins are scored. Static rank also enables us to have complex functions for scoring.<\/p>\n<p>The searcher in Obelix scans and scores multiple index segments in parallel. It maximize our CPU usage during non-peak hours and improves latency.<\/p>\n<h2>Search Ranking<\/h2>\n<p>Search ranking is the process of choosing the most relevant, useful and personalized Pins for a search query. We solve this unique problem from four different aspects.<\/p>\n<ol>\n<li><strong>Query. <\/strong>Similar to other search engines, Pinterest query rewrite does spell correction, query segmentation, category prediction and other rewrites. However, a unique aspect of Pinterest search is<em> what <\/em>Pinners are searching for. On Pinterest, people issue exploratory queries for ideas versus asking objective questions.\u00a0To provide diverse results, we developed a context-based query expansion. By analyzing query logs and engagement data, we extract query pairs that have similar keyword context and engaged result, and use them to construct term expansions. For example, \u201crelief\u201d can expand to \u201cremedies stress\u201d under context \u201canxiety\u201d. After expanding our result, we provide <a href=\"http:\/\/www.zombiesoftwares.com\/blog\/search-serving-and-ranking-at-pinterest\/\">guides<\/a> to help user drill down specific interests.<\/li>\n<li><strong>Content.<\/strong> Pinterest has a unique, human curated dataset constructed of Pins, boards and Pinners. We explore different signals from our content, some human readable (e.g. board titles) and some not (e.g. embedding vectors for Pins, boards and Pinners).<\/li>\n<\/ol>\n<p>Let\u2019s look at an example of work we\u2019ve done to better understand board titles. Let\u2019s say a Pinner saves a Pin about \u201cDaisy Ridley at the 2016 Academy Awards\u201d to a board called \u201cOscar gowns.\u201d We extract dozens of signals, such as board title frequency, Pin topic distribution and board quality, and can then understand this Pin is not only about \u201cDaisy Ridley\u201d but also related to \u201coscar gowns\u201d and, more specifically, \u201c2016 oscar gowns.\u201d<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/www.zombiesoftwares.com\/blog\/wp-content\/uploads\/2017\/01\/ScreenShot2016-12-09at10.07.59AM-1.png?w=840\" alt=\"\" \/><\/p>\n<h2>Personalization<\/h2>\n<p>Now that 150 million people use Pinterest every month, and more than half of Pinners are outside the U.S., the ideas and interests on the platform are more diverse than ever. This presents a huge technical challenge\u2013showing the right idea to the right person at the right time. For example, if a Pinner searches \u201cfootball\u201d in the UK, he or she is likely not looking for American football content. To make search on Pinterest more personal for every Pinner, we \u00a0boost results based on certain preferences, such as gender, location and language.<\/p>\n<p>There\u2019s a lot more we can do to personalize the search experience for Pinners, and we\u2019re exploring strategies and building plans for the future of personalized search.<\/p>\n<h2>Ranking<\/h2>\n<p>We use a machine learning model to score search results. The model aims to optimize user engagement with results after issuing the query, such as a Pinner saving a result or clicking through depending on the query\u2019s intent. As mentioned, model features are from query, content and Pinners, like the text matching score between a rewritten query and a specific text source (e.g. board title), the match between search query category and Pin category or the match between Pin description language and a Pinner\u2019s\u00a0language. In the end, all features are combined in a linear function, and we can still understand and tweak certain parameters.<\/p>\n<p>We\u2019re also experimenting with different models including neural networks and gradient boosting decision trees. There\u2019s a lot of ongoing work,both in terms of search quality and the infrastructure side so we can better understand content and apply complicated scoring functions online.<\/p>\n<h2>Conclusion<\/h2>\n<p>Building a personalized search experience is a huge technical challenge. Our search infrastructure and ranking are still very young, and there are endless opportunities for engineers to make a big impact. We\u2019re building our next generation serving system with capabilities for better personalization. We\u2019re also investigating different signals, ranking models and verticals that provide better user experiences.<\/p>\n<p><b>&#8220;<\/b>For a <em>free consultation<\/em> with a member of our team call us now on +971-544177921\u00a0 or\u00a0 send query via this <a href=\"http:\/\/www.zombiesoftwares.com\/contact.php\" target=\"_blank\">link<\/a> \/ <a href=\"mailto:info@zombiesoftwares.com\">email<\/a> . \u201c<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pinterest Search handles billions of queries every month. Every day, we help millions of Pinners discover useful ideas by delivering <span class=\"more-text\">&hellip;<\/span><\/p>\n","protected":false},"author":1,"featured_media":435,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":true,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[24],"tags":[68,51,62,57],"class_list":["post-12","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-site-search","tag-autocomplete","tag-conversion","tag-site-search","tag-social-media"],"jetpack_publicize_connections":[],"jetpack_sharing_enabled":true,"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.zombiesoftwares.com\/blog\/wp-content\/uploads\/2017\/01\/pinterest-name-white-fade-1920.png?fit=450%2C450","jetpack_shortlink":"https:\/\/wp.me\/p8i7fD-c","jetpack-related-posts":[],"_links":{"self":[{"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/posts\/12","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/comments?post=12"}],"version-history":[{"count":6,"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/posts\/12\/revisions"}],"predecessor-version":[{"id":724,"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/posts\/12\/revisions\/724"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/media\/435"}],"wp:attachment":[{"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/media?parent=12"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/categories?post=12"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.zombiesoftwares.com\/blog\/wp-json\/wp\/v2\/tags?post=12"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}