Modern Product Discovery Solutions Solve for Complex Shopper Intent
By: Curt Brown, Director, Solutions Consulting, Monetate
Providing rich relevance, especially on matters of modern product discovery, is a hard problem to solve. Even the Googles and Amazons of the world struggle with it.
Take this example on Amazon.
A search for “desktop hp 3d printer” yields less-than-ideal results.
Instead of surfacing desktop 3D printers, HP wireless color printers are exposed to the shopper:
However, a slight tweak from “desktop hp 3d printer” to “desktop 3d printer” reveals that lots of 3D printers are available.
So why is this happening?
All search engines use a baseline algorithm called term frequency-inverse document frequency (TFIDF) and were designed in the late 70s, early 80s for document search.
TFIDF is a popularity algorithm that considers how many times a search term appears in a document or product detail page (PDP). This might have worked great in the early days of the internet when everything online was a document or a news article. But in today’s eCommerce-rich digital landscape, TFIDF has become a real problem.
Consider the search term “blue dress shirt.”
Using TFIDF alone on an apparel site, your search would yield a collection of dresses, shirts, and any products marked blue, each term considered as relevant to the retail shopper by TFIDF as the other.
The product term and not its adjectives or attributes is treated as the most important part of the query. Yet, in this case, “shirt” is the product the shopper is looking for and not “dresses” or “blue products.”
Frustratingly, TFIDF struggles to return the most relevant products to shoppers and tuning relevancy becomes a manual process that constantly needs administrators to maintain.
How Do We Solve for Poor Search Relevancy?
The use of contextual or semantic understanding, as well as natural language processing, has improved search significantly for retailers in the last 10 years.
There has also been a lot of progress with Artificial Intelligence (AI) and machine learning for automated relevancy tuning.
The first-place shoppers typically express their intent is in the search box. Here is the most direct and clear way shoppers can tell a retailer what exactly they are shopping for.
However, some shoppers are at the top of the shopping funnel and are in discovery mode. This is where you will find short queries like “jeans” or “dresses” which are referred to as head queries.
Long-tail queries, or more complex queries, are signs the shopper has narrowed in on the product they are looking for and are closer to the bottom of the shopping funnel.
Both of these shoppers can be catered to through a good autosuggest, or type-ahead, sometimes called visual search.
This is where the search solution will look at the characters being typed and refine the suggested products and categories of products directly in the search box without executing a search.
These search box visual suggestions come in a variety of formats or themes to meet merchandiser preferences. AI and machine learning can use shoppers’ engagement at the search box as well as engagement with products on the result page to predict which products are best suited to the characters being typed into the search box in real time.
This provides a shortcut for both modes of shoppers (Discovery vs Buying).
If I’m in discovery mode, I may click on one of the categories to review a variety of products, but if I’m in buying mode, I may click directly on the product that most interests me in the auto-suggested listing.
This helps meet both kinds of shoppers’ expectations without even executing a search query. For those shoppers in-between or who may have more refined needs than the visual search is exposing then they will type their query and hit enter. This is where relevancy is incredibly important.
Step 1
The first step with any search solution deployment is building an index and this is done through the consumption of a retailer’s product catalog, sometimes referred to as a product feed.
In this step, being able to take a feed and understand all the attributes within the feed like color, material, or size is a critical part of building relevancy.
Think of it as a tree: categories of products are big branches, subcategories are smaller branches, and the products are leaves.
In this way, we are building a relevancy tree.
Of course, it gets tricky when you think about a product that can belong to more than one category, but modern product discovery solutions handle this nicely. This is also where logic or perhaps AI should be able to understand what the product is and what the attributes are.
Some man-assisted AI can also be done here to weigh certain fields in the feed higher in importance than others. For example, if the brand is an import field, then you weigh it higher in relevancy than say color or gender.
An important part of any modern site search is a synonyms library and the ability to have logic take attributes in a feed and match them to the library.
Many new search solutions have a feature to add new synonyms through machine learning: essentially watching query terms and what shoppers click on and buy and making associations with other products or similar attributes.
Components That Drive Modern Product Discovery
Here, are some other concepts that are important to ensuring a good relevant shopper experience:
Semantic Search/Natural Language Processing
Contextual Relevancy
- Product vs. Attribute
- What’s the shopper’s intent – increase the weight of keywords based on shopper’s intent by utilizing natural language models.
Error Checking
- Synonym tagging – ensures the most relevant results are returned by automatically searching for synonyms of keywords within the query.
- Stemming – enables the search engine to understand that all words with the same root should be treated equally. Hike and hiking, swim and swimming, dress and dresses.
- Stop words – eliminates any unnecessary filler words from the original query to ignore false matches.
- Spellcheck – automatically recognizes and corrects misspelled words.
Classification
- Analyzes queries to determine the most relevant categories.
Relational Queries
- Determine user intent through the understanding of grammatical constructs.
- Example: For theme-based searches like “boots for hiking,” search logic should determine that “boots” is the product type and that “hiking” is related to “boots” and is a secondary attribute. Understanding that “purple dress for weddings” or “table lamps for bedroom” are theme-based searches where the second part is an additional qualifier existing in relation to the first part, thus helping in showcasing more relevant results for any search query.
Clustering
- Are there similar queries that have returned better results? For example, if I search for bags then AI logic can look at other queries to determine the best set of products to return based on search frequency and search success, i.e., conversion: Bags = trash bags, garbage bags, freezer bags, sandwich bags, etc. Thus, improving the results for the shopper.
Range Queries
- Handling 45-52-inch television. Logic applied to see that any product between this range should be included.
Diversity Handling
- Resolves the lack of identifiable shopper intent from head queries by showing a diverse result set.
- Example: If a shopper searches for a broad keyword like ‘phone,’ the solution shows a variety of phones across different brands, models, colors, and more in search results.
Handling Low Intent or Long Tail Queries
- Maps long tail queries to the relevant popular queries to ensure better recall.
- Example: “little black cocktail dress” query can be mapped to “little black dress” or “cocktail dress.” These more popular queries will ensure search results, instead of showing a zero results page.
As you can see, there is a lot of logic to be thought of when thinking about site search.
For example, when a bag of products is returned (this is referred to as “recall”) via a search, some logic needs to look at the results and grade each product as to its query relevancy, i.e., a relevancy score.
Below a certain score, the products should be rejected or filtered out of the result set. In this way, the ranking of products in the results is also achieved. Most relevant products at the top, least relevant at the bottom.
What if all the products returned for a query are equally relevant? This is where performance back ranking algorithms come into play and can then rank the products based on the performance of the product either at the site level or at the query level.
Performance metrics can include Click-Through Rate (CTR), Add-to-Cart (ATC), checkout, Revenue Per Visit (RPV), Average Order Value (AOV) and many others. Using these signals, a modern product discovery solution can not only rank best-performing products at the search result page but also on your category pages as well.
Modern Product Discovery: Personalizing Based on Shopper Intent
Personalizing experience is also an important part of any modern product discovery solution.
Understanding shoppers’ affinities and bubbling up products to the top of listing pages, such as search results and category pages, has an impact on shopper engagement. With modern machine learning models and AI, a shopper’s past and in-session affinities can be arranged to show their preference. If I have a brand preference or I’m price-sensitive, AI can recognize this and rank products that match those affinities to the top of these pages.
Refined, easy-to-use merchandising controls are also critical for merchandisers. AI is great but what if a low-margin product keeps ranking at the top of search queries? Manual overrides or merchandising controls enable the merchandiser the ability to override the AI defaults. This is sometimes referred to as human-assisted AI.
Modern product discovery solutions have bridged the gap between traditional document search and complex shopper intent with many different layers of architecture and functionality.
The variety of ways shoppers express their intent, the breadth of products, as well as changing social trends all require that retailers rely on a modern product discovery solution.
In order to continue to meet shopper’s high expectations, these solutions must provide good visual search, relevant search results, performance-based ranking of results, and personalization as well as refined merchandising controls.