awesome-search

Search resource hub

A curated collection of resources and information on search algorithms and techniques for building effective search applications

Awesome Search - this is all about the (e-commerce, but not only) search and its awesomeness

GitHub

1k stars
64 watching
117 forks
Language: HTML
last commit: about 1 month ago
Linked from 2 awesome lists

autocomplete-suggestionsecommerce-searchevaluating-searchknowledge-graphlearning-to-ranknatural-language-processingquery-understandingrankingrelevance-algorithmsrelevant-searchsearchsearch-enginesearch-enginessearch-intentssearch-uisearch-uxsemantic-searchspelling-correctionsuggestionssynonyms

Awesome Search / Topics / Types of search / Vectors/Semantic search / Embeddings / Encoder models

Query/Document tokens interaction

Awesome Search / Topics / Areas of application

Geo-Spatial Search
Medical and Healthcare Search
Social Media and User-Generated Content Search
Question Answering Systems
Personal Information Management

Awesome Search / Unsorted

sandbox Jun 2021 1,389 about 1 month ago
sandbox May 2021 1,389 about 1 month ago
sandbox April 2021 1,389 about 1 month ago
sandbox Dec 2020 1,389 about 1 month ago
sandbox Jan 2020 1,389 about 1 month ago

Awesome Search / General, fun, philosophy

Falsehoods Programmers Believe About Search
Ethical Search: Designing an irresistible journey with a positive impact
On Semantic Search
Feedback debt: what the segway teaches search teams
Supporting the Searcher’s Journey: When and How
Shopping is Hard, Let’s go Searching!
An Introduction to Search Quality
On-Site Search Design Patterns for E-Commerce: Schema Structure, Data Driven Ranking & More
In Search of Recall
Balance Your Search Budget!
Evolution of Search Technology: A Look Ahead
Targeting Broad Queries in Search Etsy
How Etsy Uses Thermodynamics to Help You Search for “Geeky”
Broad and Ambiguous Search Queries
Deconstructing E-Commerce Search: The 12 Query Types
Nearest Neighbor Indexes for Similarity Search
The Missing WHERE Clause in Vector Search
Symmetric vs. Asymmetric Semantic Search
Bi-encoder vs Cross encoder?When to use which one?
What is ColBERT and Late Interaction and Why They Matter in Search?
Choosing the best model for semantic search
Announcing the Vespa ColBERT embedder
What is ColBERT and Late Interaction and Why They Matter in Search?
Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval
Introduction to Matryoshka Embedding Models
Matryoshka representations. A guide to faster semantic search
Hybrid Search: SPLADE (Sparse Encoder)
SPLADE for Sparse Vector Search Explained
Hybrid search > sum of its parts?
On Hybrid Search
Hybrid search with Re-ranking
Hybrid search with Re-ranking
Reciprocal rank fusion
Muves: Multimodal & multilingual vector search w/ Hardware Acceleration
Model Selection for Multimodal Search
GenAI Can Improve Enterprise Search, But Remains a Work In Progress
The influence of TF-IDF algorithms in eCommerce search
Search as a Conversation
Affordances for Conversational Search
Query Understanding and Chatbots

Awesome Search / Search Results / Retrieval

Humans Search for Things not for Strings
What is a ‘Relevant’ Search Result?
How to Achieve Ecommerce Search Relevance
Setting up a relevance evaluation program
Understanding the BM25 full text search algorithm
How Shards Affect Relevance Scoring in Elasticsearch Practical BM25: ,
The influence of TF-IDF algorithms in eCommerce search
BM25 The Next Generation of Lucene Relevance
Lucene Similarities (BM25, DFR, DFI, IB, LM) Explained

Awesome Search / Search Results / Ranking

Multi stage ranking
How is search different than other machine learning problems?
Reinforcement learning assisted search ranking
E-commerce Search Re-Ranking as a Reinforcement Learning Problem
When to use a machine learned vs. score-based search ranker
What is Learning To Rank?
Using AI and Machine Learning to Overcome Position Bias within Adobe Stock Search
Train and Test Sets Split for Evaluating Learning To Rank Models
How LambdaMART works - optimizing product ranking goals
Click models 16 almost 4 years ago
Click Modeling for eCommerce
Using Behavioral Data to Improve Search

Awesome Search / Search Results / Bias

What is Presentation Bias in search?
Dealing with Position Bias in Recommendations and Search

Awesome Search / Search Results / Diversification

Search Result Diversification using Causal Language Models
Learning to Diversify for E-commerce Search with Multi-Armed Bandit
Search Quality for Discovery & Inspiration
How to measure Diversity of Search Results
Searching for Goldilocks
Broad and Ambiguous Search Queries - Recognizing When Search Results Need Diversification
Thoughts on Search Result Diversity
How to Calculate MMR?
Maximal Marginal Relevance to Re-rank results in Unsupervised KeyPhrase Extraction

Awesome Search / Search Results / Personalisation

Patterns for Personalization in Recommendations and Search
Personalization Daniel Tunkelang
Real-time personalization in search Airbnb -
98 personal data points that facebook uses to target ads to you
Architecture of real world recommendation systems
Feature engineering for personalized search

Awesome Search / Search Results / Zero search results

Strategies for using alternative queries to mitigate zero results and their application to online marketplaces
Semantic Equivalence of e-Commerce Queries

Awesome Search / Search UX / Baymard Institute

Deconstructing E-Commerce Search: The 12 Query Types
Autodirect or Guide Users to Matching Category
13 Design Patterns for Autocomplete Suggestions (27% Get it Wrong)
E-Commerce Search Needs to Support Users’ Non-Product Search Queries (15% Don’t)
Search UX: 6 Essential Elements for ‘No Results’ Pages
Product Thumbnails Should Dynamically Update to Match the Variation Searched For (54% Don’t)
Faceted Sorting - A New Method for Sorting Search Results
The Current State of E-Commerce Search
E-Commerce Sites Need Multiple of These 5 ‘Search Scope’ Features
E-Commerce Search Field Design and Its Implications
E-Commerce Sites Should Include Contextual Search Snippets (96% Get it Wrong)
E-Commerce Search Usability: Report & Benchmark
Six ‘COVID-19’ Related E-Commerce UX Improvements to Make

Awesome Search / Search UX / Nielsen Norman Group

The Love-at-First-Sight Gaze Pattern on Search-Results Pages
Good Abandonment on Search Results Pages
Complex Search-Results Pages Change Search Behavior: The Pinball Pattern
Site Search Suggestions
Search-Log Analysis: The Most Overlooked Opportunity in Web UX Research
Scoped Search: Dangerous, but Sometimes Useful
3 Guidelines for Search Engine "No Results" Pages

Awesome Search / Search UX / Enterprise Knowledge LLC

Optimizing Your Search Experience: A Human-Centered Approach to Search Design

Awesome Search / Search UX / Facets

Facets of Faceted Search
Coffee, Coffee, Coffee!
Faceted Search (start here!)
How to implement faceted search the right way
Metadata and Faceted Search
Metacrap: Putting the torch to seven straw-men of the meta-utopia
7 Filtering Implementations That Make Macy’s Best-in-Class
Facet Search: The Most Comprehensive Guide. Best Practices, Design Patterns, Hidden Caveats, And Workarounds
Facets: Constraints or Preferences?
Facets, But Which Ones?
How Many Facets Should a Taxonomy Have
When a Taxonomy Should not be Hierarchical
Customizing Taxonomy Facets

Awesome Search / Search UX / Other

Learning from Friction to Improve the Search Experience
Why is it so hard to sort by price?
Faceted Sorting
Google kills Instant Search

Awesome Search / Spelling correction

"How to Write a Spelling Corrector" Peter Norvig. . Classic publication
"Spelling Correction" Daniel Tunkelang
A simple spell checker built from word vectors
1 A closer look into the spell correction problem: , , ,
Deep Spelling
Modeling Spelling Correction for Search at Etsy
Sympell 3,169 3 months ago Wolf Garbe. Author of . , ,
Chars2vec: character-based language model for handling real world texts with spelling errors and
library 618 5 months ago JamSpell, spelling correction taking into account surrounding context - , (in russian)
Embedding for spelling correction
A simple spell checker built from word vectors
What are some algorithms of spelling correction that are used by search engines?
Moman 28 almost 5 years ago lucene/solr/elasticsearch spell correction/autocorrect is (was?) actually powered by this library
Query Segmentation and Spelling Correction
Applying Context Aware Spell Checking in Spark NLP
Autocorrect in Google, Amazon and Pinterest and how to write your own one

Awesome Search / Synonyms

Boosting the power of Elasticsearch with synonyms
Real Talk About Synonyms and Search
Synonyms in Solr I — The good, the bad and the ugly
Synonyms and Antonyms from WordNet
Synonyms and Antonyms in Python
Dive into WordNet with NLTK
Creating Better Searches Through Automatic Synonym Detection
Multiword synonyms in search using Querqy
How to Build a Smart Synonyms Model
The importance of Synonyms in eCommerce Search

Awesome Search / Stopwords

Do all-stopword queries matter?

Awesome Search / Suggestions

Bootstrapping Autosuggest Giovanni Fernandez-Kincade. , , , ,
On two types of suggestions
Improving Search Suggestions for eCommerce
Autocomplete Search Best Practices to Increase Conversions
Why we’ve developed the searchhub smartSuggest module and why it might matter to you
Site Search Suggestions Nielsen Norman Group:
13 Design Patterns for Autocomplete Suggestions
Autocomplete
Autocomplete and User Experience
IMPLEMENTING A LINKEDIN LIKE SEARCH AS YOU TYPE WITH ELASTICSEARCH
Smart autocomplete best practices: improve search relevance and sales
Building Corpus for AutoSuggest (Part 1) OLX: ,
Autocomplete, Live Search Suggestions, and Autocorrection: Best Practice Design Patterns
Mirror, Mirror, What Am I Typing Next? All About Search Suggestions
How we built the lightning fast autosuggest for otto.de

Awesome Search / Graphs/Taxonomies/Knowledge Graph / Integrating Search and Knowledge Graphs (by Enterprise Knowledge)

Part 1: Displaying Relationships
Search query expansion with query embeddings

Awesome Search / Query expansion

Fundamentals of query rewriting (part 1): introduction to query expansion

Awesome Search / Query understanding

Query Understanding Daniel Tunkelang
Query Understanding, Divided into Three Parts
Search for Things not for Strings
Part 1 Understanding the Search Query. , ,
Food Discovery with Uber Eats: Building a Query Understanding Engine
AI for Query Understanding

Awesome Search / Query understanding / Search Intent

Mapping Search Queries To Search Intents
Search: Intent, Not Inventory

Awesome Search / Query understanding / Query segmentation

Unsupervised Query Segmentation Using only Query Logs Paper
Towards Semantic Query Segmentation Paper

Awesome Search / Algorithms / BERT

Understanding BERT and Search Relevance
Google is improving web search with BERT – can we use it for enterprise search too?

Awesome Search / Algorithms / ColBERT

Pretrained Transformer Language Models for Search - part 3

Awesome Search / Algorithms / Collocations, common phrases

Automatically detect common phrases – multi-word expressions / word n-grams – from a stream of sentences.
The Unreasonable Effectiveness of Collocations

Awesome Search / Algorithms / Other Algorithms

One hot encoding
Writing a full-text search engine using Bloom filters
Locality Sensitive Hashing
Locality Sensitive Hashing (LSH): The Practical and Illustrated Guide
Minhash
Better than Average: Sort by Best Rating
How Not To Sort By Average Rating
Keyword Extraction using RAKE
Yet Another Keyword Extractor (Yake) 1,658 about 1 year ago
Keyword Extraction with BERT

Awesome Search / Tracking, profiling, GDPR, Analysis / Tools, platforms, helpers for search tracking

OpenSearch User Behavior Insights 22 about 1 month ago
Site Search tracking with Google Analytics 4
Snowplow
search-colletor 41 5 months ago
OpenTelemetry with search additions
Pulse Query Analytics
Tracking who's hot and who's not presents an algorithmic challenge

Awesome Search / Tracking, profiling, GDPR, Analysis / Resources

Anonymisation: managing data protection risk (code of practice)
The Anonymisation Decision-Making Framework
98 personal data points that facebook uses to target ads to you
Opportunity Analysis for Search
A Face Is Exposed for AOL Searcher No. 4417749
AOL search data leak
Personal data

Awesome Search / Experiments

Common Pitfalls of Search Experimentation
Improving Search @scale with efficient query experimentation

Awesome Search / Experiments / A/B testing, MABs

A/B Testing for Search is Different
A/B Testing Search: thinking like a scientist

Awesome Search / Testing, metrics, KPIs / Metrics

Discounted cumulative gain
Flavors of NDCG - normalized to what!?
Mean reciprocal rank
P@k
Demystifying nDCG and ERR
Choosing your search relevance evaluation metric
How to Implement a Normalized Discounted Cumulative Gain (NDCG) Ranking Quality Scorer in Quepid
https://en.wikipedia.org/wiki/Precision_and_recall
https://en.wikipedia.org/wiki/F1_score
Visualizing search metrics
Choosing your search relevance evaluation metric
Compute Mean Reciprocal Rank (MRR) using Pandas
Recommender Systems: Machine Learning Metrics and Business Metrics

Awesome Search / Testing, metrics, KPIs / KPIs

5 Right Ways to Measure How Search Is Performing
Part 1 – Customers E-commerce Site-Search KPIs. , ,
Learning from Friction to Improve the Search Experience
Behind the Wizardry of a Seamless Search Experience
Analyzing online search relevance metrics with the Elastic Stack
How to Gain Insight From Search Analytics

Awesome Search / Testing, metrics, KPIs / Evaluating Search (by Daniel Tunkelang)

Measure It
Measuring Searcher Behavior
Using Human Judgement
When There’s No Conversion Rate

Awesome Search / Testing, metrics, KPIs / Measuring Search (by James Rubinstein)

Statistical and human-centered approaches to search engine improvement
A Human Approach
Setting up a relevance evaluation program
Metrics Matter
A/B Testing Search: thinking like a scientist
Query Triage: The Secret Weapon for Search Relevance
The Launch Review: bringing it all together…

Awesome Search / Testing, metrics, KPIs / Three Pillars of Search Relevancy (by Andreas Wagner)

Part 1: Findability
part 2: Search Quality For Discovery & Inspiration

Awesome Search / Architecture

The Art Of Abstraction – Revisiting Webshop Architecture

Awesome Search / Architecture / Canva

Part One outline of the challenges faced
Part Two new search arcthitecture

Awesome Search / Architecture

Event-Driven Architecture for Efficient Search Indexing

Awesome Search / Education and networking / Conferences

Activate
Berlin buzzword
Haystack
Elastic{ON}
MIX-CAMP E-COMMERCE SEARCH
SIGIR eCommerce

Awesome Search / Education and networking / Conferences / SIGIR eCommerce

2019
2018
2017

Awesome Search / Education and networking / Trainings and courses

Elasticsearch "Think Like a Relevance Engineer"
Solr "Think Like a Relevance Engingeer"
Beyond Search Relevance: Understanding and Measuring Search Result Quality
Hello LTR

Awesome Search / Education and networking / Books

AI-powered search
Relevant Search
Deep Learning for search
Interactions with search systems
Embeddings in Natural Language Processing. Theory and Advances in Vector Representation of Meaning
Search User Interfaces
Search Patterns
Search Analytics for Your Site: Conversations with Your Customers
Click Models for Web Search
Optimization Algorithms

Awesome Search / Education and networking / Blogs and Portals

Searchnews

Awesome Search / Education and networking / Papers

List of papers

Awesome Search / Management, Search Team

Search is a Team Sport
Thoughts about Managing Search Teams
On Search Leadership
Building an Effective Search Team: the key to great search & relevancy
Query Triage: The Secret Weapon for Search Relevance
The Launch Review: bringing it all together
The Role of Search Product Owners
Search Product Management: The Most Misunderstood Role in Search?
Search relevance for understaffed teams

Awesome Search / Management, Search Team / Job Interviews

Interview Questions for Search Relevance Engineers, Data Scientists, and Product Managers
Data Science Interviews: Ranking and search 9,002 4 months ago

Awesome Search / Management, Search Team / Engineering

Technical debt in search

Awesome Search / Blogposts series / Search Optimization 101 (by Charlie Hull)

How do I know that my search is broken?
What does it mean if my search is ‘broken’?
How do you fix a broken search?
Reducing business risk by optimizing search

Awesome Search / Blogposts series / Query Understanding (by Daniel Tunkelang)

An Introduction
Language Identification
Character Filtering
Tokenization
Spelling Correction
Stemming and Lemmatization
Query Rewriting: An Overview
Query Expansion
Query Relaxation
Query Segmentation
Query Scoping
Entity Recognition
Taxonomies and Ontologies
Autocomplete
Autocomplete and User Experience
Contextual Query Understanding: An Overview
Session Context
Location as Context
Seasonality
Personalization
Search as a Conversation
Clarification Dialogues
Relevance Feedback
Faceted Search
Search Results Presentation
Search Result Snippets
Search Results Clustering
Question Answering
Query Understanding and Voice Interfaces
Query Understanding and Chatbots

Awesome Search / Blogposts series / Grid Dynamics

Not your father’s search engine: a brief history of retail search
Semantic vector search: the new frontier in product discovery
Boosting product discovery with semantic search
Semantic query parsing blueprint

Awesome Search / Blogposts series / Considering Search: Search Topics (by Derek Sisson)

Intro
Assumptions About Search
Assumptions About User Search Behavior
Types of Information Collections
A Structural Look at Search
Users and the Task of Information Retrieval
Testing Search
Useful Search Links and References

Awesome Search / Industry players / Personalies and influencers

Daniel Tunkelang (he is God of Search)
Max Irwin
Doug Turnbull
Baymard’s Institute

Awesome Search / Industry players / Products and services

Algolia
Elasticsearch 71,007 about 1 month ago Distributed search & analytics engine
Solr Solr is the blazing-fast, open source, multi-modal search platform built on the full-text vector, and geospatial search capabilities of Apache Lucene
Fess Enterprise Search Server 1,006 about 2 months ago
Typesense 21,516 about 1 month ago an opensource alternative to Algolia
SearchHub.io
Datafari an open source enterprise search solution
Qdrant an open source vector database
Awakari Real-Time search from unlimited sources like RSS, Fediverse, Telegram. Text keyword matching conditions, numeric conditions, condition groups. Reverse search index based
Meilisearch Open source search API that supports full-text, vector, geospatial & faceted search

Awesome Search / Industry players / Consulting companies

BigData Boutique
OpenSource Connections
https://sease.io/
Sematext

Awesome Search / Case studies

Machine Learning-Powered Search Ranking of Airbnb Experiences Airbnb -
Listing Embeddings in Search Ranking Airbnb -
The Architecture Of Algolia’s Distributed Search Network Algolia -
BERT在美团搜索核心排序的探索和实践 Meituan - Exploration and practice of BERT in the core ranking of Meituan search (🇨🇳 )
Part 1 Netflix - How Netflix Content Engineering makes a federated graph searchable ( , )
Elasticsearch Indexing Strategy in Asset Management Platform (AMP) Netflix -
Learning to Rank for Flight Itinerary Search Skyscanner -
Search at Slack Slack -
Stability and scalability for search Twitter -
Amazon SEO Explained: How to Rank Your Products #1 in Amazon Search Results in 2020
Building a Better Search Engine for Semantic Scholar
How Bing Ranks Search Results: Core Algorithm & Blue Links
How Google Search Ranking Works – Darwinism in Search

Awesome Search / Case studies / E-commerce

Searchandising

Awesome Search / Case studies / Multisided markets

Discover How Cassini (The eBay Search Engine) Works and Rank

Awesome Search / Videos / Channels

Lucid Thoughts
Lucidworks
MIx-Camp E-commerce Search
OpenSource Connections
SIGIR eCom
Relevant Facets

Awesome Search / Datasets

Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving Product Search 256 3 months ago
ESCI-S: extended metadata for Amazon ESCI dataset 38 about 2 years ago
Home Depot Product Search Relevance
WANDS - Wayfair ANnotation Dataset 66 11 months ago

Awesome Search / Tools / Word2Vec

Word2Vec For Phrases — Learning Embeddings For More Than One Word
Gensim Word2Vec Tutorial
How to incorporate phrases into Word2Vec – a text mining approach
Word2Vec — a baby step in Deep Learning but a giant leap towards Natural Language Processing
How to Develop Word Embeddings in Python with Gensim

Awesome Search / Tools / Libs

Query Segmenter 1 over 4 years ago
https://github.com/zentity-io/zentity 158 6 months ago
https://github.com/mammothb/symspellpy 809 about 1 month ago
https://github.com/searchhub/search-collector 41 5 months ago
Kiri 243 over 3 years ago State-of-the-art semantic search made easy
Haystack 18,094 about 1 month ago End-to-end Python framework for building natural language search interfaces to data
https://github.com/castorini/docTTTTTquery 359 almost 2 years ago

Awesome Search / Tools / Other

Chorus 144 about 1 month ago , ,
Quepid 285 about 1 month ago
Rated Ranking Evaluator 180 10 months ago
Jina AI 21,180 2 months ago A neural search framework

Awesome Search / Other awesome stuff

Awesome Knowledge Graphs 75 over 3 years ago
Awesome time series 13 over 4 years ago
Awesome Spacy 15 about 5 years ago
Query-Understanding 56 almost 7 years ago
Click models 16 almost 4 years ago

Backlinks from these awesome lists:

More related projects: