Member-only story

Fine-Tuning Text Embeddings For Domain-Specific Search

An overview with Python code

7 min readJan 24, 2025

Embedding models represent text as semantically meaningful vectors. Although they can be readily used for countless use cases (e.g. retrieval, classification), general-purpose embedding models may perform poorly on domain-specific tasks. One way to overcome this limitation is via fine-tuning. In this article, I will discuss the key ideas behind this technique and share a concrete example of fine-tuning embeddings for matching queries to AI job postings.

A popular use of text embedding models is in retrieval augmented generation (RAG). This is where, given an input to an LLM-based system (e.g. a customer question), the relevant context (e.g. FAQs) is automatically retrieved from a knowledge base and passed to an LLM.

Embeddings enable the retrieval process through a 3-step process.

Vector representations (i.e. embeddings) are computed for all items in a knowledge base
Input text is translated into a vector representation (using the same embedding model as in Step 1)
The similarity between the input text vector and each item in the knowledge base is computed, and the most similar items are returned

3-step search process with embeddings. Image by author.

This process (i.e., semantic search) provides a simple and flexible way to search over arbitrary text items. However, there’s an issue.

Problem with Similarity Search

Despite the popularity of semantic search, it has a central problem. Namely, just because a query and knowledge base item are similar (i.e. the angle between their associated embedding vectors is small), that doesn’t necessarily mean the item helps answer the query.

For example, consider the question, “How do I update my payment method?”

Fine-Tuning Text Embeddings For Domain-Specific Search

An overview with Python code

Problem with Similarity Search

Written by Shaw Talebi

Responses (6)