Databases & DBA  

How to Store and Query Embeddings Using Vector Databases

Introduction

As AI-powered applications like chatbots, recommendation systems, and semantic search engines become more common, one concept is becoming increasingly important — embeddings.

Embeddings are numerical representations of text, images, or other data that help machines understand meaning instead of just keywords.

But generating embeddings is only half the story.

To make them useful, you need a way to store and search them efficiently. This is where vector databases come into play.

In this article, we will explore how to store and query embeddings using vector databases step by step, with practical examples and real-world understanding.

What are Embeddings?

Embeddings are vectors (arrays of numbers) that represent the meaning of data.

For example:

  • "apple" (fruit) and "banana" will have similar embeddings

  • "car" and "engine" will be closer compared to "car" and "banana"

This allows machines to understand semantic similarity instead of just exact matches.

Example (Conceptual)

Text:
"Machine learning is powerful"

Embedding:
[0.12, -0.45, 0.67, ...]

Each number captures part of the meaning.

What is a Vector Database?

A vector database is a specialized database designed to store and search embeddings efficiently.

Unlike traditional databases that use exact matching, vector databases use similarity search.

This means:

  • You search by meaning

  • Not by exact keywords

Common vector databases include:

  • Pinecone

  • FAISS

  • Weaviate

  • Milvus

Why Use Vector Databases?

Traditional databases are not optimized for high-dimensional vector search.

Vector databases provide:

  • Fast similarity search

  • Scalable storage

  • Efficient indexing (like HNSW, IVF)

These features are essential for AI applications like RAG systems and semantic search.

Step 1: Generate Embeddings

Before storing anything, you need embeddings.

Example using Python

from openai import OpenAI
client = OpenAI()

response = client.embeddings.create(
    input="Artificial Intelligence is transforming industries",
    model="text-embedding-3-small"
)

embedding = response.data[0].embedding

Explanation

  • The model converts text into a vector

  • The output is a list of floating-point numbers

  • This vector represents semantic meaning

Step 2: Choose a Vector Database

Select a database based on your needs.

Options

  • FAISS → Local, fast, good for small projects

  • Pinecone → Managed, scalable

  • Weaviate → Open-source + cloud

For beginners, FAISS is often the easiest to start with.

Step 3: Store Embeddings in Vector Database

Example using FAISS

import faiss
import numpy as np

# Sample embeddings
embeddings = np.array([
    [0.1, 0.2, 0.3],
    [0.4, 0.5, 0.6]
]).astype('float32')

# Create index
index = faiss.IndexFlatL2(3)

# Add vectors
index.add(embeddings)

Explanation

  • IndexFlatL2 uses Euclidean distance for similarity

  • 3 is the vector dimension

  • index.add() stores embeddings

Now your data is ready for searching.

Step 4: Attach Metadata (Important)

Embeddings alone are not enough.

You also need metadata like:

  • Original text

  • Document ID

  • Source

Example

documents = [
    "AI is powerful",
    "Cloud computing is scalable"
]

Store metadata separately or alongside embeddings depending on the database.

Step 5: Query the Vector Database

To search, convert the query into an embedding.

Example

query_embedding = np.array([[0.1, 0.2, 0.25]]).astype('float32')

distances, indices = index.search(query_embedding, k=1)

Explanation

  • query_embedding: Vector of search query

  • k=1: Number of results

  • indices: Matching vector positions

Step 6: Retrieve Original Data

Use the index to fetch original content.

Example

result = documents[indices[0][0]]
print(result)

Explanation

  • The index maps back to stored documents

  • This gives meaningful results to users

Step 7: Real-World Use Case (Semantic Search)

Imagine a knowledge base system:

  • User searches: "How to secure APIs"

  • System converts query to embedding

  • Vector DB finds similar documents

  • Relevant articles are returned

This works even if exact keywords do not match.

Step 8: Use in RAG Applications

In Retrieval-Augmented Generation:

  • Query → embedding

  • Retrieve documents from vector DB

  • Pass context to LLM

  • Generate answer

This improves accuracy and reduces hallucination.

Common Mistakes to Avoid

  • Not normalizing embeddings

  • Ignoring metadata storage

  • Using wrong distance metric

  • Storing inconsistent vector sizes

These issues can reduce search quality.

Advantages of Vector Databases

  • Fast similarity search

  • Handles high-dimensional data

  • Ideal for AI applications

Limitations to Consider

  • Requires understanding of embeddings

  • Setup can be complex for beginners

When Should You Use Vector Databases?

Use them when:

  • Building semantic search systems

  • Implementing RAG pipelines

  • Working with AI-driven recommendations

Summary

Storing and querying embeddings using vector databases is a key technique in modern AI applications. By converting data into embeddings, storing them efficiently, and performing similarity-based searches, you can build intelligent systems that understand meaning rather than just keywords. Whether you are building a chatbot, search engine, or recommendation system, mastering vector databases will significantly enhance your application's capabilities.