Databases & DBA  

Vector Databases vs Relational Databases: Understanding, Implementation, and Use Cases

What is a Relational Database (RDBMS)

A Relational Database stores data in tables with rows and columns. Each row represents a record, and each column represents a data attribute. RDBMS uses SQL (Structured Query Language) for querying and is ideal for structured, transactional data.

Key Features of RDBMS

  • Schema-based: strict table definitions

  • ACID compliance (Atomicity, Consistency, Isolation, Durability)

  • Supports joins, transactions, and indexing

  • Best for structured data like users, orders, products, and financial records

Use Cases

  • Banking systems (transactions, accounts)

  • Inventory management

  • Employee and HR records

What is a Vector Database (Vector DB)?

A Vector Database stores high-dimensional numeric vectors, typically generated from AI embeddings. Each vector represents the semantic meaning of an object, such as text, images, or audio.

Important Points

  • Vector DB requires numeric vectors. You cannot store plain text directly for similarity search.

  • To store text (or any unstructured data), you must convert it into numbers using some embedding mechanism.

  • Vector DB excels at similarity search using distances like cosine similarity or Euclidean distance.

Key Features of Vector DB

  • Stores embeddings as numeric vectors

  • Optimized for Approximate Nearest Neighbor (ANN) search

  • Can store metadata along with vectors

  • Scales horizontally for large AI workloads

  • Ideal for semantic search, recommendations, and AI-powered applications

Use Cases

  • Semantic document search (RAG workflows)

  • Chatbots and question-answering systems

  • Image and video retrieval

  • Recommendation engines

RDBMS vs Vector DB

FeatureRDBMSVector DB
Data TypeStructured TablesHigh-dimensional numeric vectors
QuerySQL, exact matchSimilarity search (distance-based)
Use CaseTransactions, ReportingAI/ML, Semantic search
ScalabilityVertical/HorizontalOptimized for horizontal scaling
Example SystemsMySQL, PostgreSQL, Oracle, SQL ServerChroma, Pinecone, FAISS, Milvus, Waviate
StrengthReliable, mature, ACID-compliantHandles unstructured data efficiently

Implementation Example: RDBMS

Simple Python example using SQLite, a lightweight RDBMS

import sqlite3

# Connect to database (or create it)
conn = sqlite3.connect("enterprise.db")
cursor = conn.cursor()

# Create a table
cursor.execute("""
CREATE TABLE IF NOT EXISTS employees (
    id INTEGER PRIMARY KEY,
    name TEXT,
    department TEXT
)
""")

# Insert records
cursor.execute("INSERT INTO employees (name, department) VALUES (?, ?)", ("Jayant", "IT"))
cursor.execute("INSERT INTO employees (name, department) VALUES (?, ?)", ("Riya", "HR"))

conn.commit()

# Query records
cursor.execute("SELECT * FROM employees WHERE department=?", ("IT",))
rows = cursor.fetchall()
for row in rows:
    print(row)

#Close Connection
conn.close()

Implementation Example 2- Vector DB: Using AI Embeddings

Simple python Script using Ollama/Gemma/ChromaDB (Vector DB) to Store and retrieve information

  • Ollama runs Gemma locally and provides embeddings via its REST API.

  • The embeddings are stored in ChromaDB.

  • A query is converted into an embedding and matched against stored vectors.


import requests
import chromadb

# 1. Function to get embeddings from Ollama (Gemma)
def get_embedding(text, model="gemma"):
    url = "http://localhost:11434/api/embeddings"
    payload = {"model": model, "prompt": text}
    response = requests.post(url, json=payload)
    data = response.json()
    return data["embedding"]

# 2. Initialize ChromaDB
client = chromadb.Client()
collection = client.create_collection(name="ollama_gemma_embeddings")

# 3. Store some texts
texts = ["My name is Jayant", "I love Python programming", "ChromaDB with Ollama and Gemma"]

for i, text in enumerate(texts):
    embedding = get_embedding(text)
    collection.add(
        ids=[f"text_{i+1}"],
        documents=[text],
        embeddings=[embedding]
    )

# 4. Query example
query = "Who is Jayant?"
query_embedding = get_embedding(query)

results = collection.query(query_embeddings=[query_embedding], n_results=2)

print("Search Results:")
for i, doc in enumerate(results["documents"][0]):
    print(f"{i+1}. {doc}")

Practical Insights

  • RDBMS: Essential for core enterprise applications; handles structured, transactional, and relational data.

  • Vector DB: Gaining importance for AI-driven applications, especially where semantic similarity matters.

  • Vector Requirement: All data must be converted into numeric vectors using some embedding mechanism (manual numeric vectors or AI embeddings).

  • Hybrid Approach: Many enterprises use both: RDBMS for transactions, Vector DB for AI-powered search and recommendations.

Conclusion

Vector databases are not a replacement for RDBMS but a complementary technology. Enterprises typically rely on RDBMS for reliability and structure, while leveraging Vector DB for AI-enhanced features. Understanding the numeric vector requirement and embedding mechanisms is critical when implementing vector databases in real-world applications.