ASP.NET Core  

Integrating Vector Databases (like Pinecone) in ASP.NET Core Search

Modern search needs go beyond keyword matching — users expect semantic search, fuzzy relevance, and contextual retrieval. Vector databases (Pinecone, Milvus, Weaviate, etc.) store vector embeddings and provide efficient similarity search. Combined with OpenAI (or other embedding models), they let you add semantic search to ASP.NET Core applications with minimal latency and high relevance.

This article explains how to design, implement, and operate a vector-search workflow using ASP.NET Core, OpenAI embeddings, and Pinecone, with practial code examples, architecture guidance and diagrams. Language is simple Indian English and the content suits beginners through experts.

Overview

High-level steps

  1. Generate embeddings for documents/items using an embedding model (OpenAI or local model).

  2. Index embeddings into Pinecone with an ID and metadata.

  3. Store canonical data and metadata in SQL Server (for full retrieval and joins).

  4. For a user query, create an embedding and query Pinecone for nearest neighbors.

  5. Fetch full records from SQL Server using returned IDs and display results in Angular.

Use cases

  • Product search (e-commerce)

  • Document / knowledge base search

  • FAQ and support tickets

  • Code search and snippets

  • Semantic filtering and recommendations

High-Level Workflow

User Query (Angular) 
       ↓
ASP.NET Core API → Create embedding (OpenAI) 
       ↓
Query Pinecone (vector DB) —> Pinecone returns nearest vector IDs + scores
       ↓
Fetch full metadata from SQL Server using IDs
       ↓
Aggregate, rank and return results to Angular
       ↓
User sees semantic search results

Flowchart

+--------------------+
|   User submits     |
|   search query     |
+---------+----------+
          |
          v
+---------+----------+
| API receives query |
+---------+----------+
          |
          v
+---------+----------+
| Create embedding   |
| (OpenAI/Model)     |
+---------+----------+
          |
          v
+---------+----------+
| Query Pinecone     |
+---------+----------+
          |
          v
+---------+----------+
| Pinecone returns   |
| top N vector IDs    |
+---------+----------+
          |
          v
+---------+----------+
| Fetch rows from DB |
+---------+----------+
          |
          v
+---------+----------+
| Return results to  |
| frontend (Angular) |
+--------------------+

Architecture Diagram (Visio-style)

+----------------------+       +---------------------+       +-------------------+
|   Angular Frontend   | <---> |  ASP.NET Core API   | <---> | OpenAI Embedding  |
| (Search UI, Filters) |       | (Search Controller) |       |  Service / Model  |
+----------------------+       +---------------------+       +-------------------+
                                         |
                                         |
                                         v
                                +------------------+
                                |   Pinecone       |
                                |   Vector DB      |
                                +------------------+
                                         |
                                         v
                                +------------------+
                                |   SQL Server     |
                                | (metadata, source|
                                |  documents)      |
                                +------------------+

ER Diagram (Minimal mapping)

+----------------------+         +----------------------+
|    Documents         | 1 --- * |  VectorIndex         |
+----------------------+         +----------------------+
| DocumentId (PK)      |         | VectorId (PK)        |
| Title                |         | DocumentId (FK)      |
| Content              |         | Namespace            |
| Language             |         | EmbeddingDimension   |
| CreatedOn            |         | PineconeId           |
+----------------------+         | Metadata (JSON)      |
                                 | InsertedOn           |
                                 +----------------------+
  • Documents stores canonical text, titles, URLs, etc.

  • VectorIndex tracks which vector belongs to which document and stores Pinecone ID / namespace / metadata for joins.

Sequence Diagram (smaller header)

User -> Angular: enters query
Angular -> API: /api/search?q=...
API -> OpenAI: POST embeddings (query)
OpenAI -> API: embedding vector
API -> Pinecone: query top K vectors
Pinecone -> API: vector IDs + scores
API -> SQL Server: SELECT * FROM Documents WHERE DocumentId IN (...)
SQL Server -> API: document rows
API -> Angular: aggregated search results
Angular -> User: display results

Implementation — Practical Guide

Prerequisites

  • ASP.NET Core 7/8 project

  • OpenAI API key (or other embedding service)

  • Pinecone account + API key + environment (or any vector DB)

  • SQL Server for metadata

  • Angular front-end to call API

Note: Use secure secrets management (Azure Key Vault, GitHub Secrets) — never hardcode keys.

Data Model (SQL Server)

Example DDL

CREATE TABLE Documents (
  DocumentId UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),
  Title NVARCHAR(500),
  Content NVARCHAR(MAX),
  Url NVARCHAR(1000),
  Language NVARCHAR(20),
  CreatedOn DATETIME2 DEFAULT SYSUTCDATETIME()
);

CREATE TABLE VectorIndex (
  VectorId UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),
  DocumentId UNIQUEIDENTIFIER NOT NULL,
  PineconeId NVARCHAR(200) NOT NULL,
  Namespace NVARCHAR(200) DEFAULT 'default',
  EmbeddingDim INT,
  Metadata NVARCHAR(MAX),
  InsertedOn DATETIME2 DEFAULT SYSUTCDATETIME(),
  FOREIGN KEY (DocumentId) REFERENCES Documents(DocumentId)
);

Embedding & Pinecone Integration (ASP.NET Core)

Below is a simple implementation using HttpClient. In production you may want to use an official SDK if available.

1. Configure services (Startup / Program.cs)

builder.Services.AddHttpClient("OpenAI", c =>
{
    c.BaseAddress = new Uri("https://api.openai.com/");
    c.DefaultRequestHeaders.Authorization = 
        new AuthenticationHeaderValue("Bearer", configuration["OPENAI_API_KEY"]);
});

builder.Services.AddHttpClient("Pinecone", c =>
{
    c.BaseAddress = new Uri(configuration["PINECONE_BASE_URL"]); // e.g. https://your-project.svc.pinecone.io
    c.DefaultRequestHeaders.Add("Api-Key", configuration["PINECONE_API_KEY"]);
});

builder.Services.AddScoped<IEmbeddingService, OpenAiEmbeddingService>();
builder.Services.AddScoped<IVectorService, PineconeVectorService>();

2. Embedding service (OpenAI)

public class OpenAiEmbeddingService : IEmbeddingService
{
    private readonly HttpClient _client;
    private readonly string _model = "text-embedding-3-small"; // choose model

    public OpenAiEmbeddingService(IHttpClientFactory http)
    {
        _client = http.CreateClient("OpenAI");
    }

    public async Task<float[]> GetEmbeddingAsync(string text)
    {
        var payload = new
        {
            model = _model,
            input = text
        };

        var resp = await _client.PostAsJsonAsync("v1/embeddings", payload);
        resp.EnsureSuccessStatusCode();

        var body = await resp.Content.ReadFromJsonAsync<JsonDocument>();
        var arr = body.RootElement.GetProperty("data")[0].GetProperty("embedding").EnumerateArray()
                  .Select(e => e.GetSingle()).ToArray();
        return arr;
    }
}

3. Pinecone vector service (indexing & query)

public class PineconeVectorService : IVectorService
{
    private readonly HttpClient _client;
    private readonly string _indexName;

    public PineconeVectorService(IHttpClientFactory http, IConfiguration config)
    {
        _client = http.CreateClient("Pinecone");
        _indexName = config["PINECONE_INDEX"];
    }

    public async Task UpsertAsync(string id, float[] vector, object metadata)
    {
        var body = new
        {
            vectors = new[] {
                new {
                    id = id,
                    values = vector,
                    metadata = metadata
                }
            }
        };

        var resp = await _client.PostAsJsonAsync($"/vectors/upsert", body);
        resp.EnsureSuccessStatusCode();
    }

    public async Task<IEnumerable<(string id, float score)>> QueryAsync(float[] vector, int topK = 5)
    {
        var body = new {
            vector = vector,
            topK = topK,
            includeValues = false,
            includeMetadata = false
        };

        var resp = await _client.PostAsJsonAsync($"/query", body);
        resp.EnsureSuccessStatusCode();

        var result = await resp.Content.ReadFromJsonAsync<JsonDocument>();
        var matches = result.RootElement.GetProperty("matches").EnumerateArray();

        return matches.Select(m => (
            id: m.GetProperty("id").GetString(),
            score: m.GetProperty("score").GetSingle()
        )).ToList();
    }
}

Pinecone REST endpoints depend on your Pinecone project URL and API version. Some deployments require /indexes/{indexName}/query and /indexes/{indexName}/vectors/upsert. Adjust paths accordingly.

Example Controller: Index and Search

[ApiController]
[Route("api/search")]
public class SearchController : ControllerBase
{
    private readonly IEmbeddingService _embed;
    private readonly IVectorService _vectors;
    private readonly ApplicationDbContext _db;

    public SearchController(IEmbeddingService embed, IVectorService vectors, ApplicationDbContext db)
    {
        _embed = embed;
        _vectors = vectors;
        _db = db;
    }

    [HttpPost("index")]
    public async Task<IActionResult> Index([FromBody] DocumentDto dto)
    {
        // Save to DB
        var doc = new Document { Title = dto.Title, Content = dto.Content, Url = dto.Url };
        _db.Documents.Add(doc);
        await _db.SaveChangesAsync();

        // Create embedding
        var vector = await _embed.GetEmbeddingAsync(dto.Content);

        // Upsert to Pinecone with id = doc.DocumentId
        var meta = new { title = dto.Title, url = dto.Url };
        await _vectors.UpsertAsync(doc.DocumentId.ToString(), vector, meta);

        // Save vector index record
        _db.VectorIndex.Add(new VectorIndex {
          DocumentId = doc.DocumentId, PineconeId = doc.DocumentId.ToString(), EmbeddingDim = vector.Length,
          Metadata = JsonSerializer.Serialize(meta)
        });
        await _db.SaveChangesAsync();

        return Ok();
    }

    [HttpGet]
    public async Task<IActionResult> Search([FromQuery] string q, [FromQuery] int k = 5)
    {
        var v = await _embed.GetEmbeddingAsync(q);
        var matches = await _vectors.QueryAsync(v, k);

        var ids = matches.Select(m => Guid.Parse(m.id)).ToArray();
        var docs = await _db.Documents.Where(d => ids.Contains(d.DocumentId)).ToListAsync();

        // Optionally preserve order by score
        var ordered = ids.Select(id => docs.First(d => d.DocumentId == id));
        return Ok(ordered);
    }
}

Angular Frontend (simple)

Service

search(q: string) {
  return this.http.get<any[]>(`/api/search?q=${encodeURIComponent(q)}`);
}

Component

onSearch(text: string) {
  this.searchService.search(text).subscribe(res => {
    this.results = res;
  });
}

Operational Considerations & Best Practices

  1. Dimension and model choice

    • Choose an embedding model that balances cost, latency, and quality. Common OpenAI embedding models: text-embedding-3-small / text-embedding-3-large.

    • Confirm Pinecone index dimension equals embedding size.

  2. Batching

    • For large indexing jobs, batch embeddings and upserts to reduce API calls and improve throughput.

  3. Namespace & Metadata

    • Use Pinecone namespaces to separate environments (dev/prod) or tenants.

    • Store searchable metadata (title, url, type) for filtering and quick display.

  4. Filtering

    • Use metadata filters in Pinecone queries to restrict search to a subset (category, language, tenant).

  5. Consistency

    • Keep SQL Server as source of truth. Rebuild or reconcile vectors periodically (scheduled job) in case of drift.

  6. Cost & Rate Limits

    • Monitor usage of embedding API and Pinecone. Cache embeddings for repeated queries or reuse document embeddings.

  7. Security

    • Secure API keys using Key Vault / secret manager.

    • Never expose Pinecone or OpenAI keys to frontend.

  8. Latency

    • Embed creation adds latency. For interactive search, consider caching popular query embeddings or precomputing suggestions.

  9. Relevance Tuning

    • Adjust topK, distance metric (cosine vs dot product) and post-filter re-ranking (BM25 hybrid rerank using keywords) for best UX.

  10. Logging & Monitoring

    • Log query latency, failure rates, top-K clickthrough for continuous improvement.

Advanced Patterns

  • Hybrid search (Vector + BM25): run a keyword search in SQL/ElasticSearch to narrow candidates, then re-rank by vector similarity. Good for precision and filter controls.

  • Re-ranking with a cross-encoder: for top-N candidates, use a heavier model to compute final relevance.

  • Semantic chunking: split long documents into overlapping chunks and store chunk-level vectors for fine-grained retrieval. Keep mapping to parent document.

  • Pinecone + embeddings cache: use Redis for caching top queries or precomputed embeddings.

  • Multi-lingual: detect language and use appropriate embeddings or translate before embedding.

Sample Reconciliation Job (pseudo)

  • Run nightly job

    • Query DB for new/updated documents since last run

    • Create embeddings in batches

    • Upsert to Pinecone

    • Update VectorIndex table with timestamp

This ensures index remains consistent.

Conclusion

Integrating Pinecone (or any vector DB) with ASP.NET Core and OpenAI embeddings offers powerful semantic search capabilities. Keep SQL Server as canonical storage for documents and metadata, while the vector DB handles fast nearest-neighbour search.

Key takeaways

  • Use embeddings to convert text into vectors.

  • Store vectors in Pinecone and metadata in SQL Server.

  • Query flow: embed query → Pinecone query → fetch metadata from SQL → present to user.

  • Apply batching, namespaces, caching, and monitoring to make it production-ready.