Modern search needs go beyond keyword matching — users expect semantic search, fuzzy relevance, and contextual retrieval. Vector databases (Pinecone, Milvus, Weaviate, etc.) store vector embeddings and provide efficient similarity search. Combined with OpenAI (or other embedding models), they let you add semantic search to ASP.NET Core applications with minimal latency and high relevance.
This article explains how to design, implement, and operate a vector-search workflow using ASP.NET Core, OpenAI embeddings, and Pinecone, with practial code examples, architecture guidance and diagrams. Language is simple Indian English and the content suits beginners through experts.
Overview
High-level steps
Generate embeddings for documents/items using an embedding model (OpenAI or local model).
Index embeddings into Pinecone with an ID and metadata.
Store canonical data and metadata in SQL Server (for full retrieval and joins).
For a user query, create an embedding and query Pinecone for nearest neighbors.
Fetch full records from SQL Server using returned IDs and display results in Angular.
Use cases
Product search (e-commerce)
Document / knowledge base search
FAQ and support tickets
Code search and snippets
Semantic filtering and recommendations
High-Level Workflow
User Query (Angular)
↓
ASP.NET Core API → Create embedding (OpenAI)
↓
Query Pinecone (vector DB) —> Pinecone returns nearest vector IDs + scores
↓
Fetch full metadata from SQL Server using IDs
↓
Aggregate, rank and return results to Angular
↓
User sees semantic search results
Flowchart
+--------------------+
| User submits |
| search query |
+---------+----------+
|
v
+---------+----------+
| API receives query |
+---------+----------+
|
v
+---------+----------+
| Create embedding |
| (OpenAI/Model) |
+---------+----------+
|
v
+---------+----------+
| Query Pinecone |
+---------+----------+
|
v
+---------+----------+
| Pinecone returns |
| top N vector IDs |
+---------+----------+
|
v
+---------+----------+
| Fetch rows from DB |
+---------+----------+
|
v
+---------+----------+
| Return results to |
| frontend (Angular) |
+--------------------+
Architecture Diagram (Visio-style)
+----------------------+ +---------------------+ +-------------------+
| Angular Frontend | <---> | ASP.NET Core API | <---> | OpenAI Embedding |
| (Search UI, Filters) | | (Search Controller) | | Service / Model |
+----------------------+ +---------------------+ +-------------------+
|
|
v
+------------------+
| Pinecone |
| Vector DB |
+------------------+
|
v
+------------------+
| SQL Server |
| (metadata, source|
| documents) |
+------------------+
ER Diagram (Minimal mapping)
+----------------------+ +----------------------+
| Documents | 1 --- * | VectorIndex |
+----------------------+ +----------------------+
| DocumentId (PK) | | VectorId (PK) |
| Title | | DocumentId (FK) |
| Content | | Namespace |
| Language | | EmbeddingDimension |
| CreatedOn | | PineconeId |
+----------------------+ | Metadata (JSON) |
| InsertedOn |
+----------------------+
Documents stores canonical text, titles, URLs, etc.
VectorIndex tracks which vector belongs to which document and stores Pinecone ID / namespace / metadata for joins.
Sequence Diagram (smaller header)
User -> Angular: enters query
Angular -> API: /api/search?q=...
API -> OpenAI: POST embeddings (query)
OpenAI -> API: embedding vector
API -> Pinecone: query top K vectors
Pinecone -> API: vector IDs + scores
API -> SQL Server: SELECT * FROM Documents WHERE DocumentId IN (...)
SQL Server -> API: document rows
API -> Angular: aggregated search results
Angular -> User: display results
Implementation — Practical Guide
Prerequisites
ASP.NET Core 7/8 project
OpenAI API key (or other embedding service)
Pinecone account + API key + environment (or any vector DB)
SQL Server for metadata
Angular front-end to call API
Note: Use secure secrets management (Azure Key Vault, GitHub Secrets) — never hardcode keys.
Data Model (SQL Server)
Example DDL
CREATE TABLE Documents (
DocumentId UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),
Title NVARCHAR(500),
Content NVARCHAR(MAX),
Url NVARCHAR(1000),
Language NVARCHAR(20),
CreatedOn DATETIME2 DEFAULT SYSUTCDATETIME()
);
CREATE TABLE VectorIndex (
VectorId UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),
DocumentId UNIQUEIDENTIFIER NOT NULL,
PineconeId NVARCHAR(200) NOT NULL,
Namespace NVARCHAR(200) DEFAULT 'default',
EmbeddingDim INT,
Metadata NVARCHAR(MAX),
InsertedOn DATETIME2 DEFAULT SYSUTCDATETIME(),
FOREIGN KEY (DocumentId) REFERENCES Documents(DocumentId)
);
Embedding & Pinecone Integration (ASP.NET Core)
Below is a simple implementation using HttpClient. In production you may want to use an official SDK if available.
1. Configure services (Startup / Program.cs)
builder.Services.AddHttpClient("OpenAI", c =>
{
c.BaseAddress = new Uri("https://api.openai.com/");
c.DefaultRequestHeaders.Authorization =
new AuthenticationHeaderValue("Bearer", configuration["OPENAI_API_KEY"]);
});
builder.Services.AddHttpClient("Pinecone", c =>
{
c.BaseAddress = new Uri(configuration["PINECONE_BASE_URL"]); // e.g. https://your-project.svc.pinecone.io
c.DefaultRequestHeaders.Add("Api-Key", configuration["PINECONE_API_KEY"]);
});
builder.Services.AddScoped<IEmbeddingService, OpenAiEmbeddingService>();
builder.Services.AddScoped<IVectorService, PineconeVectorService>();
2. Embedding service (OpenAI)
public class OpenAiEmbeddingService : IEmbeddingService
{
private readonly HttpClient _client;
private readonly string _model = "text-embedding-3-small"; // choose model
public OpenAiEmbeddingService(IHttpClientFactory http)
{
_client = http.CreateClient("OpenAI");
}
public async Task<float[]> GetEmbeddingAsync(string text)
{
var payload = new
{
model = _model,
input = text
};
var resp = await _client.PostAsJsonAsync("v1/embeddings", payload);
resp.EnsureSuccessStatusCode();
var body = await resp.Content.ReadFromJsonAsync<JsonDocument>();
var arr = body.RootElement.GetProperty("data")[0].GetProperty("embedding").EnumerateArray()
.Select(e => e.GetSingle()).ToArray();
return arr;
}
}
3. Pinecone vector service (indexing & query)
public class PineconeVectorService : IVectorService
{
private readonly HttpClient _client;
private readonly string _indexName;
public PineconeVectorService(IHttpClientFactory http, IConfiguration config)
{
_client = http.CreateClient("Pinecone");
_indexName = config["PINECONE_INDEX"];
}
public async Task UpsertAsync(string id, float[] vector, object metadata)
{
var body = new
{
vectors = new[] {
new {
id = id,
values = vector,
metadata = metadata
}
}
};
var resp = await _client.PostAsJsonAsync($"/vectors/upsert", body);
resp.EnsureSuccessStatusCode();
}
public async Task<IEnumerable<(string id, float score)>> QueryAsync(float[] vector, int topK = 5)
{
var body = new {
vector = vector,
topK = topK,
includeValues = false,
includeMetadata = false
};
var resp = await _client.PostAsJsonAsync($"/query", body);
resp.EnsureSuccessStatusCode();
var result = await resp.Content.ReadFromJsonAsync<JsonDocument>();
var matches = result.RootElement.GetProperty("matches").EnumerateArray();
return matches.Select(m => (
id: m.GetProperty("id").GetString(),
score: m.GetProperty("score").GetSingle()
)).ToList();
}
}
Pinecone REST endpoints depend on your Pinecone project URL and API version. Some deployments require /indexes/{indexName}/query and /indexes/{indexName}/vectors/upsert. Adjust paths accordingly.
Example Controller: Index and Search
[ApiController]
[Route("api/search")]
public class SearchController : ControllerBase
{
private readonly IEmbeddingService _embed;
private readonly IVectorService _vectors;
private readonly ApplicationDbContext _db;
public SearchController(IEmbeddingService embed, IVectorService vectors, ApplicationDbContext db)
{
_embed = embed;
_vectors = vectors;
_db = db;
}
[HttpPost("index")]
public async Task<IActionResult> Index([FromBody] DocumentDto dto)
{
// Save to DB
var doc = new Document { Title = dto.Title, Content = dto.Content, Url = dto.Url };
_db.Documents.Add(doc);
await _db.SaveChangesAsync();
// Create embedding
var vector = await _embed.GetEmbeddingAsync(dto.Content);
// Upsert to Pinecone with id = doc.DocumentId
var meta = new { title = dto.Title, url = dto.Url };
await _vectors.UpsertAsync(doc.DocumentId.ToString(), vector, meta);
// Save vector index record
_db.VectorIndex.Add(new VectorIndex {
DocumentId = doc.DocumentId, PineconeId = doc.DocumentId.ToString(), EmbeddingDim = vector.Length,
Metadata = JsonSerializer.Serialize(meta)
});
await _db.SaveChangesAsync();
return Ok();
}
[HttpGet]
public async Task<IActionResult> Search([FromQuery] string q, [FromQuery] int k = 5)
{
var v = await _embed.GetEmbeddingAsync(q);
var matches = await _vectors.QueryAsync(v, k);
var ids = matches.Select(m => Guid.Parse(m.id)).ToArray();
var docs = await _db.Documents.Where(d => ids.Contains(d.DocumentId)).ToListAsync();
// Optionally preserve order by score
var ordered = ids.Select(id => docs.First(d => d.DocumentId == id));
return Ok(ordered);
}
}
Angular Frontend (simple)
Service
search(q: string) {
return this.http.get<any[]>(`/api/search?q=${encodeURIComponent(q)}`);
}
Component
onSearch(text: string) {
this.searchService.search(text).subscribe(res => {
this.results = res;
});
}
Operational Considerations & Best Practices
Dimension and model choice
Choose an embedding model that balances cost, latency, and quality. Common OpenAI embedding models: text-embedding-3-small / text-embedding-3-large.
Confirm Pinecone index dimension equals embedding size.
Batching
Namespace & Metadata
Use Pinecone namespaces to separate environments (dev/prod) or tenants.
Store searchable metadata (title, url, type) for filtering and quick display.
Filtering
Consistency
Cost & Rate Limits
Security
Latency
Relevance Tuning
Logging & Monitoring
Advanced Patterns
Hybrid search (Vector + BM25): run a keyword search in SQL/ElasticSearch to narrow candidates, then re-rank by vector similarity. Good for precision and filter controls.
Re-ranking with a cross-encoder: for top-N candidates, use a heavier model to compute final relevance.
Semantic chunking: split long documents into overlapping chunks and store chunk-level vectors for fine-grained retrieval. Keep mapping to parent document.
Pinecone + embeddings cache: use Redis for caching top queries or precomputed embeddings.
Multi-lingual: detect language and use appropriate embeddings or translate before embedding.
Sample Reconciliation Job (pseudo)
Run nightly job
Query DB for new/updated documents since last run
Create embeddings in batches
Upsert to Pinecone
Update VectorIndex table with timestamp
This ensures index remains consistent.
Conclusion
Integrating Pinecone (or any vector DB) with ASP.NET Core and OpenAI embeddings offers powerful semantic search capabilities. Keep SQL Server as canonical storage for documents and metadata, while the vector DB handles fast nearest-neighbour search.
Key takeaways
Use embeddings to convert text into vectors.
Store vectors in Pinecone and metadata in SQL Server.
Query flow: embed query → Pinecone query → fetch metadata from SQL → present to user.
Apply batching, namespaces, caching, and monitoring to make it production-ready.