SharpCoreDB.VectorSearch 1.3.0

There is a newer version of this package available.
See the version list below for details.
dotnet add package SharpCoreDB.VectorSearch --version 1.3.0
                    
NuGet\Install-Package SharpCoreDB.VectorSearch -Version 1.3.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="SharpCoreDB.VectorSearch" Version="1.3.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="SharpCoreDB.VectorSearch" Version="1.3.0" />
                    
Directory.Packages.props
<PackageReference Include="SharpCoreDB.VectorSearch" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add SharpCoreDB.VectorSearch --version 1.3.0
                    
#r "nuget: SharpCoreDB.VectorSearch, 1.3.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package SharpCoreDB.VectorSearch@1.3.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=SharpCoreDB.VectorSearch&version=1.3.0
                    
Install as a Cake Addin
#tool nuget:?package=SharpCoreDB.VectorSearch&version=1.3.0
                    
Install as a Cake Tool

๐Ÿ” SharpCoreDB.VectorSearch

High-performance vector search extension for SharpCoreDB โ€” SIMD-accelerated similarity search with HNSW indexing, quantization, and encrypted storage.

NuGet License .NET C# Version


๐Ÿš€ Overview

SharpCoreDB.VectorSearch enables semantic search, similarity matching, and AI/RAG applications by storing and querying high-dimensional embeddings directly within your SharpCoreDB database. It's built for production workloads with:

  • โœ… Pure managed C# 14 โ€” Zero native dependencies
  • โœ… SIMD-accelerated โ€” AVX-512, AVX2, ARM NEON support
  • โœ… HNSW indexing โ€” Logarithmic-time approximate nearest neighbor search
  • โœ… Quantization โ€” Scalar and binary quantization for memory efficiency
  • โœ… Encrypted storage โ€” AES-256-GCM for sensitive embeddings
  • โœ… NativeAOT compatible โ€” Deploy as trimmed, self-contained executables
  • โœ… SQL integration โ€” Native VECTOR(N) type and vec_*() functions

Performance Highlights

Operation Typical Latency Notes
Vector Search (k=10) 0.5-2ms 1M vectors, HNSW index, cosine similarity
Index Build (1M vectors) 2-5 seconds M=16, efConstruction=200
Memory Overhead 200-400 bytes/vector HNSW graph structure (M=16)
Throughput 500-2000 queries/sec Single-threaded on modern CPU

Benchmarks run on AMD Ryzen 9 5950X with 1536-dim vectors. See tests/SharpCoreDB.Benchmarks/VectorSearchPerformanceBenchmark.cs for reproducible results.


๐Ÿ“ฆ Installation

# Install SharpCoreDB core (if not already installed)
dotnet add package SharpCoreDB --version 1.3.0

# Install vector search extension
dotnet add package SharpCoreDB.VectorSearch --version 1.3.0

Requirements:

  • .NET 10.0 or later
  • SharpCoreDB 1.3.0+
  • 64-bit runtime (x64, ARM64)

๐ŸŽฏ Quick Start

1. Register Vector Support

using Microsoft.Extensions.DependencyInjection;
using SharpCoreDB;
using SharpCoreDB.VectorSearch;

var services = new ServiceCollection();
services.AddSharpCoreDB()
    .AddVectorSupport(options =>
    {
        options.EnableQueryOptimization = true;  // Auto-select indexes
        options.DefaultIndexType = VectorIndexType.Hnsw;
        options.MaxCacheSize = 1_000_000;       // Cache 1M vectors
    });

var provider = services.BuildServiceProvider();
var factory = provider.GetRequiredService<DatabaseFactory>();

using var db = factory.Create("./vector_db", "StrongPassword!");

2. Create Vector Schema

// Create table with VECTOR column
await db.ExecuteSQLAsync(@"
    CREATE TABLE documents (
        id INTEGER PRIMARY KEY,
        title TEXT,
        content TEXT,
        embedding VECTOR(1536)  -- OpenAI text-embedding-3-large dimensions
    )
");

// Build HNSW index for fast similarity search
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_doc_embedding ON documents(embedding)
    WITH (index_type='hnsw', m=16, ef_construction=200)
");

3. Insert Vectors

// Insert embeddings (e.g., from OpenAI API)
var embedding = new float[1536]; // Your embedding vector
// ... populate embedding from your ML model ...

await db.ExecuteSQLAsync(@"
    INSERT INTO documents (id, title, content, embedding)
    VALUES (?, ?, ?, ?)
", [1, "AI Overview", "Artificial Intelligence is...", embedding]);
// Search for similar documents
var queryEmbedding = new float[1536]; // Query embedding
var k = 10;  // Top-10 results

var results = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_distance_cosine(embedding, ?) AS similarity
    FROM documents
    ORDER BY similarity ASC
    LIMIT ?
", [queryEmbedding, k]);

foreach (var row in results)
{
    Console.WriteLine($"Document: {row["title"]}, Similarity: {row["similarity"]:F3}");
}

๐Ÿ› ๏ธ Features

Distance Metrics

Choose the right metric for your embeddings:

Metric Use Case SQL Function
Cosine Text embeddings (normalized) vec_distance_cosine(v1, v2)
Euclidean (L2) Image embeddings, general purpose vec_distance_l2(v1, v2)
Dot Product Recommendation systems, max similarity vec_dot_product(v1, v2)
Hamming Binary embeddings vec_distance_hamming(v1, v2)
// Example: Dot product search (higher = more similar)
var results = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_dot_product(embedding, ?) AS score
    FROM documents
    ORDER BY score DESC
    LIMIT 10
", [queryEmbedding]);

Index Types

HNSW (Hierarchical Navigable Small World)

Best for: Large datasets (10K+ vectors), fast approximate search

await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_hnsw ON vectors(embedding)
    WITH (
        index_type='hnsw',
        m=16,               -- Neighbors per layer (higher = more recall, slower build)
        ef_construction=200, -- Build-time beam search width
        ef_search=50        -- Query-time beam search width
    )
");

Tuning Guide:

  • M=8-16 โ€” Good default (16 for high recall, 8 for faster build)
  • ef_construction=100-400 โ€” Higher = better quality, slower build
  • ef_search=10-100 โ€” Higher = better recall, slower search
Flat Index

Best for: Small datasets (<1K vectors), exact search

await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_flat ON vectors(embedding)
    WITH (index_type='flat')
");

Quantization

Reduce memory usage by 4-32x with minimal accuracy loss:

// Scalar Quantization (4x reduction: float32 โ†’ int8)
var indexManager = provider.GetRequiredService<VectorIndexManager>();
await indexManager.CreateIndexAsync(
    tableName: "documents",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Scalar
);

// Binary Quantization (32x reduction: float32 โ†’ bit)
await indexManager.CreateIndexAsync(
    tableName: "documents",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Binary
);

Tradeoffs:

  • Scalar: ~1-3% recall drop, 4x memory savings
  • Binary: ~5-10% recall drop, 32x memory savings, best for cosine similarity

SQL Functions

-- Distance/similarity functions
vec_distance_cosine(v1, v2)    -- Returns 0-2 (lower = more similar)
vec_distance_l2(v1, v2)        -- Euclidean distance
vec_dot_product(v1, v2)        -- Dot product (higher = more similar)
vec_distance_hamming(v1, v2)   -- Hamming distance (binary vectors)

-- Vector operations
vec_length(v)                  -- Vector L2 norm
vec_normalize(v)               -- Normalize to unit length
vec_add(v1, v2)                -- Element-wise addition
vec_subtract(v1, v2)           -- Element-wise subtraction
vec_multiply(v, scalar)        -- Scalar multiplication

-- Metadata
vec_dimensions(v)              -- Get vector dimensions

๐Ÿ“Š Use Cases

1. AI/RAG Applications

Store document embeddings for retrieval-augmented generation:

// Index knowledge base
var docs = await LoadDocumentsAsync();
foreach (var doc in docs)
{
    var embedding = await GetEmbeddingAsync(doc.Content);  // OpenAI, Ollama, etc.
    await db.ExecuteSQLAsync(@"
        INSERT INTO knowledge_base (id, content, embedding)
        VALUES (?, ?, ?)
    ", [doc.Id, doc.Content, embedding]);
}

// Retrieve context for LLM
var userQuestion = "What is vector search?";
var queryEmbedding = await GetEmbeddingAsync(userQuestion);
var context = await db.ExecuteSQLAsync(@"
    SELECT content
    FROM knowledge_base
    ORDER BY vec_distance_cosine(embedding, ?)
    LIMIT 5
", [queryEmbedding]);

// Send context + question to LLM...

Search by meaning, not just keywords:

// Traditional keyword search (may miss relevant docs)
var results = await db.ExecuteSQLAsync(@"
    SELECT * FROM articles
    WHERE content LIKE '%machine learning%'
");

// Semantic vector search (finds conceptually similar docs)
var queryEmbedding = await GetEmbeddingAsync("machine learning");
var semanticResults = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_distance_cosine(embedding, ?) AS relevance
    FROM articles
    ORDER BY relevance ASC
    LIMIT 10
", [queryEmbedding]);

3. Recommendation Systems

Find similar products, users, or content:

// Find similar products based on embedding similarity
var productEmbedding = await GetProductEmbeddingAsync(productId);
var recommendations = await db.ExecuteSQLAsync(@"
    SELECT id, name, price, vec_dot_product(embedding, ?) AS score
    FROM products
    WHERE id != ?
    ORDER BY score DESC
    LIMIT 5
", [productEmbedding, productId]);

4. Image/Audio Similarity

Compare media by their embeddings (e.g., CLIP, Wav2Vec):

// Find visually similar images
var imageEmbedding = await GetImageEmbeddingAsync(imagePath);  // CLIP model
var similarImages = await db.ExecuteSQLAsync(@"
    SELECT id, path, vec_distance_l2(embedding, ?) AS distance
    FROM images
    ORDER BY distance ASC
    LIMIT 20
", [imageEmbedding]);

๐Ÿ” Security

Encrypted Vector Storage

All vectors are encrypted at rest using AES-256-GCM when you create an encrypted database:

using var db = factory.CreateEncrypted(
    dbPath: "./secure_vectors",
    password: "YourStrongPassword123!",
    options: new DatabaseOptions
    {
        EnableEncryption = true  // Vectors encrypted automatically
    }
);

What's encrypted:

  • โœ… Vector embeddings (VECTOR columns)
  • โœ… HNSW graph structure
  • โœ… Quantization tables
  • โœ… All metadata

โšก Performance Tips

1. Choose the Right Index

Dataset Size Recommended Index Search Time
< 1K vectors Flat 0.1-1ms
1K-10K vectors HNSW (M=8) 0.2-0.5ms
10K-100K vectors HNSW (M=16) 0.5-2ms
100K+ vectors HNSW (M=16) + Quantization 1-5ms

2. Tune HNSW Parameters

// High recall (slower)
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_high_recall ON vectors(embedding)
    WITH (index_type='hnsw', m=32, ef_construction=400, ef_search=100)
");

// Fast search (lower recall)
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_fast ON vectors(embedding)
    WITH (index_type='hnsw', m=8, ef_construction=100, ef_search=10)
");

3. Use Quantization for Large Datasets

// 1M vectors, 1536 dimensions:
// - Unquantized: ~6GB RAM
// - Scalar:      ~1.5GB RAM (4x reduction)
// - Binary:      ~200MB RAM (32x reduction)

var indexManager = provider.GetRequiredService<VectorIndexManager>();
await indexManager.CreateIndexAsync(
    tableName: "large_embeddings",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Scalar  // 4x memory savings
);

4. Batch Operations

// โœ… DO: Batch inserts
using var transaction = db.BeginTransaction();
foreach (var doc in documents)
{
    await db.ExecuteSQLAsync(@"
        INSERT INTO documents (id, embedding) VALUES (?, ?)
    ", [doc.Id, doc.Embedding]);
}
transaction.Commit();

// โŒ DON'T: Individual transactions
foreach (var doc in documents)
{
    using var tx = db.BeginTransaction();
    await db.ExecuteSQLAsync("INSERT INTO documents ...");
    tx.Commit();  // Slow!
}

๐Ÿงช Testing

Run the included benchmarks to verify performance on your hardware:

cd tests/SharpCoreDB.Benchmarks
dotnet run -c Release -- --filter *VectorSearch*

Example output:

| Method        | VectorCount | Dimensions | K   | Mean      | Error   | StdDev  | Allocated |
|-------------- |------------ |----------- |---- |----------:|--------:|--------:|----------:|
| HnswSearch    | 100000      | 1536       | 10  | 1.845 ms  | 0.032 ms| 0.028 ms|     2.1 KB|
| FlatSearch    | 100000      | 1536       | 10  | 89.32 ms  | 1.23 ms | 1.15 ms |     2.1 KB|

๐Ÿ“š Documentation


๐Ÿค Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Areas for Contribution

  • ๐Ÿš€ Additional distance metrics (Manhattan, Mahalanobis, etc.)
  • ๐Ÿ”ฌ New quantization strategies (product quantization, PQ)
  • ๐Ÿ“Š Performance benchmarks on different hardware
  • ๐Ÿ“– Documentation improvements and examples
  • ๐Ÿ› Bug reports and fixes

๐Ÿ“„ License

This project is licensed under the MIT License. See LICENSE for details.


๐Ÿ™ Acknowledgments


๐Ÿ“ž Support


Made with โค๏ธ by MPCoreDeveloper

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on SharpCoreDB.VectorSearch:

Package Downloads
SharpCoreDB.Graph.Advanced

`SharpCoreDB.Graph` provides traversal and pathfinding; `SharpCoreDB.Graph.Advanced` adds analytics, centrality metrics, subgraph analysis, and GraphRAG with vector search integration.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.5.0 93 3/14/2026
1.4.1 90 2/28/2026
1.3.5 102 2/21/2026
1.3.0 100 2/14/2026