SharpCoreDB.VectorSearch 1.3.0

There is a newer version of this package available.
See the version list below for details.

dotnet add package SharpCoreDB.VectorSearch --version 1.3.0

NuGet\Install-Package SharpCoreDB.VectorSearch -Version 1.3.0

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="SharpCoreDB.VectorSearch" Version="1.3.0" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="SharpCoreDB.VectorSearch" Version="1.3.0" />
                    

                            Directory.Packages.props

<PackageReference Include="SharpCoreDB.VectorSearch" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add SharpCoreDB.VectorSearch --version 1.3.0

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: SharpCoreDB.VectorSearch, 1.3.0"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package SharpCoreDB.VectorSearch@1.3.0

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=SharpCoreDB.VectorSearch&version=1.3.0
                    

                            Install as a Cake Addin

#tool nuget:?package=SharpCoreDB.VectorSearch&version=1.3.0
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

🔍 SharpCoreDB.VectorSearch

High-performance vector search extension for SharpCoreDB — SIMD-accelerated similarity search with HNSW indexing, quantization, and encrypted storage.

🚀 Overview

SharpCoreDB.VectorSearch enables semantic search, similarity matching, and AI/RAG applications by storing and querying high-dimensional embeddings directly within your SharpCoreDB database. It's built for production workloads with:

✅ Pure managed C# 14 — Zero native dependencies
✅ SIMD-accelerated — AVX-512, AVX2, ARM NEON support
✅ HNSW indexing — Logarithmic-time approximate nearest neighbor search
✅ Quantization — Scalar and binary quantization for memory efficiency
✅ Encrypted storage — AES-256-GCM for sensitive embeddings
✅ NativeAOT compatible — Deploy as trimmed, self-contained executables
✅ SQL integration — Native VECTOR(N) type and vec_*() functions

Performance Highlights

Operation	Typical Latency	Notes
Vector Search (k=10)	0.5-2ms	1M vectors, HNSW index, cosine similarity
Index Build (1M vectors)	2-5 seconds	M=16, efConstruction=200
Memory Overhead	200-400 bytes/vector	HNSW graph structure (M=16)
Throughput	500-2000 queries/sec	Single-threaded on modern CPU

Benchmarks run on AMD Ryzen 9 5950X with 1536-dim vectors. See tests/SharpCoreDB.Benchmarks/VectorSearchPerformanceBenchmark.cs for reproducible results.

📦 Installation

# Install SharpCoreDB core (if not already installed)
dotnet add package SharpCoreDB --version 1.3.0

# Install vector search extension
dotnet add package SharpCoreDB.VectorSearch --version 1.3.0

Requirements:

.NET 10.0 or later
SharpCoreDB 1.3.0+
64-bit runtime (x64, ARM64)

🎯 Quick Start

1. Register Vector Support

using Microsoft.Extensions.DependencyInjection;
using SharpCoreDB;
using SharpCoreDB.VectorSearch;

var services = new ServiceCollection();
services.AddSharpCoreDB()
    .AddVectorSupport(options =>
    {
        options.EnableQueryOptimization = true;  // Auto-select indexes
        options.DefaultIndexType = VectorIndexType.Hnsw;
        options.MaxCacheSize = 1_000_000;       // Cache 1M vectors
    });

var provider = services.BuildServiceProvider();
var factory = provider.GetRequiredService<DatabaseFactory>();

using var db = factory.Create("./vector_db", "StrongPassword!");

2. Create Vector Schema

// Create table with VECTOR column
await db.ExecuteSQLAsync(@"
    CREATE TABLE documents (
        id INTEGER PRIMARY KEY,
        title TEXT,
        content TEXT,
        embedding VECTOR(1536)  -- OpenAI text-embedding-3-large dimensions
    )
");

// Build HNSW index for fast similarity search
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_doc_embedding ON documents(embedding)
    WITH (index_type='hnsw', m=16, ef_construction=200)
");

3. Insert Vectors

// Insert embeddings (e.g., from OpenAI API)
var embedding = new float[1536]; // Your embedding vector
// ... populate embedding from your ML model ...

await db.ExecuteSQLAsync(@"
    INSERT INTO documents (id, title, content, embedding)
    VALUES (?, ?, ?, ?)
", [1, "AI Overview", "Artificial Intelligence is...", embedding]);

4. Semantic Search

// Search for similar documents
var queryEmbedding = new float[1536]; // Query embedding
var k = 10;  // Top-10 results

var results = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_distance_cosine(embedding, ?) AS similarity
    FROM documents
    ORDER BY similarity ASC
    LIMIT ?
", [queryEmbedding, k]);

foreach (var row in results)
{
    Console.WriteLine($"Document: {row["title"]}, Similarity: {row["similarity"]:F3}");
}

🛠️ Features

Distance Metrics

Choose the right metric for your embeddings:

Metric	Use Case	SQL Function
Cosine	Text embeddings (normalized)	`vec_distance_cosine(v1, v2)`
Euclidean (L2)	Image embeddings, general purpose	`vec_distance_l2(v1, v2)`
Dot Product	Recommendation systems, max similarity	`vec_dot_product(v1, v2)`
Hamming	Binary embeddings	`vec_distance_hamming(v1, v2)`

// Example: Dot product search (higher = more similar)
var results = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_dot_product(embedding, ?) AS score
    FROM documents
    ORDER BY score DESC
    LIMIT 10
", [queryEmbedding]);

Index Types

HNSW (Hierarchical Navigable Small World)

Best for: Large datasets (10K+ vectors), fast approximate search

await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_hnsw ON vectors(embedding)
    WITH (
        index_type='hnsw',
        m=16,               -- Neighbors per layer (higher = more recall, slower build)
        ef_construction=200, -- Build-time beam search width
        ef_search=50        -- Query-time beam search width
    )
");

Tuning Guide:

M=8-16 — Good default (16 for high recall, 8 for faster build)
ef_construction=100-400 — Higher = better quality, slower build
ef_search=10-100 — Higher = better recall, slower search

Flat Index

Best for: Small datasets (<1K vectors), exact search

await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_flat ON vectors(embedding)
    WITH (index_type='flat')
");

Quantization

Reduce memory usage by 4-32x with minimal accuracy loss:

// Scalar Quantization (4x reduction: float32 → int8)
var indexManager = provider.GetRequiredService<VectorIndexManager>();
await indexManager.CreateIndexAsync(
    tableName: "documents",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Scalar
);

// Binary Quantization (32x reduction: float32 → bit)
await indexManager.CreateIndexAsync(
    tableName: "documents",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Binary
);

Tradeoffs:

Scalar: ~1-3% recall drop, 4x memory savings
Binary: ~5-10% recall drop, 32x memory savings, best for cosine similarity

SQL Functions

-- Distance/similarity functions
vec_distance_cosine(v1, v2)    -- Returns 0-2 (lower = more similar)
vec_distance_l2(v1, v2)        -- Euclidean distance
vec_dot_product(v1, v2)        -- Dot product (higher = more similar)
vec_distance_hamming(v1, v2)   -- Hamming distance (binary vectors)

-- Vector operations
vec_length(v)                  -- Vector L2 norm
vec_normalize(v)               -- Normalize to unit length
vec_add(v1, v2)                -- Element-wise addition
vec_subtract(v1, v2)           -- Element-wise subtraction
vec_multiply(v, scalar)        -- Scalar multiplication

-- Metadata
vec_dimensions(v)              -- Get vector dimensions

📊 Use Cases

1. AI/RAG Applications

Store document embeddings for retrieval-augmented generation:

// Index knowledge base
var docs = await LoadDocumentsAsync();
foreach (var doc in docs)
{
    var embedding = await GetEmbeddingAsync(doc.Content);  // OpenAI, Ollama, etc.
    await db.ExecuteSQLAsync(@"
        INSERT INTO knowledge_base (id, content, embedding)
        VALUES (?, ?, ?)
    ", [doc.Id, doc.Content, embedding]);
}

// Retrieve context for LLM
var userQuestion = "What is vector search?";
var queryEmbedding = await GetEmbeddingAsync(userQuestion);
var context = await db.ExecuteSQLAsync(@"
    SELECT content
    FROM knowledge_base
    ORDER BY vec_distance_cosine(embedding, ?)
    LIMIT 5
", [queryEmbedding]);

// Send context + question to LLM...

2. Semantic Search

Search by meaning, not just keywords:

// Traditional keyword search (may miss relevant docs)
var results = await db.ExecuteSQLAsync(@"
    SELECT * FROM articles
    WHERE content LIKE '%machine learning%'
");

// Semantic vector search (finds conceptually similar docs)
var queryEmbedding = await GetEmbeddingAsync("machine learning");
var semanticResults = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_distance_cosine(embedding, ?) AS relevance
    FROM articles
    ORDER BY relevance ASC
    LIMIT 10
", [queryEmbedding]);

3. Recommendation Systems

Find similar products, users, or content:

// Find similar products based on embedding similarity
var productEmbedding = await GetProductEmbeddingAsync(productId);
var recommendations = await db.ExecuteSQLAsync(@"
    SELECT id, name, price, vec_dot_product(embedding, ?) AS score
    FROM products
    WHERE id != ?
    ORDER BY score DESC
    LIMIT 5
", [productEmbedding, productId]);

4. Image/Audio Similarity

Compare media by their embeddings (e.g., CLIP, Wav2Vec):

// Find visually similar images
var imageEmbedding = await GetImageEmbeddingAsync(imagePath);  // CLIP model
var similarImages = await db.ExecuteSQLAsync(@"
    SELECT id, path, vec_distance_l2(embedding, ?) AS distance
    FROM images
    ORDER BY distance ASC
    LIMIT 20
", [imageEmbedding]);

🔐 Security

Encrypted Vector Storage

All vectors are encrypted at rest using AES-256-GCM when you create an encrypted database:

using var db = factory.CreateEncrypted(
    dbPath: "./secure_vectors",
    password: "YourStrongPassword123!",
    options: new DatabaseOptions
    {
        EnableEncryption = true  // Vectors encrypted automatically
    }
);

What's encrypted:

✅ Vector embeddings (VECTOR columns)
✅ HNSW graph structure
✅ Quantization tables
✅ All metadata

⚡ Performance Tips

1. Choose the Right Index

Dataset Size	Recommended Index	Search Time
< 1K vectors	Flat	0.1-1ms
1K-10K vectors	HNSW (M=8)	0.2-0.5ms
10K-100K vectors	HNSW (M=16)	0.5-2ms
100K+ vectors	HNSW (M=16) + Quantization	1-5ms

2. Tune HNSW Parameters

// High recall (slower)
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_high_recall ON vectors(embedding)
    WITH (index_type='hnsw', m=32, ef_construction=400, ef_search=100)
");

// Fast search (lower recall)
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_fast ON vectors(embedding)
    WITH (index_type='hnsw', m=8, ef_construction=100, ef_search=10)
");

3. Use Quantization for Large Datasets

// 1M vectors, 1536 dimensions:
// - Unquantized: ~6GB RAM
// - Scalar:      ~1.5GB RAM (4x reduction)
// - Binary:      ~200MB RAM (32x reduction)

var indexManager = provider.GetRequiredService<VectorIndexManager>();
await indexManager.CreateIndexAsync(
    tableName: "large_embeddings",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Scalar  // 4x memory savings
);

4. Batch Operations

// ✅ DO: Batch inserts
using var transaction = db.BeginTransaction();
foreach (var doc in documents)
{
    await db.ExecuteSQLAsync(@"
        INSERT INTO documents (id, embedding) VALUES (?, ?)
    ", [doc.Id, doc.Embedding]);
}
transaction.Commit();

// ❌ DON'T: Individual transactions
foreach (var doc in documents)
{
    using var tx = db.BeginTransaction();
    await db.ExecuteSQLAsync("INSERT INTO documents ...");
    tx.Commit();  // Slow!
}

🧪 Testing

Run the included benchmarks to verify performance on your hardware:

cd tests/SharpCoreDB.Benchmarks
dotnet run -c Release -- --filter *VectorSearch*

Example output:

| Method        | VectorCount | Dimensions | K   | Mean      | Error   | StdDev  | Allocated |
|-------------- |------------ |----------- |---- |----------:|--------:|--------:|----------:|
| HnswSearch    | 100000      | 1536       | 10  | 1.845 ms  | 0.032 ms| 0.028 ms|     2.1 KB|
| FlatSearch    | 100000      | 1536       | 10  | 89.32 ms  | 1.23 ms | 1.15 ms |     2.1 KB|

📚 Documentation

Full Vector Search Guide — Complete documentation
Implementation Details — Architecture overview
Migration Guide — Upgrade from older versions
API Reference — Full API documentation

🤝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Areas for Contribution

🚀 Additional distance metrics (Manhattan, Mahalanobis, etc.)
🔬 New quantization strategies (product quantization, PQ)
📊 Performance benchmarks on different hardware
📖 Documentation improvements and examples
🐛 Bug reports and fixes

📄 License

This project is licensed under the MIT License. See LICENSE for details.

🙏 Acknowledgments

HNSW Algorithm: Based on Malkov & Yashunin (2018)
SIMD Optimizations: Inspired by Faiss and Qdrant
C# 14 Features: Built with modern .NET practices from Microsoft

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: support@sharpcoredb.com

Made with ❤️ by MPCoreDeveloper

Product	Compatible and additional computed target framework versions.
.NET	net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net10.0
- SharpCoreDB (>= 1.3.0)

NuGet packages (1)

Showing the top 1 NuGet packages that depend on SharpCoreDB.VectorSearch:

Package	Downloads
SharpCoreDB.Graph.Advanced `SharpCoreDB.Graph` provides traversal and pathfinding; `SharpCoreDB.Graph.Advanced` adds analytics, centrality metrics, subgraph analysis, and GraphRAG with vector search integration.	75

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
1.5.0	93	3/14/2026
1.4.1	90	2/28/2026
1.3.5	102	2/21/2026
1.3.0	100	2/14/2026