pgvector & Vector Search

Why pgvector?

Instead of running a separate vector database (Pinecone, Weaviate, Qdrant), HyperSaaS keeps embeddings inside PostgreSQL using pgvector. This means:

No extra infrastructure to manage
Transactional consistency with your relational data
Standard SQL queries with vector operations
Production-ready up to ~10M vectors with HNSW indexing

Setup

pgvector must be installed as a PostgreSQL extension. The Docker image pgvector/pgvector:pg17 includes it pre-installed. For manual setups:

CREATE EXTENSION IF NOT EXISTS vector;

The Django migration handles this automatically via VectorExtension().

DocumentChunk Model

Each document chunk stores its text content alongside a 1536-dimensional embedding vector:

class DocumentChunk(BaseModel):
    document = models.ForeignKey(Document, on_delete=models.CASCADE)
    chunk_index = models.PositiveIntegerField()
    content = models.TextField()                          # Raw text
    embedding = VectorField(dimensions=1536)              # pgvector (semantic leg)
    embedding_model = models.CharField(max_length=100)
    page_number = models.PositiveIntegerField(null=True)
    section_heading = models.CharField(max_length=500, blank=True)
    token_count = models.PositiveIntegerField(default=0)
    chunk_metadata = models.JSONField(default=dict)       # Docling metadata
    search_vector = SearchVectorField(null=True)          # tsvector (keyword leg)

HNSW Index

An HNSW (Hierarchical Navigable Small World) index enables fast approximate nearest-neighbor search:

HnswIndex(
    name="doc_chunk_embedding_hnsw_idx",
    fields=["embedding"],
    m=16,                              # Connections per node
    ef_construction=64,                # Build-time search width
    opclasses=["vector_cosine_ops"],   # Cosine distance
)

Index parameters

Parameter	Value	Effect
`m`	16	Higher = better recall, more memory
`ef_construction`	64	Higher = better index quality, slower build
`opclasses`	`vector_cosine_ops`	Cosine similarity distance metric

GIN Index for Keyword Search

Alongside the HNSW index, a GIN index on the precomputed search_vector column serves the keyword leg of hybrid retrieval:

GinIndex(fields=["search_vector"], name="doc_chunk_fts_gin_idx")

Filtering with search_vector=SearchQuery(...) uses the @@ operator, which is index-eligible — keyword search stays fast at any corpus size instead of computing tsvectors per row per query. The column is populated once at ingestion (chunks are immutable), using the DOCUMENT_FTS_LANGUAGE text-search config.

Querying

Semantic search uses pgvector's CosineDistance annotation:

from pgvector.django import CosineDistance

DocumentChunk.objects.filter(
    document_id__in=document_ids
).annotate(
    distance=CosineDistance("embedding", query_embedding)
).order_by("distance")[:top_k]

The score is converted to similarity: score = 1.0 - distance.

Embedding Model

HyperSaaS uses OpenAI's text-embedding-3-small (1536 dimensions) by default. This is configurable:

Setting	Default	Description
`DOCUMENT_EMBEDDING_MODEL`	`text-embedding-3-small`	OpenAI model name
`DOCUMENT_EMBEDDING_DIMENSIONS`	`1536`	Must match VectorField dimensions
`DOCUMENT_EMBEDDING_BATCH_SIZE`	`512`	Texts per API call

pgvector & Vector Search

On this page