Intelligence · Retrieval

GraphRAG on VectorScaleDB

The coupling matrix is an explicit graph — computed continuously from behavior, not extracted after the fact. If you know GraphRAG, you already know how to use it. This page translates the vocabulary.

Why the Mapping Matters

Recent peer-reviewed research out of NYU Shanghai evaluated GraphRAG-style explicit-graph retrieval against vector-only RAG on multi-hop reasoning tasks. The explicit-graph approach wins clearly on questions that require chaining relationships across entities — the exact workload where flat nearest-neighbor search degrades.

External Validation
Explicit Graphs Win Multi-Hop
Independent academic work confirms what the VectorScaleDB architecture assumed from day one: retrieval that traverses an explicit relationship graph beats retrieval that only ranks by embedding similarity whenever the question spans more than one hop.
Native, Not Retrofit
The Graph Is the Substrate
GraphRAG pipelines typically extract a graph as a post-processing step on top of a vector store. VectorScaleDB computes the graph continuously as behavioral coupling between entities. There is no extract-then-retrieve pipeline — the graph is always current.
One Query Surface
Vectors + Graph, Same API
You do not choose between vector search and graph traversal. A single query plan uses the coupling structure for expansion and vector similarity for ranking. The query language does not split along the seam.

Vocabulary Translation

If you are coming from a GraphRAG implementation — Microsoft Research's open-source reference, LangChain GraphRAG, LlamaIndex property graph, or a hand-rolled Neo4j + embeddings pipeline — the concepts map cleanly. Only the vocabulary changes.

GraphRAG Term VectorScaleDB Equivalent Notes
Entity extraction (LLM over chunks) Entity-type classification at ingest 200+ first-class entity types across 20+ domains; classification happens in the ingest adapter, not as a separate LLM pass.
Relationship extraction Coupling discovery Relationships are inferred from co-behavior over time, not parsed from prose. No LLM in the extraction loop.
Knowledge graph Coupling matrix The coupling matrix is the explicit graph — weighted, directed, and versioned alongside the underlying segments.
Community detection (Leiden, Louvain) Regime clustering Behavioral regimes group entities that move together. Regimes are produced by the compression engine and visible in every query response.
Community summary Regime summary / domain composition Each regime carries a centroid, drift magnitude, member count, and per-domain composition breakdown. Available at /v1/query/forecast and related endpoints.
Subgraph retrieval / local search Coupling-neighbor query Given a seed entity, expand along coupling edges above a configurable weight, then rank with vector similarity. One call, not two.
Global search (community-level answers) Regime-level query Answer over regime summaries instead of individual segments. Same query shape, different resolution level.
Hierarchical summarization Multi-resolution hierarchical summary Summaries exist at segment, regime, and cross-domain cluster levels. The query planner chooses the resolution that matches the question.
Multi-hop reasoning Cascade / coupling traversal Follow coupling edges across entity-type boundaries to reach related entities that no single embedding would have returned. Cascade prediction is the same mechanism exposed as a forecasting endpoint.
Graph embeddings Coupled vectors Every stored vector is already positioned relative to its neighbours in the coupling structure. No separate embedding step is required to blend structure and content.
Anomaly / novelty detection Behavioral anomaly detection An entity that does not fit its regime or coupling neighbourhood scores high on anomaly. Works across domains with the unified segment format.
Re-indexing after schema change Not applicable The coupling matrix evolves continuously as entities behave. There is no batch rebuild step because there is no batch extraction step.

What This Means in Practice

Three concrete consequences of the coupling matrix being native rather than retrofitted.

Freshness
No Graph Rebuild Lag
GraphRAG pipelines typically rebuild the knowledge graph on a batch schedule — nightly, weekly, or on demand. Between runs, the graph drifts out of sync with the underlying data. VectorScaleDB updates couplings as segments are ingested, so the graph is never stale.
Cost
No LLM in the Write Path
Entity and relationship extraction in classic GraphRAG runs an LLM over every ingested chunk. VectorScaleDB does not put an LLM in the write path. Coupling is derived from behavior. LLMs are a consumer of the graph, never a dependency for building it.
Scope
Works Past Text
GraphRAG was designed for document corpora. The coupling matrix treats documents, telemetry, trajectories, financial ticks, and biological signals with the same math. A retrieval query can legitimately cross from a text passage to a sensor regime via shared coupling.
Related reading

Looking for the human-reasoning angle?

This page is deliberately scoped to the GraphRAG isomorphism. If you are here because you care about how the coupling matrix supports dialectical, multi-perspective reasoning — the cyborg-readiness framing — see the cross-domain intelligence page and the architecture overview.

Cross-Domain Intelligence Architecture

Related Capabilities

Bring your GraphRAG workload onto a native graph

Keep the mental model. Drop the extract-then-retrieve pipeline.