Storage

Unified Multi-Model Storage

One system, one API, one query language. Replace Postgres, Elasticsearch, Neo4j, and S3 with a single embedded engine.

Five Data Models, One Engine

Stop stitching together separate databases for vectors, documents, graphs, blobs, and text search. VectorScaleDB stores and queries all five natively, with cross-model joins at query time.

Vectors
Temporal-semantic vector storage
The core data model. High-dimensional vectors with temporal context, behavioral drift tracking, and regime-aware compression. Sub-millisecond similarity search across billions of vectors with time-bounded queries.
Documents
Structured metadata storage
JSON-compatible metadata attached to any entity. Indexed for efficient filtering and faceted queries. Metadata participates in vector queries as pre-filters, post-filters, or scoring signals — no separate document store required.
Graphs
Entity relationship traversal
Cross-entity relationships stored as first-class edges. Traverse from a sensor to its device, from a device to its fleet, from a fleet to its operator. Graph traversal combines with vector similarity for relationship-aware nearest neighbor search.
Blobs
Content-addressable binary storage
Raw binary data — images, model weights, sensor firmware, configuration snapshots — stored with content-addressable deduplication. Identical blobs are stored once regardless of how many entities reference them. Automatic garbage collection reclaims unreferenced blobs.
Full-Text
Integrated text search
Full-text search over entity metadata, log messages, annotations, and any text field. Combined with vector similarity for hybrid search: find entities whose behavior matches a vector pattern AND whose metadata matches a text query, in a single request.

Automatic Tiered Storage

Data moves automatically between storage tiers based on access patterns. Hot data stays in memory for microsecond access. Cold data compresses to disk. You set the policy; the engine manages placement.

Hot Tier
In-memory LRU cache
Frequently accessed segments and active entities reside in memory with LRU eviction. Sub-microsecond access latency. Cache size adapts automatically to available memory, respecting the resource governor's budget constraints.
Warm Tier
Optimized on-disk storage
Recently-active data stored on disk with bloom filters for fast negative lookups. Read latency in the low microseconds for SSD-backed deployments. Automatic promotion to hot tier on repeated access.
Cold Tier
Maximum compression archival
Historical data compressed with aggressive algorithms (up to 19x additional compression beyond behavioral compression). Slightly higher read latency in exchange for dramatically reduced storage costs. Transparent promotion on access — queries work identically across all tiers.

Content-Addressable Deduplication

Identical data is stored once, regardless of how many entities reference it. Content hashes serve as universal identifiers — enabling deduplication, integrity verification, and decentralized caching in a single mechanism.

Deduplication
Automatic cross-entity dedup
When multiple entities produce identical behavioral segments, model weights, or binary blobs, the storage layer detects the duplication via content hashing and stores a single copy. References are lightweight pointers to the canonical copy. In fleets with thousands of similar devices, deduplication can reduce storage by 40-80%.
Integrity
Built-in corruption detection
Every stored object carries its content hash. On read, the hash is recomputed and verified. Any corruption — from disk failure, bit rot, or tampering — is detected immediately. Combined with the SHA-256 audit chain, this provides end-to-end data integrity from ingestion to query.

Semantic Chunk Compression

Beyond behavioral regime compression, the unified storage layer applies semantic-aware compression that understands the structure of your data.

Structure-Aware
Per-column encoding
Each data type is encoded with a type-optimal strategy chosen automatically — the engine applies intelligent multi-stage compression and selects the best approach per field without manual tuning.
Cross-Entity
Fleet-level compression
Entities of the same type often share structural similarities. The compression engine identifies common patterns across entities in the same domain and compresses against shared baselines, achieving additional 2-5x compression beyond per-entity behavioral compression.
Queryable
Compression-transparent queries
All compression is transparent to the query layer. Queries operate on compressed data directly when possible — centroid comparisons, range checks, and bloom filter lookups happen without decompression. Only final result materialization requires full decompression.

Related Capabilities

Replace your data stack with one engine

See how unified storage simplifies your architecture and reduces operational overhead.