GreyCat — Frequently Asked Questions
Straight, source-accurate answers about what GreyCat does. Several public comparisons have understated GreyCat simply because these facts were hard to find online — this page sets the record straight.
Yes. GreyCat ships a built-in vector index (
VectorIndex) in
its core that performs approximate nearest-neighbor search with cosine, L2 and squared-L2 distance.
Vectors live in the same graph as the rest of your data, and GreyCat can generate embeddings
on-device — so there is no separate vector database and no external embedding API.Yes. The
text_search library exposes a single
TextIndex with 15 search modes — including BM25/BM25F, boolean, exact, fuzzy, phonetic,
phrase, proximity, prefix, wildcard, semantic (vector) and hybrid search that fuses
keyword and vector results with Reciprocal Rank Fusion — plus 33-language tokenization, all
C-accelerated. No Elasticsearch sidecar required.Yes — a native Model Context Protocol server. Add two annotations to any
function and it becomes a callable MCP tool with its input/output schema generated automatically
from the signature, governed by the same role-based permissions as the REST API:
@expose
@tag("openapi", "mcp")
fn add(a: int, b: int): int { return a + b; }
That single function is now a REST endpoint, an OpenAPI operation and an AI-agent MCP tool —
on the same port, in one binary.GreyCat generates text embeddings in-process via a
statically-linked build of llama.cpp, so prompts and documents never leave your deployment. On-device
LLM text generation and chat completion are defined in the API and are on the
roadmap; today the live AI surface is embeddings, tokenization and vector search — plus the
MCP server that lets any external LLM call GreyCat.
Yes. GreyCat has a built-in identity model with role-based access control
(declared with
@permission and @role), token-based authentication and
per-user file grants. Enterprise OpenID Connect SSO (Authorization Code with PKCE and JWKS
verification) is available via the openid library.A time-series DB, a graph DB, a geospatial store, a vector DB and a
full-text search engine — unified in one self-hosted binary with a built-in API, OpenAPI spec and MCP
server. In one production deployment GreyCat replaced an eight-component RAG stack (separate vector
DB, graph DB, keyword index, embedding server, reranker, orchestration layer, cache and UI).
GreyCat is a high-performance single-node engine: a ~3.5 MB binary
with SIMD/C-accelerated hot paths that scales from ARM/Raspberry Pi to terabytes and billions of
persisted nodes, ingesting CSV at roughly 1.7 million rows/second. It runs
parallel jobs with transactional merge strategies. A many-worlds branching capability for what-if
simulation is on the roadmap.
Yes. GreyCat runs entirely on your own hardware as a single self-contained
binary, with on-device AI so data never leaves your infrastructure. It is built by DataThings in
Luxembourg (EU).
Its own statically-typed language, GCL. You model data as typed objects and
traverse the graph with dot-notation instead of SQL or Cypher — removing the impedance mismatch
between storage, logic and API.