When engineering teams decide to run graph workloads directly against their existing PostgreSQL data, the first question they inevitably ask is: "Why not just use Apache AGE?" It's an established, open-source project that implements openCypher directly inside Postgres. On paper, it seems like the obvious choice.
The short answer is architecture: different layer, different job.
Apache AGE and pgGraph (the open-source engine powering Evokoa) both bring graph-style querying to Postgres, but their underlying architectures are fundamentally different. AGE converts Cypher queries into recursive SQL calls inside Postgres. This works well for simple queries, but breaks down catastrophically as paths get deeper. In contrast, pgGraph creates a highly optimized virtual graph layer completely in-memory. Postgres remains your single source of truth and primary query interface, while pgGraph handles the deep relationship traversals (including 10+ hop paths) at bare-metal speeds.
Different Execution Models
To understand why this architectural divergence matters, we need to look at how traversals are executed.
Apache AGE translates Cypher queries into recursive work inside the Postgres execution planner. It stores graph topology using standard relational rows or JSONB columns. When you execute a traversal, the database has to perform complex recursive SQL joins, chasing pointers across B-tree indexes.
For simple, 2-hop or 3-hop queries, this relational emulation works fine. But AI agent workloads and enterprise operational intelligence do not stop at 3 hops. They require deep, multi-hop reasoning. They need to traverse from a patient, to a transcript, to an SOP, to a branch, to an outcome, and back again, continuously.
When you push AGE-style recursive calls to 10 or 15 hops across a multi-million edge dataset, the Postgres execution planner chokes. The recursive joins explode the working memory set. Latency degrades from milliseconds to seconds, and eventually, the queries simply timeout.
Built for Deep Traversal
When we built Evokoa, we realized that to get microsecond latency at 20-hop depth, we could not rely on the Postgres query planner to walk edges. Emulation was not enough. We needed a data structure built specifically for traversal.
Instead of emulating edges in tables, pgGraph compiles your Postgres relationships into a highly compact, memory-mapped virtual graph layer. We use a Compressed Sparse Row (CSR) array format—the same structure used in high-performance scientific computing and GPU algorithms.
In this in-memory virtual layer, finding a node's neighbors does not require an index lookup or a table join. It requires a single array offset calculation. There are no pointers to chase. There are no recursive SQL statements. The CPU simply walks contiguous blocks of memory in a hot loop.
Postgres Remains the Interface
We are open-sourcing pgGraph as a Postgres extension. This means builders can run complex graph queries directly inside Postgres, alongside their existing data. Your application keeps writing data the exact same way it always has, and Postgres remains the interface. Meanwhile, behind the scenes, pgGraph maintains the virtual relationship graph needed for massive scale.
pgGraph Performance at Scale
To demonstrate the power of the in-memory virtual graph layer, we ran extensive benchmarks on pgGraph using two distinct datasets: the PANAMA dataset (2M+ nodes) and the massive LDBC Social Network Benchmark (3.1M+ nodes, 34.5M+ edges).
Note: These benchmarks measure pgGraph's performance in isolation. We do not plot Apache AGE here, as deep traversals (10+ hops) on datasets of this size resulted in query timeouts under the recursive SQL model.
