Get Started
Z-GASAB benchmark
How to read the interactive comparison on the homepage and the full benchmarks page.
Zephex publishes a repository-grounded benchmark (Z-GASAB) that scores MCP stacks on tasks agents fail when they only have parametric knowledge or static docs. The live chart is on the homepage and /benchmarks.
What each metric means
- Typosquat pass (%) — Share of tasks where the agent correctly rejects malicious or typosquatted packages. Zephex uses
audit_packageand live registry data on your lockfile. - AST search (%) — Correct file + symbol for in-repo call sites. Zephex uses
find_codeandread_codeon your tree. - Path tracing (%) — Multi-hop accuracy (e.g. billing webhook → handler → DB). Zephex uses
explain_architectureon your app structure. - API docs (%) — External API documentation tasks. Tools like Context7 and Exa often score higher here because they focus on library docs or web retrieval—not your private repo layout.
- Upgrade success (%) — Derived from dependency upgrade error rate (100% − error%). Lower raw errors mean a taller bar.
Where the numbers come from
Scores are from the internal Z-GASAB harness (June 2026), aligned to public product capabilities—e.g. Context7 resolve-library-id / query-docs, Exa get_code_context_exa, Snyk Studio MCP scans. We explain each bar on click; sources are cited in the panel (vendor docs, Nia research posts, etc.).
This is not one overall “winner” score. A docs-only MCP can top API-doc tasks while scoring low on AST search—that is expected, not a mistake.
Using the chart
- Pick a metric tab (typosquat, AST search, path tracing, API docs, upgrade success).
- Click a bar or table row—the explanation stays until you pick another tool.
- Follow the link under the chart for vendor docs or Zephex tool guides.