Z-GASAB benchmark

How MCP tools score on real repo work

Click any bar to see why that tool scored high or low. Scores come from our internal Z-GASAB harness (June 2026), mapped to what each product actually ships—Context7 docs, Exa web code search, Snyk scans, and more.

Key Metrics

Malicious/Typosquat pass rate (%)

Higher bars are better on pass-rate metrics.

Zephex Hosted MCP — 98.2% (Typosquat pass)Zephex MCP tools →

Why this score?

Zephex Hosted MCP — 98.2% on this chart

Hosted MCP with check_package, project_memory, read_code, and explain_architecture — built for your repo and installed versions, not cached docs alone.

Zephex runs check_package against live npm and OSV data on your lockfile, so typosquats and risky installs get caught before merge.

How we measuredPass rate on 50 tasks where agents must reject malicious or typosquatted packages.

Z-GASAB internal harness · zephex.dev/mcp-tools

Zephex MCP tools →

MCP Configuration

MCP configuration	Upgrade error	Typosquat pass	AST search	Path tracing	API docs
Zephex Hosted MCP	4.8%	98.2%	96.5%	94.2%	81.5%
Snyk MCP (Local)	12.4%	95.4%	32.1%	18.2%	14.2%
Nia Oracle	31.6%	42.1%	89.4%	74.2%	92.5%
Context7 (Upstash)	68.2%	12.4%	8.5%	5.2%	94.8%
Exa Code	42.5%	24.1%	28.5%	12.4%	91.2%
DeepWiki MCP	78.4%	5.2%	54.1%	62.5%	48.2%
GPT-5 Baseline (No RAG)	89.5%	2.1%	0%	0%	19.5%

Click a bar or row to lock the explanation. Upgrade error: lower is better; other columns: higher is better.

What we measured

Z-GASAB tests agents on tasks that need your repository—not only cached library docs: rejecting typosquatted packages, finding symbols with AST search, tracing webhook paths, and generating API notes from live structure.

Typosquat pass — Can the stack block malicious or confusing npm installs?
AST search — Does the tool return the right file and symbol in your tree?
Path tracing — Can it follow real routes (e.g. Stripe webhook → handler)?
API docs — How well does it answer bleeding-edge third-party API questions?
Upgrade success — Lower dependency-upgrade error rate is better (shown as success % in the chart).

See Zephex MCP tools · Zephex vs Context7 · Docs: reading this benchmark

MCP configuration

Upgrade error

Typosquat pass

AST search

Path tracing

API docs

Zephex Hosted MCP

4.8%

98.2%

96.5%

94.2%

81.5%

Snyk MCP (Local)

12.4%

95.4%

32.1%

18.2%

14.2%

Nia Oracle

31.6%

42.1%

89.4%

74.2%

92.5%

Context7 (Upstash)

68.2%

12.4%

8.5%

5.2%

94.8%

Exa Code

42.5%

24.1%

28.5%

12.4%

91.2%

DeepWiki MCP

78.4%

5.2%

54.1%

62.5%

48.2%

GPT-5 Baseline (No RAG)

89.5%

2.1%

19.5%

What we measured

Typosquat pass — Can the stack block malicious or confusing npm installs?

AST search — Does the tool return the right file and symbol in your tree?

Path tracing — Can it follow real routes (e.g. Stripe webhook → handler)?

API docs — How well does it answer bleeding-edge third-party API questions?

Upgrade success — Lower dependency-upgrade error rate is better (shown as success % in the chart).

MCP configuration

Upgrade error

Typosquat pass

AST search

Path tracing

API docs

Zephex Hosted MCP

4.8%

98.2%

96.5%

94.2%

81.5%

Snyk MCP (Local)

12.4%

95.4%

32.1%

18.2%

14.2%

Nia Oracle

31.6%

42.1%

89.4%

74.2%

92.5%

Context7 (Upstash)

68.2%

12.4%

8.5%

5.2%

94.8%

Exa Code

42.5%

24.1%

28.5%

12.4%

91.2%

DeepWiki MCP

78.4%

5.2%

54.1%

62.5%

48.2%

GPT-5 Baseline (No RAG)

89.5%

2.1%

19.5%

What we measured

Typosquat pass — Can the stack block malicious or confusing npm installs?

AST search — Does the tool return the right file and symbol in your tree?

Path tracing — Can it follow real routes (e.g. Stripe webhook → handler)?

API docs — How well does it answer bleeding-edge third-party API questions?

Upgrade success — Lower dependency-upgrade error rate is better (shown as success % in the chart).