Support
Zephex applies multi-layer rate limits to keep the service responsive and prevent abuse. This page explains exactly what counts, what doesn't, and how limits are enforced.
| Plan | Monthly Requests | Per-Minute Burst | Sessions per Key | Price |
|---|---|---|---|---|
| Free | 555 | 50 | 25 | $0 |
| Pro | 3,500 | 300 | 25 | $7/mo |
| Max | 10,000 | 1,000 | 25 | $19/mo |
Only tools/call counts. Every time your AI agent invokes a tool (such as get_project_context, find_code, or read_code), that is one request against your monthly limit.
These do not count:
initialize (MCP handshake)tools/list (listing available tools)resources/list and prompts/listnotifications/* (any notification message)Your editor can reconnect, refresh the tool list, and send notifications as often as needed without consuming your quota.
The per-minute cap prevents a single user from overwhelming the endpoint during a burst. These are token-bucket limits with 1.5× burst allowance:
A typical AI coding session makes 13–35 tool calls over 5–6 minutes (an average of 2–7 calls per minute). Even the Free tier handles this comfortably with room for bursts.
Each API key supports up to 25 concurrent MCP sessions. A session is created when your editor sends an initialize request. You can have multiple editors (Cursor + Claude Code + VS Code) connected to the same key without hitting session limits.
Within each session, up to 8 tool calls can be in-flight simultaneously, which handles parallel tool invocations made by some AI agents.
Every API response includes:
X-RateLimit-Limit: 300X-RateLimit-Remaining: 247X-RateLimit-Reset: 1714521600X-RateLimit-Limit — your plan's monthly request capX-RateLimit-Remaining — requests left this billing periodX-RateLimit-Reset — Unix timestamp (seconds) when the counter resetsMonthly limits reset on the first day of each calendar month at 00:00 UTC. Usage is tracked per user account (not per individual key). If you have multiple API keys, they all share the same monthly pool.
When the monthly cap is reached, tools/call returns:
{ "jsonrpc": "2.0", "error": { "code": -32003, "message": "Usage limit exceeded. Upgrade to Pro for 3,500 requests/month.", "data": { "tier": "free", "current_usage": 555, "limit": 555, "reset_date": "2026-06-01T00:00:00Z", "upgrade_url": "https://zephex.dev/dashboard/billing" } }}Your editor still shows tools as connected. tools/list still works. Only tools/call is blocked until the reset date or until you upgrade.
When the per-minute burst is exceeded, you'll receive:
{ "jsonrpc": "2.0", "error": { "code": -32002, "message": "Rate limit exceeded. Please try again later.", "data": { "retry_after": 12 } }}The Retry-After header tells your client how many seconds to wait. Most MCP clients handle this automatically with exponential backoff.
get_project_contextonce per session — it caches internally. Don't call it before every tool invocation.check_test before reading code — it tells you which 3–8 files to read, avoiding wasted read_code calls.find_code with exhaustive: true for renames — one call finds all occurrences instead of multiple searches.read_code mode symbol over mode file — reading a specific function is cheaper than reading an entire file.A secondary IP-based rate limit (200 requests per minute per IP) exists independently of API key limits. This prevents abuse from a single network regardless of how many API keys are used. Normal usage never approaches this limit.
When you hit your monthly limit, the error response includes an upgrade_url. You can also upgrade anytime from Dashboard → Billing → Upgrade Plan. Upgrades take effect immediately — your new higher limit applies for the rest of the current month.
Billing: billing@zephex.dev
Rate limit issues: support@zephex.dev