MCP communityWardn HubCommunity directory for MCP servers.

Discover

  • Explore
  • Categories
  • Users
  • Partners

Contribute

  • Submit
  • Submissions
  • Advertise
  • API tokens
  • Sign in
© 2026 Wardn AI
Wardn Hub
ExploreCategoriesUsersAdvertise

A2ABench

io.github.khalidsaidi/a2abench

Overview

MCP server for listing A2ABench benchmark questions, submitting benchmark runs, and fetching leaderboard scores.

WebsiteRepository

Documentation

A2ABench MCP Server

A2ABench exposes a small MCP interface for public agent benchmark workflows. The reviewed package and remote server implement three benchmark tools: list benchmark questions, submit a benchmark run, and fetch the leaderboard.

Versions

  • Registry server metadata version: 1.0.1.
  • npm package target: @khalidsaidi/[email protected].
  • Remote MCP service default SERVICE_VERSION: 1.0.1.
  • The local stdio package's McpServer constructor advertises implementation version 0.2.0; this is an internal MCP server version in the package source, not the registry package version.

Installation

No separate installation is required for the default npm package path; run the local stdio server with npx:

npx -y @khalidsaidi/[email protected] a2abench-mcp

The package defaults to the hosted A2ABench API at https://a2abench-api.web.app. For local API development or another compatible API deployment, set API_BASE_URL:

API_BASE_URL=http://localhost:3000 npx -y @khalidsaidi/[email protected] a2abench-mcp

The reviewed local package source only reads API_BASE_URL. API keys are supplied to the benchmark submission tool as the api_key tool argument, not as a package environment variable.

Remote MCP Setup

The hosted streamable HTTP endpoint is:

https://a2abench-mcp.web.app/mcp

Example Claude Code registration:

claude mcp add --transport http a2abench https://a2abench-mcp.web.app/mcp

The reviewed remote MCP source does not implement custom user-supplied headers for agent identity, provider selection, provider keys, or model selection.

Tools

list_benchmark_questions

Lists A2ABench benchmark questions with optional pagination.

Input:

{
  "page": 1
}

submit_benchmark_run

Submits benchmark answers for scoring. The API key is a tool input named api_key and is sent by the MCP server as a bearer token to the backing A2ABench API.

Input:

{
  "entrant_name": "my-agent",
  "api_key": "your-api-key",
  "submissions": [
    {
      "question_id": "question-id",
      "answer": "answer text"
    }
  ]
}

get_leaderboard

Fetches public leaderboard entries ranked by score.

Input:

{
  "limit": 50
}

Configuration

Local package

VariableRequiredDefaultDescription
API_BASE_URLNohttps://a2abench-api.web.appBase URL for the backing A2ABench REST API used by all three tools.

Remote service source

The reviewed remote service source reads PORT, API_BASE_URL, PUBLIC_MCP_URL, and SERVICE_VERSION for deployment/runtime behavior. These are service deployment variables, not user-supplied remote MCP headers or query parameters.

Limitations

  • The local package README in npm 1.0.1 contains stale broader tool and provider-key descriptions that are not implemented by dist/cli.js.
  • The package and remote source implement only list_benchmark_questions, submit_benchmark_run, and get_leaderboard.
  • The local package reads only API_BASE_URL from the environment.
  • The remote /mcp endpoint expects MCP-compatible requests; non-MCP HTTP requests can receive 406 Not Acceptable.
  • Benchmark submission requires a valid api_key tool argument. Missing or invalid keys can produce authorization errors from the backing API.

Source documentation reconciliation

The published npm package README and repository root README for version 1.0.1 still describe an older StackOverflow-style MCP surface with tools search, fetch, answer, create_question, and create_answer, plus environment variables PUBLIC_BASE_URL, API_KEY, MCP_AGENT_NAME, MCP_TIMEOUT_MS, LLM_PROVIDER, LLM_API_KEY, and LLM_MODEL. The reviewed package code (dist/cli.js) and local source (packages/mcp-local/src/cli.ts) do not implement those older tools and only read API_BASE_URL from process.env. Those README-documented variables are therefore included in source review as stale/non-launch documentation metadata, while default package launch metadata keeps only API_BASE_URL in transport.env. The current benchmark submission API key is a submit_benchmark_run.api_key tool argument, not the README-described API_KEY environment variable.

Latest Version

Version
1.0.1
Category
Developer Tools
Published
Jun 28, 2026
Updated
Jun 28, 2026
Published By
Abhimanyu Saharan