EvalGate

MCP Integration

MCP-compatible tool discovery and execution for AI agents

Cursor compatible

Claude compatible

ChatGPT compatible

Overview

EvalGate exposes an MCP-style tool discovery and execution API for AI agents. Tools map to platform services: evaluations, quality scores, traces, spans, and test cases.

Available Tools

• Quality score retrieval
• Evaluation management
• Trace and span operations
• Test case management

Supported Agents

• Cursor IDE
• Claude Desktop
• ChatGPT Plugins
• Custom MCP clients

API Endpoints

Method	Endpoint	Auth	Description
GET	`/api/mcp/tools`	None	List available tools
POST	`/api/mcp/call`	Required	Execute a tool

Authentication

Use either method for authenticated requests:

Session Cookie

When using the platform in a browser, the session cookie is automatically included.

API Key

Authorization: Bearer <EVALAI_API_KEY>

Get API keys from Settings → Developer in the app.

Tool Discovery

Example Request

curl -X GET "https://eval.ai/api/mcp/tools"

Response Format

{
  "tools": [
    {
      "name": "eval.quality.latest",
      "description": "Get the latest quality score for an evaluation.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "evaluationId": { 
            "type": "number", 
            "description": "ID of the evaluation" 
          },
          "baseline": { 
            "type": "string", 
            "enum": ["published", "previous", "production"] 
          }
        },
        "required": ["evaluationId"]
      }
    }
  ]
}

Tool Execution

Example Request

curl -X POST "https://eval.ai/api/mcp/call" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "tool": "eval.quality.latest",
    "arguments": { 
      "evaluationId": 42, 
      "baseline": "published" 
    }
  }'

Success (200)

{
  "ok": true,
  "content": [
    { 
      "type": "text", 
      "text": "{\"score\":85,\"baselineScore\":82,...}" 
    }
  ]
}

Error (400)

{
  "ok": false,
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Evaluation not found",
    "requestId": "uuid"
  }
}

Available Tools

eval.quality.latest

Get the latest quality score for an evaluation

Parameters: evaluationId (required), baseline (optional)

eval.run.create

Create a new evaluation run

Parameters: evaluationId (required), environment (optional)

eval.trace.create

Create a distributed trace

Parameters: name (required), metadata (optional)

eval.testcase.list

List test cases for an evaluation

Parameters: evaluationId (required), limit (optional)

Integration Examples

Cursor IDE

Add MCP server configuration to Cursor settings to enable AI-powered evaluation management directly in your IDE.

{
  "mcpServers": {
    "evalai": {
      "command": "curl",
      "args": ["https://eval.ai/api/mcp/tools"]
    }
  }
}

Claude Desktop

Configure Claude Desktop to use EvalGate tools for evaluation management and quality scoring.

{
  "mcpServers": {
    "evalai": {
      "url": "https://eval.ai/api/mcp/tools",
      "auth": "Bearer YOUR_API_KEY"
    }
  }
}

Learn More

API Contract API Reference Try It Now

Overview

EvalGate exposes an MCP-style tool discovery and execution API for AI agents. Tools map to platform services: evaluations, quality scores, traces, spans, and test cases.

Available Tools

• Quality score retrieval
• Evaluation management
• Trace and span operations
• Test case management

Supported Agents

• Cursor IDE
• Claude Desktop
• ChatGPT Plugins
• Custom MCP clients

Method

Endpoint

Auth

Description

GET

/api/mcp/tools

None

List available tools

POST

/api/mcp/call

Required

Execute a tool

Tool Discovery

Example Request

curl -X GET "https://eval.ai/api/mcp/tools"

Response Format

{
  "tools": [
    {
      "name": "eval.quality.latest",
      "description": "Get the latest quality score for an evaluation.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "evaluationId": { 
            "type": "number", 
            "description": "ID of the evaluation" 
          },
          "baseline": { 
            "type": "string", 
            "enum": ["published", "previous", "production"] 
          }
        },
        "required": ["evaluationId"]
      }
    }
  ]
}

Tool Execution

Example Request

curl -X POST "https://eval.ai/api/mcp/call" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "tool": "eval.quality.latest",
    "arguments": { 
      "evaluationId": 42, 
      "baseline": "published" 
    }
  }'

Success (200)

{
  "ok": true,
  "content": [
    { 
      "type": "text", 
      "text": "{\"score\":85,\"baselineScore\":82,...}" 
    }
  ]
}

Error (400)

{
  "ok": false,
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Evaluation not found",
    "requestId": "uuid"
  }
}

Available Tools

eval.quality.latest

Get the latest quality score for an evaluation

Parameters: evaluationId (required), baseline (optional)

eval.run.create

Create a new evaluation run

Parameters: evaluationId (required), environment (optional)

eval.trace.create

Create a distributed trace

Parameters: name (required), metadata (optional)

eval.testcase.list

List test cases for an evaluation

Parameters: evaluationId (required), limit (optional)

Integration Examples

Cursor IDE

Add MCP server configuration to Cursor settings to enable AI-powered evaluation management directly in your IDE.

{
  "mcpServers": {
    "evalai": {
      "command": "curl",
      "args": ["https://eval.ai/api/mcp/tools"]
    }
  }
}

Claude Desktop

Configure Claude Desktop to use EvalGate tools for evaluation management and quality scoring.

{
  "mcpServers": {
    "evalai": {
      "url": "https://eval.ai/api/mcp/tools",
      "auth": "Bearer YOUR_API_KEY"
    }
  }
}