Quick Start Guide
Get started with the EvalGate SDK in under 5 minutes
EvalGate is CI for AI behavior. LLMs drift silently — EvalGate turns evaluations into CI gates so regressions never reach production.
# Add this to .github/workflows/evalai.yml
name: EvalGate CI
on: [push, pull_request]
jobs:
evalai:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- run: npm ci
- run: npx evalgate ci --format github --write-results --base main
- uses: actions/upload-artifact@v4
if: always()
with:
name: evalai-results
path: .evalai/That's it! Your CI now automatically discovers specs, runs only impacted tests, compares against baseline, and posts rich summaries in PRs.
npx @evalgate/sdk init # detects repo, creates baseline, installs CI workflow git add evals/ .github/workflows/evalai-gate.yml evalai.config.json git commit -m "chore: add EvalGate regression gate" git push # open a PR → CI blocks regressions
That's it. evalai init detects your package manager, runs your tests to capture a baseline, and scaffolds everything. No account required for local gating.
Run gate locally
npx evalgate gateUpdate baseline
npx evalgate baseline updateUpgrade to full gate
npx evalgate upgrade --fullpip install pauly4010-evalgate-sdk
from evalgate_sdk import expect
result = expect("The capital of France is Paris.").to_contain("Paris")
print(result.passed) # TrueNo API key needed for local assertions. For platform traces and evaluations, use AIEvalClient. See SDK page for full Python examples.
- Node.js 18.0.0 or higher
- npm, yarn, or pnpm package manager
- Python 3.8+ (for Python SDK)
- An EvalGate account (sign up above)
Create an API Key
- 1. Navigate to the Developer Dashboard
- 2. Scroll down to the "API Keys" section
- 3. Click "Create API Key"
- 4. Enter a name (e.g., "Development Key")
- 5. Select the scopes you need (start with all for testing)
- 6. Click "Create Key"
- 7. Copy and save your API key immediately - you won't see it again!
Important: Your API key is shown only once. Store it securely!
Install the SDK
Install the EvalGate SDK in your project using your preferred package manager:
Python
Configure Environment Variables
Create a .env file in your project root:
EVALAI_API_KEY=sk_test_your_api_key_here EVALAI_ORGANIZATION_ID=your_org_id_here
Where to find these values:
EVALAI_API_KEY- From the API key creation dialogEVALAI_ORGANIZATION_ID- Shown in the API key creation dialog
Security Tip: Add .env to your .gitignore file to prevent committing secrets!
Initialize the Client
Import and initialize the SDK in your code:
TypeScript
import { AIEvalClient } from '@evalgate/sdk'
// Auto-loads from environment variables
const client = AIEvalClient.init()
// Or with explicit configuration
const client = new AIEvalClient({
apiKey: process.env.EVALAI_API_KEY,
organizationId: parseInt(process.env.EVALAI_ORGANIZATION_ID!),
debug: true // Enable debug logging
})Python
from evalgate_sdk import AIEvalClient
# Auto-loads from environment variables
client = AIEvalClient.init()
# Or with explicit configuration
client = AIEvalClient(
api_key=os.environ["EVALAI_API_KEY"],
organization_id=int(os.environ["EVALAI_ORGANIZATION_ID"]),
debug=True
)Create Your First Trace
Track your first LLM call:
TypeScript
// Create a trace
const trace = await client.traces.create({
name: 'Chat Completion',
traceId: 'trace-' + Date.now(),
metadata: {
userId: 'user-123',
model: 'gpt-4'
}
})
console.log('Trace created:', trace.id)
// Add a span to track the LLM call
const span = await client.traces.createSpan(trace.id, {
name: 'OpenAI API Call',
spanId: 'span-' + Date.now(),
type: 'llm',
startTime: new Date().toISOString(),
input: 'What is AI?',
output: 'AI is artificial intelligence...',
metadata: {
model: 'gpt-4',
tokens: 150,
latency: 1200
}
})
console.log('Span created:', span.id)Python
from evalgate_sdk.types import CreateTraceParams, CreateSpanParams
# Create a trace
trace = await client.traces.create(CreateTraceParams(
name="Chat Completion",
trace_id=f"trace-{int(time.time() * 1000)}",
metadata={"userId": "user-123", "model": "gpt-4"}
))
print(f"Trace created: {trace.id}")
# Add a span to track the LLM call
span = await client.traces.create_span(trace.id, CreateSpanParams(
name="OpenAI API Call",
span_id=f"span-{int(time.time() * 1000)}",
type="llm",
start_time=datetime.now().isoformat(),
input="What is AI?",
output="AI is artificial intelligence...",
metadata={"model": "gpt-4", "tokens": 150, "latency": 1200}
))
print(f"Span created: {span.id}")Write Your First Eval
Now that you can trace, let's evaluate. The SDK includes a test suite runner with 20+ built-in assertions designed for LLM outputs.
TypeScript
import { createTestSuite, expect } from '@evalgate/sdk';
const suite = createTestSuite('My First Eval', {
executor: async (input) => await myLLM(input),
cases: [{
input: 'Summarize this document...',
assertions: [
(output) => expect(output).toHaveLength({ min: 50, max: 500 }),
(output) => expect(output).toNotContainPII(),
(output) => expect(output).toHaveSentiment('neutral'),
]
}]
});
const { total, passed, failed } = await suite.run();
console.log(`Results: ${passed}/${total} passed`);Python
from evalgate_sdk import create_test_suite, expect
suite = create_test_suite("My First Eval",
executor=lambda input: my_llm(input),
cases=[{
"input": "Summarize this document...",
"assertions": [
lambda output: expect(output).to_have_length(min=50, max=500),
lambda output: expect(output).to_not_contain_pii(),
lambda output: expect(output).to_have_sentiment("neutral"),
]
}]
)
results = await suite.run()
print(f"Results: {results.passed}/{results.total} passed")Explore all 20+ assertions including hallucination detection, JSON validation, and profanity checks. View the full assertion library →