Files
n8n-mcp/docs/BENCHMARKS.md
czlonkowski b5210e5963 feat: add comprehensive performance benchmark tracking system
- Create benchmark test suites for critical operations:
  - Node loading performance
  - Database query performance
  - Search operations performance
  - Validation performance
  - MCP tool execution performance

- Add GitHub Actions workflow for benchmark tracking:
  - Runs on push to main and PRs
  - Uses github-action-benchmark for historical tracking
  - Comments on PRs with performance results
  - Alerts on >10% performance regressions
  - Stores results in GitHub Pages

- Create benchmark infrastructure:
  - Custom Vitest benchmark configuration
  - JSON reporter for CI results
  - Result formatter for github-action-benchmark
  - Performance threshold documentation

- Add supporting utilities:
  - SQLiteStorageService for benchmark database setup
  - MCPEngine wrapper for testing MCP tools
  - Test factories for generating benchmark data
  - Enhanced NodeRepository with benchmark methods

- Document benchmark system:
  - Comprehensive benchmark guide in docs/BENCHMARKS.md
  - Performance thresholds in .github/BENCHMARK_THRESHOLDS.md
  - README for benchmarks directory
  - Integration with existing test suite

The benchmark system will help monitor performance over time and catch regressions before they reach production.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-28 22:45:09 +02:00

5.2 KiB

n8n-mcp Performance Benchmarks

Overview

The n8n-mcp project includes comprehensive performance benchmarks to ensure optimal performance across all critical operations. These benchmarks help identify performance regressions and guide optimization efforts.

Running Benchmarks

Local Development

# Run all benchmarks
npm run benchmark

# Run in watch mode
npm run benchmark:watch

# Run with UI
npm run benchmark:ui

# Run specific benchmark suite
npm run benchmark tests/benchmarks/node-loading.bench.ts

Continuous Integration

Benchmarks run automatically on:

  • Every push to main branch
  • Every pull request
  • Manual workflow dispatch

Results are:

Benchmark Suites

1. Node Loading Performance

Tests the performance of loading n8n node packages and parsing their metadata.

Key Metrics:

  • Package loading time (< 100ms target)
  • Individual node file loading (< 5ms target)
  • Package.json parsing (< 1ms target)

2. Database Query Performance

Measures database operation performance including queries, inserts, and updates.

Key Metrics:

  • Node retrieval by type (< 5ms target)
  • Search operations (< 50ms target)
  • Bulk operations (< 100ms target)

3. Search Operations

Tests various search modes and their performance characteristics.

Key Metrics:

  • Simple word search (< 10ms target)
  • Multi-word OR search (< 20ms target)
  • Fuzzy search (< 50ms target)

4. Validation Performance

Measures configuration and workflow validation speed.

Key Metrics:

  • Simple config validation (< 1ms target)
  • Complex config validation (< 10ms target)
  • Workflow validation (< 50ms target)

5. MCP Tool Execution

Tests the overhead of MCP tool execution.

Key Metrics:

  • Tool invocation overhead (< 5ms target)
  • Complex tool operations (< 50ms target)

Performance Targets

Operation Category Target Warning Critical
Node Loading < 100ms > 150ms > 200ms
Database Query < 5ms > 10ms > 20ms
Search (simple) < 10ms > 20ms > 50ms
Search (complex) < 50ms > 100ms > 200ms
Validation < 10ms > 20ms > 50ms
MCP Tools < 50ms > 100ms > 200ms

Optimization Guidelines

Current Optimizations

  1. In-memory caching: Frequently accessed nodes are cached
  2. Indexed database: Key fields are indexed for fast lookups
  3. Lazy loading: Large properties are loaded on demand
  4. Batch operations: Multiple operations are batched when possible

Future Optimizations

  1. FTS5 Search: Implement SQLite FTS5 for faster full-text search
  2. Connection pooling: Reuse database connections
  3. Query optimization: Analyze and optimize slow queries
  4. Parallel loading: Load multiple packages concurrently

Benchmark Implementation

Writing New Benchmarks

import { bench, describe } from 'vitest';

describe('My Performance Suite', () => {
  bench('operation name', async () => {
    // Code to benchmark
  }, {
    iterations: 100,
    warmupIterations: 10,
    warmupTime: 500,
    time: 3000
  });
});

Best Practices

  1. Isolate operations: Benchmark specific operations, not entire workflows
  2. Use realistic data: Load actual n8n nodes for accurate measurements
  3. Include warmup: Allow JIT compilation to stabilize
  4. Consider memory: Monitor memory usage for memory-intensive operations
  5. Statistical significance: Run enough iterations for reliable results

Interpreting Results

Key Metrics

  • hz: Operations per second (higher is better)
  • mean: Average time per operation (lower is better)
  • p99: 99th percentile (worst-case performance)
  • rme: Relative margin of error (lower is more reliable)

Performance Regression Detection

A performance regression is flagged when:

  1. Operation time increases by >10% from baseline
  2. Multiple related operations show degradation
  3. P99 latency exceeds critical thresholds
  1. Gradual degradation: Often indicates growing technical debt
  2. Sudden spikes: Usually from specific code changes
  3. Seasonal patterns: May indicate cache effectiveness
  4. Outliers: Check p99 vs mean for consistency

Troubleshooting

Common Issues

  1. Inconsistent results: Increase warmup iterations
  2. High variance: Check for background processes
  3. Memory issues: Reduce iteration count
  4. CI failures: Verify runner resources

Performance Debugging

  1. Use --reporter=verbose for detailed output
  2. Profile with node --inspect for bottlenecks
  3. Check database query plans
  4. Monitor memory allocation patterns

Contributing

When submitting performance improvements:

  1. Run benchmarks before and after changes
  2. Include benchmark results in PR description
  3. Explain optimization approach
  4. Consider trade-offs (memory vs speed)
  5. Add new benchmarks for new features

References