Enhanced tools documentation, duplicate ID errors, and AI Agent validator based on telemetry analysis of 593 validation errors across 3 categories: - 378 errors: Duplicate node IDs (64%) - 179 errors: AI Agent configuration (30%) - 36 errors: Other validations (6%) Quick Win #1: Enhanced tools documentation (src/mcp/tools-documentation.ts) - Added prominent warnings to call get_node_essentials() FIRST before configuring nodes - Emphasized 5KB vs 100KB+ size difference between essentials and full info - Updated workflow patterns to prioritize essentials over get_node_info Quick Win #2: Improved duplicate ID error messages (src/services/workflow-validator.ts) - Added crypto import for UUID generation examples - Enhanced error messages with node indices, names, and types - Included crypto.randomUUID() example in error messages - Helps AI agents understand EXACTLY which nodes conflict and how to fix Quick Win #3: Added AI Agent node-specific validator (src/services/node-specific-validators.ts) - Validates prompt configuration (promptType + text requirement) - Checks maxIterations bounds (1-50 recommended) - Suggests error handling (onError + retryOnFail) - Warns about high iteration limits (cost/performance impact) - Integrated into enhanced-config-validator.ts Test Coverage: - Added duplicate ID validation tests (workflow-validator.test.ts) - Added AI Agent validator tests (node-specific-validators.test.ts:2312-2491) - All new tests passing (3527 total passing) Version: 2.22.12 → 2.22.13 Expected Impact: 30-40% reduction in AI agent validation errors Technical Details: - Telemetry analysis: 593 validation errors (Dec 2024 - Jan 2025) - 100% error recovery rate maintained (validation working correctly) - Root cause: Documentation/guidance gaps, not validation logic failures - Solution: Proactive guidance at decision points References: - Telemetry analysis findings - Issue #392 (helpful error messages pattern) - Existing Slack validator pattern (node-specific-validators.ts:98-230) Concieved by Romuald Członkowski - www.aiadvisors.pl/en
15 KiB
n8n-MCP Telemetry Analysis - Complete Index
Navigation Guide for All Analysis Documents
Analysis Period: August 10 - November 8, 2025 (90 days) Report Date: November 8, 2025 Data Quality: High (506K+ events, 36/90 days with errors) Status: Critical Issues Identified - Action Required
Document Overview
This telemetry analysis consists of 5 comprehensive documents designed for different audiences and use cases.
Document Map
┌─────────────────────────────────────────────────────────────┐
│ TELEMETRY ANALYSIS COMPLETE PACKAGE │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. EXECUTIVE SUMMARY (this file + next level up) │
│ ↓ Start here for quick overview │
│ └─→ TELEMETRY_EXECUTIVE_SUMMARY.md │
│ • For: Decision makers, leadership │
│ • Length: 5-10 minutes read │
│ • Contains: Key stats, risks, ROI │
│ │
│ 2. MAIN ANALYSIS REPORT │
│ ↓ For comprehensive understanding │
│ └─→ TELEMETRY_ANALYSIS_REPORT.md │
│ • For: Product, engineering teams │
│ • Length: 30-45 minutes read │
│ • Contains: Detailed findings, patterns, trends │
│ │
│ 3. TECHNICAL DEEP-DIVE │
│ ↓ For root cause investigation │
│ └─→ TELEMETRY_TECHNICAL_DEEP_DIVE.md │
│ • For: Engineering team, architects │
│ • Length: 45-60 minutes read │
│ • Contains: Root causes, hypotheses, gaps │
│ │
│ 4. IMPLEMENTATION ROADMAP │
│ ↓ For actionable next steps │
│ └─→ IMPLEMENTATION_ROADMAP.md │
│ • For: Engineering leads, project managers │
│ • Length: 20-30 minutes read │
│ • Contains: Detailed implementation steps │
│ │
│ 5. VISUALIZATION DATA │
│ ↓ For presentations and dashboards │
│ └─→ TELEMETRY_DATA_FOR_VISUALIZATION.md │
│ • For: All audiences (chart data) │
│ • Length: Reference material │
│ • Contains: Charts, graphs, metrics data │
│ │
└─────────────────────────────────────────────────────────────┘
Quick Navigation
By Role
Executive Leadership / C-Level
Time Available: 5-10 minutes Priority: Understanding business impact
- Start: TELEMETRY_EXECUTIVE_SUMMARY.md
- Focus: Risk assessment, ROI, timeline
- Reference: Key Statistics (below)
Product Management
Time Available: 30 minutes Priority: User impact, feature decisions
- Start: TELEMETRY_ANALYSIS_REPORT.md (Section 1-3)
- Then: TELEMETRY_TECHNICAL_DEEP_DIVE.md (Section 1-2)
- Reference: TELEMETRY_DATA_FOR_VISUALIZATION.md (charts)
Engineering / DevOps
Time Available: 1-2 hours Priority: Root causes, implementation details
- Start: TELEMETRY_TECHNICAL_DEEP_DIVE.md
- Then: IMPLEMENTATION_ROADMAP.md
- Reference: TELEMETRY_ANALYSIS_REPORT.md (for metrics)
Engineering Leads / Architects
Time Available: 2-3 hours Priority: System design, priority decisions
- Start: TELEMETRY_ANALYSIS_REPORT.md (all sections)
- Then: TELEMETRY_TECHNICAL_DEEP_DIVE.md (all sections)
- Then: IMPLEMENTATION_ROADMAP.md
- Reference: Visualization data for presentations
Customer Support / Success
Time Available: 20 minutes Priority: Common issues, user guidance
- Start: TELEMETRY_EXECUTIVE_SUMMARY.md (Top 5 Issues section)
- Then: TELEMETRY_ANALYSIS_REPORT.md (Section 6: Search Queries)
- Reference: Top error messages list (below)
Marketing / Communications
Time Available: 15 minutes Priority: Messaging, external communications
- Start: TELEMETRY_EXECUTIVE_SUMMARY.md
- Focus: Business impact statement
- Key message: "We're fixing critical issues this week"
Key Statistics Summary
Error Metrics
| Metric | Value | Status |
|---|---|---|
| Total Errors (90 days) | 8,859 | Baseline |
| Daily Average | 60.68 | Stable |
| Peak Day | 276 (Oct 30) | Outlier |
| ValidationError | 3,080 (34.77%) | Largest |
| TypeError | 2,767 (31.23%) | Second |
Tool Performance
| Metric | Value | Status |
|---|---|---|
| Critical Tool: get_node_info | 11.72% failure | Action Required |
| Average Success Rate | 98.4% | Good |
| Highest Risk Tools | 5.5-6.4% failure | Monitor |
Performance
| Metric | Value | Status |
|---|---|---|
| Sequential Updates Latency | 55.2 seconds | Bottleneck |
| Read-After-Write Latency | 96.6 seconds | Bottleneck |
| Search Retry Rate | 17% | High |
User Engagement
| Metric | Value | Status |
|---|---|---|
| Daily Sessions | 895 avg | Healthy |
| Daily Users | 572 avg | Healthy |
| Sessions per User | 1.52 avg | Good |
Top 5 Critical Issues
1. Workflow-Level Validation Failures (39% of errors)
- File: TELEMETRY_ANALYSIS_REPORT.md, Section 2.1
- Detail: TELEMETRY_TECHNICAL_DEEP_DIVE.md, Section 1.1
- Fix: IMPLEMENTATION_ROADMAP.md, Section Phase 1, Issue 1.2
2. get_node_info Unreliability (11.72% failure)
- File: TELEMETRY_ANALYSIS_REPORT.md, Section 3.2
- Detail: TELEMETRY_TECHNICAL_DEEP_DIVE.md, Section 4.1
- Fix: IMPLEMENTATION_ROADMAP.md, Section Phase 1, Issue 1.1
3. Slow Sequential Updates (55+ seconds)
- File: TELEMETRY_ANALYSIS_REPORT.md, Section 4.1
- Detail: TELEMETRY_TECHNICAL_DEEP_DIVE.md, Section 6.1
- Fix: IMPLEMENTATION_ROADMAP.md, Section Phase 1, Issue 1.3
4. Search Inefficiency (17% retry rate)
- File: TELEMETRY_ANALYSIS_REPORT.md, Section 6.1
- Detail: TELEMETRY_TECHNICAL_DEEP_DIVE.md, Section 6.3
- Fix: IMPLEMENTATION_ROADMAP.md, Section Phase 2, Issue 2.2
5. Type-Related Validation Errors (31.23% of errors)
- File: TELEMETRY_ANALYSIS_REPORT.md, Section 1.2
- Detail: TELEMETRY_TECHNICAL_DEEP_DIVE.md, Section 2
- Fix: IMPLEMENTATION_ROADMAP.md, Section Phase 2, Issue 2.3
Implementation Timeline
Week 1 (Immediate)
Expected Impact: 40-50% error reduction
-
Fix
get_node_inforeliability- File: IMPLEMENTATION_ROADMAP.md, Phase 1, Issue 1.1
- Effort: 1 day
-
Improve validation error messages
- File: IMPLEMENTATION_ROADMAP.md, Phase 1, Issue 1.2
- Effort: 2 days
-
Add batch workflow update operation
- File: IMPLEMENTATION_ROADMAP.md, Phase 1, Issue 1.3
- Effort: 2-3 days
Week 2-3 (High Priority)
Expected Impact: +30% additional improvement
-
Implement validation caching
- File: IMPLEMENTATION_ROADMAP.md, Phase 2, Issue 2.1
- Effort: 1-2 days
-
Improve search ranking
- File: IMPLEMENTATION_ROADMAP.md, Phase 2, Issue 2.2
- Effort: 2 days
-
Add TypeScript types for top nodes
- File: IMPLEMENTATION_ROADMAP.md, Phase 2, Issue 2.3
- Effort: 3 days
Week 4 (Optimization)
Expected Impact: +10% additional improvement
-
Return updated state in responses
- File: IMPLEMENTATION_ROADMAP.md, Phase 3, Issue 3.1
- Effort: 1-2 days
-
Add workflow diff generation
- File: IMPLEMENTATION_ROADMAP.md, Phase 3, Issue 3.2
- Effort: 1-2 days
Key Findings by Category
Validation Issues
- Most common error category (96.6% of all errors)
- Workflow-level validation: 39.11% of validation errors
- Generic error messages prevent self-resolution
- See: TELEMETRY_ANALYSIS_REPORT.md, Section 2
Tool Reliability Issues
get_node_infocritical (11.72% failure rate)- Information retrieval tools less reliable than state management tools
- Validation tools consistently underperform (5.5-6.4% failure)
- See: TELEMETRY_ANALYSIS_REPORT.md, Section 3 & TECHNICAL_DEEP_DIVE.md, Section 4
Performance Bottlenecks
- Sequential operations extremely slow (55+ seconds)
- Read-after-write pattern inefficient (96.6 seconds)
- Search refinement rate high (17% need multiple searches)
- See: TELEMETRY_ANALYSIS_REPORT.md, Section 4 & TECHNICAL_DEEP_DIVE.md, Section 6
User Behavior
- Top searches: test (5.8K), webhook (5.1K), http (4.2K)
- Most searches indicate where users struggle
- Session metrics show healthy engagement
- See: TELEMETRY_ANALYSIS_REPORT.md, Section 6
Temporal Patterns
- Error rate volatile with significant spikes
- October incident period with slow recovery
- Currently stabilizing at 60-65 errors/day baseline
- See: TELEMETRY_ANALYSIS_REPORT.md, Section 9 & TECHNICAL_DEEP_DIVE.md, Section 5
Metrics to Track Post-Implementation
Primary Success Metrics
get_node_infofailure rate: 11.72% → <1%- Validation error clarity: Generic → Specific (95% have guidance)
- Update latency: 55.2s → <5s
- Overall error count: 8,859 → <2,000 per quarter
Secondary Metrics
- Tool success rates across board: >99%
- Search retry rate: 17% → <5%
- Workflow validation time: <2 seconds
- User satisfaction: +50% improvement
Dashboard Recommendations
- See: TELEMETRY_DATA_FOR_VISUALIZATION.md, Section 14
- Create live dashboard in Grafana/Datadog
- Update daily; review weekly
SQL Queries Reference
All analysis derived from these core queries:
Error Analysis
-- Error type distribution
SELECT error_type, SUM(error_count) as total_occurrences
FROM telemetry_errors_daily
WHERE date >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY error_type ORDER BY total_occurrences DESC;
-- Temporal trends
SELECT date, SUM(error_count) as daily_errors
FROM telemetry_errors_daily
WHERE date >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY date ORDER BY date DESC;
Tool Performance
-- Tool success rates
SELECT tool_name, SUM(usage_count), SUM(success_count),
ROUND(100.0 * SUM(success_count) / SUM(usage_count), 2) as success_rate
FROM telemetry_tool_usage_daily
WHERE date >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY tool_name
ORDER BY success_rate ASC;
Validation Errors
-- Validation errors by node type
SELECT node_type, error_type, SUM(error_count) as total
FROM telemetry_validation_errors_daily
WHERE date >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY node_type, error_type
ORDER BY total DESC;
Complete query library in: TELEMETRY_ANALYSIS_REPORT.md, Section 12
FAQ
Q: Which document should I read first?
A: TELEMETRY_EXECUTIVE_SUMMARY.md (5 min) to understand the situation
Q: What's the most critical issue?
A: Workflow-level validation failures (39% of errors) with generic error messages that prevent users from self-fixing
Q: How long will fixes take?
A: Week 1: 40-50% improvement; Full implementation: 4-5 weeks
Q: What's the ROI?
A: ~26x return in first year; payback in <2 weeks
Q: Should we implement all recommendations?
A: Phase 1 (Week 1) is mandatory; Phase 2-3 are high-value optimization
Q: How confident are these findings?
A: Very high; based on 506K events across 90 days with consistent patterns
Q: What should support/success team do?
A: Review Section 6 of ANALYSIS_REPORT.md for top user pain points and search patterns
Additional Resources
For Presentations
- Use TELEMETRY_DATA_FOR_VISUALIZATION.md for all chart/graph data
- Recommend audience: TELEMETRY_EXECUTIVE_SUMMARY.md, Section "Stakeholder Questions & Answers"
For Team Meetings
- Stand-up briefing: Key Statistics Summary (above)
- Engineering sync: IMPLEMENTATION_ROADMAP.md
- Product review: TELEMETRY_ANALYSIS_REPORT.md, Sections 1-3
For Documentation
- User-facing docs: TELEMETRY_ANALYSIS_REPORT.md, Section 6 (search queries reveal documentation gaps)
- Error code docs: IMPLEMENTATION_ROADMAP.md, Phase 4
For Monitoring
- KPI dashboard: TELEMETRY_DATA_FOR_VISUALIZATION.md, Section 14
- Alert thresholds: IMPLEMENTATION_ROADMAP.md, success metrics
Contact & Questions
Analysis Prepared By: AI Telemetry Analyst Date: November 8, 2025 Data Freshness: Last updated October 31, 2025 (daily updates) Review Frequency: Weekly recommended
For questions about specific findings, refer to:
- Executive level: TELEMETRY_EXECUTIVE_SUMMARY.md
- Technical details: TELEMETRY_TECHNICAL_DEEP_DIVE.md
- Implementation: IMPLEMENTATION_ROADMAP.md
Document Checklist
Use this checklist to ensure you've reviewed appropriate documents:
Essential Reading (Everyone)
- TELEMETRY_EXECUTIVE_SUMMARY.md (5-10 min)
- Top 5 Issues section above (5 min)
Role-Specific
- Leadership: TELEMETRY_EXECUTIVE_SUMMARY.md (Risk & ROI sections)
- Engineering: TELEMETRY_TECHNICAL_DEEP_DIVE.md (all sections)
- Product: TELEMETRY_ANALYSIS_REPORT.md (Sections 1-3)
- Project Manager: IMPLEMENTATION_ROADMAP.md (Timeline section)
- Support: TELEMETRY_ANALYSIS_REPORT.md (Section 6: Search Queries)
For Implementation
- IMPLEMENTATION_ROADMAP.md (all sections)
- TELEMETRY_TECHNICAL_DEEP_DIVE.md (root cause analysis)
For Presentations
- TELEMETRY_DATA_FOR_VISUALIZATION.md (all chart data)
- TELEMETRY_EXECUTIVE_SUMMARY.md (key statistics)
Version History
| Version | Date | Changes |
|---|---|---|
| 1.0 | Nov 8, 2025 | Initial comprehensive analysis |
Next Steps
- Today: Review TELEMETRY_EXECUTIVE_SUMMARY.md
- Tomorrow: Schedule team review meeting
- This Week: Estimate Phase 1 implementation effort
- Next Week: Begin Phase 1 development
Status: Analysis Complete - Ready for Action
All documents are located in:
/Users/romualdczlonkowski/Pliki/n8n-mcp/n8n-mcp/
Files:
- TELEMETRY_ANALYSIS_INDEX.md (this file)
- TELEMETRY_EXECUTIVE_SUMMARY.md
- TELEMETRY_ANALYSIS_REPORT.md
- TELEMETRY_TECHNICAL_DEEP_DIVE.md
- IMPLEMENTATION_ROADMAP.md
- TELEMETRY_DATA_FOR_VISUALIZATION.md