n8n-mcp/TELEMETRY_N8N_FIXER_DATASET.md

# N8N-Fixer Dataset: Telemetry Infrastructure Analysis

**Analysis Completed:** November 12, 2025
**Scope:** N8N-MCP Telemetry Database Schema & Workflow Mutation Tracking
**Status:** Ready for Implementation Planning

---

## Overview

This document synthesizes a comprehensive analysis of the n8n-mcp telemetry infrastructure and provides actionable recommendations for building an n8n-fixer dataset with before/instruction/after workflow snapshots.

**Key Findings:**
- Telemetry system is production-ready with 276K+ events tracked
- Supabase PostgreSQL backend stores all events
- Current system **does NOT capture workflow mutations** (before→after transitions)
- Requires new table + instrumentation to collect fixer dataset
- Implementation is straightforward with 3-4 weeks of development

---

## Documentation Map

### 1. TELEMETRY_ANALYSIS.md (Primary Reference)
**Length:** 720 lines | **Read Time:** 20-30 minutes
**Contains:**
- Complete schema analysis (tables, columns, types)
- All 12 event types with examples
- Current workflow tracking capabilities
- Missing data for mutation tracking
- Recommended schema additions
- Technical implementation details

**Start Here If:** You need the complete picture of current capabilities and gaps

---

### 2. TELEMETRY_MUTATION_SPEC.md (Implementation Blueprint)
**Length:** 918 lines | **Read Time:** 30-40 minutes
**Contains:**
- Detailed SQL schema for `workflow_mutations` table
- Complete TypeScript interfaces and types
- Integration points with existing tools
- Mutation analyzer service specification
- Batch processor extensions
- Query examples for dataset analysis

**Start Here If:** You're ready to implement the mutation tracking system

---

### 3. TELEMETRY_QUICK_REFERENCE.md (Developer Guide)
**Length:** 503 lines | **Read Time:** 10-15 minutes
**Contains:**
- Supabase connection details
- Common queries and patterns
- Performance tips and tricks
- Code file references
- Quick lookup for event types

**Start Here If:** You need to query existing telemetry data or reference specific details

---

### 4. TELEMETRY_QUICK_REFERENCE.md (Archive)
These documents from November 8 contain additional context:
- `TELEMETRY_ANALYSIS_REPORT.md` - Executive summary with visualizations
- `TELEMETRY_EXECUTIVE_SUMMARY.md` - High-level overview
- `TELEMETRY_TECHNICAL_DEEP_DIVE.md` - Architecture details
- `TELEMETRY_DATA_FOR_VISUALIZATION.md` - Sample data for dashboards

---

## Current State Summary

### Telemetry Backend
```
URL:     https://ydyufsohxdfpopqbubwk.supabase.co
Database: PostgreSQL
Tables:   telemetry_events (276K rows)
          telemetry_workflows (6.5K rows)
Privacy:  PII sanitization enabled
Scope:    Anonymous tool usage, workflows, errors
```

### Tracked Event Categories
1. **Tool Usage** (40-50%) - Which tools users employ
2. **Tool Sequences** (20-30%) - How tools are chained together
3. **Errors** (10-15%) - Error types and context
4. **Validation** (5-10%) - Configuration validation details
5. **Workflows** (5-10%) - Workflow creation and structure
6. **Performance** (5-10%) - Operation latency
7. **Sessions** (misc) - User session metadata

### What's Missing for N8N-Fixer
```
MISSING: Workflow Mutation Events
- No before workflow capture
- No instruction/transformation storage
- No after workflow snapshot
- No mutation success metrics
- No validation improvement tracking
```

---

## Recommended Implementation Path

### Phase 1: Infrastructure (1-2 weeks)
1. Create `workflow_mutations` table in Supabase
   - See TELEMETRY_MUTATION_SPEC.md Section 2.1 for full SQL
   - Includes 20+ strategic indexes
   - Supports compression for large workflows

2. Update TypeScript types
   - New `WorkflowMutation` interface
   - New `WorkflowMutationEvent` event type
   - Mutation analyzer service

3. Add data validators
   - Hash verification
   - Deduplication logic
   - Size validation

---

### Phase 2: Core Integration (1-2 weeks)
1. Extend TelemetryManager
   - Add `trackWorkflowMutation()` method
   - Auto-flush mutations to prevent loss

2. Extend EventTracker
   - Add mutation queue
   - Mutation analyzer integration
   - Validation state detection

3. Extend BatchProcessor
   - Flush workflow mutations to Supabase
   - Retry logic and dead letter queue
   - Performance monitoring

---

### Phase 3: Tool Integration (1 week)
Instrument 3 key tools to capture mutations:

1. **n8n_autofix_workflow**
   - Before: Broken workflow
   - Instruction: "Auto-fix validation errors"
   - After: Fixed workflow
   - Type: `auto_fix`

2. **n8n_update_partial_workflow**
   - Before: Current workflow
   - Instruction: Diff operations
   - After: Updated workflow
   - Type: `user_provided`

3. **Validation Engine** (if applicable)
   - Before: Invalid workflow
   - Instruction: Validation correction
   - After: Valid workflow
   - Type: `validation_correction`

---

### Phase 4: Validation & Analysis (1 week)
1. Data quality verification
   - Hash validation
   - Size checks
   - Deduplication effectiveness

2. Sample query execution
   - Success rate by instruction type
   - Common mutations
   - Complexity impact

3. Dataset assessment
   - Volume estimates
   - Data distribution
   - Quality metrics

---

## Key Metrics You'll Collect

### Per Mutation Record
- **Identification:** User ID, Workflow ID, Timestamp
- **Before State:** Full workflow JSON, hash, validation status
- **Instruction:** The transformation prompt/directive
- **After State:** Full workflow JSON, hash, validation status
- **Changes:** Nodes modified, properties changed, connections modified
- **Outcome:** Success boolean, validation improvement, errors fixed

### Aggregate Analysis
```sql
-- Success rates by instruction type
SELECT instruction_type, COUNT(*) as count,
       ROUND(100.0 * COUNT(*) FILTER(WHERE mutation_success) / COUNT(*), 2) as success_rate
FROM workflow_mutations
GROUP BY instruction_type;

-- Validation improvement distribution
SELECT validation_errors_fixed, COUNT(*) as count
FROM workflow_mutations
WHERE validation_improved = true
GROUP BY 1
ORDER BY 2 DESC;

-- Complexity transitions
SELECT complexity_before, complexity_after, COUNT(*) as transitions
FROM workflow_mutations
GROUP BY 1, 2;
```

---

## Storage Requirements

### Data Size Estimates
```
Average Before Workflow:     10 KB
Average After Workflow:      10 KB
Average Instruction:         500 B
Indexes & Metadata:          5 KB
Per Mutation Total:          25 KB

Monthly Mutations (estimate): 10K-50K
Monthly Storage:             250 MB - 1.2 GB
Annual Storage:              3-14 GB
```

### Optimization Strategies
1. **Compression:** Gzip workflows >1MB
2. **Deduplication:** Skip identical before/after pairs
3. **Retention:** Define archival policy (90 days? 1 year?)
4. **Indexing:** Materialized views for common queries

---

## Data Safety & Privacy

### Current Protections
- User IDs are anonymized
- Credentials are stripped from workflows
- Email addresses are masked [EMAIL]
- API keys are masked [KEY]
- URLs are masked [URL]
- Error messages are sanitized

### For Mutations Table
- Continue PII sanitization
- Hash verification for integrity
- Size limits (10 MB per workflow with compression)
- User consent (telemetry opt-in)

---

## Integration Points

### Where to Add Tracking Calls
```typescript
// In n8n_autofix_workflow
await telemetry.trackWorkflowMutation(
  originalWorkflow,
  'Auto-fix validation errors',
  fixedWorkflow,
  { instructionType: 'auto_fix', success: true }
);

// In n8n_update_partial_workflow
await telemetry.trackWorkflowMutation(
  currentWorkflow,
  formatOperationsAsInstruction(operations),
  updatedWorkflow,
  { instructionType: 'user_provided' }
);
```

### No Breaking Changes
- Fully backward compatible
- Existing telemetry unaffected
- Optional feature (can disable if needed)
- Doesn't require version bump

---

## Success Criteria

### Phase 1 Complete When:
- [ ] `workflow_mutations` table created with all indexes
- [ ] TypeScript types defined and compiling
- [ ] Validators written and tested
- [ ] No schema changes needed (validated against use cases)

### Phase 2 Complete When:
- [ ] TelemetryManager has `trackWorkflowMutation()` method
- [ ] EventTracker queues mutations properly
- [ ] BatchProcessor flushes mutations to Supabase
- [ ] Integration tests pass

### Phase 3 Complete When:
- [ ] 3+ tools instrumented with tracking calls
- [ ] Manual testing shows mutations captured
- [ ] Sample mutations visible in Supabase
- [ ] No performance regression in tools

### Phase 4 Complete When:
- [ ] 100+ mutations collected and validated
- [ ] Sample queries execute correctly
- [ ] Data quality metrics acceptable
- [ ] Dataset ready for ML training

---

## File Structure for Implementation

```
src/telemetry/
├── telemetry-types.ts          (Update: Add WorkflowMutation interface)
├── telemetry-manager.ts        (Update: Add trackWorkflowMutation method)
├── event-tracker.ts            (Update: Add mutation tracking)
├── batch-processor.ts          (Update: Add flush mutations)
├── mutation-analyzer.ts        (NEW: Analyze workflow diffs)
├── mutation-validator.ts       (NEW: Validate mutation data)
└── index.ts                    (Update: Export new functions)

tests/
└── unit/telemetry/
    ├── mutation-analyzer.test.ts        (NEW)
    ├── mutation-validator.test.ts       (NEW)
    └── telemetry-integration.test.ts    (Update)
```

---

## Risk Assessment

### Low Risk
- No changes to existing event system
- Supabase table addition is non-breaking
- TypeScript types only (no runtime impact)

### Medium Risk
- Large workflows may impact performance if not compressed
- Storage costs if dataset grows faster than estimated
- Mitigation: Compression + retention policy

### High Risk
- None identified if implemented as specified

---

## Next Steps

1. **Review This Analysis**
   - Read TELEMETRY_ANALYSIS.md (main reference)
   - Review TELEMETRY_MUTATION_SPEC.md (implementation guide)

2. **Plan Implementation**
   - Estimate developer hours
   - Assign implementation tasks
   - Create Jira tickets or equivalent

3. **Phase 1: Create Infrastructure**
   - Create Supabase table
   - Define TypeScript types
   - Write validators

4. **Phase 2: Integrate Core**
   - Extend telemetry system
   - Write integration tests

5. **Phase 3: Instrument Tools**
   - Add tracking calls to 3+ mutation sources
   - Test end-to-end

6. **Phase 4: Validate**
   - Collect sample data
   - Run analysis queries
   - Begin dataset collection

---

## Questions to Answer Before Starting

1. **Data Retention:** How long should mutations be kept? (90 days? 1 year?)
2. **Storage Budget:** What's acceptable monthly storage cost?
3. **Workflow Size:** What's the max workflow size to store? (with or without compression?)
4. **Dataset Timeline:** When do you need first 1K/10K/100K samples?
5. **Privacy:** Any additional PII to sanitize beyond current approach?
6. **User Consent:** Should mutation tracking be separate opt-in from telemetry?

---

## Useful Commands

### View Current Telemetry Tables
```sql
SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public'
AND table_name LIKE 'telemetry%';
```

### Count Current Events
```sql
SELECT event, COUNT(*) FROM telemetry_events
GROUP BY event ORDER BY 2 DESC;
```

### Check Workflow Deduplication Rate
```sql
SELECT COUNT(*) as total,
       COUNT(DISTINCT workflow_hash) as unique
FROM telemetry_workflows;
```

---

## Document References

All documents are in the n8n-mcp repository root:

| Document | Purpose | Read Time |
|----------|---------|-----------|
| TELEMETRY_ANALYSIS.md | Complete schema & event analysis | 20-30 min |
| TELEMETRY_MUTATION_SPEC.md | Implementation specification | 30-40 min |
| TELEMETRY_QUICK_REFERENCE.md | Developer quick lookup | 10-15 min |
| TELEMETRY_ANALYSIS_REPORT.md | Executive summary (archive) | 15-20 min |
| TELEMETRY_TECHNICAL_DEEP_DIVE.md | Architecture (archive) | 20-25 min |

---

## Summary

The n8n-mcp telemetry infrastructure is mature, privacy-conscious, and well-designed. It currently tracks user interactions effectively but lacks workflow mutation capture needed for the n8n-fixer dataset.

**The solution is straightforward:** Add a single `workflow_mutations` table, extend the tracking system, and instrument 3-4 key tools.

**Implementation effort:** 3-4 weeks for a complete, production-ready system.

**Result:** A high-quality dataset of before/instruction/after workflow transformations suitable for training ML models to fix broken n8n workflows automatically.

---

**Analysis completed by:** Telemetry Data Analyst
**Date:** November 12, 2025
**Status:** Ready for implementation planning

For questions or clarifications, refer to the detailed specifications or raise issues on GitHub.