fix: critical telemetry improvements for data quality and security (#421)

* fix: critical telemetry improvements for data quality and security

Fixed three critical issues in workflow mutation telemetry:

1. Fixed Inconsistent Sanitization (Security Critical)
   - Problem: 30% of workflows unsanitized, exposing credentials/tokens
   - Solution: Use robust WorkflowSanitizer.sanitizeWorkflowRaw()
   - Impact: 100% sanitization with 17 sensitive patterns redacted
   - Files: workflow-sanitizer.ts, mutation-tracker.ts

2. Enabled Validation Data Capture (Data Quality)
   - Problem: Zero validation metrics captured (all NULL)
   - Solution: Add pre/post mutation validation with WorkflowValidator
   - Impact: Measure mutation quality, track error resolution
   - Non-blocking validation that captures errors/warnings
   - Files: handlers-workflow-diff.ts

3. Improved Intent Capture (Data Quality)
   - Problem: 92.62% generic "Partial workflow update" intents
   - Solution: Enhanced docs + automatic intent inference
   - Impact: Meaningful intents auto-generated from operations
   - Files: n8n-update-partial-workflow.ts, handlers-workflow-diff.ts

Expected Results:
- 100% sanitization coverage (up from 70%)
- 100% validation capture (up from 0%)
- 50%+ meaningful intents (up from 7.33%)

Version bumped to 2.22.17

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Conceived by Romuald Członkowski - https://www.aiadvisors.pl/en

Co-Authored-By: Claude <noreply@anthropic.com>

* perf: implement validator instance caching to avoid redundant initialization

- Add module-level cached WorkflowValidator instance
- Create getValidator() helper to reuse validator across mutations
- Update pre/post mutation validation to use cached instance
- Avoids redundant NodeSimilarityService initialization on every mutation

Conceived by Romuald Członkowski - https://www.aiadvisors.pl/en

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: restore backward-compatible sanitization with context preservation

Fixed CI test failures by updating WorkflowSanitizer to use pattern-specific
placeholders while maintaining backward compatibility:

Changes:
- Convert SENSITIVE_PATTERNS to PatternDefinition objects with specific placeholders
- Update sanitizeString() to preserve context (Bearer prefix, URL paths)
- Refactor sanitizeObject() to handle sensitive fields vs URL fields differently
- Remove overly greedy field patterns that conflicted with token patterns

Pattern-specific placeholders:
- [REDACTED_URL_WITH_AUTH] for URLs with credentials
- [REDACTED_TOKEN] for long tokens (32+ chars)
- [REDACTED_APIKEY] for OpenAI-style keys
- Bearer [REDACTED] for Bearer tokens (preserves "Bearer " prefix)
- [REDACTED] for generic sensitive fields

Test Results:
- All 13 mutation-tracker tests passing
- URL with auth: preserves path after credentials
- Long tokens: properly detected and marked
- OpenAI keys: correctly identified
- Bearer tokens: prefix preserved
- Sensitive field names: generic redaction for non-URL fields

Fixes #421 CI failures

Conceived by Romuald Członkowski - https://www.aiadvisors.pl/en

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: prevent double-redaction in workflow sanitizer

Added safeguard to stop pattern matching once a placeholder is detected,
preventing token patterns from matching text inside placeholders like
[REDACTED_URL_WITH_AUTH].

Also expanded database URL pattern to match full URLs including port and
path, and updated test expectations to match context-preserving sanitization.

Fixes:
- Database URLs now properly sanitized to [REDACTED_URL_WITH_AUTH]
- Prevents [[REDACTED]] double-redaction issue
- All 25 workflow-sanitizer tests passing
- No regression in mutation-tracker tests

Conceived by Romuald Członkowski - www.aiadvisors.pl/en

---------

Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
Romuald Członkowski
2025-11-13 22:13:31 +01:00
committed by GitHub
parent 99c5907b71
commit 597bd290b6
11 changed files with 630 additions and 137 deletions

View File

@@ -9,7 +9,8 @@ export const n8nUpdatePartialWorkflowDoc: ToolDocumentation = {
example: 'n8n_update_partial_workflow({id: "wf_123", operations: [{type: "rewireConnection", source: "IF", from: "Old", to: "New", branch: "true"}]})',
performance: 'Fast (50-200ms)',
tips: [
'Include intent parameter in every call - helps to return better responses',
'ALWAYS provide intent parameter describing what you\'re doing (e.g., "Add error handling", "Fix webhook URL", "Connect Slack to error output")',
'DON\'T use generic intent like "update workflow" or "partial update" - be specific about your goal',
'Use rewireConnection to change connection targets',
'Use branch="true"/"false" for IF nodes',
'Use case=N for Switch nodes',
@@ -367,7 +368,7 @@ n8n_update_partial_workflow({
],
performance: 'Very fast - typically 50-200ms. Much faster than full updates as only changes are processed.',
bestPractices: [
'Always include intent parameter - it helps provide better responses',
'Always include intent parameter with specific description (e.g., "Add error handling to HTTP Request node", "Fix authentication flow", "Connect Slack notification to errors"). Avoid generic phrases like "update workflow" or "partial update"',
'Use rewireConnection instead of remove+add for changing targets',
'Use branch="true"/"false" for IF nodes instead of sourceIndex',
'Use case=N for Switch nodes instead of sourceIndex',