Commit Graph

6 Commits

Author SHA1 Message Date
Romuald Członkowski
5575630711 fix: eliminate stack overflow in session removal (#427) (#428)
Critical bug fix for production crashes during session cleanup.

**Root Cause:**
Infinite recursion caused by circular event handler chain:
- removeSession() called transport.close()
- transport.close() triggered onclose event handler
- onclose handler called removeSession() again
- Loop continued until stack overflow

**Solution:**
Delete transport from registry BEFORE closing to break circular reference:
1. Store transport reference
2. Delete from this.transports first
3. Close transport after deletion
4. When onclose fires, transport no longer found, no recursion

**Impact:**
- Eliminates "RangeError: Maximum call stack size exceeded" errors
- Fixes session cleanup crashes every 5 minutes in production
- Prevents potential memory leaks from failed cleanup

**Testing:**
- Added regression test for infinite recursion prevention
- All 39 session management tests pass
- Build and typecheck succeed

Conceived by Romuald Członkowski - https://www.aiadvisors.pl/en

Closes #427
2025-11-18 17:41:17 +01:00
Romuald Członkowski
8d20c64f5c Revert to v2.18.10 - Remove session persistence (v2.19.0-v2.19.5) (#322)
After 5 consecutive hotfix attempts, session persistence has proven
architecturally incompatible with the MCP SDK. Rolling back to last
known stable version.

## Removed
- 16 new files (session types, docs, tests, planning docs)
- 1,100+ lines of session persistence code
- Session restoration hooks and lifecycle events
- Retry policy and warm-start implementations

## Restored
- Stable v2.18.10 codebase
- Library export fields (from PR #310)
- All core MCP functionality

## Breaking Changes
- Session persistence APIs removed
- onSessionNotFound hook removed
- Session lifecycle events removed

This reverts commits fe13091 through 1d34ad8.
Restores commit 4566253 (v2.18.10, PR #310).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 10:13:43 +02:00
Romuald Członkowski
318986f546 🚨 HOTFIX v2.19.2: Fix critical session cleanup stack overflow (#316)
* fix: Fix critical session cleanup stack overflow bug (v2.19.2)

This commit fixes a critical P0 bug that caused stack overflow during
container restart, making the service unusable for all users with
session persistence enabled.

Root Causes:
1. Missing await in cleanupExpiredSessions() line 206 caused
   overlapping async cleanup attempts
2. Transport event handlers (onclose, onerror) triggered recursive
   cleanup during shutdown
3. No recursion guard to prevent concurrent cleanup of same session

Fixes Applied:
- Added cleanupInProgress Set recursion guard
- Added isShuttingDown flag to prevent recursive event handlers
- Implemented safeCloseTransport() with timeout protection (3s)
- Updated removeSession() with recursion guard and safe close
- Fixed cleanupExpiredSessions() to properly await with error isolation
- Updated all transport event handlers to check shutdown flag
- Enhanced shutdown() method for proper sequential cleanup

Impact:
- Service now survives container restarts without stack overflow
- No more hanging requests after restart
- Individual session cleanup failures don't cascade
- All 77 session lifecycle tests passing

Version: 2.19.2
Severity: CRITICAL
Priority: P0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: Bump package.runtime.json to v2.19.2

* test: Fix transport cleanup test to work with safeCloseTransport

The test was manually triggering mockTransport.onclose() to simulate
cleanup, but our stack overflow fix sets transport.onclose = undefined
in safeCloseTransport() before closing.

Updated the test to call removeSession() directly instead of manually
triggering the onclose handler. This properly tests the cleanup behavior
with the new recursion-safe approach.

Changes:
- Call removeSession() directly to test cleanup
- Verify transport.close() is called
- Verify onclose and onerror handlers are cleared
- Verify all session data structures are cleaned up

Test Results: All 115 session tests passing 

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-13 11:54:18 +02:00
czlonkowski
275e573d8d fix: update session validation tests to match relaxed validation behavior
- Updated "should return 400 for empty session ID" test to expect "Mcp-Session-Id header is required"
  instead of "Invalid session ID format" (empty strings are treated as missing headers)
- Updated "should return 404 for non-existent session" test to verify any non-empty string format is accepted
- Updated "should accept any non-empty string as session ID" test to comprehensively test all session ID formats
- All 38 session management tests now pass

This aligns with the relaxed session ID validation introduced in PR #309 for multi-tenant support.
The server now accepts any non-empty string as a session ID to support various MCP clients
(UUIDv4, instance-prefixed, mcp-remote, custom formats).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-11 22:31:07 +02:00
czlonkowski
6264bcff33 fix: resolve TypeScript errors and enhance test script
- Fix TypeScript errors in session management tests
  - Add null checks for sessionInfo.sessions access
  - Use type assertion for delete operator on process.env
  - Ensure proper cleanup of NODE_ENV in tests
- Enhance test-n8n-integration.sh script
  - Add Docker installation check and auto-install for multiple OS
  - Implement n8n API key flow for management tools
  - Fix misleading Bearer token instruction
  - Add colored output for better UX
  - Check for optional jq installation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-01 08:54:24 +02:00
czlonkowski
916825634b test: add comprehensive session management tests to improve patch coverage
- Add 37 test cases covering all session management features
- Test session creation, limits, expiration, and cleanup
- Test security features including production mode validation
- Test transport management and cleanup
- Test new DELETE /mcp endpoint for session termination
- Test enhanced health endpoint with session statistics
- Improve statement coverage from 50.43% to 71.94%
- Improve function coverage from 55.55% to 80.95%

This addresses the codecov patch coverage failure by adding tests
for the ~600 new lines of session management code.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-01 08:40:20 +02:00