fix: critical memory leak from per-session database connections (#554)

* fix: critical memory leak from per-session database connections (#542)

Each MCP session was creating its own database connection (~900MB),
causing OOM kills every ~20 minutes with 3-4 concurrent sessions.

Changes:
- Add SharedDatabase singleton pattern - all sessions share ONE connection
- Reduce session timeout from 30 min to 5 min (configurable)
- Add eager cleanup for reconnecting instances
- Fix telemetry event listener leak

Memory impact: ~900MB/session → ~68MB shared + ~5MB/session overhead

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Conceived by Romuald Czlonkowski - https://www.aiadvisors.pl/en

* fix: resolve test failures from shared database race conditions

- Fix `shutdown()` to respect shared database pattern (was directly closing)
- Add `await this.initialized` in both `close()` and `shutdown()` to prevent
  race condition where cleanup runs while initialization is in progress
- Add double-shutdown protection with `isShutdown` flag
- Export `SharedDatabaseState` type for proper typing
- Include error details in debug logs
- Add MCP server close to `shutdown()` for consistency with `close()`
- Null out `earlyLogger` in `shutdown()` for consistency

The CI test failure "The database connection is not open" was caused by:
1. `shutdown()` directly calling `this.db.close()` which closed the SHARED
   database connection, breaking subsequent tests
2. Race condition where `shutdown()` ran before initialization completed

Conceived by Romuald Członkowski - www.aiadvisors.pl/en

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: add unit tests for shared-database module

Add comprehensive unit tests covering:
- getSharedDatabase: initialization, reuse, different path error, concurrent requests
- releaseSharedDatabase: refCount decrement, double-release guard
- closeSharedDatabase: state clearing, error handling, re-initialization
- Helper functions: isSharedDatabaseInitialized, getSharedDatabaseRefCount

21 tests covering the singleton database connection pattern used to prevent
~900MB memory leaks per session.

Conceived by Romuald Członkowski - www.aiadvisors.pl/en

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Romuald Członkowski
2026-01-23 19:51:22 +01:00
committed by GitHub
parent fad3437977
commit c8c76e435d
9 changed files with 761 additions and 45 deletions

View File

@@ -58,6 +58,13 @@ export class TelemetryBatchProcessor {
private flushTimes: number[] = [];
private deadLetterQueue: (TelemetryEvent | WorkflowTelemetry | WorkflowMutationRecord)[] = [];
private readonly maxDeadLetterSize = 100;
// Track event listeners for proper cleanup to prevent memory leaks
private eventListeners: {
beforeExit?: () => void;
sigint?: () => void;
sigterm?: () => void;
} = {};
private started: boolean = false;
constructor(
private supabase: SupabaseClient | null,
@@ -72,6 +79,12 @@ export class TelemetryBatchProcessor {
start(): void {
if (!this.isEnabled() || !this.supabase) return;
// Guard against multiple starts (prevents event listener accumulation)
if (this.started) {
logger.debug('Telemetry batch processor already started, skipping');
return;
}
// Set up periodic flushing
this.flushTimer = setInterval(() => {
this.flush();
@@ -83,17 +96,22 @@ export class TelemetryBatchProcessor {
this.flushTimer.unref();
}
// Set up process exit handlers
process.on('beforeExit', () => this.flush());
process.on('SIGINT', () => {
// Set up process exit handlers with stored references for cleanup
this.eventListeners.beforeExit = () => this.flush();
this.eventListeners.sigint = () => {
this.flush();
process.exit(0);
});
process.on('SIGTERM', () => {
};
this.eventListeners.sigterm = () => {
this.flush();
process.exit(0);
});
};
process.on('beforeExit', this.eventListeners.beforeExit);
process.on('SIGINT', this.eventListeners.sigint);
process.on('SIGTERM', this.eventListeners.sigterm);
this.started = true;
logger.debug('Telemetry batch processor started');
}
@@ -105,6 +123,20 @@ export class TelemetryBatchProcessor {
clearInterval(this.flushTimer);
this.flushTimer = undefined;
}
// Remove event listeners to prevent memory leaks
if (this.eventListeners.beforeExit) {
process.removeListener('beforeExit', this.eventListeners.beforeExit);
}
if (this.eventListeners.sigint) {
process.removeListener('SIGINT', this.eventListeners.sigint);
}
if (this.eventListeners.sigterm) {
process.removeListener('SIGTERM', this.eventListeners.sigterm);
}
this.eventListeners = {};
this.started = false;
logger.debug('Telemetry batch processor stopped');
}