Files
n8n-mcp/docs/DOCKER_OPTIMIZATION_PLAN.md
czlonkowski 3ab8fbd60b feat: implement Docker image optimization - reduces size from 2.6GB to ~200MB
- Add optimized database schema with embedded source code storage
- Create optimized rebuild script that extracts source at build time
- Implement optimized MCP server reading from pre-built database
- Add Dockerfile.optimized with multi-stage build process
- Create comprehensive documentation and testing scripts
- Demonstrate 92% size reduction by removing runtime n8n dependencies

The optimization works by:
1. Building complete database at Docker build time
2. Extracting all node source code into the database
3. Creating minimal runtime image without n8n packages
4. Serving everything from pre-built SQLite database

This makes n8n-MCP suitable for resource-constrained production deployments.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-14 10:36:54 +02:00

198 lines
5.9 KiB
Markdown

# Docker Image Optimization Plan
## Current State Analysis
### Problems Identified:
1. **Image Size**: 2.61GB (way too large for an MCP server)
2. **Runtime Dependencies**: Includes entire n8n ecosystem (`n8n`, `n8n-core`, `n8n-workflow`, `@n8n/n8n-nodes-langchain`)
3. **Database Built at Runtime**: `docker-entrypoint.sh` runs `rebuild.js` on container start
4. **Runtime Node Extraction**: Several MCP tools try to extract node source code at runtime
### Root Cause:
The production `node_modules` includes massive n8n packages that are only needed for:
- Extracting node metadata during database build
- Source code extraction (which should be done at build time)
## Optimization Strategy
### Goal:
Reduce Docker image from 2.61GB to ~150-200MB by:
1. Building complete database at Docker build time
2. Including pre-extracted source code in database
3. Removing n8n dependencies from runtime image
## Implementation Plan
### Phase 1: Database Schema Enhancement
Modify `schema.sql` to store source code directly:
```sql
-- Add to nodes table
ALTER TABLE nodes ADD COLUMN node_source_code TEXT;
ALTER TABLE nodes ADD COLUMN credential_source_code TEXT;
ALTER TABLE nodes ADD COLUMN source_extracted_at INTEGER;
```
### Phase 2: Enhance Database Building
#### 2.1 Modify `rebuild.ts`:
- Extract and store node source code during build
- Extract and store credential source code
- Save all data that runtime tools need
#### 2.2 Create `build-time-extractor.ts`:
- Dedicated extractor for build-time use
- Extracts ALL information needed at runtime
- Stores in database for later retrieval
### Phase 3: Refactor Runtime Services
#### 3.1 Update `NodeDocumentationService`:
- Remove dependency on `NodeSourceExtractor` for runtime
- Read source code from database instead of filesystem
- Remove `ensureNodeDataAvailable` dynamic loading
#### 3.2 Modify MCP Tools:
- `get_node_source_code`: Read from database, not filesystem
- `list_available_nodes`: Query database, not scan packages
- `rebuild_documentation_database`: Remove or make it a no-op
### Phase 4: Dockerfile Optimization
```dockerfile
# Build stage - includes all n8n packages
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Database build stage - has n8n packages
FROM builder AS db-builder
WORKDIR /app
# Build complete database with all source code
RUN npm run rebuild
# Runtime stage - minimal dependencies
FROM node:20-alpine AS runtime
WORKDIR /app
# Only runtime dependencies (no n8n packages)
COPY package*.json ./
RUN npm ci --omit=dev --ignore-scripts && \
npm uninstall n8n n8n-core n8n-workflow @n8n/n8n-nodes-langchain && \
npm install @modelcontextprotocol/sdk better-sqlite3 express dotenv sql.js
# Copy built application
COPY --from=builder /app/dist ./dist
# Copy pre-built database
COPY --from=db-builder /app/data/nodes.db ./data/
# Copy minimal required files
COPY src/database/schema.sql ./src/database/
COPY .env.example ./
COPY docker/docker-entrypoint-optimized.sh /usr/local/bin/docker-entrypoint.sh
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
USER nodejs
EXPOSE 3000
HEALTHCHECK CMD curl -f http://localhost:3000/health || exit 1
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
CMD ["node", "dist/mcp/index.js"]
```
### Phase 5: Runtime Adjustments
#### 5.1 Create `docker-entrypoint-optimized.sh`:
- Remove database building logic
- Only check if database exists
- Simple validation and startup
#### 5.2 Update `package.json`:
- Create separate `dependencies-runtime.json` for Docker
- Move n8n packages to `buildDependencies` section
## File Changes Required
### 1. Database Schema (`src/database/schema.sql`)
- Add source code columns
- Add extraction metadata
### 2. Rebuild Script (`src/scripts/rebuild.ts`)
- Extract and store source code during build
- Store all runtime-needed data
### 3. Node Repository (`src/database/node-repository.ts`)
- Add methods to save/retrieve source code
- Update data structures
### 4. MCP Server (`src/mcp/server.ts`)
- Modify `getNodeSourceCode` to use database
- Update `listAvailableNodes` to query database
- Remove/disable `rebuildDocumentationDatabase`
### 5. Node Documentation Service (`src/services/node-documentation-service.ts`)
- Remove runtime extractors
- Use database for all queries
- Simplify initialization
### 6. Docker Files
- Create optimized Dockerfile
- Create optimized entrypoint script
- Update docker-compose.yml
## Expected Results
### Before:
- Image size: 2.61GB
- Runtime deps: Full n8n ecosystem
- Startup: Slow (builds database)
- Memory: High usage
### After:
- Image size: ~150-200MB
- Runtime deps: Minimal (MCP + SQLite)
- Startup: Fast (pre-built database)
- Memory: Low usage
## Migration Strategy
1. **Keep existing functionality**: Current Docker setup continues to work
2. **Create new optimized version**: `Dockerfile.optimized`
3. **Test thoroughly**: Ensure all MCP tools work with pre-built database
4. **Gradual rollout**: Tag as `n8n-mcp:slim` initially
5. **Documentation**: Update guides for both versions
## Risks and Mitigations
### Risk 1: Dynamic Nodes
- **Issue**: New nodes added after build won't be available
- **Mitigation**: Document rebuild process, consider scheduled rebuilds
### Risk 2: Source Code Extraction
- **Issue**: Source code might be large
- **Mitigation**: Compress source code in database, lazy load if needed
### Risk 3: Compatibility
- **Issue**: Some tools expect runtime n8n access
- **Mitigation**: Careful testing, fallback mechanisms
## Success Metrics
1. ✅ Image size < 300MB
2. Container starts in < 5 seconds
3. All MCP tools functional
4. Memory usage < 100MB idle
5. No runtime dependency on n8n packages
## Implementation Order
1. **Database schema changes** (non-breaking)
2. **Enhanced rebuild script** (backward compatible)
3. **Runtime service refactoring** (feature flagged)
4. **Optimized Dockerfile** (separate file)
5. **Testing and validation**
6. **Documentation updates**
7. **Gradual rollout**