improve performance

2026-03-16 18:33:08 +00:00 · 2026-01-23 14:37:43 +02:00
parent 1be42cc734
commit 874359fcf6
9 changed files with 396 additions and 672 deletions
--- a/.claude/templates/coding_prompt.template.md
+++ b/.claude/templates/coding_prompt.template.md
@@ -172,48 +172,12 @@ Use browser automation tools:
 - [ ] Loading states appeared during API calls
 - [ ] Error states handle failures gracefully

-### STEP 5.6: MOCK DATA DETECTION SWEEP
+### STEP 5.6: MOCK DATA DETECTION (Before marking passing)

-**Run this sweep AFTER EVERY FEATURE before marking it as passing:**
-
-#### 1. Code Pattern Search
-
-Search the codebase for forbidden patterns:
-
-```bash
-# Search for mock data patterns
-grep -r "mockData\|fakeData\|sampleData\|dummyData\|testData" --include="*.js" --include="*.ts" --include="*.jsx" --include="*.tsx"
-grep -r "// TODO\|// FIXME\|// STUB\|// MOCK" --include="*.js" --include="*.ts" --include="*.jsx" --include="*.tsx"
-grep -r "hardcoded\|placeholder" --include="*.js" --include="*.ts" --include="*.jsx" --include="*.tsx"
-```
-
-**If ANY matches found related to your feature - FIX THEM before proceeding.**
-
-#### 2. Runtime Verification
-
-For ANY data displayed in UI:
-
-1. Create NEW data with UNIQUE content (e.g., "TEST_12345_DELETE_ME")
-2. Verify that EXACT content appears in the UI
-3. Delete the record
-4. Verify it's GONE from the UI
-5. **If you see data that wasn't created during testing - IT'S MOCK DATA. Fix it.**
-
-#### 3. Database Verification
-
-Check that:
-
- Database tables contain only data you created during tests
- Counts/statistics match actual database record counts
- No seed data is masquerading as user data
-
-#### 4. API Response Verification
-
-For API endpoints used by this feature:
-
- Call the endpoint directly
- Verify response contains actual database data
- Empty database = empty response (not pre-populated mock data)
+1. **Search code:** `grep -r "mockData\|fakeData\|TODO\|STUB" --include="*.ts" --include="*.tsx"`
+2. **Runtime test:** Create unique data (e.g., "TEST_12345") → verify in UI → delete → verify gone
+3. **Check database:** All displayed data must come from real DB queries
+4. If unexplained data appears, it's mock data - fix before marking passing.

 ### STEP 6: UPDATE FEATURE STATUS (CAREFULLY!)

@@ -273,51 +237,11 @@ Before context fills up:

 ---

-## TESTING REQUIREMENTS
+## BROWSER AUTOMATION

-**ALL testing must use browser automation tools.**
+Use Playwright MCP tools (`browser_*`) for UI verification. Key tools: `navigate`, `click`, `type`, `fill_form`, `take_screenshot`, `console_messages`, `network_requests`. All tools have auto-wait built in.

-Available tools:
-
-**Navigation & Screenshots:**
-
- browser_navigate - Navigate to a URL
- browser_navigate_back - Go back to previous page
- browser_take_screenshot - Capture screenshot (use for visual verification)
- browser_snapshot - Get accessibility tree snapshot (structured page data)
-
-**Element Interaction:**
-
- browser_click - Click elements (has built-in auto-wait)
- browser_type - Type text into editable elements
- browser_fill_form - Fill multiple form fields at once
- browser_select_option - Select dropdown options
- browser_hover - Hover over elements
- browser_drag - Drag and drop between elements
- browser_press_key - Press keyboard keys
-
-**Debugging & Monitoring:**
-
- browser_console_messages - Get browser console output (check for errors)
- browser_network_requests - Monitor API calls and responses
- browser_evaluate - Execute JavaScript (USE SPARINGLY - debugging only, NOT for bypassing UI)
-
-**Browser Management:**
-
- browser_close - Close the browser
- browser_resize - Resize browser window (use to test mobile: 375x667, tablet: 768x1024, desktop: 1280x720)
- browser_tabs - Manage browser tabs
- browser_wait_for - Wait for text/element/time
- browser_handle_dialog - Handle alert/confirm dialogs
- browser_file_upload - Upload files
-
-**Key Benefits:**
-
- All interaction tools have **built-in auto-wait** - no manual timeouts needed
- Use `browser_console_messages` to detect JavaScript errors
- Use `browser_network_requests` to verify API calls succeed
-
-Test like a human user with mouse and keyboard. Don't take shortcuts by using JavaScript evaluation.
+Test like a human user with mouse and keyboard. Use `browser_console_messages` to detect errors. Don't bypass UI with JavaScript evaluation.

 ---

@@ -381,26 +305,7 @@ This allows you to fully test email-dependent flows without needing external ema

 ---

-## IMPORTANT REMINDERS
-
-**Your Goal:** Production-quality application with all tests passing
-
-**This Session's Goal:** Complete at least one feature perfectly
-
-**Priority:** Fix broken tests before implementing new features
-
-**Quality Bar:**
-
- Zero console errors
- Polished UI matching the design specified in app_spec.txt
- All features work end-to-end through the UI
- Fast, responsive, professional
- **NO MOCK DATA - all data from real database**
- **Security enforced - unauthorized access blocked**
- **All navigation works - no 404s or broken links**
-
-**You have unlimited time.** Take as long as needed to get it right. The most important thing is that you
-leave the code base in a clean state before terminating the session (Step 9).
+**Remember:** One feature per session. Zero console errors. All data from real database. Leave codebase clean before ending session.

 ---

--- a/.claude/templates/initializer_prompt.template.md
+++ b/.claude/templates/initializer_prompt.template.md
@@ -26,82 +26,11 @@ which is the single source of truth for what needs to be built.

 **Creating Features:**

-Use the feature_create_bulk tool to add all features at once. Note: You MUST include `depends_on_indices`
-to specify dependencies. Features with no dependencies can run first and enable parallel execution.
-
-```
-Use the feature_create_bulk tool with features=[
-  {
-    "category": "functional",
-    "name": "App loads without errors",
-    "description": "Application starts and renders homepage",
-    "steps": [
-      "Step 1: Navigate to homepage",
-      "Step 2: Verify no console errors",
-      "Step 3: Verify main content renders"
-    ]
-    // No depends_on_indices = FOUNDATION feature (runs first)
-  },
-  {
-    "category": "functional",
-    "name": "User can create an account",
-    "description": "Basic user registration functionality",
-    "steps": [
-      "Step 1: Navigate to registration page",
-      "Step 2: Fill in required fields",
-      "Step 3: Submit form and verify account created"
-    ],
-    "depends_on_indices": [0]  // Depends on app loading
-  },
-  {
-    "category": "functional",
-    "name": "User can log in",
-    "description": "Authentication with existing credentials",
-    "steps": [
-      "Step 1: Navigate to login page",
-      "Step 2: Enter credentials",
-      "Step 3: Verify successful login and redirect"
-    ],
-    "depends_on_indices": [0, 1]  // Depends on app loading AND registration
-  },
-  {
-    "category": "functional",
-    "name": "User can view dashboard",
-    "description": "Protected dashboard requires authentication",
-    "steps": [
-      "Step 1: Log in as user",
-      "Step 2: Navigate to dashboard",
-      "Step 3: Verify personalized content displays"
-    ],
-    "depends_on_indices": [2]  // Depends on login only
-  },
-  {
-    "category": "functional",
-    "name": "User can update profile",
-    "description": "User can modify their profile information",
-    "steps": [
-      "Step 1: Log in as user",
-      "Step 2: Navigate to profile settings",
-      "Step 3: Update and save profile"
-    ],
-    "depends_on_indices": [2]  // ALSO depends on login (WIDE GRAPH - can run parallel with dashboard!)
-  }
-]
-```
+Use the feature_create_bulk tool to add all features at once. You can create features in batches if there are many (e.g., 50 at a time).

 **Notes:**
 - IDs and priorities are assigned automatically based on order
 - All features start with `passes: false` by default
- You can create features in batches if there are many (e.g., 50 at a time)
- **CRITICAL:** Use `depends_on_indices` to specify dependencies (see FEATURE DEPENDENCIES section below)
-
-**DEPENDENCY REQUIREMENT:**
-You MUST specify dependencies using `depends_on_indices` for features that logically depend on others.
- Features 0-9 should have NO dependencies (foundation/setup features)
- Features 10+ MUST have at least some dependencies where logical
- Create WIDE dependency graphs, not linear chains:
-  - BAD:  A -> B -> C -> D -> E (linear chain, only 1 feature can run at a time)
-  - GOOD: A -> B, A -> C, A -> D, B -> E, C -> E (wide graph, multiple features can run in parallel)

 **Requirements for features:**

@@ -114,7 +43,6 @@ You MUST specify dependencies using `depends_on_indices` for features that logic
 - Mix of narrow tests (2-5 steps) and comprehensive tests (10+ steps)
 - At least 25 tests MUST have 10+ steps each (more for complex apps)
 - Order features by priority: fundamental features first (the API assigns priority based on order)
- All features start with `passes: false` automatically
 - Cover every feature in the spec exhaustively
 - **MUST include tests from ALL 20 mandatory categories below**

@@ -122,125 +50,68 @@ You MUST specify dependencies using `depends_on_indices` for features that logic

 ## FEATURE DEPENDENCIES (MANDATORY)

-**THIS SECTION IS MANDATORY. You MUST specify dependencies for features.**
+Dependencies enable **parallel execution** of independent features. When specified correctly, multiple agents can work on unrelated features simultaneously, dramatically speeding up development.

-Dependencies enable **parallel execution** of independent features. When you specify dependencies correctly, multiple agents can work on unrelated features simultaneously, dramatically speeding up development.
+**Why this matters:** Without dependencies, features execute in random order, causing logical issues (e.g., "Edit user" before "Create user") and preventing efficient parallelization.

-**WARNING:** If you do not specify dependencies, ALL features will be ready immediately, which:
-1. Overwhelms the parallel agents trying to work on unrelated features
-2. Results in features being implemented in random order
-3. Causes logical issues (e.g., "Edit user" attempted before "Create user")
+### Dependency Rules

-You MUST analyze each feature and specify its dependencies using `depends_on_indices`.
+1. **Use `depends_on_indices`** (0-based array indices) to reference dependencies
+2. **Can only depend on EARLIER features** (index must be less than current position)
+3. **No circular dependencies** allowed
+4. **Maximum 20 dependencies** per feature
+5. **Foundation features (index 0-9)** should have NO dependencies
+6. **60% of features after index 10** should have at least one dependency

-### Why Dependencies Matter
+### Dependency Types

-1. **Parallel Execution**: Features without dependencies can run in parallel
-2. **Logical Ordering**: Ensures features are built in the right order
-3. **Blocking Prevention**: An agent won't start a feature until its dependencies pass
+| Type | Example |
+|------|---------|
+| Data | "Edit item" depends on "Create item" |
+| Auth | "View dashboard" depends on "User can log in" |
+| Navigation | "Modal close works" depends on "Modal opens" |
+| UI | "Filter results" depends on "Display results list" |

-### How to Determine Dependencies
+### Wide Graph Pattern (REQUIRED)

-Ask yourself: "What MUST be working before this feature can be tested?"
+Create WIDE dependency graphs, not linear chains:
+- **BAD:** A -> B -> C -> D -> E (linear chain, only 1 feature runs at a time)
+- **GOOD:** A -> B, A -> C, A -> D, B -> E, C -> E (wide graph, parallel execution)

-| Dependency Type | Example |
-|-----------------|---------|
-| **Data dependencies** | "Edit item" depends on "Create item" |
-| **Auth dependencies** | "View dashboard" depends on "User can log in" |
-| **Navigation dependencies** | "Modal close works" depends on "Modal opens" |
-| **UI dependencies** | "Filter results" depends on "Display results list" |
-| **API dependencies** | "Fetch user data" depends on "API authentication" |
-
-### Using `depends_on_indices`
-
-Since feature IDs aren't assigned until after creation, use **array indices** (0-based) to reference dependencies:
-
-```json
-{
-  "features": [
-    { "name": "Create account", ... },           // Index 0
-    { "name": "Login", "depends_on_indices": [0] },  // Index 1, depends on 0
-    { "name": "View profile", "depends_on_indices": [1] }, // Index 2, depends on 1
-    { "name": "Edit profile", "depends_on_indices": [2] }  // Index 3, depends on 2
-  ]
-}
-```
-
-### Rules for Dependencies
-
-1. **Can only depend on EARLIER features**: Index must be less than current feature's position
-2. **No circular dependencies**: A cannot depend on B if B depends on A
-3. **Maximum 20 dependencies** per feature
-4. **Foundation features have NO dependencies**: First features in each category typically have none
-5. **Don't over-depend**: Only add dependencies that are truly required for testing
-
-### Best Practices
-
-1. **Start with foundation features** (index 0-10): Core setup, basic navigation, authentication
-2. **Group related features together**: Keep CRUD operations adjacent
-3. **Chain complex flows**: Registration -> Login -> Dashboard -> Settings
-4. **Keep dependencies shallow**: Prefer 1-2 dependencies over deep chains
-5. **Skip dependencies for independent features**: Visual tests often have no dependencies
-
-### Minimum Dependency Coverage
-
-**REQUIREMENT:** At least 60% of your features (after index 10) should have at least one dependency.
-
-Target structure for a 150-feature project:
- Features 0-9: Foundation (0 dependencies) - App loads, basic setup
- Features 10-149: At least 84 should have dependencies (60% of 140)
-
-This ensures:
- A good mix of parallelizable features (foundation)
- Logical ordering for dependent features
-
-### Example: Todo App Feature Chain (Wide Graph Pattern)
-
-This example shows the CORRECT wide graph pattern where multiple features share the same dependency,
-enabling parallel execution:
+### Complete Example

 ```json
 [
-  // FOUNDATION TIER (indices 0-2, no dependencies)
-  // These run first and enable everything else
+  // FOUNDATION TIER (indices 0-2, no dependencies) - run first
  { "name": "App loads without errors", "category": "functional" },
  { "name": "Navigation bar displays", "category": "style" },
  { "name": "Homepage renders correctly", "category": "functional" },

-  // AUTH TIER (indices 3-5, depend on foundation)
-  // These can all run in parallel once foundation passes
+  // AUTH TIER (indices 3-5, depend on foundation) - run in parallel
  { "name": "User can register", "depends_on_indices": [0] },
  { "name": "User can login", "depends_on_indices": [0, 3] },
  { "name": "User can logout", "depends_on_indices": [4] },

-  // CORE CRUD TIER (indices 6-9, depend on auth)
-  // WIDE GRAPH: All 4 of these depend on login (index 4)
-  // This means all 4 can start as soon as login passes!
+  // CORE CRUD TIER (indices 6-9) - WIDE GRAPH: all 4 depend on login
+  // All 4 start as soon as login passes!
  { "name": "User can create todo", "depends_on_indices": [4] },
  { "name": "User can view todos", "depends_on_indices": [4] },
  { "name": "User can edit todo", "depends_on_indices": [4, 6] },
  { "name": "User can delete todo", "depends_on_indices": [4, 6] },

-  // ADVANCED TIER (indices 10-11, depend on CRUD)
-  // Note: filter and search both depend on view (7), not on each other
+  // ADVANCED TIER (indices 10-11) - both depend on view, not each other
  { "name": "User can filter todos", "depends_on_indices": [7] },
  { "name": "User can search todos", "depends_on_indices": [7] }
 ]
 ```

-**Parallelism analysis of this example:**
- Foundation tier: 3 features can run in parallel
- Auth tier: 3 features wait for foundation, then can run (mostly parallel)
- CRUD tier: 4 features can start once login passes (all 4 in parallel!)
- Advanced tier: 2 features can run once view passes (both in parallel)
-
 **Result:** With 3 parallel agents, this 12-feature project completes in ~5-6 cycles instead of 12 sequential cycles.

 ---

 ## MANDATORY TEST CATEGORIES

-The feature_list.json **MUST** include tests from ALL of these categories. The minimum counts scale by complexity tier.
+The feature_list.json **MUST** include tests from ALL 20 categories. Minimum counts scale by complexity tier.

 ### Category Distribution by Complexity Tier

@@ -270,331 +141,47 @@ The feature_list.json **MUST** include tests from ALL of these categories. The m

 ---

-### A. Security & Access Control Tests
+### Category Descriptions

-Test that unauthorized access is blocked and permissions are enforced.
+**A. Security & Access Control** - Test unauthorized access blocking, permission enforcement, session management, role-based access, and data isolation between users.

-**Required tests (examples):**
+**B. Navigation Integrity** - Test all buttons, links, menus, breadcrumbs, deep links, back button behavior, 404 handling, and post-login/logout redirects.

- Unauthenticated user cannot access protected routes (redirect to login)
- Regular user cannot access admin-only pages (403 or redirect)
- API endpoints return 401 for unauthenticated requests
- API endpoints return 403 for unauthorized role access
- Session expires after configured inactivity period
- Logout clears all session data and tokens
- Invalid/expired tokens are rejected
- Each role can ONLY see their permitted menu items
- Direct URL access to unauthorized pages is blocked
- Sensitive operations require confirmation or re-authentication
- Cannot access another user's data by manipulating IDs in URL
- Password reset flow works securely
- Failed login attempts are handled (no information leakage)
+**C. Real Data Verification** - Test data persistence across refreshes and sessions, CRUD operations with unique test data, related record updates, and empty states.

-### B. Navigation Integrity Tests
+**D. Workflow Completeness** - Test end-to-end CRUD for every entity, state transitions, multi-step wizards, bulk operations, and form submission feedback.

-Test that every button, link, and menu item goes to the correct place.
+**E. Error Handling** - Test network failures, invalid input, API errors, 404/500 responses, loading states, timeouts, and user-friendly error messages.

-**Required tests (examples):**
+**F. UI-Backend Integration** - Test request/response format matching, database-driven dropdowns, cascading updates, filters/sorts with real data, and API error display.

- Every button in sidebar navigates to correct page
- Every menu item links to existing route
- All CRUD action buttons (Edit, Delete, View) go to correct URLs with correct IDs
- Back button works correctly after each navigation
- Deep linking works (direct URL access to any page with auth)
- Breadcrumbs reflect actual navigation path
- 404 page shown for non-existent routes (not crash)
- After login, user redirected to intended destination (or dashboard)
- After logout, user redirected to login page
- Pagination links work and preserve current filters
- Tab navigation within pages works correctly
- Modal close buttons return to previous state
- Cancel buttons on forms return to previous page
+**G. State & Persistence** - Test refresh mid-form, session recovery, multi-tab behavior, back-button after submit, and unsaved changes warnings.

-### C. Real Data Verification Tests
+**H. URL & Direct Access** - Test URL manipulation security, direct route access by role, malformed parameters, deep links to deleted entities, and shareable filter URLs.

-Test that data is real (not mocked) and persists correctly.
+**I. Double-Action & Idempotency** - Test double-click submit, rapid delete clicks, back-and-resubmit, button disabled during processing, and concurrent submissions.

-**Required tests (examples):**
+**J. Data Cleanup & Cascade** - Test parent deletion effects on children, removal from search/lists/dropdowns, statistics updates, and soft vs hard delete behavior.

- Create a record via UI with unique content → verify it appears in list
- Create a record → refresh page → record still exists
- Create a record → log out → log in → record still exists
- Edit a record → verify changes persist after refresh
- Delete a record → verify it's gone from list AND database
- Delete a record → verify it's gone from related dropdowns
- Filter/search → results match actual data created in test
- Dashboard statistics reflect real record counts (create 3 items, count shows 3)
- Reports show real aggregated data
- Export functionality exports actual data you created
- Related records update when parent changes
- Timestamps are real and accurate (created_at, updated_at)
- Data created by User A is not visible to User B (unless shared)
- Empty state shows correctly when no data exists
+**K. Default & Reset** - Test form defaults, sensible date picker defaults, dropdown placeholders, reset button behavior, and filter/pagination reset on context change.

-### D. Workflow Completeness Tests
+**L. Search & Filter Edge Cases** - Test empty search, whitespace-only, special characters, quotes, long strings, zero-result combinations, and filter persistence.

-Test that every workflow can be completed end-to-end through the UI.
+**M. Form Validation** - Test required fields, email/password/numeric/date formats, min/max constraints, uniqueness, specific error messages, and server-side validation.

-**Required tests (examples):**
+**N. Feedback & Notification** - Test success/error feedback for all actions, loading spinners, disabled buttons during submit, progress indicators, and toast behavior.

- Every entity has working Create operation via UI form
- Every entity has working Read/View operation (detail page loads)
- Every entity has working Update operation (edit form saves)
- Every entity has working Delete operation (with confirmation dialog)
- Every status/state has a UI mechanism to transition to next state
- Multi-step processes (wizards) can be completed end-to-end
- Bulk operations (select all, delete selected) work
- Cancel/Undo operations work where applicable
- Required fields prevent submission when empty
- Form validation shows errors before submission
- Successful submission shows success feedback
- Backend workflow (e.g., user→customer conversion) has UI trigger
+**O. Responsive & Layout** - Test layouts at desktop (1920px), tablet (768px), and mobile (375px), no horizontal scroll, touch targets, modal fit, and text overflow.

-### E. Error Handling Tests
+**P. Accessibility** - Test tab navigation, focus rings, screen reader compatibility, ARIA labels, color contrast, labels on form fields, and error announcements.

-Test graceful handling of errors and edge cases.
+**Q. Temporal & Timezone** - Test timezone-aware display, accurate timestamps, date picker constraints, overdue detection, and date sorting across boundaries.

-**Required tests (examples):**
+**R. Concurrency & Race Conditions** - Test concurrent edits, viewing deleted records, pagination during updates, rapid navigation, and late API response handling.

- Network failure shows user-friendly error message, not crash
- Invalid form input shows field-level errors
- API errors display meaningful messages to user
- 404 responses handled gracefully (show not found page)
- 500 responses don't expose stack traces or technical details
- Empty search results show "no results found" message
- Loading states shown during all async operations
- Timeout doesn't hang the UI indefinitely
- Submitting form with server error keeps user data in form
- File upload errors (too large, wrong type) show clear message
- Duplicate entry errors (e.g., email already exists) are clear
+**S. Export/Import** - Test full/filtered export, import with valid/duplicate/malformed files, and round-trip data integrity.

-### F. UI-Backend Integration Tests
-
-Test that frontend and backend communicate correctly.
-
-**Required tests (examples):**
-
- Frontend request format matches what backend expects
- Backend response format matches what frontend parses
- All dropdown options come from real database data (not hardcoded)
- Related entity selectors (e.g., "choose category") populated from DB
- Changes in one area reflect in related areas after refresh
- Deleting parent handles children correctly (cascade or block)
- Filters work with actual data attributes from database
- Sort functionality sorts real data correctly
- Pagination returns correct page of real data
- API error responses are parsed and displayed correctly
- Loading spinners appear during API calls
- Optimistic updates (if used) rollback on failure
-
-### G. State & Persistence Tests
-
-Test that state is maintained correctly across sessions and tabs.
-
-**Required tests (examples):**
-
- Refresh page mid-form - appropriate behavior (data kept or cleared)
- Close browser, reopen - session state handled correctly
- Same user in two browser tabs - changes sync or handled gracefully
- Browser back after form submit - no duplicate submission
- Bookmark a page, return later - works (with auth check)
- LocalStorage/cookies cleared - graceful re-authentication
- Unsaved changes warning when navigating away from dirty form
-
-### H. URL & Direct Access Tests
-
-Test direct URL access and URL manipulation security.
-
-**Required tests (examples):**
-
- Change entity ID in URL - cannot access others' data
- Access /admin directly as regular user - blocked
- Malformed URL parameters - handled gracefully (no crash)
- Very long URL - handled correctly
- URL with SQL injection attempt - rejected/sanitized
- Deep link to deleted entity - shows "not found", not crash
- Query parameters for filters are reflected in UI
- Sharing a URL with filters preserves those filters
-
-### I. Double-Action & Idempotency Tests
-
-Test that rapid or duplicate actions don't cause issues.
-
-**Required tests (examples):**
-
- Double-click submit button - only one record created
- Rapid multiple clicks on delete - only one deletion occurs
- Submit form, hit back, submit again - appropriate behavior
- Multiple simultaneous API calls - server handles correctly
- Refresh during save operation - data not corrupted
- Click same navigation link twice quickly - no issues
- Submit button disabled during processing
-
-### J. Data Cleanup & Cascade Tests
-
-Test that deleting data cleans up properly everywhere.
-
-**Required tests (examples):**
-
- Delete parent entity - children removed from all views
- Delete item - removed from search results immediately
- Delete item - statistics/counts updated immediately
- Delete item - related dropdowns updated
- Delete item - cached views refreshed
- Soft delete (if applicable) - item hidden but recoverable
- Hard delete - item completely removed from database
-
-### K. Default & Reset Tests
-
-Test that defaults and reset functionality work correctly.
-
-**Required tests (examples):**
-
- New form shows correct default values
- Date pickers default to sensible dates (today, not 1970)
- Dropdowns default to correct option (or placeholder)
- Reset button clears to defaults, not just empty
- Clear filters button resets all filters to default
- Pagination resets to page 1 when filters change
- Sorting resets when changing views
-
-### L. Search & Filter Edge Cases
-
-Test search and filter functionality thoroughly.
-
-**Required tests (examples):**
-
- Empty search shows all results (or appropriate message)
- Search with only spaces - handled correctly
- Search with special characters (!@#$%^&\*) - no errors
- Search with quotes - handled correctly
- Search with very long string - handled correctly
- Filter combinations that return zero results - shows message
- Filter + search + sort together - all work correctly
- Filter persists after viewing detail and returning to list
- Clear individual filter - works correctly
- Search is case-insensitive (or clearly case-sensitive)
-
-### M. Form Validation Tests
-
-Test all form validation rules exhaustively.
-
-**Required tests (examples):**
-
- Required field empty - shows error, blocks submit
- Email field with invalid email formats - shows error
- Password field - enforces complexity requirements
- Numeric field with letters - rejected
- Date field with invalid date - rejected
- Min/max length enforced on text fields
- Min/max values enforced on numeric fields
- Duplicate unique values rejected (e.g., duplicate email)
- Error messages are specific (not just "invalid")
- Errors clear when user fixes the issue
- Server-side validation matches client-side
- Whitespace-only input rejected for required fields
-
-### N. Feedback & Notification Tests
-
-Test that users get appropriate feedback for all actions.
-
-**Required tests (examples):**
-
- Every successful save/create shows success feedback
- Every failed action shows error feedback
- Loading spinner during every async operation
- Disabled state on buttons during form submission
- Progress indicator for long operations (file upload)
- Toast/notification disappears after appropriate time
- Multiple notifications don't overlap incorrectly
- Success messages are specific (not just "Success")
-
-### O. Responsive & Layout Tests
-
-Test that the UI works on different screen sizes.
-
-**Required tests (examples):**
-
- Desktop layout correct at 1920px width
- Tablet layout correct at 768px width
- Mobile layout correct at 375px width
- No horizontal scroll on any standard viewport
- Touch targets large enough on mobile (44px min)
- Modals fit within viewport on mobile
- Long text truncates or wraps correctly (no overflow)
- Tables scroll horizontally if needed on mobile
- Navigation collapses appropriately on mobile
-
-### P. Accessibility Tests
-
-Test basic accessibility compliance.
-
-**Required tests (examples):**
-
- Tab navigation works through all interactive elements
- Focus ring visible on all focused elements
- Screen reader can navigate main content areas
- ARIA labels on icon-only buttons
- Color contrast meets WCAG AA (4.5:1 for text)
- No information conveyed by color alone
- Form fields have associated labels
- Error messages announced to screen readers
- Skip link to main content (if applicable)
- Images have alt text
-
-### Q. Temporal & Timezone Tests
-
-Test date/time handling.
-
-**Required tests (examples):**
-
- Dates display in user's local timezone
- Created/updated timestamps accurate and formatted correctly
- Date picker allows only valid date ranges
- Overdue items identified correctly (timezone-aware)
- "Today", "This Week" filters work correctly for user's timezone
- Recurring items generate at correct times (if applicable)
- Date sorting works correctly across months/years
-
-### R. Concurrency & Race Condition Tests
-
-Test multi-user and race condition scenarios.
-
-**Required tests (examples):**
-
- Two users edit same record - last save wins or conflict shown
- Record deleted while another user viewing - graceful handling
- List updates while user on page 2 - pagination still works
- Rapid navigation between pages - no stale data displayed
- API response arrives after user navigated away - no crash
- Concurrent form submissions from same user handled
-
-### S. Export/Import Tests (if applicable)
-
-Test data export and import functionality.
-
-**Required tests (examples):**
-
- Export all data - file contains all records
- Export filtered data - only filtered records included
- Import valid file - all records created correctly
- Import duplicate data - handled correctly (skip/update/error)
- Import malformed file - error message, no partial import
- Export then import - data integrity preserved exactly
-
-### T. Performance Tests
-
-Test basic performance requirements.
-
-**Required tests (examples):**
-
- Page loads in <3s with 100 records
- Page loads in <5s with 1000 records
- Search responds in <1s
- Infinite scroll doesn't degrade with many items
- Large file upload shows progress
- Memory doesn't leak on long sessions
- No console errors during normal operation
+**T. Performance** - Test page load with 100/1000 records, search response time, infinite scroll stability, upload progress, and memory/console errors.

 ---

--- a/api/database.py
+++ b/api/database.py
@@ -21,6 +21,7 @@ from sqlalchemy import (
    Column,
    DateTime,
    ForeignKey,
+    Index,
    Integer,
    String,
    Text,
@@ -39,6 +40,12 @@ class Feature(Base):

    __tablename__ = "features"

+    # Composite index for common status query pattern (passes, in_progress)
+    # Used by feature_get_stats, get_ready_features, and other status queries
+    __table_args__ = (
+        Index('ix_feature_status', 'passes', 'in_progress'),
+    )
+
    id = Column(Integer, primary_key=True, index=True)
    priority = Column(Integer, nullable=False, default=999, index=True)
    category = Column(String(100), nullable=False)
--- a/api/dependency_resolver.py
+++ b/api/dependency_resolver.py
@@ -6,6 +6,7 @@ Provides dependency resolution using Kahn's algorithm for topological sorting.
 Includes cycle detection, validation, and helper functions for dependency management.
 """

+import heapq
 from typing import TypedDict

 # Security: Prevent DoS via excessive dependencies
@@ -55,19 +56,27 @@ def resolve_dependencies(features: list[dict]) -> DependencyResult:
                if not dep.get("passes"):
                    blocked.setdefault(feature["id"], []).append(dep_id)

-    # Kahn's algorithm with priority-aware selection
-    queue = [f for f in features if in_degree[f["id"]] == 0]
-    queue.sort(key=lambda f: (f.get("priority", 999), f["id"]))
+    # Kahn's algorithm with priority-aware selection using a heap
+    # Heap entries are tuples: (priority, id, feature_dict) for stable ordering
+    heap = [
+        (f.get("priority", 999), f["id"], f)
+        for f in features
+        if in_degree[f["id"]] == 0
+    ]
+    heapq.heapify(heap)
    ordered: list[dict] = []

-    while queue:
-        current = queue.pop(0)
+    while heap:
+        _, _, current = heapq.heappop(heap)
        ordered.append(current)
        for dependent_id in adjacency[current["id"]]:
            in_degree[dependent_id] -= 1
            if in_degree[dependent_id] == 0:
-                queue.append(feature_map[dependent_id])
-                queue.sort(key=lambda f: (f.get("priority", 999), f["id"]))
+                dep_feature = feature_map[dependent_id]
+                heapq.heappush(
+                    heap,
+                    (dep_feature.get("priority", 999), dependent_id, dep_feature)
+                )

    # Detect cycles (features not in ordered = part of cycle)
    cycles: list[list[int]] = []
@@ -84,12 +93,19 @@ def resolve_dependencies(features: list[dict]) -> DependencyResult:
    }


-def are_dependencies_satisfied(feature: dict, all_features: list[dict]) -> bool:
+def are_dependencies_satisfied(
+    feature: dict,
+    all_features: list[dict],
+    passing_ids: set[int] | None = None,
+) -> bool:
    """Check if all dependencies have passes=True.

    Args:
        feature: Feature dict to check
        all_features: List of all feature dicts
+        passing_ids: Optional pre-computed set of passing feature IDs.
+            If None, will be computed from all_features. Pass this when
+            calling in a loop to avoid O(n^2) complexity.

    Returns:
        True if all dependencies are satisfied (or no dependencies)
@@ -97,22 +113,31 @@ def are_dependencies_satisfied(feature: dict, all_features: list[dict]) -> bool:
    deps = feature.get("dependencies") or []
    if not deps:
        return True
-    passing_ids = {f["id"] for f in all_features if f.get("passes")}
+    if passing_ids is None:
+        passing_ids = {f["id"] for f in all_features if f.get("passes")}
    return all(dep_id in passing_ids for dep_id in deps)


-def get_blocking_dependencies(feature: dict, all_features: list[dict]) -> list[int]:
+def get_blocking_dependencies(
+    feature: dict,
+    all_features: list[dict],
+    passing_ids: set[int] | None = None,
+) -> list[int]:
    """Get list of incomplete dependency IDs.

    Args:
        feature: Feature dict to check
        all_features: List of all feature dicts
+        passing_ids: Optional pre-computed set of passing feature IDs.
+            If None, will be computed from all_features. Pass this when
+            calling in a loop to avoid O(n^2) complexity.

    Returns:
        List of feature IDs that are blocking this feature
    """
    deps = feature.get("dependencies") or []
-    passing_ids = {f["id"] for f in all_features if f.get("passes")}
+    if passing_ids is None:
+        passing_ids = {f["id"] for f in all_features if f.get("passes")}
    return [dep_id for dep_id in deps if dep_id not in passing_ids]


--- a/client.py
+++ b/client.py
@@ -12,7 +12,7 @@ import sys
 from pathlib import Path

 from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
-from claude_agent_sdk.types import HookMatcher
+from claude_agent_sdk.types import HookContext, HookInput, HookMatcher, SyncHookJSONOutput
 from dotenv import load_dotenv

 from security import bash_security_hook
@@ -55,7 +55,9 @@ FEATURE_MCP_TOOLS = [
    # Core feature operations
    "mcp__features__feature_get_stats",
    "mcp__features__feature_get_by_id",  # Get assigned feature details
+    "mcp__features__feature_get_summary",  # Lightweight: id, name, status, deps only
    "mcp__features__feature_mark_in_progress",
+    "mcp__features__feature_claim_and_get",  # Atomic claim + get details
    "mcp__features__feature_mark_passing",
    "mcp__features__feature_mark_failing",  # Mark regression detected
    "mcp__features__feature_skip",
@@ -268,6 +270,45 @@ def create_client(
        context["project_dir"] = str(project_dir.resolve())
        return await bash_security_hook(input_data, tool_use_id, context)

+    # PreCompact hook for logging and customizing context compaction
+    # Compaction is handled automatically by Claude Code CLI when context approaches limits.
+    # This hook allows us to log when compaction occurs and optionally provide custom instructions.
+    async def pre_compact_hook(
+        input_data: HookInput,
+        tool_use_id: str | None,
+        context: HookContext,
+    ) -> SyncHookJSONOutput:
+        """
+        Hook called before context compaction occurs.
+
+        Compaction triggers:
+        - "auto": Automatic compaction when context approaches token limits
+        - "manual": User-initiated compaction via /compact command
+
+        The hook can customize compaction via hookSpecificOutput:
+        - customInstructions: String with focus areas for summarization
+        """
+        trigger = input_data.get("trigger", "auto")
+        custom_instructions = input_data.get("custom_instructions")
+
+        if trigger == "auto":
+            print("[Context] Auto-compaction triggered (context approaching limit)")
+        else:
+            print("[Context] Manual compaction requested")
+
+        if custom_instructions:
+            print(f"[Context] Custom instructions: {custom_instructions}")
+
+        # Return empty dict to allow compaction to proceed with default behavior
+        # To customize, return:
+        # {
+        #     "hookSpecificOutput": {
+        #         "hookEventName": "PreCompact",
+        #         "customInstructions": "Focus on preserving file paths and test results"
+        #     }
+        # }
+        return SyncHookJSONOutput()
+
    return ClaudeSDKClient(
        options=ClaudeAgentOptions(
            model=model,
@@ -281,10 +322,35 @@ def create_client(
                "PreToolUse": [
                    HookMatcher(matcher="Bash", hooks=[bash_hook_with_context]),
                ],
+                # PreCompact hook for context management during long sessions.
+                # Compaction is automatic when context approaches token limits.
+                # This hook logs compaction events and can customize summarization.
+                "PreCompact": [
+                    HookMatcher(hooks=[pre_compact_hook]),
+                ],
            },
            max_turns=1000,
            cwd=str(project_dir.resolve()),
            settings=str(settings_file.resolve()),  # Use absolute path
            env=sdk_env,  # Pass API configuration overrides to CLI subprocess
+            # Enable extended context beta for better handling of long sessions.
+            # This provides up to 1M tokens of context with automatic compaction.
+            # See: https://docs.anthropic.com/en/api/beta-headers
+            betas=["context-1m-2025-08-07"],
+            # Note on context management:
+            # The Claude Agent SDK handles context management automatically through the
+            # underlying Claude Code CLI. When context approaches limits, the CLI
+            # automatically compacts/summarizes previous messages.
+            #
+            # The SDK does NOT expose explicit compaction_control or context_management
+            # parameters. Instead, context is managed via:
+            # 1. betas=["context-1m-2025-08-07"] - Extended context window
+            # 2. PreCompact hook - Intercept and customize compaction behavior
+            # 3. max_turns - Limit conversation turns (set to 1000 for long sessions)
+            #
+            # Future SDK versions may add explicit compaction controls. When available,
+            # consider adding:
+            # - compaction_control={"enabled": True, "context_token_threshold": 80000}
+            # - context_management={"edits": [...]} for tool use clearing
        )
    )
--- a/mcp_server/feature_mcp.py
+++ b/mcp_server/feature_mcp.py
@@ -8,10 +8,12 @@ Provides tools to manage features in the autonomous coding system.
 Tools:
 - feature_get_stats: Get progress statistics
 - feature_get_by_id: Get a specific feature by ID
+- feature_get_summary: Get minimal feature info (id, name, status, deps)
 - feature_mark_passing: Mark a feature as passing
 - feature_mark_failing: Mark a feature as failing (regression detected)
 - feature_skip: Skip a feature (move to end of queue)
 - feature_mark_in_progress: Mark a feature as in-progress
+- feature_claim_and_get: Atomically claim and get feature details
 - feature_clear_in_progress: Clear in-progress status
 - feature_release_testing: Release testing lock on a feature
 - feature_create_bulk: Create multiple features at once
@@ -19,7 +21,7 @@ Tools:
 - feature_add_dependency: Add a dependency between features
 - feature_remove_dependency: Remove a dependency
 - feature_get_ready: Get features ready to implement
- feature_get_blocked: Get features blocked by dependencies
+- feature_get_blocked: Get features blocked by dependencies (with limit)
 - feature_get_graph: Get the dependency graph

 Note: Feature selection (which feature to work on) is handled by the
@@ -142,11 +144,20 @@ def feature_get_stats() -> str:
    Returns:
        JSON with: passing (int), in_progress (int), total (int), percentage (float)
    """
+    from sqlalchemy import case, func
+
    session = get_session()
    try:
-        total = session.query(Feature).count()
-        passing = session.query(Feature).filter(Feature.passes == True).count()
-        in_progress = session.query(Feature).filter(Feature.in_progress == True).count()
+        # Single aggregate query instead of 3 separate COUNT queries
+        result = session.query(
+            func.count(Feature.id).label('total'),
+            func.sum(case((Feature.passes == True, 1), else_=0)).label('passing'),
+            func.sum(case((Feature.in_progress == True, 1), else_=0)).label('in_progress')
+        ).first()
+
+        total = result.total or 0
+        passing = int(result.passing or 0)
+        in_progress = int(result.in_progress or 0)
        percentage = round((passing / total) * 100, 1) if total > 0 else 0.0

        return json.dumps({
@@ -154,7 +165,7 @@ def feature_get_stats() -> str:
            "in_progress": in_progress,
            "total": total,
            "percentage": percentage
-        }, indent=2)
+        })
    finally:
        session.close()

@@ -181,7 +192,38 @@ def feature_get_by_id(
        if feature is None:
            return json.dumps({"error": f"Feature with ID {feature_id} not found"})

-        return json.dumps(feature.to_dict(), indent=2)
+        return json.dumps(feature.to_dict())
+    finally:
+        session.close()
+
+
+@mcp.tool()
+def feature_get_summary(
+    feature_id: Annotated[int, Field(description="The ID of the feature", ge=1)]
+) -> str:
+    """Get minimal feature info: id, name, status, and dependencies only.
+
+    Use this instead of feature_get_by_id when you only need status info,
+    not the full description and steps. This reduces response size significantly.
+
+    Args:
+        feature_id: The ID of the feature to retrieve
+
+    Returns:
+        JSON with: id, name, passes, in_progress, dependencies
+    """
+    session = get_session()
+    try:
+        feature = session.query(Feature).filter(Feature.id == feature_id).first()
+        if feature is None:
+            return json.dumps({"error": f"Feature with ID {feature_id} not found"})
+        return json.dumps({
+            "id": feature.id,
+            "name": feature.name,
+            "passes": feature.passes,
+            "in_progress": feature.in_progress,
+            "dependencies": feature.dependencies or []
+        })
    finally:
        session.close()

@@ -229,7 +271,7 @@ def feature_release_testing(
        return json.dumps({
            "message": f"Feature #{feature_id} testing {status}",
            "feature": feature.to_dict()
-        }, indent=2)
+        })
    except Exception as e:
        session.rollback()
        return json.dumps({"error": f"Failed to release testing claim: {str(e)}"})
@@ -250,7 +292,7 @@ def feature_mark_passing(
        feature_id: The ID of the feature to mark as passing

    Returns:
-        JSON with the updated feature details, or error if not found.
+        JSON with success confirmation: {success, feature_id, name}
    """
    session = get_session()
    try:
@@ -262,9 +304,8 @@ def feature_mark_passing(
        feature.passes = True
        feature.in_progress = False
        session.commit()
-        session.refresh(feature)

-        return json.dumps(feature.to_dict(), indent=2)
+        return json.dumps({"success": True, "feature_id": feature_id, "name": feature.name})
    except Exception as e:
        session.rollback()
        return json.dumps({"error": f"Failed to mark feature passing: {str(e)}"})
@@ -309,7 +350,7 @@ def feature_mark_failing(
        return json.dumps({
            "message": f"Feature #{feature_id} marked as failing - regression detected",
            "feature": feature.to_dict()
-        }, indent=2)
+        })
    except Exception as e:
        session.rollback()
        return json.dumps({"error": f"Failed to mark feature failing: {str(e)}"})
@@ -368,7 +409,7 @@ def feature_skip(
            "old_priority": old_priority,
            "new_priority": new_priority,
            "message": f"Feature '{feature.name}' moved to end of queue"
-        }, indent=2)
+        })
    except Exception as e:
        session.rollback()
        return json.dumps({"error": f"Failed to skip feature: {str(e)}"})
@@ -408,7 +449,7 @@ def feature_mark_in_progress(
        session.commit()
        session.refresh(feature)

-        return json.dumps(feature.to_dict(), indent=2)
+        return json.dumps(feature.to_dict())
    except Exception as e:
        session.rollback()
        return json.dumps({"error": f"Failed to mark feature in-progress: {str(e)}"})
@@ -416,6 +457,48 @@ def feature_mark_in_progress(
        session.close()


+@mcp.tool()
+def feature_claim_and_get(
+    feature_id: Annotated[int, Field(description="The ID of the feature to claim", ge=1)]
+) -> str:
+    """Atomically claim a feature (mark in-progress) and return its full details.
+
+    Combines feature_mark_in_progress + feature_get_by_id into a single operation.
+    If already in-progress, still returns the feature details (idempotent).
+
+    Args:
+        feature_id: The ID of the feature to claim and retrieve
+
+    Returns:
+        JSON with feature details including claimed status, or error if not found.
+    """
+    session = get_session()
+    try:
+        feature = session.query(Feature).filter(Feature.id == feature_id).first()
+
+        if feature is None:
+            return json.dumps({"error": f"Feature with ID {feature_id} not found"})
+
+        if feature.passes:
+            return json.dumps({"error": f"Feature with ID {feature_id} is already passing"})
+
+        # Idempotent: if already in-progress, just return details
+        already_claimed = feature.in_progress
+        if not already_claimed:
+            feature.in_progress = True
+            session.commit()
+            session.refresh(feature)
+
+        result = feature.to_dict()
+        result["already_claimed"] = already_claimed
+        return json.dumps(result)
+    except Exception as e:
+        session.rollback()
+        return json.dumps({"error": f"Failed to claim feature: {str(e)}"})
+    finally:
+        session.close()
+
+
@mcp.tool()
 def feature_clear_in_progress(
    feature_id: Annotated[int, Field(description="The ID of the feature to clear in-progress status", ge=1)]
@@ -442,7 +525,7 @@ def feature_clear_in_progress(
        session.commit()
        session.refresh(feature)

-        return json.dumps(feature.to_dict(), indent=2)
+        return json.dumps(feature.to_dict())
    except Exception as e:
        session.rollback()
        return json.dumps({"error": f"Failed to clear in-progress status: {str(e)}"})
@@ -549,7 +632,7 @@ def feature_create_bulk(
        return json.dumps({
            "created": len(created_features),
            "with_dependencies": deps_count
-        }, indent=2)
+        })
    except Exception as e:
        session.rollback()
        return json.dumps({"error": str(e)})
@@ -604,7 +687,7 @@ def feature_create(
            "success": True,
            "message": f"Created feature: {name}",
            "feature": db_feature.to_dict()
-        }, indent=2)
+        })
    except Exception as e:
        session.rollback()
        return json.dumps({"error": str(e)})
@@ -754,20 +837,25 @@ def feature_get_ready(
            "features": ready[:limit],
            "count": len(ready[:limit]),
            "total_ready": len(ready)
-        }, indent=2)
+        })
    finally:
        session.close()


@mcp.tool()
-def feature_get_blocked() -> str:
-    """Get all features that are blocked by unmet dependencies.
+def feature_get_blocked(
+    limit: Annotated[int, Field(default=20, ge=1, le=100, description="Max features to return")] = 20
+) -> str:
+    """Get features that are blocked by unmet dependencies.

    Returns features that have dependencies which are not yet passing.
    Each feature includes a 'blocked_by' field listing the blocking feature IDs.

+    Args:
+        limit: Maximum number of features to return (1-100, default 20)
+
    Returns:
-        JSON with: features (list with blocked_by field), count (int)
+        JSON with: features (list with blocked_by field), count (int), total_blocked (int)
    """
    session = get_session()
    try:
@@ -787,9 +875,10 @@ def feature_get_blocked() -> str:
                })

        return json.dumps({
-            "features": blocked,
-            "count": len(blocked)
-        }, indent=2)
+            "features": blocked[:limit],
+            "count": len(blocked[:limit]),
+            "total_blocked": len(blocked)
+        })
    finally:
        session.close()

@@ -840,7 +929,7 @@ def feature_get_graph() -> str:
        return json.dumps({
            "nodes": nodes,
            "edges": edges
-        }, indent=2)
+        })
    finally:
        session.close()

--- a/parallel_orchestrator.py
+++ b/parallel_orchestrator.py
@@ -186,6 +186,12 @@ class ParallelOrchestrator:
        # Session tracking for logging/debugging
        self.session_start_time: datetime = None

+        # Event signaled when any agent completes, allowing the main loop to wake
+        # immediately instead of waiting for the full POLL_INTERVAL timeout.
+        # This reduces latency when spawning the next feature after completion.
+        self._agent_completed_event: asyncio.Event = None  # Created in run_loop
+        self._event_loop: asyncio.AbstractEventLoop = None  # Stored for thread-safe signaling
+
        # Database session for this orchestrator
        self._engine, self._session_maker = create_database(project_dir)

@@ -311,6 +317,9 @@ class ParallelOrchestrator:
            all_features = session.query(Feature).all()
            all_dicts = [f.to_dict() for f in all_features]

+            # Pre-compute passing_ids once to avoid O(n^2) in the loop
+            passing_ids = {f.id for f in all_features if f.passes}
+
            ready = []
            skipped_reasons = {"passes": 0, "in_progress": 0, "running": 0, "failed": 0, "deps": 0}
            for f in all_features:
@@ -329,8 +338,8 @@ class ParallelOrchestrator:
                if self._failure_counts.get(f.id, 0) >= MAX_FEATURE_RETRIES:
                    skipped_reasons["failed"] += 1
                    continue
-                # Check dependencies
-                if are_dependencies_satisfied(f.to_dict(), all_dicts):
+                # Check dependencies (pass pre-computed passing_ids)
+                if are_dependencies_satisfied(f.to_dict(), all_dicts, passing_ids):
                    ready.append(f.to_dict())
                else:
                    skipped_reasons["deps"] += 1
@@ -794,6 +803,52 @@ class ParallelOrchestrator:
        finally:
            self._on_agent_complete(feature_id, proc.returncode, agent_type, proc)

+    def _signal_agent_completed(self):
+        """Signal that an agent has completed, waking the main loop.
+
+        This method is safe to call from any thread. It schedules the event.set()
+        call to run on the event loop thread to avoid cross-thread issues with
+        asyncio.Event.
+        """
+        if self._agent_completed_event is not None and self._event_loop is not None:
+            try:
+                # Use the stored event loop reference to schedule the set() call
+                # This is necessary because asyncio.Event is not thread-safe and
+                # asyncio.get_event_loop() fails in threads without an event loop
+                if self._event_loop.is_running():
+                    self._event_loop.call_soon_threadsafe(self._agent_completed_event.set)
+                else:
+                    # Fallback: set directly if loop isn't running (shouldn't happen during normal operation)
+                    self._agent_completed_event.set()
+            except RuntimeError:
+                # Event loop closed, ignore (orchestrator may be shutting down)
+                pass
+
+    async def _wait_for_agent_completion(self, timeout: float = POLL_INTERVAL):
+        """Wait for an agent to complete or until timeout expires.
+
+        This replaces fixed `asyncio.sleep(POLL_INTERVAL)` calls with event-based
+        waiting. When an agent completes, _signal_agent_completed() sets the event,
+        causing this method to return immediately. If no agent completes within
+        the timeout, we return anyway to check for ready features.
+
+        Args:
+            timeout: Maximum seconds to wait (default: POLL_INTERVAL)
+        """
+        if self._agent_completed_event is None:
+            # Fallback if event not initialized (shouldn't happen in normal operation)
+            await asyncio.sleep(timeout)
+            return
+
+        try:
+            await asyncio.wait_for(self._agent_completed_event.wait(), timeout=timeout)
+            # Event was set - an agent completed. Clear it for the next wait cycle.
+            self._agent_completed_event.clear()
+            debug_log.log("EVENT", "Woke up immediately - agent completed")
+        except asyncio.TimeoutError:
+            # Timeout reached without agent completion - this is normal, just check anyway
+            pass
+
    def _on_agent_complete(
        self,
        feature_id: int | None,
@@ -832,6 +887,8 @@ class ParallelOrchestrator:
                pid=proc.pid,
                feature_id=feature_id,
                status=status)
+            # Signal main loop that an agent slot is available
+            self._signal_agent_completed()
            return

        # Coding agent completion
@@ -843,40 +900,20 @@ class ParallelOrchestrator:
            self.running_coding_agents.pop(feature_id, None)
            self.abort_events.pop(feature_id, None)

-        # BEFORE dispose: Query database state to see if it's stale
-        session_before = self.get_session()
-        try:
-            session_before.expire_all()
-            feature_before = session_before.query(Feature).filter(Feature.id == feature_id).first()
-            all_before = session_before.query(Feature).all()
-            passing_before = sum(1 for f in all_before if f.passes)
-            debug_log.log("DB", f"BEFORE engine.dispose() - Feature #{feature_id} state",
-                passes=feature_before.passes if feature_before else None,
-                in_progress=feature_before.in_progress if feature_before else None,
-                total_passing_in_db=passing_before)
-        finally:
-            session_before.close()
-
-        # CRITICAL: Refresh database connection to see subprocess commits
+        # Refresh session cache to see subprocess commits
        # The coding agent runs as a subprocess and commits changes (e.g., passes=True).
-        # SQLAlchemy may have stale connections. Disposing the engine forces new connections
-        # that will see the subprocess's committed changes.
-        debug_log.log("DB", "Disposing database engine now...")
-        self._engine.dispose()
-
-        # AFTER dispose: Query again to compare
+        # Using session.expire_all() is lighter weight than engine.dispose() for SQLite WAL mode
+        # and is sufficient to invalidate cached data and force fresh reads.
+        # engine.dispose() is only called on orchestrator shutdown, not on every agent completion.
        session = self.get_session()
        try:
+            session.expire_all()
            feature = session.query(Feature).filter(Feature.id == feature_id).first()
-            all_after = session.query(Feature).all()
-            passing_after = sum(1 for f in all_after if f.passes)
            feature_passes = feature.passes if feature else None
            feature_in_progress = feature.in_progress if feature else None
-            debug_log.log("DB", f"AFTER engine.dispose() - Feature #{feature_id} state",
+            debug_log.log("DB", f"Feature #{feature_id} state after session.expire_all()",
                passes=feature_passes,
-                in_progress=feature_in_progress,
-                total_passing_in_db=passing_after,
-                passing_changed=(passing_after != passing_before) if 'passing_before' in dir() else "unknown")
+                in_progress=feature_in_progress)
            if feature and feature.in_progress and not feature.passes:
                feature.in_progress = False
                session.commit()
@@ -900,6 +937,9 @@ class ParallelOrchestrator:
        # CRITICAL: This print triggers the WebSocket to emit agent_update with state='error' or 'success'
        print(f"Feature #{feature_id} {status}", flush=True)

+        # Signal main loop that an agent slot is available
+        self._signal_agent_completed()
+
        # NOTE: Testing agents are now spawned in start_feature() when coding agents START,
        # not here when they complete. This ensures 1:1 ratio and proper termination.

@@ -949,6 +989,12 @@ class ParallelOrchestrator:
        """Main orchestration loop."""
        self.is_running = True

+        # Initialize the agent completion event for this run
+        # Must be created in the async context where it will be used
+        self._agent_completed_event = asyncio.Event()
+        # Store the event loop reference for thread-safe signaling from output reader threads
+        self._event_loop = asyncio.get_running_loop()
+
        # Track session start for regression testing (UTC for consistency with last_tested_at)
        self.session_start_time = datetime.now(timezone.utc)

@@ -1100,8 +1146,8 @@ class ParallelOrchestrator:
                    at_capacity=(current >= self.max_concurrency))

                if current >= self.max_concurrency:
-                    debug_log.log("CAPACITY", "At max capacity, sleeping...")
-                    await asyncio.sleep(POLL_INTERVAL)
+                    debug_log.log("CAPACITY", "At max capacity, waiting for agent completion...")
+                    await self._wait_for_agent_completion()
                    continue

                # Priority 1: Resume features from previous session
@@ -1119,7 +1165,7 @@ class ParallelOrchestrator:
                if not ready:
                    # Wait for running features to complete
                    if current > 0:
-                        await asyncio.sleep(POLL_INTERVAL)
+                        await self._wait_for_agent_completion()
                        continue
                    else:
                        # No ready features and nothing running
@@ -1138,7 +1184,7 @@ class ParallelOrchestrator:

                        # Still have pending features but all are blocked by dependencies
                        print("No ready features available. All remaining features may be blocked by dependencies.", flush=True)
-                        await asyncio.sleep(POLL_INTERVAL * 2)
+                        await self._wait_for_agent_completion(timeout=POLL_INTERVAL * 2)
                        continue

                # Start features up to capacity
@@ -1174,7 +1220,7 @@ class ParallelOrchestrator:

            except Exception as e:
                print(f"Orchestrator error: {e}", flush=True)
-                await asyncio.sleep(POLL_INTERVAL)
+                await self._wait_for_agent_completion()

        # Wait for remaining agents to complete
        print("Waiting for running agents to complete...", flush=True)
@@ -1184,7 +1230,8 @@ class ParallelOrchestrator:
                testing_done = len(self.running_testing_agents) == 0
                if coding_done and testing_done:
                    break
-            await asyncio.sleep(1)
+            # Use short timeout since we're just waiting for final agents to finish
+            await self._wait_for_agent_completion(timeout=1.0)

        print("Orchestrator finished.", flush=True)

--- a/progress.py
+++ b/progress.py
@@ -72,15 +72,31 @@ def count_passing_tests(project_dir: Path) -> tuple[int, int, int]:
    try:
        conn = sqlite3.connect(db_file)
        cursor = conn.cursor()
-        cursor.execute("SELECT COUNT(*) FROM features")
-        total = cursor.fetchone()[0]
-        cursor.execute("SELECT COUNT(*) FROM features WHERE passes = 1")
-        passing = cursor.fetchone()[0]
-        # Handle case where in_progress column doesn't exist yet
+        # Single aggregate query instead of 3 separate COUNT queries
+        # Handle case where in_progress column doesn't exist yet (legacy DBs)
        try:
-            cursor.execute("SELECT COUNT(*) FROM features WHERE in_progress = 1")
-            in_progress = cursor.fetchone()[0]
+            cursor.execute("""
+                SELECT
+                    COUNT(*) as total,
+                    SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing,
+                    SUM(CASE WHEN in_progress = 1 THEN 1 ELSE 0 END) as in_progress
+                FROM features
+            """)
+            row = cursor.fetchone()
+            total = row[0] or 0
+            passing = row[1] or 0
+            in_progress = row[2] or 0
        except sqlite3.OperationalError:
+            # Fallback for databases without in_progress column
+            cursor.execute("""
+                SELECT
+                    COUNT(*) as total,
+                    SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing
+                FROM features
+            """)
+            row = cursor.fetchone()
+            total = row[0] or 0
+            passing = row[1] or 0
            in_progress = 0
        conn.close()
        return passing, in_progress, total
--- a/prompts.py
+++ b/prompts.py
@@ -109,11 +109,11 @@ The orchestrator has already claimed this feature for you.


 def get_single_feature_prompt(feature_id: int, project_dir: Path | None = None, yolo_mode: bool = False) -> str:
-    """
-    Load the coding prompt with single-feature focus instructions prepended.
+    """Prepend single-feature assignment header to base coding prompt.

-    When the orchestrator assigns a specific feature to a coding agent,
-    this prompt ensures the agent works ONLY on that feature.
+    Used in parallel mode to assign a specific feature to an agent.
+    The base prompt already contains the full workflow - this just
+    identifies which feature to work on.

    Args:
        feature_id: The specific feature ID to work on
@@ -122,38 +122,20 @@ def get_single_feature_prompt(feature_id: int, project_dir: Path | None = None,
                   handled by separate testing agents, not YOLO prompts.

    Returns:
-        The prompt with single-feature instructions prepended
+        The prompt with single-feature header prepended
    """
-    # Always use the standard coding prompt
-    # (Testing/regression is handled by separate testing agents)
    base_prompt = get_coding_prompt(project_dir)

-    # Prepend single-feature instructions
-    single_feature_header = f"""## ASSIGNED FEATURE
+    # Minimal header - the base prompt already contains the full workflow
+    single_feature_header = f"""## ASSIGNED FEATURE: #{feature_id}

-**You are assigned to work on Feature #{feature_id} ONLY.**
-
-This session is part of a parallel execution where multiple agents work on different features simultaneously.
-
-### Your workflow:
-
-1. **Get feature details** using `feature_get_by_id` with ID {feature_id}
-2. **Mark as in-progress** using `feature_mark_in_progress` with ID {feature_id}
-   - If you get "already in-progress" error, that's OK - continue with implementation
-3. **Implement the feature** following the steps from the feature details
-4. **Test your implementation** to verify it works correctly
-5. **Mark as passing** using `feature_mark_passing` with ID {feature_id}
-6. **Commit your changes** and end the session
-
-### Important rules:
-
- **Do NOT** work on any other features - other agents are handling them
- If blocked, use `feature_skip` and document the blocker in claude-progress.txt
+Work ONLY on this feature. Other agents are handling other features.
+Use `feature_claim_and_get` with ID {feature_id} to claim it and get details.
+If blocked, use `feature_skip` and document the blocker.

 ---

 """
-
    return single_feature_header + base_prompt