mirror of
https://github.com/leonvanzyl/autocoder.git
synced 2026-01-29 22:02:05 +00:00
init
This commit is contained in:
561
.claude/commands/create-spec.md
Normal file
561
.claude/commands/create-spec.md
Normal file
@@ -0,0 +1,561 @@
|
||||
---
|
||||
description: Create an app spec for autonomous coding (project)
|
||||
---
|
||||
|
||||
# PROJECT DIRECTORY
|
||||
|
||||
This command **requires** the project directory as an argument via `$ARGUMENTS`.
|
||||
|
||||
**Example:** `/create-spec generations/my-app`
|
||||
|
||||
**Output location:** `$ARGUMENTS/prompts/app_spec.txt` and `$ARGUMENTS/prompts/initializer_prompt.md`
|
||||
|
||||
If `$ARGUMENTS` is empty, inform the user they must provide a project path and exit.
|
||||
|
||||
---
|
||||
|
||||
# GOAL
|
||||
|
||||
Help the user create a comprehensive project specification for a long-running autonomous coding process. This specification will be used by AI coding agents to build their application across multiple sessions.
|
||||
|
||||
This tool works for projects of any size - from simple utilities to large-scale applications.
|
||||
|
||||
---
|
||||
|
||||
# YOUR ROLE
|
||||
|
||||
You are the **Spec Creation Assistant** - an expert at translating project ideas into detailed technical specifications. Your job is to:
|
||||
|
||||
1. Understand what the user wants to build (in their own words)
|
||||
2. Ask about features and functionality (things anyone can describe)
|
||||
3. **Derive** the technical details (database, API, architecture) from their requirements
|
||||
4. Generate the specification files that autonomous coding agents will use
|
||||
|
||||
**IMPORTANT: Cater to all skill levels.** Many users are product owners or have functional knowledge but aren't technical. They know WHAT they want to build, not HOW to build it. You should:
|
||||
|
||||
- Ask questions anyone can answer (features, user flows, what screens exist)
|
||||
- **Derive** technical details (database schema, API endpoints, architecture) yourself
|
||||
- Only ask technical questions if the user wants to be involved in those decisions
|
||||
|
||||
**USE THE AskUserQuestion TOOL** for structured questions. This provides a much better UX with:
|
||||
|
||||
- Multiple-choice options displayed as clickable buttons
|
||||
- Tabs for grouping related questions
|
||||
- Free-form "Other" option automatically included
|
||||
|
||||
Use AskUserQuestion whenever you have questions with clear options (involvement level, scale, yes/no choices, preferences). Use regular conversation for open-ended exploration (describing features, walking through user flows).
|
||||
|
||||
---
|
||||
|
||||
# CONVERSATION FLOW
|
||||
|
||||
There are two paths through this process:
|
||||
|
||||
**Quick Path** (recommended for most users): You describe what you want, agent derives the technical details
|
||||
**Detailed Path**: You want input on technology choices, database design, API structure, etc.
|
||||
|
||||
**CRITICAL: This is a CONVERSATION, not a form.**
|
||||
|
||||
- Ask questions for ONE phase at a time
|
||||
- WAIT for the user to respond before moving to the next phase
|
||||
- Acknowledge their answers before continuing
|
||||
- Do NOT bundle multiple phases into one message
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Project Overview
|
||||
|
||||
Start with simple questions anyone can answer:
|
||||
|
||||
1. **Project Name**: What should this project be called?
|
||||
2. **Description**: In your own words, what are you building and what problem does it solve?
|
||||
3. **Target Audience**: Who will use this?
|
||||
|
||||
**IMPORTANT: Ask these questions and WAIT for the user to respond before continuing.**
|
||||
Do NOT immediately jump to Phase 2. Let the user answer, acknowledge their responses, then proceed.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Involvement Level
|
||||
|
||||
**Use AskUserQuestion tool here.** Example:
|
||||
|
||||
```
|
||||
Question: "How involved do you want to be in technical decisions?"
|
||||
Header: "Involvement"
|
||||
Options:
|
||||
- Label: "Quick Mode (Recommended)"
|
||||
Description: "I'll describe what I want, you handle database, API, and architecture"
|
||||
- Label: "Detailed Mode"
|
||||
Description: "I want input on technology choices and architecture decisions"
|
||||
```
|
||||
|
||||
**If Quick Mode**: Skip to Phase 3, then go to Phase 4 (Features). You will derive technical details yourself.
|
||||
**If Detailed Mode**: Go through all phases, asking technical questions.
|
||||
|
||||
## Phase 3: Technology Preferences
|
||||
|
||||
**For Quick Mode users**, also ask about tech preferences (can combine in same AskUserQuestion):
|
||||
|
||||
```
|
||||
Question: "Any technology preferences, or should I choose sensible defaults?"
|
||||
Header: "Tech Stack"
|
||||
Options:
|
||||
- Label: "Use defaults (Recommended)"
|
||||
Description: "React, Node.js, SQLite - solid choices for most apps"
|
||||
- Label: "I have preferences"
|
||||
Description: "I'll specify my preferred languages/frameworks"
|
||||
```
|
||||
|
||||
**For Detailed Mode users**, ask specific tech questions about frontend, backend, database, etc.
|
||||
|
||||
## Phase 4: Features (THE MAIN PHASE)
|
||||
|
||||
This is where you spend most of your time. Ask questions in plain language that anyone can answer.
|
||||
|
||||
**Start broad with open conversation:**
|
||||
|
||||
> "Walk me through your app. What does a user see when they first open it? What can they do?"
|
||||
|
||||
**Then use AskUserQuestion for quick yes/no feature areas.** Example:
|
||||
|
||||
```
|
||||
Questions (can ask up to 4 at once):
|
||||
1. Question: "Do users need to log in / have accounts?"
|
||||
Header: "Accounts"
|
||||
Options: Yes (with profiles, settings) | No (anonymous use) | Maybe (optional accounts)
|
||||
|
||||
2. Question: "Should this work well on mobile phones?"
|
||||
Header: "Mobile"
|
||||
Options: Yes (fully responsive) | Desktop only | Basic mobile support
|
||||
|
||||
3. Question: "Do users need to search or filter content?"
|
||||
Header: "Search"
|
||||
Options: Yes | No | Basic only
|
||||
|
||||
4. Question: "Any sharing or collaboration features?"
|
||||
Header: "Sharing"
|
||||
Options: Yes | No | Maybe later
|
||||
```
|
||||
|
||||
**Then drill into the "Yes" answers with open conversation:**
|
||||
|
||||
**4a. The Main Experience**
|
||||
|
||||
- What's the main thing users do in your app?
|
||||
- Walk me through a typical user session
|
||||
|
||||
**4b. User Accounts** (if they said Yes)
|
||||
|
||||
- What can they do with their account?
|
||||
- Any roles or permissions?
|
||||
|
||||
**4c. What Users Create/Manage**
|
||||
|
||||
- What "things" do users create, save, or manage?
|
||||
- Can they edit or delete these things?
|
||||
- Can they organize them (folders, tags, categories)?
|
||||
|
||||
**4d. Settings & Customization**
|
||||
|
||||
- What should users be able to customize?
|
||||
- Light/dark mode? Other display preferences?
|
||||
|
||||
**4e. Search & Finding Things** (if they said Yes)
|
||||
|
||||
- What do they search for?
|
||||
- What filters would be helpful?
|
||||
|
||||
**4f. Sharing & Collaboration** (if they said Yes)
|
||||
|
||||
- What can be shared?
|
||||
- View-only or collaborative editing?
|
||||
|
||||
**4g. Any Dashboards or Analytics?**
|
||||
|
||||
- Does the user see any stats, reports, or metrics?
|
||||
|
||||
**4h. Domain-Specific Features**
|
||||
|
||||
- What else is unique to your app?
|
||||
- Any features we haven't covered?
|
||||
|
||||
**4i. Security & Access Control (if app has authentication)**
|
||||
|
||||
**Use AskUserQuestion for roles:**
|
||||
|
||||
```
|
||||
Question: "Who are the different types of users?"
|
||||
Header: "User Roles"
|
||||
Options:
|
||||
- Label: "Just regular users"
|
||||
Description: "Everyone has the same permissions"
|
||||
- Label: "Users + Admins"
|
||||
Description: "Regular users and administrators with extra powers"
|
||||
- Label: "Multiple roles"
|
||||
Description: "Several distinct user types (e.g., viewer, editor, manager, admin)"
|
||||
```
|
||||
|
||||
**If multiple roles, explore in conversation:**
|
||||
|
||||
- What can each role see?
|
||||
- What can each role do?
|
||||
- Are there pages only certain roles can access?
|
||||
- What happens if someone tries to access something they shouldn't?
|
||||
|
||||
**Also ask about authentication:**
|
||||
|
||||
- How do users log in? (email/password, social login, SSO)
|
||||
- Password requirements? (for security testing)
|
||||
- Session timeout? Auto-logout after inactivity?
|
||||
- Any sensitive operations requiring extra confirmation?
|
||||
|
||||
**4j. Data Flow & Integration**
|
||||
|
||||
- What data do users create vs what's system-generated?
|
||||
- Are there workflows that span multiple steps or pages?
|
||||
- What happens to related data when something is deleted?
|
||||
- Are there any external systems or APIs to integrate with?
|
||||
- Any import/export functionality?
|
||||
|
||||
**4k. Error & Edge Cases**
|
||||
|
||||
- What should happen if the network fails mid-action?
|
||||
- What about duplicate entries (e.g., same email twice)?
|
||||
- Very long text inputs?
|
||||
- Empty states (what shows when there's no data)?
|
||||
|
||||
**Keep asking follow-up questions until you have a complete picture.** For each feature area, understand:
|
||||
|
||||
- What the user sees
|
||||
- What actions they can take
|
||||
- What happens as a result
|
||||
- Who is allowed to do it (permissions)
|
||||
- What errors could occur
|
||||
|
||||
## Phase 4L: Derive Feature Count (DO NOT ASK THE USER)
|
||||
|
||||
After gathering all features, **you** (the agent) should tally up the testable features. Do NOT ask the user how many features they want - derive it from what was discussed.
|
||||
|
||||
**Typical ranges for reference:**
|
||||
|
||||
- **Simple apps** (todo list, calculator, notes): ~20-50 features
|
||||
- **Medium apps** (blog, task manager with auth): ~100 features
|
||||
- **Advanced apps** (e-commerce, CRM, full SaaS): ~150-200 features
|
||||
|
||||
These are just reference points - your actual count should come from the requirements discussed.
|
||||
|
||||
**How to count features:**
|
||||
For each feature area discussed, estimate the number of discrete, testable behaviors:
|
||||
|
||||
- Each CRUD operation = 1 feature (create, read, update, delete)
|
||||
- Each UI interaction = 1 feature (click, drag, hover effect)
|
||||
- Each validation/error case = 1 feature
|
||||
- Each visual requirement = 1 feature (styling, animation, responsive behavior)
|
||||
|
||||
**Present your estimate to the user:**
|
||||
|
||||
> "Based on what we discussed, here's my feature breakdown:
|
||||
>
|
||||
> - [Category 1]: ~X features
|
||||
> - [Category 2]: ~Y features
|
||||
> - [Category 3]: ~Z features
|
||||
> - ...
|
||||
>
|
||||
> **Total: ~N features**
|
||||
>
|
||||
> Does this seem right, or should I adjust?"
|
||||
|
||||
Let the user confirm or adjust. This becomes your `feature_count` for the spec.
|
||||
|
||||
## Phase 5: Technical Details (DERIVED OR DISCUSSED)
|
||||
|
||||
**For Quick Mode users:**
|
||||
Tell them: "Based on what you've described, I'll design the database, API, and architecture. Here's a quick summary of what I'm planning..."
|
||||
|
||||
Then briefly outline:
|
||||
|
||||
- Main data entities you'll create (in plain language: "I'll create tables for users, projects, documents, etc.")
|
||||
- Overall app structure ("sidebar navigation with main content area")
|
||||
- Any key technical decisions
|
||||
|
||||
Ask: "Does this sound right? Any concerns?"
|
||||
|
||||
**For Detailed Mode users:**
|
||||
Walk through each technical area:
|
||||
|
||||
**5a. Database Design**
|
||||
|
||||
- What entities/tables are needed?
|
||||
- Key fields for each?
|
||||
- Relationships?
|
||||
|
||||
**5b. API Design**
|
||||
|
||||
- What endpoints are needed?
|
||||
- How should they be organized?
|
||||
|
||||
**5c. UI Layout**
|
||||
|
||||
- Overall structure (columns, navigation)
|
||||
- Key screens/pages
|
||||
- Design preferences (colors, themes)
|
||||
|
||||
**5d. Implementation Phases**
|
||||
|
||||
- What order to build things?
|
||||
- Dependencies?
|
||||
|
||||
## Phase 6: Success Criteria
|
||||
|
||||
Ask in simple terms:
|
||||
|
||||
> "What does 'done' look like for you? When would you consider this app complete and successful?"
|
||||
|
||||
Prompt for:
|
||||
|
||||
- Must-have functionality
|
||||
- Quality expectations (polished vs functional)
|
||||
- Any specific requirements
|
||||
|
||||
## Phase 7: Review & Approval
|
||||
|
||||
Present everything gathered:
|
||||
|
||||
1. **Summary of the app** (in plain language)
|
||||
2. **Feature count**
|
||||
3. **Technology choices** (whether specified or derived)
|
||||
4. **Brief technical plan** (for their awareness)
|
||||
|
||||
First ask in conversation if they want to make changes.
|
||||
|
||||
**Then use AskUserQuestion for final confirmation:**
|
||||
|
||||
```
|
||||
Question: "Ready to generate the specification files?"
|
||||
Header: "Generate"
|
||||
Options:
|
||||
- Label: "Yes, generate files"
|
||||
Description: "Create app_spec.txt and update prompt files"
|
||||
- Label: "I have changes"
|
||||
Description: "Let me add or modify something first"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# FILE GENERATION
|
||||
|
||||
**Note: This section is for YOU (the agent) to execute. Do not burden the user with these technical details.**
|
||||
|
||||
## Output Directory
|
||||
|
||||
The output directory is: `$ARGUMENTS/prompts/`
|
||||
|
||||
Once the user approves, generate these files:
|
||||
|
||||
## 1. Generate `app_spec.txt`
|
||||
|
||||
**Output path:** `$ARGUMENTS/prompts/app_spec.txt`
|
||||
|
||||
Create a new file using this XML structure:
|
||||
|
||||
```xml
|
||||
<project_specification>
|
||||
<project_name>[Project Name]</project_name>
|
||||
|
||||
<overview>
|
||||
[2-3 sentence description from Phase 1]
|
||||
</overview>
|
||||
|
||||
<technology_stack>
|
||||
<frontend>
|
||||
<framework>[Framework]</framework>
|
||||
<styling>[Styling solution]</styling>
|
||||
[Additional frontend config]
|
||||
</frontend>
|
||||
<backend>
|
||||
<runtime>[Runtime]</runtime>
|
||||
<database>[Database]</database>
|
||||
[Additional backend config]
|
||||
</backend>
|
||||
<communication>
|
||||
<api>[API style]</api>
|
||||
[Additional communication config]
|
||||
</communication>
|
||||
</technology_stack>
|
||||
|
||||
<prerequisites>
|
||||
<environment_setup>
|
||||
[Setup requirements]
|
||||
</environment_setup>
|
||||
</prerequisites>
|
||||
|
||||
<feature_count>[derived count from Phase 4L]</feature_count>
|
||||
|
||||
<security_and_access_control>
|
||||
<user_roles>
|
||||
<role name="[role_name]">
|
||||
<permissions>
|
||||
- [Can do X]
|
||||
- [Can see Y]
|
||||
- [Cannot access Z]
|
||||
</permissions>
|
||||
<protected_routes>
|
||||
- /admin/* (admin only)
|
||||
- /settings (authenticated users)
|
||||
</protected_routes>
|
||||
</role>
|
||||
[Repeat for each role]
|
||||
</user_roles>
|
||||
<authentication>
|
||||
<method>[email/password | social | SSO]</method>
|
||||
<session_timeout>[duration or "none"]</session_timeout>
|
||||
<password_requirements>[if applicable]</password_requirements>
|
||||
</authentication>
|
||||
<sensitive_operations>
|
||||
- [Delete account requires password confirmation]
|
||||
- [Financial actions require 2FA]
|
||||
</sensitive_operations>
|
||||
</security_and_access_control>
|
||||
|
||||
<core_features>
|
||||
<[category_name]>
|
||||
- [Feature 1]
|
||||
- [Feature 2]
|
||||
- [Feature 3]
|
||||
</[category_name]>
|
||||
[Repeat for all feature categories]
|
||||
</core_features>
|
||||
|
||||
<database_schema>
|
||||
<tables>
|
||||
<[table_name]>
|
||||
- [field1], [field2], [field3]
|
||||
- [additional fields]
|
||||
</[table_name]>
|
||||
[Repeat for all tables]
|
||||
</tables>
|
||||
</database_schema>
|
||||
|
||||
<api_endpoints_summary>
|
||||
<[category]>
|
||||
- [VERB] /api/[path]
|
||||
- [VERB] /api/[path]
|
||||
</[category]>
|
||||
[Repeat for all categories]
|
||||
</api_endpoints_summary>
|
||||
|
||||
<ui_layout>
|
||||
<main_structure>
|
||||
[Layout description]
|
||||
</main_structure>
|
||||
[Additional UI sections as needed]
|
||||
</ui_layout>
|
||||
|
||||
<design_system>
|
||||
<color_palette>
|
||||
[Colors]
|
||||
</color_palette>
|
||||
<typography>
|
||||
[Font preferences]
|
||||
</typography>
|
||||
</design_system>
|
||||
|
||||
<implementation_steps>
|
||||
<step number="1">
|
||||
<title>[Phase Title]</title>
|
||||
<tasks>
|
||||
- [Task 1]
|
||||
- [Task 2]
|
||||
</tasks>
|
||||
</step>
|
||||
[Repeat for all phases]
|
||||
</implementation_steps>
|
||||
|
||||
<success_criteria>
|
||||
<functionality>
|
||||
[Functionality criteria]
|
||||
</functionality>
|
||||
<user_experience>
|
||||
[UX criteria]
|
||||
</user_experience>
|
||||
<technical_quality>
|
||||
[Technical criteria]
|
||||
</technical_quality>
|
||||
<design_polish>
|
||||
[Design criteria]
|
||||
</design_polish>
|
||||
</success_criteria>
|
||||
</project_specification>
|
||||
```
|
||||
|
||||
## 2. Update `initializer_prompt.md`
|
||||
|
||||
**Output path:** `$ARGUMENTS/prompts/initializer_prompt.md`
|
||||
|
||||
If the output directory has an existing `initializer_prompt.md`, read it and update the feature count.
|
||||
If not, copy from `.claude/templates/initializer_prompt.template.md` first, then update.
|
||||
|
||||
Update the feature count references to match the derived count from Phase 4L:
|
||||
|
||||
- Line containing "create ... test cases" - update to the derived feature count
|
||||
- Line containing "Minimum ... features" - update to the derived feature count
|
||||
|
||||
**Note:** You do NOT need to update `coding_prompt.md` - the coding agent works through features one at a time regardless of total count.
|
||||
|
||||
---
|
||||
|
||||
# AFTER FILE GENERATION: NEXT STEPS
|
||||
|
||||
Once files are generated, tell the user what to do next:
|
||||
|
||||
> "Your specification files have been created in `$ARGUMENTS/prompts/`!
|
||||
>
|
||||
> **Files created:**
|
||||
> - `$ARGUMENTS/prompts/app_spec.txt`
|
||||
> - `$ARGUMENTS/prompts/initializer_prompt.md`
|
||||
>
|
||||
> **Next step:** Type `/exit` to exit this Claude session. The autonomous coding agent will start automatically.
|
||||
>
|
||||
> **Important timing expectations:**
|
||||
>
|
||||
> - **First session:** The agent generates features in the database. This takes several minutes.
|
||||
> - **Subsequent sessions:** Each coding iteration takes 5-15 minutes depending on complexity.
|
||||
> - **Full app:** Building all [X] features will take many hours across multiple sessions.
|
||||
>
|
||||
> **Controls:**
|
||||
>
|
||||
> - Press `Ctrl+C` to pause the agent at any time
|
||||
> - Run `start.bat` (Windows) or `./start.sh` (Mac/Linux) to resume where you left off"
|
||||
|
||||
Replace `[X]` with their feature count.
|
||||
|
||||
---
|
||||
|
||||
# IMPORTANT REMINDERS
|
||||
|
||||
- **Meet users where they are**: Not everyone is technical. Ask about what they want, not how to build it.
|
||||
- **Quick Mode is the default**: Most users should be able to describe their app and let you handle the technical details.
|
||||
- **Derive, don't interrogate**: For non-technical users, derive database schema, API endpoints, and architecture from their feature descriptions. Don't ask them to specify these.
|
||||
- **Use plain language**: Instead of "What entities need CRUD operations?", ask "What things can users create, edit, or delete?"
|
||||
- **Be thorough on features**: This is where to spend time. Keep asking follow-up questions until you have a complete picture.
|
||||
- **Derive feature count, don't guess**: After gathering requirements, tally up testable features yourself and present the estimate. Don't use fixed tiers or ask users to guess.
|
||||
- **Validate before generating**: Present a summary including your derived feature count and get explicit approval before creating files.
|
||||
|
||||
---
|
||||
|
||||
# BEGIN
|
||||
|
||||
Start by greeting the user warmly. Ask ONLY the Phase 1 questions:
|
||||
|
||||
> "Hi! I'm here to help you create a detailed specification for your app.
|
||||
>
|
||||
> Let's start with the basics:
|
||||
>
|
||||
> 1. What do you want to call this project?
|
||||
> 2. In your own words, what are you building?
|
||||
> 3. Who will use it - just you, or others too?"
|
||||
|
||||
**STOP HERE and wait for their response.** Do not ask any other questions yet. Do not use AskUserQuestion yet. Just have a conversation about their project basics first.
|
||||
|
||||
After they respond, acknowledge what they said, then move to Phase 2.
|
||||
177
.claude/skills/frontend-design/LICENSE.txt
Normal file
177
.claude/skills/frontend-design/LICENSE.txt
Normal file
@@ -0,0 +1,177 @@
|
||||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
42
.claude/skills/frontend-design/SKILL.md
Normal file
42
.claude/skills/frontend-design/SKILL.md
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
name: frontend-design
|
||||
description: Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, artifacts, posters, or applications (examples include websites, landing pages, dashboards, React components, HTML/CSS layouts, or when styling/beautifying any web UI). Generates creative, polished code and UI design that avoids generic AI aesthetics.
|
||||
license: Complete terms in LICENSE.txt
|
||||
---
|
||||
|
||||
This skill guides creation of distinctive, production-grade frontend interfaces that avoid generic "AI slop" aesthetics. Implement real working code with exceptional attention to aesthetic details and creative choices.
|
||||
|
||||
The user provides frontend requirements: a component, page, application, or interface to build. They may include context about the purpose, audience, or technical constraints.
|
||||
|
||||
## Design Thinking
|
||||
|
||||
Before coding, understand the context and commit to a BOLD aesthetic direction:
|
||||
- **Purpose**: What problem does this interface solve? Who uses it?
|
||||
- **Tone**: Pick an extreme: brutally minimal, maximalist chaos, retro-futuristic, organic/natural, luxury/refined, playful/toy-like, editorial/magazine, brutalist/raw, art deco/geometric, soft/pastel, industrial/utilitarian, etc. There are so many flavors to choose from. Use these for inspiration but design one that is true to the aesthetic direction.
|
||||
- **Constraints**: Technical requirements (framework, performance, accessibility).
|
||||
- **Differentiation**: What makes this UNFORGETTABLE? What's the one thing someone will remember?
|
||||
|
||||
**CRITICAL**: Choose a clear conceptual direction and execute it with precision. Bold maximalism and refined minimalism both work - the key is intentionality, not intensity.
|
||||
|
||||
Then implement working code (HTML/CSS/JS, React, Vue, etc.) that is:
|
||||
- Production-grade and functional
|
||||
- Visually striking and memorable
|
||||
- Cohesive with a clear aesthetic point-of-view
|
||||
- Meticulously refined in every detail
|
||||
|
||||
## Frontend Aesthetics Guidelines
|
||||
|
||||
Focus on:
|
||||
- **Typography**: Choose fonts that are beautiful, unique, and interesting. Avoid generic fonts like Arial and Inter; opt instead for distinctive choices that elevate the frontend's aesthetics; unexpected, characterful font choices. Pair a distinctive display font with a refined body font.
|
||||
- **Color & Theme**: Commit to a cohesive aesthetic. Use CSS variables for consistency. Dominant colors with sharp accents outperform timid, evenly-distributed palettes.
|
||||
- **Motion**: Use animations for effects and micro-interactions. Prioritize CSS-only solutions for HTML. Use Motion library for React when available. Focus on high-impact moments: one well-orchestrated page load with staggered reveals (animation-delay) creates more delight than scattered micro-interactions. Use scroll-triggering and hover states that surprise.
|
||||
- **Spatial Composition**: Unexpected layouts. Asymmetry. Overlap. Diagonal flow. Grid-breaking elements. Generous negative space OR controlled density.
|
||||
- **Backgrounds & Visual Details**: Create atmosphere and depth rather than defaulting to solid colors. Add contextual effects and textures that match the overall aesthetic. Apply creative forms like gradient meshes, noise textures, geometric patterns, layered transparencies, dramatic shadows, decorative borders, custom cursors, and grain overlays.
|
||||
|
||||
NEVER use generic AI-generated aesthetics like overused font families (Inter, Roboto, Arial, system fonts), cliched color schemes (particularly purple gradients on white backgrounds), predictable layouts and component patterns, and cookie-cutter design that lacks context-specific character.
|
||||
|
||||
Interpret creatively and make unexpected choices that feel genuinely designed for the context. No design should be the same. Vary between light and dark themes, different fonts, different aesthetics. NEVER converge on common choices (Space Grotesk, for example) across generations.
|
||||
|
||||
**IMPORTANT**: Match implementation complexity to the aesthetic vision. Maximalist designs need elaborate code with extensive animations and effects. Minimalist or refined designs need restraint, precision, and careful attention to spacing, typography, and subtle details. Elegance comes from executing the vision well.
|
||||
|
||||
Remember: Claude is capable of extraordinary creative work. Don't hold back, show what can truly be created when thinking outside the box and committing fully to a distinctive vision.
|
||||
331
.claude/templates/app_spec.template.txt
Normal file
331
.claude/templates/app_spec.template.txt
Normal file
@@ -0,0 +1,331 @@
|
||||
<!--
|
||||
Project Specification Template
|
||||
==============================
|
||||
|
||||
This is a placeholder template. Replace with your actual project specification.
|
||||
|
||||
You can either:
|
||||
1. Use the /create-spec command to generate this interactively with Claude
|
||||
2. Manually edit this file following the structure below
|
||||
|
||||
See existing projects in generations/ for examples of complete specifications.
|
||||
-->
|
||||
|
||||
<project_specification>
|
||||
<project_name>YOUR_PROJECT_NAME</project_name>
|
||||
|
||||
<overview>
|
||||
Describe your project in 2-3 sentences. What are you building? What problem
|
||||
does it solve? Who is it for? Include key features and design goals.
|
||||
</overview>
|
||||
|
||||
<technology_stack>
|
||||
<frontend>
|
||||
<framework>React with Vite</framework>
|
||||
<styling>Tailwind CSS</styling>
|
||||
<state_management>React hooks and context</state_management>
|
||||
<routing>React Router for navigation</routing>
|
||||
<port>3000</port>
|
||||
</frontend>
|
||||
<backend>
|
||||
<runtime>Node.js with Express</runtime>
|
||||
<database>SQLite with better-sqlite3</database>
|
||||
<port>3001</port>
|
||||
</backend>
|
||||
<communication>
|
||||
<api>RESTful endpoints</api>
|
||||
</communication>
|
||||
</technology_stack>
|
||||
|
||||
<prerequisites>
|
||||
<environment_setup>
|
||||
- Node.js 18+ installed
|
||||
- npm or pnpm package manager
|
||||
- Any API keys or external services needed
|
||||
</environment_setup>
|
||||
</prerequisites>
|
||||
|
||||
<core_features>
|
||||
<!--
|
||||
List features grouped by category. Each feature should be:
|
||||
- Specific and testable
|
||||
- Independent where possible
|
||||
- Written as a capability ("User can...", "System displays...")
|
||||
-->
|
||||
|
||||
<authentication>
|
||||
- User registration with email/password
|
||||
- User login with session management
|
||||
- User logout
|
||||
- Password reset flow
|
||||
- Profile management
|
||||
</authentication>
|
||||
|
||||
<main_functionality>
|
||||
<!-- Replace with your app's primary features -->
|
||||
- Create new items
|
||||
- View list of items with pagination
|
||||
- Edit existing items
|
||||
- Delete items with confirmation
|
||||
- Search and filter items
|
||||
</main_functionality>
|
||||
|
||||
<user_interface>
|
||||
- Responsive layout (mobile, tablet, desktop)
|
||||
- Dark/light theme toggle
|
||||
- Loading states and skeletons
|
||||
- Error handling with user feedback
|
||||
- Toast notifications for actions
|
||||
</user_interface>
|
||||
|
||||
<data_management>
|
||||
- Data validation on forms
|
||||
- Auto-save drafts
|
||||
- Export data functionality
|
||||
- Import data functionality
|
||||
</data_management>
|
||||
|
||||
<!-- Add more feature categories as needed -->
|
||||
</core_features>
|
||||
|
||||
<database_schema>
|
||||
<tables>
|
||||
<users>
|
||||
- id (PRIMARY KEY)
|
||||
- email (UNIQUE, NOT NULL)
|
||||
- password_hash (NOT NULL)
|
||||
- name
|
||||
- avatar_url
|
||||
- preferences (JSON)
|
||||
- created_at, updated_at
|
||||
</users>
|
||||
|
||||
<!-- Add more tables for your domain entities -->
|
||||
<items>
|
||||
- id (PRIMARY KEY)
|
||||
- user_id (FOREIGN KEY -> users.id)
|
||||
- title (NOT NULL)
|
||||
- description
|
||||
- status (enum: draft, active, archived)
|
||||
- created_at, updated_at
|
||||
</items>
|
||||
|
||||
<!-- Add additional tables as needed -->
|
||||
</tables>
|
||||
</database_schema>
|
||||
|
||||
<api_endpoints_summary>
|
||||
<authentication>
|
||||
- POST /api/auth/register
|
||||
- POST /api/auth/login
|
||||
- POST /api/auth/logout
|
||||
- GET /api/auth/me
|
||||
- PUT /api/auth/profile
|
||||
- POST /api/auth/forgot-password
|
||||
- POST /api/auth/reset-password
|
||||
</authentication>
|
||||
|
||||
<items>
|
||||
- GET /api/items (list with pagination, search, filters)
|
||||
- POST /api/items (create)
|
||||
- GET /api/items/:id (get single)
|
||||
- PUT /api/items/:id (update)
|
||||
- DELETE /api/items/:id (delete)
|
||||
</items>
|
||||
|
||||
<!-- Add more endpoint categories as needed -->
|
||||
</api_endpoints_summary>
|
||||
|
||||
<ui_layout>
|
||||
<main_structure>
|
||||
Describe the overall layout structure:
|
||||
- Header with navigation and user menu
|
||||
- Sidebar for navigation (collapsible on mobile)
|
||||
- Main content area
|
||||
- Footer (optional)
|
||||
</main_structure>
|
||||
|
||||
<sidebar>
|
||||
- Logo/brand at top
|
||||
- Navigation links
|
||||
- Quick actions
|
||||
- User profile at bottom
|
||||
</sidebar>
|
||||
|
||||
<main_content>
|
||||
- Page header with title and actions
|
||||
- Content area with cards/lists/forms
|
||||
- Pagination or infinite scroll
|
||||
</main_content>
|
||||
|
||||
<modals_overlays>
|
||||
- Confirmation dialogs
|
||||
- Form modals for create/edit
|
||||
- Settings modal
|
||||
- Help/keyboard shortcuts reference
|
||||
</modals_overlays>
|
||||
</ui_layout>
|
||||
|
||||
<design_system>
|
||||
<color_palette>
|
||||
- Primary: #3B82F6 (blue)
|
||||
- Secondary: #10B981 (green)
|
||||
- Accent: #F59E0B (amber)
|
||||
- Background: #FFFFFF (light), #1A1A1A (dark)
|
||||
- Surface: #F5F5F5 (light), #2A2A2A (dark)
|
||||
- Text: #1F2937 (light), #E5E5E5 (dark)
|
||||
- Border: #E5E5E5 (light), #404040 (dark)
|
||||
- Error: #EF4444
|
||||
- Success: #10B981
|
||||
- Warning: #F59E0B
|
||||
</color_palette>
|
||||
|
||||
<typography>
|
||||
- Font family: Inter, system-ui, -apple-system, sans-serif
|
||||
- Headings: font-semibold
|
||||
- Body: font-normal, leading-relaxed
|
||||
- Code: JetBrains Mono, Consolas, monospace
|
||||
</typography>
|
||||
|
||||
<components>
|
||||
<buttons>
|
||||
- Primary: colored background, white text, rounded
|
||||
- Secondary: border style, hover fill
|
||||
- Ghost: transparent, hover background
|
||||
- Icon buttons: square with hover state
|
||||
</buttons>
|
||||
|
||||
<inputs>
|
||||
- Rounded borders with focus ring
|
||||
- Clear placeholder text
|
||||
- Error states with red border
|
||||
- Disabled state styling
|
||||
</inputs>
|
||||
|
||||
<cards>
|
||||
- Subtle border or shadow
|
||||
- Rounded corners (8px)
|
||||
- Hover state for interactive cards
|
||||
</cards>
|
||||
</components>
|
||||
|
||||
<animations>
|
||||
- Smooth transitions (150-300ms)
|
||||
- Fade in for new content
|
||||
- Slide animations for modals/sidebars
|
||||
- Loading spinners
|
||||
- Skeleton loaders
|
||||
</animations>
|
||||
</design_system>
|
||||
|
||||
<key_interactions>
|
||||
<!-- Describe the main user flows -->
|
||||
<user_flow_1>
|
||||
1. User arrives at landing page
|
||||
2. Clicks "Get Started" or "Sign Up"
|
||||
3. Fills registration form
|
||||
4. Receives confirmation
|
||||
5. Redirected to main dashboard
|
||||
</user_flow_1>
|
||||
|
||||
<user_flow_2>
|
||||
1. User clicks "Create New"
|
||||
2. Form modal opens
|
||||
3. User fills in details
|
||||
4. Clicks save
|
||||
5. Item appears in list with success toast
|
||||
</user_flow_2>
|
||||
|
||||
<!-- Add more key interactions as needed -->
|
||||
</key_interactions>
|
||||
|
||||
<implementation_steps>
|
||||
<step number="1">
|
||||
<title>Project Setup and Database</title>
|
||||
<tasks>
|
||||
- Initialize frontend with Vite + React
|
||||
- Set up Express backend
|
||||
- Create SQLite database with schema
|
||||
- Configure CORS and middleware
|
||||
- Set up environment variables
|
||||
</tasks>
|
||||
</step>
|
||||
|
||||
<step number="2">
|
||||
<title>Authentication System</title>
|
||||
<tasks>
|
||||
- Implement user registration
|
||||
- Build login/logout flow
|
||||
- Add session management
|
||||
- Create protected routes
|
||||
- Build user profile page
|
||||
</tasks>
|
||||
</step>
|
||||
|
||||
<step number="3">
|
||||
<title>Core Features</title>
|
||||
<tasks>
|
||||
- Build main CRUD operations
|
||||
- Implement list views with pagination
|
||||
- Add search and filtering
|
||||
- Create form validation
|
||||
- Handle error states
|
||||
</tasks>
|
||||
</step>
|
||||
|
||||
<step number="4">
|
||||
<title>UI Polish and Responsiveness</title>
|
||||
<tasks>
|
||||
- Implement responsive design
|
||||
- Add dark/light theme
|
||||
- Create loading states
|
||||
- Add animations and transitions
|
||||
- Implement toast notifications
|
||||
</tasks>
|
||||
</step>
|
||||
|
||||
<step number="5">
|
||||
<title>Testing and Refinement</title>
|
||||
<tasks>
|
||||
- Test all user flows
|
||||
- Fix edge cases
|
||||
- Optimize performance
|
||||
- Ensure accessibility
|
||||
- Final UI polish
|
||||
</tasks>
|
||||
</step>
|
||||
</implementation_steps>
|
||||
|
||||
<success_criteria>
|
||||
<functionality>
|
||||
- All features work as specified
|
||||
- No console errors in browser
|
||||
- Proper error handling throughout
|
||||
- Data persists correctly in database
|
||||
</functionality>
|
||||
|
||||
<user_experience>
|
||||
- Intuitive navigation and workflows
|
||||
- Responsive on all device sizes
|
||||
- Fast load times (< 2s)
|
||||
- Clear feedback for all actions
|
||||
- Accessible (keyboard navigation, ARIA labels)
|
||||
</user_experience>
|
||||
|
||||
<technical_quality>
|
||||
- Clean, maintainable code structure
|
||||
- Consistent coding style
|
||||
- Proper separation of concerns
|
||||
- Secure authentication
|
||||
- Input validation and sanitization
|
||||
</technical_quality>
|
||||
|
||||
<design_polish>
|
||||
- Consistent visual design
|
||||
- Smooth animations
|
||||
- Professional appearance
|
||||
- Both themes fully implemented
|
||||
- No layout issues or overflow
|
||||
</design_polish>
|
||||
</success_criteria>
|
||||
</project_specification>
|
||||
407
.claude/templates/coding_prompt.template.md
Normal file
407
.claude/templates/coding_prompt.template.md
Normal file
@@ -0,0 +1,407 @@
|
||||
## YOUR ROLE - CODING AGENT
|
||||
|
||||
You are continuing work on a long-running autonomous development task.
|
||||
This is a FRESH context window - you have no memory of previous sessions.
|
||||
|
||||
### STEP 1: GET YOUR BEARINGS (MANDATORY)
|
||||
|
||||
Start by orienting yourself:
|
||||
|
||||
```bash
|
||||
# 1. See your working directory
|
||||
pwd
|
||||
|
||||
# 2. List files to understand project structure
|
||||
ls -la
|
||||
|
||||
# 3. Read the project specification to understand what you're building
|
||||
cat app_spec.txt
|
||||
|
||||
# 4. Read progress notes from previous sessions
|
||||
cat claude-progress.txt
|
||||
|
||||
# 5. Check recent git history
|
||||
git log --oneline -20
|
||||
```
|
||||
|
||||
Then use MCP tools to check feature status:
|
||||
|
||||
```
|
||||
# 6. Get progress statistics (passing/total counts)
|
||||
Use the feature_get_stats tool
|
||||
|
||||
# 7. Get the next feature to work on
|
||||
Use the feature_get_next tool
|
||||
```
|
||||
|
||||
Understanding the `app_spec.txt` is critical - it contains the full requirements
|
||||
for the application you're building.
|
||||
|
||||
### STEP 2: START SERVERS (IF NOT RUNNING)
|
||||
|
||||
If `init.sh` exists, run it:
|
||||
|
||||
```bash
|
||||
chmod +x init.sh
|
||||
./init.sh
|
||||
```
|
||||
|
||||
Otherwise, start servers manually and document the process.
|
||||
|
||||
### STEP 3: VERIFICATION TEST (CRITICAL!)
|
||||
|
||||
**MANDATORY BEFORE NEW WORK:**
|
||||
|
||||
The previous session may have introduced bugs. Before implementing anything
|
||||
new, you MUST run verification tests.
|
||||
|
||||
Run 1-2 of the features marked as passing that are most core to the app's functionality to verify they still work.
|
||||
|
||||
To get passing features for regression testing:
|
||||
|
||||
```
|
||||
Use the feature_get_for_regression tool (returns up to 3 random passing features)
|
||||
```
|
||||
|
||||
For example, if this were a chat app, you should perform a test that logs into the app, sends a message, and gets a response.
|
||||
|
||||
**If you find ANY issues (functional or visual):**
|
||||
|
||||
- Mark that feature as "passes": false immediately
|
||||
- Add issues to a list
|
||||
- Fix all issues BEFORE moving to new features
|
||||
- This includes UI bugs like:
|
||||
- White-on-white text or poor contrast
|
||||
- Random characters displayed
|
||||
- Incorrect timestamps
|
||||
- Layout issues or overflow
|
||||
- Buttons too close together
|
||||
- Missing hover states
|
||||
- Console errors
|
||||
|
||||
### STEP 4: CHOOSE ONE FEATURE TO IMPLEMENT
|
||||
|
||||
Get the next feature to implement:
|
||||
|
||||
```
|
||||
# Get the highest-priority pending feature
|
||||
Use the feature_get_next tool
|
||||
```
|
||||
|
||||
Focus on completing one feature perfectly and completing its testing steps in this session before moving on to other features.
|
||||
It's ok if you only complete one feature in this session, as there will be more sessions later that continue to make progress.
|
||||
|
||||
#### If You Cannot Implement the Feature
|
||||
|
||||
Sometimes a feature cannot be implemented yet. Valid reasons to skip:
|
||||
|
||||
- **Dependency**: The feature requires another feature to be implemented first
|
||||
- **Missing prerequisite**: Core infrastructure (auth, database schema) isn't ready
|
||||
- **Unclear requirements**: The feature description is ambiguous and needs clarification
|
||||
|
||||
If you encounter a blocker, **skip the feature** to move it to the end of the queue:
|
||||
|
||||
```
|
||||
# Skip feature #42 - moves it to end of priority queue
|
||||
Use the feature_skip tool with feature_id=42
|
||||
```
|
||||
|
||||
After skipping, use the feature_get_next tool again to get the next feature to work on.
|
||||
|
||||
**Do NOT skip features just because they seem difficult.** Only skip when there is a genuine dependency or blocker. Document why you skipped in `claude-progress.txt`.
|
||||
|
||||
### STEP 5: IMPLEMENT THE FEATURE
|
||||
|
||||
Implement the chosen feature thoroughly:
|
||||
|
||||
1. Write the code (frontend and/or backend as needed)
|
||||
2. Test manually using browser automation (see Step 6)
|
||||
3. Fix any issues discovered
|
||||
4. Verify the feature works end-to-end
|
||||
|
||||
### STEP 6: VERIFY WITH BROWSER AUTOMATION
|
||||
|
||||
**CRITICAL:** You MUST verify features through the actual UI.
|
||||
|
||||
Use browser automation tools:
|
||||
|
||||
- Navigate to the app in a real browser
|
||||
- Interact like a human user (click, type, scroll)
|
||||
- Take screenshots at each step
|
||||
- Verify both functionality AND visual appearance
|
||||
|
||||
**DO:**
|
||||
|
||||
- Test through the UI with clicks and keyboard input
|
||||
- Take screenshots to verify visual appearance
|
||||
- Check for console errors in browser
|
||||
- Verify complete user workflows end-to-end
|
||||
|
||||
**DON'T:**
|
||||
|
||||
- Only test with curl commands (backend testing alone is insufficient)
|
||||
- Use JavaScript evaluation to bypass UI (no shortcuts)
|
||||
- Skip visual verification
|
||||
- Mark tests passing without thorough verification
|
||||
|
||||
### STEP 6.5: MANDATORY VERIFICATION CHECKLIST (BEFORE MARKING ANY TEST PASSING)
|
||||
|
||||
**You MUST complete ALL of these checks before marking any feature as "passes": true**
|
||||
|
||||
#### Security Verification (for protected features)
|
||||
|
||||
- [ ] Feature respects user role permissions
|
||||
- [ ] Unauthenticated access is blocked (redirects to login)
|
||||
- [ ] API endpoint checks authorization (returns 401/403 appropriately)
|
||||
- [ ] Cannot access other users' data by manipulating URLs
|
||||
|
||||
#### Real Data Verification (CRITICAL - NO MOCK DATA)
|
||||
|
||||
- [ ] Created unique test data via UI (e.g., "TEST_12345_VERIFY_ME")
|
||||
- [ ] Verified the EXACT data I created appears in UI
|
||||
- [ ] Refreshed page - data persists (proves database storage)
|
||||
- [ ] Deleted the test data - verified it's gone everywhere
|
||||
- [ ] NO unexplained data appeared (would indicate mock data)
|
||||
- [ ] Dashboard/counts reflect real numbers after my changes
|
||||
|
||||
#### Navigation Verification
|
||||
|
||||
- [ ] All buttons on this page link to existing routes
|
||||
- [ ] No 404 errors when clicking any interactive element
|
||||
- [ ] Back button returns to correct previous page
|
||||
- [ ] Related links (edit, view, delete) have correct IDs in URLs
|
||||
|
||||
#### Integration Verification
|
||||
|
||||
- [ ] Console shows ZERO JavaScript errors
|
||||
- [ ] Network tab shows successful API calls (no 500s)
|
||||
- [ ] Data returned from API matches what UI displays
|
||||
- [ ] Loading states appeared during API calls
|
||||
- [ ] Error states handle failures gracefully
|
||||
|
||||
### STEP 6.6: MOCK DATA DETECTION SWEEP
|
||||
|
||||
**Run this sweep AFTER EVERY FEATURE before marking it as passing:**
|
||||
|
||||
#### 1. Code Pattern Search
|
||||
|
||||
Search the codebase for forbidden patterns:
|
||||
|
||||
```bash
|
||||
# Search for mock data patterns
|
||||
grep -r "mockData\|fakeData\|sampleData\|dummyData\|testData" --include="*.js" --include="*.ts" --include="*.jsx" --include="*.tsx"
|
||||
grep -r "// TODO\|// FIXME\|// STUB\|// MOCK" --include="*.js" --include="*.ts" --include="*.jsx" --include="*.tsx"
|
||||
grep -r "hardcoded\|placeholder" --include="*.js" --include="*.ts" --include="*.jsx" --include="*.tsx"
|
||||
```
|
||||
|
||||
**If ANY matches found related to your feature - FIX THEM before proceeding.**
|
||||
|
||||
#### 2. Runtime Verification
|
||||
|
||||
For ANY data displayed in UI:
|
||||
|
||||
1. Create NEW data with UNIQUE content (e.g., "TEST_12345_DELETE_ME")
|
||||
2. Verify that EXACT content appears in the UI
|
||||
3. Delete the record
|
||||
4. Verify it's GONE from the UI
|
||||
5. **If you see data that wasn't created during testing - IT'S MOCK DATA. Fix it.**
|
||||
|
||||
#### 3. Database Verification
|
||||
|
||||
Check that:
|
||||
|
||||
- Database tables contain only data you created during tests
|
||||
- Counts/statistics match actual database record counts
|
||||
- No seed data is masquerading as user data
|
||||
|
||||
#### 4. API Response Verification
|
||||
|
||||
For API endpoints used by this feature:
|
||||
|
||||
- Call the endpoint directly
|
||||
- Verify response contains actual database data
|
||||
- Empty database = empty response (not pre-populated mock data)
|
||||
|
||||
### STEP 7: UPDATE FEATURE STATUS (CAREFULLY!)
|
||||
|
||||
**YOU CAN ONLY MODIFY ONE FIELD: "passes"**
|
||||
|
||||
After thorough verification, mark the feature as passing:
|
||||
|
||||
```
|
||||
# Mark feature #42 as passing (replace 42 with the actual feature ID)
|
||||
Use the feature_mark_passing tool with feature_id=42
|
||||
```
|
||||
|
||||
**NEVER:**
|
||||
|
||||
- Delete features
|
||||
- Edit feature descriptions
|
||||
- Modify feature steps
|
||||
- Combine or consolidate features
|
||||
- Reorder features
|
||||
|
||||
**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**
|
||||
|
||||
### STEP 8: COMMIT YOUR PROGRESS
|
||||
|
||||
Make a descriptive git commit:
|
||||
|
||||
```bash
|
||||
git add .
|
||||
git commit -m "Implement [feature name] - verified end-to-end
|
||||
|
||||
- Added [specific changes]
|
||||
- Tested with browser automation
|
||||
- Marked feature #X as passing
|
||||
- Screenshots in verification/ directory
|
||||
"
|
||||
```
|
||||
|
||||
### STEP 9: UPDATE PROGRESS NOTES
|
||||
|
||||
Update `claude-progress.txt` with:
|
||||
|
||||
- What you accomplished this session
|
||||
- Which test(s) you completed
|
||||
- Any issues discovered or fixed
|
||||
- What should be worked on next
|
||||
- Current completion status (e.g., "45/200 tests passing")
|
||||
|
||||
### STEP 10: END SESSION CLEANLY
|
||||
|
||||
Before context fills up:
|
||||
|
||||
1. Commit all working code
|
||||
2. Update claude-progress.txt
|
||||
3. Mark features as passing if tests verified
|
||||
4. Ensure no uncommitted changes
|
||||
5. Leave app in working state (no broken features)
|
||||
|
||||
---
|
||||
|
||||
## TESTING REQUIREMENTS
|
||||
|
||||
**ALL testing must use browser automation tools.**
|
||||
|
||||
Available tools:
|
||||
|
||||
**Navigation & Screenshots:**
|
||||
|
||||
- browser_navigate - Navigate to a URL
|
||||
- browser_navigate_back - Go back to previous page
|
||||
- browser_take_screenshot - Capture screenshot (use for visual verification)
|
||||
- browser_snapshot - Get accessibility tree snapshot (structured page data)
|
||||
|
||||
**Element Interaction:**
|
||||
|
||||
- browser_click - Click elements (has built-in auto-wait)
|
||||
- browser_type - Type text into editable elements
|
||||
- browser_fill_form - Fill multiple form fields at once
|
||||
- browser_select_option - Select dropdown options
|
||||
- browser_hover - Hover over elements
|
||||
- browser_drag - Drag and drop between elements
|
||||
- browser_press_key - Press keyboard keys
|
||||
|
||||
**Debugging & Monitoring:**
|
||||
|
||||
- browser_console_messages - Get browser console output (check for errors)
|
||||
- browser_network_requests - Monitor API calls and responses
|
||||
- browser_evaluate - Execute JavaScript (use sparingly)
|
||||
|
||||
**Browser Management:**
|
||||
|
||||
- browser_close - Close the browser
|
||||
- browser_resize - Resize browser window (use to test mobile: 375x667, tablet: 768x1024, desktop: 1280x720)
|
||||
- browser_tabs - Manage browser tabs
|
||||
- browser_wait_for - Wait for text/element/time
|
||||
- browser_handle_dialog - Handle alert/confirm dialogs
|
||||
- browser_file_upload - Upload files
|
||||
|
||||
**Key Benefits:**
|
||||
|
||||
- All interaction tools have **built-in auto-wait** - no manual timeouts needed
|
||||
- Use `browser_console_messages` to detect JavaScript errors
|
||||
- Use `browser_network_requests` to verify API calls succeed
|
||||
|
||||
Test like a human user with mouse and keyboard. Don't take shortcuts by using JavaScript evaluation.
|
||||
|
||||
---
|
||||
|
||||
## FEATURE TOOL USAGE RULES (CRITICAL - DO NOT VIOLATE)
|
||||
|
||||
The feature tools exist to reduce token usage. **DO NOT make exploratory queries.**
|
||||
|
||||
### ALLOWED Feature Tools (ONLY these):
|
||||
|
||||
```
|
||||
# 1. Get progress stats (passing/total counts)
|
||||
feature_get_stats
|
||||
|
||||
# 2. Get the NEXT feature to work on (one feature only)
|
||||
feature_get_next
|
||||
|
||||
# 3. Get up to 3 random passing features for regression testing
|
||||
feature_get_for_regression
|
||||
|
||||
# 4. Mark a feature as passing (after verification)
|
||||
feature_mark_passing with feature_id={id}
|
||||
|
||||
# 5. Skip a feature (moves to end of queue) - ONLY when blocked by dependency
|
||||
feature_skip with feature_id={id}
|
||||
```
|
||||
|
||||
### RULES:
|
||||
|
||||
- Do NOT try to fetch lists of all features
|
||||
- Do NOT query features by category
|
||||
- Do NOT list all pending features
|
||||
|
||||
**You do NOT need to see all features.** The feature_get_next tool tells you exactly what to work on. Trust it.
|
||||
|
||||
---
|
||||
|
||||
## EMAIL INTEGRATION (DEVELOPMENT MODE)
|
||||
|
||||
When building applications that require email functionality (password resets, email verification, notifications, etc.), you typically won't have access to a real email service or the ability to read email inboxes.
|
||||
|
||||
**Solution:** Configure the application to log emails to the terminal instead of sending them.
|
||||
|
||||
- Password reset links should be printed to the console
|
||||
- Email verification links should be printed to the console
|
||||
- Any notification content should be logged to the terminal
|
||||
|
||||
**During testing:**
|
||||
|
||||
1. Trigger the email action (e.g., click "Forgot Password")
|
||||
2. Check the terminal/server logs for the generated link
|
||||
3. Use that link directly to verify the functionality works
|
||||
|
||||
This allows you to fully test email-dependent flows without needing external email services.
|
||||
|
||||
---
|
||||
|
||||
## IMPORTANT REMINDERS
|
||||
|
||||
**Your Goal:** Production-quality application with all tests passing
|
||||
|
||||
**This Session's Goal:** Complete at least one feature perfectly
|
||||
|
||||
**Priority:** Fix broken tests before implementing new features
|
||||
|
||||
**Quality Bar:**
|
||||
|
||||
- Zero console errors
|
||||
- Polished UI matching the design specified in app_spec.txt
|
||||
- All features work end-to-end through the UI
|
||||
- Fast, responsive, professional
|
||||
- **NO MOCK DATA - all data from real database**
|
||||
- **Security enforced - unauthorized access blocked**
|
||||
- **All navigation works - no 404s or broken links**
|
||||
|
||||
**You have unlimited time.** Take as long as needed to get it right. The most important thing is that you
|
||||
leave the code base in a clean state before terminating the session (Step 10).
|
||||
|
||||
---
|
||||
|
||||
Begin by running Step 1 (Get Your Bearings).
|
||||
513
.claude/templates/initializer_prompt.template.md
Normal file
513
.claude/templates/initializer_prompt.template.md
Normal file
@@ -0,0 +1,513 @@
|
||||
## YOUR ROLE - INITIALIZER AGENT (Session 1 of Many)
|
||||
|
||||
You are the FIRST agent in a long-running autonomous development process.
|
||||
Your job is to set up the foundation for all future coding agents.
|
||||
|
||||
### FIRST: Read the Project Specification
|
||||
|
||||
Start by reading `app_spec.txt` in your working directory. This file contains
|
||||
the complete specification for what you need to build. Read it carefully
|
||||
before proceeding.
|
||||
|
||||
### CRITICAL FIRST TASK: Create Features
|
||||
|
||||
Based on `app_spec.txt`, create features using the feature_create_bulk tool. The features are stored in a SQLite database,
|
||||
which is the single source of truth for what needs to be built.
|
||||
|
||||
**Creating Features:**
|
||||
|
||||
Use the feature_create_bulk tool to add all features at once:
|
||||
|
||||
```
|
||||
Use the feature_create_bulk tool with features=[
|
||||
{
|
||||
"category": "functional",
|
||||
"name": "Brief feature name",
|
||||
"description": "Brief description of the feature and what this test verifies",
|
||||
"steps": [
|
||||
"Step 1: Navigate to relevant page",
|
||||
"Step 2: Perform action",
|
||||
"Step 3: Verify expected result"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "style",
|
||||
"name": "Brief feature name",
|
||||
"description": "Brief description of UI/UX requirement",
|
||||
"steps": [
|
||||
"Step 1: Navigate to page",
|
||||
"Step 2: Take screenshot",
|
||||
"Step 3: Verify visual requirements"
|
||||
]
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
- IDs and priorities are assigned automatically based on order
|
||||
- All features start with `passes: false` by default
|
||||
- You can create features in batches if there are many (e.g., 50 at a time)
|
||||
|
||||
**Requirements for features:**
|
||||
|
||||
- Feature count must match the `feature_count` specified in app_spec.txt
|
||||
- Reference tiers for other projects:
|
||||
- **Simple apps**: ~150 tests
|
||||
- **Medium apps**: ~250 tests
|
||||
- **Complex apps**: ~400+ tests
|
||||
- Both "functional" and "style" categories
|
||||
- Mix of narrow tests (2-5 steps) and comprehensive tests (10+ steps)
|
||||
- At least 25 tests MUST have 10+ steps each (more for complex apps)
|
||||
- Order features by priority: fundamental features first (the API assigns priority based on order)
|
||||
- All features start with `passes: false` automatically
|
||||
- Cover every feature in the spec exhaustively
|
||||
- **MUST include tests from ALL 20 mandatory categories below**
|
||||
|
||||
---
|
||||
|
||||
## MANDATORY TEST CATEGORIES
|
||||
|
||||
The feature_list.json **MUST** include tests from ALL of these categories. The minimum counts scale by complexity tier.
|
||||
|
||||
### Category Distribution by Complexity Tier
|
||||
|
||||
| Category | Simple | Medium | Complex |
|
||||
| -------------------------------- | ------- | ------- | -------- |
|
||||
| A. Security & Access Control | 5 | 20 | 40 |
|
||||
| B. Navigation Integrity | 15 | 25 | 40 |
|
||||
| C. Real Data Verification | 20 | 30 | 50 |
|
||||
| D. Workflow Completeness | 10 | 20 | 40 |
|
||||
| E. Error Handling | 10 | 15 | 25 |
|
||||
| F. UI-Backend Integration | 10 | 20 | 35 |
|
||||
| G. State & Persistence | 8 | 10 | 15 |
|
||||
| H. URL & Direct Access | 5 | 10 | 20 |
|
||||
| I. Double-Action & Idempotency | 5 | 8 | 15 |
|
||||
| J. Data Cleanup & Cascade | 5 | 10 | 20 |
|
||||
| K. Default & Reset | 5 | 8 | 12 |
|
||||
| L. Search & Filter Edge Cases | 8 | 12 | 20 |
|
||||
| M. Form Validation | 10 | 15 | 25 |
|
||||
| N. Feedback & Notification | 8 | 10 | 15 |
|
||||
| O. Responsive & Layout | 8 | 10 | 15 |
|
||||
| P. Accessibility | 8 | 10 | 15 |
|
||||
| Q. Temporal & Timezone | 5 | 8 | 12 |
|
||||
| R. Concurrency & Race Conditions | 5 | 8 | 15 |
|
||||
| S. Export/Import | 5 | 6 | 10 |
|
||||
| T. Performance | 5 | 5 | 10 |
|
||||
| **TOTAL** | **150** | **250** | **400+** |
|
||||
|
||||
---
|
||||
|
||||
### A. Security & Access Control Tests
|
||||
|
||||
Test that unauthorized access is blocked and permissions are enforced.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Unauthenticated user cannot access protected routes (redirect to login)
|
||||
- Regular user cannot access admin-only pages (403 or redirect)
|
||||
- API endpoints return 401 for unauthenticated requests
|
||||
- API endpoints return 403 for unauthorized role access
|
||||
- Session expires after configured inactivity period
|
||||
- Logout clears all session data and tokens
|
||||
- Invalid/expired tokens are rejected
|
||||
- Each role can ONLY see their permitted menu items
|
||||
- Direct URL access to unauthorized pages is blocked
|
||||
- Sensitive operations require confirmation or re-authentication
|
||||
- Cannot access another user's data by manipulating IDs in URL
|
||||
- Password reset flow works securely
|
||||
- Failed login attempts are handled (no information leakage)
|
||||
|
||||
### B. Navigation Integrity Tests
|
||||
|
||||
Test that every button, link, and menu item goes to the correct place.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Every button in sidebar navigates to correct page
|
||||
- Every menu item links to existing route
|
||||
- All CRUD action buttons (Edit, Delete, View) go to correct URLs with correct IDs
|
||||
- Back button works correctly after each navigation
|
||||
- Deep linking works (direct URL access to any page with auth)
|
||||
- Breadcrumbs reflect actual navigation path
|
||||
- 404 page shown for non-existent routes (not crash)
|
||||
- After login, user redirected to intended destination (or dashboard)
|
||||
- After logout, user redirected to login page
|
||||
- Pagination links work and preserve current filters
|
||||
- Tab navigation within pages works correctly
|
||||
- Modal close buttons return to previous state
|
||||
- Cancel buttons on forms return to previous page
|
||||
|
||||
### C. Real Data Verification Tests
|
||||
|
||||
Test that data is real (not mocked) and persists correctly.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Create a record via UI with unique content → verify it appears in list
|
||||
- Create a record → refresh page → record still exists
|
||||
- Create a record → log out → log in → record still exists
|
||||
- Edit a record → verify changes persist after refresh
|
||||
- Delete a record → verify it's gone from list AND database
|
||||
- Delete a record → verify it's gone from related dropdowns
|
||||
- Filter/search → results match actual data created in test
|
||||
- Dashboard statistics reflect real record counts (create 3 items, count shows 3)
|
||||
- Reports show real aggregated data
|
||||
- Export functionality exports actual data you created
|
||||
- Related records update when parent changes
|
||||
- Timestamps are real and accurate (created_at, updated_at)
|
||||
- Data created by User A is not visible to User B (unless shared)
|
||||
- Empty state shows correctly when no data exists
|
||||
|
||||
### D. Workflow Completeness Tests
|
||||
|
||||
Test that every workflow can be completed end-to-end through the UI.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Every entity has working Create operation via UI form
|
||||
- Every entity has working Read/View operation (detail page loads)
|
||||
- Every entity has working Update operation (edit form saves)
|
||||
- Every entity has working Delete operation (with confirmation dialog)
|
||||
- Every status/state has a UI mechanism to transition to next state
|
||||
- Multi-step processes (wizards) can be completed end-to-end
|
||||
- Bulk operations (select all, delete selected) work
|
||||
- Cancel/Undo operations work where applicable
|
||||
- Required fields prevent submission when empty
|
||||
- Form validation shows errors before submission
|
||||
- Successful submission shows success feedback
|
||||
- Backend workflow (e.g., user→customer conversion) has UI trigger
|
||||
|
||||
### E. Error Handling Tests
|
||||
|
||||
Test graceful handling of errors and edge cases.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Network failure shows user-friendly error message, not crash
|
||||
- Invalid form input shows field-level errors
|
||||
- API errors display meaningful messages to user
|
||||
- 404 responses handled gracefully (show not found page)
|
||||
- 500 responses don't expose stack traces or technical details
|
||||
- Empty search results show "no results found" message
|
||||
- Loading states shown during all async operations
|
||||
- Timeout doesn't hang the UI indefinitely
|
||||
- Submitting form with server error keeps user data in form
|
||||
- File upload errors (too large, wrong type) show clear message
|
||||
- Duplicate entry errors (e.g., email already exists) are clear
|
||||
|
||||
### F. UI-Backend Integration Tests
|
||||
|
||||
Test that frontend and backend communicate correctly.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Frontend request format matches what backend expects
|
||||
- Backend response format matches what frontend parses
|
||||
- All dropdown options come from real database data (not hardcoded)
|
||||
- Related entity selectors (e.g., "choose category") populated from DB
|
||||
- Changes in one area reflect in related areas after refresh
|
||||
- Deleting parent handles children correctly (cascade or block)
|
||||
- Filters work with actual data attributes from database
|
||||
- Sort functionality sorts real data correctly
|
||||
- Pagination returns correct page of real data
|
||||
- API error responses are parsed and displayed correctly
|
||||
- Loading spinners appear during API calls
|
||||
- Optimistic updates (if used) rollback on failure
|
||||
|
||||
### G. State & Persistence Tests
|
||||
|
||||
Test that state is maintained correctly across sessions and tabs.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Refresh page mid-form - appropriate behavior (data kept or cleared)
|
||||
- Close browser, reopen - session state handled correctly
|
||||
- Same user in two browser tabs - changes sync or handled gracefully
|
||||
- Browser back after form submit - no duplicate submission
|
||||
- Bookmark a page, return later - works (with auth check)
|
||||
- LocalStorage/cookies cleared - graceful re-authentication
|
||||
- Unsaved changes warning when navigating away from dirty form
|
||||
|
||||
### H. URL & Direct Access Tests
|
||||
|
||||
Test direct URL access and URL manipulation security.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Change entity ID in URL - cannot access others' data
|
||||
- Access /admin directly as regular user - blocked
|
||||
- Malformed URL parameters - handled gracefully (no crash)
|
||||
- Very long URL - handled correctly
|
||||
- URL with SQL injection attempt - rejected/sanitized
|
||||
- Deep link to deleted entity - shows "not found", not crash
|
||||
- Query parameters for filters are reflected in UI
|
||||
- Sharing a URL with filters preserves those filters
|
||||
|
||||
### I. Double-Action & Idempotency Tests
|
||||
|
||||
Test that rapid or duplicate actions don't cause issues.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Double-click submit button - only one record created
|
||||
- Rapid multiple clicks on delete - only one deletion occurs
|
||||
- Submit form, hit back, submit again - appropriate behavior
|
||||
- Multiple simultaneous API calls - server handles correctly
|
||||
- Refresh during save operation - data not corrupted
|
||||
- Click same navigation link twice quickly - no issues
|
||||
- Submit button disabled during processing
|
||||
|
||||
### J. Data Cleanup & Cascade Tests
|
||||
|
||||
Test that deleting data cleans up properly everywhere.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Delete parent entity - children removed from all views
|
||||
- Delete item - removed from search results immediately
|
||||
- Delete item - statistics/counts updated immediately
|
||||
- Delete item - related dropdowns updated
|
||||
- Delete item - cached views refreshed
|
||||
- Soft delete (if applicable) - item hidden but recoverable
|
||||
- Hard delete - item completely removed from database
|
||||
|
||||
### K. Default & Reset Tests
|
||||
|
||||
Test that defaults and reset functionality work correctly.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- New form shows correct default values
|
||||
- Date pickers default to sensible dates (today, not 1970)
|
||||
- Dropdowns default to correct option (or placeholder)
|
||||
- Reset button clears to defaults, not just empty
|
||||
- Clear filters button resets all filters to default
|
||||
- Pagination resets to page 1 when filters change
|
||||
- Sorting resets when changing views
|
||||
|
||||
### L. Search & Filter Edge Cases
|
||||
|
||||
Test search and filter functionality thoroughly.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Empty search shows all results (or appropriate message)
|
||||
- Search with only spaces - handled correctly
|
||||
- Search with special characters (!@#$%^&\*) - no errors
|
||||
- Search with quotes - handled correctly
|
||||
- Search with very long string - handled correctly
|
||||
- Filter combinations that return zero results - shows message
|
||||
- Filter + search + sort together - all work correctly
|
||||
- Filter persists after viewing detail and returning to list
|
||||
- Clear individual filter - works correctly
|
||||
- Search is case-insensitive (or clearly case-sensitive)
|
||||
|
||||
### M. Form Validation Tests
|
||||
|
||||
Test all form validation rules exhaustively.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Required field empty - shows error, blocks submit
|
||||
- Email field with invalid email formats - shows error
|
||||
- Password field - enforces complexity requirements
|
||||
- Numeric field with letters - rejected
|
||||
- Date field with invalid date - rejected
|
||||
- Min/max length enforced on text fields
|
||||
- Min/max values enforced on numeric fields
|
||||
- Duplicate unique values rejected (e.g., duplicate email)
|
||||
- Error messages are specific (not just "invalid")
|
||||
- Errors clear when user fixes the issue
|
||||
- Server-side validation matches client-side
|
||||
- Whitespace-only input rejected for required fields
|
||||
|
||||
### N. Feedback & Notification Tests
|
||||
|
||||
Test that users get appropriate feedback for all actions.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Every successful save/create shows success feedback
|
||||
- Every failed action shows error feedback
|
||||
- Loading spinner during every async operation
|
||||
- Disabled state on buttons during form submission
|
||||
- Progress indicator for long operations (file upload)
|
||||
- Toast/notification disappears after appropriate time
|
||||
- Multiple notifications don't overlap incorrectly
|
||||
- Success messages are specific (not just "Success")
|
||||
|
||||
### O. Responsive & Layout Tests
|
||||
|
||||
Test that the UI works on different screen sizes.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Desktop layout correct at 1920px width
|
||||
- Tablet layout correct at 768px width
|
||||
- Mobile layout correct at 375px width
|
||||
- No horizontal scroll on any standard viewport
|
||||
- Touch targets large enough on mobile (44px min)
|
||||
- Modals fit within viewport on mobile
|
||||
- Long text truncates or wraps correctly (no overflow)
|
||||
- Tables scroll horizontally if needed on mobile
|
||||
- Navigation collapses appropriately on mobile
|
||||
|
||||
### P. Accessibility Tests
|
||||
|
||||
Test basic accessibility compliance.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Tab navigation works through all interactive elements
|
||||
- Focus ring visible on all focused elements
|
||||
- Screen reader can navigate main content areas
|
||||
- ARIA labels on icon-only buttons
|
||||
- Color contrast meets WCAG AA (4.5:1 for text)
|
||||
- No information conveyed by color alone
|
||||
- Form fields have associated labels
|
||||
- Error messages announced to screen readers
|
||||
- Skip link to main content (if applicable)
|
||||
- Images have alt text
|
||||
|
||||
### Q. Temporal & Timezone Tests
|
||||
|
||||
Test date/time handling.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Dates display in user's local timezone
|
||||
- Created/updated timestamps accurate and formatted correctly
|
||||
- Date picker allows only valid date ranges
|
||||
- Overdue items identified correctly (timezone-aware)
|
||||
- "Today", "This Week" filters work correctly for user's timezone
|
||||
- Recurring items generate at correct times (if applicable)
|
||||
- Date sorting works correctly across months/years
|
||||
|
||||
### R. Concurrency & Race Condition Tests
|
||||
|
||||
Test multi-user and race condition scenarios.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Two users edit same record - last save wins or conflict shown
|
||||
- Record deleted while another user viewing - graceful handling
|
||||
- List updates while user on page 2 - pagination still works
|
||||
- Rapid navigation between pages - no stale data displayed
|
||||
- API response arrives after user navigated away - no crash
|
||||
- Concurrent form submissions from same user handled
|
||||
|
||||
### S. Export/Import Tests (if applicable)
|
||||
|
||||
Test data export and import functionality.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Export all data - file contains all records
|
||||
- Export filtered data - only filtered records included
|
||||
- Import valid file - all records created correctly
|
||||
- Import duplicate data - handled correctly (skip/update/error)
|
||||
- Import malformed file - error message, no partial import
|
||||
- Export then import - data integrity preserved exactly
|
||||
|
||||
### T. Performance Tests
|
||||
|
||||
Test basic performance requirements.
|
||||
|
||||
**Required tests (examples):**
|
||||
|
||||
- Page loads in <3s with 100 records
|
||||
- Page loads in <5s with 1000 records
|
||||
- Search responds in <1s
|
||||
- Infinite scroll doesn't degrade with many items
|
||||
- Large file upload shows progress
|
||||
- Memory doesn't leak on long sessions
|
||||
- No console errors during normal operation
|
||||
|
||||
---
|
||||
|
||||
## ABSOLUTE PROHIBITION: NO MOCK DATA
|
||||
|
||||
The feature_list.json must include tests that **actively verify real data** and **detect mock data patterns**.
|
||||
|
||||
**Include these specific tests:**
|
||||
|
||||
1. Create unique test data (e.g., "TEST_12345_VERIFY_ME")
|
||||
2. Verify that EXACT data appears in UI
|
||||
3. Refresh page - data persists
|
||||
4. Delete data - verify it's gone
|
||||
5. If data appears that wasn't created during test - FLAG AS MOCK DATA
|
||||
|
||||
**The agent implementing features MUST NOT use:**
|
||||
|
||||
- Hardcoded arrays of fake data
|
||||
- `mockData`, `fakeData`, `sampleData`, `dummyData` variables
|
||||
- `// TODO: replace with real API`
|
||||
- `setTimeout` simulating API delays with static data
|
||||
- Static returns instead of database queries
|
||||
|
||||
---
|
||||
|
||||
**CRITICAL INSTRUCTION:**
|
||||
IT IS CATASTROPHIC TO REMOVE OR EDIT FEATURES IN FUTURE SESSIONS.
|
||||
Features can ONLY be marked as passing (via the `feature_mark_passing` tool with the feature_id).
|
||||
Never remove features, never edit descriptions, never modify testing steps.
|
||||
This ensures no functionality is missed.
|
||||
|
||||
### SECOND TASK: Create init.sh
|
||||
|
||||
Create a script called `init.sh` that future agents can use to quickly
|
||||
set up and run the development environment. The script should:
|
||||
|
||||
1. Install any required dependencies
|
||||
2. Start any necessary servers or services
|
||||
3. Print helpful information about how to access the running application
|
||||
|
||||
Base the script on the technology stack specified in `app_spec.txt`.
|
||||
|
||||
### THIRD TASK: Initialize Git
|
||||
|
||||
Create a git repository and make your first commit with:
|
||||
|
||||
- init.sh (environment setup script)
|
||||
- README.md (project overview and setup instructions)
|
||||
- Any initial project structure files
|
||||
|
||||
Note: Features are stored in the SQLite database (features.db), not in a JSON file.
|
||||
|
||||
Commit message: "Initial setup: init.sh, project structure, and features created via API"
|
||||
|
||||
### FOURTH TASK: Create Project Structure
|
||||
|
||||
Set up the basic project structure based on what's specified in `app_spec.txt`.
|
||||
This typically includes directories for frontend, backend, and any other
|
||||
components mentioned in the spec.
|
||||
|
||||
### OPTIONAL: Start Implementation
|
||||
|
||||
If you have time remaining in this session, you may begin implementing
|
||||
the highest-priority features. Get the next feature with:
|
||||
|
||||
```
|
||||
Use the feature_get_next tool
|
||||
```
|
||||
|
||||
Remember:
|
||||
- Work on ONE feature at a time
|
||||
- Test thoroughly before marking as passing
|
||||
- Commit your progress before session ends
|
||||
|
||||
### ENDING THIS SESSION
|
||||
|
||||
Before your context fills up:
|
||||
|
||||
1. Commit all work with descriptive messages
|
||||
2. Create `claude-progress.txt` with a summary of what you accomplished
|
||||
3. Verify features were created using the feature_get_stats tool
|
||||
4. Leave the environment in a clean, working state
|
||||
|
||||
The next agent will continue from here with a fresh context window.
|
||||
|
||||
---
|
||||
|
||||
**Remember:** You have unlimited time across many sessions. Focus on
|
||||
quality over speed. Production-ready is the goal.
|
||||
2
.env.example
Normal file
2
.env.example
Normal file
@@ -0,0 +1,2 @@
|
||||
# Optional: N8N webhook for progress notifications
|
||||
# PROGRESS_N8N_WEBHOOK_URL=https://your-n8n-instance.com/webhook/...
|
||||
10
.gitignore
vendored
Normal file
10
.gitignore
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
# Agent-generated output directories
|
||||
generations/
|
||||
|
||||
# Log files
|
||||
logs/
|
||||
|
||||
venv/
|
||||
.env
|
||||
__pycache__/
|
||||
*.pyc
|
||||
242
README.md
Normal file
242
README.md
Normal file
@@ -0,0 +1,242 @@
|
||||
# Autonomous Coding Agent
|
||||
|
||||
A long-running autonomous coding agent powered by the Claude Agent SDK. This tool can build complete applications over multiple sessions using a two-agent pattern (initializer + coding agent).
|
||||
|
||||
## Video Walkthrough
|
||||
|
||||
[](https://youtu.be/YW09hhnVqNM)
|
||||
|
||||
> **[Watch the setup and usage guide →](https://youtu.be/YW09hhnVqNM)**
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Claude Code CLI (Required)
|
||||
|
||||
This project requires the Claude Code CLI to be installed. Install it using one of these methods:
|
||||
|
||||
**macOS / Linux:**
|
||||
```bash
|
||||
curl -fsSL https://claude.ai/install.sh | bash
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
irm https://claude.ai/install.ps1 | iex
|
||||
```
|
||||
|
||||
### Authentication
|
||||
|
||||
You need one of the following:
|
||||
|
||||
- **Claude Pro/Max Subscription** - Use `claude login` to authenticate (recommended)
|
||||
- **Anthropic API Key** - Pay-per-use from https://console.anthropic.com/
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Clone the Repository
|
||||
|
||||
```bash
|
||||
git clone https://github.com/your-repo/autonomous-coding.git
|
||||
cd autonomous-coding
|
||||
```
|
||||
|
||||
### 2. Run the Start Script
|
||||
|
||||
**Windows:**
|
||||
```cmd
|
||||
start.bat
|
||||
```
|
||||
|
||||
**macOS / Linux:**
|
||||
```bash
|
||||
./start.sh
|
||||
```
|
||||
|
||||
The start script will:
|
||||
1. Check if Claude CLI is installed
|
||||
2. Check if you're authenticated (prompt to run `claude login` if not)
|
||||
3. Create a Python virtual environment
|
||||
4. Install dependencies
|
||||
5. Launch the main menu
|
||||
|
||||
### 3. Create or Continue a Project
|
||||
|
||||
You'll see a menu with options to:
|
||||
- **Create new project** - Start a fresh project with AI-assisted spec generation
|
||||
- **Continue existing project** - Resume work on a previous project
|
||||
|
||||
For new projects, you can use the built-in `/create-spec` command to interactively create your app specification with Claude's help.
|
||||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
|
||||
### Two-Agent Pattern
|
||||
|
||||
1. **Initializer Agent (First Session):** Reads your app specification, creates a `feature_list.json` with test cases, sets up the project structure, and initializes git.
|
||||
|
||||
2. **Coding Agent (Subsequent Sessions):** Picks up where the previous session left off, implements features one by one, and marks them as passing in `feature_list.json`.
|
||||
|
||||
### Session Management
|
||||
|
||||
- Each session runs with a fresh context window
|
||||
- Progress is persisted via `feature_list.json` and git commits
|
||||
- The agent auto-continues between sessions (3 second delay)
|
||||
- Press `Ctrl+C` to pause; run the start script again to resume
|
||||
|
||||
---
|
||||
|
||||
## Important Timing Expectations
|
||||
|
||||
> **Note: Building complete applications takes time!**
|
||||
|
||||
- **First session (initialization):** The agent generates feature test cases. This takes several minutes and may appear to hang - this is normal.
|
||||
|
||||
- **Subsequent sessions:** Each coding iteration can take **5-15 minutes** depending on complexity.
|
||||
|
||||
- **Full app:** Building all features typically requires **many hours** of total runtime across multiple sessions.
|
||||
|
||||
**Tip:** The feature count in the prompts determines scope. For faster demos, you can modify your app spec to target fewer features (e.g., 20-50 features for a quick demo).
|
||||
|
||||
---
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
autonomous-coding/
|
||||
├── start.bat # Windows start script
|
||||
├── start.sh # macOS/Linux start script
|
||||
├── start.py # Main menu and project management
|
||||
├── autonomous_agent_demo.py # Agent entry point
|
||||
├── agent.py # Agent session logic
|
||||
├── client.py # Claude SDK client configuration
|
||||
├── security.py # Bash command allowlist and validation
|
||||
├── progress.py # Progress tracking utilities
|
||||
├── prompts.py # Prompt loading utilities
|
||||
├── .claude/
|
||||
│ ├── commands/
|
||||
│ │ └── create-spec.md # Interactive spec creation command
|
||||
│ └── templates/ # Prompt templates
|
||||
├── generations/ # Generated projects go here
|
||||
├── requirements.txt # Python dependencies
|
||||
└── .env # Optional configuration (N8N webhook)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Generated Project Structure
|
||||
|
||||
After the agent runs, your project directory will contain:
|
||||
|
||||
```
|
||||
generations/my_project/
|
||||
├── feature_list.json # Test cases (source of truth)
|
||||
├── prompts/
|
||||
│ ├── app_spec.txt # Your app specification
|
||||
│ ├── initializer_prompt.md # First session prompt
|
||||
│ └── coding_prompt.md # Continuation session prompt
|
||||
├── init.sh # Environment setup script
|
||||
├── claude-progress.txt # Session progress notes
|
||||
└── [application files] # Generated application code
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Running the Generated Application
|
||||
|
||||
After the agent completes (or pauses), you can run the generated application:
|
||||
|
||||
```bash
|
||||
cd generations/my_project
|
||||
|
||||
# Run the setup script created by the agent
|
||||
./init.sh
|
||||
|
||||
# Or manually (typical for Node.js apps):
|
||||
npm install
|
||||
npm run dev
|
||||
```
|
||||
|
||||
The application will typically be available at `http://localhost:3000` or similar.
|
||||
|
||||
---
|
||||
|
||||
## Security Model
|
||||
|
||||
This project uses a defense-in-depth security approach (see `security.py` and `client.py`):
|
||||
|
||||
1. **OS-level Sandbox:** Bash commands run in an isolated environment
|
||||
2. **Filesystem Restrictions:** File operations restricted to the project directory only
|
||||
3. **Bash Allowlist:** Only specific commands are permitted:
|
||||
- File inspection: `ls`, `cat`, `head`, `tail`, `wc`, `grep`
|
||||
- Node.js: `npm`, `node`
|
||||
- Version control: `git`
|
||||
- Process management: `ps`, `lsof`, `sleep`, `pkill` (dev processes only)
|
||||
|
||||
Commands not in the allowlist are blocked by the security hook.
|
||||
|
||||
---
|
||||
|
||||
## Configuration (Optional)
|
||||
|
||||
### N8N Webhook Integration
|
||||
|
||||
The agent can send progress notifications to an N8N webhook. Create a `.env` file:
|
||||
|
||||
```bash
|
||||
# Optional: N8N webhook for progress notifications
|
||||
PROGRESS_N8N_WEBHOOK_URL=https://your-n8n-instance.com/webhook/your-webhook-id
|
||||
```
|
||||
|
||||
When test progress increases, the agent sends:
|
||||
|
||||
```json
|
||||
{
|
||||
"event": "test_progress",
|
||||
"passing": 45,
|
||||
"total": 200,
|
||||
"percentage": 22.5,
|
||||
"project": "my_project",
|
||||
"timestamp": "2025-01-15T14:30:00.000Z"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Customization
|
||||
|
||||
### Changing the Application
|
||||
|
||||
Use the `/create-spec` command when creating a new project, or manually edit the files in your project's `prompts/` directory:
|
||||
- `app_spec.txt` - Your application specification
|
||||
- `initializer_prompt.md` - Controls feature generation
|
||||
|
||||
### Modifying Allowed Commands
|
||||
|
||||
Edit `security.py` to add or remove commands from `ALLOWED_COMMANDS`.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**"Claude CLI not found"**
|
||||
Install the Claude Code CLI using the instructions in the Prerequisites section.
|
||||
|
||||
**"Not authenticated with Claude"**
|
||||
Run `claude login` to authenticate. The start script will prompt you to do this automatically.
|
||||
|
||||
**"Appears to hang on first run"**
|
||||
This is normal. The initializer agent is generating detailed test cases, which takes significant time. Watch for `[Tool: ...]` output to confirm the agent is working.
|
||||
|
||||
**"Command blocked by security hook"**
|
||||
The agent tried to run a command not in the allowlist. This is the security system working as intended. If needed, add the command to `ALLOWED_COMMANDS` in `security.py`.
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
Internal Anthropic use.
|
||||
213
agent.py
Normal file
213
agent.py
Normal file
@@ -0,0 +1,213 @@
|
||||
"""
|
||||
Agent Session Logic
|
||||
===================
|
||||
|
||||
Core agent interaction functions for running autonomous coding sessions.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
from claude_agent_sdk import ClaudeSDKClient
|
||||
|
||||
from client import create_client
|
||||
from progress import print_session_header, print_progress_summary, has_features
|
||||
from prompts import (
|
||||
get_initializer_prompt,
|
||||
get_coding_prompt,
|
||||
copy_spec_to_project,
|
||||
has_project_prompts,
|
||||
)
|
||||
|
||||
|
||||
# Configuration
|
||||
AUTO_CONTINUE_DELAY_SECONDS = 3
|
||||
|
||||
|
||||
async def run_agent_session(
|
||||
client: ClaudeSDKClient,
|
||||
message: str,
|
||||
project_dir: Path,
|
||||
) -> tuple[str, str]:
|
||||
"""
|
||||
Run a single agent session using Claude Agent SDK.
|
||||
|
||||
Args:
|
||||
client: Claude SDK client
|
||||
message: The prompt to send
|
||||
project_dir: Project directory path
|
||||
|
||||
Returns:
|
||||
(status, response_text) where status is:
|
||||
- "continue" if agent should continue working
|
||||
- "error" if an error occurred
|
||||
"""
|
||||
print("Sending prompt to Claude Agent SDK...\n")
|
||||
|
||||
try:
|
||||
# Send the query
|
||||
await client.query(message)
|
||||
|
||||
# Collect response text and show tool use
|
||||
response_text = ""
|
||||
async for msg in client.receive_response():
|
||||
msg_type = type(msg).__name__
|
||||
|
||||
# Handle AssistantMessage (text and tool use)
|
||||
if msg_type == "AssistantMessage" and hasattr(msg, "content"):
|
||||
for block in msg.content:
|
||||
block_type = type(block).__name__
|
||||
|
||||
if block_type == "TextBlock" and hasattr(block, "text"):
|
||||
response_text += block.text
|
||||
print(block.text, end="", flush=True)
|
||||
elif block_type == "ToolUseBlock" and hasattr(block, "name"):
|
||||
print(f"\n[Tool: {block.name}]", flush=True)
|
||||
if hasattr(block, "input"):
|
||||
input_str = str(block.input)
|
||||
if len(input_str) > 200:
|
||||
print(f" Input: {input_str[:200]}...", flush=True)
|
||||
else:
|
||||
print(f" Input: {input_str}", flush=True)
|
||||
|
||||
# Handle UserMessage (tool results)
|
||||
elif msg_type == "UserMessage" and hasattr(msg, "content"):
|
||||
for block in msg.content:
|
||||
block_type = type(block).__name__
|
||||
|
||||
if block_type == "ToolResultBlock":
|
||||
result_content = getattr(block, "content", "")
|
||||
is_error = getattr(block, "is_error", False)
|
||||
|
||||
# Check if command was blocked by security hook
|
||||
if "blocked" in str(result_content).lower():
|
||||
print(f" [BLOCKED] {result_content}", flush=True)
|
||||
elif is_error:
|
||||
# Show errors (truncated)
|
||||
error_str = str(result_content)[:500]
|
||||
print(f" [Error] {error_str}", flush=True)
|
||||
else:
|
||||
# Tool succeeded - just show brief confirmation
|
||||
print(" [Done]", flush=True)
|
||||
|
||||
print("\n" + "-" * 70 + "\n")
|
||||
return "continue", response_text
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error during agent session: {e}")
|
||||
return "error", str(e)
|
||||
|
||||
|
||||
async def run_autonomous_agent(
|
||||
project_dir: Path,
|
||||
model: str,
|
||||
max_iterations: Optional[int] = None,
|
||||
) -> None:
|
||||
"""
|
||||
Run the autonomous agent loop.
|
||||
|
||||
Args:
|
||||
project_dir: Directory for the project
|
||||
model: Claude model to use
|
||||
max_iterations: Maximum number of iterations (None for unlimited)
|
||||
"""
|
||||
print("\n" + "=" * 70)
|
||||
print(" AUTONOMOUS CODING AGENT DEMO")
|
||||
print("=" * 70)
|
||||
print(f"\nProject directory: {project_dir}")
|
||||
print(f"Model: {model}")
|
||||
if max_iterations:
|
||||
print(f"Max iterations: {max_iterations}")
|
||||
else:
|
||||
print("Max iterations: Unlimited (will run until completion)")
|
||||
print()
|
||||
|
||||
# Create project directory
|
||||
project_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Check if this is a fresh start or continuation
|
||||
# Uses has_features() which checks if the database actually has features,
|
||||
# not just if the file exists (empty db should still trigger initializer)
|
||||
is_first_run = not has_features(project_dir)
|
||||
|
||||
if is_first_run:
|
||||
print("Fresh start - will use initializer agent")
|
||||
print()
|
||||
print("=" * 70)
|
||||
print(" NOTE: First session takes 10-20+ minutes!")
|
||||
print(" The agent is generating 200 detailed test cases.")
|
||||
print(" This may appear to hang - it's working. Watch for [Tool: ...] output.")
|
||||
print("=" * 70)
|
||||
print()
|
||||
# Copy the app spec into the project directory for the agent to read
|
||||
copy_spec_to_project(project_dir)
|
||||
else:
|
||||
print("Continuing existing project")
|
||||
print_progress_summary(project_dir)
|
||||
|
||||
# Main loop
|
||||
iteration = 0
|
||||
|
||||
while True:
|
||||
iteration += 1
|
||||
|
||||
# Check max iterations
|
||||
if max_iterations and iteration > max_iterations:
|
||||
print(f"\nReached max iterations ({max_iterations})")
|
||||
print("To continue, run the script again without --max-iterations")
|
||||
break
|
||||
|
||||
# Print session header
|
||||
print_session_header(iteration, is_first_run)
|
||||
|
||||
# Create client (fresh context)
|
||||
client = create_client(project_dir, model)
|
||||
|
||||
# Choose prompt based on session type
|
||||
# Pass project_dir to enable project-specific prompts
|
||||
if is_first_run:
|
||||
prompt = get_initializer_prompt(project_dir)
|
||||
is_first_run = False # Only use initializer once
|
||||
else:
|
||||
prompt = get_coding_prompt(project_dir)
|
||||
|
||||
# Run session with async context manager
|
||||
async with client:
|
||||
status, response = await run_agent_session(client, prompt, project_dir)
|
||||
|
||||
# Handle status
|
||||
if status == "continue":
|
||||
print(f"\nAgent will auto-continue in {AUTO_CONTINUE_DELAY_SECONDS}s...")
|
||||
print_progress_summary(project_dir)
|
||||
await asyncio.sleep(AUTO_CONTINUE_DELAY_SECONDS)
|
||||
|
||||
elif status == "error":
|
||||
print("\nSession encountered an error")
|
||||
print("Will retry with a fresh session...")
|
||||
await asyncio.sleep(AUTO_CONTINUE_DELAY_SECONDS)
|
||||
|
||||
# Small delay between sessions
|
||||
if max_iterations is None or iteration < max_iterations:
|
||||
print("\nPreparing next session...\n")
|
||||
await asyncio.sleep(1)
|
||||
|
||||
# Final summary
|
||||
print("\n" + "=" * 70)
|
||||
print(" SESSION COMPLETE")
|
||||
print("=" * 70)
|
||||
print(f"\nProject directory: {project_dir}")
|
||||
print_progress_summary(project_dir)
|
||||
|
||||
# Print instructions for running the generated application
|
||||
print("\n" + "-" * 70)
|
||||
print(" TO RUN THE GENERATED APPLICATION:")
|
||||
print("-" * 70)
|
||||
print(f"\n cd {project_dir.resolve()}")
|
||||
print(" ./init.sh # Run the setup script")
|
||||
print(" # Or manually:")
|
||||
print(" npm install && npm run dev")
|
||||
print("\n Then open http://localhost:3000 (or check init.sh for the URL)")
|
||||
print("-" * 70)
|
||||
|
||||
print("\nDone!")
|
||||
10
api/__init__.py
Normal file
10
api/__init__.py
Normal file
@@ -0,0 +1,10 @@
|
||||
"""
|
||||
API Package
|
||||
============
|
||||
|
||||
Database models and utilities for feature management.
|
||||
"""
|
||||
|
||||
from api.database import Feature, create_database, get_database_path
|
||||
|
||||
__all__ = ["Feature", "create_database", "get_database_path"]
|
||||
99
api/database.py
Normal file
99
api/database.py
Normal file
@@ -0,0 +1,99 @@
|
||||
"""
|
||||
Database Models and Connection
|
||||
==============================
|
||||
|
||||
SQLite database schema for feature storage using SQLAlchemy.
|
||||
"""
|
||||
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
from sqlalchemy import Boolean, Column, Integer, String, Text, create_engine
|
||||
from sqlalchemy.ext.declarative import declarative_base
|
||||
from sqlalchemy.orm import sessionmaker, Session
|
||||
from sqlalchemy.types import JSON
|
||||
|
||||
Base = declarative_base()
|
||||
|
||||
|
||||
class Feature(Base):
|
||||
"""Feature model representing a test case/feature to implement."""
|
||||
|
||||
__tablename__ = "features"
|
||||
|
||||
id = Column(Integer, primary_key=True, index=True)
|
||||
priority = Column(Integer, nullable=False, default=999, index=True)
|
||||
category = Column(String(100), nullable=False)
|
||||
name = Column(String(255), nullable=False)
|
||||
description = Column(Text, nullable=False)
|
||||
steps = Column(JSON, nullable=False) # Stored as JSON array
|
||||
passes = Column(Boolean, default=False, index=True)
|
||||
|
||||
def to_dict(self) -> dict:
|
||||
"""Convert feature to dictionary for JSON serialization."""
|
||||
return {
|
||||
"id": self.id,
|
||||
"priority": self.priority,
|
||||
"category": self.category,
|
||||
"name": self.name,
|
||||
"description": self.description,
|
||||
"steps": self.steps,
|
||||
"passes": self.passes,
|
||||
}
|
||||
|
||||
|
||||
def get_database_path(project_dir: Path) -> Path:
|
||||
"""Return the path to the SQLite database for a project."""
|
||||
return project_dir / "features.db"
|
||||
|
||||
|
||||
def get_database_url(project_dir: Path) -> str:
|
||||
"""Return the SQLAlchemy database URL for a project.
|
||||
|
||||
Uses POSIX-style paths (forward slashes) for cross-platform compatibility.
|
||||
"""
|
||||
db_path = get_database_path(project_dir)
|
||||
return f"sqlite:///{db_path.as_posix()}"
|
||||
|
||||
|
||||
def create_database(project_dir: Path) -> tuple:
|
||||
"""
|
||||
Create database and return engine + session maker.
|
||||
|
||||
Args:
|
||||
project_dir: Directory containing the project
|
||||
|
||||
Returns:
|
||||
Tuple of (engine, SessionLocal)
|
||||
"""
|
||||
db_url = get_database_url(project_dir)
|
||||
engine = create_engine(db_url, connect_args={"check_same_thread": False})
|
||||
Base.metadata.create_all(bind=engine)
|
||||
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
|
||||
return engine, SessionLocal
|
||||
|
||||
|
||||
# Global session maker - will be set when server starts
|
||||
_session_maker: Optional[sessionmaker] = None
|
||||
|
||||
|
||||
def set_session_maker(session_maker: sessionmaker) -> None:
|
||||
"""Set the global session maker."""
|
||||
global _session_maker
|
||||
_session_maker = session_maker
|
||||
|
||||
|
||||
def get_db() -> Session:
|
||||
"""
|
||||
Dependency for FastAPI to get database session.
|
||||
|
||||
Yields a database session and ensures it's closed after use.
|
||||
"""
|
||||
if _session_maker is None:
|
||||
raise RuntimeError("Database not initialized. Call set_session_maker first.")
|
||||
|
||||
db = _session_maker()
|
||||
try:
|
||||
yield db
|
||||
finally:
|
||||
db.close()
|
||||
154
api/migration.py
Normal file
154
api/migration.py
Normal file
@@ -0,0 +1,154 @@
|
||||
"""
|
||||
JSON to SQLite Migration
|
||||
========================
|
||||
|
||||
Automatically migrates existing feature_list.json files to SQLite database.
|
||||
"""
|
||||
|
||||
import json
|
||||
import shutil
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
from sqlalchemy.orm import sessionmaker, Session
|
||||
|
||||
from api.database import Feature
|
||||
|
||||
|
||||
def migrate_json_to_sqlite(
|
||||
project_dir: Path,
|
||||
session_maker: sessionmaker,
|
||||
) -> bool:
|
||||
"""
|
||||
Detect existing feature_list.json, import to SQLite, rename to backup.
|
||||
|
||||
This function:
|
||||
1. Checks if feature_list.json exists
|
||||
2. Checks if database already has data (skips if so)
|
||||
3. Imports all features from JSON
|
||||
4. Renames JSON file to feature_list.json.backup.<timestamp>
|
||||
|
||||
Args:
|
||||
project_dir: Directory containing the project
|
||||
session_maker: SQLAlchemy session maker
|
||||
|
||||
Returns:
|
||||
True if migration was performed, False if skipped
|
||||
"""
|
||||
json_file = project_dir / "feature_list.json"
|
||||
|
||||
if not json_file.exists():
|
||||
return False # No JSON file to migrate
|
||||
|
||||
# Check if database already has data
|
||||
session: Session = session_maker()
|
||||
try:
|
||||
existing_count = session.query(Feature).count()
|
||||
if existing_count > 0:
|
||||
print(
|
||||
f"Database already has {existing_count} features, skipping migration"
|
||||
)
|
||||
return False
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
# Load JSON data
|
||||
try:
|
||||
with open(json_file, "r", encoding="utf-8") as f:
|
||||
features_data = json.load(f)
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"Error parsing feature_list.json: {e}")
|
||||
return False
|
||||
except IOError as e:
|
||||
print(f"Error reading feature_list.json: {e}")
|
||||
return False
|
||||
|
||||
if not isinstance(features_data, list):
|
||||
print("Error: feature_list.json must contain a JSON array")
|
||||
return False
|
||||
|
||||
# Import features into database
|
||||
session = session_maker()
|
||||
try:
|
||||
imported_count = 0
|
||||
for i, feature_dict in enumerate(features_data):
|
||||
# Handle both old format (no id/priority/name) and new format
|
||||
feature = Feature(
|
||||
id=feature_dict.get("id", i + 1),
|
||||
priority=feature_dict.get("priority", i + 1),
|
||||
category=feature_dict.get("category", "uncategorized"),
|
||||
name=feature_dict.get("name", f"Feature {i + 1}"),
|
||||
description=feature_dict.get("description", ""),
|
||||
steps=feature_dict.get("steps", []),
|
||||
passes=feature_dict.get("passes", False),
|
||||
)
|
||||
session.add(feature)
|
||||
imported_count += 1
|
||||
|
||||
session.commit()
|
||||
|
||||
# Verify import
|
||||
final_count = session.query(Feature).count()
|
||||
print(f"Migrated {final_count} features from JSON to SQLite")
|
||||
|
||||
except Exception as e:
|
||||
session.rollback()
|
||||
print(f"Error during migration: {e}")
|
||||
return False
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
# Rename JSON file to backup
|
||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
backup_file = project_dir / f"feature_list.json.backup.{timestamp}"
|
||||
|
||||
try:
|
||||
shutil.move(json_file, backup_file)
|
||||
print(f"Original JSON backed up to: {backup_file.name}")
|
||||
except IOError as e:
|
||||
print(f"Warning: Could not backup JSON file: {e}")
|
||||
# Continue anyway - the data is in the database
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def export_to_json(
|
||||
project_dir: Path,
|
||||
session_maker: sessionmaker,
|
||||
output_file: Optional[Path] = None,
|
||||
) -> Path:
|
||||
"""
|
||||
Export features from database back to JSON format.
|
||||
|
||||
Useful for debugging or if you need to revert to the old format.
|
||||
|
||||
Args:
|
||||
project_dir: Directory containing the project
|
||||
session_maker: SQLAlchemy session maker
|
||||
output_file: Output file path (default: feature_list_export.json)
|
||||
|
||||
Returns:
|
||||
Path to the exported file
|
||||
"""
|
||||
if output_file is None:
|
||||
output_file = project_dir / "feature_list_export.json"
|
||||
|
||||
session: Session = session_maker()
|
||||
try:
|
||||
features = (
|
||||
session.query(Feature)
|
||||
.order_by(Feature.priority.asc(), Feature.id.asc())
|
||||
.all()
|
||||
)
|
||||
|
||||
features_data = [f.to_dict() for f in features]
|
||||
|
||||
with open(output_file, "w", encoding="utf-8") as f:
|
||||
json.dump(features_data, f, indent=2)
|
||||
|
||||
print(f"Exported {len(features_data)} features to {output_file}")
|
||||
return output_file
|
||||
|
||||
finally:
|
||||
session.close()
|
||||
119
autonomous_agent_demo.py
Normal file
119
autonomous_agent_demo.py
Normal file
@@ -0,0 +1,119 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Autonomous Coding Agent Demo
|
||||
============================
|
||||
|
||||
A minimal harness demonstrating long-running autonomous coding with Claude.
|
||||
This script implements the two-agent pattern (initializer + coding agent) and
|
||||
incorporates all the strategies from the long-running agents guide.
|
||||
|
||||
Example Usage:
|
||||
python autonomous_agent_demo.py --project-dir ./claude_clone_demo
|
||||
python autonomous_agent_demo.py --project-dir ./claude_clone_demo --max-iterations 5
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables from .env file (if it exists)
|
||||
# IMPORTANT: Must be called BEFORE importing other modules that read env vars at load time
|
||||
load_dotenv()
|
||||
|
||||
from agent import run_autonomous_agent
|
||||
|
||||
|
||||
# Configuration
|
||||
# DEFAULT_MODEL = "claude-sonnet-4-5-20250929"
|
||||
DEFAULT_MODEL = "claude-opus-4-5-20251101"
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
"""Parse command line arguments."""
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Autonomous Coding Agent Demo - Long-running agent harness",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
# Start fresh project
|
||||
python autonomous_agent_demo.py --project-dir ./claude_clone
|
||||
|
||||
# Use a specific model
|
||||
python autonomous_agent_demo.py --project-dir ./claude_clone --model claude-sonnet-4-5-20250929
|
||||
|
||||
# Limit iterations for testing
|
||||
python autonomous_agent_demo.py --project-dir ./claude_clone --max-iterations 5
|
||||
|
||||
# Continue existing project
|
||||
python autonomous_agent_demo.py --project-dir ./claude_clone
|
||||
|
||||
Authentication:
|
||||
Uses Claude CLI credentials from ~/.claude/.credentials.json
|
||||
Run 'claude login' to authenticate (handled by start.bat/start.sh)
|
||||
""",
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--project-dir",
|
||||
type=Path,
|
||||
default=Path("./autonomous_demo_project"),
|
||||
help="Directory for the project (default: generations/autonomous_demo_project). Relative paths automatically placed in generations/ directory.",
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--max-iterations",
|
||||
type=int,
|
||||
default=None,
|
||||
help="Maximum number of agent iterations (default: unlimited)",
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--model",
|
||||
type=str,
|
||||
default=DEFAULT_MODEL,
|
||||
help=f"Claude model to use (default: {DEFAULT_MODEL})",
|
||||
)
|
||||
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Main entry point."""
|
||||
args = parse_args()
|
||||
|
||||
# Note: Authentication is handled by start.bat/start.sh before this script runs.
|
||||
# The Claude SDK auto-detects credentials from ~/.claude/.credentials.json
|
||||
|
||||
# Automatically place projects in generations/ directory unless already specified
|
||||
project_dir = args.project_dir
|
||||
if not str(project_dir).startswith("generations/"):
|
||||
# Convert relative paths to be under generations/
|
||||
if project_dir.is_absolute():
|
||||
# If absolute path, use as-is
|
||||
pass
|
||||
else:
|
||||
# Prepend generations/ to relative paths
|
||||
project_dir = Path("generations") / project_dir
|
||||
|
||||
try:
|
||||
# Run the agent (MCP server handles feature database)
|
||||
asyncio.run(
|
||||
run_autonomous_agent(
|
||||
project_dir=project_dir,
|
||||
model=args.model,
|
||||
max_iterations=args.max_iterations,
|
||||
)
|
||||
)
|
||||
except KeyboardInterrupt:
|
||||
print("\n\nInterrupted by user")
|
||||
print("To resume, run the same command again")
|
||||
except Exception as e:
|
||||
print(f"\nFatal error: {e}")
|
||||
raise
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
165
client.py
Normal file
165
client.py
Normal file
@@ -0,0 +1,165 @@
|
||||
"""
|
||||
Claude SDK Client Configuration
|
||||
===============================
|
||||
|
||||
Functions for creating and configuring the Claude Agent SDK client.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
|
||||
from claude_agent_sdk.types import HookMatcher
|
||||
|
||||
from security import bash_security_hook
|
||||
|
||||
|
||||
# Feature MCP tools for feature/test management
|
||||
FEATURE_MCP_TOOLS = [
|
||||
"mcp__features__feature_get_stats",
|
||||
"mcp__features__feature_get_next",
|
||||
"mcp__features__feature_get_for_regression",
|
||||
"mcp__features__feature_mark_passing",
|
||||
"mcp__features__feature_skip",
|
||||
"mcp__features__feature_create_bulk",
|
||||
]
|
||||
|
||||
# Playwright MCP tools for browser automation
|
||||
PLAYWRIGHT_TOOLS = [
|
||||
# Core navigation & screenshots
|
||||
"mcp__playwright__browser_navigate",
|
||||
"mcp__playwright__browser_navigate_back",
|
||||
"mcp__playwright__browser_take_screenshot",
|
||||
"mcp__playwright__browser_snapshot",
|
||||
|
||||
# Element interaction
|
||||
"mcp__playwright__browser_click",
|
||||
"mcp__playwright__browser_type",
|
||||
"mcp__playwright__browser_fill_form",
|
||||
"mcp__playwright__browser_select_option",
|
||||
"mcp__playwright__browser_hover",
|
||||
"mcp__playwright__browser_drag",
|
||||
"mcp__playwright__browser_press_key",
|
||||
|
||||
# JavaScript & debugging
|
||||
"mcp__playwright__browser_evaluate",
|
||||
"mcp__playwright__browser_run_code",
|
||||
"mcp__playwright__browser_console_messages",
|
||||
"mcp__playwright__browser_network_requests",
|
||||
|
||||
# Browser management
|
||||
"mcp__playwright__browser_close",
|
||||
"mcp__playwright__browser_resize",
|
||||
"mcp__playwright__browser_tabs",
|
||||
"mcp__playwright__browser_wait_for",
|
||||
"mcp__playwright__browser_handle_dialog",
|
||||
"mcp__playwright__browser_file_upload",
|
||||
"mcp__playwright__browser_install",
|
||||
]
|
||||
|
||||
# Built-in tools
|
||||
BUILTIN_TOOLS = [
|
||||
"Read",
|
||||
"Write",
|
||||
"Edit",
|
||||
"Glob",
|
||||
"Grep",
|
||||
"Bash",
|
||||
]
|
||||
|
||||
|
||||
def create_client(project_dir: Path, model: str):
|
||||
"""
|
||||
Create a Claude Agent SDK client with multi-layered security.
|
||||
|
||||
Args:
|
||||
project_dir: Directory for the project
|
||||
model: Claude model to use
|
||||
|
||||
Returns:
|
||||
Configured ClaudeSDKClient (from claude_agent_sdk)
|
||||
|
||||
Security layers (defense in depth):
|
||||
1. Sandbox - OS-level bash command isolation prevents filesystem escape
|
||||
2. Permissions - File operations restricted to project_dir only
|
||||
3. Security hooks - Bash commands validated against an allowlist
|
||||
(see security.py for ALLOWED_COMMANDS)
|
||||
|
||||
Note: Authentication is handled by start.bat/start.sh before this runs.
|
||||
The Claude SDK auto-detects credentials from ~/.claude/.credentials.json
|
||||
"""
|
||||
# Create comprehensive security settings
|
||||
# Note: Using relative paths ("./**") restricts access to project directory
|
||||
# since cwd is set to project_dir
|
||||
security_settings = {
|
||||
"sandbox": {"enabled": True, "autoAllowBashIfSandboxed": True},
|
||||
"permissions": {
|
||||
"defaultMode": "acceptEdits", # Auto-approve edits within allowed directories
|
||||
"allow": [
|
||||
# Allow all file operations within the project directory
|
||||
"Read(./**)",
|
||||
"Write(./**)",
|
||||
"Edit(./**)",
|
||||
"Glob(./**)",
|
||||
"Grep(./**)",
|
||||
# Bash permission granted here, but actual commands are validated
|
||||
# by the bash_security_hook (see security.py for allowed commands)
|
||||
"Bash(*)",
|
||||
# Allow Playwright MCP tools for browser automation
|
||||
*PLAYWRIGHT_TOOLS,
|
||||
# Allow Feature MCP tools for feature management
|
||||
*FEATURE_MCP_TOOLS,
|
||||
],
|
||||
},
|
||||
}
|
||||
|
||||
# Ensure project directory exists before creating settings file
|
||||
project_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Write settings to a file in the project directory
|
||||
settings_file = project_dir / ".claude_settings.json"
|
||||
with open(settings_file, "w") as f:
|
||||
json.dump(security_settings, f, indent=2)
|
||||
|
||||
print(f"Created security settings at {settings_file}")
|
||||
print(" - Sandbox enabled (OS-level bash isolation)")
|
||||
print(f" - Filesystem restricted to: {project_dir.resolve()}")
|
||||
print(" - Bash commands restricted to allowlist (see security.py)")
|
||||
print(" - MCP servers: playwright (browser), features (database)")
|
||||
print(" - Project settings enabled (skills, commands, CLAUDE.md)")
|
||||
print()
|
||||
|
||||
return ClaudeSDKClient(
|
||||
options=ClaudeAgentOptions(
|
||||
model=model,
|
||||
system_prompt="You are an expert full-stack developer building a production-quality web application.",
|
||||
setting_sources=["project"], # Enable skills, commands, and CLAUDE.md from project dir
|
||||
max_buffer_size=10 * 1024 * 1024, # 10MB for large Playwright screenshots
|
||||
allowed_tools=[
|
||||
*BUILTIN_TOOLS,
|
||||
*PLAYWRIGHT_TOOLS,
|
||||
*FEATURE_MCP_TOOLS,
|
||||
],
|
||||
mcp_servers={
|
||||
"playwright": {"command": "npx", "args": ["@playwright/mcp@latest", "--viewport-size", "1280x720"]},
|
||||
"features": {
|
||||
"command": sys.executable, # Use the same Python that's running this script
|
||||
"args": ["-m", "mcp_server.feature_mcp"],
|
||||
"env": {
|
||||
"PROJECT_DIR": str(project_dir.resolve()),
|
||||
"PYTHONPATH": str(Path(__file__).parent.resolve()),
|
||||
},
|
||||
},
|
||||
},
|
||||
hooks={
|
||||
"PreToolUse": [
|
||||
HookMatcher(matcher="Bash", hooks=[bash_security_hook]),
|
||||
],
|
||||
},
|
||||
max_turns=1000,
|
||||
cwd=str(project_dir.resolve()),
|
||||
settings=str(settings_file.resolve()), # Use absolute path
|
||||
)
|
||||
)
|
||||
1
mcp_server/__init__.py
Normal file
1
mcp_server/__init__.py
Normal file
@@ -0,0 +1 @@
|
||||
"""MCP Server Package for Feature Management."""
|
||||
332
mcp_server/feature_mcp.py
Normal file
332
mcp_server/feature_mcp.py
Normal file
@@ -0,0 +1,332 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
MCP Server for Feature Management
|
||||
==================================
|
||||
|
||||
Provides tools to manage features in the autonomous coding system,
|
||||
replacing the previous FastAPI-based REST API.
|
||||
|
||||
Tools:
|
||||
- feature_get_stats: Get progress statistics
|
||||
- feature_get_next: Get next feature to implement
|
||||
- feature_get_for_regression: Get random passing features for testing
|
||||
- feature_mark_passing: Mark a feature as passing
|
||||
- feature_skip: Skip a feature (move to end of queue)
|
||||
- feature_create_bulk: Create multiple features at once
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from contextlib import asynccontextmanager
|
||||
from pathlib import Path
|
||||
from typing import Annotated
|
||||
|
||||
from mcp.server.fastmcp import FastMCP
|
||||
from pydantic import BaseModel, Field
|
||||
from sqlalchemy.sql.expression import func
|
||||
|
||||
# Add parent directory to path so we can import from api module
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent))
|
||||
|
||||
from api.database import Feature, create_database
|
||||
from api.migration import migrate_json_to_sqlite
|
||||
|
||||
# Configuration from environment
|
||||
PROJECT_DIR = Path(os.environ.get("PROJECT_DIR", ".")).resolve()
|
||||
|
||||
|
||||
# Pydantic models for input validation
|
||||
class MarkPassingInput(BaseModel):
|
||||
"""Input for marking a feature as passing."""
|
||||
feature_id: int = Field(..., description="The ID of the feature to mark as passing", ge=1)
|
||||
|
||||
|
||||
class SkipFeatureInput(BaseModel):
|
||||
"""Input for skipping a feature."""
|
||||
feature_id: int = Field(..., description="The ID of the feature to skip", ge=1)
|
||||
|
||||
|
||||
class RegressionInput(BaseModel):
|
||||
"""Input for getting regression features."""
|
||||
limit: int = Field(default=3, ge=1, le=10, description="Maximum number of passing features to return")
|
||||
|
||||
|
||||
class FeatureCreateItem(BaseModel):
|
||||
"""Schema for creating a single feature."""
|
||||
category: str = Field(..., min_length=1, max_length=100, description="Feature category")
|
||||
name: str = Field(..., min_length=1, max_length=255, description="Feature name")
|
||||
description: str = Field(..., min_length=1, description="Detailed description")
|
||||
steps: list[str] = Field(..., min_length=1, description="Implementation/test steps")
|
||||
|
||||
|
||||
class BulkCreateInput(BaseModel):
|
||||
"""Input for bulk creating features."""
|
||||
features: list[FeatureCreateItem] = Field(..., min_length=1, description="List of features to create")
|
||||
|
||||
|
||||
# Global database session maker (initialized on startup)
|
||||
_session_maker = None
|
||||
_engine = None
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def server_lifespan(server: FastMCP):
|
||||
"""Initialize database on startup, cleanup on shutdown."""
|
||||
global _session_maker, _engine
|
||||
|
||||
# Create project directory if it doesn't exist
|
||||
PROJECT_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Initialize database
|
||||
_engine, _session_maker = create_database(PROJECT_DIR)
|
||||
|
||||
# Run migration if needed (converts legacy JSON to SQLite)
|
||||
migrate_json_to_sqlite(PROJECT_DIR, _session_maker)
|
||||
|
||||
yield
|
||||
|
||||
# Cleanup
|
||||
if _engine:
|
||||
_engine.dispose()
|
||||
|
||||
|
||||
# Initialize the MCP server
|
||||
mcp = FastMCP("features", lifespan=server_lifespan)
|
||||
|
||||
|
||||
def get_session():
|
||||
"""Get a new database session."""
|
||||
if _session_maker is None:
|
||||
raise RuntimeError("Database not initialized")
|
||||
return _session_maker()
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def feature_get_stats() -> str:
|
||||
"""Get statistics about feature completion progress.
|
||||
|
||||
Returns the number of passing features, total features, and completion percentage.
|
||||
Use this to track overall progress of the implementation.
|
||||
|
||||
Returns:
|
||||
JSON with: passing (int), total (int), percentage (float)
|
||||
"""
|
||||
session = get_session()
|
||||
try:
|
||||
total = session.query(Feature).count()
|
||||
passing = session.query(Feature).filter(Feature.passes == True).count()
|
||||
percentage = round((passing / total) * 100, 1) if total > 0 else 0.0
|
||||
|
||||
return json.dumps({
|
||||
"passing": passing,
|
||||
"total": total,
|
||||
"percentage": percentage
|
||||
}, indent=2)
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def feature_get_next() -> str:
|
||||
"""Get the highest-priority pending feature to work on.
|
||||
|
||||
Returns the feature with the lowest priority number that has passes=false.
|
||||
Use this at the start of each coding session to determine what to implement next.
|
||||
|
||||
Returns:
|
||||
JSON with feature details (id, priority, category, name, description, steps, passes)
|
||||
or error message if all features are passing.
|
||||
"""
|
||||
session = get_session()
|
||||
try:
|
||||
feature = (
|
||||
session.query(Feature)
|
||||
.filter(Feature.passes == False)
|
||||
.order_by(Feature.priority.asc(), Feature.id.asc())
|
||||
.first()
|
||||
)
|
||||
|
||||
if feature is None:
|
||||
return json.dumps({"error": "All features are passing! No more work to do."})
|
||||
|
||||
return json.dumps(feature.to_dict(), indent=2)
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def feature_get_for_regression(
|
||||
limit: Annotated[int, Field(default=3, ge=1, le=10, description="Maximum number of passing features to return")] = 3
|
||||
) -> str:
|
||||
"""Get random passing features for regression testing.
|
||||
|
||||
Returns a random selection of features that are currently passing.
|
||||
Use this to verify that previously implemented features still work
|
||||
after making changes.
|
||||
|
||||
Args:
|
||||
limit: Maximum number of features to return (1-10, default 3)
|
||||
|
||||
Returns:
|
||||
JSON with: features (list of feature objects), count (int)
|
||||
"""
|
||||
session = get_session()
|
||||
try:
|
||||
features = (
|
||||
session.query(Feature)
|
||||
.filter(Feature.passes == True)
|
||||
.order_by(func.random())
|
||||
.limit(limit)
|
||||
.all()
|
||||
)
|
||||
|
||||
return json.dumps({
|
||||
"features": [f.to_dict() for f in features],
|
||||
"count": len(features)
|
||||
}, indent=2)
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def feature_mark_passing(
|
||||
feature_id: Annotated[int, Field(description="The ID of the feature to mark as passing", ge=1)]
|
||||
) -> str:
|
||||
"""Mark a feature as passing after successful implementation.
|
||||
|
||||
Updates the feature's passes field to true. Use this after you have
|
||||
implemented the feature and verified it works correctly.
|
||||
|
||||
Args:
|
||||
feature_id: The ID of the feature to mark as passing
|
||||
|
||||
Returns:
|
||||
JSON with the updated feature details, or error if not found.
|
||||
"""
|
||||
session = get_session()
|
||||
try:
|
||||
feature = session.query(Feature).filter(Feature.id == feature_id).first()
|
||||
|
||||
if feature is None:
|
||||
return json.dumps({"error": f"Feature with ID {feature_id} not found"})
|
||||
|
||||
feature.passes = True
|
||||
session.commit()
|
||||
session.refresh(feature)
|
||||
|
||||
return json.dumps(feature.to_dict(), indent=2)
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def feature_skip(
|
||||
feature_id: Annotated[int, Field(description="The ID of the feature to skip", ge=1)]
|
||||
) -> str:
|
||||
"""Skip a feature by moving it to the end of the priority queue.
|
||||
|
||||
Use this when a feature cannot be implemented yet due to:
|
||||
- Dependencies on other features that aren't implemented yet
|
||||
- External blockers (missing assets, unclear requirements)
|
||||
- Technical prerequisites that need to be addressed first
|
||||
|
||||
The feature's priority is set to max_priority + 1, so it will be
|
||||
worked on after all other pending features.
|
||||
|
||||
Args:
|
||||
feature_id: The ID of the feature to skip
|
||||
|
||||
Returns:
|
||||
JSON with skip details: id, name, old_priority, new_priority, message
|
||||
"""
|
||||
session = get_session()
|
||||
try:
|
||||
feature = session.query(Feature).filter(Feature.id == feature_id).first()
|
||||
|
||||
if feature is None:
|
||||
return json.dumps({"error": f"Feature with ID {feature_id} not found"})
|
||||
|
||||
if feature.passes:
|
||||
return json.dumps({"error": "Cannot skip a feature that is already passing"})
|
||||
|
||||
old_priority = feature.priority
|
||||
|
||||
# Get max priority and set this feature to max + 1
|
||||
max_priority_result = session.query(Feature.priority).order_by(Feature.priority.desc()).first()
|
||||
new_priority = (max_priority_result[0] + 1) if max_priority_result else 1
|
||||
|
||||
feature.priority = new_priority
|
||||
session.commit()
|
||||
session.refresh(feature)
|
||||
|
||||
return json.dumps({
|
||||
"id": feature.id,
|
||||
"name": feature.name,
|
||||
"old_priority": old_priority,
|
||||
"new_priority": new_priority,
|
||||
"message": f"Feature '{feature.name}' moved to end of queue"
|
||||
}, indent=2)
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def feature_create_bulk(
|
||||
features: Annotated[list[dict], Field(description="List of features to create, each with category, name, description, and steps")]
|
||||
) -> str:
|
||||
"""Create multiple features in a single operation.
|
||||
|
||||
Features are assigned sequential priorities based on their order.
|
||||
All features start with passes=false.
|
||||
|
||||
This is typically used by the initializer agent to set up the initial
|
||||
feature list from the app specification.
|
||||
|
||||
Args:
|
||||
features: List of features to create, each with:
|
||||
- category (str): Feature category
|
||||
- name (str): Feature name
|
||||
- description (str): Detailed description
|
||||
- steps (list[str]): Implementation/test steps
|
||||
|
||||
Returns:
|
||||
JSON with: created (int) - number of features created
|
||||
"""
|
||||
session = get_session()
|
||||
try:
|
||||
# Get the starting priority
|
||||
max_priority_result = session.query(Feature.priority).order_by(Feature.priority.desc()).first()
|
||||
start_priority = (max_priority_result[0] + 1) if max_priority_result else 1
|
||||
|
||||
created_count = 0
|
||||
for i, feature_data in enumerate(features):
|
||||
# Validate required fields
|
||||
if not all(key in feature_data for key in ["category", "name", "description", "steps"]):
|
||||
return json.dumps({
|
||||
"error": f"Feature at index {i} missing required fields (category, name, description, steps)"
|
||||
})
|
||||
|
||||
db_feature = Feature(
|
||||
priority=start_priority + i,
|
||||
category=feature_data["category"],
|
||||
name=feature_data["name"],
|
||||
description=feature_data["description"],
|
||||
steps=feature_data["steps"],
|
||||
passes=False,
|
||||
)
|
||||
session.add(db_feature)
|
||||
created_count += 1
|
||||
|
||||
session.commit()
|
||||
|
||||
return json.dumps({"created": created_count}, indent=2)
|
||||
except Exception as e:
|
||||
session.rollback()
|
||||
return json.dumps({"error": str(e)})
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
mcp.run()
|
||||
215
progress.py
Normal file
215
progress.py
Normal file
@@ -0,0 +1,215 @@
|
||||
"""
|
||||
Progress Tracking Utilities
|
||||
===========================
|
||||
|
||||
Functions for tracking and displaying progress of the autonomous coding agent.
|
||||
Uses direct SQLite access for database queries.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sqlite3
|
||||
import urllib.request
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
WEBHOOK_URL = os.environ.get("PROGRESS_N8N_WEBHOOK_URL")
|
||||
PROGRESS_CACHE_FILE = ".progress_cache"
|
||||
|
||||
|
||||
def has_features(project_dir: Path) -> bool:
|
||||
"""
|
||||
Check if the project has features in the database.
|
||||
|
||||
This is used to determine if the initializer agent needs to run.
|
||||
We check the database directly (not via API) since the API server
|
||||
may not be running yet when this check is performed.
|
||||
|
||||
Returns True if:
|
||||
- features.db exists AND has at least 1 feature, OR
|
||||
- feature_list.json exists (legacy format)
|
||||
|
||||
Returns False if no features exist (initializer needs to run).
|
||||
"""
|
||||
import sqlite3
|
||||
|
||||
# Check legacy JSON file first
|
||||
json_file = project_dir / "feature_list.json"
|
||||
if json_file.exists():
|
||||
return True
|
||||
|
||||
# Check SQLite database
|
||||
db_file = project_dir / "features.db"
|
||||
if not db_file.exists():
|
||||
return False
|
||||
|
||||
try:
|
||||
conn = sqlite3.connect(db_file)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT COUNT(*) FROM features")
|
||||
count = cursor.fetchone()[0]
|
||||
conn.close()
|
||||
return count > 0
|
||||
except Exception:
|
||||
# Database exists but can't be read or has no features table
|
||||
return False
|
||||
|
||||
|
||||
def count_passing_tests(project_dir: Path) -> tuple[int, int]:
|
||||
"""
|
||||
Count passing and total tests via direct database access.
|
||||
|
||||
Args:
|
||||
project_dir: Directory containing the project
|
||||
|
||||
Returns:
|
||||
(passing_count, total_count)
|
||||
"""
|
||||
db_file = project_dir / "features.db"
|
||||
if not db_file.exists():
|
||||
return 0, 0
|
||||
|
||||
try:
|
||||
conn = sqlite3.connect(db_file)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT COUNT(*) FROM features")
|
||||
total = cursor.fetchone()[0]
|
||||
cursor.execute("SELECT COUNT(*) FROM features WHERE passes = 1")
|
||||
passing = cursor.fetchone()[0]
|
||||
conn.close()
|
||||
return passing, total
|
||||
except Exception as e:
|
||||
print(f"[Database error in count_passing_tests: {e}]")
|
||||
return 0, 0
|
||||
|
||||
|
||||
def get_all_passing_features(project_dir: Path) -> list[dict]:
|
||||
"""
|
||||
Get all passing features for webhook notifications.
|
||||
|
||||
Args:
|
||||
project_dir: Directory containing the project
|
||||
|
||||
Returns:
|
||||
List of dicts with id, category, name for each passing feature
|
||||
"""
|
||||
db_file = project_dir / "features.db"
|
||||
if not db_file.exists():
|
||||
return []
|
||||
|
||||
try:
|
||||
conn = sqlite3.connect(db_file)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(
|
||||
"SELECT id, category, name FROM features WHERE passes = 1 ORDER BY priority ASC"
|
||||
)
|
||||
features = [
|
||||
{"id": row[0], "category": row[1], "name": row[2]}
|
||||
for row in cursor.fetchall()
|
||||
]
|
||||
conn.close()
|
||||
return features
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
|
||||
def send_progress_webhook(passing: int, total: int, project_dir: Path) -> None:
|
||||
"""Send webhook notification when progress increases."""
|
||||
if not WEBHOOK_URL:
|
||||
return # Webhook not configured
|
||||
|
||||
cache_file = project_dir / PROGRESS_CACHE_FILE
|
||||
previous = 0
|
||||
previous_passing_ids = set()
|
||||
|
||||
# Read previous progress and passing feature IDs
|
||||
if cache_file.exists():
|
||||
try:
|
||||
cache_data = json.loads(cache_file.read_text())
|
||||
previous = cache_data.get("count", 0)
|
||||
previous_passing_ids = set(cache_data.get("passing_ids", []))
|
||||
except Exception:
|
||||
previous = 0
|
||||
|
||||
# Only notify if progress increased
|
||||
if passing > previous:
|
||||
# Find which features are now passing via API
|
||||
completed_tests = []
|
||||
current_passing_ids = []
|
||||
|
||||
# Detect transition from old cache format (had count but no passing_ids)
|
||||
# In this case, we can't reliably identify which specific tests are new
|
||||
is_old_cache_format = len(previous_passing_ids) == 0 and previous > 0
|
||||
|
||||
# Get all passing features via direct database access
|
||||
all_passing = get_all_passing_features(project_dir)
|
||||
for feature in all_passing:
|
||||
feature_id = feature.get("id")
|
||||
current_passing_ids.append(feature_id)
|
||||
# Only identify individual new tests if we have previous IDs to compare
|
||||
if not is_old_cache_format and feature_id not in previous_passing_ids:
|
||||
# This feature is newly passing
|
||||
name = feature.get("name", f"Feature #{feature_id}")
|
||||
category = feature.get("category", "")
|
||||
if category:
|
||||
completed_tests.append(f"{category} {name}")
|
||||
else:
|
||||
completed_tests.append(name)
|
||||
|
||||
payload = {
|
||||
"event": "test_progress",
|
||||
"passing": passing,
|
||||
"total": total,
|
||||
"percentage": round((passing / total) * 100, 1) if total > 0 else 0,
|
||||
"previous_passing": previous,
|
||||
"tests_completed_this_session": passing - previous,
|
||||
"completed_tests": completed_tests,
|
||||
"project": project_dir.name,
|
||||
"timestamp": datetime.utcnow().isoformat() + "Z",
|
||||
}
|
||||
|
||||
try:
|
||||
req = urllib.request.Request(
|
||||
WEBHOOK_URL,
|
||||
data=json.dumps([payload]).encode("utf-8"), # n8n expects array
|
||||
headers={"Content-Type": "application/json"},
|
||||
)
|
||||
urllib.request.urlopen(req, timeout=5)
|
||||
except Exception as e:
|
||||
print(f"[Webhook notification failed: {e}]")
|
||||
|
||||
# Update cache with count and passing IDs
|
||||
cache_file.write_text(
|
||||
json.dumps({"count": passing, "passing_ids": current_passing_ids})
|
||||
)
|
||||
else:
|
||||
# Update cache even if no change (for initial state)
|
||||
if not cache_file.exists():
|
||||
all_passing = get_all_passing_features(project_dir)
|
||||
current_passing_ids = [f.get("id") for f in all_passing]
|
||||
cache_file.write_text(
|
||||
json.dumps({"count": passing, "passing_ids": current_passing_ids})
|
||||
)
|
||||
|
||||
|
||||
def print_session_header(session_num: int, is_initializer: bool) -> None:
|
||||
"""Print a formatted header for the session."""
|
||||
session_type = "INITIALIZER" if is_initializer else "CODING AGENT"
|
||||
|
||||
print("\n" + "=" * 70)
|
||||
print(f" SESSION {session_num}: {session_type}")
|
||||
print("=" * 70)
|
||||
print()
|
||||
|
||||
|
||||
def print_progress_summary(project_dir: Path) -> None:
|
||||
"""Print a summary of current progress."""
|
||||
passing, total = count_passing_tests(project_dir)
|
||||
|
||||
if total > 0:
|
||||
percentage = (passing / total) * 100
|
||||
print(f"\nProgress: {passing}/{total} tests passing ({percentage:.1f}%)")
|
||||
send_progress_webhook(passing, total, project_dir)
|
||||
else:
|
||||
print("\nProgress: No features in database yet")
|
||||
223
prompts.py
Normal file
223
prompts.py
Normal file
@@ -0,0 +1,223 @@
|
||||
"""
|
||||
Prompt Loading Utilities
|
||||
========================
|
||||
|
||||
Functions for loading prompt templates with project-specific support.
|
||||
|
||||
Fallback chain:
|
||||
1. Project-specific: generations/{project}/prompts/{name}.md
|
||||
2. Base template: .claude/templates/{name}.template.md
|
||||
"""
|
||||
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
# Base templates location (generic templates)
|
||||
TEMPLATES_DIR = Path(__file__).parent / ".claude" / "templates"
|
||||
|
||||
|
||||
def get_project_prompts_dir(project_dir: Path) -> Path:
|
||||
"""Get the prompts directory for a specific project."""
|
||||
return project_dir / "prompts"
|
||||
|
||||
|
||||
def load_prompt(name: str, project_dir: Path | None = None) -> str:
|
||||
"""
|
||||
Load a prompt template with fallback chain.
|
||||
|
||||
Fallback order:
|
||||
1. Project-specific: {project_dir}/prompts/{name}.md
|
||||
2. Base template: .claude/templates/{name}.template.md
|
||||
|
||||
Args:
|
||||
name: The prompt name (without extension), e.g., "initializer_prompt"
|
||||
project_dir: Optional project directory for project-specific prompts
|
||||
|
||||
Returns:
|
||||
The prompt content as a string
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If prompt not found in any location
|
||||
"""
|
||||
# 1. Try project-specific first
|
||||
if project_dir:
|
||||
project_prompts = get_project_prompts_dir(project_dir)
|
||||
project_path = project_prompts / f"{name}.md"
|
||||
if project_path.exists():
|
||||
try:
|
||||
return project_path.read_text(encoding="utf-8")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f"Warning: Could not read {project_path}: {e}")
|
||||
|
||||
# 2. Try base template
|
||||
template_path = TEMPLATES_DIR / f"{name}.template.md"
|
||||
if template_path.exists():
|
||||
try:
|
||||
return template_path.read_text(encoding="utf-8")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f"Warning: Could not read {template_path}: {e}")
|
||||
|
||||
raise FileNotFoundError(
|
||||
f"Prompt '{name}' not found in:\n"
|
||||
f" - Project: {project_dir / 'prompts' if project_dir else 'N/A'}\n"
|
||||
f" - Templates: {TEMPLATES_DIR}"
|
||||
)
|
||||
|
||||
|
||||
def get_initializer_prompt(project_dir: Path | None = None) -> str:
|
||||
"""Load the initializer prompt (project-specific if available)."""
|
||||
return load_prompt("initializer_prompt", project_dir)
|
||||
|
||||
|
||||
def get_coding_prompt(project_dir: Path | None = None) -> str:
|
||||
"""Load the coding agent prompt (project-specific if available)."""
|
||||
return load_prompt("coding_prompt", project_dir)
|
||||
|
||||
|
||||
def get_app_spec(project_dir: Path) -> str:
|
||||
"""
|
||||
Load the app spec from the project.
|
||||
|
||||
Checks in order:
|
||||
1. Project prompts directory: {project_dir}/prompts/app_spec.txt
|
||||
2. Project root (legacy): {project_dir}/app_spec.txt
|
||||
|
||||
Args:
|
||||
project_dir: The project directory
|
||||
|
||||
Returns:
|
||||
The app spec content
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If no app_spec.txt found
|
||||
"""
|
||||
# Try project prompts directory first
|
||||
project_prompts = get_project_prompts_dir(project_dir)
|
||||
spec_path = project_prompts / "app_spec.txt"
|
||||
if spec_path.exists():
|
||||
try:
|
||||
return spec_path.read_text(encoding="utf-8")
|
||||
except (OSError, PermissionError) as e:
|
||||
raise FileNotFoundError(f"Could not read {spec_path}: {e}") from e
|
||||
|
||||
# Fallback to legacy location in project root
|
||||
legacy_spec = project_dir / "app_spec.txt"
|
||||
if legacy_spec.exists():
|
||||
try:
|
||||
return legacy_spec.read_text(encoding="utf-8")
|
||||
except (OSError, PermissionError) as e:
|
||||
raise FileNotFoundError(f"Could not read {legacy_spec}: {e}") from e
|
||||
|
||||
raise FileNotFoundError(f"No app_spec.txt found for project: {project_dir}")
|
||||
|
||||
|
||||
def scaffold_project_prompts(project_dir: Path) -> Path:
|
||||
"""
|
||||
Create the project prompts directory and copy base templates.
|
||||
|
||||
This sets up a new project with template files that can be customized.
|
||||
|
||||
Args:
|
||||
project_dir: The project directory (e.g., generations/my-app)
|
||||
|
||||
Returns:
|
||||
The path to the project prompts directory
|
||||
"""
|
||||
project_prompts = get_project_prompts_dir(project_dir)
|
||||
project_prompts.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Define template mappings: (source_template, destination_name)
|
||||
templates = [
|
||||
("app_spec.template.txt", "app_spec.txt"),
|
||||
("coding_prompt.template.md", "coding_prompt.md"),
|
||||
("initializer_prompt.template.md", "initializer_prompt.md"),
|
||||
]
|
||||
|
||||
copied_files = []
|
||||
for template_name, dest_name in templates:
|
||||
template_path = TEMPLATES_DIR / template_name
|
||||
dest_path = project_prompts / dest_name
|
||||
|
||||
# Only copy if template exists and destination doesn't
|
||||
if template_path.exists() and not dest_path.exists():
|
||||
try:
|
||||
shutil.copy(template_path, dest_path)
|
||||
copied_files.append(dest_name)
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not copy {dest_name}: {e}")
|
||||
|
||||
if copied_files:
|
||||
print(f" Created prompt files: {', '.join(copied_files)}")
|
||||
|
||||
return project_prompts
|
||||
|
||||
|
||||
def has_project_prompts(project_dir: Path) -> bool:
|
||||
"""
|
||||
Check if a project has valid prompts set up.
|
||||
|
||||
A project has valid prompts if:
|
||||
1. The prompts directory exists, AND
|
||||
2. app_spec.txt exists within it, AND
|
||||
3. app_spec.txt contains the <project_specification> tag
|
||||
|
||||
Args:
|
||||
project_dir: The project directory to check
|
||||
|
||||
Returns:
|
||||
True if valid project prompts exist, False otherwise
|
||||
"""
|
||||
project_prompts = get_project_prompts_dir(project_dir)
|
||||
app_spec = project_prompts / "app_spec.txt"
|
||||
|
||||
if not app_spec.exists():
|
||||
# Also check legacy location in project root
|
||||
legacy_spec = project_dir / "app_spec.txt"
|
||||
if legacy_spec.exists():
|
||||
try:
|
||||
content = legacy_spec.read_text(encoding="utf-8")
|
||||
return "<project_specification>" in content
|
||||
except (OSError, PermissionError):
|
||||
return False
|
||||
return False
|
||||
|
||||
# Check for valid spec content
|
||||
try:
|
||||
content = app_spec.read_text(encoding="utf-8")
|
||||
return "<project_specification>" in content
|
||||
except (OSError, PermissionError):
|
||||
return False
|
||||
|
||||
|
||||
def copy_spec_to_project(project_dir: Path) -> None:
|
||||
"""
|
||||
Copy the app spec file into the project root directory for the agent to read.
|
||||
|
||||
This maintains backwards compatibility - the agent expects app_spec.txt
|
||||
in the project root directory.
|
||||
|
||||
The spec is sourced from: {project_dir}/prompts/app_spec.txt
|
||||
|
||||
Args:
|
||||
project_dir: The project directory
|
||||
"""
|
||||
spec_dest = project_dir / "app_spec.txt"
|
||||
|
||||
# Don't overwrite if already exists
|
||||
if spec_dest.exists():
|
||||
return
|
||||
|
||||
# Copy from project prompts directory
|
||||
project_prompts = get_project_prompts_dir(project_dir)
|
||||
project_spec = project_prompts / "app_spec.txt"
|
||||
if project_spec.exists():
|
||||
try:
|
||||
shutil.copy(project_spec, spec_dest)
|
||||
print("Copied app_spec.txt to project directory")
|
||||
return
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f"Warning: Could not copy app_spec.txt: {e}")
|
||||
return
|
||||
|
||||
print("Warning: No app_spec.txt found to copy to project directory")
|
||||
3
requirements.txt
Normal file
3
requirements.txt
Normal file
@@ -0,0 +1,3 @@
|
||||
claude-agent-sdk>=0.1.0
|
||||
python-dotenv>=1.0.0
|
||||
sqlalchemy>=2.0.0
|
||||
376
security.py
Normal file
376
security.py
Normal file
@@ -0,0 +1,376 @@
|
||||
"""
|
||||
Security Hooks for Autonomous Coding Agent
|
||||
==========================================
|
||||
|
||||
Pre-tool-use hooks that validate bash commands for security.
|
||||
Uses an allowlist approach - only explicitly permitted commands can run.
|
||||
"""
|
||||
|
||||
import os
|
||||
import shlex
|
||||
|
||||
|
||||
# Allowed commands for development tasks
|
||||
# Minimal set needed for the autonomous coding demo
|
||||
ALLOWED_COMMANDS = {
|
||||
# File inspection
|
||||
"ls",
|
||||
"cat",
|
||||
"head",
|
||||
"tail",
|
||||
"wc",
|
||||
"grep",
|
||||
"eof",
|
||||
# File operations (agent uses SDK tools for most file ops, but cp/mkdir needed occasionally)
|
||||
"cp",
|
||||
"mkdir",
|
||||
"chmod", # For making scripts executable; validated separately
|
||||
# Directory
|
||||
"pwd",
|
||||
# Output
|
||||
"echo",
|
||||
# Node.js development
|
||||
"npm",
|
||||
"npx",
|
||||
"pnpm", # Project uses pnpm
|
||||
"node",
|
||||
# Version control
|
||||
"git",
|
||||
# Docker (for PostgreSQL)
|
||||
"docker",
|
||||
# Process management
|
||||
"ps",
|
||||
"lsof",
|
||||
"sleep",
|
||||
"kill", # Kill by PID
|
||||
"pkill", # For killing dev servers; validated separately
|
||||
# Network/API testing
|
||||
"curl",
|
||||
# File operations
|
||||
"mv",
|
||||
"rm", # Use with caution
|
||||
"touch",
|
||||
# Shell scripts
|
||||
"sh",
|
||||
"bash",
|
||||
# Script execution
|
||||
"init.sh", # Init scripts; validated separately
|
||||
}
|
||||
|
||||
# Commands that need additional validation even when in the allowlist
|
||||
COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh"}
|
||||
|
||||
|
||||
def split_command_segments(command_string: str) -> list[str]:
|
||||
"""
|
||||
Split a compound command into individual command segments.
|
||||
|
||||
Handles command chaining (&&, ||, ;) but not pipes (those are single commands).
|
||||
|
||||
Args:
|
||||
command_string: The full shell command
|
||||
|
||||
Returns:
|
||||
List of individual command segments
|
||||
"""
|
||||
import re
|
||||
|
||||
# Split on && and || while preserving the ability to handle each segment
|
||||
# This regex splits on && or || that aren't inside quotes
|
||||
segments = re.split(r"\s*(?:&&|\|\|)\s*", command_string)
|
||||
|
||||
# Further split on semicolons
|
||||
result = []
|
||||
for segment in segments:
|
||||
sub_segments = re.split(r'(?<!["\'])\s*;\s*(?!["\'])', segment)
|
||||
for sub in sub_segments:
|
||||
sub = sub.strip()
|
||||
if sub:
|
||||
result.append(sub)
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def extract_commands(command_string: str) -> list[str]:
|
||||
"""
|
||||
Extract command names from a shell command string.
|
||||
|
||||
Handles pipes, command chaining (&&, ||, ;), and subshells.
|
||||
Returns the base command names (without paths).
|
||||
|
||||
Args:
|
||||
command_string: The full shell command
|
||||
|
||||
Returns:
|
||||
List of command names found in the string
|
||||
"""
|
||||
commands = []
|
||||
|
||||
# shlex doesn't treat ; as a separator, so we need to pre-process
|
||||
import re
|
||||
|
||||
# Split on semicolons that aren't inside quotes (simple heuristic)
|
||||
# This handles common cases like "echo hello; ls"
|
||||
segments = re.split(r'(?<!["\'])\s*;\s*(?!["\'])', command_string)
|
||||
|
||||
for segment in segments:
|
||||
segment = segment.strip()
|
||||
if not segment:
|
||||
continue
|
||||
|
||||
try:
|
||||
tokens = shlex.split(segment)
|
||||
except ValueError:
|
||||
# Malformed command (unclosed quotes, etc.)
|
||||
# Return empty to trigger block (fail-safe)
|
||||
return []
|
||||
|
||||
if not tokens:
|
||||
continue
|
||||
|
||||
# Track when we expect a command vs arguments
|
||||
expect_command = True
|
||||
|
||||
for token in tokens:
|
||||
# Shell operators indicate a new command follows
|
||||
if token in ("|", "||", "&&", "&"):
|
||||
expect_command = True
|
||||
continue
|
||||
|
||||
# Skip shell keywords that precede commands
|
||||
if token in (
|
||||
"if",
|
||||
"then",
|
||||
"else",
|
||||
"elif",
|
||||
"fi",
|
||||
"for",
|
||||
"while",
|
||||
"until",
|
||||
"do",
|
||||
"done",
|
||||
"case",
|
||||
"esac",
|
||||
"in",
|
||||
"!",
|
||||
"{",
|
||||
"}",
|
||||
):
|
||||
continue
|
||||
|
||||
# Skip flags/options
|
||||
if token.startswith("-"):
|
||||
continue
|
||||
|
||||
# Skip variable assignments (VAR=value)
|
||||
if "=" in token and not token.startswith("="):
|
||||
continue
|
||||
|
||||
if expect_command:
|
||||
# Extract the base command name (handle paths like /usr/bin/python)
|
||||
cmd = os.path.basename(token)
|
||||
commands.append(cmd)
|
||||
expect_command = False
|
||||
|
||||
return commands
|
||||
|
||||
|
||||
def validate_pkill_command(command_string: str) -> tuple[bool, str]:
|
||||
"""
|
||||
Validate pkill commands - only allow killing dev-related processes.
|
||||
|
||||
Uses shlex to parse the command, avoiding regex bypass vulnerabilities.
|
||||
|
||||
Returns:
|
||||
Tuple of (is_allowed, reason_if_blocked)
|
||||
"""
|
||||
# Allowed process names for pkill
|
||||
allowed_process_names = {
|
||||
"node",
|
||||
"npm",
|
||||
"npx",
|
||||
"vite",
|
||||
"next",
|
||||
}
|
||||
|
||||
try:
|
||||
tokens = shlex.split(command_string)
|
||||
except ValueError:
|
||||
return False, "Could not parse pkill command"
|
||||
|
||||
if not tokens:
|
||||
return False, "Empty pkill command"
|
||||
|
||||
# Separate flags from arguments
|
||||
args = []
|
||||
for token in tokens[1:]:
|
||||
if not token.startswith("-"):
|
||||
args.append(token)
|
||||
|
||||
if not args:
|
||||
return False, "pkill requires a process name"
|
||||
|
||||
# The target is typically the last non-flag argument
|
||||
target = args[-1]
|
||||
|
||||
# For -f flag (full command line match), extract the first word as process name
|
||||
# e.g., "pkill -f 'node server.js'" -> target is "node server.js", process is "node"
|
||||
if " " in target:
|
||||
target = target.split()[0]
|
||||
|
||||
if target in allowed_process_names:
|
||||
return True, ""
|
||||
return False, f"pkill only allowed for dev processes: {allowed_process_names}"
|
||||
|
||||
|
||||
def validate_chmod_command(command_string: str) -> tuple[bool, str]:
|
||||
"""
|
||||
Validate chmod commands - only allow making files executable with +x.
|
||||
|
||||
Returns:
|
||||
Tuple of (is_allowed, reason_if_blocked)
|
||||
"""
|
||||
try:
|
||||
tokens = shlex.split(command_string)
|
||||
except ValueError:
|
||||
return False, "Could not parse chmod command"
|
||||
|
||||
if not tokens or tokens[0] != "chmod":
|
||||
return False, "Not a chmod command"
|
||||
|
||||
# Look for the mode argument
|
||||
# Valid modes: +x, u+x, a+x, etc. (anything ending with +x for execute permission)
|
||||
mode = None
|
||||
files = []
|
||||
|
||||
for token in tokens[1:]:
|
||||
if token.startswith("-"):
|
||||
# Skip flags like -R (we don't allow recursive chmod anyway)
|
||||
return False, "chmod flags are not allowed"
|
||||
elif mode is None:
|
||||
mode = token
|
||||
else:
|
||||
files.append(token)
|
||||
|
||||
if mode is None:
|
||||
return False, "chmod requires a mode"
|
||||
|
||||
if not files:
|
||||
return False, "chmod requires at least one file"
|
||||
|
||||
# Only allow +x variants (making files executable)
|
||||
# This matches: +x, u+x, g+x, o+x, a+x, ug+x, etc.
|
||||
import re
|
||||
|
||||
if not re.match(r"^[ugoa]*\+x$", mode):
|
||||
return False, f"chmod only allowed with +x mode, got: {mode}"
|
||||
|
||||
return True, ""
|
||||
|
||||
|
||||
def validate_init_script(command_string: str) -> tuple[bool, str]:
|
||||
"""
|
||||
Validate init.sh script execution - only allow ./init.sh.
|
||||
|
||||
Returns:
|
||||
Tuple of (is_allowed, reason_if_blocked)
|
||||
"""
|
||||
try:
|
||||
tokens = shlex.split(command_string)
|
||||
except ValueError:
|
||||
return False, "Could not parse init script command"
|
||||
|
||||
if not tokens:
|
||||
return False, "Empty command"
|
||||
|
||||
# The command should be exactly ./init.sh (possibly with arguments)
|
||||
script = tokens[0]
|
||||
|
||||
# Allow ./init.sh or paths ending in /init.sh
|
||||
if script == "./init.sh" or script.endswith("/init.sh"):
|
||||
return True, ""
|
||||
|
||||
return False, f"Only ./init.sh is allowed, got: {script}"
|
||||
|
||||
|
||||
def get_command_for_validation(cmd: str, segments: list[str]) -> str:
|
||||
"""
|
||||
Find the specific command segment that contains the given command.
|
||||
|
||||
Args:
|
||||
cmd: The command name to find
|
||||
segments: List of command segments
|
||||
|
||||
Returns:
|
||||
The segment containing the command, or empty string if not found
|
||||
"""
|
||||
for segment in segments:
|
||||
segment_commands = extract_commands(segment)
|
||||
if cmd in segment_commands:
|
||||
return segment
|
||||
return ""
|
||||
|
||||
|
||||
async def bash_security_hook(input_data, tool_use_id=None, context=None):
|
||||
"""
|
||||
Pre-tool-use hook that validates bash commands using an allowlist.
|
||||
|
||||
Only commands in ALLOWED_COMMANDS are permitted.
|
||||
|
||||
Args:
|
||||
input_data: Dict containing tool_name and tool_input
|
||||
tool_use_id: Optional tool use ID
|
||||
context: Optional context
|
||||
|
||||
Returns:
|
||||
Empty dict to allow, or {"decision": "block", "reason": "..."} to block
|
||||
"""
|
||||
if input_data.get("tool_name") != "Bash":
|
||||
return {}
|
||||
|
||||
command = input_data.get("tool_input", {}).get("command", "")
|
||||
if not command:
|
||||
return {}
|
||||
|
||||
# Extract all commands from the command string
|
||||
commands = extract_commands(command)
|
||||
|
||||
if not commands:
|
||||
# Could not parse - fail safe by blocking
|
||||
return {
|
||||
"decision": "block",
|
||||
"reason": f"Could not parse command for security validation: {command}",
|
||||
}
|
||||
|
||||
# Split into segments for per-command validation
|
||||
segments = split_command_segments(command)
|
||||
|
||||
# Check each command against the allowlist
|
||||
for cmd in commands:
|
||||
if cmd not in ALLOWED_COMMANDS:
|
||||
return {
|
||||
"decision": "block",
|
||||
"reason": f"Command '{cmd}' is not in the allowed commands list",
|
||||
}
|
||||
|
||||
# Additional validation for sensitive commands
|
||||
if cmd in COMMANDS_NEEDING_EXTRA_VALIDATION:
|
||||
# Find the specific segment containing this command
|
||||
cmd_segment = get_command_for_validation(cmd, segments)
|
||||
if not cmd_segment:
|
||||
cmd_segment = command # Fallback to full command
|
||||
|
||||
if cmd == "pkill":
|
||||
allowed, reason = validate_pkill_command(cmd_segment)
|
||||
if not allowed:
|
||||
return {"decision": "block", "reason": reason}
|
||||
elif cmd == "chmod":
|
||||
allowed, reason = validate_chmod_command(cmd_segment)
|
||||
if not allowed:
|
||||
return {"decision": "block", "reason": reason}
|
||||
elif cmd == "init.sh":
|
||||
allowed, reason = validate_init_script(cmd_segment)
|
||||
if not allowed:
|
||||
return {"decision": "block", "reason": reason}
|
||||
|
||||
return {}
|
||||
84
start.bat
Normal file
84
start.bat
Normal file
@@ -0,0 +1,84 @@
|
||||
@echo off
|
||||
cd /d "%~dp0"
|
||||
|
||||
echo.
|
||||
echo ========================================
|
||||
echo Autonomous Coding Agent
|
||||
echo ========================================
|
||||
echo.
|
||||
|
||||
REM Check if Claude CLI is installed
|
||||
where claude >nul 2>nul
|
||||
if %errorlevel% neq 0 (
|
||||
echo [ERROR] Claude CLI not found
|
||||
echo.
|
||||
echo Please install Claude CLI first:
|
||||
echo https://claude.ai/download
|
||||
echo.
|
||||
echo Then run this script again.
|
||||
echo.
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
echo [OK] Claude CLI found
|
||||
|
||||
REM Check if user has credentials (check for ~/.claude/.credentials.json)
|
||||
set "CLAUDE_CREDS=%USERPROFILE%\.claude\.credentials.json"
|
||||
if exist "%CLAUDE_CREDS%" (
|
||||
echo [OK] Claude credentials found
|
||||
goto :setup_venv
|
||||
)
|
||||
|
||||
REM No credentials - prompt user to login
|
||||
echo [!] Not authenticated with Claude
|
||||
echo.
|
||||
echo You need to run 'claude login' to authenticate.
|
||||
echo This will open a browser window to sign in.
|
||||
echo.
|
||||
set /p "LOGIN_CHOICE=Would you like to run 'claude login' now? (y/n): "
|
||||
|
||||
if /i "%LOGIN_CHOICE%"=="y" (
|
||||
echo.
|
||||
echo Running 'claude login'...
|
||||
echo Complete the login in your browser, then return here.
|
||||
echo.
|
||||
call claude login
|
||||
|
||||
REM Check if login succeeded
|
||||
if exist "%CLAUDE_CREDS%" (
|
||||
echo.
|
||||
echo [OK] Login successful!
|
||||
goto :setup_venv
|
||||
) else (
|
||||
echo.
|
||||
echo [ERROR] Login failed or was cancelled.
|
||||
echo Please try again.
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
) else (
|
||||
echo.
|
||||
echo Please run 'claude login' manually, then try again.
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
:setup_venv
|
||||
echo.
|
||||
|
||||
REM Check if venv exists, create if not
|
||||
if not exist "venv\Scripts\activate.bat" (
|
||||
echo Creating virtual environment...
|
||||
python -m venv venv
|
||||
)
|
||||
|
||||
REM Activate the virtual environment
|
||||
call venv\Scripts\activate.bat
|
||||
|
||||
REM Install dependencies
|
||||
echo Installing dependencies...
|
||||
pip install -r requirements.txt --quiet
|
||||
|
||||
REM Run the app
|
||||
python start.py
|
||||
366
start.py
Normal file
366
start.py
Normal file
@@ -0,0 +1,366 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Simple CLI launcher for the Autonomous Coding Agent.
|
||||
Provides an interactive menu to create new projects or continue existing ones.
|
||||
|
||||
Supports two paths for new projects:
|
||||
1. Claude path: Use /create-spec to generate spec interactively
|
||||
2. Manual path: Edit template files directly, then continue
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
|
||||
from prompts import (
|
||||
scaffold_project_prompts,
|
||||
has_project_prompts,
|
||||
get_project_prompts_dir,
|
||||
)
|
||||
|
||||
|
||||
# Directory containing generated projects
|
||||
GENERATIONS_DIR = Path(__file__).parent / "generations"
|
||||
|
||||
|
||||
def check_spec_exists(project_dir: Path) -> bool:
|
||||
"""
|
||||
Check if valid spec files exist for a project.
|
||||
|
||||
Checks in order:
|
||||
1. Project prompts directory: {project_dir}/prompts/app_spec.txt
|
||||
2. Project root (legacy): {project_dir}/app_spec.txt
|
||||
"""
|
||||
# Check project prompts directory first
|
||||
project_prompts = get_project_prompts_dir(project_dir)
|
||||
spec_file = project_prompts / "app_spec.txt"
|
||||
if spec_file.exists():
|
||||
try:
|
||||
content = spec_file.read_text(encoding="utf-8")
|
||||
return "<project_specification>" in content
|
||||
except (OSError, PermissionError):
|
||||
return False
|
||||
|
||||
# Check legacy location in project root
|
||||
legacy_spec = project_dir / "app_spec.txt"
|
||||
if legacy_spec.exists():
|
||||
try:
|
||||
content = legacy_spec.read_text(encoding="utf-8")
|
||||
return "<project_specification>" in content
|
||||
except (OSError, PermissionError):
|
||||
return False
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def get_existing_projects() -> list[str]:
|
||||
"""Get list of existing projects from generations folder."""
|
||||
if not GENERATIONS_DIR.exists():
|
||||
return []
|
||||
|
||||
projects = []
|
||||
for item in GENERATIONS_DIR.iterdir():
|
||||
if item.is_dir() and not item.name.startswith('.'):
|
||||
projects.append(item.name)
|
||||
|
||||
return sorted(projects)
|
||||
|
||||
|
||||
def display_menu(projects: list[str]) -> None:
|
||||
"""Display the main menu."""
|
||||
print("\n" + "=" * 50)
|
||||
print(" Autonomous Coding Agent Launcher")
|
||||
print("=" * 50)
|
||||
print("\n[1] Create new project")
|
||||
|
||||
if projects:
|
||||
print("[2] Continue existing project")
|
||||
|
||||
print("[q] Quit")
|
||||
print()
|
||||
|
||||
|
||||
def display_projects(projects: list[str]) -> None:
|
||||
"""Display list of existing projects."""
|
||||
print("\n" + "-" * 40)
|
||||
print(" Existing Projects")
|
||||
print("-" * 40)
|
||||
|
||||
for i, project in enumerate(projects, 1):
|
||||
print(f" [{i}] {project}")
|
||||
|
||||
print("\n [b] Back to main menu")
|
||||
print()
|
||||
|
||||
|
||||
def get_project_choice(projects: list[str]) -> str | None:
|
||||
"""Get user's project selection."""
|
||||
while True:
|
||||
choice = input("Select project number: ").strip().lower()
|
||||
|
||||
if choice == 'b':
|
||||
return None
|
||||
|
||||
try:
|
||||
idx = int(choice) - 1
|
||||
if 0 <= idx < len(projects):
|
||||
return projects[idx]
|
||||
print(f"Please enter a number between 1 and {len(projects)}")
|
||||
except ValueError:
|
||||
print("Invalid input. Enter a number or 'b' to go back.")
|
||||
|
||||
|
||||
def get_new_project_name() -> str | None:
|
||||
"""Get name for new project."""
|
||||
print("\n" + "-" * 40)
|
||||
print(" Create New Project")
|
||||
print("-" * 40)
|
||||
print("\nEnter project name (e.g., my-awesome-app)")
|
||||
print("Leave empty to cancel.\n")
|
||||
|
||||
name = input("Project name: ").strip()
|
||||
|
||||
if not name:
|
||||
return None
|
||||
|
||||
# Basic validation - OS-aware invalid characters
|
||||
# Windows has more restrictions than Unix
|
||||
if sys.platform == "win32":
|
||||
invalid_chars = '<>:"/\\|?*'
|
||||
else:
|
||||
# Unix only restricts / and null
|
||||
invalid_chars = '/'
|
||||
|
||||
for char in invalid_chars:
|
||||
if char in name:
|
||||
print(f"Invalid character '{char}' in project name")
|
||||
return None
|
||||
|
||||
return name
|
||||
|
||||
|
||||
def ensure_project_scaffolded(project_name: str) -> Path:
|
||||
"""
|
||||
Ensure project directory exists with prompt templates.
|
||||
|
||||
Creates the project directory and copies template files if needed.
|
||||
|
||||
Returns:
|
||||
The project directory path
|
||||
"""
|
||||
project_dir = GENERATIONS_DIR / project_name
|
||||
|
||||
# Create project directory if it doesn't exist
|
||||
project_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Scaffold prompts (copies templates if they don't exist)
|
||||
print(f"\nSetting up project: {project_name}")
|
||||
scaffold_project_prompts(project_dir)
|
||||
|
||||
return project_dir
|
||||
|
||||
|
||||
def run_spec_creation(project_dir: Path) -> bool:
|
||||
"""
|
||||
Run Claude Code with /create-spec command to create project specification.
|
||||
|
||||
The project path is passed as an argument so create-spec knows where to write files.
|
||||
"""
|
||||
print("\n" + "=" * 50)
|
||||
print(" Project Specification Setup")
|
||||
print("=" * 50)
|
||||
print(f"\nProject directory: {project_dir}")
|
||||
print(f"Prompts will be saved to: {get_project_prompts_dir(project_dir)}")
|
||||
print("\nLaunching Claude Code for interactive spec creation...")
|
||||
print("Answer the questions to define your project.")
|
||||
print("When done, Claude will generate the spec files.")
|
||||
print("Exit Claude Code (Ctrl+C or /exit) when finished.\n")
|
||||
|
||||
try:
|
||||
# Launch Claude Code with /create-spec command
|
||||
# Project path included in command string so it populates $ARGUMENTS
|
||||
subprocess.run(
|
||||
["claude", f"/create-spec {project_dir}"],
|
||||
check=False, # Don't raise on non-zero exit
|
||||
cwd=str(Path(__file__).parent) # Run from project root
|
||||
)
|
||||
|
||||
# Check if spec was created in project prompts directory
|
||||
if check_spec_exists(project_dir):
|
||||
print("\n" + "-" * 50)
|
||||
print("Spec files created successfully!")
|
||||
return True
|
||||
else:
|
||||
print("\n" + "-" * 50)
|
||||
print("Spec creation incomplete.")
|
||||
print(f"Please ensure app_spec.txt exists in: {get_project_prompts_dir(project_dir)}")
|
||||
return False
|
||||
|
||||
except FileNotFoundError:
|
||||
print("\nError: 'claude' command not found.")
|
||||
print("Make sure Claude Code CLI is installed:")
|
||||
print(" npm install -g @anthropic-ai/claude-code")
|
||||
return False
|
||||
except KeyboardInterrupt:
|
||||
print("\n\nSpec creation cancelled.")
|
||||
return False
|
||||
|
||||
|
||||
def run_manual_spec_flow(project_dir: Path) -> bool:
|
||||
"""
|
||||
Guide user through manual spec editing flow.
|
||||
|
||||
Shows the paths to edit and waits for user to press Enter when ready.
|
||||
"""
|
||||
prompts_dir = get_project_prompts_dir(project_dir)
|
||||
|
||||
print("\n" + "-" * 50)
|
||||
print(" Manual Specification Setup")
|
||||
print("-" * 50)
|
||||
print("\nTemplate files have been created. Edit these files in your editor:")
|
||||
print(f"\n Required:")
|
||||
print(f" {prompts_dir / 'app_spec.txt'}")
|
||||
print(f"\n Optional (customize agent behavior):")
|
||||
print(f" {prompts_dir / 'initializer_prompt.md'}")
|
||||
print(f" {prompts_dir / 'coding_prompt.md'}")
|
||||
print("\n" + "-" * 50)
|
||||
print("\nThe app_spec.txt file contains a template with placeholders.")
|
||||
print("Replace the placeholders with your actual project specification.")
|
||||
print("\nWhen you're done editing, press Enter to continue...")
|
||||
|
||||
try:
|
||||
input()
|
||||
except KeyboardInterrupt:
|
||||
print("\n\nCancelled.")
|
||||
return False
|
||||
|
||||
# Validate that spec was edited
|
||||
if check_spec_exists(project_dir):
|
||||
print("\nSpec file validated successfully!")
|
||||
return True
|
||||
else:
|
||||
print("\nWarning: The app_spec.txt file still contains the template placeholder.")
|
||||
print("The agent may not work correctly without a proper specification.")
|
||||
confirm = input("Continue anyway? [y/N]: ").strip().lower()
|
||||
return confirm == 'y'
|
||||
|
||||
|
||||
def ask_spec_creation_choice() -> str | None:
|
||||
"""Ask user whether to create spec with Claude or manually."""
|
||||
print("\n" + "-" * 40)
|
||||
print(" Specification Setup")
|
||||
print("-" * 40)
|
||||
print("\nHow would you like to define your project?")
|
||||
print("\n[1] Create spec with Claude (recommended)")
|
||||
print(" Interactive conversation to define your project")
|
||||
print("\n[2] Edit templates manually")
|
||||
print(" Edit the template files directly in your editor")
|
||||
print("\n[b] Back to main menu")
|
||||
print()
|
||||
|
||||
while True:
|
||||
choice = input("Select [1/2/b]: ").strip().lower()
|
||||
if choice in ['1', '2', 'b']:
|
||||
return choice
|
||||
print("Invalid choice. Please enter 1, 2, or b.")
|
||||
|
||||
|
||||
def create_new_project_flow() -> str | None:
|
||||
"""
|
||||
Complete flow for creating a new project.
|
||||
|
||||
1. Get project name
|
||||
2. Create project directory and scaffold prompts
|
||||
3. Ask: Claude or Manual?
|
||||
4. If Claude: Run /create-spec with project path
|
||||
5. If Manual: Show paths, wait for Enter
|
||||
6. Return project name if successful
|
||||
"""
|
||||
project_name = get_new_project_name()
|
||||
if not project_name:
|
||||
return None
|
||||
|
||||
# Create project directory and scaffold prompts FIRST
|
||||
project_dir = ensure_project_scaffolded(project_name)
|
||||
|
||||
# Ask user how they want to handle spec creation
|
||||
choice = ask_spec_creation_choice()
|
||||
|
||||
if choice == 'b':
|
||||
return None
|
||||
elif choice == '1':
|
||||
# Create spec with Claude
|
||||
success = run_spec_creation(project_dir)
|
||||
if not success:
|
||||
print("\nYou can try again later or edit the templates manually.")
|
||||
retry = input("Start agent anyway? [y/N]: ").strip().lower()
|
||||
if retry != 'y':
|
||||
return None
|
||||
elif choice == '2':
|
||||
# Manual mode - guide user through editing
|
||||
success = run_manual_spec_flow(project_dir)
|
||||
if not success:
|
||||
return None
|
||||
|
||||
return project_name
|
||||
|
||||
|
||||
def run_agent(project_name: str) -> None:
|
||||
"""Run the autonomous agent with the given project."""
|
||||
project_dir = GENERATIONS_DIR / project_name
|
||||
|
||||
# Final validation before running
|
||||
if not has_project_prompts(project_dir):
|
||||
print(f"\nWarning: No valid spec found for project '{project_name}'")
|
||||
print("The agent may not work correctly.")
|
||||
confirm = input("Continue anyway? [y/N]: ").strip().lower()
|
||||
if confirm != 'y':
|
||||
return
|
||||
|
||||
print(f"\nStarting agent for project: {project_name}")
|
||||
print("-" * 50)
|
||||
|
||||
# Build the command
|
||||
cmd = [sys.executable, "autonomous_agent_demo.py", "--project-dir", project_name]
|
||||
|
||||
# Run the agent
|
||||
try:
|
||||
subprocess.run(cmd, check=False)
|
||||
except KeyboardInterrupt:
|
||||
print("\n\nAgent interrupted. Run again to resume.")
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Main entry point."""
|
||||
# Ensure we're in the right directory
|
||||
script_dir = Path(__file__).parent.absolute()
|
||||
os.chdir(script_dir)
|
||||
|
||||
while True:
|
||||
projects = get_existing_projects()
|
||||
display_menu(projects)
|
||||
|
||||
choice = input("Select option: ").strip().lower()
|
||||
|
||||
if choice == 'q':
|
||||
print("\nGoodbye!")
|
||||
break
|
||||
|
||||
elif choice == '1':
|
||||
project_name = create_new_project_flow()
|
||||
if project_name:
|
||||
run_agent(project_name)
|
||||
|
||||
elif choice == '2' and projects:
|
||||
display_projects(projects)
|
||||
selected = get_project_choice(projects)
|
||||
if selected:
|
||||
run_agent(selected)
|
||||
|
||||
else:
|
||||
print("Invalid option. Please try again.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
75
start.sh
Normal file
75
start.sh
Normal file
@@ -0,0 +1,75 @@
|
||||
#!/bin/bash
|
||||
cd "$(dirname "$0")"
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
echo " Autonomous Coding Agent"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
# Check if Claude CLI is installed
|
||||
if ! command -v claude &> /dev/null; then
|
||||
echo "[ERROR] Claude CLI not found"
|
||||
echo ""
|
||||
echo "Please install Claude CLI first:"
|
||||
echo " curl -fsSL https://claude.ai/install.sh | bash"
|
||||
echo ""
|
||||
echo "Then run this script again."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "[OK] Claude CLI found"
|
||||
|
||||
# Check if user has credentials
|
||||
CLAUDE_CREDS="$HOME/.claude/.credentials.json"
|
||||
if [ -f "$CLAUDE_CREDS" ]; then
|
||||
echo "[OK] Claude credentials found"
|
||||
else
|
||||
echo "[!] Not authenticated with Claude"
|
||||
echo ""
|
||||
echo "You need to run 'claude login' to authenticate."
|
||||
echo "This will open a browser window to sign in."
|
||||
echo ""
|
||||
read -p "Would you like to run 'claude login' now? (y/n): " LOGIN_CHOICE
|
||||
|
||||
if [[ "$LOGIN_CHOICE" =~ ^[Yy]$ ]]; then
|
||||
echo ""
|
||||
echo "Running 'claude login'..."
|
||||
echo "Complete the login in your browser, then return here."
|
||||
echo ""
|
||||
claude login
|
||||
|
||||
# Check if login succeeded
|
||||
if [ -f "$CLAUDE_CREDS" ]; then
|
||||
echo ""
|
||||
echo "[OK] Login successful!"
|
||||
else
|
||||
echo ""
|
||||
echo "[ERROR] Login failed or was cancelled."
|
||||
echo "Please try again."
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
echo ""
|
||||
echo "Please run 'claude login' manually, then try again."
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
||||
echo ""
|
||||
|
||||
# Check if venv exists, create if not
|
||||
if [ ! -d "venv" ]; then
|
||||
echo "Creating virtual environment..."
|
||||
python3 -m venv venv
|
||||
fi
|
||||
|
||||
# Activate the virtual environment
|
||||
source venv/bin/activate
|
||||
|
||||
# Install dependencies
|
||||
echo "Installing dependencies..."
|
||||
pip install -r requirements.txt --quiet
|
||||
|
||||
# Run the app
|
||||
python start.py
|
||||
290
test_security.py
Normal file
290
test_security.py
Normal file
@@ -0,0 +1,290 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Security Hook Tests
|
||||
===================
|
||||
|
||||
Tests for the bash command security validation logic.
|
||||
Run with: python test_security.py
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import sys
|
||||
|
||||
from security import (
|
||||
bash_security_hook,
|
||||
extract_commands,
|
||||
validate_chmod_command,
|
||||
validate_init_script,
|
||||
)
|
||||
|
||||
|
||||
def test_hook(command: str, should_block: bool) -> bool:
|
||||
"""Test a single command against the security hook."""
|
||||
input_data = {"tool_name": "Bash", "tool_input": {"command": command}}
|
||||
result = asyncio.run(bash_security_hook(input_data))
|
||||
was_blocked = result.get("decision") == "block"
|
||||
|
||||
if was_blocked == should_block:
|
||||
status = "PASS"
|
||||
else:
|
||||
status = "FAIL"
|
||||
expected = "blocked" if should_block else "allowed"
|
||||
actual = "blocked" if was_blocked else "allowed"
|
||||
reason = result.get("reason", "")
|
||||
print(f" {status}: {command!r}")
|
||||
print(f" Expected: {expected}, Got: {actual}")
|
||||
if reason:
|
||||
print(f" Reason: {reason}")
|
||||
return False
|
||||
|
||||
print(f" {status}: {command!r}")
|
||||
return True
|
||||
|
||||
|
||||
def test_extract_commands():
|
||||
"""Test the command extraction logic."""
|
||||
print("\nTesting command extraction:\n")
|
||||
passed = 0
|
||||
failed = 0
|
||||
|
||||
test_cases = [
|
||||
("ls -la", ["ls"]),
|
||||
("npm install && npm run build", ["npm", "npm"]),
|
||||
("cat file.txt | grep pattern", ["cat", "grep"]),
|
||||
("/usr/bin/node script.js", ["node"]),
|
||||
("VAR=value ls", ["ls"]),
|
||||
("git status || git init", ["git", "git"]),
|
||||
]
|
||||
|
||||
for cmd, expected in test_cases:
|
||||
result = extract_commands(cmd)
|
||||
if result == expected:
|
||||
print(f" PASS: {cmd!r} -> {result}")
|
||||
passed += 1
|
||||
else:
|
||||
print(f" FAIL: {cmd!r}")
|
||||
print(f" Expected: {expected}, Got: {result}")
|
||||
failed += 1
|
||||
|
||||
return passed, failed
|
||||
|
||||
|
||||
def test_validate_chmod():
|
||||
"""Test chmod command validation."""
|
||||
print("\nTesting chmod validation:\n")
|
||||
passed = 0
|
||||
failed = 0
|
||||
|
||||
# Test cases: (command, should_be_allowed, description)
|
||||
test_cases = [
|
||||
# Allowed cases
|
||||
("chmod +x init.sh", True, "basic +x"),
|
||||
("chmod +x script.sh", True, "+x on any script"),
|
||||
("chmod u+x init.sh", True, "user +x"),
|
||||
("chmod a+x init.sh", True, "all +x"),
|
||||
("chmod ug+x init.sh", True, "user+group +x"),
|
||||
("chmod +x file1.sh file2.sh", True, "multiple files"),
|
||||
# Blocked cases
|
||||
("chmod 777 init.sh", False, "numeric mode"),
|
||||
("chmod 755 init.sh", False, "numeric mode 755"),
|
||||
("chmod +w init.sh", False, "write permission"),
|
||||
("chmod +r init.sh", False, "read permission"),
|
||||
("chmod -x init.sh", False, "remove execute"),
|
||||
("chmod -R +x dir/", False, "recursive flag"),
|
||||
("chmod --recursive +x dir/", False, "long recursive flag"),
|
||||
("chmod +x", False, "missing file"),
|
||||
]
|
||||
|
||||
for cmd, should_allow, description in test_cases:
|
||||
allowed, reason = validate_chmod_command(cmd)
|
||||
if allowed == should_allow:
|
||||
print(f" PASS: {cmd!r} ({description})")
|
||||
passed += 1
|
||||
else:
|
||||
expected = "allowed" if should_allow else "blocked"
|
||||
actual = "allowed" if allowed else "blocked"
|
||||
print(f" FAIL: {cmd!r} ({description})")
|
||||
print(f" Expected: {expected}, Got: {actual}")
|
||||
if reason:
|
||||
print(f" Reason: {reason}")
|
||||
failed += 1
|
||||
|
||||
return passed, failed
|
||||
|
||||
|
||||
def test_validate_init_script():
|
||||
"""Test init.sh script execution validation."""
|
||||
print("\nTesting init.sh validation:\n")
|
||||
passed = 0
|
||||
failed = 0
|
||||
|
||||
# Test cases: (command, should_be_allowed, description)
|
||||
test_cases = [
|
||||
# Allowed cases
|
||||
("./init.sh", True, "basic ./init.sh"),
|
||||
("./init.sh arg1 arg2", True, "with arguments"),
|
||||
("/path/to/init.sh", True, "absolute path"),
|
||||
("../dir/init.sh", True, "relative path with init.sh"),
|
||||
# Blocked cases
|
||||
("./setup.sh", False, "different script name"),
|
||||
("./init.py", False, "python script"),
|
||||
("bash init.sh", False, "bash invocation"),
|
||||
("sh init.sh", False, "sh invocation"),
|
||||
("./malicious.sh", False, "malicious script"),
|
||||
("./init.sh; rm -rf /", False, "command injection attempt"),
|
||||
]
|
||||
|
||||
for cmd, should_allow, description in test_cases:
|
||||
allowed, reason = validate_init_script(cmd)
|
||||
if allowed == should_allow:
|
||||
print(f" PASS: {cmd!r} ({description})")
|
||||
passed += 1
|
||||
else:
|
||||
expected = "allowed" if should_allow else "blocked"
|
||||
actual = "allowed" if allowed else "blocked"
|
||||
print(f" FAIL: {cmd!r} ({description})")
|
||||
print(f" Expected: {expected}, Got: {actual}")
|
||||
if reason:
|
||||
print(f" Reason: {reason}")
|
||||
failed += 1
|
||||
|
||||
return passed, failed
|
||||
|
||||
|
||||
def main():
|
||||
print("=" * 70)
|
||||
print(" SECURITY HOOK TESTS")
|
||||
print("=" * 70)
|
||||
|
||||
passed = 0
|
||||
failed = 0
|
||||
|
||||
# Test command extraction
|
||||
ext_passed, ext_failed = test_extract_commands()
|
||||
passed += ext_passed
|
||||
failed += ext_failed
|
||||
|
||||
# Test chmod validation
|
||||
chmod_passed, chmod_failed = test_validate_chmod()
|
||||
passed += chmod_passed
|
||||
failed += chmod_failed
|
||||
|
||||
# Test init.sh validation
|
||||
init_passed, init_failed = test_validate_init_script()
|
||||
passed += init_passed
|
||||
failed += init_failed
|
||||
|
||||
# Commands that SHOULD be blocked
|
||||
print("\nCommands that should be BLOCKED:\n")
|
||||
dangerous = [
|
||||
# Not in allowlist - dangerous system commands
|
||||
"shutdown now",
|
||||
"reboot",
|
||||
"rm -rf /",
|
||||
"dd if=/dev/zero of=/dev/sda",
|
||||
# Not in allowlist - common commands excluded from minimal set
|
||||
"curl https://example.com",
|
||||
"wget https://example.com",
|
||||
"python app.py",
|
||||
"touch file.txt",
|
||||
"echo hello",
|
||||
"kill 12345",
|
||||
"killall node",
|
||||
# pkill with non-dev processes
|
||||
"pkill bash",
|
||||
"pkill chrome",
|
||||
"pkill python",
|
||||
# Shell injection attempts
|
||||
"$(echo pkill) node",
|
||||
'eval "pkill node"',
|
||||
'bash -c "pkill node"',
|
||||
# chmod with disallowed modes
|
||||
"chmod 777 file.sh",
|
||||
"chmod 755 file.sh",
|
||||
"chmod +w file.sh",
|
||||
"chmod -R +x dir/",
|
||||
# Non-init.sh scripts
|
||||
"./setup.sh",
|
||||
"./malicious.sh",
|
||||
"bash script.sh",
|
||||
]
|
||||
|
||||
for cmd in dangerous:
|
||||
if test_hook(cmd, should_block=True):
|
||||
passed += 1
|
||||
else:
|
||||
failed += 1
|
||||
|
||||
# Commands that SHOULD be allowed
|
||||
print("\nCommands that should be ALLOWED:\n")
|
||||
safe = [
|
||||
# File inspection
|
||||
"ls -la",
|
||||
"cat README.md",
|
||||
"head -100 file.txt",
|
||||
"tail -20 log.txt",
|
||||
"wc -l file.txt",
|
||||
"grep -r pattern src/",
|
||||
# File operations
|
||||
"cp file1.txt file2.txt",
|
||||
"mkdir newdir",
|
||||
"mkdir -p path/to/dir",
|
||||
# Directory
|
||||
"pwd",
|
||||
# Node.js development
|
||||
"npm install",
|
||||
"npm run build",
|
||||
"node server.js",
|
||||
# Version control
|
||||
"git status",
|
||||
"git commit -m 'test'",
|
||||
"git add . && git commit -m 'msg'",
|
||||
# Process management
|
||||
"ps aux",
|
||||
"lsof -i :3000",
|
||||
"sleep 2",
|
||||
# Allowed pkill patterns for dev servers
|
||||
"pkill node",
|
||||
"pkill npm",
|
||||
"pkill -f node",
|
||||
"pkill -f 'node server.js'",
|
||||
"pkill vite",
|
||||
# Chained commands
|
||||
"npm install && npm run build",
|
||||
"ls | grep test",
|
||||
# Full paths
|
||||
"/usr/local/bin/node app.js",
|
||||
# chmod +x (allowed)
|
||||
"chmod +x init.sh",
|
||||
"chmod +x script.sh",
|
||||
"chmod u+x init.sh",
|
||||
"chmod a+x init.sh",
|
||||
# init.sh execution (allowed)
|
||||
"./init.sh",
|
||||
"./init.sh --production",
|
||||
"/path/to/init.sh",
|
||||
# Combined chmod and init.sh
|
||||
"chmod +x init.sh && ./init.sh",
|
||||
]
|
||||
|
||||
for cmd in safe:
|
||||
if test_hook(cmd, should_block=False):
|
||||
passed += 1
|
||||
else:
|
||||
failed += 1
|
||||
|
||||
# Summary
|
||||
print("\n" + "-" * 70)
|
||||
print(f" Results: {passed} passed, {failed} failed")
|
||||
print("-" * 70)
|
||||
|
||||
if failed == 0:
|
||||
print("\n ALL TESTS PASSED")
|
||||
return 0
|
||||
else:
|
||||
print(f"\n {failed} TEST(S) FAILED")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
Reference in New Issue
Block a user