Compare commits
4 Commits
archive-v1
...
v3.0.0
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
92c346e65f | ||
|
|
c7995bd1f0 | ||
|
|
04972720d0 | ||
|
|
fa470c92fd |
2
.gitignore
vendored
2
.gitignore
vendored
@@ -16,6 +16,4 @@ build/
|
||||
# Environment variables
|
||||
.env
|
||||
|
||||
# VSCode settings
|
||||
.vscode/
|
||||
CLAUDE.md
|
||||
6
.vscode/extensions.json
vendored
Normal file
6
.vscode/extensions.json
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"recommendations": [
|
||||
"davidanson.vscode-markdownlint",
|
||||
"streetsidesoftware.code-spell-checker"
|
||||
]
|
||||
}
|
||||
40
.vscode/settings.json
vendored
Normal file
40
.vscode/settings.json
vendored
Normal file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"cSpell.words": [
|
||||
"agentic",
|
||||
"Axios",
|
||||
"BMAD",
|
||||
"Centricity",
|
||||
"dataclass",
|
||||
"docstrings",
|
||||
"emergently",
|
||||
"explorative",
|
||||
"frontends",
|
||||
"golint",
|
||||
"Goroutines",
|
||||
"HSTS",
|
||||
"httpx",
|
||||
"Immer",
|
||||
"implementability",
|
||||
"Inclusivity",
|
||||
"Luxon",
|
||||
"pasteable",
|
||||
"Pino",
|
||||
"Polyrepo",
|
||||
"Pydantic",
|
||||
"pyproject",
|
||||
"rescope",
|
||||
"roadmaps",
|
||||
"roleplay",
|
||||
"runbooks",
|
||||
"Serilog",
|
||||
"shadcn",
|
||||
"structlog",
|
||||
"Systemization",
|
||||
"taskroot",
|
||||
"Testcontainers",
|
||||
"tmpl",
|
||||
"VARCHAR",
|
||||
"venv",
|
||||
"WCAG"
|
||||
]
|
||||
}
|
||||
@@ -1,6 +1,10 @@
|
||||
# The BMAD-Method 3.1 (Breakthrough Method of Agile (ai-driven) Development)
|
||||
|
||||
## Do This First, and all will make sense!
|
||||
Old Versions:
|
||||
[Prior Version 1](https://github.com/bmadcode/BMAD-METHOD/tree/V1)
|
||||
[Prior Version 2](https://github.com/bmadcode/BMAD-METHOD/tree/V2)
|
||||
|
||||
## Do This First, and all will make sense
|
||||
|
||||
There are lots of docs here, but I HIGHLY suggest you just try the Web Agent - it takes just a few minutes to set up in Gemini - and you can use the BMad Agent to explain how this method works, how to set up in the IDE, how to set up in the Web, what should be done in the web or ide (although you can choose your own path also!) - all just by talking to the bmad agent!
|
||||
|
||||
|
||||
@@ -72,7 +72,7 @@
|
||||
|
||||
## 5. Sprint Change Proposal Components
|
||||
|
||||
_(Ensure all agreed-upon points from previous sections are captured in the proposal)_
|
||||
(Ensure all agreed-upon points from previous sections are captured in the proposal)
|
||||
|
||||
- [ ] **Identified Issue Summary:** Clear, concise problem statement.
|
||||
- [ ] **Epic Impact Summary:** How epics are affected.
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
# Frontend Architecture Document Review Checklist
|
||||
|
||||
## Purpose
|
||||
|
||||
This checklist is for the Design Architect to use after completing the "Frontend Architecture Mode" and populating the `front-end-architecture-tmpl.txt` (or `.md`) document. It ensures all sections are comprehensively covered and meet quality standards before finalization.
|
||||
|
||||
---
|
||||
@@ -34,10 +35,12 @@ This checklist is for the Design Architect to use after completing the "Frontend
|
||||
## IV. Component Breakdown & Implementation Details
|
||||
|
||||
### Component Naming & Organization
|
||||
|
||||
- [ ] Are conventions for naming components (e.g., PascalCase) described?
|
||||
- [ ] Is the organization of components on the filesystem clearly explained (reiterating from directory structure if needed)?
|
||||
|
||||
### Template for Component Specification
|
||||
|
||||
- [ ] Is the "Template for Component Specification" itself complete and well-defined?
|
||||
- [ ] Does it include fields for: Purpose, Source File(s), Visual Reference?
|
||||
- [ ] Does it include a table structure for Props (Name, Type, Required, Default, Description)?
|
||||
@@ -50,6 +53,7 @@ This checklist is for the Design Architect to use after completing the "Frontend
|
||||
- [ ] Is there a clear statement that this template should be used for most feature-specific components?
|
||||
|
||||
### Foundational/Shared Components (if any specified upfront)
|
||||
|
||||
- [ ] If any foundational/shared UI components are specified, do they follow the "Template for Component Specification"?
|
||||
- [ ] Is the rationale for specifying these components upfront clear?
|
||||
|
||||
|
||||
@@ -5,6 +5,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 1. PROBLEM DEFINITION & CONTEXT
|
||||
|
||||
### 1.1 Problem Statement
|
||||
|
||||
- [ ] Clear articulation of the problem being solved
|
||||
- [ ] Identification of who experiences the problem
|
||||
- [ ] Explanation of why solving this problem matters
|
||||
@@ -12,6 +13,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Differentiation from existing solutions
|
||||
|
||||
### 1.2 Business Goals & Success Metrics
|
||||
|
||||
- [ ] Specific, measurable business objectives defined
|
||||
- [ ] Clear success metrics and KPIs established
|
||||
- [ ] Metrics are tied to user and business value
|
||||
@@ -19,6 +21,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Timeframe for achieving goals specified
|
||||
|
||||
### 1.3 User Research & Insights
|
||||
|
||||
- [ ] Target user personas clearly defined
|
||||
- [ ] User needs and pain points documented
|
||||
- [ ] User research findings summarized (if available)
|
||||
@@ -28,6 +31,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 2. MVP SCOPE DEFINITION
|
||||
|
||||
### 2.1 Core Functionality
|
||||
|
||||
- [ ] Essential features clearly distinguished from nice-to-haves
|
||||
- [ ] Features directly address defined problem statement
|
||||
- [ ] Each Epic ties back to specific user needs
|
||||
@@ -35,6 +39,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Minimum requirements for success defined
|
||||
|
||||
### 2.2 Scope Boundaries
|
||||
|
||||
- [ ] Clear articulation of what is OUT of scope
|
||||
- [ ] Future enhancements section included
|
||||
- [ ] Rationale for scope decisions documented
|
||||
@@ -42,6 +47,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Scope has been reviewed and refined multiple times
|
||||
|
||||
### 2.3 MVP Validation Approach
|
||||
|
||||
- [ ] Method for testing MVP success defined
|
||||
- [ ] Initial user feedback mechanisms planned
|
||||
- [ ] Criteria for moving beyond MVP specified
|
||||
@@ -51,6 +57,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 3. USER EXPERIENCE REQUIREMENTS
|
||||
|
||||
### 3.1 User Journeys & Flows
|
||||
|
||||
- [ ] Primary user flows documented
|
||||
- [ ] Entry and exit points for each flow identified
|
||||
- [ ] Decision points and branches mapped
|
||||
@@ -58,6 +65,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Edge cases considered
|
||||
|
||||
### 3.2 Usability Requirements
|
||||
|
||||
- [ ] Accessibility considerations documented
|
||||
- [ ] Platform/device compatibility specified
|
||||
- [ ] Performance expectations from user perspective defined
|
||||
@@ -65,6 +73,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] User feedback mechanisms identified
|
||||
|
||||
### 3.3 UI Requirements
|
||||
|
||||
- [ ] Information architecture outlined
|
||||
- [ ] Critical UI components identified
|
||||
- [ ] Visual design guidelines referenced (if applicable)
|
||||
@@ -74,6 +83,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 4. FUNCTIONAL REQUIREMENTS
|
||||
|
||||
### 4.1 Feature Completeness
|
||||
|
||||
- [ ] All required features for MVP documented
|
||||
- [ ] Features have clear, user-focused descriptions
|
||||
- [ ] Feature priority/criticality indicated
|
||||
@@ -81,6 +91,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Dependencies between features identified
|
||||
|
||||
### 4.2 Requirements Quality
|
||||
|
||||
- [ ] Requirements are specific and unambiguous
|
||||
- [ ] Requirements focus on WHAT not HOW
|
||||
- [ ] Requirements use consistent terminology
|
||||
@@ -88,6 +99,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Technical jargon minimized or explained
|
||||
|
||||
### 4.3 User Stories & Acceptance Criteria
|
||||
|
||||
- [ ] Stories follow consistent format
|
||||
- [ ] Acceptance criteria are testable
|
||||
- [ ] Stories are sized appropriately (not too large)
|
||||
@@ -98,6 +110,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 5. NON-FUNCTIONAL REQUIREMENTS
|
||||
|
||||
### 5.1 Performance Requirements
|
||||
|
||||
- [ ] Response time expectations defined
|
||||
- [ ] Throughput/capacity requirements specified
|
||||
- [ ] Scalability needs documented
|
||||
@@ -105,6 +118,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Load handling expectations set
|
||||
|
||||
### 5.2 Security & Compliance
|
||||
|
||||
- [ ] Data protection requirements specified
|
||||
- [ ] Authentication/authorization needs defined
|
||||
- [ ] Compliance requirements documented
|
||||
@@ -112,6 +126,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Privacy considerations addressed
|
||||
|
||||
### 5.3 Reliability & Resilience
|
||||
|
||||
- [ ] Availability requirements defined
|
||||
- [ ] Backup and recovery needs documented
|
||||
- [ ] Fault tolerance expectations set
|
||||
@@ -119,6 +134,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Maintenance and support considerations included
|
||||
|
||||
### 5.4 Technical Constraints
|
||||
|
||||
- [ ] Platform/technology constraints documented
|
||||
- [ ] Integration requirements outlined
|
||||
- [ ] Third-party service dependencies identified
|
||||
@@ -128,6 +144,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 6. EPIC & STORY STRUCTURE
|
||||
|
||||
### 6.1 Epic Definition
|
||||
|
||||
- [ ] Epics represent cohesive units of functionality
|
||||
- [ ] Epics focus on user/business value delivery
|
||||
- [ ] Epic goals clearly articulated
|
||||
@@ -135,6 +152,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Epic sequence and dependencies identified
|
||||
|
||||
### 6.2 Story Breakdown
|
||||
|
||||
- [ ] Stories are broken down to appropriate size
|
||||
- [ ] Stories have clear, independent value
|
||||
- [ ] Stories include appropriate acceptance criteria
|
||||
@@ -142,6 +160,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Stories aligned with epic goals
|
||||
|
||||
### 6.3 First Epic Completeness
|
||||
|
||||
- [ ] First epic includes all necessary setup steps
|
||||
- [ ] Project scaffolding and initialization addressed
|
||||
- [ ] Core infrastructure setup included
|
||||
@@ -151,6 +170,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 7. TECHNICAL GUIDANCE
|
||||
|
||||
### 7.1 Architecture Guidance
|
||||
|
||||
- [ ] Initial architecture direction provided
|
||||
- [ ] Technical constraints clearly communicated
|
||||
- [ ] Integration points identified
|
||||
@@ -159,6 +179,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Known areas of high complexity or technical risk flagged for architectural deep-dive
|
||||
|
||||
### 7.2 Technical Decision Framework
|
||||
|
||||
- [ ] Decision criteria for technical choices provided
|
||||
- [ ] Trade-offs articulated for key decisions
|
||||
- [ ] Rationale for selecting primary approach over considered alternatives documented (for key design/feature choices)
|
||||
@@ -167,6 +188,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Guidance on technical debt approach provided
|
||||
|
||||
### 7.3 Implementation Considerations
|
||||
|
||||
- [ ] Development approach guidance provided
|
||||
- [ ] Testing requirements articulated
|
||||
- [ ] Deployment expectations set
|
||||
@@ -176,6 +198,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 8. CROSS-FUNCTIONAL REQUIREMENTS
|
||||
|
||||
### 8.1 Data Requirements
|
||||
|
||||
- [ ] Data entities and relationships identified
|
||||
- [ ] Data storage requirements specified
|
||||
- [ ] Data quality requirements defined
|
||||
@@ -184,6 +207,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Schema changes planned iteratively, tied to stories requiring them
|
||||
|
||||
### 8.2 Integration Requirements
|
||||
|
||||
- [ ] External system integrations identified
|
||||
- [ ] API requirements documented
|
||||
- [ ] Authentication for integrations specified
|
||||
@@ -191,6 +215,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Integration testing requirements outlined
|
||||
|
||||
### 8.3 Operational Requirements
|
||||
|
||||
- [ ] Deployment frequency expectations set
|
||||
- [ ] Environment requirements defined
|
||||
- [ ] Monitoring and alerting needs identified
|
||||
@@ -200,6 +225,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## 9. CLARITY & COMMUNICATION
|
||||
|
||||
### 9.1 Documentation Quality
|
||||
|
||||
- [ ] Documents use clear, consistent language
|
||||
- [ ] Documents are well-structured and organized
|
||||
- [ ] Technical terms are defined where necessary
|
||||
@@ -207,6 +233,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
- [ ] Documentation is versioned appropriately
|
||||
|
||||
### 9.2 Stakeholder Alignment
|
||||
|
||||
- [ ] Key stakeholders identified
|
||||
- [ ] Stakeholder input incorporated
|
||||
- [ ] Potential areas of disagreement addressed
|
||||
@@ -216,6 +243,7 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
## PRD & EPIC VALIDATION SUMMARY
|
||||
|
||||
### Category Statuses
|
||||
|
||||
| Category | Status | Critical Issues |
|
||||
|----------|--------|----------------|
|
||||
| 1. Problem Definition & Context | PASS/FAIL/PARTIAL | |
|
||||
@@ -229,11 +257,14 @@ This checklist serves as a comprehensive framework to ensure the Product Require
|
||||
| 9. Clarity & Communication | PASS/FAIL/PARTIAL | |
|
||||
|
||||
### Critical Deficiencies
|
||||
|
||||
- List all critical issues that must be addressed before handoff to Architect
|
||||
|
||||
### Recommendations
|
||||
|
||||
- Provide specific recommendations for addressing each deficiency
|
||||
|
||||
### Final Decision
|
||||
|
||||
- **READY FOR ARCHITECT**: The PRD and epics are comprehensive, properly structured, and ready for architectural design.
|
||||
- **NEEDS REFINEMENT**: The requirements documentation requires additional work to address the identified deficiencies.
|
||||
@@ -5,6 +5,7 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 1. PROJECT SETUP & INITIALIZATION
|
||||
|
||||
### 1.1 Project Scaffolding
|
||||
|
||||
- [ ] Epic 1 includes explicit steps for project creation/initialization
|
||||
- [ ] If using a starter template, steps for cloning/setup are included
|
||||
- [ ] If building from scratch, all necessary scaffolding steps are defined
|
||||
@@ -12,6 +13,7 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
- [ ] Repository setup and initial commit processes are defined (if applicable)
|
||||
|
||||
### 1.2 Development Environment
|
||||
|
||||
- [ ] Local development environment setup is clearly defined
|
||||
- [ ] Required tools and versions are specified (Node.js, Python, etc.)
|
||||
- [ ] Steps for installing dependencies are included
|
||||
@@ -19,6 +21,7 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
- [ ] Development server setup is included
|
||||
|
||||
### 1.3 Core Dependencies
|
||||
|
||||
- [ ] All critical packages/libraries are installed early in the process
|
||||
- [ ] Package management (npm, pip, etc.) is properly addressed
|
||||
- [ ] Version specifications are appropriately defined
|
||||
@@ -27,6 +30,7 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 2. INFRASTRUCTURE & DEPLOYMENT SEQUENCING
|
||||
|
||||
### 2.1 Database & Data Store Setup
|
||||
|
||||
- [ ] Database selection/setup occurs before any database operations
|
||||
- [ ] Schema definitions are created before data operations
|
||||
- [ ] Migration strategies are defined if applicable
|
||||
@@ -34,12 +38,14 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
- [ ] Database access patterns and security are established early
|
||||
|
||||
### 2.2 API & Service Configuration
|
||||
|
||||
- [ ] API frameworks are set up before implementing endpoints
|
||||
- [ ] Service architecture is established before implementing services
|
||||
- [ ] Authentication framework is set up before protected routes
|
||||
- [ ] Middleware and common utilities are created before use
|
||||
|
||||
### 2.3 Deployment Pipeline
|
||||
|
||||
- [ ] CI/CD pipeline is established before any deployment actions
|
||||
- [ ] Infrastructure as Code (IaC) is set up before use
|
||||
- [ ] Environment configurations (dev, staging, prod) are defined early
|
||||
@@ -47,6 +53,7 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
- [ ] Rollback procedures or considerations are addressed
|
||||
|
||||
### 2.4 Testing Infrastructure
|
||||
|
||||
- [ ] Testing frameworks are installed before writing tests
|
||||
- [ ] Test environment setup precedes test implementation
|
||||
- [ ] Mock services or data are defined before testing
|
||||
@@ -55,18 +62,21 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 3. EXTERNAL DEPENDENCIES & INTEGRATIONS
|
||||
|
||||
### 3.1 Third-Party Services
|
||||
|
||||
- [ ] Account creation steps are identified for required services
|
||||
- [ ] API key acquisition processes are defined
|
||||
- [ ] Steps for securely storing credentials are included
|
||||
- [ ] Fallback or offline development options are considered
|
||||
|
||||
### 3.2 External APIs
|
||||
|
||||
- [ ] Integration points with external APIs are clearly identified
|
||||
- [ ] Authentication with external services is properly sequenced
|
||||
- [ ] API limits or constraints are acknowledged
|
||||
- [ ] Backup strategies for API failures are considered
|
||||
|
||||
### 3.3 Infrastructure Services
|
||||
|
||||
- [ ] Cloud resource provisioning is properly sequenced
|
||||
- [ ] DNS or domain registration needs are identified
|
||||
- [ ] Email or messaging service setup is included if needed
|
||||
@@ -75,12 +85,14 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 4. USER/AGENT RESPONSIBILITY DELINEATION
|
||||
|
||||
### 4.1 User Actions
|
||||
|
||||
- [ ] User responsibilities are limited to only what requires human intervention
|
||||
- [ ] Account creation on external services is properly assigned to users
|
||||
- [ ] Purchasing or payment actions are correctly assigned to users
|
||||
- [ ] Credential provision is appropriately assigned to users
|
||||
|
||||
### 4.2 Developer Agent Actions
|
||||
|
||||
- [ ] All code-related tasks are assigned to developer agents
|
||||
- [ ] Automated processes are correctly identified as agent responsibilities
|
||||
- [ ] Configuration management is properly assigned
|
||||
@@ -89,18 +101,21 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 5. FEATURE SEQUENCING & DEPENDENCIES
|
||||
|
||||
### 5.1 Functional Dependencies
|
||||
|
||||
- [ ] Features that depend on other features are sequenced correctly
|
||||
- [ ] Shared components are built before their use
|
||||
- [ ] User flows follow a logical progression
|
||||
- [ ] Authentication features precede protected routes/features
|
||||
|
||||
### 5.2 Technical Dependencies
|
||||
|
||||
- [ ] Lower-level services are built before higher-level ones
|
||||
- [ ] Libraries and utilities are created before their use
|
||||
- [ ] Data models are defined before operations on them
|
||||
- [ ] API endpoints are defined before client consumption
|
||||
|
||||
### 5.3 Cross-Epic Dependencies
|
||||
|
||||
- [ ] Later epics build upon functionality from earlier epics
|
||||
- [ ] No epic requires functionality from later epics
|
||||
- [ ] Infrastructure established in early epics is utilized consistently
|
||||
@@ -109,18 +124,21 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 6. MVP SCOPE ALIGNMENT
|
||||
|
||||
### 6.1 PRD Goals Alignment
|
||||
|
||||
- [ ] All core goals defined in the PRD are addressed in epics/stories
|
||||
- [ ] Features directly support the defined MVP goals
|
||||
- [ ] No extraneous features beyond MVP scope are included
|
||||
- [ ] Critical features are prioritized appropriately
|
||||
|
||||
### 6.2 User Journey Completeness
|
||||
|
||||
- [ ] All critical user journeys are fully implemented
|
||||
- [ ] Edge cases and error scenarios are addressed
|
||||
- [ ] User experience considerations are included
|
||||
- [ ] Accessibility requirements are incorporated if specified
|
||||
|
||||
### 6.3 Technical Requirements Satisfaction
|
||||
|
||||
- [ ] All technical constraints from the PRD are addressed
|
||||
- [ ] Non-functional requirements are incorporated
|
||||
- [ ] Architecture decisions align with specified constraints
|
||||
@@ -129,18 +147,21 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 7. RISK MANAGEMENT & PRACTICALITY
|
||||
|
||||
### 7.1 Technical Risk Mitigation
|
||||
|
||||
- [ ] Complex or unfamiliar technologies have appropriate learning/prototyping stories
|
||||
- [ ] High-risk components have explicit validation steps
|
||||
- [ ] Fallback strategies exist for risky integrations
|
||||
- [ ] Performance concerns have explicit testing/validation
|
||||
|
||||
### 7.2 External Dependency Risks
|
||||
|
||||
- [ ] Risks with third-party services are acknowledged and mitigated
|
||||
- [ ] API limits or constraints are addressed
|
||||
- [ ] Backup strategies exist for critical external services
|
||||
- [ ] Cost implications of external services are considered
|
||||
|
||||
### 7.3 Timeline Practicality
|
||||
|
||||
- [ ] Story complexity and sequencing suggest a realistic timeline
|
||||
- [ ] Dependencies on external factors are minimized or managed
|
||||
- [ ] Parallel work is enabled where possible
|
||||
@@ -149,12 +170,14 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 8. DOCUMENTATION & HANDOFF
|
||||
|
||||
### 8.1 Developer Documentation
|
||||
|
||||
- [ ] API documentation is created alongside implementation
|
||||
- [ ] Setup instructions are comprehensive
|
||||
- [ ] Architecture decisions are documented
|
||||
- [ ] Patterns and conventions are documented
|
||||
|
||||
### 8.2 User Documentation
|
||||
|
||||
- [ ] User guides or help documentation is included if required
|
||||
- [ ] Error messages and user feedback are considered
|
||||
- [ ] Onboarding flows are fully specified
|
||||
@@ -163,12 +186,14 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## 9. POST-MVP CONSIDERATIONS
|
||||
|
||||
### 9.1 Future Enhancements
|
||||
|
||||
- [ ] Clear separation between MVP and future features
|
||||
- [ ] Architecture supports planned future enhancements
|
||||
- [ ] Technical debt considerations are documented
|
||||
- [ ] Extensibility points are identified
|
||||
|
||||
### 9.2 Feedback Mechanisms
|
||||
|
||||
- [ ] Analytics or usage tracking is included if required
|
||||
- [ ] User feedback collection is considered
|
||||
- [ ] Monitoring and alerting are addressed
|
||||
@@ -177,6 +202,7 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
## VALIDATION SUMMARY
|
||||
|
||||
### Category Statuses
|
||||
|
||||
| Category | Status | Critical Issues |
|
||||
|----------|--------|----------------|
|
||||
| 1. Project Setup & Initialization | PASS/FAIL/PARTIAL | |
|
||||
@@ -190,11 +216,14 @@ This checklist serves as a comprehensive framework for the Product Owner to vali
|
||||
| 9. Post-MVP Considerations | PASS/FAIL/PARTIAL | |
|
||||
|
||||
### Critical Deficiencies
|
||||
|
||||
- List all critical issues that must be addressed before approval
|
||||
|
||||
### Recommendations
|
||||
|
||||
- Provide specific recommendations for addressing each deficiency
|
||||
|
||||
### Final Decision
|
||||
|
||||
- **APPROVED**: The plan is comprehensive, properly sequenced, and ready for implementation.
|
||||
- **REJECTED**: The plan requires revision to address the identified deficiencies.
|
||||
@@ -1,10 +1,10 @@
|
||||
# Story Definition of Done (DoD) Checklist
|
||||
|
||||
## Instructions for Developer Agent:
|
||||
## Instructions for Developer Agent
|
||||
|
||||
Before marking a story as 'Review', please go through each item in this checklist. Report the status of each item (e.g., [x] Done, [ ] Not Done, [N/A] Not Applicable) and provide brief comments if necessary.
|
||||
|
||||
## Checklist Items:
|
||||
## Checklist Items
|
||||
|
||||
1. **Requirements Met:**
|
||||
|
||||
@@ -51,6 +51,6 @@ Before marking a story as 'Review', please go through each item in this checklis
|
||||
- [ ] User-facing documentation updated, if changes impact users.
|
||||
- [ ] Technical documentation (e.g., READMEs, system diagrams) updated if significant architectural changes were made.
|
||||
|
||||
## Final Confirmation:
|
||||
## Final Confirmation
|
||||
|
||||
- [ ] I, the Developer Agent, confirm that all applicable items above have been addressed.
|
||||
|
||||
12
bmad-agent/data/technical-preferences.md
Normal file
12
bmad-agent/data/technical-preferences.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# User-Defined Preferred Patterns and Preferences
|
||||
|
||||
List out your preferred:
|
||||
- technical preferences
|
||||
- design patterns
|
||||
- languages
|
||||
- framework
|
||||
- etc...
|
||||
|
||||
Anything you learn or prefer over time to drive future project choices, add them here.
|
||||
|
||||
These will be used by the agents when producing PRD and Architectures
|
||||
@@ -1,6 +0,0 @@
|
||||
# User-Defined Preferred Patterns and Preferences
|
||||
|
||||
See example files in this folder.
|
||||
list out your technical preferences, patterns you like to follow, language framework or starter project preferences.
|
||||
|
||||
Anything you learn or prefer over time to drive future project choices, add the here.
|
||||
@@ -14,7 +14,7 @@ Example: If above cfg has `agent-root: root/foo/` and `tasks: (agent-root)/tasks
|
||||
|
||||
## Title: Analyst
|
||||
|
||||
- Name: Wendy
|
||||
- Name: Mary
|
||||
- Customize: ""
|
||||
- Description: "Research assistant, brain storming coach, requirements gathering, project briefs."
|
||||
- Persona: "analyst.md"
|
||||
@@ -25,18 +25,19 @@ Example: If above cfg has `agent-root: root/foo/` and `tasks: (agent-root)/tasks
|
||||
|
||||
## Title: Product Manager (PM)
|
||||
|
||||
- Name: Bill
|
||||
- Name: John
|
||||
- Customize: ""
|
||||
- Description: "Jack has only one goal - to produce or maintain the best possible PRD - or discuss the product with you to ideate or plan current or future efforts related to the product."
|
||||
- Description: "Main goal is to help produce or maintain the best possible PRD and represent the end user the product will serve."
|
||||
- Persona: "pm.md"
|
||||
- Tasks:
|
||||
- [Create PRD](create-prd.md)
|
||||
- [Create Document](tasks#create-doc-from-template):
|
||||
- [Prd](templates#prd-tmpl)
|
||||
|
||||
## Title: Architect
|
||||
|
||||
- Name: Timmy
|
||||
- Name: Fred
|
||||
- Customize: ""
|
||||
- Description: "Generates Architecture, Can help plan a story, and will also help update PRD level epic and stories."
|
||||
- Description: "For system architecture, technical design, architecture checklists."
|
||||
- Persona: "architect.md"
|
||||
- Tasks:
|
||||
- [Create Architecture](create-architecture.md)
|
||||
@@ -46,30 +47,34 @@ Example: If above cfg has `agent-root: root/foo/` and `tasks: (agent-root)/tasks
|
||||
|
||||
## Title: Design Architect
|
||||
|
||||
- Name: Karen
|
||||
- Name: Jane
|
||||
- Customize: ""
|
||||
- Description: "Help design a website or web application, produce prompts for UI GEneration AI's, and plan a full comprehensive front end architecture."
|
||||
- Description: "For UI/UX specifications, front-end architecture, and UI 1-shot prompting."
|
||||
- Persona: "design-architect.md"
|
||||
- Tasks:
|
||||
- [Create Frontend Architecture](create-frontend-architecture.md)
|
||||
- [Create Next Story](create-ai-frontend-prompt.md)
|
||||
- [Slice Documents](create-uxui-spec.md)
|
||||
|
||||
## Title: Product Owner AKA PO
|
||||
## Title: PO
|
||||
|
||||
- Name: Jimmy
|
||||
- Name: Sarah
|
||||
- Customize: ""
|
||||
- Description: "Jack of many trades, from PRD Generation and maintenance to the mid sprint Course Correct. Also able to draft masterful stories for the dev agent."
|
||||
- Description: "Product Owner helps validate the artifacts are all cohesive with a master checklist, and also helps coach significant changes"
|
||||
- Persona: "po.md"
|
||||
- Tasks:
|
||||
- [Create PRD](create-prd.md)
|
||||
- [Create Next Story](create-next-story-task.md)
|
||||
- [Slice Documents](doc-sharding-task.md)
|
||||
- [Correct Course](correct-course.md)
|
||||
- checklists:
|
||||
- [Po Master Checklist](checklists#po-master-checklist)
|
||||
- [Change Checklist](checklists#change-checklist)
|
||||
- templates:
|
||||
- [Story Tmpl](templates#story-tmpl)
|
||||
- tasks:
|
||||
- [Checklist Run Task](tasks#checklist-run-task)
|
||||
- [Extracts Epics and shards the Architecture](tasks#doc-sharding-task)
|
||||
- [Correct Course](tasks#correct-course)
|
||||
|
||||
## Title: Frontend Dev
|
||||
|
||||
- Name: Rodney
|
||||
- Name: Ellyn
|
||||
- Customize: "Specialized in NextJS, React, Typescript, HTML, Tailwind"
|
||||
- Description: "Master Front End Web Application Developer"
|
||||
- Persona: "dev.ide.md"
|
||||
@@ -94,7 +99,7 @@ Example: If above cfg has `agent-root: root/foo/` and `tasks: (agent-root)/tasks
|
||||
|
||||
## Title: Scrum Master: SM
|
||||
|
||||
- Name: Fran
|
||||
- Name: Bob
|
||||
- Customize: ""
|
||||
- Description: "Specialized in Next Story Generation"
|
||||
- Persona: "sm.md"
|
||||
|
||||
@@ -67,7 +67,7 @@ This phase focuses on collaboratively crafting a comprehensive and effective pro
|
||||
|
||||
Choose this phase with the Analyst when you need to prepare for in-depth research by meticulously defining the research questions, scope, objectives, and desired output format for a dedicated research agent or for your own research activities.
|
||||
|
||||
### Instructions
|
||||
### Deep Research Instructions
|
||||
|
||||
<critical*rule>Note on Subsequent Deep Research Execution:</critical_rule>
|
||||
The output of this phase is a research prompt. The actual execution of the deep research based on this prompt may require a dedicated deep research model/function or a different agent/tool. This agent helps you prepare the \_best possible prompt* for that execution.
|
||||
@@ -87,7 +87,7 @@ The output of this phase is a research prompt. The actual execution of the deep
|
||||
- Analytical insights required (e.g., SWOT analysis, trend implications, feasibility assessments).
|
||||
- Validation of specific hypotheses.
|
||||
- **Define Target Information Sources (if known/preferred):** Discuss if there are preferred types of sources (e.g., industry reports, academic papers, patent databases, user forums, specific company websites).
|
||||
- **Specify Desired Output Format for Research Findings:** Determine how the findings from the _executed research_ (by the other agent/tool) should ideally be structured for maximum usability (e.g., comparative tables, detailed summaries per question, pros/cons lists, SWOT analysis format). This will inform the prompt.
|
||||
- **Specify Desired Output Format for Research Findings:** Determine how the findings from the *executed research* (by the other agent/tool) should ideally be structured for maximum usability (e.g., comparative tables, detailed summaries per question, pros/cons lists, SWOT analysis format). This will inform the prompt.
|
||||
- **Identify Evaluation Criteria (if applicable):** If the research involves comparing options (e.g., technologies, solutions), define the criteria for evaluation (e.g., cost, performance, scalability, ease of integration).
|
||||
3. **Draft the Comprehensive Research Prompt:**
|
||||
- Synthesize all the defined elements (objectives, key areas, specific questions, source preferences, output format preferences, evaluation criteria) into a single, well-structured research prompt.
|
||||
@@ -103,7 +103,7 @@ The output of this phase is a research prompt. The actual execution of the deep
|
||||
|
||||
## Project Briefing Phase
|
||||
|
||||
### Instructions
|
||||
### Project Briefing Instructions
|
||||
|
||||
- State that you will use the attached `project-brief-tmpl` as the structure
|
||||
- Guide through defining each section of the template:
|
||||
|
||||
@@ -21,7 +21,7 @@ MUST review and use:
|
||||
- `Project Structure`: `docs/project-structure.md`
|
||||
- `Operational Guidelines`: `docs/operational-guidelines.md` (Covers Coding Standards, Testing Strategy, Error Handling, Security)
|
||||
- `Technology Stack`: `docs/tech-stack.md`
|
||||
- `Story DoD Checklist`: `docs/checklists/story-dod-checklist.txt`
|
||||
- `Story DoD Checklist`: `bmad-agent/checklists/story-dod-checklist.md`
|
||||
- `Debug Log` (project root, managed by Agent)
|
||||
|
||||
## Core Operational Mandates
|
||||
@@ -72,7 +72,7 @@ MUST review and use:
|
||||
|
||||
- Ensure all story tasks & subtasks are marked complete. Verify all tests pass.
|
||||
- <critical_rule>Review `Debug Log`. Meticulously revert all temporary changes for this story. Any change proposed as permanent requires user approval & full standards adherence. `Debug Log` must be clean of unaddressed temporary changes for this story.</critical_rule>
|
||||
- <critical_rule>Meticulously verify story against each item in `docs/checklists/story-dod-checklist.txt`.</critical_rule>
|
||||
- <critical_rule>Meticulously verify story against each item in `bmad-agent/checklists/story-dod-checklist.md`.</critical_rule>
|
||||
- Address any unmet checklist items.
|
||||
- Prepare itemized "Story DoD Checklist Report" in story file. Justify `[N/A]` items. Note DoD check clarifications/interpretations.
|
||||
|
||||
@@ -82,7 +82,7 @@ MUST review and use:
|
||||
- <critical_rule>Update story `Status: Review` in story file if DoD, Tasks and Subtasks are complete.</critical_rule>
|
||||
- State story is complete & HALT!
|
||||
|
||||
## Commands:
|
||||
## Commands
|
||||
|
||||
- `*help` - list these commands
|
||||
- `*core-dump` - ensure story tasks and notes are recorded as of now, and then run bmad-agent/tasks/core-dump.md
|
||||
|
||||
@@ -70,7 +70,7 @@ When responding to requests, gather essential context first:
|
||||
|
||||
For implementation scenarios, summarize key context:
|
||||
|
||||
```
|
||||
```plaintext
|
||||
[Environment] Multi-cloud, multi-region, brownfield
|
||||
[Stack] Microservices, event-driven, containerized
|
||||
[Constraints] SOC2 compliance, 3-month timeline
|
||||
@@ -191,6 +191,7 @@ For complex technical problems, use a structured meta-reasoning approach:
|
||||
## Domain Boundaries with Architecture
|
||||
|
||||
### Collaboration Protocols
|
||||
|
||||
- **Design Review Gates:** Architecture produces technical specifications, DevOps/Platform reviews for implementability
|
||||
- **Feasibility Feedback:** DevOps/Platform provides operational constraints during architecture design phase
|
||||
- **Implementation Planning:** Joint sessions to translate architectural decisions into operational tasks
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Role: Technical Scrum Master (IDE - Story Creator & Validator)
|
||||
|
||||
## File References:
|
||||
## File References
|
||||
|
||||
`Create Next Story Task`: `bmad-agent/tasks/create-next-story-task.md`
|
||||
|
||||
|
||||
77
bmad-agent/tasks/advanced-elicitation.md
Normal file
77
bmad-agent/tasks/advanced-elicitation.md
Normal file
@@ -0,0 +1,77 @@
|
||||
# Advanced Elicitation Task
|
||||
|
||||
## Purpose
|
||||
|
||||
- Provide optional reflective and brainstorming actions to enhance content quality
|
||||
- Enable deeper exploration of ideas through structured elicitation techniques
|
||||
- Support iterative refinement through multiple analytical perspectives
|
||||
|
||||
## Task Instructions
|
||||
|
||||
### 1. Ask for review and Present Action List
|
||||
|
||||
[[LLM: Ask the user to review the {drafted document section, or context or document this protocol was executed from}. In the SAME message, inform them that they can suggest additions, removals, or modifications, OR they can select an action by number from the 'Advanced Reflective, Elicitation & Brainstorming Actions'. Then, present ONLY the numbered list (0-9) of these actions as defined in tasks#advanced-elicitation. Conclude by stating that selecting 9 will proceed to the next section. Await user selection. If an elicitation action (0-8) is chosen, execute it and then re-offer this combined review/elicitation choice. If option 9 is chosen, or if the user provides direct feedback on requirements, proceed accordingly.]]
|
||||
|
||||
**Present the numbered list (0-9) with this exact format:**
|
||||
|
||||
```
|
||||
**Advanced Reflective, Elicitation & Brainstorming Actions**
|
||||
Choose an action (0-9 - 9 to bypass - HELP for explanation of these options):
|
||||
|
||||
0. Expand or Contract for Audience
|
||||
1. Explain Reasoning (CoT Step-by-Step)
|
||||
2. Critique and Refine
|
||||
3. Analyze Logical Flow and Dependencies
|
||||
4. Assess Alignment with Overall Goals
|
||||
5. Identify Potential Risks and Unforeseen Issues
|
||||
6. Challenge from Critical Perspective (Self or Other Persona)
|
||||
7. Explore Diverse Alternatives (ToT-Inspired)
|
||||
8. Hindsight is 20/20: The 'If Only...' Reflection
|
||||
9. Proceed / No Further Actions
|
||||
```
|
||||
|
||||
### 2. Processing Guidelines
|
||||
|
||||
**Do NOT show:**
|
||||
|
||||
- The full protocol text with `[[LLM: ...]]` instructions
|
||||
- Detailed explanations of each option unless executing or the user asks, when giving the definition you can modify to tie its relevance
|
||||
- Any internal template markup
|
||||
|
||||
**After user selection from the list:**
|
||||
|
||||
- Execute the chosen action according to the protocol instructions below
|
||||
- Ask if they want to select another action or proceed with option 9 once complete
|
||||
- Continue until user selects option 9 or indicates completion
|
||||
|
||||
## Action Definitions
|
||||
|
||||
0. Expand or Contract for Audience
|
||||
[[LLM: Ask the user whether they want to 'expand' on the content (add more detail, elaborate) or 'contract' it (simplify, clarify, make more concise). Also, ask if there's a specific target audience they have in mind. Once clarified, perform the expansion or contraction from your current role's perspective, tailored to the specified audience if provided.]]
|
||||
|
||||
1. Explain Reasoning (CoT Step-by-Step)
|
||||
[[LLM: Explain the step-by-step thinking process, characteristic of your role, that you used to arrive at the current proposal for this content.]]
|
||||
|
||||
2. Critique and Refine
|
||||
[[LLM: From your current role's perspective, review your last output or the current section for flaws, inconsistencies, or areas for improvement, and then suggest a refined version reflecting your expertise.]]
|
||||
|
||||
3. Analyze Logical Flow and Dependencies
|
||||
[[LLM: From your role's standpoint, examine the content's structure for logical progression, internal consistency, and any relevant dependencies. Confirm if elements are presented in an effective order.]]
|
||||
|
||||
4. Assess Alignment with Overall Goals
|
||||
[[LLM: Evaluate how well the current content contributes to the stated overall goals of the document, interpreting this from your specific role's perspective and identifying any misalignments you perceive.]]
|
||||
|
||||
5. Identify Potential Risks and Unforeseen Issues
|
||||
[[LLM: Based on your role's expertise, brainstorm potential risks, overlooked edge cases, or unintended consequences related to the current content or proposal.]]
|
||||
|
||||
6. Challenge from Critical Perspective (Self or Other Persona)
|
||||
[[LLM: Adopt a critical perspective on the current content. If the user specifies another role or persona (e.g., 'as a customer', 'as [Another Persona Name]'), critique the content or play devil's advocate from that specified viewpoint. If no other role is specified, play devil's advocate from your own current persona's viewpoint, arguing against the proposal or current content and highlighting weaknesses or counterarguments specific to your concerns. This can also randomly include YAGNI when appropriate, such as when trimming the scope of an MVP, the perspective might challenge the need for something to cut MVP scope.]]
|
||||
|
||||
7. Explore Diverse Alternatives (ToT-Inspired)
|
||||
[[LLM: From your role's perspective, first broadly brainstorm a range of diverse approaches or solutions to the current topic. Then, from this wider exploration, select and present 2 distinct alternatives, detailing the pros, cons, and potential implications you foresee for each.]]
|
||||
|
||||
8. Hindsight is 20/20: The 'If Only...' Reflection
|
||||
[[LLM: In your current persona, imagine it's a retrospective for a project based on the current content. What's the one 'if only we had known/done X...' that your role would humorously or dramatically highlight, along with the imagined consequences?]]
|
||||
|
||||
9. Proceed / No Further Actions
|
||||
[[LLM: Acknowledge the user's choice to finalize the current work, accept the AI's last output as is, or move on to the next step without selecting another action from this list. Prepare to proceed accordingly.]]
|
||||
@@ -1,4 +1,4 @@
|
||||
## Deep Research Phase
|
||||
# Deep Research Phase
|
||||
|
||||
Leveraging advanced analytical capabilities, the Deep Research Phase with the PM is designed to provide targeted, strategic insights crucial for product definition. Unlike the broader exploratory research an Analyst might undertake, the PM utilizes deep research to:
|
||||
|
||||
@@ -9,13 +9,13 @@ Leveraging advanced analytical capabilities, the Deep Research Phase with the PM
|
||||
|
||||
Choose this phase with the PM when you need to strategically validate a product direction, fill specific knowledge gaps critical for defining _what_ to build, or ensure a strong, evidence-backed foundation for your PRD, especially if initial Analyst research was not performed or requires deeper, product-focused investigation.
|
||||
|
||||
### Purpose
|
||||
## Purpose
|
||||
|
||||
- To gather foundational information, validate concepts, understand market needs, or analyze competitors when a comprehensive Project Brief from an Analyst is unavailable or insufficient.
|
||||
- To ensure the PM has a solid, data-informed basis for defining a valuable and viable product before committing to PRD specifics.
|
||||
- To de-risk product decisions by grounding them in targeted research, especially if the user is engaging the PM directly without prior Analyst work or if the initial brief lacks necessary depth.
|
||||
|
||||
### Instructions
|
||||
## Instructions
|
||||
|
||||
<critical_rule>Note on Deep Research Execution:</critical_rule>
|
||||
To perform deep research effectively, please be aware:
|
||||
|
||||
93
bmad-agent/tasks/create-doc-from-template.md
Normal file
93
bmad-agent/tasks/create-doc-from-template.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# Create Document from Template Task
|
||||
|
||||
## Purpose
|
||||
|
||||
- Generate documents from any specified template following embedded instructions
|
||||
- Support multiple document types through template-driven approach
|
||||
- Enable any persona to create consistent, well-structured documents
|
||||
|
||||
## Instructions
|
||||
|
||||
### 1. Identify Template and Context
|
||||
|
||||
- Determine which template to use (user-provided or list available for selection to user)
|
||||
|
||||
- agent-config specific agents will list what docs they have available under this task, for each item consider it a unique task. So if the user had for example:
|
||||
|
||||
@{example}
|
||||
|
||||
- tasks:
|
||||
|
||||
- [Create Document](tasks#create-doc-from-template):
|
||||
|
||||
- [Prd](templates#prd-tmpl)
|
||||
|
||||
- [Architecture](templates#architecture-tmpl)
|
||||
|
||||
@{/example}
|
||||
|
||||
you would list `Create Document PRD` and `Create Document Architecture` as tasks the agent could perform.
|
||||
|
||||
- Gather all relevant inputs, or ask for them, or else rely on user providing necessary details to complete the document
|
||||
- Understand the document purpose and target audience
|
||||
|
||||
### 2. Determine Interaction Mode
|
||||
|
||||
Confirm with the user their preferred interaction style:
|
||||
|
||||
- **Incremental:** Work through chunks of the document.
|
||||
- **YOLO Mode:** Draft complete document making reasonable assumptions in one shot. (Can be entered also after starting incremental by just typing /yolo)
|
||||
|
||||
### 3. Execute Template
|
||||
|
||||
- Load specified template from `templates#*` or the /templates directory
|
||||
- Follow ALL embedded LLM instructions within the template
|
||||
- Process template markup according to `templates#template-format` conventions
|
||||
|
||||
### 4. Template Processing Rules
|
||||
|
||||
**CRITICAL: Never display template markup, LLM instructions, or examples to users**
|
||||
|
||||
- Replace all {{placeholders}} with actual content
|
||||
- Execute all [[LLM: instructions]] internally
|
||||
- Process <<REPEAT>> sections as needed
|
||||
- Evaluate ^^CONDITION^^ blocks and include only if applicable
|
||||
- Use @{examples} for guidance but never output them
|
||||
|
||||
### 5. Content Generation
|
||||
|
||||
- **Incremental Mode**: Present each major section for review before proceeding
|
||||
- **YOLO Mode**: Generate all sections, then review complete document with user
|
||||
- Apply any elicitation protocols specified in template
|
||||
- Incorporate user feedback and iterate as needed
|
||||
|
||||
### 6. Validation
|
||||
|
||||
If template specifies a checklist:
|
||||
|
||||
- Run the appropriate checklist against completed document
|
||||
- Document completion status for each item
|
||||
- Address any deficiencies found
|
||||
- Present validation summary to user
|
||||
|
||||
### 7. Final Presentation
|
||||
|
||||
- Present clean, formatted content only
|
||||
- Ensure all sections are complete
|
||||
- DO NOT truncate or summarize content
|
||||
- Begin directly with document content (no preamble)
|
||||
- Include any handoff prompts specified in template
|
||||
|
||||
## Key Resources
|
||||
|
||||
- **Template Format:** `templates#template-format`
|
||||
- **Available Templates:** All files in `templates#` directory
|
||||
- **Checklists:** As specified by template or persona
|
||||
- **User Preferences:** `data#technical-preferences`
|
||||
|
||||
## Important Notes
|
||||
|
||||
- This task is template and persona agnostic
|
||||
- All specific instructions are embedded in templates
|
||||
- Focus on faithful template execution and clean output
|
||||
- Template markup is for AI processing only - never expose to users
|
||||
@@ -33,7 +33,7 @@ To identify the next logical story based on project progress and epic definition
|
||||
- Verify its `Status` is 'Done' (or equivalent).
|
||||
- If not 'Done', present an alert to the user:
|
||||
|
||||
```
|
||||
```plaintext
|
||||
ALERT: Found incomplete story:
|
||||
File: {lastEpicNum}.{lastStoryNum}.story.md
|
||||
Status: [current status]
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## Purpose
|
||||
|
||||
To implement a comprehensive platform infrastructure stack based on the Infrastructure Architecture Document, including foundation infrastructure, container orchestration, GitOps workflows, service mesh, and developer experience platforms. This integrated approach ensures all platform components work synergetically to provide a complete, secure, and operationally excellent platform foundation.
|
||||
To implement a comprehensive platform infrastructure stack based on the Infrastructure Architecture Document, including foundation infrastructure, container orchestration, GitOps workflows, service mesh, and developer experience platforms. This integrated approach ensures all platform components work synergistically to provide a complete, secure, and operationally excellent platform foundation.
|
||||
|
||||
## Inputs
|
||||
|
||||
|
||||
@@ -2,228 +2,88 @@
|
||||
|
||||
## Purpose
|
||||
|
||||
- Transform inputs into core product definition documents conforming to the `prd-tmpl` template.
|
||||
- Define clear MVP scope focused on essential functionality.
|
||||
- Provide foundation for Architect and eventually AI dev agents.
|
||||
|
||||
Remember as you follow the upcoming instructions:
|
||||
|
||||
- Your documents form the foundation for the entire development process.
|
||||
- Output will be directly used by the Architect to create an architecture document and solution designs to make definitive technical decisions.
|
||||
- Your epics/stories will ultimately be transformed into development tasks.
|
||||
- While you focus on the "what" not "how", be precise enough to support a logical sequential order of operations that once later further details can logically be followed where a story will complete what is needed.
|
||||
- Transform inputs into core product definition documents conforming to a PRD template
|
||||
- Define clear MVP scope focused on essential functionality
|
||||
- Provide foundation for Architect and Design Architect to help create technical artifacts which will in turn later draft further details for very junior engineers or simple dev ai agents.
|
||||
|
||||
## Instructions
|
||||
|
||||
### 1. Define Project Workflow Context
|
||||
### 1. Review Inputs
|
||||
|
||||
- Before PRD generation, ask the user to choose their intended workflow:
|
||||
Review all provided inputs including project brief, research documents, prd template and user ideas to guide PRD generation.
|
||||
|
||||
A. **Outcome Focused (Default):** (Agent defines outcome-focused User Stories, leaving detailed technical "how" for Architect/Scrum Master. Capture nuances as "Notes for Architect/Scrum Master in the Prompt for Architect.")
|
||||
### 2. Determine Interaction Mode
|
||||
|
||||
B. **Very Technical (Not Recommended):** (Agent adopts a "solution-aware" stance, providing more detailed, implementation-aware Acceptance Criteria to bridge to development, potentially with no architect involved at all, instead filling in all of the technical details. \<important_note\>When this workflow is selected, you are also responsible for collaboratively defining and documenting key technical foundations—such as technology stack choices and proposed application structure—directly within a new, dedicated section of the PRD template titled '[OPTIONAL: For Simplified PM-to-Development Workflow Only] Core Technical Decisions & Application Structure'.\</important_note\>)
|
||||
Confirm with the user their preferred interaction style:
|
||||
|
||||
- Explain this choice sets a default detail level, which can be fine-tuned later per story/epic.
|
||||
- **Incremental:** Work through sections one at a time via chat messages as defined in the template.
|
||||
|
||||
### 2. Determine Interaction Mode (for PRD Structure & Detail)
|
||||
- **YOLO Mode:** Draft the complete PRD making assumptions as necessary. Present full document at once, noting which sections required assumptions.
|
||||
|
||||
- Confirm with the user their preferred interaction style for creating the PRD if unknown - INCREMENTAL or YOLO?:
|
||||
- **Incrementally (Default):** Address PRD sections sequentially, seeking feedback on each. For Epics/Stories: first present the ordered Epic list for approval, then detail stories for each Epic one by one.
|
||||
- **"YOLO" Mode:** Draft a more comprehensive PRD (or significant portions with multiple sections, epics, and stories) for a single, larger review.
|
||||
### 3. Execute Template
|
||||
|
||||
### 3. Review inputs provided
|
||||
- Use the `prd-tmpl` template (or user-specified alternative template)
|
||||
- Follow all embedded LLM instructions within the template
|
||||
- Template contains section-specific guidance and examples
|
||||
|
||||
Review the inputs provided so far, such as a project brief, any research, and user input and ideas.
|
||||
### 4. Template Processing Notes
|
||||
|
||||
### 4. Process PRD Sections
|
||||
- **Incremental Mode**: Present each section for review before proceeding
|
||||
- **YOLO Mode**: Generate all sections, then review with user
|
||||
|
||||
Inform the user we will work through the PRD sections in order 1 at a time (if not YOLO) - the template contains your instructions for each section. After presenting the section to the user, also [Offer Advanced Self-Refinement & Elicitation Options](#offer-advanced-self-refinement--elicitation-options)
|
||||
Process all template elements according to `templates#template-format` conventions.
|
||||
|
||||
<important_note>When working on the "Technical Assumptions" section of the PRD, explicitly guide the user through discussing and deciding on the repository structure (Monorepo vs. Polyrepo) and the high-level service architecture (e.g., Monolith, Microservices, Serverless functions within a Monorepo). Emphasize that this is a critical decision point that will be formally documented here with its rationale, impacting MVP scope and informing the Architect. Ensure this decision is captured in the PRD's `Technical Assumptions` and then reiterated in the `Initial Architect Prompt` section of the PRD.</important_note>
|
||||
**CRITICAL: Never display or output template markup formatting, LLM instructions or examples - they MUST be used by you the agent only, AND NEVER shown to users in chat or document output**
|
||||
|
||||
<important_note>Specifically for "Simplified PM-to-Development Workflow":
|
||||
After discussing initial PRD sections (like Problem, Goals, User Personas) and before or in parallel with defining detailed Epics and Stories, you must introduce and populate the "[OPTIONAL: For Simplified PM-to-Development Workflow Only] Core Technical Decisions & Application Structure" section of the PRD.
|
||||
**Content Presentation Guidelines:**
|
||||
|
||||
When doing so, first check if a `docs/technical-preferences.md` file exists or has been provided. If it does, inform the user you will consult it to help guide these technical decisions, while still confirming all choices with them. Ask targeted questions such as:
|
||||
- Present only the final, clean content to users
|
||||
- Replace template variables with actual project-specific content
|
||||
- Process all conditional logic internally - show only relevant sections
|
||||
- For Canvas mode: Update the document with clean, formatted content only
|
||||
|
||||
1. "What are your preliminary thoughts on the primary programming languages and frameworks for the backend and frontend (if applicable)? (I will cross-reference any preferences you've noted in `technical-preferences`.)"
|
||||
2. "Which database system are you considering? (Checking preferences...)"
|
||||
3. "Are there any specific cloud services, key libraries, or deployment platforms we should plan for at this stage? (Checking preferences...)"
|
||||
4. "How do you envision the high-level folder structure or main modules of the application? Could you describe the key components and their responsibilities? (I'll consider any structural preferences noted.)"
|
||||
5. "Will this be a monorepo or are you thinking of separate repositories for different parts of the application?"
|
||||
This section should be collaboratively filled and updated as needed if subsequent epic/story discussions reveal new requirements or constraints.
|
||||
### 7. Prepare Handoffs
|
||||
|
||||
</important_note\>
|
||||
Based on PRD content, prepare appropriate next-step prompts:
|
||||
|
||||
<important_note>
|
||||
**If UI Component Exists:**
|
||||
|
||||
For the Epic and Story Section (if in Incremental mode for these), prepare in memory what you think the initial epic and story list so we can work through this incrementally, use all of the information you have learned that has been provided thus far to follow the guidelines in the section below [Guiding Principles for Epic and User Story Generation](https://www.google.com/search?q=%23guiding-principles-for-epic-and-user-story-generation).
|
||||
1. Add Design Architect prompt in designated template section
|
||||
2. Recommend: User engages Design Architect first for UI/UX Specification
|
||||
3. Then proceed to Architect with enriched PRD
|
||||
|
||||
</important_note>
|
||||
**If No UI Component:**
|
||||
|
||||
#### 4A. Epic Presentation and Drafting Strategy
|
||||
- Add Architect prompt in designated template section
|
||||
- Recommend proceeding directly to Architect
|
||||
|
||||
You will first present the user with the epic titles and descriptions, so that the user can determine if it is correct and what is expected, or if there is a major epic missing.
|
||||
### 8. Validate with Checklist
|
||||
|
||||
#### 4B. Story Generation and Review within Epics (Incremental Mode)
|
||||
- Run the `pm-checklist` against completed PRD
|
||||
- Document completion status for each checklist item
|
||||
- Present summary by section, address any deficiencies
|
||||
- Generate final checklist report with findings and resolutions
|
||||
|
||||
**Once the Epic List is approved, THEN for each Epic, you will proceed as follows:**
|
||||
### 9. Final Presentation
|
||||
|
||||
i. **Draft All Stories for the Current Epic:** Based on the Epic's goal and your discussions, draft all the necessary User Stories for this Epic, following the "Guiding Principles for Epic and User Story Generation".
|
||||
ii. **Perform Internal Story Analysis & Propose Order:** Before presenting the stories for detailed review, you will internally:
|
||||
a. **Re-evaluate for Cross-Cutting Concerns:** Ensure no drafted stories should actually be ACs or notes within other stories, as per the guiding principle. Make necessary adjustments.
|
||||
b. **Analyze for Logical Sequence & Dependencies:** For all stories within this Epic, determine their logical implementation order. Identify any direct prerequisite stories (e.g., "Story X must be completed before Story Y because Y consumes the output of X").
|
||||
c. **Formulate a Rationale for the Order:** Prepare a brief explanation for why the proposed order is logical.
|
||||
iii. **Present Proposed Story Set & Order for the Epic:** Present to the user:
|
||||
a. The complete list of (potentially revised) User Stories for the Epic.
|
||||
b. The proposed sequence for these stories.
|
||||
c. Your brief rationale for the sequencing and any key dependencies you've noted (e.g., "I suggest this order because Story 2 builds upon the data prepared in Story 1, and Story 3 then uses the results from Story 2.").
|
||||
iv. **Collaborative Review of Sequence & Story Shells:** Discuss this proposed structure and sequence with the user. Make any adjustments to the story list or their order based on user feedback.
|
||||
v. Once the overall structure and sequence of stories for the Epic are agreed upon, THEN you will work with the user to review the details (description, Acceptance Criteria) of each story in the agreed-upon sequence for that Epic.
|
||||
vi. [Offer Advanced Self-Refinement & Elicitation Options](#offer-advanced-self-refinement--elicitation-options)
|
||||
**General Guidelines:**
|
||||
|
||||
#### 4C. Present Complete Draft
|
||||
- Present complete documents in clean, full format
|
||||
- DO NOT truncate unchanged information
|
||||
- Begin directly with content (no introductory text needed)
|
||||
- Ensure all template sections are properly filled
|
||||
- **NEVER show template markup, instructions, or processing directives to users**
|
||||
|
||||
Present the user with the complete full draft once all sections are completed (or as per YOLO mode interaction).
|
||||
## Key Resources
|
||||
|
||||
#### 4D. UI Component Handoff Note
|
||||
- **Default Template:** `templates#prd-tmpl`
|
||||
- **Validation:** `checklists#pm-checklist`
|
||||
- **User Preferences:** `data#technical-preferences`
|
||||
- **Elicitation Protocol:** `tasks#advanced-elicitation`
|
||||
|
||||
If there is a UI component to this PRD, you can inform the user that the Design Architect should take this final output.
|
||||
## Important Notes
|
||||
|
||||
### 5\. Checklist Assessment
|
||||
|
||||
- Use the `pm-checklist` to consider each item in the checklist is met (or n/a) against the PRD.
|
||||
- Document completion status for each item.
|
||||
- Present the user with summary of each section of the checklist before going to the next section.
|
||||
- Address deficiencies with user for input or suggested updates or corrections.
|
||||
- Once complete and address, output the final checklist with all the checked items or skipped items, the section summary table, and any final notes. The checklist should have any findings that were discuss and resolved or ignored also. This will be a nice artifact for the user to keep.
|
||||
|
||||
### 6\. Produce the PRD
|
||||
|
||||
Produce the PRD with PM Prompt per the `prd-tmpl` utilizing the following guidance:
|
||||
|
||||
**General Presentation & Content:**
|
||||
|
||||
- Present Project Briefs (drafts or final) in a clean, full format.
|
||||
- Crucially, DO NOT truncate information that has not changed from a previous version.
|
||||
- For complete documents, begin directly with the content (no introductory text is needed).
|
||||
|
||||
<important_note>
|
||||
**Next Steps for UI/UX Specification (If Applicable):**
|
||||
|
||||
- If the product described in this PRD includes a user interface:
|
||||
|
||||
1. **Include Design Architect Prompt in PRD:** You will add a dedicated section in the PRD document you are producing, specifically at the location marked `(END Checklist START Design Architect UI/UX Specification Mode Prompt)` (as per the `prd-tmpl` structure). This section will contain a prompt for the **Design Architect** agent.
|
||||
|
||||
- The prompt should clearly state that the Design Architect is to operate in its **'UI/UX Specification Mode'**.
|
||||
|
||||
- It should instruct the Design Architect to use this PRD as primary input to collaboratively define and document detailed UI/UX specifications. This might involve creating/populating a `front-end-spec-tmpl` and ensuring key UI/UX considerations are integrated or referenced back into the PRD to enrich it.
|
||||
|
||||
- Example prompt text to insert:
|
||||
|
||||
```markdown
|
||||
## Prompt for Design Architect (UI/UX Specification Mode)
|
||||
|
||||
**Objective:** Elaborate on the UI/UX aspects of the product defined in this PRD.
|
||||
**Mode:** UI/UX Specification Mode
|
||||
**Input:** This completed PRD document.
|
||||
**Key Tasks:**
|
||||
|
||||
1. Review the product goals, user stories, and any UI-related notes herein.
|
||||
2. Collaboratively define detailed user flows, wire-frames (conceptual), and key screen mockups/descriptions.
|
||||
3. Specify usability requirements and accessibility considerations.
|
||||
4. Populate or create the `front-end-spec-tmpl` document.
|
||||
5. Ensure that this PRD is updated or clearly references the detailed UI/UX specifications derived from your work, so that it provides a comprehensive foundation for subsequent architecture and development phases.
|
||||
|
||||
Please guide the user through this process to enrich the PRD with detailed UI/UX specifications.
|
||||
```
|
||||
|
||||
2. **Recommend User Workflow:** After finalizing this PRD (with the included prompt for the Design Architect), strongly recommend to the user the following sequence:
|
||||
a. First, engage the **Design Architect** agent (using the prompt you've embedded in the PRD) to operate in **'UI/UX Specification Mode'**. Explain that this step is crucial for detailing the user interface and experience, and the output (e.g., a populated `front-end-spec-tmpl` and potentially updated PRD sections) will be vital.
|
||||
b. Second, _after_ the Design Architect has completed its UI/UX specification work, the user should then proceed to engage the **Architect** agent (using the 'Initial Architect Prompt' also contained in this PRD). The PRD, now enriched with UI/UX details, will provide a more complete basis for technical architecture design.
|
||||
|
||||
- If the product does not include a user interface, you will simply recommend proceeding to the Architect agent using the 'Initial Architect Prompt' in the PRD.
|
||||
</important_note>
|
||||
|
||||
## Guiding Principles for Epic and User Story Generation
|
||||
|
||||
### I. Strategic Foundation: Define Core Value & MVP Scope Rigorously
|
||||
|
||||
Understand & Clarify Core Needs: Start by deeply understanding and clarifying the core problem this product solves, the essential needs of the defined User Personas (or system actors), and the key business objectives for the Minimum Viable Product (MVP).
|
||||
Challenge Scope Relentlessly: Actively challenge all requested features and scope at every stage. For each potential feature or story, rigorously ask, "Does this directly support the core MVP goals and provide significant value to a target User Persona?" Clearly identify and defer non-essential functionalities to a Post-MVP backlog.
|
||||
|
||||
### II. Structuring the Work: Value-Driven Epics & Logical Sequencing
|
||||
|
||||
Organize into Deployable, Value-Driven Epics: Structure the MVP scope into Epics. Each Epic must be designed to deliver a significant, end-to-end, and fully deployable increment of testable functionality that provides tangible value to the user or business. Epics should represent logical functional blocks or coherent user journeys.
|
||||
|
||||
Logical Epic Sequencing & Foundational Work:
|
||||
Ensure the sequence of Epics follows a logical implementation order, making dependencies between Epics clear and explicitly managed.
|
||||
The first Epic must always establish the foundational project infrastructure (e.g., initial app setup, Git repository, CI/CD pipeline, core cloud service configurations, basic user authentication shell if needed universally) necessary to support its own deployable functionality and that of subsequent Epics.
|
||||
Ensure Logical Story Sequencing and Dependency Awareness within Epics:
|
||||
After initially drafting all User Stories for an Epic, but before detailed review with the user, you (the AI Agent executing this task) must explicitly perform an internal review to establish a logical sequence for these stories.
|
||||
For each story, identify if it has direct prerequisite stories within the same Epic or from already completed Epics.
|
||||
Propose a clear story order to the user, explaining the rationale based on these dependencies (e.g., "Story X needs to be done before Story Y because..."). Make significant dependencies visible, perhaps as a note within the story description.
|
||||
|
||||
### III. Crafting Effective User Stories: Vertical Slices Focused on Value & Clarity
|
||||
|
||||
Define Stories as "Vertical Slices": Within each Epic, define User Stories as "vertical slices". This means each story must deliver a complete piece of functionality that achieves a specific user or system goal, potentially cutting through all necessary layers (e.g., UI, API, business logic, database).
|
||||
Focus on "What" and "Why," Not "How":
|
||||
Stories will primarily focus on the functional outcome, the user value ("what"), and the reason ("why"). Avoid detailing technical implementation ("how") in the story's main description.
|
||||
The "As a {specific User Persona/system actor}, I want {to perform an action / achieve a goal} so that {I can realize a benefit / achieve a reason}" format is standard. Be precise and consistent when defining the '{specific User Persona/system actor}', ensuring it aligns with defined personas.
|
||||
Ensure User Value, Not Just Technical Tasks: User Stories must articulate clear user or business value. Avoid creating stories that are purely technical tasks (e.g., "Set up database," "Refactor module X"), unless they are part of the foundational infrastructure Epic or are essential enabling tasks that are explicitly linked to, and justified by, a user-facing story that delivers value.
|
||||
Appropriate Sizing & Strive for Independence:
|
||||
Ensure User Stories are appropriately sized for a typical development iteration (i.e., can be completed by the team in one sprint/iteration).
|
||||
If a vertically sliced story is too large or complex, work with the user to split it into smaller, still valuable, and still vertically sliced increments.
|
||||
Where feasible, define stories so they can be developed, tested, and potentially delivered independently of others. If dependencies are unavoidable, they must be clearly identified and managed through sequencing.
|
||||
|
||||
### IV. Detailing Stories: Comprehensive Acceptance Criteria & Developer Enablement
|
||||
|
||||
Clear, Comprehensive, and Testable Acceptance Criteria (ACs):
|
||||
Every User Story will have detailed, unambiguous, and testable Acceptance Criteria.
|
||||
ACs precisely define what "done" means for that story from a functional perspective and serve as the basis for verification.
|
||||
Where a specific Non-Functional Requirement (NFR) from the PRD (e.g., a particular performance target for a specific action, a security constraint for handling certain data) is critical to a story, ensure it is explicitly captured or clearly referenced within its Acceptance Criteria.
|
||||
Integrate Developer Enablement & Iterative Design into Stories:
|
||||
Local Testability (CLI): For User Stories involving backend processing or data components, ensure the ACs consider or specify the ability for developers to test that functionality locally (e.g., via CLI commands, local service instances).
|
||||
Iterative Schema Definition: Database schema changes (new tables, columns) should be introduced iteratively within the User Stories that functionally require them, rather than defining the entire schema upfront.
|
||||
Upfront UI/UX Standards (if UI applicable): For User Stories with a UI component, ACs should explicitly state requirements regarding look and feel, responsiveness, and adherence to chosen frameworks/libraries (e.g., Tailwind CSS, shadcn/ui) from the start.
|
||||
|
||||
### V. Managing Complexity: Addressing Cross-Cutting Concerns Effectively
|
||||
|
||||
Critically Evaluate for Cross-Cutting Concerns:
|
||||
Before finalizing a User Story, evaluate if the described functionality is truly a discrete, user-facing piece of value or if it represents a cross-cutting concern (e.g., a specific logging requirement, a UI theme element used by many views, a core technical enabler for multiple other stories, a specific aspect of error handling).
|
||||
If a piece of functionality is identified as a cross-cutting concern:
|
||||
a. Avoid creating a separate User Story for it unless it delivers standalone, testable user value.
|
||||
b. Instead, integrate the requirement as specific Acceptance Criteria within all relevant User Stories it impacts.
|
||||
c. Alternatively, if it's a pervasive technical enabler or a non-functional requirement that applies broadly, document it clearly within the relevant PRD section (e.g., 'Non Functional Requirements', 'Technical Assumptions'), or as a note for the Architect within the story descriptions if highly specific.
|
||||
|
||||
Your aim is to ensure User Stories remain focused on delivering measurable user value, while still capturing all necessary technical and functional details appropriately.
|
||||
|
||||
### VI. Ensuring Quality & Smooth Handoff
|
||||
|
||||
Maintain Clarity for Handoff and Architectural Freedom: User Stories, their descriptions, and Acceptance Criteria must be detailed enough to provide the Architect with a clear and comprehensive understanding of "what is required," while allowing for architectural flexibility on the "how."
|
||||
Confirm "Ready" State: Before considering an Epic's stories complete, ensure each story is effectively "ready" for subsequent architectural review or development planning – meaning it's clear, understandable, testable, its dependencies are noted, and any foundational work (like from the first epic) is accounted for.
|
||||
|
||||
## Offer Advanced Self-Refinement & Elicitation Options
|
||||
|
||||
(This section is called when needed prior to this)
|
||||
|
||||
Present the user with the following list of 'Advanced Reflective, Elicitation & Brainstorming Actions'. Explain that these are optional steps to help ensure quality, explore alternatives, and deepen the understanding of the current section before finalizing it and moving on. The user can select an action by number, or choose to skip this and proceed to finalize the section.
|
||||
|
||||
"To ensure the quality of the current section: **[Specific Section Name]** and to ensure its robustness, explore alternatives, and consider all angles, I can perform any of the following actions. Please choose a number (8 to finalize and proceed):
|
||||
|
||||
**Advanced Reflective, Elicitation & Brainstorming Actions I Can Take:**
|
||||
|
||||
{Instruction for AI Agent: Display the title of each numbered item below. If the user asks what a specific option means, provide a brief explanation of the action you will take, drawing from detailed descriptions tailored for the context.}
|
||||
|
||||
1. **Critical Self-Review & User Goal Alignment**
|
||||
2. **Generate & Evaluate Alternative Design Solutions**
|
||||
3. **User Journey & Interaction Stress Test (Conceptual)**
|
||||
4. **Deep Dive into Design Assumptions & Constraints**
|
||||
5. **Usability & Accessibility Audit Review & Probing Questions**
|
||||
6. **Collaborative Ideation & UI Feature Brainstorming**
|
||||
7. **Elicit 'Unforeseen User Needs' & Future Interaction Questions**
|
||||
8. **Finalize this Section and Proceed.**
|
||||
|
||||
After I perform the selected action, we can discuss the outcome and decide on any further revisions for this section."
|
||||
|
||||
REPEAT by Asking the user if they would like to perform another Reflective, Elicitation & Brainstorming Action UNIT the user indicates it is time to proceed ot the next section (or selects #8)
|
||||
- This task is template-agnostic - users may specify custom templates
|
||||
- All detailed instructions are embedded in templates, not this task file
|
||||
- Focus on orchestration and workflow
|
||||
- **Template markup is for AI processing only - users should never see output indicators from templates#template-format**
|
||||
|
||||
@@ -1,113 +1,182 @@
|
||||
# {Project Name} Product Requirements Document (PRD)
|
||||
# {{Project Name}} Product Requirements Document (PRD)
|
||||
|
||||
## Goal, Objective and Context
|
||||
[[LLM: If available, review any provided document or ask if any are optionally available: Project Brief]]
|
||||
|
||||
This should come mostly from the user or the provided brief, but ask for clarifications as needed.
|
||||
## Goals and Background Context
|
||||
|
||||
## Functional Requirements (MVP)
|
||||
[[LLM: Populate the 2 child sections based on what we have received from user description or the provided brief. Allow user to review the 2 sections and offer changes before proceeding]]
|
||||
|
||||
You should have a good idea at this point, but clarify suggest question and explain to ensure these are correct.
|
||||
### Goals
|
||||
|
||||
## Non Functional Requirements (MVP)
|
||||
[[LLM: Bullet list of 1 line desired outcomes the PRD will deliver if successful - user and project desires]]
|
||||
|
||||
You should have a good idea at this point, but clarify suggest question and explain to ensure these are correct.
|
||||
### Background Context
|
||||
|
||||
## User Interaction and Design Goals
|
||||
[[LLM: 1-2 short paragraphs summarizing the background context, such as what we learned in the brief without being redundant with the goals, what and why this solves a problem, what the current landscape or need is etc...]]
|
||||
|
||||
{
|
||||
If the product includes a User Interface (UI), this section captures the Product Manager's high-level vision and goals for the User Experience (UX). This information will serve as a crucial starting point and brief for the Design Architect.
|
||||
## Requirements
|
||||
|
||||
Consider and elicit information from the user regarding:
|
||||
[[LLM: Draft the list of functional and non functional requirements under the two child sections, and immediately execute tasks#advanced-elicitation display]]
|
||||
|
||||
- **Overall Vision & Experience:** What is the desired look and feel (e.g., "modern and minimalist," "friendly and approachable," "data-intensive and professional")? What kind of experience should users have?
|
||||
- **Key Interaction Paradigms:** Are there specific ways users will interact with core features (e.g., "drag-and-drop interface for X," "wizard-style setup for Y," "real-time dashboard for Z")?
|
||||
- **Core Screens/Views (Conceptual):** From a product perspective, what are the most critical screens or views necessary to deliver the MVP's value? (e.g., "Login Screen," "Main Dashboard," "Item Detail Page," "Settings Page").
|
||||
- **Accessibility Aspirations:** Any known high-level accessibility goals (e.g., "must be usable by screen reader users").
|
||||
- **Branding Considerations (High-Level):** Any known branding elements or style guides that must be incorporated?
|
||||
- **Target Devices/Platforms:** (e.g., "primarily web desktop," "mobile-first responsive web app").
|
||||
### Functional
|
||||
|
||||
This section is not intended to be a detailed UI specification but rather a product-focused brief to guide the subsequent detailed work by the Design Architect, who will create the comprehensive UI/UX Specification document.
|
||||
}
|
||||
[[LLM: Each Requirement will be a bullet markdown and an identifier sequence starting with FR`.]]
|
||||
@{example: - FR6: The Todo List uses AI to detect and warn against adding potentially duplicate todo items that are worded differently.}
|
||||
|
||||
### Non Functional
|
||||
|
||||
[[LLM: Each Requirement will be a bullet markdown and an identifier sequence starting with NFR`.]]
|
||||
@{example: - NFR1: AWS service usage **must** aim to stay within free-tier limits where feasible.}
|
||||
|
||||
^^CONDITION: has_ui^^
|
||||
|
||||
## User Interface Design Goals
|
||||
|
||||
[[LLM: Capture high-level UI/UX vision to guide Design Architect and to inform story creation. Steps:
|
||||
|
||||
1. Pre-fill all subsections with educated guesses based on project context
|
||||
2. Present the complete rendered section to user
|
||||
3. Clearly let the user know where assumptions were made
|
||||
4. Ask targeted questions for unclear/missing elements or areas needing more specification
|
||||
5. This is NOT detailed UI spec - focus on product vision and user goals
|
||||
6. After section completion, immediately apply `tasks#advanced-elicitation` protocol]]
|
||||
|
||||
### Overall UX Vision
|
||||
|
||||
### Key Interaction Paradigms
|
||||
|
||||
### Core Screens and Views
|
||||
|
||||
[[LLM: From a product perspective, what are the most critical screens or views necessary to deliver the the PRD values and goals? This is meant to be Conceptual High Level to Drive Rough Epic or User Stories]]
|
||||
|
||||
@{example}
|
||||
|
||||
- Login Screen
|
||||
- Main Dashboard
|
||||
- Item Detail Page
|
||||
- Settings Page
|
||||
@{/example}
|
||||
|
||||
### Accessibility: { None, WCAG, etc }
|
||||
|
||||
### Branding
|
||||
|
||||
[[LLM: Any known branding elements or style guides that must be incorporated?]]
|
||||
|
||||
@{example}
|
||||
|
||||
- Replicate the look and feel of early 1900s black and white cinema, including animated effects replicating film damage or projector glitches during page or state transitions.
|
||||
- Attached is the full color pallet and tokens for our corporate branding.
|
||||
@{/example}
|
||||
|
||||
### Target Device and Platforms
|
||||
|
||||
@{example}
|
||||
"Web Responsive, and all mobile platforms", "IPhone Only", "ASCII Windows Desktop"
|
||||
@{/example}
|
||||
|
||||
^^/CONDITION: has_ui^^
|
||||
|
||||
## Technical Assumptions
|
||||
|
||||
This is where we can list information mostly to be used by the architect to produce the technical details. This could be anything we already know or found out from the user at a technical high level. Inquire about this from the user to get a basic idea of languages, frameworks, knowledge of starter templates, libraries, external apis, potential library choices etc...
|
||||
[[LLM: Gather technical decisions that will guide the Architect. Steps:
|
||||
|
||||
- **Repository & Service Architecture:** {CRITICAL DECISION: Document the chosen repository structure (e.g., Monorepo, Polyrepo) and the high-level service architecture (e.g., Monolith, Microservices, Serverless functions within a Monorepo). Explain the rationale based on project goals, MVP scope, team structure, and scalability needs. This decision directly impacts the technical approach and informs the Architect Agent.}
|
||||
1. Check if `data#technical-preferences` file exists - use it to pre-populate choices
|
||||
2. Ask user about: languages, frameworks, starter templates, libraries, APIs, deployment targets
|
||||
3. For unknowns, offer guidance based on project goals and MVP scope
|
||||
4. Document ALL technical choices with rationale (why this choice fits the project)
|
||||
5. These become constraints for the Architect - be specific and complete
|
||||
6. After section completion, apply `tasks#advanced-elicitation` protocol.]]
|
||||
|
||||
### Repository Structure: { Monorepo, Polyrepo, etc...}
|
||||
|
||||
### Service Architecture
|
||||
|
||||
[[LLM: CRITICAL DECISION - Document the high-level service architecture (e.g., Monolith, Microservices, Serverless functions within a Monorepo).]]
|
||||
|
||||
### Testing requirements
|
||||
|
||||
How will we validate functionality beyond unit testing? Will we want manual scripts or testing, e2e, integration etc... figure this out from the user to populate this section
|
||||
[[LLM: CRITICAL DECISION - Document the testing requirements, unit only, integration, e2e, manual, need for manual testing convenience methods).]]
|
||||
|
||||
## Epic Overview
|
||||
### Additional Technical Assumptions and Requests
|
||||
|
||||
- **Epic {#}: {Title}**
|
||||
- Goal: {A concise 1-2 sentence statement describing the primary objective and value of this Epic.}
|
||||
- Story {#}: As a {type of user/system}, I want {to perform an action / achieve a goal} so that {I can realize a benefit / achieve a reason}.
|
||||
- {Acceptance Criteria List}
|
||||
- Story {#}: As a {type of user/system}, I want {to perform an action / achieve a goal} so that {I can realize a benefit / achieve a reason}.
|
||||
- {Acceptance Criteria List}
|
||||
- **Epic {#}: {Title}**
|
||||
- Goal: {A concise 1-2 sentence statement describing the primary objective and value of this Epic.}
|
||||
- Story {#}: As a {type of user/system}, I want {to perform an action / achieve a goal} so that {I can realize a benefit / achieve a reason}.
|
||||
- {Acceptance Criteria List}
|
||||
- Story {#}: As a {type of user/system}, I want {to perform an action / achieve a goal} so that {I can realize a benefit / achieve a reason}.
|
||||
- {Acceptance Criteria List}
|
||||
[[LLM: Throughout the entire process of drafting this document, if any other technical assumptions are raised or discovered appropriate for the architect, add them here as additional bulleted items]]
|
||||
|
||||
## Key Reference Documents
|
||||
## Epics
|
||||
|
||||
{ This section will be created later, from the sections prior to this being carved up into smaller documents }
|
||||
[[LLM: First, present a high-level list of all epics for user approval, the epic_list and immediately execute tasks#advanced-elicitation display. Each epic should have a title and a short (1 sentence) goal statement. This allows the user to review the overall structure before diving into details.
|
||||
|
||||
## Out of Scope Ideas Post MVP
|
||||
CRITICAL: Epics MUST be logically sequential following agile best practices:
|
||||
|
||||
Anything you and the user agreed it out of scope or can be removed from scope to keep MVP lean. Consider the goals of the PRD and what might be extra gold plating or additional features that could wait until the MVP is completed and delivered to assess functionality and market fit or usage.
|
||||
- Each epic should deliver a significant, end-to-end, fully deployable increment of testable functionality
|
||||
- Epic 1 must establish foundational project infrastructure (app setup, Git, CI/CD, core services) unless we are adding new functionality to an existing app, while also delivering an initial piece of functionality, even as simple as a health-check route or display of a simple canary page
|
||||
- Each subsequent epic builds upon previous epics' functionality delivering major blocks of functionality that provide tangible value to users or business when deployed
|
||||
- Not every project needs multiple epics, an epic needs to deliver value. For example, an API completed can deliver value even if a UI is not complete and planned for a separate epic.
|
||||
- Err on the side of less epics, but let the user know your rationale and offer options for splitting them if it seems some are too large or focused on disparate things.
|
||||
- Cross Cutting Concerns should flow through epics and stories and not be final stories. For example, adding a logging framework as a last story of an epic, or at the end of a project as a final epic or story would be terrible as we would not have logging from the beginning.]]
|
||||
|
||||
## [OPTIONAL: For Simplified PM-to-Development Workflow Only] Core Technical Decisions & Application Structure
|
||||
<<REPEAT: epic_list>>
|
||||
|
||||
{This section is to be populated ONLY if the PM is operating in the 'Simplified PM-to-Development Workflow'. It captures essential technical foundations that would typically be defined by an Architect, allowing for a more direct path to development. This information should be gathered after initial PRD sections (Goals, Users, etc.) are drafted, and ideally before or in parallel with detailed Epic/Story definition, and updated as needed.}
|
||||
- Epic{{epic_number}} {{epic_title}}: {{short_goal}}
|
||||
|
||||
### Technology Stack Selections
|
||||
<</REPEAT>>
|
||||
|
||||
{Collaboratively define the core technologies. Be specific about choices and versions where appropriate.}
|
||||
@{example: epic_list}
|
||||
|
||||
- **Primary Backend Language/Framework:** {e.g., Python/FastAPI, Node.js/Express, Java/Spring Boot}
|
||||
- **Primary Frontend Language/Framework (if applicable):** {e.g., TypeScript/React (Next.js), JavaScript/Vue.js}
|
||||
- **Database:** {e.g., PostgreSQL, MongoDB, AWS DynamoDB}
|
||||
- **Key Libraries/Services (Backend):** {e.g., Authentication (JWT, OAuth provider), ORM (SQLAlchemy), Caching (Redis)}
|
||||
- **Key Libraries/Services (Frontend, if applicable):** {e.g., UI Component Library (Material-UI, Tailwind CSS + Headless UI), State Management (Redux, Zustand)}
|
||||
- **Deployment Platform/Environment:** {e.g., Docker on AWS ECS, Vercel, Netlify, Kubernetes}
|
||||
- **Version Control System:** {e.g., Git with GitHub/GitLab}
|
||||
1. Foundation & Core Infrastructure: Establish project setup, authentication, and basic user management
|
||||
2. Core Business Entities: Create and manage primary domain objects with CRUD operations
|
||||
3. User Workflows & Interactions: Enable key user journeys and business processes
|
||||
4. Reporting & Analytics: Provide insights and data visualization for users
|
||||
|
||||
### Proposed Application Structure
|
||||
@{/example}
|
||||
|
||||
{Describe the high-level organization of the codebase. This might include a simple text-based directory layout, a list of main modules/components, and a brief explanation of how they interact. The goal is to provide a clear starting point for developers.}
|
||||
[[LLM: After the epic list is approved, present each `epic_details` with all its stories and acceptance criteria as a complete review unit and immediately execute tasks#advanced-elicitation display, before moving on to the next epic.]]
|
||||
|
||||
Example:
|
||||
<<REPEAT: epic_details>>
|
||||
|
||||
```
|
||||
/
|
||||
├── app/ # Main application source code
|
||||
│ ├── api/ # Backend API routes and logic
|
||||
│ │ ├── v1/
|
||||
│ │ └── models.py
|
||||
│ ├── web/ # Frontend components and pages (if monolithic)
|
||||
│ │ ├── components/
|
||||
│ │ └── pages/
|
||||
│ ├── core/ # Shared business logic, utilities
|
||||
│ └── main.py # Application entry point
|
||||
├── tests/ # Unit and integration tests
|
||||
├── scripts/ # Utility scripts
|
||||
├── Dockerfile
|
||||
├── requirements.txt
|
||||
└── README.md
|
||||
```
|
||||
## Epic {{epic_number}} {{epic_title}}
|
||||
|
||||
- **Monorepo/Polyrepo:** {Specify if a monorepo or polyrepo structure is envisioned, and briefly why.}
|
||||
- **Key Modules/Components and Responsibilities:**
|
||||
- {Module 1 Name}: {Brief description of its purpose and key responsibilities}
|
||||
- {Module 2 Name}: {Brief description of its purpose and key responsibilities}
|
||||
- ...
|
||||
- **Data Flow Overview (Conceptual):** {Briefly describe how data is expected to flow between major components, e.g., Frontend -> API -> Core Logic -> Database.}
|
||||
{{epic_goal}} [[LLM: Expanded goal - 2-3 sentences describing the objective and value all the stories will achieve]]
|
||||
|
||||
[[LLM: CRITICAL STORY SEQUENCING REQUIREMENTS:
|
||||
|
||||
- Stories within each epic MUST be logically sequential
|
||||
- Each story should be a "vertical slice" delivering complete functionality
|
||||
- No story should depend on work from a later story or epic
|
||||
- Identify and note any direct prerequisite stories
|
||||
- Focus on "what" and "why" not "how" (leave technical implementation to Architect) yet be precise enough to support a logical sequential order of operations from story to story.
|
||||
- Ensure each story delivers clear user or business value, try to avoid enablers and build them into stories that deliver value.
|
||||
- Size stories for AI agent execution: Each story must be completable by a single AI agent in one focused session without context overflow
|
||||
- Think "junior developer working for 2-4 hours" - stories must be small, focused, and self-contained
|
||||
- If a story seems complex, break it down further as long as it can deliver a vertical slice
|
||||
- Each story should result in working, testable code before the agent's context window fills]]
|
||||
|
||||
<<REPEAT: story>>
|
||||
|
||||
### Story {{epic_number}}.{{story_number}} {{story_title}}
|
||||
|
||||
As a {{user_type}},
|
||||
I want {{action}},
|
||||
so that {{benefit}}.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
[[LLM: Define clear, comprehensive, and testable acceptance criteria that:
|
||||
|
||||
- Precisely define what "done" means from a functional perspective
|
||||
- Are unambiguous and serve as basis for verification
|
||||
- Include any critical non-functional requirements from the PRD
|
||||
- Consider local testability for backend/data components
|
||||
- Specify UI/UX requirements and framework adherence where applicable
|
||||
- Avoid cross-cutting concerns that should be in other stories or PRD sections]]
|
||||
|
||||
<<REPEAT: criteria>>
|
||||
|
||||
- {{criterion number}}: {{criteria}}
|
||||
|
||||
<</REPEAT>>
|
||||
<</REPEAT>>
|
||||
<</REPEAT>>
|
||||
|
||||
## Change Log
|
||||
|
||||
@@ -118,49 +187,18 @@ Example:
|
||||
|
||||
## Checklist Results Report
|
||||
|
||||
[[LLM: Before running the checklist and drafting the prompts, offer to output the full updated PRD. If outputting it, confirm with the user that you will be proceeding to run the checklist and produce the report. Once the user confirms, execute the `pm-checklist` and populate the results in this section.]]
|
||||
|
||||
----- END Checklist START Design Architect `UI/UX Specification Mode` Prompt ------
|
||||
|
||||
## Design Architect Prompt
|
||||
|
||||
[[LLM: This section will contain the prompt for the Design Architect, keep it short and to the point to initiate create architecture mode using this document as input.]]
|
||||
|
||||
----- END Design Architect `UI/UX Specification Mode` Prompt START Architect Prompt ------
|
||||
|
||||
## Initial Architect Prompt
|
||||
## Architect Prompt
|
||||
|
||||
Based on our discussions and requirements analysis for the {Product Name}, I've compiled the following technical guidance to inform your architecture analysis and decisions to kick off Architecture Creation Mode:
|
||||
[[LLM: This section will contain the prompt for the Architect, keep it short and to the point to initiate create architecture mode using this document as input.]]
|
||||
|
||||
### Technical Infrastructure
|
||||
|
||||
- **Repository & Service Architecture Decision:** {Reiterate the decision made in 'Technical Assumptions', e.g., Monorepo with Next.js frontend and Python FastAPI backend services within the same repo; or Polyrepo with separate Frontend (Next.js) and Backend (Spring Boot Microservices) repositories.}
|
||||
- **Starter Project/Template:** {Information about any starter projects, templates, or existing codebases that should be used}
|
||||
- **Hosting/Cloud Provider:** {Specified cloud platform (AWS, Azure, GCP, etc.) or hosting requirements}
|
||||
- **Frontend Platform:** {Framework/library preferences or requirements (React, Angular, Vue, etc.)}
|
||||
- **Backend Platform:** {Framework/language preferences or requirements (Node.js, Python/Django, etc.)}
|
||||
- **Database Requirements:** {Relational, NoSQL, specific products or services preferred}
|
||||
|
||||
### Technical Constraints
|
||||
|
||||
- {List any technical constraints that impact architecture decisions}
|
||||
- {Include any mandatory technologies, services, or platforms}
|
||||
- {Note any integration requirements with specific technical implications}
|
||||
|
||||
### Deployment Considerations
|
||||
|
||||
- {Deployment frequency expectations}
|
||||
- {CI/CD requirements}
|
||||
- {Environment requirements (local, dev, staging, production)}
|
||||
|
||||
### Local Development & Testing Requirements
|
||||
|
||||
{Include this section only if the user has indicated these capabilities are important. If not applicable based on user preferences, you may remove this section.}
|
||||
|
||||
- {Requirements for local development environment}
|
||||
- {Expectations for command-line testing capabilities}
|
||||
- {Needs for testing across different environments}
|
||||
- {Utility scripts or tools that should be provided}
|
||||
- {Any specific testability requirements for components}
|
||||
|
||||
### Other Technical Considerations
|
||||
|
||||
- {Security requirements with technical implications}
|
||||
- {Scalability needs with architectural impact}
|
||||
- {Any other technical context the Architect should consider}
|
||||
|
||||
----- END Architect Prompt -----
|
||||
----- END Architect Prompt ------
|
||||
|
||||
@@ -45,7 +45,7 @@
|
||||
|
||||
## PM Prompt
|
||||
|
||||
This Project Brief provides the full context for {Project Name}. Please start in 'PRD Generation Mode', review the brief thoroughly to work with the user to create the PRD section by section 1 at a time, asking for any necessary clarification or suggesting improvements as your mode 1 programming allows.
|
||||
This Project Brief provides the full context for {Project Name}. Please start in 'PRD Generation Mode', review the brief thoroughly to work with the user to create the PRD section by section as the template indicates, asking for any necessary clarification or suggesting improvements as your mode 1 programming allows.
|
||||
|
||||
<example_handoff_prompt>
|
||||
This Project Brief provides the full context for Mealmate. Please start in 'PRD Generation Mode', review the brief thoroughly to work with the user to create the PRD section by section 1 at a time, asking for any necessary clarification or suggesting improvements as your mode 1 programming allows.</example_handoff_prompt>
|
||||
|
||||
43
bmad-agent/templates/template-format.md
Normal file
43
bmad-agent/templates/template-format.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# MD Template Format:
|
||||
|
||||
- {{placeholder}} = Simple text replacement placeholder
|
||||
- [[LLM: instruction]] = Instructions for the LLM (not included in output)
|
||||
- <<REPEAT: section_name>> ... <</REPEAT>> = Repeating section
|
||||
- ^^CONDITION: condition_name^^ ... ^^/CONDITION: condition_name^^ = Conditional section that will render if the condition_name logically applies
|
||||
- @{example: content} = Single line example content for LLM guidance - do not render
|
||||
- @{example} ... @{/example} = Multi-line example content for LLM guidance - do not render
|
||||
|
||||
## Critical Template Usage Rules
|
||||
|
||||
- CRITICAL: Never display or output template markup formatting, LLM instructions or examples
|
||||
- they MUST be used by you the agent only, AND NEVER shown to users in chat or documented output\*\*
|
||||
- Present only the final, clean content to users
|
||||
- Replace template variables with actual project-specific content
|
||||
- Show examples only when they add value, without the markup
|
||||
- Process all conditional logic internally - show only relevant sections
|
||||
- For Canvas mode: Update the document with clean, formatted content only
|
||||
|
||||
@{example}
|
||||
|
||||
# My Template Foo
|
||||
|
||||
[[LLM: Check the current system date and if the user name is unknown, just say hello]]
|
||||
Hello {{users name}}, this is your foo report for {{todays date}}
|
||||
|
||||
<<REPEAT: single_foo>>
|
||||
[[LLM: For Each Foo, Create a matching creative Bar]]
|
||||
|
||||
## Foo: {{Bar}}
|
||||
|
||||
<</REPEAT>>
|
||||
|
||||
^^CONDITION: if_BAZ_exists^^
|
||||
|
||||
## BAZ
|
||||
|
||||
### You haz BAZ! Here is your daily Baz Forecast!
|
||||
|
||||
[[LLM: Give the user their daily baz report here]]
|
||||
^^/CONDITION: if_BAZ_exists^^
|
||||
|
||||
@{/example}
|
||||
@@ -19,9 +19,6 @@
|
||||
- "Brain Storming"
|
||||
- "Deep Research"
|
||||
- "Project Briefing"
|
||||
- Interaction Modes:
|
||||
- "Interactive"
|
||||
- "YOLO"
|
||||
- templates:
|
||||
- [Project Brief Tmpl](templates#project-brief-tmpl)
|
||||
|
||||
@@ -29,20 +26,16 @@
|
||||
|
||||
- Name: John
|
||||
- Customize: ""
|
||||
- Description: "For PRDs, project planning, PM checklists and potential replans."
|
||||
- Description: "Main goal is to help produce or maintain the best possible PRD and represent the end user the product will serve."
|
||||
- Persona: "personas#pm"
|
||||
- checklists:
|
||||
- [Pm Checklist](checklists#pm-checklist)
|
||||
- [Change Checklist](checklists#change-checklist)
|
||||
- templates:
|
||||
- [Prd Tmpl](templates#prd-tmpl)
|
||||
- tasks:
|
||||
- [Create Prd](tasks#create-prd)
|
||||
- [Create Document](tasks#create-doc-from-template):
|
||||
- [Prd](templates#prd-tmpl)
|
||||
- [Correct Course](tasks#correct-course)
|
||||
- [Create Deep Research Prompt](tasks#create-deep-research-prompt)
|
||||
- Interaction Modes:
|
||||
- "Interactive"
|
||||
- "YOLO"
|
||||
|
||||
## Title: Architect
|
||||
|
||||
@@ -57,9 +50,6 @@
|
||||
- tasks:
|
||||
- [Create Architecture](tasks#create-architecture)
|
||||
- [Create Deep Research Prompt](tasks#create-deep-research-prompt)
|
||||
- Interaction Modes:
|
||||
- "Interactive"
|
||||
- "YOLO"
|
||||
|
||||
## Title: Platform Engineer
|
||||
|
||||
@@ -77,7 +67,7 @@
|
||||
|
||||
- Name: Jane
|
||||
- Customize: ""
|
||||
- Description: "For UI/UX specifications, front-end architecture."
|
||||
- Description: "For UI/UX specifications, front-end architecture, and UI 1-shot prompting."
|
||||
- Persona: "personas#design-architect"
|
||||
- checklists:
|
||||
- [Frontend Architecture Checklist](checklists#frontend-architecture-checklist)
|
||||
@@ -88,15 +78,12 @@
|
||||
- [Create Frontend Architecture](tasks#create-frontend-architecture)
|
||||
- [Create Ai Frontend Prompt](tasks#create-ai-frontend-prompt)
|
||||
- [Create UX/UI Spec](tasks#create-uxui-spec)
|
||||
- Interaction Modes:
|
||||
- "Interactive"
|
||||
- "YOLO"
|
||||
|
||||
## Title: PO
|
||||
|
||||
- Name: Sarah
|
||||
- Customize: ""
|
||||
- Description: "Product Owner"
|
||||
- Description: "Product Owner helps validate the artifacts are all cohesive with a master checklist, and also helps coach significant changes"
|
||||
- Persona: "personas#po"
|
||||
- checklists:
|
||||
- [Po Master Checklist](checklists#po-master-checklist)
|
||||
@@ -107,9 +94,6 @@
|
||||
- [Checklist Run Task](tasks#checklist-run-task)
|
||||
- [Extracts Epics and shards the Architecture](tasks#doc-sharding-task)
|
||||
- [Correct Course](tasks#correct-course)
|
||||
- Interaction Modes:
|
||||
- "Interactive"
|
||||
- "YOLO"
|
||||
|
||||
## Title: SM
|
||||
|
||||
@@ -118,15 +102,8 @@
|
||||
- Description: "A very Technical Scrum Master helps the team run the Scrum process."
|
||||
- Persona: "personas#sm"
|
||||
- checklists:
|
||||
- [Change Checklist](checklists#change-checklist)
|
||||
- [Story Dod Checklist](checklists#story-dod-checklist)
|
||||
- [Story Draft Checklist](checklists#story-draft-checklist)
|
||||
- tasks:
|
||||
- [Checklist Run Task](tasks#checklist-run-task)
|
||||
- [Correct Course](tasks#correct-course)
|
||||
- [Draft a story for dev agent](tasks#story-draft-task)
|
||||
- templates:
|
||||
- [Story Tmpl](templates#story-tmpl)
|
||||
- Interaction Modes:
|
||||
- "Interactive"
|
||||
- "YOLO"
|
||||
|
||||
@@ -1,13 +0,0 @@
|
||||
A simple project run through the Web Gemini BMad Agent - all artifacts from a single chat session (split up into smaller files with the sharding task)
|
||||
|
||||
- The [Project Brief](./v3-output-demo-files/project-brief.md) was first collaborated on and created with the Analyst
|
||||
- The first [PRD Draft](./v3-output-demo-files/prd.draft.md) was created with the PM
|
||||
- The [Architecture](./v3-output-demo-files/architecture.md) was created and then we worked on some design artifacts. The architect conversation lead to changes in the PRD reflected later.
|
||||
|
||||
Design Artifacts with the Design Architect:
|
||||
|
||||
- [UX UI Spec](./v3-output-demo-files/ux-ui-spec.md)
|
||||
- [V0 1 Shot UI Prompt](./v3-output-demo-files/v0-prompt.md)
|
||||
- [Front End Architecture](./v3-output-demo-files/front-end-architecture.md)
|
||||
|
||||
Then the updated PRD with fixed Expic and Stories after running the PO Checklist. The PO took all changes from the architect and design architect and worked them back into the updated [PRD Final](./v3-output-demo-files/prd.md)
|
||||
@@ -1,34 +0,0 @@
|
||||
# API Reference
|
||||
|
||||
## External APIs Consumed
|
||||
|
||||
**1. Algolia Hacker News Search API**
|
||||
|
||||
* **Base URL:** `http://hn.algolia.com/api/v1/`
|
||||
* **Authentication:** None.
|
||||
* **Endpoints Used:**
|
||||
* `GET /search_by_date?tags=story&hitsPerPage={N}` (For top posts)
|
||||
* `GET /items/{POST_ID}` (For comments/post details)
|
||||
* **Key Data Extracted:** Post title, article URL, HN link, HN Post ID, author, points, creation timestamp; Comment text, author, creation timestamp.
|
||||
|
||||
**2. Play.ai PlayNote API**
|
||||
|
||||
* **Base URL:** `https://api.play.ai/api/v1/`
|
||||
* **Authentication:** Headers: `Authorization: Bearer <PLAY_AI_BEARER_TOKEN>`, `X-USER-ID: <PLAY_AI_USER_ID>`.
|
||||
* **Endpoints Used:**
|
||||
* `POST /playnotes` (Submit job)
|
||||
* Request: `application/json` with `sourceText`, `title`, voice params (from env vars: `PLAY_AI_VOICE1_ID`, `PLAY_AI_VOICE1_NAME`, `PLAY_AI_VOICE2_ID`, `PLAY_AI_VOICE2_NAME`), style (`PLAY_AI_STYLE`).
|
||||
* Response: JSON with `jobId`.
|
||||
* `GET /playnote/{jobId}` (Poll status)
|
||||
* Response: JSON with `status`, `audioUrl` (if completed).
|
||||
|
||||
## Internal APIs Provided (by backend for frontend)
|
||||
|
||||
* **Base URL Path Prefix:** `/v1` (Full URL from `NEXT_PUBLIC_BACKEND_API_URL`).
|
||||
* **Authentication:** Requires "Frontend Read API Key" via `x-api-key` header for GET endpoints. A separate "Admin Action API Key" for trigger endpoint.
|
||||
* **Endpoints:**
|
||||
* **`GET /status`**: Health/status check. Response: `{"message": "BMad Daily Digest Backend is operational.", "timestamp": "..."}`.
|
||||
* **`GET /episodes`**: Lists episodes. Response: `{ "episodes": [EpisodeListItem, ...] }`.
|
||||
* **`GET /episodes/{episodeId}`**: Episode details. Response: `EpisodeDetail` object.
|
||||
* **`POST /jobs/daily-digest/trigger`**: (Admin Key) Triggers daily pipeline. Response: `{"message": "...", "executionArn": "..."}`.
|
||||
* **Common Errors:** 401 Unauthorized, 404 Not Found, 500 Internal Server Error.
|
||||
@@ -1,571 +0,0 @@
|
||||
# BMad Daily Digest Architecture Document
|
||||
|
||||
**Version:** 0.1
|
||||
**Date:** May 20, 2025
|
||||
**Author:** Fred (Architect) & User
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. Introduction / Preamble
|
||||
2. Technical Summary
|
||||
3. High-Level Overview
|
||||
* Backend Architectural Style
|
||||
* Frontend Architectural Style
|
||||
* Repository Structure
|
||||
* Primary Data Flow & User Interaction (Conceptual)
|
||||
* System Context Diagram (Conceptual)
|
||||
4. Architectural / Design Patterns Adopted
|
||||
5. Component View
|
||||
* Backend Components
|
||||
* Frontend Components
|
||||
* External Services
|
||||
* Component Interaction Diagram (Conceptual Backend Focus)
|
||||
6. Project Structure
|
||||
* Backend Repository (`bmad-daily-digest-backend`)
|
||||
* Frontend Repository (`bmad-daily-digest-frontend`)
|
||||
* Notes
|
||||
7. API Reference
|
||||
* External APIs Consumed
|
||||
* Internal APIs Provided
|
||||
8. Data Models
|
||||
* Core Application Entities / Domain Objects
|
||||
* API Payload Schemas (Internal API)
|
||||
* Database Schemas (AWS DynamoDB)
|
||||
9. Core Workflow / Sequence Diagrams
|
||||
* Daily Automated Podcast Generation Pipeline (Backend)
|
||||
* Frontend User Requesting and Playing an Episode
|
||||
10. Definitive Tech Stack Selections
|
||||
11. Infrastructure and Deployment Overview
|
||||
12. Error Handling Strategy
|
||||
13. Coding Standards (Backend: `bmad-daily-digest-backend`)
|
||||
* Detailed Language & Framework Conventions (TypeScript/Node.js - Backend Focus)
|
||||
14. Overall Testing Strategy
|
||||
15. Security Best Practices
|
||||
16. Key Reference Documents
|
||||
17. Change Log
|
||||
18. Prompt for Design Architect (Jane) - To Produce Frontend Architecture Document
|
||||
|
||||
-----
|
||||
|
||||
## 1\. Introduction / Preamble
|
||||
|
||||
This document outlines the overall project architecture for "BMad Daily Digest," including backend systems, frontend deployment infrastructure, shared services considerations, and non-UI specific concerns. Its primary goal is to serve as the guiding architectural blueprint for AI-driven development and human developers, ensuring consistency and adherence to chosen patterns and technologies as defined in the Product Requirements Document (PRD v0.1) and UI/UX Specification (v0.1).
|
||||
|
||||
**Relationship to Frontend Architecture:**
|
||||
The frontend application (Next.js) will have its own detailed frontend architecture considerations (component structure, state management, etc.) which will be detailed in a separate Frontend Architecture Document (to be created by the Design Architect, Jane, based on a prompt at the end of this document). This overall Architecture Document will define the backend services the frontend consumes, the infrastructure for hosting the frontend (S3/CloudFront), and ensure alignment on shared technology choices (like TypeScript) and API contracts. The "Definitive Tech Stack Selections" section herein is the single source of truth for all major technology choices across the project.
|
||||
|
||||
## 2\. Technical Summary
|
||||
|
||||
"BMad Daily Digest" is a serverless application designed to automatically produce a daily audio podcast summarizing top Hacker News posts. The backend, built with TypeScript on Node.js 22 and deployed as AWS Lambda functions, will fetch data from the Algolia HN API, scrape linked articles, process content, and use the Play.ai API for audio generation (with job status managed by AWS Step Functions polling). Podcast metadata will be stored in DynamoDB, and audio files in S3. All backend infrastructure will be managed via AWS CDK within its own repository.
|
||||
|
||||
The frontend will be a Next.js (React, TypeScript) application, styled with Tailwind CSS and shadcn/ui to an "80s retro CRT terminal" aesthetic, kickstarted by an AI UI generation tool. It will be a statically exported site hosted on AWS S3 and delivered globally via AWS CloudFront, with its infrastructure also managed by a separate AWS CDK application within its own repository. The frontend will consume data from the backend via an AWS API Gateway secured with API Keys. The entire system aims for cost-efficiency, leveraging AWS free-tier services where possible.
|
||||
|
||||
## 3\. High-Level Overview
|
||||
|
||||
The "BMad Daily Digest" system is architected as a decoupled, serverless application designed for automated daily content aggregation and audio generation, with a statically generated frontend for content consumption.
|
||||
|
||||
* **Backend Architectural Style:** The backend employs a **serverless, event-driven architecture** leveraging AWS Lambda functions for discrete processing tasks. These tasks are orchestrated by AWS Step Functions to manage the daily content pipeline, including interactions with external services. An API layer is provided via AWS API Gateway for frontend consumption.
|
||||
* **Frontend Architectural Style:** The frontend is a **statically generated site (SSG)** built with Next.js. This approach maximizes performance, security, and cost-effectiveness by serving pre-built files from AWS S3 via CloudFront.
|
||||
* **Repository Structure:** The project utilizes a **polyrepo structure** with two primary repositories:
|
||||
* `bmad-daily-digest-backend`: Housing all backend TypeScript code, AWS CDK for backend infrastructure, Lambda functions, Step Function definitions, etc.
|
||||
* `bmad-daily-digest-frontend`: Housing the Next.js TypeScript application, UI components, styling, and its dedicated AWS CDK application for S3/CloudFront infrastructure.
|
||||
* **Primary Data Flow & User Interaction (Conceptual):**
|
||||
1. **Daily Automated Pipeline (Backend):**
|
||||
* An Amazon EventBridge Scheduler rule triggers an AWS Step Function state machine daily.
|
||||
* The Step Function orchestrates a sequence of AWS Lambda functions to:
|
||||
* Fetch top posts and comments from the Algolia HN API (identifying repeats).
|
||||
* Scrape and extract content from linked external article URLs (for new posts or if scraping previously failed).
|
||||
* Aggregate and format the text content (handling new posts, updates, scrape failures, truncation).
|
||||
* Submit the formatted text to the Play.ai PlayNote API, receiving a `jobId`.
|
||||
* Poll the Play.ai API (using the `jobId`) for podcast generation status until completion or failure.
|
||||
* Upon completion, download the generated MP3 audio from Play.ai.
|
||||
* Store the MP3 file in a designated S3 bucket.
|
||||
* Store episode metadata (including the S3 audio link, source HN post details, etc.) in a DynamoDB table and update HN post processing states.
|
||||
2. **User Consumption (Frontend):**
|
||||
* The user accesses the "BMad Daily Digest" Next.js web application served via AWS CloudFront from an S3 bucket.
|
||||
* The frontend application makes API calls (via `axios`) to an AWS API Gateway endpoint (secured with an API Key).
|
||||
* API Gateway routes these requests to specific AWS Lambda functions that query the DynamoDB table to retrieve episode lists and details.
|
||||
* The frontend renders the information and provides an HTML5 audio player to stream/play the MP3 from its S3/CloudFront URL.
|
||||
|
||||
**System Context Diagram (Conceptual):**
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[User] -->|Views & Interacts via Browser| B(Frontend Application\nNext.js on S3/CloudFront);
|
||||
B -->|Fetches Episode Data (HTTPS, API Key)| C(Backend API\nAPI Gateway + Lambda);
|
||||
C -->|Reads/Writes| D(Episode Metadata\nDynamoDB);
|
||||
B -->|Streams Audio| E(Podcast Audio Files\nS3 via CloudFront);
|
||||
|
||||
F[Daily Scheduler\nEventBridge] -->|Triggers| G(Orchestration Service\nAWS Step Functions);
|
||||
G -->|Invokes| H(Data Collection Lambdas\n- Fetch HN via Algolia\n- Scrape Articles);
|
||||
H -->|Calls| I[Algolia HN API];
|
||||
H -->|Scrapes| J[External Article Websites];
|
||||
G -->|Invokes| K(Content Processing Lambda);
|
||||
G -->|Invokes| L(Play.ai Interaction Lambdas\n- Submit Job\n- Poll Status);
|
||||
L -->|Calls / Gets Status| M[Play.ai PlayNote API];
|
||||
G -->|Invokes| N(Storage Lambdas\n- Store Audio to S3\n- Store Metadata to DynamoDB);
|
||||
N -->|Writes| E;
|
||||
N -->|Writes| D;
|
||||
M -->|Returns Audio URL| L;
|
||||
```
|
||||
|
||||
## 4\. Architectural / Design Patterns Adopted
|
||||
|
||||
The following key architectural and design patterns have been chosen for this project:
|
||||
|
||||
* **Serverless Architecture:** Entire backend on AWS Lambda, API Gateway, S3, DynamoDB, Step Functions. Rationale: Minimized operations, auto-scaling, pay-per-use, cost-efficiency.
|
||||
* **Event-Driven Architecture:** Daily pipeline initiated by EventBridge Scheduler; Step Functions orchestrate based on state changes. Rationale: Decoupled components, reactive system for automation.
|
||||
* **Microservices-like Approach (Backend Lambda Functions):** Each Lambda handles a specific, well-defined task. Rationale: Modularity, independent scalability, easier testing/maintenance.
|
||||
* **Static Site Generation (SSG) for Frontend:** Next.js frontend exported as static files, hosted on S3/CloudFront. Rationale: Optimal performance, security, scalability, lower hosting costs.
|
||||
* **Infrastructure as Code (IaC):** AWS CDK in TypeScript for all AWS infrastructure in both repositories. Rationale: Repeatable, version-controlled, automated provisioning.
|
||||
* **Polling Pattern (External Job Status):** AWS Step Functions implement a polling loop for Play.ai job status. Rationale: Reliable tracking of asynchronous third-party jobs, based on Play.ai docs.
|
||||
* **Orchestration Pattern (AWS Step Functions):** End-to-end daily backend pipeline managed by a Step Functions state machine. Rationale: Robust workflow automation, state management, error handling for multi-step processes.
|
||||
|
||||
## 5\. Component View
|
||||
|
||||
The system is divided into distinct backend and frontend components.
|
||||
|
||||
**Backend Components (`bmad-daily-digest-backend` repository):**
|
||||
|
||||
1. **Daily Workflow Orchestrator (AWS Step Functions state machine):** Manages the end-to-end daily pipeline.
|
||||
2. **HN Data Fetcher Service (AWS Lambda):** Fetches HN posts/comments (Algolia), identifies repeats (via DynamoDB).
|
||||
3. **Article Scraping Service (AWS Lambda):** Scrapes/extracts content from external article URLs, handles fallbacks.
|
||||
4. **Content Formatting Service (AWS Lambda):** Aggregates and formats text payload for Play.ai.
|
||||
5. **Play.ai Interaction Service (AWS Lambda functions, orchestrated by Polling Step Function):** Submits job to Play.ai, polls for status.
|
||||
6. **Podcast Storage Service (AWS Lambda):** Downloads audio from Play.ai, stores to S3.
|
||||
7. **Metadata Persistence Service (AWS Lambda & DynamoDB Tables):** Manages episode and HN post processing state metadata in DynamoDB.
|
||||
8. **Backend API Service (AWS API Gateway + AWS Lambda functions):** Exposes endpoints for frontend (episode lists/details).
|
||||
|
||||
**Frontend Components (`bmad-daily-digest-frontend` repository):**
|
||||
|
||||
1. **Next.js Web Application (Static Site on S3/CloudFront):** Renders UI, handles navigation.
|
||||
2. **Frontend API Client Service (TypeScript module):** Encapsulates communication with the Backend API Service.
|
||||
|
||||
**External Services:** Algolia HN Search API, Play.ai PlayNote API, Various External Article Websites.
|
||||
|
||||
**Component Interaction Diagram (Conceptual Backend Focus):**
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph Frontend Application Space
|
||||
F_App[Next.js App on S3/CloudFront]
|
||||
F_APIClient[Frontend API Client]
|
||||
F_App --> F_APIClient
|
||||
end
|
||||
|
||||
subgraph Backend API Space
|
||||
APIGW[API Gateway]
|
||||
API_L[Backend API Lambdas]
|
||||
APIGW --> API_L
|
||||
end
|
||||
|
||||
subgraph Backend Daily Pipeline Space
|
||||
Scheduler[EventBridge Scheduler] --> Orchestrator[Step Functions Orchestrator]
|
||||
|
||||
Orchestrator --> HNFetcher[HN Data Fetcher Lambda]
|
||||
HNFetcher -->|Reads/Writes Post Status| DDB
|
||||
HNFetcher --> Algolia[Algolia HN API]
|
||||
|
||||
Orchestrator --> ArticleScraper[Article Scraper Lambda]
|
||||
ArticleScraper --> ExtWebsites[External Article Websites]
|
||||
|
||||
Orchestrator --> ContentFormatter[Content Formatter Lambda]
|
||||
|
||||
Orchestrator --> PlayAISubmit[Play.ai Submit Lambda]
|
||||
PlayAISubmit --> PlayAI_API[Play.ai PlayNote API]
|
||||
|
||||
subgraph Polling_SF[Play.ai Polling (Step Functions)]
|
||||
direction LR
|
||||
PollTask[Poll Status Lambda] --> PlayAI_API
|
||||
end
|
||||
Orchestrator --> Polling_SF
|
||||
|
||||
|
||||
Orchestrator --> PodcastStorage[Podcast Storage Lambda]
|
||||
PodcastStorage --> PlayAI_API
|
||||
PodcastStorage --> S3Store[S3 Audio Storage]
|
||||
|
||||
Orchestrator --> MetadataService[Metadata Persistence Lambda]
|
||||
MetadataService --> DDB[DynamoDB Episode/Post Metadata]
|
||||
end
|
||||
|
||||
F_APIClient --> APIGW
|
||||
API_L --> DDB
|
||||
|
||||
classDef external fill:#ddd,stroke:#333,stroke-width:2px;
|
||||
class Algolia,ExtWebsites,PlayAI_API external;
|
||||
```
|
||||
|
||||
## 6\. Project Structure
|
||||
|
||||
The project utilizes a polyrepo structure with separate backend and frontend repositories, each with its own CDK application.
|
||||
|
||||
**1. Backend Repository (`bmad-daily-digest-backend`)**
|
||||
Organized by features within `src/`, using `dash-case` for folders and files (e.g., `src/features/content-ingestion/hn-fetcher-service.ts`).
|
||||
|
||||
```plaintext
|
||||
bmad-daily-digest-backend/
|
||||
├── .github/
|
||||
├── cdk/
|
||||
│ ├── bin/
|
||||
│ ├── lib/ # Backend Stack, Step Function definitions
|
||||
│ └── test/
|
||||
├── src/
|
||||
│ ├── features/
|
||||
│ │ ├── dailyJobOrchestrator/ # Main Step Function trigger/definition support
|
||||
│ │ ├── hnContentPipeline/ # Services for Algolia, scraping, formatting
|
||||
│ │ ├── playAiIntegration/ # Services for Play.ai submit & polling Lambda logic
|
||||
│ │ ├── podcastPersistence/ # Services for S3 & DynamoDB storage
|
||||
│ │ └── publicApi/ # Handlers for API Gateway (status, episodes)
|
||||
│ ├── shared/
|
||||
│ │ ├── utils/
|
||||
│ │ ├── types/
|
||||
│ │ └── services/ # Optional shared low-level AWS SDK wrappers
|
||||
├── tests/ # Unit/Integration tests, mirroring src/features/
|
||||
│ └── features/
|
||||
... (root config files: .env.example, .eslintrc.js, .gitignore, .prettierrc.js, jest.config.js, package.json, README.md, tsconfig.json)
|
||||
```
|
||||
|
||||
*Key Directories: `cdk/` for IaC, `src/features/` for modular backend logic, `src/shared/` for reusable code, `tests/` for Jest tests.*
|
||||
|
||||
**2. Frontend Repository (`bmad-daily-digest-frontend`)**
|
||||
Aligns with V0.dev generated Next.js App Router structure, using `dash-case` for custom files/folders where applicable.
|
||||
|
||||
```plaintext
|
||||
bmad-daily-digest-frontend/
|
||||
├── .github/
|
||||
├── app/
|
||||
│ ├── (pages)/
|
||||
│ │ ├── episodes/
|
||||
│ │ │ ├── page.tsx # List page
|
||||
│ │ │ └── [episode-id]/
|
||||
│ │ │ └── page.tsx # Detail page
|
||||
│ │ └── about/
|
||||
│ │ └── page.tsx
|
||||
│ ├── layout.tsx
|
||||
│ └── globals.css
|
||||
├── components/
|
||||
│ ├── ui/ # shadcn/ui based components
|
||||
│ └── domain/ # Custom composite components (e.g., episode-card)
|
||||
├── cdk/ # AWS CDK application for frontend infra (S3, CloudFront)
|
||||
│ ├── bin/
|
||||
│ └── lib/
|
||||
├── hooks/
|
||||
├── lib/
|
||||
│ ├── types.ts
|
||||
│ ├── utils.ts
|
||||
│ └── api-client.ts # Backend API communication
|
||||
├── public/
|
||||
├── tests/ # Jest & RTL tests
|
||||
... (root config files: .env.local.example, .eslintrc.js, components.json, next.config.mjs, package.json, tailwind.config.ts, tsconfig.json)
|
||||
```
|
||||
|
||||
*Key Directories: `app/` for Next.js routes, `components/` for UI, `cdk/` for frontend IaC, `lib/` for utilities and `api-client.ts`.*
|
||||
|
||||
## 7\. API Reference
|
||||
|
||||
### External APIs Consumed
|
||||
|
||||
**1. Algolia Hacker News Search API**
|
||||
|
||||
* **Base URL:** `http://hn.algolia.com/api/v1/`
|
||||
* **Authentication:** None.
|
||||
* **Endpoints Used:**
|
||||
* `GET /search_by_date?tags=story&hitsPerPage={N}` (For top posts)
|
||||
* `GET /items/{POST_ID}` (For comments/post details)
|
||||
* **Key Data Extracted:** Post title, article URL, HN link, HN Post ID, author, points, creation timestamp; Comment text, author, creation timestamp.
|
||||
|
||||
**2. Play.ai PlayNote API**
|
||||
|
||||
* **Base URL:** `https://api.play.ai/api/v1/`
|
||||
* **Authentication:** Headers: `Authorization: Bearer <PLAY_AI_BEARER_TOKEN>`, `X-USER-ID: <PLAY_AI_USER_ID>`.
|
||||
* **Endpoints Used:**
|
||||
* `POST /playnotes` (Submit job)
|
||||
* Request: `application/json` with `sourceText`, `title`, voice params (from env vars: `PLAY_AI_VOICE1_ID`, `PLAY_AI_VOICE1_NAME`, `PLAY_AI_VOICE2_ID`, `PLAY_AI_VOICE2_NAME`), style (`PLAY_AI_STYLE`).
|
||||
* Response: JSON with `jobId`.
|
||||
* `GET /playnote/{jobId}` (Poll status)
|
||||
* Response: JSON with `status`, `audioUrl` (if completed).
|
||||
|
||||
### Internal APIs Provided (by backend for frontend)
|
||||
|
||||
* **Base URL Path Prefix:** `/v1` (Full URL from `NEXT_PUBLIC_BACKEND_API_URL`).
|
||||
* **Authentication:** Requires "Frontend Read API Key" via `x-api-key` header for GET endpoints. A separate "Admin Action API Key" for trigger endpoint.
|
||||
* **Endpoints:**
|
||||
* **`GET /status`**: Health/status check. Response: `{"message": "BMad Daily Digest Backend is operational.", "timestamp": "..."}`.
|
||||
* **`GET /episodes`**: Lists episodes. Response: `{ "episodes": [EpisodeListItem, ...] }`.
|
||||
* **`GET /episodes/{episodeId}`**: Episode details. Response: `EpisodeDetail` object.
|
||||
* **`POST /jobs/daily-digest/trigger`**: (Admin Key) Triggers daily pipeline. Response: `{"message": "...", "executionArn": "..."}`.
|
||||
* **Common Errors:** 401 Unauthorized, 404 Not Found, 500 Internal Server Error.
|
||||
|
||||
## 8\. Data Models
|
||||
|
||||
### Core Application Entities
|
||||
|
||||
**a. Episode**
|
||||
|
||||
* Attributes: `episodeId` (PK, UUID), `publicationDate` (YYYY-MM-DD), `episodeNumber` (Number), `podcastGeneratedTitle` (String), `audioS3Bucket` (String), `audioS3Key` (String), `audioUrl` (String, derived for API), `playAiJobId` (String), `playAiSourceAudioUrl` (String), `sourceHNPosts` (List of `SourceHNPost`), `status` (String: "PROCESSING", "PUBLISHED", "FAILED"), `createdAt` (ISO Timestamp), `updatedAt` (ISO Timestamp).
|
||||
|
||||
**b. SourceHNPost (object within `Episode.sourceHNPosts`)**
|
||||
|
||||
* Attributes: `hnPostId` (String), `title` (String), `originalArticleUrl` (String), `hnLink` (String), `isUpdateStatus` (Boolean), `oldRank` (Number, Optional), `lastCommentFetchTimestamp` (Number, Unix Timestamp), `articleScrapingFailed` (Boolean), `articleTitleFromScrape` (String, Optional).
|
||||
|
||||
**c. HackerNewsPostProcessState (DynamoDB Table)**
|
||||
|
||||
* Attributes: `hnPostId` (PK, String), `originalArticleUrl` (String), `articleTitleFromScrape` (String, Optional), `lastSuccessfullyScrapedTimestamp` (Number, Optional), `lastCommentFetchTimestamp` (Number, Optional), `firstProcessedDate` (YYYY-MM-DD), `lastProcessedDate` (YYYY-MM-DD), `lastKnownRank` (Number, Optional).
|
||||
|
||||
### API Payload Schemas (Internal API)
|
||||
|
||||
**a. `EpisodeListItem` (for `GET /episodes`)**
|
||||
|
||||
* `episodeId`, `publicationDate`, `episodeNumber`, `podcastGeneratedTitle`.
|
||||
|
||||
**b. `EpisodeDetail` (for `GET /episodes/{episodeId}`)**
|
||||
|
||||
* `episodeId`, `publicationDate`, `episodeNumber`, `podcastGeneratedTitle`, `audioUrl`, `sourceHNPosts` (list of `SourceHNPostDetail` containing `hnPostId`, `title`, `originalArticleUrl`, `hnLink`, `isUpdateStatus`, `oldRank`), `playAiJobId` (optional), `playAiSourceAudioUrl` (optional), `createdAt`.
|
||||
|
||||
### Database Schemas (AWS DynamoDB)
|
||||
|
||||
**a. `BmadDailyDigestEpisodes` Table**
|
||||
|
||||
* PK: `episodeId` (String).
|
||||
* Attributes: As per `Episode` entity.
|
||||
* GSI Example (`PublicationDateIndex`): PK: `status`, SK: `publicationDate`.
|
||||
* Billing: PAY\_PER\_REQUEST.
|
||||
|
||||
**b. `HackerNewsPostProcessState` Table**
|
||||
|
||||
* PK: `hnPostId` (String).
|
||||
* Attributes: As per `HackerNewsPostProcessState` entity.
|
||||
* Billing: PAY\_PER\_REQUEST.
|
||||
|
||||
## 9\. Core Workflow / Sequence Diagrams
|
||||
|
||||
### 1\. Daily Automated Podcast Generation Pipeline (Backend)
|
||||
|
||||
*(Mermaid diagram as previously shown, detailing EventBridge -\> Step Functions -\> Lambdas -\> Algolia/External Sites/Play.ai -\> S3/DynamoDB).*
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Sched as Scheduler (EventBridge)
|
||||
participant Orch as Orchestrator (Step Functions)
|
||||
participant HNF as HN Data Fetcher Lambda
|
||||
participant Algolia as Algolia HN API
|
||||
participant ASL as Article Scraper Lambda
|
||||
participant EAS as External Article Sites
|
||||
participant CFL as Content Formatter Lambda
|
||||
participant PSubL as Play.ai Submit Lambda
|
||||
participant PlayAI as Play.ai API
|
||||
participant PStatL as Play.ai Status Poller Lambda
|
||||
participant PSL as Podcast Storage Lambda
|
||||
participant S3 as S3 Audio Storage
|
||||
participant MPL as Metadata Persistence Lambda
|
||||
participant DDB as DynamoDB (Episodes & HNPostState)
|
||||
|
||||
Sched->>Orch: Trigger Daily Workflow
|
||||
activate Orch
|
||||
Orch->>HNF: Start: Fetch HN Posts
|
||||
activate HNF
|
||||
HNF->>Algolia: Request top N posts
|
||||
Algolia-->>HNF: Return HN post list
|
||||
HNF->>DDB: Query HNPostProcessState for repeat status & lastCommentFetchTimestamp
|
||||
DDB-->>HNF: Return status
|
||||
HNF-->>Orch: HN Posts Data (with repeat status)
|
||||
deactivate HNF
|
||||
Orch->>ASL: For each NEW HN Post: Scrape Article (URL)
|
||||
activate ASL
|
||||
ASL->>EAS: Fetch article HTML
|
||||
EAS-->>ASL: Return HTML
|
||||
ASL-->>Orch: Scraped Article Content / Scrape Failure+Fallback Flag
|
||||
deactivate ASL
|
||||
Orch->>HNF: For each HN Post: Fetch Comments (HN Post ID, isRepeat, lastCommentFetchTimestamp, articleScrapedFailedFlag)
|
||||
activate HNF
|
||||
HNF->>Algolia: Request comments for Post ID
|
||||
Algolia-->>HNF: Return comments
|
||||
HNF->>DDB: Update HNPostProcessState (lastCommentFetchTimestamp)
|
||||
DDB-->>HNF: Confirm update
|
||||
HNF-->>Orch: Selected Comments
|
||||
deactivate HNF
|
||||
Orch->>CFL: Format Content for Play.ai (HN Posts, Articles, Comments)
|
||||
activate CFL
|
||||
CFL-->>Orch: Formatted Text Payload
|
||||
deactivate CFL
|
||||
Orch->>PSubL: Submit to Play.ai (Formatted Text)
|
||||
activate PSubL
|
||||
PSubL->>PlayAI: POST /playnotes (text, voice params, auth)
|
||||
PlayAI-->>PSubL: Return { jobId }
|
||||
PSubL-->>Orch: Play.ai Job ID
|
||||
deactivate PSubL
|
||||
loop Poll for Completion (managed by Orchestrator/Step Functions)
|
||||
Orch->>Orch: Wait (e.g., M minutes)
|
||||
Orch->>PStatL: Check Status (Job ID)
|
||||
activate PStatL
|
||||
PStatL->>PlayAI: GET /playnote/{jobId} (auth)
|
||||
PlayAI-->>PStatL: Return { status, audioUrl? }
|
||||
PStatL-->>Orch: Job Status & audioUrl (if completed)
|
||||
deactivate PStatL
|
||||
alt Job Completed
|
||||
Orch->>PSL: Store Podcast (audioUrl, jobId, episode context)
|
||||
activate PSL
|
||||
PSL->>PlayAI: GET audio from audioUrl
|
||||
PlayAI-->>PSL: Audio Stream/File
|
||||
PSL->>S3: Upload MP3
|
||||
S3-->>PSL: Confirm S3 Upload (s3Key, s3Bucket)
|
||||
PSL-->>Orch: S3 Location
|
||||
deactivate PSL
|
||||
Orch->>MPL: Persist Episode Metadata (S3 loc, HN sources, etc.)
|
||||
activate MPL
|
||||
MPL->>DDB: Save Episode Item & Update HNPostProcessState (lastProcessedDate)
|
||||
DDB-->>MPL: Confirm save
|
||||
MPL-->>Orch: Success
|
||||
deactivate MPL
|
||||
else Job Failed or Timeout
|
||||
Orch->>Orch: Log Error, Terminate Sub-Workflow for this job
|
||||
end
|
||||
end
|
||||
deactivate Orch
|
||||
```
|
||||
|
||||
### 2\. Frontend User Requesting and Playing an Episode
|
||||
|
||||
*(Mermaid diagram as previously shown, detailing User -\> Next.js App -\> API Gateway/Lambda -\> DynamoDB, and User -\> Next.js App -\> S3/CloudFront for audio).*
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User as User (Browser)
|
||||
participant FE_App as Frontend App (Next.js on CloudFront/S3)
|
||||
participant BE_API as Backend API (API Gateway)
|
||||
participant API_L as API Lambda
|
||||
participant DDB as DynamoDB (Episode Metadata)
|
||||
participant Audio_S3 as Audio Storage (S3 via CloudFront)
|
||||
|
||||
User->>FE_App: Requests page (e.g., /episodes or /episodes/{id})
|
||||
activate FE_App
|
||||
FE_App->>BE_API: GET /v1/episodes (or /v1/episodes/{id}) (includes API Key)
|
||||
activate BE_API
|
||||
BE_API->>API_L: Invoke Lambda with request data
|
||||
activate API_L
|
||||
API_L->>DDB: Query for episode(s) metadata
|
||||
activate DDB
|
||||
DDB-->>API_L: Return episode data
|
||||
deactivate DDB
|
||||
API_L-->>BE_API: Return formatted episode data
|
||||
deactivate API_L
|
||||
BE_API-->>FE_App: Return API response (JSON)
|
||||
deactivate BE_API
|
||||
FE_App->>FE_App: Render page with episode data (list or detail)
|
||||
FE_App-->>User: Display page
|
||||
deactivate FE_App
|
||||
|
||||
alt User on Episode Detail Page & Clicks Play
|
||||
User->>FE_App: Clicks play on HTML5 Audio Player
|
||||
activate FE_App
|
||||
Note over FE_App, Audio_S3: Player's src attribute is set to CloudFront URL for audio file in S3.
|
||||
FE_App->>Audio_S3: Browser requests audio file via CloudFront URL
|
||||
activate Audio_S3
|
||||
Audio_S3-->>FE_App: Stream/Return audio file
|
||||
deactivate Audio_S3
|
||||
FE_App-->>User: Plays audio
|
||||
deactivate FE_App
|
||||
end
|
||||
```
|
||||
|
||||
## 10\. Definitive Tech Stack Selections
|
||||
|
||||
| Category | Technology | Version / Details | Description / Purpose | Justification (Optional) |
|
||||
| :------------------- | :----------------------------- | :------------------------------------- | :------------------------------------------------------------------------------------ | :--------------------------------------------------------------------------------------- |
|
||||
| **Languages** | TypeScript | Latest stable (e.g., 5.x) | Primary language for backend and frontend. | Consistency, strong typing. |
|
||||
| **Runtime** | Node.js | 22.x | Server-side environment for backend & Next.js. | User preference, performance. |
|
||||
| **Frameworks (Frontend)** | Next.js (with React) | Latest stable (e.g., 14.x) | Frontend web application framework. | User preference, SSG, DX. |
|
||||
| **Frameworks (Backend)** | AWS Lambda (Node.js runtime) | N/A | Execution environment for serverless functions. | Serverless architecture. |
|
||||
| | AWS Step Functions | N/A | Orchestration of backend workflows. | Robust state management, retries. |
|
||||
| **Databases** | AWS DynamoDB | N/A | NoSQL database for metadata. | Scalability, serverless, free-tier. |
|
||||
| **Cloud Platform** | AWS | N/A | Primary cloud provider. | Comprehensive services, serverless. |
|
||||
| **Cloud Services** | AWS Lambda, API Gateway, S3, CloudFront, EventBridge Scheduler, CloudWatch, IAM, ACM | N/A | Core services for application hosting and operation. | Standard AWS serverless stack. |
|
||||
| **Infrastructure as Code (IaC)** | AWS CDK (TypeScript) | v2.x Latest stable | Defining cloud infrastructure. | User preference, TypeScript, repeatability. |
|
||||
| **UI Libraries (Frontend)** | Tailwind CSS | Latest stable (e.g., 3.x) | Utility-first CSS framework. | User preference, customization. |
|
||||
| | shadcn/ui | Latest stable | Accessible UI components. | User preference, base for themed components. |
|
||||
| **HTTP Client (Backend)** | axios | Latest stable | Making HTTP requests from backend. | User preference, feature-rich. |
|
||||
| **SDKs / Core Libraries (Backend)** | AWS SDK for JavaScript/TypeScript | v3.x (Latest stable) | Programmatic interaction with AWS services. | Official AWS SDK, modular. |
|
||||
| **Scraping / Content Extraction** | Cheerio | Latest stable | Server-side HTML parsing. | Efficient for static HTML. |
|
||||
| | @mozilla/readability (JS port) | Latest stable | Extracting primary readable article content. | Key for isolating main content. |
|
||||
| | Playwright (or Puppeteer) | Latest stable | Browser automation (if required for dynamic content). | Handles dynamic sites; use judiciously. |
|
||||
| **Bundling (Backend)**| esbuild | Latest stable | Bundling TypeScript Lambda functions. | User preference, speed. |
|
||||
| **Logging (Backend)** | Pino | Latest stable | Structured, low-overhead logging. | Better observability, JSON logs for CloudWatch. |
|
||||
| **Testing (Backend)**| Jest, ESLint, Prettier | Latest stable | Unit/integration testing, linting, formatting. | Code quality, consistency. |
|
||||
| **Testing (Frontend)**| Jest, React Testing Library, ESLint, Prettier | Latest stable | Unit/component testing, linting, formatting. | Code quality, consistency. |
|
||||
| **CI/CD** | GitHub Actions | N/A | Automation of build, test, quality checks. | Integration with GitHub. |
|
||||
| **External APIs** | Algolia HN Search API, Play.ai PlayNote API | v1 (for both) | Data sources and audio generation. | Core to product functionality. |
|
||||
*Note: "Latest stable" versions should be pinned to specific versions in `package.json` files during development.*
|
||||
|
||||
## 11\. Infrastructure and Deployment Overview
|
||||
|
||||
* **Cloud Provider:** AWS.
|
||||
* **Core Services Used:** Lambda, API Gateway (HTTP API), S3, DynamoDB (On-Demand), Step Functions, EventBridge Scheduler, CloudFront, CloudWatch, IAM, ACM (if custom domain).
|
||||
* **IaC:** AWS CDK (TypeScript), with separate CDK apps in backend and frontend polyrepos.
|
||||
* **Deployment Strategy (MVP):** CI (GitHub Actions) for build/test/lint. CDK deployment (initially manual or CI-scripted) to a single AWS environment.
|
||||
* **Environments (MVP):** Local Development; Single Deployed MVP Environment (e.g., "dev" acting as initial production).
|
||||
* **Rollback Strategy (MVP):** CDK stack rollback, Lambda/S3 versioning, DynamoDB PITR.
|
||||
|
||||
## 12\. Error Handling Strategy
|
||||
|
||||
* **General Approach:** Custom `Error` classes hierarchy. Promises reject with `Error` objects.
|
||||
* **Logging:** Pino for structured JSON logs to CloudWatch. Standard levels (DEBUG, INFO, WARN, ERROR, CRITICAL). Contextual info (AWS Request ID, business IDs). No sensitive data in logs.
|
||||
* **Specific Patterns:**
|
||||
* **External API Calls (`axios`):** Timeouts, retries (e.g., `axios-retry`), wrap errors in custom types.
|
||||
* **Internal Errors:** Custom error types, detailed server-side logging.
|
||||
* **API Gateway Responses:** Translate internal errors to appropriate HTTP errors (4xx, 500) with generic client messages.
|
||||
* **Workflow (Step Functions):** Error handling, retries, catch blocks for states. Failed executions logged.
|
||||
* **Data Consistency:** Lambdas handle partial failures gracefully. Step Functions manage overall workflow state.
|
||||
|
||||
## 13\. Coding Standards (Backend: `bmad-daily-digest-backend`)
|
||||
|
||||
**Scope:** Applies to `bmad-daily-digest-backend`. Frontend standards are separate.
|
||||
|
||||
* **Primary Language:** TypeScript (Node.js 22).
|
||||
* **Style:** ESLint, Prettier.
|
||||
* **Naming:** Variables/Functions: `camelCase`. Constants: `UPPER_SNAKE_CASE`. Classes/Interfaces/Types/Enums: `PascalCase`. Files/Folders: `dash-case` (e.g., `episode-service.ts`, `content-ingestion/`).
|
||||
* **Structure:** Feature-based (`src/features/feature-name/`).
|
||||
* **Tests:** Unit/integration tests co-located (`*.test.ts`). E2E tests (if any for backend API) in root `tests/e2e/`.
|
||||
* **Async:** `async`/`await` for Promises.
|
||||
* **Types:** `strict: true`. No `any` without justification. JSDoc for exported items. Inline comments for clarity.
|
||||
* **Dependencies:** `npm` with `package-lock.json`. Pin versions or use tilde (`~`).
|
||||
* **Detailed Conventions:** Immutability preferred. Functional constructs for stateless logic, classes for stateful services/entities. Custom errors. Strict null checks. ESModules. Pino for logging (structured JSON, levels, context, no secrets). Lambda best practices (lean handlers, env vars, optimize size). `axios` with timeouts. AWS SDK v3 modular imports. Avoid common anti-patterns (deep nesting, large functions, `@ts-ignore`, hardcoded secrets, unhandled promises).
|
||||
|
||||
## 14\. Overall Testing Strategy
|
||||
|
||||
* **Tools:** Jest, React Testing Library (frontend), ESLint, Prettier, GitHub Actions.
|
||||
* **Unit Tests:** Isolate functions/methods/components. Mock dependencies. Co-located. Developer responsibility.
|
||||
* **Integration Tests (Backend/Frontend):** Test interactions between internal components with external systems mocked (AWS SDK clients, third-party APIs).
|
||||
* **End-to-End (E2E) Tests (MVP):**
|
||||
* Backend API: Automated test for "Hello World"/status. Test daily job trigger verifies DDB/S3 output.
|
||||
* Frontend UI: Key user flows tested manually for MVP. (Playwright deferred to post-MVP).
|
||||
* **Coverage:** Guideline \>80% unit test coverage for critical logic. Quality over quantity. Measured by Jest.
|
||||
* **Mocking:** Jest's built-in system. `axios-mock-adapter` if needed.
|
||||
* **Test Data:** Inline mocks or small fixtures for unit/integration.
|
||||
|
||||
## 15\. Security Best Practices
|
||||
|
||||
* **Input Validation:** API Gateway basic validation; Zod for detailed payload validation in Lambdas.
|
||||
* **Output Encoding:** Next.js/React handles XSS for frontend rendering. Backend API is JSON.
|
||||
* **Secrets Management:** Lambda environment variables via CDK (from local gitignored `.env` for MVP setup). No hardcoding. Pino redaction for logs if needed.
|
||||
* **Dependency Security:** `npm audit` in CI. Promptly address high/critical vulnerabilities.
|
||||
* **Authentication/Authorization:** API Gateway API Keys (Frontend Read Key, Admin Action Key). IAM roles with least privilege for service-to-service.
|
||||
* **Principle of Least Privilege (IAM):** Minimal permissions for all IAM roles (Lambdas, Step Functions, CDK).
|
||||
* **API Security:** HTTPS enforced by API Gateway/CloudFront. Basic rate limiting on API Gateway. Frontend uses HTTP security headers (via CloudFront/Next.js).
|
||||
* **Error Disclosure:** Generic errors to client, detailed logs server-side.
|
||||
* **Infrastructure Security:** S3 bucket access restricted (CloudFront OAC/OAI).
|
||||
* **Post-MVP:** Consider SAST/DAST, penetration testing.
|
||||
* **Adherence:** AWS Well-Architected Framework - Security Pillar.
|
||||
|
||||
## 16\. Key Reference Documents
|
||||
|
||||
1. **Product Requirements Document (PRD) - BMad Daily Digest** (Version: 0.1)
|
||||
2. **UI/UX Specification - BMad Daily Digest** (Version: 0.1)
|
||||
3. **Algolia Hacker News Search API Documentation** (`https://hn.algolia.com/api`)
|
||||
4. **Play.ai PlayNote API Documentation** (`https://docs.play.ai/api-reference/playnote/post`)
|
||||
|
||||
## 17\. Change Log
|
||||
|
||||
| Version | Date | Author | Summary of Changes |
|
||||
| :------ | :----------- | :----------------------- | :---------------------------------------------------------------------------------------------------------------- |
|
||||
| 0.1 | May 20, 2025 | Fred (Architect) & User | Initial draft of the Architecture Document based on PRD v0.1 and UI/UX Spec v0.1. |
|
||||
@@ -1,8 +0,0 @@
|
||||
# Change Log
|
||||
|
||||
> This document is a granulated shard from the main "PRD.md" focusing on "Change Log".
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| :----------------------------------------------------------- | :------------ | :------ | :--------------------------------------------------------------------------------------------------------- | :------------------------------- |
|
||||
| Initial PRD draft and MVP scope definition. | May 20, 2025 | 0.1 | Created initial PRD based on Project Brief and discussions on goals, requirements, and Epics/Stories (shells). | John (PM) & User |
|
||||
| Architectural refinements incorporated into Story ACs. | May 20, 2025 | 0.2 | Updated ACs for Stories 2.1 and 3.1 based on System Architecture Document feedback from Fred (Architect). | Sarah (PO) & User |
|
||||
@@ -1,90 +0,0 @@
|
||||
# Component View
|
||||
|
||||
The system is divided into distinct backend and frontend components.
|
||||
|
||||
## Backend Components (`bmad-daily-digest-backend` repository)
|
||||
|
||||
1. **Daily Workflow Orchestrator (AWS Step Functions state machine):** Manages the end-to-end daily pipeline.
|
||||
2. **HN Data Fetcher Service (AWS Lambda):** Fetches HN posts/comments (Algolia), identifies repeats (via DynamoDB).
|
||||
3. **Article Scraping Service (AWS Lambda):** Scrapes/extracts content from external article URLs, handles fallbacks.
|
||||
4. **Content Formatting Service (AWS Lambda):** Aggregates and formats text payload for Play.ai.
|
||||
5. **Play.ai Interaction Service (AWS Lambda functions, orchestrated by Polling Step Function):** Submits job to Play.ai, polls for status.
|
||||
6. **Podcast Storage Service (AWS Lambda):** Downloads audio from Play.ai, stores to S3.
|
||||
7. **Metadata Persistence Service (AWS Lambda & DynamoDB Tables):** Manages episode and HN post processing state metadata in DynamoDB.
|
||||
8. **Backend API Service (AWS API Gateway + AWS Lambda functions):** Exposes endpoints for frontend (episode lists/details).
|
||||
|
||||
## Frontend Components (`bmad-daily-digest-frontend` repository)
|
||||
|
||||
1. **Next.js Web Application (Static Site on S3/CloudFront):** Renders UI, handles navigation.
|
||||
2. **Frontend API Client Service (TypeScript module):** Encapsulates communication with the Backend API Service.
|
||||
|
||||
## External Services
|
||||
|
||||
- Algolia HN Search API
|
||||
- Play.ai PlayNote API
|
||||
- Various External Article Websites
|
||||
|
||||
## Component Interaction Diagram (Conceptual Backend Focus)
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph Frontend Application Space
|
||||
F_App[Next.js App on S3/CloudFront]
|
||||
F_APIClient[Frontend API Client]
|
||||
F_App --> F_APIClient
|
||||
end
|
||||
|
||||
subgraph Backend API Space
|
||||
APIGW[API Gateway]
|
||||
API_L[Backend API Lambdas]
|
||||
APIGW --> API_L
|
||||
end
|
||||
|
||||
subgraph Backend Daily Pipeline Space
|
||||
Scheduler[EventBridge Scheduler] --> Orchestrator[Step Functions Orchestrator]
|
||||
|
||||
Orchestrator --> HNFetcher[HN Data Fetcher Lambda]
|
||||
HNFetcher -->|Reads/Writes Post Status| DDB
|
||||
HNFetcher --> Algolia[Algolia HN API]
|
||||
|
||||
Orchestrator --> ArticleScraper[Article Scraper Lambda]
|
||||
ArticleScraper --> ExtWebsites[External Article Websites]
|
||||
|
||||
Orchestrator --> ContentFormatter[Content Formatter Lambda]
|
||||
|
||||
Orchestrator --> PlayAISubmit[Play.ai Submit Lambda]
|
||||
PlayAISubmit --> PlayAI_API[Play.ai PlayNote API]
|
||||
|
||||
subgraph Polling_SF[Play.ai Polling (Step Functions)]
|
||||
direction LR
|
||||
PollTask[Poll Status Lambda] --> PlayAI_API
|
||||
end
|
||||
Orchestrator --> Polling_SF
|
||||
|
||||
|
||||
Orchestrator --> PodcastStorage[Podcast Storage Lambda]
|
||||
PodcastStorage --> PlayAI_API
|
||||
PodcastStorage --> S3Store[S3 Audio Storage]
|
||||
|
||||
Orchestrator --> MetadataService[Metadata Persistence Lambda]
|
||||
MetadataService --> DDB[DynamoDB Episode/Post Metadata]
|
||||
end
|
||||
|
||||
F_APIClient --> APIGW
|
||||
API_L --> DDB
|
||||
|
||||
classDef external fill:#ddd,stroke:#333,stroke-width:2px;
|
||||
class Algolia,ExtWebsites,PlayAI_API external;
|
||||
```
|
||||
|
||||
## Architectural / Design Patterns Adopted
|
||||
|
||||
The following key architectural and design patterns have been chosen for this project:
|
||||
|
||||
* **Serverless Architecture:** Entire backend on AWS Lambda, API Gateway, S3, DynamoDB, Step Functions. Rationale: Minimized operations, auto-scaling, pay-per-use, cost-efficiency.
|
||||
* **Event-Driven Architecture:** Daily pipeline initiated by EventBridge Scheduler; Step Functions orchestrate based on state changes. Rationale: Decoupled components, reactive system for automation.
|
||||
* **Microservices-like Approach (Backend Lambda Functions):** Each Lambda handles a specific, well-defined task. Rationale: Modularity, independent scalability, easier testing/maintenance.
|
||||
* **Static Site Generation (SSG) for Frontend:** Next.js frontend exported as static files, hosted on S3/CloudFront. Rationale: Optimal performance, security, scalability, lower hosting costs.
|
||||
* **Infrastructure as Code (IaC):** AWS CDK in TypeScript for all AWS infrastructure in both repositories. Rationale: Repeatable, version-controlled, automated provisioning.
|
||||
* **Polling Pattern (External Job Status):** AWS Step Functions implement a polling loop for Play.ai job status. Rationale: Reliable tracking of asynchronous third-party jobs, based on Play.ai docs.
|
||||
* **Orchestration Pattern (AWS Step Functions):** End-to-end daily backend pipeline managed by a Step Functions state machine. Rationale: Robust workflow automation, state management, error handling for multi-step processes.
|
||||
@@ -1,40 +0,0 @@
|
||||
# Data Models
|
||||
|
||||
## Core Application Entities
|
||||
|
||||
**a. Episode**
|
||||
|
||||
* Attributes: `episodeId` (PK, UUID), `publicationDate` (YYYY-MM-DD), `episodeNumber` (Number), `podcastGeneratedTitle` (String), `audioS3Bucket` (String), `audioS3Key` (String), `audioUrl` (String, derived for API), `playAiJobId` (String), `playAiSourceAudioUrl` (String), `sourceHNPosts` (List of `SourceHNPost`), `status` (String: "PROCESSING", "PUBLISHED", "FAILED"), `createdAt` (ISO Timestamp), `updatedAt` (ISO Timestamp).
|
||||
|
||||
**b. SourceHNPost (object within `Episode.sourceHNPosts`)**
|
||||
|
||||
* Attributes: `hnPostId` (String), `title` (String), `originalArticleUrl` (String), `hnLink` (String), `isUpdateStatus` (Boolean), `oldRank` (Number, Optional), `lastCommentFetchTimestamp` (Number, Unix Timestamp), `articleScrapingFailed` (Boolean), `articleTitleFromScrape` (String, Optional).
|
||||
|
||||
**c. HackerNewsPostProcessState (DynamoDB Table)**
|
||||
|
||||
* Attributes: `hnPostId` (PK, String), `originalArticleUrl` (String), `articleTitleFromScrape` (String, Optional), `lastSuccessfullyScrapedTimestamp` (Number, Optional), `lastCommentFetchTimestamp` (Number, Optional), `firstProcessedDate` (YYYY-MM-DD), `lastProcessedDate` (YYYY-MM-DD), `lastKnownRank` (Number, Optional).
|
||||
|
||||
## API Payload Schemas (Internal API)
|
||||
|
||||
**a. `EpisodeListItem` (for `GET /episodes`)**
|
||||
|
||||
* `episodeId`, `publicationDate`, `episodeNumber`, `podcastGeneratedTitle`.
|
||||
|
||||
**b. `EpisodeDetail` (for `GET /episodes/{episodeId}`)**
|
||||
|
||||
* `episodeId`, `publicationDate`, `episodeNumber`, `podcastGeneratedTitle`, `audioUrl`, `sourceHNPosts` (list of `SourceHNPostDetail` containing `hnPostId`, `title`, `originalArticleUrl`, `hnLink`, `isUpdateStatus`, `oldRank`), `playAiJobId` (optional), `playAiSourceAudioUrl` (optional), `createdAt`.
|
||||
|
||||
## Database Schemas (AWS DynamoDB)
|
||||
|
||||
**a. `BmadDailyDigestEpisodes` Table**
|
||||
|
||||
* PK: `episodeId` (String).
|
||||
* Attributes: As per `Episode` entity.
|
||||
* GSI Example (`PublicationDateIndex`): PK: `status`, SK: `publicationDate`.
|
||||
* Billing: PAY\_PER\_REQUEST.
|
||||
|
||||
**b. `HackerNewsPostProcessState` Table**
|
||||
|
||||
* PK: `hnPostId` (String).
|
||||
* Attributes: As per `HackerNewsPostProcessState` entity.
|
||||
* Billing: PAY\_PER\_REQUEST.
|
||||
@@ -1,19 +0,0 @@
|
||||
# Environment Variables Documentation
|
||||
|
||||
The BMad Daily Digest application uses various environment variables for configuration settings. These variables are referenced throughout the architecture document, particularly in the sections about API interactions with external services.
|
||||
|
||||
## Backend Environment Variables
|
||||
|
||||
### Play.ai API Configuration
|
||||
- `PLAY_AI_BEARER_TOKEN`: Authentication token for Play.ai API
|
||||
- `PLAY_AI_USER_ID`: User ID for Play.ai API
|
||||
- `PLAY_AI_VOICE1_ID`: ID for primary voice used in podcast
|
||||
- `PLAY_AI_VOICE1_NAME`: Name of primary voice
|
||||
- `PLAY_AI_VOICE2_ID`: ID for secondary voice used in podcast
|
||||
- `PLAY_AI_VOICE2_NAME`: Name of secondary voice
|
||||
- `PLAY_AI_STYLE`: Style parameter for the Play.ai API
|
||||
|
||||
### Frontend API Configuration
|
||||
- `NEXT_PUBLIC_BACKEND_API_URL`: URL to access the backend API
|
||||
|
||||
Note: The environment variables are managed through AWS Lambda environment variables via CDK (from local gitignored `.env` for MVP setup).
|
||||
@@ -1,77 +0,0 @@
|
||||
# Epic 1: Backend Foundation, Tooling & "Hello World" API
|
||||
|
||||
**Goal:** To establish the core backend project infrastructure in its dedicated repository, including robust development tooling and initial AWS CDK setup for essential services. By the end of this epic:
|
||||
1. A simple "hello world" API endpoint (AWS API Gateway + Lambda) **must** be deployed and testable via `curl`, returning a dynamic message.
|
||||
2. The backend project **must** have ESLint, Prettier, Jest (unit testing), and esbuild (TypeScript bundling) configured and operational.
|
||||
3. Basic unit tests **must** exist for the "hello world" Lambda function.
|
||||
4. Code formatting and linting checks **should** be integrated into a pre-commit hook and/or a basic CI pipeline stub.
|
||||
|
||||
## User Stories
|
||||
|
||||
**Story 1.1: Initialize Backend Project using TS-TEMPLATE-STARTER**
|
||||
* **User Story Statement:** As a Developer, I want to create the `bmad-daily-digest-backend` Git repository and initialize it using the existing `TS-TEMPLATE-STARTER`, ensuring all foundational tooling (TypeScript, Node.js 22, ESLint, Prettier, Jest, esbuild) is correctly configured and operational for this specific project, so that I have a high-quality, standardized development environment ready for application logic.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A new, private Git repository named `bmad-daily-digest-backend` **must** be created on GitHub.
|
||||
2. The contents of the `TS-TEMPLATE-STARTER` project **must** be copied/cloned into this new repository.
|
||||
3. `package.json` **must** be updated (project name, version, description).
|
||||
4. Project dependencies **must** be installable.
|
||||
5. TypeScript setup (`tsconfig.json`) **must** be verified for Node.js 22, esbuild compatibility; project **must** compile.
|
||||
6. ESLint and Prettier configurations **must** be operational; lint/format scripts **must** execute successfully.
|
||||
7. Jest configuration **must** be operational; test scripts **must** execute successfully with any starter example tests.
|
||||
8. Irrelevant generic demo code from starter **should** be removed. `index.ts`/`index.test.ts` can remain as placeholders.
|
||||
9. A standard `.gitignore` and an updated project `README.md` **must** be present.
|
||||
|
||||
**Story 1.2: Pre-commit Hook Implementation**
|
||||
* **User Story Statement:** As a Developer, I want pre-commit hooks automatically enforced in the `bmad-daily-digest-backend` repository, so that code quality standards (like linting and formatting) are checked and applied to staged files before any code is committed, thereby maintaining codebase consistency and reducing trivial errors.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A pre-commit hook tool (e.g., Husky) **must** be installed and configured.
|
||||
2. A tool for running linters/formatters on staged files (e.g., `lint-staged`) **must** be installed and configured.
|
||||
3. Pre-commit hook **must** trigger `lint-staged` on staged `.ts` files.
|
||||
4. `lint-staged` **must** be configured to run ESLint (`--fix`) and Prettier (`--write`).
|
||||
5. Attempting to commit files with auto-fixable issues **must** result in fixes applied and successful commit.
|
||||
6. Attempting to commit files with non-auto-fixable linting errors **must** abort the commit with error messages.
|
||||
7. Committing clean files **must** proceed without issues.
|
||||
|
||||
**Story 1.3: "Hello World" Lambda Function Implementation & Unit Tests**
|
||||
* **User Story Statement:** As a Developer, I need a simple "Hello World" AWS Lambda function implemented in TypeScript within the `bmad-daily-digest-backend` project. This function, when invoked, should return a dynamic greeting message including the current date and time, and it must be accompanied by comprehensive Jest unit tests, so that our basic serverless compute functionality, testing setup, and TypeScript bundling are validated.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `src/features/publicApi/statusHandler.ts` file (or similar according to final backend structure) **must** contain the Lambda handler.
|
||||
2. Handler **must** be AWS Lambda compatible (event, context, Promise response).
|
||||
3. Successful execution **must** return JSON: `statusCode: 200`, body with `message: "Hello from BMad Daily Digest Backend, today is [current_date] at [current_time]."`.
|
||||
4. Date and time in message **must** be dynamic.
|
||||
5. A corresponding Jest unit test file (e.g., `src/features/publicApi/statusHandler.test.ts`) **must** be created.
|
||||
6. Unit tests **must** verify: 200 status, valid JSON body, expected `message` field, "Hello from..." prefix, dynamic date/time portion (use mocked `Date`).
|
||||
7. All unit tests **must** pass.
|
||||
8. esbuild configuration **must** correctly bundle the handler.
|
||||
|
||||
**Story 1.4: AWS CDK Setup for "Hello World" API (Lambda & API Gateway)**
|
||||
* **User Story Statement:** As a Developer, I want to define the necessary AWS infrastructure (Lambda function and API Gateway endpoint) for the "Hello World" service using AWS CDK (Cloud Development Kit) in TypeScript, so that the infrastructure is version-controlled, repeatable, and can be deployed programmatically.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. AWS CDK (v2) **must** be a development dependency.
|
||||
2. CDK app structure **must** be initialized (e.g., in `cdk/`).
|
||||
3. A new CDK stack (e.g., `BmadDailyDigestBackendStack`) **must** be defined in TypeScript.
|
||||
4. CDK stack **must** define an AWS Lambda resource for the "Hello World" function (Node.js 22, bundled code reference, handler entry point, basic IAM role for CloudWatch logs, free-tier conscious settings).
|
||||
5. CDK stack **must** define an AWS API Gateway (HTTP API preferred) with a route (e.g., `GET /status` or `GET /hello`) triggering the Lambda, secured with the "Frontend Read API Key".
|
||||
6. CDK stack **must** be synthesizable (`cdk synth`) without errors.
|
||||
7. CDK code **must** adhere to project ESLint/Prettier standards.
|
||||
8. Mechanism for passing Lambda environment variables via CDK **must** be in place.
|
||||
|
||||
**Story 1.5: "Hello World" API Deployment & Manual Invocation Test**
|
||||
* **User Story Statement:** As a Developer, I need to deploy the "Hello World" API (defined in AWS CDK) to an AWS environment and successfully invoke its endpoint using a tool like `curl` (including the API Key), so that I can verify the end-to-end deployment process and confirm the basic API is operational in the cloud.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The AWS CDK stack for "Hello World" API **must** deploy successfully to a designated AWS account/region.
|
||||
2. The API Gateway endpoint URL for the `/status` (or `/hello`) route **must** be retrievable post-deployment.
|
||||
3. A `GET` request to the deployed endpoint, including the correct `x-api-key` header, **must** receive a response.
|
||||
4. HTTP response status **must** be 200 OK.
|
||||
5. Response body **must** be JSON containing the expected dynamic "Hello..." message.
|
||||
6. Basic Lambda invocation logs **must** be visible in AWS CloudWatch Logs.
|
||||
|
||||
**Story 1.6: Basic CI/CD Pipeline Stub with Quality Gates**
|
||||
* **User Story Statement:** As a Developer, I need a basic Continuous Integration (CI) pipeline established for the `bmad-daily-digest-backend` repository, so that code quality checks (linting, formatting, unit tests) and the build process are automated upon code pushes and pull requests, ensuring early feedback and maintaining codebase health.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A CI workflow file (e.g., GitHub Actions in `.github/workflows/main.yml`) **must** be created.
|
||||
2. Pipeline **must** trigger on pushes to `main` and PRs targeting `main`.
|
||||
3. Pipeline **must** include steps for: checkout, Node.js 22 setup, dependency install, ESLint check, Prettier format check, Jest unit tests, esbuild bundle.
|
||||
4. Pipeline **must** fail if any lint, format, test, or bundle step fails.
|
||||
5. A successful CI run on the `main` branch **must** be demonstrated.
|
||||
6. CI pipeline for MVP **does not** need to perform AWS deployment.
|
||||
@@ -1,119 +0,0 @@
|
||||
# Epic 2: Automated Content Ingestion & Podcast Generation Pipeline
|
||||
|
||||
**Goal:** To implement the complete automated daily workflow within the backend. This includes fetching Hacker News post data, scraping and extracting content from linked external articles, aggregating and formatting text, submitting it to Play.ai, managing job status via polling, and retrieving/storing the final audio file and associated metadata. This epic delivers the core value proposition of generating the daily audio content and making it ready for consumption via an API.
|
||||
|
||||
## User Stories
|
||||
|
||||
**Story 2.1: AWS CDK Extension for Epic 2 Resources**
|
||||
* **User Story Statement:** As a Developer, I need to extend the existing AWS CDK stack within the `bmad-daily-digest-backend` project to define and provision all new AWS resources required for the content ingestion and podcast generation pipeline—including the `BmadDailyDigestEpisodes` DynamoDB table (with GSI), the `HackerNewsPostProcessState` DynamoDB table, an S3 bucket for audio storage, and the AWS Step Functions state machine for orchestrating the Play.ai job status polling—so that all backend infrastructure for this epic is managed as code and ready for the application logic.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The existing AWS CDK application (from Epic 1) **must** be extended.
|
||||
2. The `BmadDailyDigestEpisodes` DynamoDB table resource **must** be defined in CDK as specified in the System Architecture Document's "Data Models" section (with `episodeId` PK, key attributes like `publicationDate`, `episodeNumber`, `podcastGeneratedTitle`, `audioS3Key`, `audioS3Bucket`, `playAiJobId`, `playAiSourceAudioUrl`, `sourceHNPosts` list, `status`, `createdAt`, `updatedAt`), including a GSI for chronological sorting (e.g., PK `status`, SK `publicationDate`), and PAY_PER_REQUEST billing.
|
||||
3. The `HackerNewsPostProcessState` DynamoDB table resource **must** be defined in CDK as specified in the System Architecture Document's "Data Models" section (with `hnPostId` PK and attributes like `lastCommentFetchTimestamp`, `lastSuccessfullyScrapedTimestamp`, `lastKnownRank`), and PAY_PER_REQUEST billing.
|
||||
4. An S3 bucket resource (e.g., `bmad-daily-digest-audio-{unique-suffix}`) **must** be defined via CDK for audio storage, with private access by default.
|
||||
5. An AWS Step Functions state machine resource **must** be defined via CDK to manage the Play.ai job status polling workflow (as detailed in Story 2.6).
|
||||
6. Necessary IAM roles and permissions for Lambda functions within this epic to interact with DynamoDB, S3, Step Functions, CloudWatch Logs **must** be defined via CDK, adhering to least privilege.
|
||||
7. The updated CDK stack **must** synthesize (`cdk synth`) and deploy (`cdk deploy`) successfully.
|
||||
8. All new CDK code **must** adhere to project ESLint/Prettier standards.
|
||||
|
||||
**Story 2.2: Fetch Top Hacker News Posts & Identify Repeats**
|
||||
* **User Story Statement:** As the System, I need to reliably fetch the top N (configurable, e.g., 10) current Hacker News posts daily using the Algolia HN API, including their essential metadata. I also need to identify if each fetched post has been processed in a recent digest by checking against the `HackerNewsPostProcessState` table, so that I have an accurate list of stories and their status (new or repeat) to begin generating the daily digest.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `hackerNewsService.ts` function **must** fetch top N HN posts (stories only) via `axios` from Algolia API (configurable `HN_POSTS_COUNT`).
|
||||
2. Extracted metadata per post: Title, Article URL, HN Post URL, HN Post ID (`objectID`), Author, Points, Creation timestamp.
|
||||
3. For each post, the function **must** query the `HackerNewsPostProcessState` DynamoDB table to determine its `isUpdateStatus` (true if `lastSuccessfullyScrapedTimestamp` and `lastCommentFetchTimestamp` indicate prior full processing) and retrieve `lastCommentFetchTimestamp` and `lastKnownRank` if available.
|
||||
4. Function **must** return an array of HN post objects with metadata, `isUpdateStatus`, `lastCommentFetchTimestamp`, and `lastKnownRank`.
|
||||
5. Error handling for Algolia/DynamoDB calls **must** be implemented and logged.
|
||||
6. Unit tests (Jest) **must** verify API calls, data extraction, repeat identification (mocked DDB), and error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.3: Article Content Scraping & Extraction (Conditional)**
|
||||
* **User Story Statement:** As the System, for each Hacker News post identified as *new* (or for which article scraping previously failed and is being retried), I need to robustly fetch its HTML content from the linked article URL and extract the primary textual content and title using libraries like Cheerio and Mozilla Readability. If scraping fails, a fallback mechanism must be triggered.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. An `articleScraperService.ts` function **must** accept an article URL and `isUpdateStatus`.
|
||||
2. If `isUpdateStatus` is true (article already successfully scraped and stored, though we are not storing full articles long term - this implies we have the article data available from a previous step if it's a repeat post where we don't re-scrape), scraping **must** be skipped. (For MVP, if it's a repeat post, we assume we don't need to re-scrape the article itself, only comments, as per user feedback. This story focuses on *new* scrapes or retries of failed scrapes).
|
||||
3. If a new scrape is needed: use `axios` (timeout, User-Agent) to fetch HTML.
|
||||
4. Use `Mozilla Readability` (JS port) and/or `Cheerio` to extract main article text and title.
|
||||
5. Return `{ success: true, title: string, content: string }` on success.
|
||||
6. If scraping fails: log failure, return `{ success: false, error: string, fallbackNeeded: true }`.
|
||||
7. No specific "polite" inter-article scraping delays for MVP.
|
||||
8. Unit tests (Jest) **must** mock `axios`, test successful extraction, skip logic for non-applicable cases, and failure/fallback scenarios. All tests **must** pass.
|
||||
|
||||
**Story 2.4: Fetch Hacker News Comments (Conditional Logic)**
|
||||
* **User Story Statement:** As the System, I need to fetch comments for each selected Hacker News post using the Algolia HN API, adjusting the strategy to fetch up to N comments for new posts, only new comments since last fetch for repeat posts, or up to 3N comments if article scraping failed.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. `hackerNewsService.ts` **must** be extended to fetch comments for an HN Post ID, accepting `isUpdateStatus`, `lastCommentFetchTimestamp` (from `HackerNewsPostProcessState`), and `articleScrapingFailed` flags.
|
||||
2. Use `axios` to call Algolia HN API item endpoint.
|
||||
3. **Comment Fetching Logic:**
|
||||
* If `articleScrapingFailed`: Fetch up to 3 * `HN_COMMENTS_COUNT_PER_POST` available comments.
|
||||
* If `isUpdateStatus`: Fetch all comments, then filter client-side for comments with `created_at_i` > `lastCommentFetchTimestamp`. Select up to `HN_COMMENTS_COUNT_PER_POST` of these *new* comments.
|
||||
* Else (new post, successful scrape): Fetch up to `HN_COMMENTS_COUNT_PER_POST`.
|
||||
4. For selected comments, extract plain text (HTML stripped), author, creation timestamp.
|
||||
5. Return array of comment objects; empty if none. An updated `lastCommentFetchTimestamp` (max `created_at_i` of fetched comments for this post) should be available for updating `HackerNewsPostProcessState`.
|
||||
6. Error handling and logging for API calls.
|
||||
7. Unit tests (Jest) **must** mock `axios` and verify all conditional fetching logic, comment selection/filtering, data extraction, and error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.5: Content Aggregation and Formatting for Play.ai**
|
||||
* **User Story Statement:** As the System, I need to aggregate the collected Hacker News post data (titles), associated article content (full, truncated, or fallback summary), and comments (new, updated, or extended sets) for all top stories, and format this combined text according to the specified structure for the play.ai PlayNote API, including special phrasing for different post types (new, update, scrape-failed) and configurable article truncation.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `contentFormatterService.ts` **must** be implemented.
|
||||
2. Inputs: Array of processed HN post objects (with metadata, statuses, content, comments).
|
||||
3. Output: A single string.
|
||||
4. String starts: "It's a top 10 countdown for [Current Date]".
|
||||
5. Posts sequenced in reverse rank order.
|
||||
6. **Formatting (new post):** "Story [Rank] - [Article Title]. [Full/Truncated Article Text]. Comments Section. [Number] comments follow. Comment 1: [Text]..."
|
||||
7. **Formatting (repeat post):** "Story [Rank] (previously Rank [OldRank] yesterday) - [Article Title]. We're bringing you new comments on this popular story. Comments Section. [Number] new comments follow. Comment 1: [Text]..."
|
||||
8. **Formatting (scrape-failed post):** "Story [Rank] - [Article Title]. We couldn't retrieve the full article, but here's a summary if available and the latest comments. [Optional HN Summary]. Comments Section. [Number] comments follow. Comment 1: [Text]..."
|
||||
9. **Article Truncation:** If `MAX_ARTICLE_LENGTH` (env var) set and article exceeds, truncate aiming to preserve intro/conclusion.
|
||||
10. Graceful handling for missing parts.
|
||||
11. Unit tests (Jest) verify all formatting, truncation, data merging, error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.6 (REVISED): Implement Podcast Generation Status Polling via Play.ai API**
|
||||
* **User Story Statement:** As the System, after submitting a podcast generation job to Play.ai and receiving a `jobId`, I need an AWS Step Function state machine to periodically poll the Play.ai API for the status of this specific job, continuing until the job is reported as "completed" or "failed" (or a configurable max duration/attempts limit is reached), so the system can reliably determine when the podcast audio is ready or if an error occurred.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The AWS Step Function state machine (CDK defined in Story 2.1) **must** manage the polling workflow.
|
||||
2. Input: `jobId`.
|
||||
3. States: Invoke Poller Lambda (calls Play.ai status GET endpoint with `axios`), Wait (configurable `POLLING_INTERVAL_MINUTES`), Choice (evaluates status: "processing", "completed", "failed").
|
||||
4. Loop if "processing". Stop if "completed" or "failed".
|
||||
5. Max polling duration/attempts (configurable env vars `MAX_POLLING_DURATION_MINUTES`, `MAX_POLLING_ATTEMPTS`) **must** be enforced, treating expiry as failure.
|
||||
6. If "completed": extract `audioUrl`, trigger next step (Story 2.8 process) with data.
|
||||
7. If "failed"/"timeout": log event, record failure (e.g., update episode status in DDB via a Lambda), terminate.
|
||||
8. Poller Lambda handles API errors gracefully.
|
||||
9. Unit tests for Poller Lambda logic; Step Function definition tested (locally if possible, or via AWS console tests). All tests **must** pass.
|
||||
|
||||
**Story 2.7: Submit Content to Play.ai PlayNote API & Initiate Podcast Generation**
|
||||
* **User Story Statement:** As the System, I need to securely submit the aggregated and formatted text content (using `sourceText`) to the play.ai PlayNote API via an `application/json` request to initiate the podcast generation process, and I must capture the `jobId` returned by Play.ai, so that this `jobId` can be passed to the status polling mechanism (Step Function).
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `playAiService.ts` function **must** handle submission.
|
||||
2. Input: formatted text (from Story 2.5).
|
||||
3. Use `axios` for `POST` to Play.ai endpoint (e.g., `https://api.play.ai/api/v1/playnotes`).
|
||||
4. Request `Content-Type: application/json`.
|
||||
5. JSON body: `sourceText`, and configurable `title`, `voiceId1`, `name1` (default "Angelo"), `voiceId2`, `name2` (default "Deedee"), `styleGuidance` (default "podcast") from env vars.
|
||||
6. Headers: `Authorization: Bearer <PLAY_AI_BEARER_TOKEN>`, `X-USER-ID: <PLAY_AI_USER_ID>` (from env vars).
|
||||
7. No `webHookUrl` sent.
|
||||
8. On success: extract `jobId`, log it, initiate polling Step Function (Story 2.6) with `jobId` and other context (like internal `episodeId`).
|
||||
9. Error handling for API submission; log and flag failure.
|
||||
10. Unit tests (Jest) mock `axios`, verify API call, auth, payload, `jobId` extraction, Step Function initiation (mocked), error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.8: Retrieve, Store Generated Podcast Audio & Persist Episode Metadata**
|
||||
* **User Story Statement:** As the System, once the podcast generation status polling (Story 2.6) indicates a Play.ai job is "completed," I need to download the generated audio file from the provided `audioUrl`, store this file in our designated S3 bucket, and then save/update all relevant metadata for the episode (including S3 audio location, `episodeNumber`, `podcastGeneratedTitle`, `playAiSourceAudioUrl`, and source HN post information including their `lastCommentFetchTimestamp`) into our DynamoDB tables (`BmadDailyDigestEpisodes` and `HackerNewsPostProcessState`), so that the daily digest is fully processed, archived, and ready for access.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `podcastStorageService.ts` function **must** be triggered by Step Function (Story 2.6) on "completed" status, receiving `audioUrl`, Play.ai `jobId`, and original context (like internal `episodeId`, list of source HN posts with their metadata and processing status).
|
||||
2. Use `axios` to download audio from `audioUrl`.
|
||||
3. Upload audio to S3 bucket (from Story 2.1), using key (e.g., `YYYY/MM/DD/episodeId.mp3`).
|
||||
4. Prepare `Episode` metadata for `BmadDailyDigestEpisodes` table: `episodeId` (UUID), `publicationDate` (YYYY-MM-DD), `episodeNumber` (sequential logic, TBD), `podcastGeneratedTitle` (from Play.ai or constructed), `audioS3Bucket`, `audioS3Key`, `playAiJobId`, `playAiSourceAudioUrl`, `sourceHNPosts` (array of objects: `{ hnPostId, title, originalArticleUrl, hnLink, isUpdateStatus, oldRank, articleScrapingFailed }`), `status: "Published"`, `createdAt`, `updatedAt`.
|
||||
5. For each `hnPostId` in `sourceHNPosts`, update its corresponding item in the `HackerNewsPostProcessState` table with the `lastCommentFetchTimestamp` (current time or max comment time from this run), `lastProcessedDate` (current date), and `lastKnownRank`. If `articleScrapingFailed` was false for this run, update `lastSuccessfullyScrapedTimestamp`.
|
||||
6. Save `Episode` metadata to `BmadDailyDigestEpisodes` DynamoDB table.
|
||||
7. Error handling for download, S3 upload, DDB writes; failure should result in episode `status: "Failed"`.
|
||||
8. Unit tests (Jest) mock `axios`, AWS SDK (S3, DynamoDB); verify data handling, storage, metadata construction for both tables, errors. All tests **must** pass.
|
||||
|
||||
**Story 2.9: Daily Workflow Orchestration & Scheduling**
|
||||
* **User Story Statement:** As the System Administrator, I need the entire daily backend workflow (Stories 2.2 through 2.8) to be fully orchestrated by the primary AWS Step Function state machine and automatically scheduled to run once per day using Amazon EventBridge Scheduler, ensuring it handles re-runs for the same day by overwriting/starting over (for MVP), so that "BMad Daily Digest" episodes are produced consistently and reliably.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The primary AWS Step Function state machine **must** orchestrate the sequence: Fetch HN Posts & Identify Repeats (2.2); For each post: conditionally Scrape Article (2.3) & Fetch Comments (2.4); then Aggregate & Format Content (2.5); then Submit to Play.ai & get `jobId` (2.7); then initiate/manage Polling (2.6 using `jobId`); on "completed" polling, trigger Retrieve & Store Audio/Metadata (2.8).
|
||||
2. State machine **must** manage data flow (inputs/outputs) between steps correctly.
|
||||
3. Overall workflow error handling: critical step failure marks state machine execution as "Failed" and logs comprehensively. Steps use retries for transient errors.
|
||||
4. **Idempotency (MVP):** Re-running for the same `publicationDate` **must** re-process and effectively overwrite previous data for that date.
|
||||
5. Amazon EventBridge Scheduler rule (CDK defined) **must** trigger the main Step Function daily at 12:00 UTC (default, configurable via `DAILY_JOB_SCHEDULE_UTC_CRON`).
|
||||
6. Successful end-to-end run **must** be demonstrated (e.g., processing sample data through the pipeline).
|
||||
7. Step Function execution history **must** provide a clear audit trail of steps and data.
|
||||
8. Unit tests for any new orchestrator-specific Lambda functions (if any not covered). All tests **must** pass.
|
||||
@@ -1,83 +0,0 @@
|
||||
# Epic 3: Web Application MVP & Podcast Consumption
|
||||
|
||||
**Goal:** To set up the frontend project in its dedicated repository and develop and deploy the Next.js frontend application MVP, enabling users to consume the "BMad Daily Digest." This includes initial project setup (AI-assisted UI kickstart from `bmad-daily-digest-ui` scaffold), pages for listing and detailing episodes, an about page, and deployment.
|
||||
|
||||
## User Stories
|
||||
|
||||
**Story 3.1: Frontend Project Repository & Initial UI Setup (AI-Assisted)**
|
||||
* **User Story Statement:** As a Developer, I need to establish the `bmad-daily-digest-frontend` Git repository with a new Next.js (TypeScript, Node.js 22) project, using the provided `bmad-daily-digest-ui` V0 scaffold as the base. This setup must include all foundational tooling (ESLint, Prettier, Jest with React Testing Library, a basic CI stub), and an initial AWS CDK application structure, ensuring the "80s retro CRT terminal" aesthetic (with Tailwind CSS and shadcn/ui) is operational, so that a high-quality, styled, and standardized frontend development environment is ready.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A new, private Git repository `bmad-daily-digest-frontend` **must** be created on GitHub.
|
||||
2. The `bmad-daily-digest-ui` V0 scaffold project files **must** be used as the initial codebase in this repository.
|
||||
3. `package.json` **must** be updated (project name, version, description).
|
||||
4. Project dependencies **must** be installable.
|
||||
5. TypeScript (`tsconfig.json`), Next.js (`next.config.mjs`), Tailwind (`tailwind.config.ts`), ESLint, Prettier, Jest configurations from the scaffold **must** be verified and operational.
|
||||
6. The application **must** build successfully (`npm run build`) with the scaffolded UI.
|
||||
7. A basic CI pipeline stub (GitHub Actions) for lint, format check, test, build **must** be created.
|
||||
8. A standard `.gitignore` and an updated `README.md` **must** be present.
|
||||
9. An initial AWS CDK application structure **must** be created within a `cdk/` directory in this repository, ready for defining frontend-specific infrastructure (S3, CloudFront in Story 3.6).
|
||||
|
||||
**Story 3.2: Frontend API Service Layer for Backend Communication**
|
||||
* **User Story Statement:** As a Frontend Developer, I need a dedicated and well-typed API service layer (e.g., `lib/api-client.ts`) within the Next.js frontend application to manage all HTTP communication with the "BMad Daily Digest" backend API (for fetching episode lists and specific episode details), so that UI components can cleanly and securely consume backend data with robust error handling.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A TypeScript module `lib/api-client.ts` (or similar) **must** encapsulate backend API interactions.
|
||||
2. Functions **must** exist for: `getEpisodes(): Promise<EpisodeListItem[]>` and `getEpisodeDetails(episodeId: string): Promise<EpisodeDetail | null>`.
|
||||
3. `axios` (or native `Workspace` with a wrapper if preferred for frontend) **must** be used for HTTP requests.
|
||||
4. Backend API base URL (`NEXT_PUBLIC_BACKEND_API_URL`) and Frontend Read API Key (`NEXT_PUBLIC_FRONTEND_API_KEY`) **must** be configurable via public environment variables and used in requests.
|
||||
5. TypeScript interfaces (`EpisodeListItem`, `EpisodeDetail`, `SourceHNPostDetail` from `lib/types.ts`) for API response data **must** be defined/used, matching backend API.
|
||||
6. API functions **must** correctly parse JSON responses and transform data into defined interfaces.
|
||||
7. Error handling (network errors, non-2xx responses from backend) **must** be implemented, providing clear error information/objects.
|
||||
8. Unit tests (Jest) **must** mock the HTTP client and verify API calls, data parsing/transformation, and error handling. All tests **must** pass.
|
||||
|
||||
**Story 3.3: Episode List Page Implementation**
|
||||
* **User Story Statement:** As a Busy Tech Executive, I want to view a responsive "Episode List Page" (based on `app/(pages)/episodes/page.tsx` from the scaffold) that clearly displays all available "BMad Daily Digest" episodes in reverse chronological order, showing the episode number, publication date, and podcast title for each, using themed components like `episode-card.tsx`, so that I can quickly find and select an episode.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The existing `app/(pages)/episodes/page.tsx` (or equivalent main list page from scaffold) **must** be updated.
|
||||
2. It **must** use the API service layer (Story 3.2) to fetch episodes.
|
||||
3. A themed loading state (e.g., using `loading-state.tsx`) **must** be shown.
|
||||
4. A themed error message (e.g., using `error-state.tsx`) **must** be shown if fetching fails.
|
||||
5. A "No episodes available yet" message **must** be shown for an empty list.
|
||||
6. Episodes **must** be listed in reverse chronological order.
|
||||
7. Each list item, potentially using a modified `episode-card.tsx` component, **must** display "Episode [EpisodeNumber]: [PublicationDate] - [PodcastGeneratedTitle]".
|
||||
8. Each item **must** link to the Episode Detail Page for that episode using its `episodeId`.
|
||||
9. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
10. The page **must** be responsive.
|
||||
11. Unit/integration tests (Jest with RTL) **must** cover all states, data display, order, and navigation. All tests **must** pass.
|
||||
|
||||
**Story 3.4: Episode Detail Page Implementation**
|
||||
* **User Story Statement:** As a Busy Tech Executive, after selecting an episode, I want to navigate to a responsive "Episode Detail Page" (based on `app/(pages)/episodes/[episodeId]/page.tsx`/page.tsx] from the scaffold) that features an embedded HTML5 audio player, displays the episode title/date/number, a list of the Hacker News stories covered (using components like `story-item.tsx`), and provides clear links to the original articles and HN discussions, so I can listen and explore sources.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The dynamic route page `app/(pages)/episodes/[episodeId]/page.tsx` **must** be implemented.
|
||||
2. It **must** accept `episodeId` from the URL.
|
||||
3. It **must** use the API service layer (Story 3.2) to fetch episode details.
|
||||
4. Loading and error states **must** be handled and displayed with themed components.
|
||||
5. If data found, **must** display: `podcastGeneratedTitle`, `publicationDate`, `episodeNumber`.
|
||||
6. An embedded HTML5 audio player (`<audio controls>`) **must** play the podcast using the public `audioUrl` from the episode details.
|
||||
7. A list of included Hacker News stories (from `sourceHNPosts`) **must** be displayed, potentially using a `story-item.tsx` component for each.
|
||||
8. For each HN story: its title, a link to `originalArticleUrl` (new tab), and a link to `hnLink` (new tab) **must** be displayed.
|
||||
9. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
10. The page **must** be responsive.
|
||||
11. Unit/integration tests (Jest with RTL) **must** cover all states, rendering of details, player, links. All tests **must** pass.
|
||||
|
||||
**Story 3.5: "About" Page Implementation**
|
||||
* **User Story Statement:** As a User, I want to access a minimalist, responsive "About Page" (based on `app/(pages)/about/page.tsx` from the scaffold) that clearly explains "BMad Daily Digest," its purpose, and how it works, styled consistently, so I can understand the service.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The `app/(pages)/about/page.tsx` component **must** be implemented.
|
||||
2. It **must** display static informational content (Placeholder: "BMad Daily Digest provides a daily audio summary of top Hacker News discussions for busy tech professionals, generated using AI. Our mission is to keep you informed, efficiently. All content is curated and processed to deliver key insights in an easily digestible audio format, presented with a unique retro-tech vibe.").
|
||||
3. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
4. The page **must** be responsive.
|
||||
5. A link to "About Page" **must** be accessible from site navigation (e.g., via `header.tsx` or `footer.tsx`).
|
||||
6. Unit tests (Jest with RTL) for rendering static content. All tests **must** pass.
|
||||
|
||||
**Story 3.6: Frontend Deployment to S3 & CloudFront via CDK**
|
||||
* **User Story Statement:** As a Developer, I need the Next.js frontend application to be configured for static export (or an equivalent static-first deployment model) and have its AWS infrastructure (S3 for hosting, CloudFront for CDN and HTTPS) defined and managed via its own AWS CDK application within the frontend repository. This setup should automate the build and deployment of the static site, making the "BMad Daily Digest" web application publicly accessible, performant, and cost-effective.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. Next.js app **must** be configured for static export suitable for S3/CloudFront.
|
||||
2. The AWS CDK app within `bmad-daily-digest-frontend/cdk/` (from Story 3.1) **must** define the S3 bucket and CloudFront distribution.
|
||||
3. CDK stack **must** define: S3 bucket (static web hosting), CloudFront distribution (S3 origin, HTTPS via default CloudFront domain or ACM cert for custom domain if specified for MVP, caching, OAC/OAI).
|
||||
4. A `package.json` build script **must** generate the static output.
|
||||
5. The CDK deployment process (`cdk deploy` run via CI or manually for MVP) **must** include steps/hooks to build the Next.js app and sync static files to S3.
|
||||
6. Application **must** be accessible via its CloudFront URL.
|
||||
7. All MVP functionalities **must** be operational on the deployed site.
|
||||
8. HTTPS **must** be enforced.
|
||||
9. CDK code **must** meet project standards.
|
||||
@@ -1,320 +0,0 @@
|
||||
# BMad Daily Digest Frontend Architecture Document
|
||||
|
||||
**Version:** 0.1
|
||||
**Date:** May 20, 2025
|
||||
**Author:** Jane (Design Architect) & User
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. Introduction
|
||||
2. Overall Frontend Philosophy & Patterns
|
||||
3. Detailed Frontend Directory Structure
|
||||
4. Component Breakdown & Implementation Details
|
||||
- Component Naming & Organization
|
||||
- Template for Component Specification
|
||||
- Example Key Custom Component: `EpisodeCard`
|
||||
5. State Management In-Depth
|
||||
6. API Interaction Layer (`lib/api-client.ts`)
|
||||
7. Routing Strategy
|
||||
8. Build, Bundling, and Deployment
|
||||
9. Frontend Testing Strategy
|
||||
10. Accessibility (AX) Implementation Details
|
||||
11. Performance Considerations
|
||||
12. Internationalization (i18n) and Localization (l10n) Strategy
|
||||
13. Feature Flag Management
|
||||
14. Frontend Security Considerations
|
||||
15. Browser Support and Progressive Enhancement
|
||||
16. Change Log
|
||||
|
||||
---
|
||||
|
||||
## 1\. Introduction
|
||||
|
||||
This document details the technical architecture specifically for the frontend of "BMad Daily Digest." It complements the main "BMad Daily Digest" System Architecture Document (v0.1) and the UI/UX Specification (v0.1). This document builds upon the foundational decisions (e.g., overall tech stack, CI/CD, primary testing tools) defined in the System Architecture Document and the visual/UX direction from the UI/UX Specification. The initial frontend structure has been scaffolded using an AI UI generation tool (V0.dev), and this document outlines how that scaffold will be developed into the full MVP application.
|
||||
|
||||
- **Link to Main System Architecture Document (REQUIRED):** `docs/architecture.md` (Conceptual path, refers to the doc created by Fred).
|
||||
- **Link to UI/UX Specification (REQUIRED):** `docs/ui-ux-specification.md` (Conceptual path, refers to the doc we created).
|
||||
- **Link to Primary Design Files (Figma, Sketch, etc.):** As per UI/UX Spec, detailed visual mockups in separate design files are not planned for MVP. Design is derived from UI/UX Spec and this document.
|
||||
- **Link to Deployed Storybook / Component Showcase:** Not an initial deliverable for MVP. May evolve post-MVP.
|
||||
- **Link to Frontend Source Code Repository:** `bmad-daily-digest-frontend` (GitHub).
|
||||
|
||||
## 2\. Overall Frontend Philosophy & Patterns
|
||||
|
||||
The frontend for "BMad Daily Digest" aims for a unique user experience based on an "80s retro CRT terminal" aesthetic, while being efficient, responsive, and maintainable.
|
||||
|
||||
- **Framework & Core Libraries:** Next.js (vLatest stable, e.g., 14.x, using App Router) with React (vLatest stable, e.g., 18.x) and TypeScript. These are derived from the "Definitive Tech Stack Selections" in the System Architecture Document.
|
||||
- **Component Architecture:**
|
||||
- Leverage **shadcn/ui** components as a base for building accessible and customizable UI elements.
|
||||
- These components will be heavily themed using **Tailwind CSS** to match the "80s retro CRT terminal" style.
|
||||
- Custom components (like `EpisodeCard`, `StoryItem` from the V0 scaffold) will be developed for application-specific needs, following a presentational/container pattern where appropriate, though for many Next.js App Router components, data fetching will be co-located or handled by Server Components if applicable to the static export strategy.
|
||||
- **State Management Strategy (MVP):**
|
||||
- Primarily **local component state** (`useState`, `useReducer`) for UI-specific logic.
|
||||
- **React Context API** for simple, shared state if needed across a limited part of the component tree (e.g., `ThemeProvider` from V0 scaffold).
|
||||
- No complex global state management library (like Redux or Zustand) is planned for the MVP, as the application's state needs are currently simple (fetching and displaying data). This can be reassessed post-MVP if complexity grows.
|
||||
- **Data Flow:** Client-side data fetching via the API Interaction Layer (`lib/api-client.ts`) which communicates with the backend API. Next.js App Router conventions for data fetching in pages/components will be used (e.g., `async` Server Components, or `useEffect` in Client Components for data fetching).
|
||||
- **Styling Approach:**
|
||||
- **Tailwind CSS:** Primary utility-first framework for all styling. Configuration in `tailwind.config.ts`.
|
||||
- **Global Styles:** Base styles, CSS variable definitions for the theme (e.g., glowing green color, retro fonts), and Tailwind base/components/utilities in `app/globals.css`.
|
||||
- The "80s retro CRT terminal" aesthetic (dark mode, glowing green text, monospaced/pixel fonts) is paramount.
|
||||
- **Key Design Patterns Used:**
|
||||
- Server Components & Client Components (Next.js App Router).
|
||||
- Custom Hooks (e.g., `use-mobile.tsx`, `use-toast.ts` from V0 scaffold) for reusable logic.
|
||||
- Composition over inheritance for UI components.
|
||||
|
||||
## 3\. Detailed Frontend Directory Structure
|
||||
|
||||
The project structure is based on the initial V0.dev scaffold for `bmad-daily-digest-frontend` and standard Next.js App Router conventions. The `cdk/` directory is added for managing frontend-specific AWS infrastructure.
|
||||
|
||||
```plaintext
|
||||
bmad-daily-digest-frontend/
|
||||
├── .github/ # GitHub Actions for CI/CD
|
||||
│ └── workflows/
|
||||
│ └── main.yml
|
||||
├── app/ # Next.js App Router
|
||||
│ ├── (pages)/ # Route group for main pages (as per V0 screenshot)
|
||||
│ │ ├── episodes/ # Route group for episodes
|
||||
│ │ │ ├── page.tsx # List Page (e.g., /episodes or /)
|
||||
│ │ │ └── [episode-id]/ # Dynamic route for episode detail
|
||||
│ │ │ └── page.tsx # Detail Page (e.g., /episodes/123)
|
||||
│ │ └── about/
|
||||
│ │ └── page.tsx # About Page (e.g., /about)
|
||||
│ ├── globals.css # Global styles, Tailwind base
|
||||
│ └── layout.tsx # Root layout, includes ThemeProvider
|
||||
├── components/ # UI components
|
||||
│ ├── ui/ # Base UI elements (likely shadcn/ui based, themed)
|
||||
│ ├── episode-card.tsx # Custom component for episode list items
|
||||
│ ├── error-state.tsx # Component for displaying error states
|
||||
│ ├── footer.tsx # Footer component
|
||||
│ ├── header.tsx # Header component with navigation
|
||||
│ ├── loading-state.tsx # Component for displaying loading states
|
||||
│ └── story-item.tsx # Component for HN story items on detail page
|
||||
├── cdk/ # AWS CDK application for frontend infra (S3, CloudFront)
|
||||
│ ├── bin/ # CDK app entry point
|
||||
│ └── lib/ # CDK stack definitions
|
||||
├── hooks/ # Custom React Hooks
|
||||
│ ├── use-mobile.tsx
|
||||
│ └── use-toast.ts
|
||||
├── lib/ # Utility functions, types, API client
|
||||
│ ├── data.ts # Initial V0 mock data (TO BE REMOVED/REPLACED)
|
||||
│ ├── types.ts # Frontend-specific TypeScript types (e.g., API responses)
|
||||
│ ├── utils.ts # Utility functions
|
||||
│ └── api-client.ts # NEW: Service for backend API communication
|
||||
├── public/ # Static assets (e.g., favicons, images if any)
|
||||
├── styles/ # Additional global styles or CSS modules (if any)
|
||||
├── .env.local.example # Example environment variables
|
||||
├── .eslintrc.js
|
||||
├── .gitignore
|
||||
├── .prettierrc.js
|
||||
├── components.json # shadcn/ui configuration
|
||||
├── jest.config.js
|
||||
├── next-env.d.ts
|
||||
├── next.config.mjs # Next.js configuration
|
||||
├── package.json
|
||||
├── package-lock.json
|
||||
├── postcss.config.js # For Tailwind CSS
|
||||
├── README.md
|
||||
├── tailwind.config.ts
|
||||
└── tsconfig.json
|
||||
```
|
||||
|
||||
**Key Directory Descriptions:**
|
||||
|
||||
- `app/`: Core Next.js App Router directory for pages, layouts, and global styles. The `(pages)` group organizes user-facing routes.
|
||||
- `components/`: Contains reusable React components, with `ui/` for base shadcn/ui elements (customized for the theme) and other files for application-specific composite components (e.g., `episode-card.tsx`).
|
||||
- `cdk/`: Houses the AWS CDK application for defining and deploying the frontend's S3 bucket and CloudFront distribution.
|
||||
- `hooks/`: For custom React Hooks providing reusable stateful logic.
|
||||
- `lib/`: For shared utilities (`utils.ts`), TypeScript type definitions (`types.ts`), and crucially, the `api-client.ts` which will encapsulate all communication with the backend. The initial `data.ts` (mock data from V0) will be phased out as `api-client.ts` is implemented.
|
||||
- Root configuration files (`next.config.mjs`, `tailwind.config.ts`, `tsconfig.json`, etc.) manage the Next.js, Tailwind, and TypeScript settings.
|
||||
|
||||
## 4\. Component Breakdown & Implementation Details
|
||||
|
||||
Components will be developed adhering to React best practices and Next.js App Router conventions (Server and Client Components). The V0 scaffold provides a good starting point for several components (`episode-card.tsx`, `header.tsx`, etc.) which will be refined and made dynamic.
|
||||
|
||||
### a. Component Naming & Organization
|
||||
|
||||
- **Naming Convention:** `PascalCase` for component file names and component names themselves (e.g., `EpisodeCard.tsx`, `LoadingState.tsx`). Folder names for components (if grouping) will be `dash-case`.
|
||||
- **Organization:** Shared, primitive UI elements (heavily themed shadcn/ui components) in `components/ui/`. More complex, domain-specific components directly under `components/` (e.g., `EpisodeCard.tsx`) or grouped into feature-specific subdirectories if the application grows significantly post-MVP.
|
||||
|
||||
### b. Template for Component Specification
|
||||
|
||||
_(This template should be used when defining new significant components or detailing existing ones that require complex logic or props. For many simple V0-generated presentational components, this level of formal spec might be overkill for MVP if their structure is clear from the code and UI/UX spec)._
|
||||
|
||||
#### Component: `{ComponentName}` (e.g., `EpisodeCard`, `StoryItem`)
|
||||
|
||||
- **Purpose:** {Briefly describe what this component does and its role.}
|
||||
- **Source File(s):** {e.g., `components/episode-card.tsx`}
|
||||
- **Visual Reference from UI/UX Spec:** {Link to relevant section/description in UI/UX Spec or conceptual layout.}
|
||||
- **Props (Properties):**
|
||||
| Prop Name | Type | Required? | Default Value | Description |
|
||||
| :-------------- | :------------------------ | :-------- | :------------ | :---------------------------------------- |
|
||||
| `{propName}` | `{type}` | Yes/No | N/A | {Description, constraints} |
|
||||
- **Internal State (if any):**
|
||||
| State Variable | Type | Initial Value | Description |
|
||||
| :-------------- | :------- | :------------ | :---------------------------------------- |
|
||||
| `{stateName}` | `{type}` | `{value}` | {Description} |
|
||||
- **Key UI Elements / Structure (Conceptual JSX/HTML):** {Describe the primary DOM structure and key elements, especially focusing on thematic styling classes from Tailwind.}
|
||||
- **Events Handled / Emitted:** {e.g., `onClick` for navigation, custom events.}
|
||||
- **Actions Triggered (Side Effects):** {e.g., API calls via `apiClient`, state updates.}
|
||||
- **Styling Notes:** {Key Tailwind classes, specific retro theme applications.}
|
||||
- **Accessibility Notes:** {ARIA attributes, keyboard navigation specifics.}
|
||||
|
||||
### c. Example Key Custom Component: `EpisodeCard.tsx`
|
||||
|
||||
_(This is an example of how the template might be briefly applied to an existing V0 component that needs to be made dynamic)._
|
||||
|
||||
- **Purpose:** Displays a single episode summary in the Episode List Page, acting as a link to the Episode Detail Page.
|
||||
- **Source File(s):** `components/episode-card.tsx`
|
||||
- **Props:**
|
||||
| Prop Name | Type | Required? | Description |
|
||||
| :---------------------- | :------------------------------------ | :-------- | :----------------------------------------------- |
|
||||
| `episode` | `EpisodeListItem` (from `lib/types.ts`) | Yes | Data object for the episode to display. |
|
||||
- **Key UI Elements / Structure:**
|
||||
- A clickable root container (e.g., `Link` component from Next.js).
|
||||
- Displays "Episode `episode.episodeNumber`: `episode.publicationDate` - `episode.podcastGeneratedTitle`" using themed text components.
|
||||
- Styled with Tailwind to match the "80s retro CRT terminal" aesthetic.
|
||||
- **Actions Triggered:** Navigates to `/episodes/${episode.episodeId}` on click.
|
||||
- **Styling Notes:** Uses primary glowing green text on dark background. Clear hover/focus state.
|
||||
- **Accessibility Notes:** Ensures the entire card is keyboard focusable and clearly indicates it's a link.
|
||||
|
||||
## 5\. State Management In-Depth
|
||||
|
||||
For the MVP, the state management strategy will be kept simple and align with modern React/Next.js best practices, leveraging built-in capabilities.
|
||||
|
||||
- **Chosen Solution(s):**
|
||||
- **Local Component State (`useState`, `useReducer`):** This will be the primary method for managing UI-specific state within individual components (e.g., dropdown open/close, input field values, loading states for component-specific data fetches).
|
||||
- **React Context API:** Will be used for sharing simple, global-like state that doesn't change frequently, such as theme information (e.g., the `ThemeProvider` from the V0 scaffold if it manages aspects of the dark mode or retro theme dynamically) or potentially user authentication status if added post-MVP. For MVP, its use will be minimal.
|
||||
- **URL State:** Next.js App Router's dynamic routes and search parameters will be used to manage state where appropriate (e.g., current `episodeId` in the URL).
|
||||
- **No Global State Library for MVP:** A dedicated global state management library (e.g., Redux Toolkit, Zustand, Jotai, Zustand) is **not planned for the initial MVP** due to the current simplicity of application-wide state requirements. Data fetching will be handled by components or page-level Server Components, with data passed down via props or managed via React Context if shared across a limited tree. This decision can be revisited post-MVP if state complexity grows.
|
||||
- **Conventions:**
|
||||
- Keep state as close as possible to where it's used.
|
||||
- Lift state up only when necessary for sharing between siblings.
|
||||
- For data fetched from the API, components will typically manage their own loading/error/data states locally (e.g., using custom hooks that wrap calls to `apiClient.ts`).
|
||||
|
||||
## 6\. API Interaction Layer (`lib/api-client.ts`)
|
||||
|
||||
This module will encapsulate all communication with the `bmad-daily-digest-backend` API.
|
||||
|
||||
- **HTTP Client Setup:**
|
||||
- Will use the browser's native **`Workspace` API**, wrapped in utility functions within `api-client.ts` for ease of use, error handling, and request/response processing.
|
||||
- The Backend API Base URL will be sourced from the `NEXT_PUBLIC_BACKEND_API_URL` environment variable.
|
||||
- The "Frontend Read API Key" (if decided upon for backend API access, as discussed with Fred) will be sourced from `NEXT_PUBLIC_FRONTEND_API_KEY` and included in requests via the `x-api-key` header.
|
||||
- **Service Functions (examples):**
|
||||
- `async function getEpisodes(): Promise<EpisodeListItem[]>`: Fetches the list of all episodes.
|
||||
- `async function getEpisodeDetails(episodeId: string): Promise<EpisodeDetail | null>`: Fetches details for a specific episode.
|
||||
- These functions will handle constructing request URLs, adding necessary headers (API Key, `Content-Type: application/json` for POST/PUT if any), making the `Workspace` call, parsing JSON responses, and transforming data into the frontend TypeScript types defined in `lib/types.ts`.
|
||||
- **Error Handling:**
|
||||
- The `api-client.ts` functions will implement robust error handling for network issues and non-successful HTTP responses from the backend.
|
||||
- Errors will be processed and returned in a consistent format (e.g., throwing a custom `ApiError` object or returning a result object like `{ data: null, error: { message: string, status?: number } }`) that UI components can easily consume to display appropriate feedback to the user (using `error-state.tsx` component).
|
||||
- Detailed errors will be logged to the browser console for debugging during development.
|
||||
- **Data Types:** All request and response payloads will be typed using interfaces defined in `lib/types.ts`, aligning with the backend API's data models.
|
||||
|
||||
## 7\. Routing Strategy
|
||||
|
||||
Routing will be handled by the **Next.js App Router**, leveraging its file-system based routing conventions.
|
||||
|
||||
- **Routing Library:** Next.js App Router (built-in).
|
||||
- **Route Definitions (MVP):**
|
||||
| Path Pattern | Page Component File Path (`app/...`) | Protection | Notes |
|
||||
| :--------------------- | :------------------------------------------ | :------------- | :---------------------------------- |
|
||||
| `/` or `/episodes` | `(pages)/episodes/page.tsx` | Public | Episode List Page (Homepage) |
|
||||
| `/episodes/[episodeId]`| `(pages)/episodes/[episodeId]/page.tsx` | Public | Episode Detail Page (Dynamic route) |
|
||||
| `/about` | `(pages)/about/page.tsx` | Public | About Page |
|
||||
- **Route Guards / Protection (MVP):** No specific client-side route protection (e.g., auth guards) is required for the MVP, as all content is public. The backend API endpoints used by the frontend are protected by an API Key.
|
||||
|
||||
## 8\. Build, Bundling, and Deployment
|
||||
|
||||
This section aligns with the "Frontend Deployment to S3 & CloudFront via CDK" (Story 3.6) from the PRD and System Architecture Document.
|
||||
|
||||
- **Build Process & Scripts (`package.json`):**
|
||||
- `"dev": "next dev"`: Starts the Next.js development server.
|
||||
- `"build": "next build"`: Builds the application for production. For static export, this may be followed by `next export` if using an older Next.js pattern, or newer Next.js versions handle static/hybrid output more directly with the build command for S3/CloudFront compatible deployment. We will aim for a fully static export if all pages support it.
|
||||
- `"start": "next start"`: Starts a production server (less relevant for pure SSG to S3, but good for local production testing).
|
||||
- `"lint": "next lint"`
|
||||
- `"test": "jest"`
|
||||
- **Environment Configuration Management:**
|
||||
- Public environment variables (prefixed with `NEXT_PUBLIC_`) like `NEXT_PUBLIC_BACKEND_API_URL` and `NEXT_PUBLIC_FRONTEND_API_KEY` will be managed via `.env.local` (gitignored), `.env.development`, `.env.production` files, and corresponding build-time environment variables in the CI/CD deployment process.
|
||||
- **Key Bundling Optimizations (largely handled by Next.js):**
|
||||
- **Code Splitting:** Automatic per-page code splitting by Next.js. Dynamic imports (`next/dynamic` or `React.lazy`) can be used for further component-level splitting if needed post-MVP.
|
||||
- **Tree Shaking:** Handled by Next.js's underlying Webpack/SWC bundler.
|
||||
- **Lazy Loading:** Next.js `next/image` component for image lazy loading. `next/dynamic` for component lazy loading.
|
||||
- **Minification & Compression:** Handled by Next.js production build.
|
||||
- **Deployment to S3/CloudFront (via Frontend CDK App):**
|
||||
- The `next build` (and potentially `next export` if using that pattern) output will be synced to an AWS S3 bucket configured for static website hosting.
|
||||
- An AWS CloudFront distribution will serve the content from S3, providing CDN caching, HTTPS, and custom domain support (post-MVP for custom domain).
|
||||
- The CDK app in the `bmad-daily-digest-frontend` repository will manage this S3 bucket and CloudFront distribution.
|
||||
|
||||
## 9\. Frontend Testing Strategy
|
||||
|
||||
This elaborates on the "Overall Testing Strategy" from the System Architecture Document, focusing on frontend specifics. E2E testing with Playwright is post-MVP.
|
||||
|
||||
- **Tools:** Jest with React Testing Library (RTL) for unit and component integration tests.
|
||||
- **Unit Tests:**
|
||||
- **Scope:** Individual utility functions, custom hooks, and simple presentational components.
|
||||
- **Focus:** Logic correctness, handling of different inputs/props.
|
||||
- **Component Tests / UI Integration Tests (using RTL):**
|
||||
- **Scope:** Testing individual React components or small groups of interacting components in isolation from the full application, but verifying their rendering, interactions, and basic state changes.
|
||||
- **Focus:** Correct rendering based on props/state, user interactions (clicks, form inputs if any using `@testing-library/user-event`), event emission, accessibility attributes. API calls from components (via `apiClient.ts`) will be mocked.
|
||||
- **Location:** Co-located with components (e.g., `MyComponent.test.tsx`).
|
||||
- **Test Coverage:** Aim for meaningful coverage of critical components and logic (\>70-80% as a guideline). Quality over quantity.
|
||||
- **Mocking:** Jest mocks for API service layer (`apiClient.ts`), Next.js router (`next/router` or `next/navigation`), and any browser APIs not available in JSDOM.
|
||||
|
||||
## 10\. Accessibility (AX) Implementation Details
|
||||
|
||||
This section details how the AX requirements from the UI/UX Specification will be technically implemented.
|
||||
|
||||
- **Semantic HTML:** Prioritize using correct HTML5 elements (e.g., `<nav>`, `<main>`, `<article>`, `<button>`) as provided by Next.js and React, or within custom JSX, to ensure inherent accessibility.
|
||||
- **ARIA Implementation:** `shadcn/ui` components generally provide good ARIA support out-of-the-box. For any custom components or interactions not covered, appropriate ARIA roles, states, and properties will be added as per WAI-ARIA authoring practices.
|
||||
- **Keyboard Navigation:** Ensure all interactive elements (links, custom components from shadcn/ui, audio player) are focusable and operable via keyboard. Logical focus order will be maintained. Focus visible styles will adhere to the retro theme (e.g., brighter green outline or block cursor).
|
||||
- **Focus Management:** For any future modals or dynamic UI changes that might trap focus, proper focus management techniques will be implemented.
|
||||
- **Contrast & Theming:** The "80s retro CRT terminal" theme (glowing green on dark) requires careful selection of shades to meet WCAG AA contrast ratios for text. This will be verified using accessibility tools during development and testing.
|
||||
- **Testing Tools for AX:**
|
||||
- Browser extensions like Axe DevTools or WAVE during development.
|
||||
- Automated checks using `jest-axe` for component tests where applicable.
|
||||
- Lighthouse accessibility audits in browser developer tools.
|
||||
- Manual keyboard navigation and screen reader (e.g., NVDA, VoiceOver) checks for key user flows.
|
||||
|
||||
## 11\. Performance Considerations
|
||||
|
||||
Frontend performance is key for a good user experience, especially for busy executives.
|
||||
|
||||
- **Next.js Optimizations:** Leverage built-in Next.js features:
|
||||
- Static Site Generation (SSG) for all pages where possible for fastest load times.
|
||||
- `next/image` for optimized image loading (formats, sizes, lazy loading), though MVP is text-heavy.
|
||||
- `next/font` for optimized web font loading if custom retro fonts are used.
|
||||
- Automatic code splitting.
|
||||
- **Minimizing Re-renders (React):** Use `React.memo` for components that render frequently with the same props. Optimize data structures passed as props. Use `useCallback` and `useMemo` judiciously for expensive computations or to preserve reference equality for props.
|
||||
- **Bundle Size:** Monitor frontend bundle size. While Next.js optimizes, be mindful of large dependencies. Use dynamic imports for non-critical, large components/libraries if they arise post-MVP.
|
||||
- **Efficient Data Fetching:** Ensure API calls via `apiClient.ts` are efficient and only fetch necessary data for each view.
|
||||
- **Debouncing/Throttling:** Not anticipated for MVP core features, but if any real-time input features were added post-MVP (e.g., search), these techniques would be applied.
|
||||
- **Performance Monitoring Tools:** Browser DevTools (Lighthouse, Performance tab), Next.js build output analysis.
|
||||
|
||||
## 12\. Internationalization (i18n) and Localization (l10n) Strategy
|
||||
|
||||
- **Not a requirement for MVP.** The application will be developed in English only. This can be revisited post-MVP if there's a need to support other languages.
|
||||
|
||||
## 13\. Feature Flag Management
|
||||
|
||||
- **Not a requirement for MVP.** No complex feature flagging system will be implemented for the initial release. New features will be released directly.
|
||||
|
||||
## 14\. Frontend Security Considerations
|
||||
|
||||
Aligns with "Security Best Practices" in the System Architecture Document.
|
||||
|
||||
- **XSS Prevention:** Rely on React's JSX auto-escaping. Avoid `dangerouslySetInnerHTML`.
|
||||
- **API Key Handling:** The `NEXT_PUBLIC_FRONTEND_API_KEY` for accessing the backend API will be embedded in the static build. While "public," it's specific to this frontend and can be rotated if necessary. This key should grant minimal (read-only) privileges on the backend.
|
||||
- **Third-Party Scripts:** Minimize use for MVP. If any are added (e.g., analytics post-MVP), vet for security and use Subresource Integrity (SRI) if loaded from CDNs.
|
||||
- **HTTPS:** Enforced by CloudFront.
|
||||
- **Dependency Vulnerabilities:** `npm audit` in CI.
|
||||
|
||||
## 15\. Browser Support and Progressive Enhancement
|
||||
|
||||
- **Target Browsers:** Latest 2 stable versions of modern evergreen browsers (Chrome, Firefox, Safari, Edge). Internet Explorer is NOT supported.
|
||||
- **Polyfill Strategy:** Next.js handles most necessary polyfills based on browser targets and features used. `core-js` might be implicitly included.
|
||||
- **JavaScript Requirement:** The application is a Next.js (React) Single Page Application and **requires JavaScript to be enabled** for all functionality. No significant progressive enhancement for non-JS environments is planned for MVP.
|
||||
- **CSS Compatibility:** Use Tailwind CSS with Autoprefixer (handled by Next.js build process) to ensure CSS compatibility with target browsers.
|
||||
|
||||
## 16\. Change Log
|
||||
|
||||
| Version | Date | Author | Summary of Changes |
|
||||
| :------ | :----------- | :----------------------------- | :--------------------------------------------------- |
|
||||
| 0.1 | May 20, 2025 | Jane (Design Architect) & User | Initial draft of the Frontend Architecture Document. |
|
||||
@@ -1,37 +0,0 @@
|
||||
# BMad Daily Digest Documentation
|
||||
|
||||
Welcome to the BMad Daily Digest documentation index. This page provides links to all project documentation.
|
||||
|
||||
## Project Overview & Requirements
|
||||
|
||||
- [Product Requirements Document (PRD)](./prd.md)
|
||||
- [UI/UX Specification](./ux-ui-spec.md)
|
||||
- [Architecture Overview](./architecture.md)
|
||||
|
||||
## Epics & User Stories
|
||||
|
||||
- [Epic 1: Backend Foundation](./epic-1.md) - Backend project setup and "Hello World" API
|
||||
- [Epic 2: Content Ingestion & Podcast Generation](./epic-2.md) - Core data pipeline and podcast creation
|
||||
- [Epic 3: Web Application & Podcast Consumption](./epic-3.md) - Frontend implementation for end-users
|
||||
|
||||
## Technical Documentation
|
||||
|
||||
### Backend Architecture
|
||||
|
||||
- [API Reference](./api-reference.md) - External and internal API documentation
|
||||
- [Data Models](./data-models.md) - Core data entities and schemas
|
||||
- [Component View](./component-view.md) - System components and their interactions
|
||||
- [Sequence Diagrams](./sequence-diagrams.md) - Core workflows and processes
|
||||
- [Project Structure](./project-structure.md) - Repository organization
|
||||
- [Environment Variables](./environment-vars.md) - Configuration settings
|
||||
|
||||
### Infrastructure & Operations
|
||||
|
||||
- [Technology Stack](./tech-stack.md) - Definitive technology selections
|
||||
- [Infrastructure and Deployment](./infra-deployment.md) - Deployment architecture
|
||||
- [Operational Guidelines](./operational-guidelines.md) - Coding standards, testing, error handling, and security
|
||||
|
||||
## Reference Materials
|
||||
|
||||
- [Key References](./key-references.md) - External documentation and resources
|
||||
- [Change Log](./change-log.md) - Document version history
|
||||
@@ -1,8 +0,0 @@
|
||||
# Infrastructure and Deployment Overview
|
||||
|
||||
* **Cloud Provider:** AWS.
|
||||
* **Core Services Used:** Lambda, API Gateway (HTTP API), S3, DynamoDB (On-Demand), Step Functions, EventBridge Scheduler, CloudFront, CloudWatch, IAM, ACM (if custom domain).
|
||||
* **IaC:** AWS CDK (TypeScript), with separate CDK apps in backend and frontend polyrepos.
|
||||
* **Deployment Strategy (MVP):** CI (GitHub Actions) for build/test/lint. CDK deployment (initially manual or CI-scripted) to a single AWS environment.
|
||||
* **Environments (MVP):** Local Development; Single Deployed MVP Environment (e.g., "dev" acting as initial production).
|
||||
* **Rollback Strategy (MVP):** CDK stack rollback, Lambda/S3 versioning, DynamoDB PITR.
|
||||
@@ -1,6 +0,0 @@
|
||||
# Key Reference Documents
|
||||
|
||||
1. **Product Requirements Document (PRD) - BMad Daily Digest** (Version: 0.1)
|
||||
2. **UI/UX Specification - BMad Daily Digest** (Version: 0.1)
|
||||
3. **Algolia Hacker News Search API Documentation** (`https://hn.algolia.com/api`)
|
||||
4. **Play.ai PlayNote API Documentation** (`https://docs.play.ai/api-reference/playnote/post`)
|
||||
@@ -1,52 +0,0 @@
|
||||
# Operational Guidelines
|
||||
|
||||
## Coding Standards (Backend: `bmad-daily-digest-backend`)
|
||||
|
||||
**Scope:** Applies to `bmad-daily-digest-backend`. Frontend standards are separate.
|
||||
|
||||
* **Primary Language:** TypeScript (Node.js 22).
|
||||
* **Style:** ESLint, Prettier.
|
||||
* **Naming:** Variables/Functions: `camelCase`. Constants: `UPPER_SNAKE_CASE`. Classes/Interfaces/Types/Enums: `PascalCase`. Files/Folders: `dash-case` (e.g., `episode-service.ts`, `content-ingestion/`).
|
||||
* **Structure:** Feature-based (`src/features/feature-name/`).
|
||||
* **Tests:** Unit/integration tests co-located (`*.test.ts`). E2E tests (if any for backend API) in root `tests/e2e/`.
|
||||
* **Async:** `async`/`await` for Promises.
|
||||
* **Types:** `strict: true`. No `any` without justification. JSDoc for exported items. Inline comments for clarity.
|
||||
* **Dependencies:** `npm` with `package-lock.json`. Pin versions or use tilde (`~`).
|
||||
* **Detailed Conventions:** Immutability preferred. Functional constructs for stateless logic, classes for stateful services/entities. Custom errors. Strict null checks. ESModules. Pino for logging (structured JSON, levels, context, no secrets). Lambda best practices (lean handlers, env vars, optimize size). `axios` with timeouts. AWS SDK v3 modular imports. Avoid common anti-patterns (deep nesting, large functions, `@ts-ignore`, hardcoded secrets, unhandled promises).
|
||||
|
||||
## Overall Testing Strategy
|
||||
|
||||
* **Tools:** Jest, React Testing Library (frontend), ESLint, Prettier, GitHub Actions.
|
||||
* **Unit Tests:** Isolate functions/methods/components. Mock dependencies. Co-located. Developer responsibility.
|
||||
* **Integration Tests (Backend/Frontend):** Test interactions between internal components with external systems mocked (AWS SDK clients, third-party APIs).
|
||||
* **End-to-End (E2E) Tests (MVP):**
|
||||
* Backend API: Automated test for "Hello World"/status. Test daily job trigger verifies DDB/S3 output.
|
||||
* Frontend UI: Key user flows tested manually for MVP. (Playwright deferred to post-MVP).
|
||||
* **Coverage:** Guideline \>80% unit test coverage for critical logic. Quality over quantity. Measured by Jest.
|
||||
* **Mocking:** Jest's built-in system. `axios-mock-adapter` if needed.
|
||||
* **Test Data:** Inline mocks or small fixtures for unit/integration.
|
||||
|
||||
## Error Handling Strategy
|
||||
|
||||
* **General Approach:** Custom `Error` classes hierarchy. Promises reject with `Error` objects.
|
||||
* **Logging:** Pino for structured JSON logs to CloudWatch. Standard levels (DEBUG, INFO, WARN, ERROR, CRITICAL). Contextual info (AWS Request ID, business IDs). No sensitive data in logs.
|
||||
* **Specific Patterns:**
|
||||
* **External API Calls (`axios`):** Timeouts, retries (e.g., `axios-retry`), wrap errors in custom types.
|
||||
* **Internal Errors:** Custom error types, detailed server-side logging.
|
||||
* **API Gateway Responses:** Translate internal errors to appropriate HTTP errors (4xx, 500) with generic client messages.
|
||||
* **Workflow (Step Functions):** Error handling, retries, catch blocks for states. Failed executions logged.
|
||||
* **Data Consistency:** Lambdas handle partial failures gracefully. Step Functions manage overall workflow state.
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
* **Input Validation:** API Gateway basic validation; Zod for detailed payload validation in Lambdas.
|
||||
* **Output Encoding:** Next.js/React handles XSS for frontend rendering. Backend API is JSON.
|
||||
* **Secrets Management:** Lambda environment variables via CDK (from local gitignored `.env` for MVP setup). No hardcoding. Pino redaction for logs if needed.
|
||||
* **Dependency Security:** `npm audit` in CI. Promptly address high/critical vulnerabilities.
|
||||
* **Authentication/Authorization:** API Gateway API Keys (Frontend Read Key, Admin Action Key). IAM roles with least privilege for service-to-service.
|
||||
* **Principle of Least Privilege (IAM):** Minimal permissions for all IAM roles (Lambdas, Step Functions, CDK).
|
||||
* **API Security:** HTTPS enforced by API Gateway/CloudFront. Basic rate limiting on API Gateway. Frontend uses HTTP security headers (via CloudFront/Next.js).
|
||||
* **Error Disclosure:** Generic errors to client, detailed logs server-side.
|
||||
* **Infrastructure Security:** S3 bucket access restricted (CloudFront OAC/OAI).
|
||||
* **Post-MVP:** Consider SAST/DAST, penetration testing.
|
||||
* **Adherence:** AWS Well-Architected Framework - Security Pillar.
|
||||
@@ -1,428 +0,0 @@
|
||||
# BMad Daily Digest Product Requirements Document (PRD)
|
||||
|
||||
**Version:** 0.1
|
||||
**Date:** May 20, 2025
|
||||
**Author:** JohnAI
|
||||
|
||||
## 1. Goal, Objective and Context
|
||||
|
||||
* **Overall Goal:** To provide busy tech executives with a quick, daily audio digest of top Hacker News posts and discussions, enabling them to stay effortlessly informed.
|
||||
* **Project Objective (MVP Focus):** To successfully launch the "BMad Daily Digest" by:
|
||||
* Automating the daily fetching of top 10 Hacker News posts (article metadata and comments via Algolia HN API) and scraping of linked article content.
|
||||
* Processing this content into a structured format.
|
||||
* Generating a 2-agent audio podcast using the play.ai PlayNote API.
|
||||
* Delivering the podcast via a simple Next.js web application (polyrepo structure) with a list of episodes and detail pages including an audio player and links to source materials.
|
||||
* Operating this process daily, aiming for delivery by a consistent morning hour.
|
||||
* Adhering to a full TypeScript stack (Node.js 22 for backend), with a Next.js frontend, AWS Lambda backend, DynamoDB, S3, and AWS CDK for IaC, while aiming to stay within AWS free-tier limits where possible.
|
||||
* **Context/Problem Solved:** Busy tech executives lack the time to thoroughly read Hacker News daily but need to stay updated on key tech discussions, trends, and news for strategic insights. "BMad Daily Digest" solves this by offering a convenient, curated audio summary.
|
||||
|
||||
## 2. Functional Requirements (MVP)
|
||||
|
||||
**FR1: Content Acquisition**
|
||||
* The system **must** automatically fetch data for the top 10 (configurable) posts from Hacker News daily.
|
||||
* For each Hacker News post, the system **must** identify and retrieve:
|
||||
* The URL of the linked article.
|
||||
* Key metadata about the Hacker News post (e.g., title, HN link, score, author, HN Post ID).
|
||||
* The system **must** fetch comments for each identified Hacker News post using the Algolia HN Search API, with logic to handle new vs. repeat posts and scraping failures differently.
|
||||
* The system **must** attempt to scrape and extract the primary textual content from the linked article URL for each of the top posts (unless it's a repeat post where only new comments are needed).
|
||||
* This process should aim to isolate the main article body.
|
||||
* If scraping fails, a fallback using HN title, summary (if available), and increased comments **must** be used.
|
||||
|
||||
**FR2: Content Processing and Formatting**
|
||||
* The system **must** aggregate the extracted/fallback article content and selected comments for the top 10 posts.
|
||||
* The system **must** process and structure the aggregated text content into a single text file suitable for submission to the play.ai PlayNote API.
|
||||
* The text file **must** begin with an introductory sentence: "It's a top 10 countdown for [Today'sDate]".
|
||||
* Content **must** be structured sequentially (e.g., "Story 10 - [details]..."), with special phrasing for repeat posts or posts where article scraping failed.
|
||||
* Article content may be truncated if `MAX_ARTICLE_LENGTH` is set, preserving intro/conclusion where possible.
|
||||
|
||||
**FR3: Podcast Generation**
|
||||
* The system **must** submit the formatted text content to the play.ai PlayNote API daily using specified voice and style parameters (configurable via environment variables).
|
||||
* The system **must** capture the `jobId` from Play.ai and use a polling mechanism (e.g., AWS Step Functions) to check for job completion status.
|
||||
* Upon successful completion, the system **must** retrieve the generated audio podcast file from the Play.ai-provided URL.
|
||||
* The system **must** store the generated audio file (e.g., on S3) and its associated metadata (including episode number, generated title, S3 location, original Play.ai URL, source HN posts, and processing status) in DynamoDB.
|
||||
|
||||
**FR4: Web Application Interface (MVP)**
|
||||
* The system **must** provide a web application (Next.js, "80s retro CRT terminal" theme with Tailwind CSS & shadcn/ui) with a **List Page** that:
|
||||
* Displays a chronological list (newest first) of all generated "BMad Daily Digest" episodes, formatted as "Episode [EpisodeNumber]: [PublicationDate] - [PodcastTitle]".
|
||||
* Allows users to navigate to a Detail Page for each episode.
|
||||
* The system **must** provide a web application with a **Detail Page** for each episode that:
|
||||
* Displays the `podcastGeneratedTitle`, `publicationDate`, and `episodeNumber`.
|
||||
* Includes an embedded HTML5 audio player for the podcast.
|
||||
* Lists the individual Hacker News stories included, with direct links to the original source article and the Hacker News discussion page.
|
||||
* The system **must** provide a minimalist **About Page**.
|
||||
* The web application **must** be responsive.
|
||||
|
||||
**FR5: Automation and Scheduling**
|
||||
* The entire end-to-end backend process **must** be orchestrated (preferably via AWS Step Functions) and automated to run daily, triggered by Amazon EventBridge Scheduler (default 12:00 UTC, configurable).
|
||||
* For MVP, a re-run of the daily job for the same day **must** overwrite/start over previous data for that day.
|
||||
|
||||
## 3. Non-Functional Requirements (MVP)
|
||||
|
||||
**a. Performance:**
|
||||
* **Podcast Generation Time:** The daily process should complete in a timely manner (e.g., target by 8 AM CST/12:00-13:00 UTC, specific completion window TBD based on Play.ai processing).
|
||||
* **Web Application Load Time:** Pages on the Next.js app should aim for fast load times (e.g., target under 3-5 seconds).
|
||||
|
||||
**b. Reliability / Availability:**
|
||||
* **Daily Process Success Rate:** Target >95% success rate for automated podcast generation without manual intervention.
|
||||
* **Web Application Uptime:** Target 99.5%+ uptime.
|
||||
|
||||
**c. Maintainability:**
|
||||
* **Code Quality:** Code **must** be well-documented. Internal code comments **should** be used when logic isn't clear from names. All functions **must** have JSDoc-style outer comments. Adherence to defined coding standards (ESLint, Prettier).
|
||||
* **Configuration Management:** System configurations and secrets **must** be managed via environment variables (`.env` locally, Lambda environment variables when deployed), set manually for MVP.
|
||||
|
||||
**d. Usability (Web Application):**
|
||||
* The web application **must** be intuitive for busy tech executives.
|
||||
* The audio player **must** be simple and reliable.
|
||||
* Accessibility: Standard MVP considerations, with particular attention to contrast for the "glowing green on dark" theme, good keyboard navigation, and basic screen reader compatibility.
|
||||
|
||||
**e. Security (MVP Focus):**
|
||||
* **API Key Management:** Keys for Algolia, Play.ai, AWS **must** be stored securely (gitignored `.env` files locally, Lambda environment variables in AWS), not hardcoded.
|
||||
* **Data Handling:** Scraped content handled responsibly.
|
||||
|
||||
**f. Cost Efficiency:**
|
||||
* AWS service usage **must** aim to stay within free-tier limits where feasible. Play.ai usage is via existing user subscription.
|
||||
|
||||
## 4. User Interaction and Design Goals
|
||||
|
||||
**a. Overall Vision & Experience:**
|
||||
* **Look and Feel:** Dark mode UI, "glowing green ASCII/text on a black background" aesthetic (CRT terminal style), "80s retro everything" theme.
|
||||
* **UI Component Toolkit:** Tailwind CSS and shadcn/ui, customized for the theme. Initial structure/components kickstarted by an AI UI generation tool.
|
||||
* **User Experience:** Highly efficient, clear navigation, no clutter, prioritizing content readability for busy tech executives.
|
||||
|
||||
**b. Key Interaction Paradigms (MVP):**
|
||||
* View list of digests (reverse chronological), select one for details. No sorting/filtering on list page for MVP.
|
||||
|
||||
**c. Core Screens/Views (MVP):**
|
||||
* **List Page:** Episodes ("Episode [N]: [Date] - [PodcastTitle]").
|
||||
* **Detail Page:** Episode details, HTML5 audio player, list of source HN stories with links to articles and HN discussions.
|
||||
* **About Page:** Minimalist, explaining the service, consistent theme.
|
||||
|
||||
**d. Accessibility Aspirations (MVP):**
|
||||
* Standard considerations: good contrast (critical for theme), keyboard navigation, basic screen reader compatibility.
|
||||
|
||||
**e. Branding Considerations (High-Level):**
|
||||
* "80s retro everything" theme is central. Logo/typeface should complement this (e.g., pixel art, retro fonts).
|
||||
|
||||
**f. Target Devices/Platforms:**
|
||||
* Responsive web application, good UX on desktop and mobile.
|
||||
|
||||
## 5. Technical Assumptions
|
||||
|
||||
**a. Core Technology Stack & Approach:**
|
||||
* **Full TypeScript Stack:** TypeScript for frontend and backend.
|
||||
* **Frontend:** Next.js (React), Node.js 22. Styling: Tailwind CSS, shadcn/ui. Hosting: Static site on AWS S3 (via CloudFront).
|
||||
* **Backend:** Node.js 22, TypeScript. HTTP Client: `axios`. Compute: AWS Lambda. Database: AWS DynamoDB.
|
||||
* **Infrastructure as Code (IaC):** All AWS infrastructure via AWS CDK.
|
||||
* **Key External Services/APIs:** Algolia HN Search API (posts/comments), Play.ai PlayNote API (audio gen, user has subscription, polling for status), Custom scraping for articles (TypeScript with Cheerio, Readability.js, potentially Puppeteer/Playwright).
|
||||
* **Automation:** Daily trigger via Amazon EventBridge Scheduler. Orchestration via AWS Step Functions.
|
||||
* **Configuration & Secrets:** Environment variables (`.env` local & gitignored, Lambda env vars).
|
||||
* **Coding Standards:** JSDoc for functions, inline comments for clarity. ESLint, Prettier.
|
||||
|
||||
**b. Repository Structure & Service Architecture:**
|
||||
* **Repository Structure:** Polyrepo (separate Git repositories for `bmad-daily-digest-frontend` and `bmad-daily-digest-backend`).
|
||||
* **High-Level Service Architecture:** Backend is serverless functions (AWS Lambda) for distinct tasks, orchestrated by Step Functions. API layer via AWS API Gateway to expose backend to frontend.
|
||||
|
||||
## 6. Epic Overview
|
||||
|
||||
This section details the Epics and their User Stories for the MVP.
|
||||
|
||||
**Epic 1: Backend Foundation, Tooling & "Hello World" API**
|
||||
* **Goal:** To establish the core backend project infrastructure in its dedicated repository, including robust development tooling and initial AWS CDK setup for essential services. By the end of this epic:
|
||||
1. A simple "hello world" API endpoint (AWS API Gateway + Lambda) **must** be deployed and testable via `curl`, returning a dynamic message.
|
||||
2. The backend project **must** have ESLint, Prettier, Jest (unit testing), and esbuild (TypeScript bundling) configured and operational.
|
||||
3. Basic unit tests **must** exist for the "hello world" Lambda function.
|
||||
4. Code formatting and linting checks **should** be integrated into a pre-commit hook and/or a basic CI pipeline stub.
|
||||
* **User Stories for Epic 1:**
|
||||
|
||||
**Story 1.1: Initialize Backend Project using TS-TEMPLATE-STARTER**
|
||||
* **User Story Statement:** As a Developer, I want to create the `bmad-daily-digest-backend` Git repository and initialize it using the existing `TS-TEMPLATE-STARTER`, ensuring all foundational tooling (TypeScript, Node.js 22, ESLint, Prettier, Jest, esbuild) is correctly configured and operational for this specific project, so that I have a high-quality, standardized development environment ready for application logic.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A new, private Git repository named `bmad-daily-digest-backend` **must** be created on GitHub.
|
||||
2. The contents of the `TS-TEMPLATE-STARTER` project **must** be copied/cloned into this new repository.
|
||||
3. `package.json` **must** be updated (project name, version, description).
|
||||
4. Project dependencies **must** be installable.
|
||||
5. TypeScript setup (`tsconfig.json`) **must** be verified for Node.js 22, esbuild compatibility; project **must** compile.
|
||||
6. ESLint and Prettier configurations **must** be operational; lint/format scripts **must** execute successfully.
|
||||
7. Jest configuration **must** be operational; test scripts **must** execute successfully with any starter example tests.
|
||||
8. Irrelevant generic demo code from starter **should** be removed. `index.ts`/`index.test.ts` can remain as placeholders.
|
||||
9. A standard `.gitignore` and an updated project `README.md` **must** be present.
|
||||
|
||||
**Story 1.2: Pre-commit Hook Implementation**
|
||||
* **User Story Statement:** As a Developer, I want pre-commit hooks automatically enforced in the `bmad-daily-digest-backend` repository, so that code quality standards (like linting and formatting) are checked and applied to staged files before any code is committed, thereby maintaining codebase consistency and reducing trivial errors.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A pre-commit hook tool (e.g., Husky) **must** be installed and configured.
|
||||
2. A tool for running linters/formatters on staged files (e.g., `lint-staged`) **must** be installed and configured.
|
||||
3. Pre-commit hook **must** trigger `lint-staged` on staged `.ts` files.
|
||||
4. `lint-staged` **must** be configured to run ESLint (`--fix`) and Prettier (`--write`).
|
||||
5. Attempting to commit files with auto-fixable issues **must** result in fixes applied and successful commit.
|
||||
6. Attempting to commit files with non-auto-fixable linting errors **must** abort the commit with error messages.
|
||||
7. Committing clean files **must** proceed without issues.
|
||||
|
||||
**Story 1.3: "Hello World" Lambda Function Implementation & Unit Tests**
|
||||
* **User Story Statement:** As a Developer, I need a simple "Hello World" AWS Lambda function implemented in TypeScript within the `bmad-daily-digest-backend` project. This function, when invoked, should return a dynamic greeting message including the current date and time, and it must be accompanied by comprehensive Jest unit tests, so that our basic serverless compute functionality, testing setup, and TypeScript bundling are validated.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `src/handlers/helloWorldHandler.ts` file (or similar) **must** contain the Lambda handler.
|
||||
2. Handler **must** be AWS Lambda compatible (event, context, Promise response).
|
||||
3. Successful execution **must** return JSON: `statusCode: 200`, body with `message: "Hello from BMad Daily Digest Backend, today is [current_date] at [current_time]."`.
|
||||
4. Date and time in message **must** be dynamic.
|
||||
5. A corresponding Jest unit test file (e.g., `src/handlers/helloWorldHandler.test.ts`) **must** be created.
|
||||
6. Unit tests **must** verify: 200 status, valid JSON body, expected `message` field, "Hello from..." prefix, dynamic date/time portion (use mocked `Date`).
|
||||
7. All unit tests **must** pass.
|
||||
8. esbuild configuration **must** correctly bundle the handler.
|
||||
|
||||
**Story 1.4: AWS CDK Setup for "Hello World" API (Lambda & API Gateway)**
|
||||
* **User Story Statement:** As a Developer, I want to define the necessary AWS infrastructure (Lambda function and API Gateway endpoint) for the "Hello World" service using AWS CDK (Cloud Development Kit) in TypeScript, so that the infrastructure is version-controlled, repeatable, and can be deployed programmatically.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. AWS CDK (v2) **must** be a development dependency.
|
||||
2. CDK app structure **must** be initialized (e.g., in `cdk/` or `infra/`).
|
||||
3. A new CDK stack (e.g., `BmadDailyDigestBackendStack`) **must** be defined in TypeScript.
|
||||
4. CDK stack **must** define an AWS Lambda resource for the "Hello World" function (Node.js 22, bundled code reference, handler entry point, basic IAM role for CloudWatch logs, free-tier conscious settings).
|
||||
5. CDK stack **must** define an AWS API Gateway (HTTP API preferred) with a route (e.g., `GET /hello`) triggering the Lambda.
|
||||
6. CDK stack **must** be synthesizable (`cdk synth`) without errors.
|
||||
7. CDK code **must** adhere to project ESLint/Prettier standards.
|
||||
8. Mechanism for passing Lambda environment variables via CDK (if any needed for "Hello World") **must** be in place.
|
||||
|
||||
**Story 1.5: "Hello World" API Deployment & Manual Invocation Test**
|
||||
* **User Story Statement:** As a Developer, I need to deploy the "Hello World" API (defined in AWS CDK) to an AWS environment and successfully invoke its endpoint using a tool like `curl`, so that I can verify the end-to-end deployment process and confirm the basic API is operational in the cloud.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The AWS CDK stack for "Hello World" API **must** deploy successfully to a designated AWS account/region.
|
||||
2. The API Gateway endpoint URL for `/hello` **must** be retrievable post-deployment.
|
||||
3. A `GET` request to the deployed `/hello` endpoint **must** receive a response.
|
||||
4. HTTP response status **must** be 200 OK.
|
||||
5. Response body **must** be JSON containing the expected dynamic "Hello..." message.
|
||||
6. Basic Lambda invocation logs **must** be visible in AWS CloudWatch Logs.
|
||||
|
||||
**Story 1.6: Basic CI/CD Pipeline Stub with Quality Gates**
|
||||
* **User Story Statement:** As a Developer, I need a basic Continuous Integration (CI) pipeline established for the `bmad-daily-digest-backend` repository, so that code quality checks (linting, formatting, unit tests) and the build process are automated upon code pushes and pull requests, ensuring early feedback and maintaining codebase health.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A CI workflow file (e.g., GitHub Actions in `.github/workflows/main.yml`) **must** be created.
|
||||
2. Pipeline **must** trigger on pushes to `main` and PRs targeting `main`.
|
||||
3. Pipeline **must** include steps for: checkout, Node.js 22 setup, dependency install, ESLint check, Prettier format check, Jest unit tests, esbuild bundle.
|
||||
4. Pipeline **must** fail if any lint, format, test, or bundle step fails.
|
||||
5. A successful CI run on the `main` branch (with "Hello World" code) **must** be demonstrated.
|
||||
6. CI pipeline for MVP **does not** need to perform AWS deployment; focus is on quality gates.
|
||||
|
||||
---
|
||||
**Epic 2: Automated Content Ingestion & Podcast Generation Pipeline**
|
||||
|
||||
**Goal:** To implement the complete automated daily workflow within the backend. This includes fetching Hacker News post data, scraping and extracting content from linked external articles, aggregating and formatting text, submitting it to Play.ai, managing job status via polling, and retrieving/storing the final audio file and associated metadata. This epic delivers the core value proposition of generating the daily audio content and making it ready for consumption via an API.
|
||||
|
||||
**User Stories for Epic 2:**
|
||||
|
||||
**Story 2.1: AWS CDK Extension for Epic 2 Resources**
|
||||
* **User Story Statement:** As a Developer, I need to extend the existing AWS CDK stack within the `bmad-daily-digest-backend` project to define and provision all new AWS resources required for the content ingestion and podcast generation pipeline—including a DynamoDB table for episode and processed post metadata, an S3 bucket for audio storage, and the AWS Step Functions state machine for orchestrating the Play.ai job status polling—so that all backend infrastructure for this epic is managed as code and ready for the application logic.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The existing AWS CDK application (from Epic 1) **must** be extended with new resource definitions for Epic 2.
|
||||
2. A DynamoDB table (e.g., `BmadDailyDigestEpisodes`) **must** be defined via CDK for episode metadata, with `episodeId` (String UUID) as PK, key attributes (`publicationDate`, `episodeNumber`, `podcastGeneratedTitle`, `audioS3Key`, `audioS3Bucket`, `playAiJobId`, `playAiSourceAudioUrl`, `sourceHNPosts` (List of objects: `{ hnPostId, title, originalArticleUrl, hnLink, isUpdateStatus, lastCommentFetchTimestamp, oldRank }`), `status`, `createdAt`), and PAY_PER_REQUEST billing.
|
||||
3. A DynamoDB table or GSI strategy **must** be defined via CDK to efficiently track processed Hacker News posts (`hnPostId`) and their `lastCommentFetchTimestamp` to support the "new comments only" feature.
|
||||
4. An S3 bucket (e.g., `bmad-daily-digest-audio-{unique-suffix}`) **must** be defined via CDK for audio storage, with private access.
|
||||
5. An AWS Step Functions state machine **must** be defined via CDK to manage the Play.ai job status polling workflow (details in Story 2.6).
|
||||
6. Necessary IAM roles/permissions for Lambdas to interact with DynamoDB, S3, Step Functions, CloudWatch Logs **must** be defined via CDK.
|
||||
7. The updated CDK stack **must** synthesize (`cdk synth`) and deploy (`cdk deploy`) successfully.
|
||||
8. All new CDK code **must** adhere to project ESLint/Prettier standards.
|
||||
|
||||
**Story 2.2: Fetch Top Hacker News Posts & Identify Repeats**
|
||||
* **User Story Statement:** As the System, I need to reliably fetch the top N (configurable, e.g., 10) current Hacker News posts daily using the Algolia HN API, including their essential metadata. I also need to identify if each fetched post has been processed in a recent digest by checking against stored data, so that I have an accurate list of stories and their status (new or repeat) to begin generating the daily digest.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `hackerNewsService.ts` function **must** fetch top N HN posts (stories only) via `axios` from Algolia API (configurable `HN_POSTS_COUNT`).
|
||||
2. Extracted metadata per post: Title, Article URL, HN Post URL, HN Post ID (`objectID`), Author, Points, Creation timestamp.
|
||||
3. For each post, the function **must** query DynamoDB (see Story 2.1 AC#3) to determine its `isUpdateStatus` (true if recently processed and article scraped) and retrieve `lastCommentFetchTimestamp` and `oldRank` if available.
|
||||
4. Function **must** return an array of HN post objects with metadata, `isUpdateStatus`, `lastCommentFetchTimestamp`, and `oldRank`.
|
||||
5. Error handling for Algolia/DynamoDB calls **must** be implemented and logged.
|
||||
6. Unit tests (Jest) **must** verify API calls, data extraction, repeat identification (mocked DDB), and error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.3: Article Content Scraping & Extraction (Conditional)**
|
||||
* **User Story Statement:** As the System, for each Hacker News post identified as *new* (or for which article scraping previously failed), I need to robustly fetch its HTML content from the linked article URL and extract the primary textual content and title. If scraping fails, a fallback mechanism must be triggered using available HN metadata and signaling the need for increased comments.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. An `articleScraperService.ts` function **must** accept an article URL and `isUpdateStatus`.
|
||||
2. If `isUpdateStatus` is true (article already scraped and good), scraping **must** be skipped, and a success status indicating no new scrape needed is returned (or pre-existing content reference).
|
||||
3. If new scrape needed: use `axios` (timeout, User-Agent) to fetch HTML.
|
||||
4. Use `Mozilla Readability` (JS port) or similar to extract main article text and title.
|
||||
5. Return `{ success: true, title: string, content: string, ... }` on success.
|
||||
6. If scraping fails: log failure, return `{ success: false, error: string, fallbackNeeded: true }`.
|
||||
7. No "polite" inter-article scraping delays for MVP.
|
||||
8. Unit tests (Jest) **must** mock `axios`, test successful extraction, skip logic, and failure/fallback scenarios. All tests **must** pass.
|
||||
|
||||
**Story 2.4: Fetch Hacker News Comments (Conditional Logic)**
|
||||
* **User Story Statement:** As the System, I need to fetch comments for each selected Hacker News post using the Algolia HN API, adjusting the strategy based on whether the post is new, a repeat (requiring only new comments), or if its article scraping failed (requiring more comments).
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. `hackerNewsService.ts` **must** be extended to fetch comments for an HN Post ID, accepting `isUpdateStatus`, `lastCommentFetchTimestamp`, and `articleScrapingFailed` flags.
|
||||
2. Use `axios` to call Algolia HN API item endpoint (`/api/v1/items/[POST_ID]`).
|
||||
3. **Comment Fetching Logic:**
|
||||
* If `articleScrapingFailed`: Fetch up to 3x `HN_COMMENTS_COUNT_PER_POST` (configurable, e.g., ~150) available comments (top-level first, then immediate children, by Algolia sort order).
|
||||
* If `isUpdateStatus` (repeat post): Fetch all comments, then filter client-side for comments with `created_at_i` > `lastCommentFetchTimestamp`. Select up to `HN_COMMENTS_COUNT_PER_POST` of these *new* comments.
|
||||
* Else (new post, successful scrape): Fetch up to `HN_COMMENTS_COUNT_PER_POST` (configurable, e.g., ~50) available comments.
|
||||
4. Extract plain text (HTML stripped), author, creation timestamp for selected comments.
|
||||
5. Return array of comment objects; empty if none.
|
||||
6. Error handling and logging for API calls.
|
||||
7. Unit tests (Jest) **must** mock `axios` and verify all conditional fetching logic, comment selection/filtering, data extraction (HTML stripping), and error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.5: Content Aggregation and Formatting for Play.ai**
|
||||
* **User Story Statement:** As the System, I need to aggregate the collected Hacker News post data (titles), associated article content (full, truncated, or fallback summary), and comments (new, updated, or extended sets) for all top stories, and format this combined text according to the specified structure for the play.ai PlayNote API, including special phrasing for different post types (new, update, scrape-failed), so that it's ready for podcast generation.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `contentFormatterService.ts` **must** be implemented.
|
||||
2. It **must** accept an array of processed HN post objects (containing all necessary metadata, statuses, content, and comments).
|
||||
3. Output **must** be a single string.
|
||||
4. String **must** start: "It's a top 10 countdown for [Current Date]".
|
||||
5. Posts **must** be sequenced in reverse rank order.
|
||||
6. **Formatting for standard new post:** "Story [Rank] - [Article Title]. [Full/Truncated Article Text]. Comments Section. [Number] comments follow. Comment 1: [Text]. Comment 2: [Text]..."
|
||||
7. **Formatting for repeat post (update):** "Story [Rank] (previously Rank [OldRank] yesterday) - [Article Title]. We're bringing you new comments on this popular story. Comments Section. [Number] new comments follow. Comment 1: [Text]..."
|
||||
8. **Formatting for scrape-failed post:** "Story [Rank] - [Article Title]. We couldn't retrieve the full article, but here's a summary if available and the latest comments. [Optional HN Summary]. Comments Section. [Number] comments follow. Comment 1: [Text]..."
|
||||
9. **Article Truncation:** If `MAX_ARTICLE_LENGTH` (env var) is set and article exceeds, truncate by attempting to preserve beginning and end (or simpler first X / last Y characters for MVP).
|
||||
10. Graceful handling for missing parts (e.g., "Article content not available," "0 comments follow") **must** be implemented.
|
||||
11. Unit tests (Jest) **must** verify all formatting variations, truncation, data merging, and error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.6 (REVISED): Implement Podcast Generation Status Polling via Play.ai API**
|
||||
* **User Story Statement:** As the System, after submitting a podcast generation job to Play.ai and receiving a `jobId`, I need an AWS Step Function state machine to periodically poll the Play.ai API for the status of this specific job, continuing until the job is reported as "completed" or "failed" (or a configurable max duration/attempts limit is reached), so the system can reliably determine when the podcast audio is ready or if an error occurred.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The AWS Step Function state machine (CDK defined in Story 2.1) **must** manage the polling workflow.
|
||||
2. Input: `jobId`.
|
||||
3. States: Invoke Poller Lambda (calls Play.ai status endpoint with `axios`), Wait (configurable `POLLING_INTERVAL_MINUTES`), Choice (evaluates status).
|
||||
4. Loop if status is "processing".
|
||||
5. Stop if "completed" or "failed".
|
||||
6. Max polling duration/attempts (configurable env vars `MAX_POLLING_DURATION_MINUTES`, `MAX_POLLING_ATTEMPTS`) **must** be enforced, treating expiry as failure.
|
||||
7. If "completed": extract `audioUrl`, trigger next step (Story 2.8 process) with data.
|
||||
8. If "failed"/"timeout": log event, record failure, terminate.
|
||||
9. Poller Lambda handles Play.ai API errors gracefully.
|
||||
10. Unit tests for Poller Lambda; Step Function definition tested. All tests **must** pass.
|
||||
|
||||
**Story 2.7: Submit Content to Play.ai PlayNote API & Initiate Podcast Generation**
|
||||
* **User Story Statement:** As the System, I need to securely submit the aggregated and formatted text content (using `sourceText`) to the play.ai PlayNote API via an `application/json` request to initiate the podcast generation process, and I must capture the `jobId` returned by Play.ai, so that this `jobId` can be passed to the status polling mechanism (Step Function).
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `playAiService.ts` function **must** handle submission.
|
||||
2. Input: formatted text (from Story 2.5).
|
||||
3. Use `axios` for `POST` to Play.ai endpoint (e.g., `https://api.play.ai/api/v1/playnotes`).
|
||||
4. Request `Content-Type: application/json`.
|
||||
5. JSON body: `sourceText`, and configurable `title`, `voiceId1`, `name1`, `voiceId2`, `name2`, `styleGuidance` (from env vars). *Developer to confirm exact field names per Play.ai JSON API for text input.*
|
||||
6. Headers: `Authorization: Bearer [TOKEN]`, `X-USER-ID: [API_KEY]` (from env vars `PLAY_AI_BEARER_TOKEN`, `PLAY_AI_USER_ID`).
|
||||
7. No `webHookUrl` sent.
|
||||
8. On success: extract `jobId`, log it, initiate polling Step Function (Story 2.6) with `jobId`.
|
||||
9. Error handling for API submission.
|
||||
10. Unit tests (Jest) mock `axios`, verify API call, auth, payload, `jobId` extraction, Step Function initiation, error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.8: Retrieve, Store Generated Podcast Audio & Persist Episode Metadata**
|
||||
* **User Story Statement:** As the System, once the podcast generation status polling (Story 2.6) indicates a Play.ai job is "completed," I need to download the generated audio file from the provided `audioUrl`, store this file in our designated S3 bucket, and then save all relevant metadata for the episode (including the S3 audio location, `episodeNumber`, `podcastGeneratedTitle`, `playAiSourceAudioUrl`, and source information like `isUpdateStatus` for each HN story, and `lastCommentFetchTimestamp` for each HN post) into our DynamoDB table, so that the daily digest is fully processed, archived, and ready for access.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `podcastStorageService.ts` function **must** be triggered by Step Function (Story 2.6) on "completed" status, receiving `audioUrl`, Play.ai `jobId`, and original context (list of source HN posts with their processing statuses & metadata).
|
||||
2. Use `axios` to download audio from `audioUrl`.
|
||||
3. Upload audio to S3 bucket (from Story 2.1), using key (e.g., `YYYY/MM/DD/episodeId.mp3`).
|
||||
4. Prepare episode metadata: `episodeId` (UUID), `publicationDate` (YYYY-MM-DD), `episodeNumber` (sequential logic), `podcastGeneratedTitle` (from Play.ai or constructed), `audioS3Bucket`, `audioS3Key`, `playAiJobId`, `playAiSourceAudioUrl`, `sourceHNPosts` (array of objects: `{ hnPostId, title, originalArticleUrl, hnLink, isUpdateStatus, lastCommentFetchTimestamp, oldRank }`), `status: "Published"`, `createdAt`.
|
||||
5. The `lastCommentFetchTimestamp` for each processed `hnPostId` in `sourceHNPosts` **must** be set to the current time (or comment processing time) for future "new comments only" logic.
|
||||
6. Save metadata to `BmadDailyDigestEpisodes` DynamoDB table.
|
||||
7. Error handling for download, S3 upload, DDB write; failure sets episode status to "Failed".
|
||||
8. Unit tests (Jest) mock `axios`, AWS SDK (S3, DynamoDB); verify data handling, storage, metadata, errors. All tests **must** pass.
|
||||
|
||||
**Story 2.9: Daily Workflow Orchestration & Scheduling**
|
||||
* **User Story Statement:** As the System Administrator, I need the entire daily backend workflow (Stories 2.2 through 2.8) to be fully orchestrated by the primary AWS Step Function state machine and automatically scheduled to run once per day using Amazon EventBridge Scheduler, ensuring it handles re-runs for the same day by overwriting/starting over (for MVP), so that "BMad Daily Digest" episodes are produced consistently and reliably.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The primary AWS Step Function state machine **must** orchestrate the sequence: Call Lambdas/services for Fetch HN Posts (2.2), then for each post: Scrape Article (2.3) & Fetch Comments (2.4); then Aggregate & Format (2.5); then Submit to Play.ai (2.7 which returns `jobId`); then initiate Polling (2.6 using `jobId`); on "completed" polling, trigger Retrieve & Store Audio/Metadata (2.8).
|
||||
2. State machine **must** manage data flow between steps.
|
||||
3. Overall workflow error handling: critical step failure marks state machine execution as "Failed" and logs comprehensively. Individual steps use retries for transient errors.
|
||||
4. **Idempotency (MVP):** Re-running for the same `publicationDate` **must** re-process and overwrite previous data for that date.
|
||||
5. Amazon EventBridge Scheduler rule (CDK defined) **must** trigger the main Step Function daily at 12:00 UTC (default, configurable via `DAILY_JOB_SCHEDULE_UTC_CRON`).
|
||||
6. Successful end-to-end run **must** be demonstrated.
|
||||
7. Step Function execution history **must** provide a clear audit trail.
|
||||
8. Unit tests for any new orchestrator-specific Lambda functions. All tests **must** pass.
|
||||
|
||||
---
|
||||
**Epic 3: Web Application MVP & Podcast Consumption**
|
||||
|
||||
**Goal:** To set up the frontend project in its dedicated repository and develop and deploy the Next.js frontend application MVP, enabling users to consume the "BMad Daily Digest." This includes initial project setup (AI-assisted UI kickstart), pages for listing and detailing episodes, an about page, and deployment.
|
||||
|
||||
**User Stories for Epic 3:**
|
||||
|
||||
**Story 3.1: Frontend Project Repository & Initial UI Setup (AI-Assisted)**
|
||||
* **User Story Statement:** As a Developer, I need to establish the `bmad-daily-digest-frontend` Git repository with a new Next.js (TypeScript, Node.js 22) project. This foundational setup must include essential development tooling (ESLint, Prettier, Jest with React Testing Library, a basic CI stub), and integrate an initial UI structure and core presentational components—kickstarted by an AI UI generation tool—styled with Tailwind CSS and shadcn/ui to embody the "80s retro CRT terminal" aesthetic, so that a high-quality, styled, and standardized frontend development environment is ready for building application features.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A new, private Git repository `bmad-daily-digest-frontend` **must** be created on GitHub.
|
||||
2. A Next.js project (TypeScript, targeting Node.js 22) **must** be initialized.
|
||||
3. Tooling (ESLint for Next.js/React/TS, Prettier, Jest with React Testing Library) **must** be installed and configured.
|
||||
4. NPM scripts for lint, format, test, build **must** be in `package.json`.
|
||||
5. Tailwind CSS and `shadcn/ui` **must** be installed and configured.
|
||||
6. An initial UI structure (basic page layout, placeholder pages for List, Detail, About) and a few core presentational components (themed buttons, text display) **must** be generated using an agreed-upon AI UI generation tool, targeting the "80s retro CRT terminal" aesthetic.
|
||||
7. The AI-generated UI code **must** be integrated into the Next.js project.
|
||||
8. The application **must** build successfully with the initial UI.
|
||||
9. A basic CI pipeline stub (GitHub Actions) for lint, format check, test, build **must** be created.
|
||||
10. A standard Next.js `.gitignore` and an updated `README.md` **must** be present.
|
||||
|
||||
**Story 3.2: Frontend API Service Layer for Backend Communication**
|
||||
* **User Story Statement:** As a Frontend Developer, I need a dedicated and well-typed API service layer within the Next.js frontend application to manage all HTTP communication with the "BMad Daily Digest" backend API (for fetching episode lists and specific episode details), so that UI components can cleanly and securely consume backend data with robust error handling.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A TypeScript module (e.g., `src/services/apiClient.ts`) **must** encapsulate backend API interactions.
|
||||
2. Functions **must** exist for: fetching all podcast episodes (e.g., `getEpisodes()`) and fetching details for a single episode by `episodeId` (e.g., `getEpisodeDetails(episodeId)`).
|
||||
3. `axios` **must** be used for HTTP requests.
|
||||
4. Backend API base URL **must** be configurable via `NEXT_PUBLIC_BACKEND_API_URL`.
|
||||
5. TypeScript interfaces (e.g., `EpisodeListItem`, `EpisodeDetail`, `SourceHNStory`) for API response data **must** be defined, matching data structure from backend (Story 2.8).
|
||||
6. API functions **must** correctly parse JSON responses and transform data into defined frontend interfaces.
|
||||
7. Error handling (network errors, non-2xx responses) **must** be implemented, providing clear error information.
|
||||
8. Unit tests (Jest) **must** mock `axios` and verify API calls, data parsing/transformation, and error handling. All tests **must** pass.
|
||||
|
||||
**Story 3.3: Episode List Page Implementation**
|
||||
* **User Story Statement:** As a Busy Tech Executive, I want to view a responsive "Episode List Page" that clearly displays all available "BMad Daily Digest" episodes in reverse chronological order, showing the episode number, publication date, and podcast title for each, so that I can quickly find and select the latest or a specific past episode to listen to.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A Next.js page component (e.g., `src/app/page.tsx` or `src/app/episodes/page.tsx`) **must** be created.
|
||||
2. It **must** use the API service layer (Story 3.2) to fetch episodes.
|
||||
3. A themed loading state **must** be shown during data fetching.
|
||||
4. An error message **must** be shown if fetching fails.
|
||||
5. A "No episodes available yet" message **must** be shown for an empty list.
|
||||
6. Episodes **must** be listed in reverse chronological order (newest first), based on `publicationDate` or `episodeNumber`.
|
||||
7. Each list item **must** display "Episode [EpisodeNumber]: [PublicationDate] - [PodcastGeneratedTitle]".
|
||||
8. Each item **must** link to the Episode Detail Page for that episode using its `episodeId`.
|
||||
9. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
10. The page **must** be responsive.
|
||||
11. Unit/integration tests (Jest with RTL) **must** cover loading, error, no episodes, list rendering, data display, order, and navigation. All tests **must** pass.
|
||||
|
||||
**Story 3.4: Episode Detail Page Implementation**
|
||||
* **User Story Statement:** As a Busy Tech Executive, after selecting a podcast episode from the list, I want to be taken to a responsive "Episode Detail Page" where I can easily play the audio using a standard HTML5 player, see a clear breakdown of the Hacker News stories discussed in that episode, and have direct links to explore both the original articles and the Hacker News comment threads, so I can listen to the digest and dive deeper into topics of interest.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A Next.js dynamic route page (e.g., `src/app/episodes/[episodeId]/page.tsx`) **must** be created.
|
||||
2. It **must** accept `episodeId` from the URL.
|
||||
3. It **must** use the API service layer (Story 3.2) to fetch details for the `episodeId`.
|
||||
4. Loading and error states **must** be handled.
|
||||
5. If data is found, it **must** display: `podcastGeneratedTitle`, `publicationDate`, `episodeNumber`.
|
||||
6. An embedded HTML5 audio player (`<audio controls>`) **must** play the podcast. The `src` **must** be the publicly accessible URL for the audio file (e.g., a CloudFront URL pointing to the S3 object).
|
||||
7. A list of included Hacker News stories (from the `sourceHNPosts` array in the episode data) **must** be displayed.
|
||||
8. For each HN story in the list: its title, a link to the `originalArticleUrl` (opening in new tab), and a link to its `hnLink` (opening in new tab) **must** be displayed.
|
||||
9. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
10. The page **must** be responsive.
|
||||
11. Unit/integration tests (Jest with RTL) **must** cover loading, error, rendering of all details, player presence, and correct link generation. All tests **must** pass.
|
||||
|
||||
**Story 3.5: "About" Page Implementation**
|
||||
* **User Story Statement:** As a User, I want to access a minimalist, responsive, and consistently styled "About Page" that clearly explains what the "BMad Daily Digest" service is, its core purpose, and how it works at a high level, so that I can quickly understand the value and nature of the service.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A Next.js page component (e.g., `src/app/about/page.tsx`) **must** be created.
|
||||
2. It **must** display static informational content (placeholder text: "BMad Daily Digest provides a daily audio summary of top Hacker News discussions for busy tech professionals, generated using AI.").
|
||||
3. Content **must** explain: What "BMad Daily Digest" is, its purpose, and a high-level overview of generation.
|
||||
4. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
5. The page **must** be responsive.
|
||||
6. A link to the "About Page" **must** be accessible (e.g., in site navigation/footer, specific location defined by Story 3.1's AI-generated layout).
|
||||
7. Unit tests (Jest with RTL) **must** verify rendering of static content. All tests **must** pass.
|
||||
|
||||
**Story 3.6: Frontend Deployment to S3 & CloudFront via CDK**
|
||||
* **User Story Statement:** As a Developer, I need the Next.js frontend application to be configured for static export (or an equivalent static-first deployment model) and have its AWS infrastructure (S3 for hosting, CloudFront for CDN and HTTPS) defined and managed via AWS CDK. This setup should automate the deployment of the static site, making the "BMad Daily Digest" web application publicly accessible, performant, and cost-effective.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The Next.js application **must** be configured for static export suitable for S3/CloudFront hosting.
|
||||
2. AWS CDK scripts (managed in the `bmad-daily-digest-backend` repo's CDK app for unified IaC, unless Architect advises a separate frontend CDK app) **must** define the frontend S3 bucket and CloudFront distribution.
|
||||
3. CDK stack **must** define: S3 bucket (static web hosting), CloudFront distribution (S3 origin, HTTPS via default CloudFront domain or ACM cert for custom domain if specified, caching behaviors, OAC/OAI).
|
||||
4. A `package.json` build script **must** generate the static output (e.g., to `out/`).
|
||||
5. The CDK deployment process (or related script) **must** include steps to build the Next.js app and sync static files to S3.
|
||||
6. The application **must** be accessible via its CloudFront URL.
|
||||
7. All MVP functionalities **must** be operational on the deployed site.
|
||||
8. HTTPS **must** be enforced.
|
||||
9. CDK code **must** meet project standards.
|
||||
|
||||
## 7. Key Reference Documents
|
||||
*{This section will be populated later with links to the final PRD, Architecture Document, UI/UX Specification, etc.}*
|
||||
|
||||
## 8. Out of Scope Ideas Post MVP
|
||||
* **Advanced Audio Player Functionality:** Custom controls (skip +/- 15s), playback speed adjustment, remembering playback position.
|
||||
* **User Accounts & Personalization:** User creation, email subscription management, customizable podcast hosts.
|
||||
* **Enhanced Content Delivery & Discovery:** Daily email summary, full RSS feed, full podcast transcription on website, search functionality.
|
||||
* **Expanded Content Sources:** Beyond Hacker News.
|
||||
* **Community & Feedback:** In-app feedback mechanisms.
|
||||
|
||||
## 9. Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| :------------------------------------------ | :------------ | :------ | :-------------------------------------------------------------------------- | :------------ |
|
||||
| Initial PRD draft and MVP scope definition. | May 20, 2025 | 0.1 | Created initial PRD based on Project Brief; defined Epics & detailed Stories. | John (PM) & User |
|
||||
@@ -1,431 +0,0 @@
|
||||
# BMad Daily Digest Product Requirements Document (PRD)
|
||||
|
||||
**Version:** 0.2
|
||||
**Date:** May 20, 2025
|
||||
**Author:** BMad Project Team (John - PM, Fred - Architect, Sarah - PO, User)
|
||||
|
||||
## 1. Goal, Objective and Context
|
||||
|
||||
* **Overall Goal:** To provide busy tech executives with a quick, daily audio digest of top Hacker News posts and discussions, enabling them to stay effortlessly informed.
|
||||
* **Project Objective (MVP Focus):** To successfully launch the "BMad Daily Digest" by:
|
||||
* Automating the daily fetching of top 10 Hacker News posts (article metadata and comments via Algolia HN API) and scraping of linked article content.
|
||||
* Processing this content into a structured format.
|
||||
* Generating a 2-agent audio podcast using the play.ai PlayNote API.
|
||||
* Delivering the podcast via a simple Next.js web application (polyrepo structure) with a list of episodes and detail pages including an audio player and links to source materials.
|
||||
* Operating this process daily, aiming for delivery by a consistent morning hour.
|
||||
* Adhering to a full TypeScript stack (Node.js 22 for backend), with a Next.js frontend, AWS Lambda backend, DynamoDB, S3, and AWS CDK for IaC, while aiming to stay within AWS free-tier limits where possible.
|
||||
* **Context/Problem Solved:** Busy tech executives lack the time to thoroughly read Hacker News daily but need to stay updated on key tech discussions, trends, and news for strategic insights. "BMad Daily Digest" solves this by offering a convenient, curated audio summary.
|
||||
|
||||
## 2. Functional Requirements (MVP)
|
||||
|
||||
**FR1: Content Acquisition**
|
||||
* The system **must** automatically fetch data for the top 10 (configurable) posts from Hacker News daily.
|
||||
* For each Hacker News post, the system **must** identify and retrieve:
|
||||
* The URL of the linked article.
|
||||
* Key metadata about the Hacker News post (e.g., title, HN link, score, author, HN Post ID).
|
||||
* The system **must** fetch comments for each identified Hacker News post using the Algolia HN Search API, with logic to handle new vs. repeat posts and scraping failures differently.
|
||||
* The system **must** attempt to scrape and extract the primary textual content from the linked article URL for each of the top posts (unless it's a repeat post where only new comments are needed).
|
||||
* This process should aim to isolate the main article body.
|
||||
* If scraping fails, a fallback using HN title, summary (if available), and increased comments **must** be used.
|
||||
|
||||
**FR2: Content Processing and Formatting**
|
||||
* The system **must** aggregate the extracted/fallback article content and selected comments for the top 10 posts.
|
||||
* The system **must** process and structure the aggregated text content into a single text file suitable for submission to the play.ai PlayNote API.
|
||||
* The text file **must** begin with an introductory sentence: "It's a top 10 countdown for [Today'sDate]".
|
||||
* Content **must** be structured sequentially (e.g., "Story 10 - [details]..."), with special phrasing for repeat posts or posts where article scraping failed.
|
||||
* Article content may be truncated if `MAX_ARTICLE_LENGTH` (environment variable) is set, aiming to preserve intro/conclusion where possible.
|
||||
|
||||
**FR3: Podcast Generation**
|
||||
* The system **must** submit the formatted text content to the play.ai PlayNote API daily using specified voice and style parameters (configurable via environment variables).
|
||||
* The system **must** capture the `jobId` from Play.ai and use a polling mechanism (e.g., AWS Step Functions) to check for job completion status.
|
||||
* Upon successful completion, the system **must** retrieve the generated audio podcast file from the Play.ai-provided URL.
|
||||
* The system **must** store the generated audio file (e.g., on S3) and its associated metadata (including episode number, generated title, S3 location, original Play.ai URL, source HN posts, and processing status) in DynamoDB.
|
||||
|
||||
**FR4: Web Application Interface (MVP)**
|
||||
* The system **must** provide a web application (Next.js, "80s retro CRT terminal" theme with Tailwind CSS & shadcn/ui) with a **List Page** that:
|
||||
* Displays a chronological list (newest first) of all generated "BMad Daily Digest" episodes, formatted as "Episode [EpisodeNumber]: [PublicationDate] - [PodcastTitle]".
|
||||
* Allows users to navigate to a Detail Page for each episode.
|
||||
* The system **must** provide a web application with a **Detail Page** for each episode that:
|
||||
* Displays the `podcastGeneratedTitle`, `publicationDate`, and `episodeNumber`.
|
||||
* Includes an embedded HTML5 audio player for the podcast.
|
||||
* Lists the individual Hacker News stories included, with direct links to the original source article and the Hacker News discussion page.
|
||||
* The system **must** provide a minimalist **About Page**.
|
||||
* The web application **must** be responsive.
|
||||
|
||||
**FR5: Automation and Scheduling**
|
||||
* The entire end-to-end backend process **must** be orchestrated (preferably via AWS Step Functions) and automated to run daily, triggered by Amazon EventBridge Scheduler (default 12:00 UTC, configurable).
|
||||
* For MVP, a re-run of the daily job for the same day **must** overwrite/start over previous data for that day.
|
||||
|
||||
## 3. Non-Functional Requirements (MVP)
|
||||
|
||||
**a. Performance:**
|
||||
* **Podcast Generation Time:** The daily process should complete in a timely manner (e.g., target by 8 AM CST/12:00-13:00 UTC, specific completion window TBD based on Play.ai processing).
|
||||
* **Web Application Load Time:** Pages on the Next.js app should aim for fast load times (e.g., target under 3-5 seconds).
|
||||
|
||||
**b. Reliability / Availability:**
|
||||
* **Daily Process Success Rate:** Target >95% success rate for automated podcast generation without manual intervention.
|
||||
* **Web Application Uptime:** Target 99.5%+ uptime.
|
||||
|
||||
**c. Maintainability:**
|
||||
* **Code Quality:** Code **must** be well-documented. Internal code comments **should** be used when logic isn't clear from names. All functions **must** have JSDoc-style outer comments. Adherence to defined coding standards (ESLint, Prettier).
|
||||
* **Configuration Management:** System configurations and secrets **must** be managed via environment variables (`.env` locally, Lambda environment variables when deployed), set manually for MVP.
|
||||
|
||||
**d. Usability (Web Application):**
|
||||
* The web application **must** be intuitive for busy tech executives.
|
||||
* The audio player **must** be simple and reliable.
|
||||
* Accessibility: Standard MVP considerations, with particular attention to contrast for the "glowing green on dark" theme, good keyboard navigation, and basic screen reader compatibility.
|
||||
|
||||
**e. Security (MVP Focus):**
|
||||
* **API Key Management:** Keys for Algolia, Play.ai, AWS **must** be stored securely (gitignored `.env` files locally, Lambda environment variables in AWS), not hardcoded.
|
||||
* **Data Handling:** Scraped content handled responsibly.
|
||||
|
||||
**f. Cost Efficiency:**
|
||||
* AWS service usage **must** aim to stay within free-tier limits where feasible. Play.ai usage is via existing user subscription.
|
||||
|
||||
## 4. User Interaction and Design Goals
|
||||
|
||||
**a. Overall Vision & Experience:**
|
||||
* **Look and Feel:** Dark mode UI, "glowing green ASCII/text on a black background" aesthetic (CRT terminal style), "80s retro everything" theme.
|
||||
* **UI Component Toolkit:** Tailwind CSS and shadcn/ui, customized for the theme. Initial structure/components kickstarted by an AI UI generation tool (using the `bmad-daily-digest-ui` V0 scaffold as a base).
|
||||
* **User Experience:** Highly efficient, clear navigation, no clutter, prioritizing content readability for busy tech executives.
|
||||
|
||||
**b. Key Interaction Paradigms (MVP):**
|
||||
* View list of digests (reverse chronological), select one for details. No sorting/filtering on list page for MVP.
|
||||
|
||||
**c. Core Screens/Views (MVP):**
|
||||
* **List Page:** Episodes ("Episode [N]: [Date] - [PodcastTitle]").
|
||||
* **Detail Page:** Episode details, HTML5 audio player, list of source HN stories with links to articles and HN discussions.
|
||||
* **About Page:** Minimalist, explaining the service, consistent theme.
|
||||
|
||||
**d. Accessibility Aspirations (MVP):**
|
||||
* Standard considerations: good contrast (critical for theme), keyboard navigation, basic screen reader compatibility.
|
||||
|
||||
**e. Branding Considerations (High-Level):**
|
||||
* "80s retro everything" theme is central. Logo/typeface should complement this (e.g., pixel art, retro fonts).
|
||||
|
||||
**f. Target Devices/Platforms:**
|
||||
* Responsive web application, good UX on desktop and mobile.
|
||||
|
||||
## 5. Technical Assumptions
|
||||
|
||||
**a. Core Technology Stack & Approach:**
|
||||
* **Full TypeScript Stack:** TypeScript for frontend and backend.
|
||||
* **Frontend:** Next.js (React), Node.js 22. Styling: Tailwind CSS, shadcn/ui. Hosting: Static site on AWS S3 (via CloudFront).
|
||||
* **Backend:** Node.js 22, TypeScript. HTTP Client: `axios`. Compute: AWS Lambda. Database: AWS DynamoDB.
|
||||
* **Infrastructure as Code (IaC):** All AWS infrastructure via AWS CDK.
|
||||
* **Key External Services/APIs:** Algolia HN Search API (posts/comments), Play.ai PlayNote API (audio gen, user has subscription, polling for status), Custom scraping for articles (TypeScript with Cheerio, Readability.js, potentially Puppeteer/Playwright).
|
||||
* **Automation:** Daily trigger via Amazon EventBridge Scheduler. Orchestration via AWS Step Functions.
|
||||
* **Configuration & Secrets:** Environment variables (`.env` local & gitignored, Lambda env vars).
|
||||
* **Coding Standards:** JSDoc for functions, inline comments for clarity. ESLint, Prettier.
|
||||
|
||||
**b. Repository Structure & Service Architecture:**
|
||||
* **Repository Structure:** Polyrepo (separate Git repositories for `bmad-daily-digest-frontend` and `bmad-daily-digest-backend`).
|
||||
* **High-Level Service Architecture:** Backend is serverless functions (AWS Lambda) for distinct tasks, orchestrated by Step Functions. API layer via AWS API Gateway to expose backend to frontend, secured with API Keys.
|
||||
|
||||
## 6. Epic Overview
|
||||
|
||||
This section details the Epics and their User Stories for the MVP. Architectural refinements have been incorporated.
|
||||
|
||||
**Epic 1: Backend Foundation, Tooling & "Hello World" API**
|
||||
* **Goal:** To establish the core backend project infrastructure in its dedicated repository, including robust development tooling and initial AWS CDK setup for essential services. By the end of this epic:
|
||||
1. A simple "hello world" API endpoint (AWS API Gateway + Lambda) **must** be deployed and testable via `curl`, returning a dynamic message.
|
||||
2. The backend project **must** have ESLint, Prettier, Jest (unit testing), and esbuild (TypeScript bundling) configured and operational.
|
||||
3. Basic unit tests **must** exist for the "hello world" Lambda function.
|
||||
4. Code formatting and linting checks **should** be integrated into a pre-commit hook and/or a basic CI pipeline stub.
|
||||
* **User Stories for Epic 1:**
|
||||
|
||||
**Story 1.1: Initialize Backend Project using TS-TEMPLATE-STARTER**
|
||||
* **User Story Statement:** As a Developer, I want to create the `bmad-daily-digest-backend` Git repository and initialize it using the existing `TS-TEMPLATE-STARTER`, ensuring all foundational tooling (TypeScript, Node.js 22, ESLint, Prettier, Jest, esbuild) is correctly configured and operational for this specific project, so that I have a high-quality, standardized development environment ready for application logic.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A new, private Git repository named `bmad-daily-digest-backend` **must** be created on GitHub.
|
||||
2. The contents of the `TS-TEMPLATE-STARTER` project **must** be copied/cloned into this new repository.
|
||||
3. `package.json` **must** be updated (project name, version, description).
|
||||
4. Project dependencies **must** be installable.
|
||||
5. TypeScript setup (`tsconfig.json`) **must** be verified for Node.js 22, esbuild compatibility; project **must** compile.
|
||||
6. ESLint and Prettier configurations **must** be operational; lint/format scripts **must** execute successfully.
|
||||
7. Jest configuration **must** be operational; test scripts **must** execute successfully with any starter example tests.
|
||||
8. Irrelevant generic demo code from starter **should** be removed. `index.ts`/`index.test.ts` can remain as placeholders.
|
||||
9. A standard `.gitignore` and an updated project `README.md` **must** be present.
|
||||
|
||||
**Story 1.2: Pre-commit Hook Implementation**
|
||||
* **User Story Statement:** As a Developer, I want pre-commit hooks automatically enforced in the `bmad-daily-digest-backend` repository, so that code quality standards (like linting and formatting) are checked and applied to staged files before any code is committed, thereby maintaining codebase consistency and reducing trivial errors.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A pre-commit hook tool (e.g., Husky) **must** be installed and configured.
|
||||
2. A tool for running linters/formatters on staged files (e.g., `lint-staged`) **must** be installed and configured.
|
||||
3. Pre-commit hook **must** trigger `lint-staged` on staged `.ts` files.
|
||||
4. `lint-staged` **must** be configured to run ESLint (`--fix`) and Prettier (`--write`).
|
||||
5. Attempting to commit files with auto-fixable issues **must** result in fixes applied and successful commit.
|
||||
6. Attempting to commit files with non-auto-fixable linting errors **must** abort the commit with error messages.
|
||||
7. Committing clean files **must** proceed without issues.
|
||||
|
||||
**Story 1.3: "Hello World" Lambda Function Implementation & Unit Tests**
|
||||
* **User Story Statement:** As a Developer, I need a simple "Hello World" AWS Lambda function implemented in TypeScript within the `bmad-daily-digest-backend` project. This function, when invoked, should return a dynamic greeting message including the current date and time, and it must be accompanied by comprehensive Jest unit tests, so that our basic serverless compute functionality, testing setup, and TypeScript bundling are validated.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `src/features/publicApi/statusHandler.ts` file (or similar according to final backend structure) **must** contain the Lambda handler.
|
||||
2. Handler **must** be AWS Lambda compatible (event, context, Promise response).
|
||||
3. Successful execution **must** return JSON: `statusCode: 200`, body with `message: "Hello from BMad Daily Digest Backend, today is [current_date] at [current_time]."`.
|
||||
4. Date and time in message **must** be dynamic.
|
||||
5. A corresponding Jest unit test file (e.g., `src/features/publicApi/statusHandler.test.ts`) **must** be created.
|
||||
6. Unit tests **must** verify: 200 status, valid JSON body, expected `message` field, "Hello from..." prefix, dynamic date/time portion (use mocked `Date`).
|
||||
7. All unit tests **must** pass.
|
||||
8. esbuild configuration **must** correctly bundle the handler.
|
||||
|
||||
**Story 1.4: AWS CDK Setup for "Hello World" API (Lambda & API Gateway)**
|
||||
* **User Story Statement:** As a Developer, I want to define the necessary AWS infrastructure (Lambda function and API Gateway endpoint) for the "Hello World" service using AWS CDK (Cloud Development Kit) in TypeScript, so that the infrastructure is version-controlled, repeatable, and can be deployed programmatically.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. AWS CDK (v2) **must** be a development dependency.
|
||||
2. CDK app structure **must** be initialized (e.g., in `cdk/`).
|
||||
3. A new CDK stack (e.g., `BmadDailyDigestBackendStack`) **must** be defined in TypeScript.
|
||||
4. CDK stack **must** define an AWS Lambda resource for the "Hello World" function (Node.js 22, bundled code reference, handler entry point, basic IAM role for CloudWatch logs, free-tier conscious settings).
|
||||
5. CDK stack **must** define an AWS API Gateway (HTTP API preferred) with a route (e.g., `GET /status` or `GET /hello`) triggering the Lambda, secured with the "Frontend Read API Key".
|
||||
6. CDK stack **must** be synthesizable (`cdk synth`) without errors.
|
||||
7. CDK code **must** adhere to project ESLint/Prettier standards.
|
||||
8. Mechanism for passing Lambda environment variables via CDK **must** be in place.
|
||||
|
||||
**Story 1.5: "Hello World" API Deployment & Manual Invocation Test**
|
||||
* **User Story Statement:** As a Developer, I need to deploy the "Hello World" API (defined in AWS CDK) to an AWS environment and successfully invoke its endpoint using a tool like `curl` (including the API Key), so that I can verify the end-to-end deployment process and confirm the basic API is operational in the cloud.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The AWS CDK stack for "Hello World" API **must** deploy successfully to a designated AWS account/region.
|
||||
2. The API Gateway endpoint URL for the `/status` (or `/hello`) route **must** be retrievable post-deployment.
|
||||
3. A `GET` request to the deployed endpoint, including the correct `x-api-key` header, **must** receive a response.
|
||||
4. HTTP response status **must** be 200 OK.
|
||||
5. Response body **must** be JSON containing the expected dynamic "Hello..." message.
|
||||
6. Basic Lambda invocation logs **must** be visible in AWS CloudWatch Logs.
|
||||
|
||||
**Story 1.6: Basic CI/CD Pipeline Stub with Quality Gates**
|
||||
* **User Story Statement:** As a Developer, I need a basic Continuous Integration (CI) pipeline established for the `bmad-daily-digest-backend` repository, so that code quality checks (linting, formatting, unit tests) and the build process are automated upon code pushes and pull requests, ensuring early feedback and maintaining codebase health.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A CI workflow file (e.g., GitHub Actions in `.github/workflows/main.yml`) **must** be created.
|
||||
2. Pipeline **must** trigger on pushes to `main` and PRs targeting `main`.
|
||||
3. Pipeline **must** include steps for: checkout, Node.js 22 setup, dependency install, ESLint check, Prettier format check, Jest unit tests, esbuild bundle.
|
||||
4. Pipeline **must** fail if any lint, format, test, or bundle step fails.
|
||||
5. A successful CI run on the `main` branch **must** be demonstrated.
|
||||
6. CI pipeline for MVP **does not** need to perform AWS deployment.
|
||||
|
||||
---
|
||||
**Epic 2: Automated Content Ingestion & Podcast Generation Pipeline**
|
||||
|
||||
**Goal:** To implement the complete automated daily workflow within the backend. This includes fetching Hacker News post data, scraping and extracting content from linked external articles, aggregating and formatting text, submitting it to Play.ai, managing job status via polling, and retrieving/storing the final audio file and associated metadata. This epic delivers the core value proposition of generating the daily audio content and making it ready for consumption via an API.
|
||||
|
||||
**User Stories for Epic 2:**
|
||||
|
||||
**Story 2.1: AWS CDK Extension for Epic 2 Resources**
|
||||
* **User Story Statement:** As a Developer, I need to extend the existing AWS CDK stack within the `bmad-daily-digest-backend` project to define and provision all new AWS resources required for the content ingestion and podcast generation pipeline—including the `BmadDailyDigestEpisodes` DynamoDB table (with GSI), the `HackerNewsPostProcessState` DynamoDB table, an S3 bucket for audio storage, and the AWS Step Functions state machine for orchestrating the Play.ai job status polling—so that all backend infrastructure for this epic is managed as code and ready for the application logic.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The existing AWS CDK application (from Epic 1) **must** be extended.
|
||||
2. The `BmadDailyDigestEpisodes` DynamoDB table resource **must** be defined in CDK as specified in the System Architecture Document's "Data Models" section (with `episodeId` PK, key attributes like `publicationDate`, `episodeNumber`, `podcastGeneratedTitle`, `audioS3Key`, `audioS3Bucket`, `playAiJobId`, `playAiSourceAudioUrl`, `sourceHNPosts` list, `status`, `createdAt`, `updatedAt`), including a GSI for chronological sorting (e.g., PK `status`, SK `publicationDate`), and PAY_PER_REQUEST billing.
|
||||
3. The `HackerNewsPostProcessState` DynamoDB table resource **must** be defined in CDK as specified in the System Architecture Document's "Data Models" section (with `hnPostId` PK and attributes like `lastCommentFetchTimestamp`, `lastSuccessfullyScrapedTimestamp`, `lastKnownRank`), and PAY_PER_REQUEST billing.
|
||||
4. An S3 bucket resource (e.g., `bmad-daily-digest-audio-{unique-suffix}`) **must** be defined via CDK for audio storage, with private access by default.
|
||||
5. An AWS Step Functions state machine resource **must** be defined via CDK to manage the Play.ai job status polling workflow (as detailed in Story 2.6).
|
||||
6. Necessary IAM roles and permissions for Lambda functions within this epic to interact with DynamoDB, S3, Step Functions, CloudWatch Logs **must** be defined via CDK, adhering to least privilege.
|
||||
7. The updated CDK stack **must** synthesize (`cdk synth`) and deploy (`cdk deploy`) successfully.
|
||||
8. All new CDK code **must** adhere to project ESLint/Prettier standards.
|
||||
|
||||
**Story 2.2: Fetch Top Hacker News Posts & Identify Repeats**
|
||||
* **User Story Statement:** As the System, I need to reliably fetch the top N (configurable, e.g., 10) current Hacker News posts daily using the Algolia HN API, including their essential metadata. I also need to identify if each fetched post has been processed in a recent digest by checking against the `HackerNewsPostProcessState` table, so that I have an accurate list of stories and their status (new or repeat) to begin generating the daily digest.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `hackerNewsService.ts` function **must** fetch top N HN posts (stories only) via `axios` from Algolia API (configurable `HN_POSTS_COUNT`).
|
||||
2. Extracted metadata per post: Title, Article URL, HN Post URL, HN Post ID (`objectID`), Author, Points, Creation timestamp.
|
||||
3. For each post, the function **must** query the `HackerNewsPostProcessState` DynamoDB table to determine its `isUpdateStatus` (true if `lastSuccessfullyScrapedTimestamp` and `lastCommentFetchTimestamp` indicate prior full processing) and retrieve `lastCommentFetchTimestamp` and `lastKnownRank` if available.
|
||||
4. Function **must** return an array of HN post objects with metadata, `isUpdateStatus`, `lastCommentFetchTimestamp`, and `lastKnownRank`.
|
||||
5. Error handling for Algolia/DynamoDB calls **must** be implemented and logged.
|
||||
6. Unit tests (Jest) **must** verify API calls, data extraction, repeat identification (mocked DDB), and error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.3: Article Content Scraping & Extraction (Conditional)**
|
||||
* **User Story Statement:** As the System, for each Hacker News post identified as *new* (or for which article scraping previously failed and is being retried), I need to robustly fetch its HTML content from the linked article URL and extract the primary textual content and title using libraries like Cheerio and Mozilla Readability. If scraping fails, a fallback mechanism must be triggered.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. An `articleScraperService.ts` function **must** accept an article URL and `isUpdateStatus`.
|
||||
2. If `isUpdateStatus` is true (article already successfully scraped and stored, though we are not storing full articles long term - this implies we have the article data available from a previous step if it's a repeat post where we don't re-scrape), scraping **must** be skipped. (For MVP, if it's a repeat post, we assume we don't need to re-scrape the article itself, only comments, as per user feedback. This story focuses on *new* scrapes or retries of failed scrapes).
|
||||
3. If a new scrape is needed: use `axios` (timeout, User-Agent) to fetch HTML.
|
||||
4. Use `Mozilla Readability` (JS port) and/or `Cheerio` to extract main article text and title.
|
||||
5. Return `{ success: true, title: string, content: string }` on success.
|
||||
6. If scraping fails: log failure, return `{ success: false, error: string, fallbackNeeded: true }`.
|
||||
7. No specific "polite" inter-article scraping delays for MVP.
|
||||
8. Unit tests (Jest) **must** mock `axios`, test successful extraction, skip logic for non-applicable cases, and failure/fallback scenarios. All tests **must** pass.
|
||||
|
||||
**Story 2.4: Fetch Hacker News Comments (Conditional Logic)**
|
||||
* **User Story Statement:** As the System, I need to fetch comments for each selected Hacker News post using the Algolia HN API, adjusting the strategy to fetch up to N comments for new posts, only new comments since last fetch for repeat posts, or up to 3N comments if article scraping failed.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. `hackerNewsService.ts` **must** be extended to fetch comments for an HN Post ID, accepting `isUpdateStatus`, `lastCommentFetchTimestamp` (from `HackerNewsPostProcessState`), and `articleScrapingFailed` flags.
|
||||
2. Use `axios` to call Algolia HN API item endpoint.
|
||||
3. **Comment Fetching Logic:**
|
||||
* If `articleScrapingFailed`: Fetch up to 3 * `HN_COMMENTS_COUNT_PER_POST` available comments.
|
||||
* If `isUpdateStatus`: Fetch all comments, then filter client-side for comments with `created_at_i` > `lastCommentFetchTimestamp`. Select up to `HN_COMMENTS_COUNT_PER_POST` of these *new* comments.
|
||||
* Else (new post, successful scrape): Fetch up to `HN_COMMENTS_COUNT_PER_POST`.
|
||||
4. For selected comments, extract plain text (HTML stripped), author, creation timestamp.
|
||||
5. Return array of comment objects; empty if none. An updated `lastCommentFetchTimestamp` (max `created_at_i` of fetched comments for this post) should be available for updating `HackerNewsPostProcessState`.
|
||||
6. Error handling and logging for API calls.
|
||||
7. Unit tests (Jest) **must** mock `axios` and verify all conditional fetching logic, comment selection/filtering, data extraction, and error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.5: Content Aggregation and Formatting for Play.ai**
|
||||
* **User Story Statement:** As the System, I need to aggregate the collected Hacker News post data (titles), associated article content (full, truncated, or fallback summary), and comments (new, updated, or extended sets) for all top stories, and format this combined text according to the specified structure for the play.ai PlayNote API, including special phrasing for different post types (new, update, scrape-failed) and configurable article truncation.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `contentFormatterService.ts` **must** be implemented.
|
||||
2. Inputs: Array of processed HN post objects (with metadata, statuses, content, comments).
|
||||
3. Output: A single string.
|
||||
4. String starts: "It's a top 10 countdown for [Current Date]".
|
||||
5. Posts sequenced in reverse rank order.
|
||||
6. **Formatting (new post):** "Story [Rank] - [Article Title]. [Full/Truncated Article Text]. Comments Section. [Number] comments follow. Comment 1: [Text]..."
|
||||
7. **Formatting (repeat post):** "Story [Rank] (previously Rank [OldRank] yesterday) - [Article Title]. We're bringing you new comments on this popular story. Comments Section. [Number] new comments follow. Comment 1: [Text]..."
|
||||
8. **Formatting (scrape-failed post):** "Story [Rank] - [Article Title]. We couldn't retrieve the full article, but here's a summary if available and the latest comments. [Optional HN Summary]. Comments Section. [Number] comments follow. Comment 1: [Text]..."
|
||||
9. **Article Truncation:** If `MAX_ARTICLE_LENGTH` (env var) set and article exceeds, truncate aiming to preserve intro/conclusion.
|
||||
10. Graceful handling for missing parts.
|
||||
11. Unit tests (Jest) verify all formatting, truncation, data merging, error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.6 (REVISED): Implement Podcast Generation Status Polling via Play.ai API**
|
||||
* **User Story Statement:** As the System, after submitting a podcast generation job to Play.ai and receiving a `jobId`, I need an AWS Step Function state machine to periodically poll the Play.ai API for the status of this specific job, continuing until the job is reported as "completed" or "failed" (or a configurable max duration/attempts limit is reached), so the system can reliably determine when the podcast audio is ready or if an error occurred.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The AWS Step Function state machine (CDK defined in Story 2.1) **must** manage the polling workflow.
|
||||
2. Input: `jobId`.
|
||||
3. States: Invoke Poller Lambda (calls Play.ai status GET endpoint with `axios`), Wait (configurable `POLLING_INTERVAL_MINUTES`), Choice (evaluates status: "processing", "completed", "failed").
|
||||
4. Loop if "processing". Stop if "completed" or "failed".
|
||||
5. Max polling duration/attempts (configurable env vars `MAX_POLLING_DURATION_MINUTES`, `MAX_POLLING_ATTEMPTS`) **must** be enforced, treating expiry as failure.
|
||||
6. If "completed": extract `audioUrl`, trigger next step (Story 2.8 process) with data.
|
||||
7. If "failed"/"timeout": log event, record failure (e.g., update episode status in DDB via a Lambda), terminate.
|
||||
8. Poller Lambda handles API errors gracefully.
|
||||
9. Unit tests for Poller Lambda logic; Step Function definition tested (locally if possible, or via AWS console tests). All tests **must** pass.
|
||||
|
||||
**Story 2.7: Submit Content to Play.ai PlayNote API & Initiate Podcast Generation**
|
||||
* **User Story Statement:** As the System, I need to securely submit the aggregated and formatted text content (using `sourceText`) to the play.ai PlayNote API via an `application/json` request to initiate the podcast generation process, and I must capture the `jobId` returned by Play.ai, so that this `jobId` can be passed to the status polling mechanism (Step Function).
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `playAiService.ts` function **must** handle submission.
|
||||
2. Input: formatted text (from Story 2.5).
|
||||
3. Use `axios` for `POST` to Play.ai endpoint (e.g., `https://api.play.ai/api/v1/playnotes`).
|
||||
4. Request `Content-Type: application/json`.
|
||||
5. JSON body: `sourceText`, and configurable `title`, `voiceId1`, `name1` (default "Angelo"), `voiceId2`, `name2` (default "Deedee"), `styleGuidance` (default "podcast") from env vars.
|
||||
6. Headers: `Authorization: Bearer <PLAY_AI_BEARER_TOKEN>`, `X-USER-ID: <PLAY_AI_USER_ID>` (from env vars).
|
||||
7. No `webHookUrl` sent.
|
||||
8. On success: extract `jobId`, log it, initiate polling Step Function (Story 2.6) with `jobId` and other context (like internal `episodeId`).
|
||||
9. Error handling for API submission; log and flag failure.
|
||||
10. Unit tests (Jest) mock `axios`, verify API call, auth, payload, `jobId` extraction, Step Function initiation (mocked), error handling. All tests **must** pass.
|
||||
|
||||
**Story 2.8: Retrieve, Store Generated Podcast Audio & Persist Episode Metadata**
|
||||
* **User Story Statement:** As the System, once the podcast generation status polling (Story 2.6) indicates a Play.ai job is "completed," I need to download the generated audio file from the provided `audioUrl`, store this file in our designated S3 bucket, and then save/update all relevant metadata for the episode (including S3 audio location, `episodeNumber`, `podcastGeneratedTitle`, `playAiSourceAudioUrl`, and source HN post information including their `lastCommentFetchTimestamp`) into our DynamoDB tables (`BmadDailyDigestEpisodes` and `HackerNewsPostProcessState`), so that the daily digest is fully processed, archived, and ready for access.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A `podcastStorageService.ts` function **must** be triggered by Step Function (Story 2.6) on "completed" status, receiving `audioUrl`, Play.ai `jobId`, and original context (like internal `episodeId`, list of source HN posts with their metadata and processing status).
|
||||
2. Use `axios` to download audio from `audioUrl`.
|
||||
3. Upload audio to S3 bucket (from Story 2.1), using key (e.g., `YYYY/MM/DD/episodeId.mp3`).
|
||||
4. Prepare `Episode` metadata for `BmadDailyDigestEpisodes` table: `episodeId` (UUID), `publicationDate` (YYYY-MM-DD), `episodeNumber` (sequential logic, TBD), `podcastGeneratedTitle` (from Play.ai or constructed), `audioS3Bucket`, `audioS3Key`, `playAiJobId`, `playAiSourceAudioUrl`, `sourceHNPosts` (array of objects: `{ hnPostId, title, originalArticleUrl, hnLink, isUpdateStatus, oldRank, articleScrapingFailed }`), `status: "Published"`, `createdAt`, `updatedAt`.
|
||||
5. For each `hnPostId` in `sourceHNPosts`, update its corresponding item in the `HackerNewsPostProcessState` table with the `lastCommentFetchTimestamp` (current time or max comment time from this run), `lastProcessedDate` (current date), and `lastKnownRank`. If `articleScrapingFailed` was false for this run, update `lastSuccessfullyScrapedTimestamp`.
|
||||
6. Save `Episode` metadata to `BmadDailyDigestEpisodes` DynamoDB table.
|
||||
7. Error handling for download, S3 upload, DDB writes; failure should result in episode `status: "Failed"`.
|
||||
8. Unit tests (Jest) mock `axios`, AWS SDK (S3, DynamoDB); verify data handling, storage, metadata construction for both tables, errors. All tests **must** pass.
|
||||
|
||||
**Story 2.9: Daily Workflow Orchestration & Scheduling**
|
||||
* **User Story Statement:** As the System Administrator, I need the entire daily backend workflow (Stories 2.2 through 2.8) to be fully orchestrated by the primary AWS Step Function state machine and automatically scheduled to run once per day using Amazon EventBridge Scheduler, ensuring it handles re-runs for the same day by overwriting/starting over (for MVP), so that "BMad Daily Digest" episodes are produced consistently and reliably.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The primary AWS Step Function state machine **must** orchestrate the sequence: Fetch HN Posts & Identify Repeats (2.2); For each post: conditionally Scrape Article (2.3) & Fetch Comments (2.4); then Aggregate & Format Content (2.5); then Submit to Play.ai & get `jobId` (2.7); then initiate/manage Polling (2.6 using `jobId`); on "completed" polling, trigger Retrieve & Store Audio/Metadata (2.8).
|
||||
2. State machine **must** manage data flow (inputs/outputs) between steps correctly.
|
||||
3. Overall workflow error handling: critical step failure marks state machine execution as "Failed" and logs comprehensively. Steps use retries for transient errors.
|
||||
4. **Idempotency (MVP):** Re-running for the same `publicationDate` **must** re-process and effectively overwrite previous data for that date.
|
||||
5. Amazon EventBridge Scheduler rule (CDK defined) **must** trigger the main Step Function daily at 12:00 UTC (default, configurable via `DAILY_JOB_SCHEDULE_UTC_CRON`).
|
||||
6. Successful end-to-end run **must** be demonstrated (e.g., processing sample data through the pipeline).
|
||||
7. Step Function execution history **must** provide a clear audit trail of steps and data.
|
||||
8. Unit tests for any new orchestrator-specific Lambda functions (if any not covered). All tests **must** pass.
|
||||
|
||||
---
|
||||
**Epic 3: Web Application MVP & Podcast Consumption**
|
||||
|
||||
**Goal:** To set up the frontend project in its dedicated repository and develop and deploy the Next.js frontend application MVP, enabling users to consume the "BMad Daily Digest." This includes initial project setup (AI-assisted UI kickstart from `bmad-daily-digest-ui` scaffold), pages for listing and detailing episodes, an about page, and deployment.
|
||||
|
||||
**User Stories for Epic 3:**
|
||||
|
||||
**Story 3.1: Frontend Project Repository & Initial UI Setup (AI-Assisted)**
|
||||
* **User Story Statement:** As a Developer, I need to establish the `bmad-daily-digest-frontend` Git repository with a new Next.js (TypeScript, Node.js 22) project, using the provided `bmad-daily-digest-ui` V0 scaffold as the base. This setup must include all foundational tooling (ESLint, Prettier, Jest with React Testing Library, a basic CI stub), and an initial AWS CDK application structure, ensuring the "80s retro CRT terminal" aesthetic (with Tailwind CSS and shadcn/ui) is operational, so that a high-quality, styled, and standardized frontend development environment is ready.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A new, private Git repository `bmad-daily-digest-frontend` **must** be created on GitHub.
|
||||
2. The `bmad-daily-digest-ui` V0 scaffold project files **must** be used as the initial codebase in this repository.
|
||||
3. `package.json` **must** be updated (project name, version, description).
|
||||
4. Project dependencies **must** be installable.
|
||||
5. TypeScript (`tsconfig.json`), Next.js (`next.config.mjs`), Tailwind (`tailwind.config.ts`), ESLint, Prettier, Jest configurations from the scaffold **must** be verified and operational.
|
||||
6. The application **must** build successfully (`npm run build`) with the scaffolded UI.
|
||||
7. A basic CI pipeline stub (GitHub Actions) for lint, format check, test, build **must** be created.
|
||||
8. A standard `.gitignore` and an updated `README.md` **must** be present.
|
||||
9. An initial AWS CDK application structure **must** be created within a `cdk/` directory in this repository, ready for defining frontend-specific infrastructure (S3, CloudFront in Story 3.6).
|
||||
|
||||
**Story 3.2: Frontend API Service Layer for Backend Communication**
|
||||
* **User Story Statement:** As a Frontend Developer, I need a dedicated and well-typed API service layer (e.g., `lib/api-client.ts`) within the Next.js frontend application to manage all HTTP communication with the "BMad Daily Digest" backend API (for fetching episode lists and specific episode details), so that UI components can cleanly and securely consume backend data with robust error handling.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. A TypeScript module `lib/api-client.ts` (or similar) **must** encapsulate backend API interactions.
|
||||
2. Functions **must** exist for: `getEpisodes(): Promise<EpisodeListItem[]>` and `getEpisodeDetails(episodeId: string): Promise<EpisodeDetail | null>`.
|
||||
3. `axios` (or native `Workspace` with a wrapper if preferred for frontend) **must** be used for HTTP requests.
|
||||
4. Backend API base URL (`NEXT_PUBLIC_BACKEND_API_URL`) and Frontend Read API Key (`NEXT_PUBLIC_FRONTEND_API_KEY`) **must** be configurable via public environment variables and used in requests.
|
||||
5. TypeScript interfaces (`EpisodeListItem`, `EpisodeDetail`, `SourceHNPostDetail` from `lib/types.ts`) for API response data **must** be defined/used, matching backend API.
|
||||
6. API functions **must** correctly parse JSON responses and transform data into defined interfaces.
|
||||
7. Error handling (network errors, non-2xx responses from backend) **must** be implemented, providing clear error information/objects.
|
||||
8. Unit tests (Jest) **must** mock the HTTP client and verify API calls, data parsing/transformation, and error handling. All tests **must** pass.
|
||||
|
||||
**Story 3.3: Episode List Page Implementation**
|
||||
* **User Story Statement:** As a Busy Tech Executive, I want to view a responsive "Episode List Page" (based on `app/(pages)/episodes/page.tsx` from the scaffold) that clearly displays all available "BMad Daily Digest" episodes in reverse chronological order, showing the episode number, publication date, and podcast title for each, using themed components like `episode-card.tsx`, so that I can quickly find and select an episode.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The existing `app/(pages)/episodes/page.tsx` (or equivalent main list page from scaffold) **must** be updated.
|
||||
2. It **must** use the API service layer (Story 3.2) to fetch episodes.
|
||||
3. A themed loading state (e.g., using `loading-state.tsx`) **must** be shown.
|
||||
4. A themed error message (e.g., using `error-state.tsx`) **must** be shown if fetching fails.
|
||||
5. A "No episodes available yet" message **must** be shown for an empty list.
|
||||
6. Episodes **must** be listed in reverse chronological order.
|
||||
7. Each list item, potentially using a modified `episode-card.tsx` component, **must** display "Episode [EpisodeNumber]: [PublicationDate] - [PodcastGeneratedTitle]".
|
||||
8. Each item **must** link to the Episode Detail Page for that episode using its `episodeId`.
|
||||
9. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
10. The page **must** be responsive.
|
||||
11. Unit/integration tests (Jest with RTL) **must** cover all states, data display, order, and navigation. All tests **must** pass.
|
||||
|
||||
**Story 3.4: Episode Detail Page Implementation**
|
||||
* **User Story Statement:** As a Busy Tech Executive, after selecting an episode, I want to navigate to a responsive "Episode Detail Page" (based on `app/(pages)/episodes/[episodeId]/page.tsx`/page.tsx] from the scaffold) that features an embedded HTML5 audio player, displays the episode title/date/number, a list of the Hacker News stories covered (using components like `story-item.tsx`), and provides clear links to the original articles and HN discussions, so I can listen and explore sources.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The dynamic route page `app/(pages)/episodes/[episodeId]/page.tsx` **must** be implemented.
|
||||
2. It **must** accept `episodeId` from the URL.
|
||||
3. It **must** use the API service layer (Story 3.2) to fetch episode details.
|
||||
4. Loading and error states **must** be handled and displayed with themed components.
|
||||
5. If data found, **must** display: `podcastGeneratedTitle`, `publicationDate`, `episodeNumber`.
|
||||
6. An embedded HTML5 audio player (`<audio controls>`) **must** play the podcast using the public `audioUrl` from the episode details.
|
||||
7. A list of included Hacker News stories (from `sourceHNPosts`) **must** be displayed, potentially using a `story-item.tsx` component for each.
|
||||
8. For each HN story: its title, a link to `originalArticleUrl` (new tab), and a link to `hnLink` (new tab) **must** be displayed.
|
||||
9. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
10. The page **must** be responsive.
|
||||
11. Unit/integration tests (Jest with RTL) **must** cover all states, rendering of details, player, links. All tests **must** pass.
|
||||
|
||||
**Story 3.5: "About" Page Implementation**
|
||||
* **User Story Statement:** As a User, I want to access a minimalist, responsive "About Page" (based on `app/(pages)/about/page.tsx` from the scaffold) that clearly explains "BMad Daily Digest," its purpose, and how it works, styled consistently, so I can understand the service.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. The `app/(pages)/about/page.tsx` component **must** be implemented.
|
||||
2. It **must** display static informational content (Placeholder: "BMad Daily Digest provides a daily audio summary of top Hacker News discussions for busy tech professionals, generated using AI. Our mission is to keep you informed, efficiently. All content is curated and processed to deliver key insights in an easily digestible audio format, presented with a unique retro-tech vibe.").
|
||||
3. Styling **must** adhere to the "80s retro CRT terminal" aesthetic.
|
||||
4. The page **must** be responsive.
|
||||
5. A link to "About Page" **must** be accessible from site navigation (e.g., via `header.tsx` or `footer.tsx`).
|
||||
6. Unit tests (Jest with RTL) for rendering static content. All tests **must** pass.
|
||||
|
||||
**Story 3.6: Frontend Deployment to S3 & CloudFront via CDK**
|
||||
* **User Story Statement:** As a Developer, I need the Next.js frontend application to be configured for static export (or an equivalent static-first deployment model) and have its AWS infrastructure (S3 for hosting, CloudFront for CDN and HTTPS) defined and managed via its own AWS CDK application within the frontend repository. This setup should automate the build and deployment of the static site, making the "BMad Daily Digest" web application publicly accessible, performant, and cost-effective.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
1. Next.js app **must** be configured for static export suitable for S3/CloudFront.
|
||||
2. The AWS CDK app within `bmad-daily-digest-frontend/cdk/` (from Story 3.1) **must** define the S3 bucket and CloudFront distribution.
|
||||
3. CDK stack **must** define: S3 bucket (static web hosting), CloudFront distribution (S3 origin, HTTPS via default CloudFront domain or ACM cert for custom domain if specified for MVP, caching, OAC/OAI).
|
||||
4. A `package.json` build script **must** generate the static output.
|
||||
5. The CDK deployment process (`cdk deploy` run via CI or manually for MVP) **must** include steps/hooks to build the Next.js app and sync static files to S3.
|
||||
6. Application **must** be accessible via its CloudFront URL.
|
||||
7. All MVP functionalities **must** be operational on the deployed site.
|
||||
8. HTTPS **must** be enforced.
|
||||
9. CDK code **must** meet project standards.
|
||||
|
||||
## 7. Key Reference Documents
|
||||
* Product Requirements Document (PRD) - BMad Daily Digest (This Document, v0.2)
|
||||
* UI/UX Specification - BMad Daily Digest (v0.1)
|
||||
* System Architecture Document - BMad Daily Digest (v0.1)
|
||||
* Frontend Architecture Document - BMad Daily Digest (v0.1)
|
||||
* Algolia Hacker News Search API Documentation (`https://hn.algolia.com/api`)
|
||||
* Play.ai PlayNote API Documentation (`https://docs.play.ai/api-reference/playnote/post`)
|
||||
|
||||
## 8. Out of Scope Ideas Post MVP
|
||||
* Advanced Audio Player Functionality (skip +/-, speed control, playback position memory).
|
||||
* User Accounts & Personalization (account creation, email subscription management, customizable podcast hosts).
|
||||
* Enhanced Content Delivery & Discovery (Daily Email Summary, Full RSS Feed, Full Podcast Transcription, Search Functionality).
|
||||
* Expanded Content Sources (beyond Hacker News).
|
||||
* Community & Feedback (In-app feedback mechanisms).
|
||||
|
||||
## 9. Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| :----------------------------------------------------------- | :------------ | :------ | :--------------------------------------------------------------------------------------------------------- | :------------------------------- |
|
||||
| Initial PRD draft and MVP scope definition. | May 20, 2025 | 0.1 | Created initial PRD based on Project Brief and discussions on goals, requirements, and Epics/Stories (shells). | John (PM) & User |
|
||||
| Architectural refinements incorporated into Story ACs. | May 20, 2025 | 0.2 | Updated ACs for Stories 2.1 and 3.1 based on System Architecture Document feedback from Fred (Architect). | Sarah (PO) & User |
|
||||
@@ -1,101 +0,0 @@
|
||||
# Project Brief: BMad Daily Digest
|
||||
|
||||
## 1. Introduction / Problem Statement
|
||||
|
||||
- **The Core Idea is**: To create a daily podcast, "BMad Daily Digest," by scraping top Hacker News posts and their comments, then using an AI service (play.ai PlayNote) to generate an audio summary with two speakers.
|
||||
- **The Problem Being Solved is**: Busy professionals, especially in the tech world, find it hard to keep up with important discussions and news on platforms like Hacker News due to time constraints. They need a quick, digestible audio format to stay informed.
|
||||
|
||||
## 2. Vision & Goals
|
||||
|
||||
- **a. Vision**: To become the go-to daily audio source for busy professionals seeking to stay effortlessly informed on the most relevant and engaging discussions from Hacker News.
|
||||
- **b. Primary Goals (for MVP)**:
|
||||
1. Successfully scrape the top 10 Hacker News posts (articles and comments) and generate a coherent text file suitable for the play.ai API daily.
|
||||
2. Reliably submit the generated text to play.ai PlayNote and receive a 2-agent audio podcast daily.
|
||||
3. Ensure the produced podcast is easily accessible to a pilot group of users (e.g., via a simple download link or a basic podcast feed).
|
||||
4. Deliver the daily digest by a consistent time each morning (e.g., by 8 AM local time).
|
||||
- **c. Success Metrics (Initial Ideas)**:
|
||||
- Content Accuracy: X% of generated summaries accurately reflect the core topics of the Hacker News posts.
|
||||
- Production Reliability: Successfully produce a podcast on X out of Y days (e.g., 95% uptime).
|
||||
- User Feedback (Qualitative): Positive feedback from a small group of initial listeners regarding clarity, usefulness, and audio quality.
|
||||
- Listenership (Small Scale): Number of downloads/listens by the pilot user group.
|
||||
|
||||
## 3. Target Audience / Users
|
||||
|
||||
- **Primary Users**: Busy tech executives (e.g., VPs, Directors, C-suite in technology companies).
|
||||
- **Characteristics**:
|
||||
- Extremely time-poor with demanding schedules.
|
||||
- Need to stay informed about technology trends, competitor moves, and industry sentiment for strategic decision-making.
|
||||
- Likely consume content during commutes, short breaks, or while multitasking.
|
||||
- Value high-signal, concise, and curated information.
|
||||
- Familiar with Hacker News but lack the time for in-depth daily reading.
|
||||
|
||||
## 4. Key Features / Scope (High-Level Ideas for MVP)
|
||||
|
||||
1. **Hacker News Scraper:**
|
||||
- Automatically fetches the top X (e.g., 10) posts from Hacker News daily.
|
||||
- Extracts the main article content/summary for each post.
|
||||
- Extracts a selection of top/relevant comments for each post.
|
||||
2. **Content Aggregator & Formatter:**
|
||||
- Combines the scraped article content and comments into a single, structured text file.
|
||||
- Formats the text in a way that's optimal for the play.ai PlayNote API.
|
||||
3. **AI Podcast Generation:**
|
||||
- Submits the formatted text file to the play.ai PlayNote API.
|
||||
- Retrieves the generated audio file.
|
||||
4. **Basic Web Application (for MVP delivery):**
|
||||
- **List Page:** Displays a chronological list of daily "BMad Daily Digest" episodes.
|
||||
- **Detail Page (per episode):**
|
||||
- Shows the list of individual Hacker News stories included in that day's digest.
|
||||
- Provides direct links to the original Hacker News post for each story.
|
||||
- Provides direct links to the source article for each story.
|
||||
- Embeds an audio player to play the generated podcast for that day.
|
||||
5. **Scheduling/Automation:**
|
||||
- The entire backend process (scrape, format, generate podcast, update application data) runs automatically on a daily schedule.
|
||||
|
||||
## 5. Post MVP Features / Scope and Ideas
|
||||
|
||||
After the successful launch and validation of the MVP, "BMad Daily Digest" could be enhanced with the following features:
|
||||
|
||||
- **Daily Email Summary:**
|
||||
- Generate an email brief from the summarized transcript of the podcast (which can also be sourced from PlayNote).
|
||||
- Include a direct link to the day's podcast episode in the email.
|
||||
- **RSS Feed:**
|
||||
- Provide a standard RSS feed, allowing users to subscribe to the "BMad Daily Digest" in their preferred podcast players.
|
||||
- **User Accounts & Personalization:**
|
||||
- Allow users to create accounts.
|
||||
- Enable email subscription management through accounts.
|
||||
- Offer options to customize podcast hosts (e.g., select from different AI voices or styles available via play.ai).
|
||||
- **Expanded Content & Features:**
|
||||
- Wider range of content sources (beyond Hacker News).
|
||||
- Advanced audio features (selectable narrators if not covered by customization, variable playback speed within the app).
|
||||
- Full podcast transcription available on the website.
|
||||
- Search functionality within the app for past digests or stories.
|
||||
- **Community & Feedback:**
|
||||
- In-app user feedback mechanisms (e.g., rating episodes, suggesting improvements).
|
||||
|
||||
## 6. Known Technical Constraints or Preferences
|
||||
|
||||
- **Development Approach Preference:**
|
||||
- The full backend (scraping, content aggregation, podcast generation, data storage) is to be built and validated first.
|
||||
- The frontend UI will be developed subsequently to interact with the established backend API.
|
||||
- **Technology Stack Preferences (Considered Solid by User):**
|
||||
- **Frontend:** NextJS React, planned to be hosted on a static S3 site.
|
||||
- **Backend Services:** AWS Lambda for compute, DynamoDB for storage.
|
||||
- **Automation Trigger:** Daily execution via a cron job or a REST API call to trigger the Lambda function.
|
||||
- **Backend Language (Open for Discussion):**
|
||||
- The user is open to using either Python or TypeScript for the backend, with a note that web scraping capabilities and ease of use with the Algolia API are factors in this decision. This choice will be finalized during more detailed technical planning.
|
||||
- **Data Source for Hacker News:**
|
||||
- Utilize the Algolia Hacker News Search API (hn.algolia.com) for fetching posts and comments, as it's generally easier to work with than direct site scraping.
|
||||
- **Budget Constraints & API Usage:**
|
||||
- **AWS Services:** Strong preference to operate services (Lambda, DynamoDB, S3, etc.) within the limits of the AWS free tier where possible.
|
||||
- **play.ai PlayNote API:** The user has an existing purchased subscription, so this service is an exception to the free-tier goal for other services.
|
||||
- **Timeline:** _(User has not specified a firm timeline for MVP at this stage of the brief)._
|
||||
- **Risks:**
|
||||
- Dependency on Algolia HN API (availability, rate limits, potential changes).
|
||||
- Consistency and quality of AI-generated audio from play.ai.
|
||||
- Potential API changes or future costs associated with play.ai (beyond current subscription terms).
|
||||
- Managing daily automated processing effectively.
|
||||
- **Other User Preferences:** _(User has not specified further preferences at this stage of the brief)._
|
||||
|
||||
## 7. Relevant Research (Optional)
|
||||
|
||||
- **Hacker News Data Acquisition:** The decision to use the Algolia Hacker News Search API (hn.algolia.com) is based on its known efficiency and ease of use for accessing posts and comments, which is preferable to direct website scraping.
|
||||
@@ -1,65 +0,0 @@
|
||||
# Project Structure
|
||||
|
||||
The project utilizes a polyrepo structure with separate backend and frontend repositories, each with its own CDK application.
|
||||
|
||||
## 1. Backend Repository (`bmad-daily-digest-backend`)
|
||||
Organized by features within `src/`, using `dash-case` for folders and files (e.g., `src/features/content-ingestion/hn-fetcher-service.ts`).
|
||||
|
||||
```plaintext
|
||||
bmad-daily-digest-backend/
|
||||
├── .github/
|
||||
├── cdk/
|
||||
│ ├── bin/
|
||||
│ ├── lib/ # Backend Stack, Step Function definitions
|
||||
│ └── test/
|
||||
├── src/
|
||||
│ ├── features/
|
||||
│ │ ├── dailyJobOrchestrator/ # Main Step Function trigger/definition support
|
||||
│ │ ├── hnContentPipeline/ # Services for Algolia, scraping, formatting
|
||||
│ │ ├── playAiIntegration/ # Services for Play.ai submit & polling Lambda logic
|
||||
│ │ ├── podcastPersistence/ # Services for S3 & DynamoDB storage
|
||||
│ │ └── publicApi/ # Handlers for API Gateway (status, episodes)
|
||||
│ ├── shared/
|
||||
│ │ ├── utils/
|
||||
│ │ ├── types/
|
||||
│ │ └── services/ # Optional shared low-level AWS SDK wrappers
|
||||
├── tests/ # Unit/Integration tests, mirroring src/features/
|
||||
│ └── features/
|
||||
... (root config files: .env.example, .eslintrc.js, .gitignore, .prettierrc.js, jest.config.js, package.json, README.md, tsconfig.json)
|
||||
```
|
||||
|
||||
*Key Directories: `cdk/` for IaC, `src/features/` for modular backend logic, `src/shared/` for reusable code, `tests/` for Jest tests.*
|
||||
|
||||
## 2. Frontend Repository (`bmad-daily-digest-frontend`)
|
||||
Aligns with V0.dev generated Next.js App Router structure, using `dash-case` for custom files/folders where applicable.
|
||||
|
||||
```plaintext
|
||||
bmad-daily-digest-frontend/
|
||||
├── .github/
|
||||
├── app/
|
||||
│ ├── (pages)/
|
||||
│ │ ├── episodes/
|
||||
│ │ │ ├── page.tsx # List page
|
||||
│ │ │ └── [episode-id]/
|
||||
│ │ │ └── page.tsx # Detail page
|
||||
│ │ └── about/
|
||||
│ │ └── page.tsx
|
||||
│ ├── layout.tsx
|
||||
│ └── globals.css
|
||||
├── components/
|
||||
│ ├── ui/ # shadcn/ui based components
|
||||
│ └── domain/ # Custom composite components (e.g., episode-card)
|
||||
├── cdk/ # AWS CDK application for frontend infra (S3, CloudFront)
|
||||
│ ├── bin/
|
||||
│ └── lib/
|
||||
├── hooks/
|
||||
├── lib/
|
||||
│ ├── types.ts
|
||||
│ ├── utils.ts
|
||||
│ └── api-client.ts # Backend API communication
|
||||
├── public/
|
||||
├── tests/ # Jest & RTL tests
|
||||
... (root config files: .env.local.example, .eslintrc.js, components.json, next.config.mjs, package.json, tailwind.config.ts, tsconfig.json)
|
||||
```
|
||||
|
||||
*Key Directories: `app/` for Next.js routes, `components/` for UI, `cdk/` for frontend IaC, `lib/` for utilities and `api-client.ts`.*
|
||||
@@ -1,126 +0,0 @@
|
||||
# Core Workflow / Sequence Diagrams
|
||||
|
||||
## 1. Daily Automated Podcast Generation Pipeline (Backend)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Sched as Scheduler (EventBridge)
|
||||
participant Orch as Orchestrator (Step Functions)
|
||||
participant HNF as HN Data Fetcher Lambda
|
||||
participant Algolia as Algolia HN API
|
||||
participant ASL as Article Scraper Lambda
|
||||
participant EAS as External Article Sites
|
||||
participant CFL as Content Formatter Lambda
|
||||
participant PSubL as Play.ai Submit Lambda
|
||||
participant PlayAI as Play.ai API
|
||||
participant PStatL as Play.ai Status Poller Lambda
|
||||
participant PSL as Podcast Storage Lambda
|
||||
participant S3 as S3 Audio Storage
|
||||
participant MPL as Metadata Persistence Lambda
|
||||
participant DDB as DynamoDB (Episodes & HNPostState)
|
||||
|
||||
Sched->>Orch: Trigger Daily Workflow
|
||||
activate Orch
|
||||
Orch->>HNF: Start: Fetch HN Posts
|
||||
activate HNF
|
||||
HNF->>Algolia: Request top N posts
|
||||
Algolia-->>HNF: Return HN post list
|
||||
HNF->>DDB: Query HNPostProcessState for repeat status & lastCommentFetchTimestamp
|
||||
DDB-->>HNF: Return status
|
||||
HNF-->>Orch: HN Posts Data (with repeat status)
|
||||
deactivate HNF
|
||||
Orch->>ASL: For each NEW HN Post: Scrape Article (URL)
|
||||
activate ASL
|
||||
ASL->>EAS: Fetch article HTML
|
||||
EAS-->>ASL: Return HTML
|
||||
ASL-->>Orch: Scraped Article Content / Scrape Failure+Fallback Flag
|
||||
deactivate ASL
|
||||
Orch->>HNF: For each HN Post: Fetch Comments (HN Post ID, isRepeat, lastCommentFetchTimestamp, articleScrapedFailedFlag)
|
||||
activate HNF
|
||||
HNF->>Algolia: Request comments for Post ID
|
||||
Algolia-->>HNF: Return comments
|
||||
HNF->>DDB: Update HNPostProcessState (lastCommentFetchTimestamp)
|
||||
DDB-->>HNF: Confirm update
|
||||
HNF-->>Orch: Selected Comments
|
||||
deactivate HNF
|
||||
Orch->>CFL: Format Content for Play.ai (HN Posts, Articles, Comments)
|
||||
activate CFL
|
||||
CFL-->>Orch: Formatted Text Payload
|
||||
deactivate CFL
|
||||
Orch->>PSubL: Submit to Play.ai (Formatted Text)
|
||||
activate PSubL
|
||||
PSubL->>PlayAI: POST /playnotes (text, voice params, auth)
|
||||
PlayAI-->>PSubL: Return { jobId }
|
||||
PSubL-->>Orch: Play.ai Job ID
|
||||
deactivate PSubL
|
||||
loop Poll for Completion (managed by Orchestrator/Step Functions)
|
||||
Orch->>Orch: Wait (e.g., M minutes)
|
||||
Orch->>PStatL: Check Status (Job ID)
|
||||
activate PStatL
|
||||
PStatL->>PlayAI: GET /playnote/{jobId} (auth)
|
||||
PlayAI-->>PStatL: Return { status, audioUrl? }
|
||||
PStatL-->>Orch: Job Status & audioUrl (if completed)
|
||||
deactivate PStatL
|
||||
alt Job Completed
|
||||
Orch->>PSL: Store Podcast (audioUrl, jobId, episode context)
|
||||
activate PSL
|
||||
PSL->>PlayAI: GET audio from audioUrl
|
||||
PlayAI-->>PSL: Audio Stream/File
|
||||
PSL->>S3: Upload MP3
|
||||
S3-->>PSL: Confirm S3 Upload (s3Key, s3Bucket)
|
||||
PSL-->>Orch: S3 Location
|
||||
deactivate PSL
|
||||
Orch->>MPL: Persist Episode Metadata (S3 loc, HN sources, etc.)
|
||||
activate MPL
|
||||
MPL->>DDB: Save Episode Item & Update HNPostProcessState (lastProcessedDate)
|
||||
DDB-->>MPL: Confirm save
|
||||
MPL-->>Orch: Success
|
||||
deactivate MPL
|
||||
else Job Failed or Timeout
|
||||
Orch->>Orch: Log Error, Terminate Sub-Workflow for this job
|
||||
end
|
||||
end
|
||||
deactivate Orch
|
||||
```
|
||||
|
||||
## 2. Frontend User Requesting and Playing an Episode
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User as User (Browser)
|
||||
participant FE_App as Frontend App (Next.js on CloudFront/S3)
|
||||
participant BE_API as Backend API (API Gateway)
|
||||
participant API_L as API Lambda
|
||||
participant DDB as DynamoDB (Episode Metadata)
|
||||
participant Audio_S3 as Audio Storage (S3 via CloudFront)
|
||||
|
||||
User->>FE_App: Requests page (e.g., /episodes or /episodes/{id})
|
||||
activate FE_App
|
||||
FE_App->>BE_API: GET /v1/episodes (or /v1/episodes/{id}) (includes API Key)
|
||||
activate BE_API
|
||||
BE_API->>API_L: Invoke Lambda with request data
|
||||
activate API_L
|
||||
API_L->>DDB: Query for episode(s) metadata
|
||||
activate DDB
|
||||
DDB-->>API_L: Return episode data
|
||||
deactivate DDB
|
||||
API_L-->>BE_API: Return formatted episode data
|
||||
deactivate API_L
|
||||
BE_API-->>FE_App: Return API response (JSON)
|
||||
deactivate BE_API
|
||||
FE_App->>FE_App: Render page with episode data (list or detail)
|
||||
FE_App-->>User: Display page
|
||||
deactivate FE_App
|
||||
|
||||
alt User on Episode Detail Page & Clicks Play
|
||||
User->>FE_App: Clicks play on HTML5 Audio Player
|
||||
activate FE_App
|
||||
Note over FE_App, Audio_S3: Player's src attribute is set to CloudFront URL for audio file in S3.
|
||||
FE_App->>Audio_S3: Browser requests audio file via CloudFront URL
|
||||
activate Audio_S3
|
||||
Audio_S3-->>FE_App: Stream/Return audio file
|
||||
deactivate Audio_S3
|
||||
FE_App-->>User: Plays audio
|
||||
deactivate FE_App
|
||||
end
|
||||
```
|
||||
@@ -1,30 +0,0 @@
|
||||
# Technology Stack
|
||||
|
||||
The following table outlines the definitive technology selections for the BMad Daily Digest project:
|
||||
|
||||
| Category | Technology | Version / Details | Description / Purpose | Justification (Optional) |
|
||||
| :------------------- | :----------------------------- | :------------------------------------- | :------------------------------------------------------------------------------------ | :--------------------------------------------------------------------------------------- |
|
||||
| **Languages** | TypeScript | Latest stable (e.g., 5.x) | Primary language for backend and frontend. | Consistency, strong typing. |
|
||||
| **Runtime** | Node.js | 22.x | Server-side environment for backend & Next.js. | User preference, performance. |
|
||||
| **Frameworks (Frontend)** | Next.js (with React) | Latest stable (e.g., 14.x) | Frontend web application framework. | User preference, SSG, DX. |
|
||||
| **Frameworks (Backend)** | AWS Lambda (Node.js runtime) | N/A | Execution environment for serverless functions. | Serverless architecture. |
|
||||
| | AWS Step Functions | N/A | Orchestration of backend workflows. | Robust state management, retries. |
|
||||
| **Databases** | AWS DynamoDB | N/A | NoSQL database for metadata. | Scalability, serverless, free-tier. |
|
||||
| **Cloud Platform** | AWS | N/A | Primary cloud provider. | Comprehensive services, serverless. |
|
||||
| **Cloud Services** | AWS Lambda, API Gateway, S3, CloudFront, EventBridge Scheduler, CloudWatch, IAM, ACM | N/A | Core services for application hosting and operation. | Standard AWS serverless stack. |
|
||||
| **Infrastructure as Code (IaC)** | AWS CDK (TypeScript) | v2.x Latest stable | Defining cloud infrastructure. | User preference, TypeScript, repeatability. |
|
||||
| **UI Libraries (Frontend)** | Tailwind CSS | Latest stable (e.g., 3.x) | Utility-first CSS framework. | User preference, customization. |
|
||||
| | shadcn/ui | Latest stable | Accessible UI components. | User preference, base for themed components. |
|
||||
| **HTTP Client (Backend)** | axios | Latest stable | Making HTTP requests from backend. | User preference, feature-rich. |
|
||||
| **SDKs / Core Libraries (Backend)** | AWS SDK for JavaScript/TypeScript | v3.x (Latest stable) | Programmatic interaction with AWS services. | Official AWS SDK, modular. |
|
||||
| **Scraping / Content Extraction** | Cheerio | Latest stable | Server-side HTML parsing. | Efficient for static HTML. |
|
||||
| | @mozilla/readability (JS port) | Latest stable | Extracting primary readable article content. | Key for isolating main content. |
|
||||
| | Playwright (or Puppeteer) | Latest stable | Browser automation (if required for dynamic content). | Handles dynamic sites; use judiciously. |
|
||||
| **Bundling (Backend)**| esbuild | Latest stable | Bundling TypeScript Lambda functions. | User preference, speed. |
|
||||
| **Logging (Backend)** | Pino | Latest stable | Structured, low-overhead logging. | Better observability, JSON logs for CloudWatch. |
|
||||
| **Testing (Backend)**| Jest, ESLint, Prettier | Latest stable | Unit/integration testing, linting, formatting. | Code quality, consistency. |
|
||||
| **Testing (Frontend)**| Jest, React Testing Library, ESLint, Prettier | Latest stable | Unit/component testing, linting, formatting. | Code quality, consistency. |
|
||||
| **CI/CD** | GitHub Actions | N/A | Automation of build, test, quality checks. | Integration with GitHub. |
|
||||
| **External APIs** | Algolia HN Search API, Play.ai PlayNote API | v1 (for both) | Data sources and audio generation. | Core to product functionality. |
|
||||
|
||||
*Note: "Latest stable" versions should be pinned to specific versions in `package.json` files during development.*
|
||||
@@ -1,261 +0,0 @@
|
||||
# BMad Daily Digest UI/UX Specification
|
||||
|
||||
**Version:** 0.1
|
||||
**Date:** May 20, 2025
|
||||
**Author:** JaneAI
|
||||
|
||||
## 1\. Introduction
|
||||
|
||||
This document defines the user experience (UX) goals, information architecture (IA), user flows, and key visual/interaction specifications for the "BMad Daily Digest" web application's MVP. It builds upon the product vision and requirements outlined in the main PRD and will serve as a direct guide for frontend development. Our core aesthetic is an "80s retro CRT terminal" look and feel.
|
||||
|
||||
* **Primary Design Files:** For the MVP, detailed visual mockups in separate design files (e.g., Figma) are not planned. The UI design will be directly derived from the specifications within this document, the "User Interaction and Design Goals" section of the PRD, and the established "80s retro CRT terminal" theme using Tailwind CSS and shadcn/ui.
|
||||
* **Deployed Storybook / Design System:** A Storybook or formal design system is not an initial deliverable for the MVP. Components will be built and styled directly within the Next.js application using shadcn/ui and Tailwind CSS according to the defined aesthetic. This may evolve post-MVP.
|
||||
|
||||
## 2\. Overall UX Goals & Principles
|
||||
|
||||
This section draws from the "Target Audience / Users" and "User Interaction and Design Goals" defined in the PRD.
|
||||
|
||||
**a. Target User Personas:**
|
||||
|
||||
* **Primary Persona:** The "Busy Tech Executive."
|
||||
* **Description (from PRD):** Extremely time-poor individuals (e.g., VPs, Directors, C-suite in technology companies) with demanding schedules.
|
||||
* **Needs & Motivations:**
|
||||
* Need to stay informed about technology trends, competitor moves, and industry sentiment for strategic decision-making.
|
||||
* Value high-signal, concise, and curated information.
|
||||
* Familiar with Hacker News but lack the time for in-depth daily reading.
|
||||
* Likely consume content during commutes, short breaks, or while multitasking, making audio a good fit.
|
||||
* **Key UX Considerations for this Persona:** The interface must be extremely efficient, quick to scan, with no learning curve and immediate access to content.
|
||||
|
||||
**b. Usability Goals (for MVP):**
|
||||
|
||||
* **Efficiency:** Users must be able to find and start listening to the latest digest with minimal clicks and time.
|
||||
* **Clarity:** Information (episode lists, story links, player controls) must be presented clearly and unambiguously, especially given the stylized "CRT terminal" theme which requires careful attention to readability.
|
||||
* **Discoverability:** Users should easily understand how to navigate between the episode list and detail pages, and how to access source articles/HN discussions.
|
||||
* **Learnability:** The interface should be immediately intuitive, requiring no explanation for these tech-savvy users.
|
||||
|
||||
**c. Design Principles:**
|
||||
|
||||
1. **Content First, Efficiency Always:** Prioritize quick access to the audio digest and associated information. Every UI element should serve a clear purpose towards this goal, minimizing clutter.
|
||||
2. **Authentic Retro Tech Aesthetic:** Consistently apply the "80s retro CRT terminal" theme (dark mode, glowing green ASCII-style visuals) to create a unique and engaging experience, without sacrificing usability.
|
||||
3. **Simplicity & Focus:** Keep the UI for the MVP focused on core tasks: finding an episode, playing it, and accessing source links. Avoid non-essential features or complex interactions.
|
||||
4. **Readability within Theme:** While maintaining the retro aesthetic, ensure text is highly readable with sufficient contrast and appropriate "ASCII/bitmap-style" font choices (if used).
|
||||
5. **Responsive & Accessible Foundation:** Design for responsiveness across desktop and mobile from the start, and ensure basic accessibility principles are met within the chosen theme.
|
||||
|
||||
## 3\. Information Architecture (IA)
|
||||
|
||||
This section outlines the overall organization and structure of the web application's content and features for the MVP.
|
||||
|
||||
**a. Site Map / Screen Inventory (MVP):**
|
||||
For the MVP, we have identified the following core screens:
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Home / Episode List Page] --> B{Episode Detail Page};
|
||||
A --> C[About Page];
|
||||
B --> A;
|
||||
C --> A;
|
||||
```
|
||||
|
||||
* **Home / Episode List Page (`/` or `/episodes`):** The main landing page, displaying a reverse chronological list of daily podcast episodes.
|
||||
* **Episode Detail Page (`/episodes/{episodeId}`):** Displays the selected podcast episode, including the audio player, and links to source articles and Hacker News discussions.
|
||||
* **About Page (`/about`):** Provides information about the "BMad Daily Digest" service.
|
||||
|
||||
**b. Navigation Structure (MVP):**
|
||||
|
||||
* **Primary Navigation:**
|
||||
* A simple, persistent header or footer navigation element should be present across all pages.
|
||||
* This navigation **must** include clear links to:
|
||||
* "Home" (or "Episodes" leading to the Episode List Page).
|
||||
* "About" (leading to the About Page).
|
||||
* The site title/logo (e.g., "BMad Daily Digest") in the header **should** also link to the Home/Episode List Page.
|
||||
* **Content-Specific Navigation:**
|
||||
* On the **Episode List Page**, each listed episode **must** function as a direct link to its respective **Episode Detail Page**.
|
||||
* The **Episode Detail Page** will contain external links to source articles and Hacker News discussions, which **must** open in new browser tabs.
|
||||
* **Theme Considerations:** All navigation elements (links, buttons if any) must conform to the "80s retro CRT terminal" aesthetic (e.g., styled as glowing text links).
|
||||
|
||||
## 4\. User Flows
|
||||
|
||||
This section details the primary paths users will take to interact with the "BMad Daily Digest" MVP.
|
||||
|
||||
### a. User Flow 1: Consuming the Latest Digest
|
||||
|
||||
* **Goal:** The user wants to quickly find and listen to the most recent daily digest.
|
||||
|
||||
* **Steps / Diagram:**
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[User lands on Home/Episode List Page] --> B[Sees list of episodes, newest first];
|
||||
B --> C[Clicks on the latest/topmost episode];
|
||||
C --> D[Navigates to Episode Detail Page for that episode];
|
||||
D --> E[Presses play on embedded audio player];
|
||||
E --> F[Listens to podcast];
|
||||
D --> G{Chooses to explore?};
|
||||
G -- Yes --> H[Clicks a source article or HN discussion link];
|
||||
H --> I[Link opens in new tab];
|
||||
G -- No --> F;
|
||||
```
|
||||
|
||||
### b. User Flow 2: Accessing an Older Digest
|
||||
|
||||
* **Goal:** The user wants to find and listen to a specific past episode.
|
||||
|
||||
* **Steps / Diagram:**
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[User lands on Home/Episode List Page] --> B[Scrolls/browses the list of episodes];
|
||||
B --> C[Identifies and clicks on a desired older episode];
|
||||
C --> D[Navigates to Episode Detail Page for that episode];
|
||||
D --> E[Presses play on embedded audio player];
|
||||
E --> F[Listens to podcast];
|
||||
```
|
||||
|
||||
### c. User Flow 3: Learning About the Service
|
||||
|
||||
* **Goal:** The user wants to understand what "BMad Daily Digest" is.
|
||||
|
||||
* **Steps / Diagram:**
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[User is on any page of the application] --> B[Locates and clicks the "About" link in site navigation/footer];
|
||||
B --> C[Navigates to the About Page];
|
||||
C --> D[Reads information about the service];
|
||||
```
|
||||
|
||||
## 5\. Wireframes & Mockups
|
||||
|
||||
As established in the Introduction, for this streamlined MVP, detailed visual mockup files (e.g., in Figma) are not planned. The UI design and layout will be directly derived from the specifications within this document, the agreed-upon "80s retro CRT terminal" aesthetic, and will be kickstarted using an AI UI generation tool with Tailwind CSS and shadcn/ui.
|
||||
|
||||
This section provides high-level conceptual descriptions of the key screen layouts to guide that process.
|
||||
|
||||
**a. General Layout Principles:**
|
||||
|
||||
* **Theme:** All screens must consistently implement the "80s retro CRT terminal" aesthetic (dark mode, glowing green ASCII-style text/elements, potentially a subtle scanline effect or CRT curvature if achievable without sacrificing readability).
|
||||
* **Typography:** Font choices should align with retro computing/ASCII art styles while ensuring high readability (specific fonts TBD, but likely monospaced or pixel-style for key elements).
|
||||
* **Navigation:** A persistent, simple navigation area (e.g., a header or footer) will provide access to "Home/Episodes" and "About."
|
||||
|
||||
**b. Screen Layout Descriptions (Conceptual):**
|
||||
|
||||
**1. Home / Episode List Page (`/` or `/episodes`)**
|
||||
\* **Header:**
|
||||
\* Site Title/Logo (e.g., "BMad Daily Digest" styled in the retro theme). Clicking this navigates to this page.
|
||||
\* Navigation Links: "Episodes" (if not the home page itself), "About."
|
||||
\* **Main Content Area:**
|
||||
\* A clear heading like "Latest Digests" or "Episodes."
|
||||
\* A vertically stacked list of episodes, presented in reverse chronological order.
|
||||
\* Each list item will display: "Episode [EpisodeNumber]: [PublicationDate] - [PodcastTitle]" as a clickable link/element.
|
||||
\* If no episodes are available, a clear message like "No digests available yet. Tune in tomorrow\!" styled to fit the theme.
|
||||
\* Loading/error states will be displayed in this area as needed.
|
||||
\* **Footer (Optional):**
|
||||
\* Could repeat navigation links or contain a simple copyright/year.
|
||||
|
||||
**2. Episode Detail Page (`/episodes/{episodeId}`)**
|
||||
\* **Header:** Consistent with the List Page.
|
||||
\* **Main Content Area:**
|
||||
\* Episode Identification: Display `PodcastGeneratedTitle`, `EpisodeNumber`, and `PublicationDate` prominently.
|
||||
\* **Audio Player:** An embedded HTML5 `<audio controls>` element, styled to fit the retro theme as much as possible. The `src` will be the podcast audio file.
|
||||
\* **Hacker News Stories Section:**
|
||||
\* A clear sub-heading like "Stories Covered in this Digest" or "Sources."
|
||||
\* A list of the individual Hacker News stories included in the podcast.
|
||||
\* For each story:
|
||||
\* Its title.
|
||||
\* A link to the "Original Article" (opens in new tab).
|
||||
\* A link to the "HN Discussion" (opens in new tab).
|
||||
\* These links should be clearly styled and distinguishable.
|
||||
\* Loading/error states for episode data will be displayed here.
|
||||
\* **Footer (Optional):** Consistent with the List Page.
|
||||
|
||||
**3. About Page (`/about`)**
|
||||
\* **Header:** Consistent with other pages.
|
||||
\* **Main Content Area:**
|
||||
\* A clear heading like "About BMad Daily Digest."
|
||||
\* Static text content (Placeholder: "BMad Daily Digest provides a daily audio summary of top Hacker News discussions for busy tech professionals, generated using AI.") explaining the service, its purpose, and a high-level overview of how it's generated. The text itself should be styled in the retro theme.
|
||||
\* **Footer (Optional):** Consistent with other pages.
|
||||
|
||||
## 6\. Component Library / Design System Reference
|
||||
|
||||
For the "BMad Daily Digest" MVP, we will proceed as follows:
|
||||
|
||||
* **Foundation:** We will **not** be using a pre-existing external, comprehensive design system. Instead, we will be creating a unique, project-specific set of themed UI components.
|
||||
* **Core Technologies:**
|
||||
* **shadcn/ui:** This will be used as a base library of accessible and unstyled components. We will heavily customize these components.
|
||||
* **Tailwind CSS:** This will be the primary utility-first CSS framework used for styling all components and achieving the "80s retro CRT terminal" aesthetic.
|
||||
* **AI-Assisted Kickstart:** As per Story 3.1 in the PRD, an AI UI generation tool will be leveraged to kickstart the initial UI structure and some of the core presentational components. These AI-generated components will then be refined and extended.
|
||||
* **No Formal Storybook for MVP:** A deployed Storybook or formal, isolated design system documentation is not a deliverable for the initial MVP. Components will be developed and documented within the Next.js application structure.
|
||||
|
||||
**Foundational Components to Thematically Style for MVP:**
|
||||
|
||||
The following foundational elements will need careful thematic styling to establish the "80s retro CRT terminal" look and feel:
|
||||
|
||||
1. **Text Rendering Components:** Headings, Body Text/Paragraphs, Links (all styled with glowing green ASCII/pixelated look).
|
||||
2. **Layout Components:** Main Page Wrapper/Container, Header/Navigation Bar, Footer (if any).
|
||||
3. **List Components:** Episode List Item, Source Story List Item.
|
||||
4. **Interactive Elements:** Clickable Links/Navigation Items, Audio Player Container (styling around the native player).
|
||||
5. **Messaging/Feedback Components:** Loading State Indicator, Error Message Display (themed).
|
||||
|
||||
## 7\. Branding & Style Guide Reference
|
||||
|
||||
This section outlines the core visual elements for the "80s retro CRT terminal" and "80s retro everything" themes.
|
||||
|
||||
**a. Color Palette:**
|
||||
|
||||
* **Primary Background:** Very dark (e.g., near-black `#0A0A0A` or very dark charcoal/green `#051005`).
|
||||
* **Primary Text/Foreground ("Glowing Green"):** Vibrant "phosphor" green (e.g., `#39FF14`, `#00FF00`). Must be tested for readability.
|
||||
* **Accent/Secondary Text:** Dimmer or alternative shade of green (e.g., `#2A8D2A`).
|
||||
* **Highlight/Interactive Hover/Focus:** Brighter green or primary green with a "glow" effect.
|
||||
* **Feedback Colors (Error, Success, Warning - if needed for MVP):** Retro-styled amber/orange for errors (e.g., `#FF8C00`), primary green for success.
|
||||
|
||||
**b. Typography:**
|
||||
|
||||
* **Primary Font Family:** Monospaced, pixelated, or classic computer-terminal style font (e.g., "VT323", "Press Start 2P", "Perfect DOS VGA", or a clean system monospaced font). High readability is paramount.
|
||||
* **Font Sizes:** Simple typographic scale via Tailwind, suitable for the retro theme.
|
||||
* **Font Weights:** Likely limited (normal, bold) depending on chosen retro font.
|
||||
* **Letter Spacing/Line Height:** Adjusted to enhance CRT feel and readability.
|
||||
|
||||
**c. Iconography (MVP):**
|
||||
|
||||
* **Style:** Minimalist, pixel-art style, or simple ASCII/text-based symbols.
|
||||
* **Usage:** Sparse for MVP. Navigation might use ASCII arrows (e.g., `-> About`).
|
||||
|
||||
**d. Spacing & Grid:**
|
||||
|
||||
* **Approach:** Managed via Tailwind CSS utility classes, adhering to a consistent spacing scale aiming for a slightly "blocky" or "grid-aligned" feel.
|
||||
|
||||
## 8\. Accessibility (AX) Requirements
|
||||
|
||||
The application must be usable by people with a wide range of abilities, striving to meet key principles of WCAG 2.1 Level AA for essential functionalities in the MVP.
|
||||
|
||||
**a. Target Compliance (MVP):** Strive for WCAG 2.1 Level AA for key aspects.
|
||||
|
||||
**b. Specific Requirements (MVP):**
|
||||
|
||||
1. **Contrast Ratios:** All text **must** meet WCAG AA contrast minimums (4.5:1 for normal, 3:1 for large), especially critical for the "glowing green on dark" theme.
|
||||
2. **Keyboard Navigation:** All interactive elements **must** be focusable and operable via keyboard, with a logical focus order and clear, themed focus indicators.
|
||||
3. **Screen Reader Compatibility:** Use semantic HTML. Core content **must** be understandable and operable with screen readers. ARIA attributes used judiciously for custom components (shadcn/ui helps).
|
||||
4. **Resizable Text:** Users **should** be able to resize text up to 200% via browser zoom without loss of functionality or horizontal scrolling.
|
||||
5. **Images (if any):** Meaningful images **must** have appropriate `alt` text.
|
||||
6. **Understandable Link Text:** Link text **should** clearly describe its destination or purpose.
|
||||
|
||||
## 9\. Responsiveness
|
||||
|
||||
The web application must provide a seamless experience across devices, maintaining the "80s retro CRT terminal" aesthetic.
|
||||
|
||||
**a. Breakpoints (Leveraging Tailwind CSS Defaults as a Start):**
|
||||
|
||||
* `sm`: 640px, `md`: 768px, `lg`: 1024px, `xl`: 1280px, `2xl`: 1536px. (Defaults expected to be sufficient for MVP).
|
||||
|
||||
**b. Adaptation Strategy:**
|
||||
|
||||
1. **Layout:** Single-column on smaller screens (`sm` and below). Potential for multi-column on wider screens if appropriate, but simplicity prioritized.
|
||||
2. **Navigation:** Must remain accessible. May collapse to a themed "hamburger" menu or simplified icons on smaller screens if header becomes crowded.
|
||||
3. **Typography & Spacing:** Adapt via Tailwind responsive utilities to ensure readability.
|
||||
4. **Interactive Elements:** Adequate touch target sizes on mobile.
|
||||
5. **Content Prioritization:** Core content always prioritized and accessible.
|
||||
|
||||
## 10\. Change Log
|
||||
|
||||
| Version | Date | Author | Summary of Changes |
|
||||
| :------ | :----------- | :------------------------------ | :------------------------------------------------------- |
|
||||
| 0.1 | May 20, 2025 | Jane (Design Architect) & User | Initial draft of the UI/UX Specification document. |
|
||||
@@ -1,77 +0,0 @@
|
||||
# V0.dev Prompt for BMad Daily Digest
|
||||
|
||||
**"BMad Daily Digest" MVP UI**
|
||||
|
||||
**Overall Project & Theme:**
|
||||
|
||||
"Create a 3-page responsive web application called 'BMad Daily Digest' using Next.js (App Router), React, Tailwind CSS, and aiming for shadcn/ui compatible component structures where possible.
|
||||
|
||||
The entire application must have a strong '80s retro CRT terminal' aesthetic. This means:
|
||||
* **Overall Dark Mode:** Use a near-black background (e.g., #0A0A0A or a very dark desaturated green like #051005).
|
||||
* **Primary Text Color:** A vibrant, "glowing" phosphor green (e.g., #39FF14 or #00FF00). This should be the main color for text, active links, and key interface elements.
|
||||
* **Secondary Text Color:** A dimmer or slightly desaturated green for less emphasized text.
|
||||
* **Font:** Use a monospaced, pixelated, or classic computer terminal-style font throughout the application (e.g., VT323, Press Start 2P for headings, or a highly readable system monospaced font like Consolas or Menlo for body copy if specific retro fonts prove hard to read for longer text). Prioritize readability within the theme.
|
||||
* **Visual Style:** Think minimalist, text-heavy, with sharp edges. Subtle effects like faint scanlines or a very slight screen curvature are a bonus if they don't clutter the UI or impact performance, but not essential for the first pass. No modern rounded corners or gradients unless they specifically enhance the retro terminal feel (e.g., a subtle glow).
|
||||
* **Interactions:** Links and interactive elements should have a clear hover/focus state, perhaps a brighter green or a block cursor effect, fitting the terminal theme.
|
||||
|
||||
The application needs to be responsive, adapting to single-column layouts on mobile."
|
||||
|
||||
**Shared Elements:**
|
||||
|
||||
"1. **Main Layout Wrapper:** A full-screen component that establishes the dark background and CRT theme for all pages.
|
||||
2. **Header Component:**
|
||||
* Displays the site title 'BMad Daily Digest' in the primary glowing green, styled prominently (perhaps with a slightly larger or more distinct retro font). Clicking the title should navigate to the Home/Episode List page.
|
||||
* Contains simple text-based navigation links: 'Episodes' (links to home/list) and 'About'. These links should also use the glowing green text style and have clear hover/focus states.
|
||||
* The header should be responsive; on smaller screens, if navigation links become crowded, they can stack or be accessible via a simple themed menu icon (e.g., an ASCII-art style hamburger `☰`).
|
||||
3. **Footer Component (Optional, Minimalist):**
|
||||
* If included, a very simple text-based footer with a copyright notice (e.g., '© 2025 BMad Daily Digest') and perhaps repeat navigation links ('Episodes', 'About'), styled minimally in the secondary green text color."
|
||||
|
||||
**Page 1: Home / Episode List Page (Default Route `/`)**
|
||||
|
||||
"This page lists the daily podcast episodes.
|
||||
* It should use the shared Header and Footer (if applicable).
|
||||
* Below the header, display a clear heading like 'LATEST DIGESTS' or 'EPISODE LOG' in the primary glowing green.
|
||||
* The main content area should display a reverse chronological list of podcast episodes.
|
||||
* Each episode in the list should be a clickable item/card that navigates to its specific Episode Detail Page (e.g., `/episodes/[id]`).
|
||||
* The display format for each list item should be: 'EPISODE [EpisodeNumber]: [PublicationDate] - [PodcastGeneratedTitle]'. For example: 'EPISODE 042: 2025-05-20 - Tech Highlights & Discussions'.
|
||||
* `[EpisodeNumber]` is a number.
|
||||
* `[PublicationDate]` is a date string (e.g., YYYY-MM-DD).
|
||||
* `[PodcastGeneratedTitle]` is the title of the podcast for that day.
|
||||
* The list items should have a clear visual separation and a hover/focus state.
|
||||
* If there are no episodes, display a message like 'NO DIGESTS AVAILABLE. CHECK BACK TOMORROW.' in the themed style.
|
||||
* Include a themed loading state (e.g., 'LOADING EPISODES...') to be shown while data is being fetched.
|
||||
* Include a themed error state (e.g., 'ERROR: COULD NOT LOAD EPISODES.') if data fetching fails."
|
||||
|
||||
**Page 2: Episode Detail Page (Dynamic Route, e.g., `/episodes/[episodeId]`)**
|
||||
|
||||
"This page displays the details for a single selected podcast episode.
|
||||
* It should use the shared Header and Footer (if applicable).
|
||||
* Prominently display the `PodcastGeneratedTitle`, `EpisodeNumber`, and `PublicationDate` of the current episode at the top of the main content area, styled in the primary glowing green.
|
||||
* Below this, embed a standard HTML5 audio player (`<audio controls>`). The player itself will likely have default browser styling, but ensure the container or area around it fits the overall dark retro theme. The `src` for the audio will be dynamic.
|
||||
* Below the audio player, include a section with a clear heading like 'STORIES COVERED IN THIS DIGEST' or 'SOURCE LOG'.
|
||||
* Under this heading, display a list of the Hacker News stories that were included in this podcast episode.
|
||||
* For each story in this list, display:
|
||||
* Its title (as plain text).
|
||||
* A clearly labeled link: 'Read Article' (linking to the original source article URL, should open in a new tab).
|
||||
* A clearly labeled link: 'HN Discussion' (linking to its Hacker News discussion page URL, should open in a new tab).
|
||||
* These links should be styled according to the theme (glowing green, clear hover/focus).
|
||||
* Include themed loading and error states for fetching episode details."
|
||||
|
||||
**Page 3: About Page (Static Route, e.g., `/about`)**
|
||||
|
||||
"This page provides information about the 'BMad Daily Digest' service.
|
||||
* It should use the shared Header and Footer (if applicable).
|
||||
* Display a clear heading like 'ABOUT BMAD DAILY DIGEST'.
|
||||
* The main content area should display static informational text. For placeholder text, use: 'BMad Daily Digest provides a daily audio summary of top Hacker News discussions for busy tech professionals, generated using AI. Our mission is to keep you informed, efficiently. All content is curated and processed to deliver key insights in an easily digestible audio format, presented with a unique retro-tech vibe.'
|
||||
* All text and headings should adhere to the '80s retro CRT terminal' theme."
|
||||
|
||||
---
|
||||
|
||||
**Key Instructions for V0:**
|
||||
* Prioritize the "80s retro CRT terminal" aesthetic in all generated components: dark background, glowing green text, monospaced/pixel fonts.
|
||||
* Use Tailwind CSS for all styling.
|
||||
* Generate React components using TypeScript for a Next.js (App Router) application.
|
||||
* Ensure layouts are responsive and adapt to a single column on mobile.
|
||||
* Focus on clean, readable code structure for the generated components.
|
||||
* The HTML5 audio player controls will likely be browser default, but any surrounding elements should be themed.
|
||||
* Links for "Read Article" and "HN Discussion" must be distinct and clearly indicate they are external links that open in new tabs.
|
||||
@@ -1,187 +0,0 @@
|
||||
# Architecture for {PRD Title}
|
||||
|
||||
Status: { Draft | Approved }
|
||||
|
||||
## Technical Summary
|
||||
|
||||
{ Short 1-2 paragraph }
|
||||
|
||||
## Technology Table
|
||||
|
||||
Table listing choices for languages, libraries, infra, cloud resources, etc... may add more detail or refinement that what was in the PRD
|
||||
|
||||
<example>
|
||||
| Technology | Version | Description |
|
||||
| ---------- | ------- | ----------- |
|
||||
| Kubernetes | x.y.z | Container orchestration platform for microservices deployment |
|
||||
| Apache Kafka | x.y.z | Event streaming platform for real-time data ingestion |
|
||||
| TimescaleDB | x.y.z | Time-series database for sensor data storage |
|
||||
| Go | x.y.z | Primary language for data processing services |
|
||||
| GoRilla Mux | x.y.z | REST API Framework |
|
||||
| Python | x.y.z | Used for data analysis and ML services |
|
||||
| DeepSeek LLM | R3 | Ollama local hosted and remote hosted API use for customer chat engagement |
|
||||
|
||||
</example>
|
||||
|
||||
## **High-Level Overview**
|
||||
|
||||
Define the architectural style (e.g., Monolith, Microservices, Serverless) and justify the choice based on the PRD. Include a high-level diagram (e.g., C4 Context or Container level using Mermaid syntax).
|
||||
|
||||
### **Component View**
|
||||
|
||||
Identify major logical components/modules/services, outline their responsibilities, and describe key interactions/APIs between them. Include diagrams if helpful (e.g., C4 Container/Component or class diagrams using Mermaid syntax).
|
||||
|
||||
## Architectural Diagrams, Data Models, Schemas
|
||||
|
||||
{ Mermaid Diagrams for architecture }
|
||||
{ Data Models, API Specs, Schemas }
|
||||
|
||||
<example>
|
||||
|
||||
### Dynamo One Table Design for App Table
|
||||
|
||||
```json
|
||||
{
|
||||
"TableName": "AppTable",
|
||||
"KeySchema": [
|
||||
{ "AttributeName": "PK", "KeyType": "HASH" },
|
||||
{ "AttributeName": "SK", "KeyType": "RANGE" }
|
||||
],
|
||||
"AttributeDefinitions": [
|
||||
{ "AttributeName": "PK", "AttributeType": "S" },
|
||||
{ "AttributeName": "SK", "AttributeType": "S" },
|
||||
{ "AttributeName": "GSI1PK", "AttributeType": "S" },
|
||||
{ "AttributeName": "GSI1SK", "AttributeType": "S" }
|
||||
],
|
||||
"GlobalSecondaryIndexes": [
|
||||
{
|
||||
"IndexName": "GSI1",
|
||||
"KeySchema": [
|
||||
{ "AttributeName": "GSI1PK", "KeyType": "HASH" },
|
||||
{ "AttributeName": "GSI1SK", "KeyType": "RANGE" }
|
||||
],
|
||||
"Projection": { "ProjectionType": "ALL" }
|
||||
}
|
||||
],
|
||||
"EntityExamples": [
|
||||
{
|
||||
"PK": "USER#123",
|
||||
"SK": "PROFILE",
|
||||
"GSI1PK": "USER",
|
||||
"GSI1SK": "John Doe",
|
||||
"email": "john@example.com",
|
||||
"createdAt": "2023-05-01T12:00:00Z"
|
||||
},
|
||||
{
|
||||
"PK": "USER#123",
|
||||
"SK": "ORDER#456",
|
||||
"GSI1PK": "ORDER",
|
||||
"GSI1SK": "2023-05-15T09:30:00Z",
|
||||
"total": 129.99,
|
||||
"status": "shipped"
|
||||
},
|
||||
{
|
||||
"PK": "PRODUCT#789",
|
||||
"SK": "DETAILS",
|
||||
"GSI1PK": "PRODUCT",
|
||||
"GSI1SK": "Wireless Headphones",
|
||||
"price": 79.99,
|
||||
"inventory": 42
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Sequence Diagram for Recording Alerts
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Sensor
|
||||
participant API
|
||||
participant ProcessingService
|
||||
participant Database
|
||||
participant NotificationService
|
||||
|
||||
Sensor->>API: Send sensor reading
|
||||
API->>ProcessingService: Forward reading data
|
||||
ProcessingService->>ProcessingService: Validate & analyze data
|
||||
alt Is threshold exceeded
|
||||
ProcessingService->>Database: Store alert
|
||||
ProcessingService->>NotificationService: Trigger notification
|
||||
NotificationService->>NotificationService: Format alert message
|
||||
NotificationService-->>API: Send notification status
|
||||
else Normal reading
|
||||
ProcessingService->>Database: Store reading only
|
||||
end
|
||||
Database-->>ProcessingService: Confirm storage
|
||||
ProcessingService-->>API: Return processing result
|
||||
API-->>Sensor: Send acknowledgement
|
||||
```
|
||||
|
||||
### Sensor Reading Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"sensor_id": "string",
|
||||
"timestamp": "datetime",
|
||||
"readings": {
|
||||
"temperature": "float",
|
||||
"pressure": "float",
|
||||
"humidity": "float"
|
||||
},
|
||||
"metadata": {
|
||||
"location": "string",
|
||||
"calibration_date": "datetime"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
</example>
|
||||
|
||||
## Project Structure
|
||||
|
||||
{ Diagram the folder and file organization structure along with descriptions }
|
||||
|
||||
```
|
||||
├ /src
|
||||
├── /services
|
||||
│ ├── /gateway # Sensor data ingestion
|
||||
│ ├── /processor # Data processing and validation
|
||||
│ ├── /analytics # Data analysis and ML
|
||||
│ └── /notifier # Alert and notification system
|
||||
├── /deploy
|
||||
│ ├── /kubernetes # K8s manifests
|
||||
│ └── /terraform # Infrastructure as Code
|
||||
└── /docs
|
||||
├── /api # API documentation
|
||||
└── /schemas # Data schemas
|
||||
```
|
||||
|
||||
## Testing Requirements and Framework
|
||||
|
||||
### Patterns and Standards (Opinionated & Specific)
|
||||
|
||||
- **Architectural/Design Patterns:** Mandate specific patterns to be used (e.g., Repository Pattern for data access, MVC/MVVM for structure, CQRS if applicable). .
|
||||
|
||||
- **API Design Standards:** Define the API style (e.g., REST, GraphQL), key conventions (naming, versioning strategy, authentication method), and data formats (e.g., JSON).
|
||||
|
||||
- **Coding Standards:** Specify the mandatory style guide (e.g., Airbnb JavaScript Style Guide, PEP 8), code formatter (e.g., Prettier), and linter (e.g., ESLint with specific config). Define mandatory naming conventions (files, variables, classes). Define test file location conventions.
|
||||
|
||||
- **Error Handling Strategy:** Outline the standard approach for logging errors, propagating exceptions, and formatting error responses.
|
||||
|
||||
### Initial Project Setup (Manual Steps)
|
||||
|
||||
Define Story 0: Explicitly state initial setup tasks for the user. Expand on what was in the PRD if it was present already if not sufficient, or else just repeat it. Examples:
|
||||
|
||||
- Framework CLI Generation: Specify exact command (e.g., `npx create-next-app@latest...`, `ng new...`). Justify why manual is preferred.
|
||||
- Environment Setup: Manual config file creation, environment variable setup. Register for Cloud DB Account.
|
||||
- LLM: Let up Local LLM or API key registration if using remote
|
||||
|
||||
## Infrastructure and Deployment
|
||||
|
||||
{ cloud accounts and resources we will need to provision and for what purpose }
|
||||
{ Specify the target deployment environment (e.g., Vercel, AWS EC2, Google Cloud Run) and outline the CI/CD strategy and any specific tools envisioned. }
|
||||
|
||||
## Change Log
|
||||
|
||||
{ table of changes }
|
||||
@@ -1,118 +0,0 @@
|
||||
# {Project Name} PRD
|
||||
|
||||
## Status: { Draft | Approved }
|
||||
|
||||
## Intro
|
||||
|
||||
{ Short 1-2 paragraph describing the what and why of what the prd will achieve, as outlined in the project brief or through user questioning }
|
||||
|
||||
## Goals and Context
|
||||
|
||||
{
|
||||
A short summarization of the project brief, with highlights on:
|
||||
|
||||
- Clear project objectives
|
||||
- Measurable outcomes
|
||||
- Success criteria
|
||||
- Key performance indicators (KPIs)
|
||||
}
|
||||
|
||||
## Features and Requirements
|
||||
|
||||
{
|
||||
|
||||
- Functional requirements
|
||||
- Non-functional requirements
|
||||
- User experience requirements
|
||||
- Integration requirements
|
||||
- Testing requirements
|
||||
}
|
||||
|
||||
## Epic Story List
|
||||
|
||||
{ We will test fully before each story is complete, so there will be no dedicated Epic and stories at the end for testing }
|
||||
|
||||
### Epic 0: Initial Manual Set Up or Provisioning
|
||||
|
||||
- stories or tasks the user might need to perform, such as register or set up an account or provide api keys, manually configure some local resources like an LLM, etc...
|
||||
|
||||
### Epic-1: Current PRD Epic (for example backend epic)
|
||||
|
||||
#### Story 1: Title
|
||||
|
||||
Requirements:
|
||||
|
||||
- Do X
|
||||
- Create Y
|
||||
- Etc...
|
||||
|
||||
### Epic-2: Second Current PRD Epic (for example front end epic)
|
||||
|
||||
### Epic-N: Future Epic Enhancements (Beyond Scope of current PRD)
|
||||
|
||||
<example>
|
||||
|
||||
## Epic 1: My Cool App Can Retrieve Data
|
||||
|
||||
#### Story 1: Project and NestJS Set Up
|
||||
|
||||
Requirements:
|
||||
|
||||
- Install NestJS CLI Globally
|
||||
- Create a new NestJS project with the nestJS cli generator
|
||||
- Test Start App Boilerplate Functionality
|
||||
- Init Git Repo and commit initial project set up
|
||||
|
||||
#### Story 2: News Retrieval API Route
|
||||
|
||||
Requirements:
|
||||
|
||||
- Create API Route that returns a list of News and comments from the news source foo
|
||||
- Route post body specifies the number of posts, articles, and comments to return
|
||||
- Create a command in package.json that I can use to call the API Route (route configured in env.local)
|
||||
|
||||
</example>
|
||||
|
||||
## Technology Stack
|
||||
|
||||
{ Table listing choices for languages, libraries, infra, etc...}
|
||||
|
||||
<example>
|
||||
| Technology | Version | Description |
|
||||
| ---------- | ------- | ----------- |
|
||||
| Kubernetes | x.y.z | Container orchestration platform for microservices deployment |
|
||||
| Apache Kafka | x.y.z | Event streaming platform for real-time data ingestion |
|
||||
| TimescaleDB | x.y.z | Time-series database for sensor data storage |
|
||||
| Go | x.y.z | Primary language for data processing services |
|
||||
| GoRilla Mux | x.y.z | REST API Framework |
|
||||
| Python | x.y.z | Used for data analysis and ML services |
|
||||
</example>
|
||||
|
||||
## Project Structure
|
||||
|
||||
{ Diagram the folder and file organization structure along with descriptions }
|
||||
|
||||
<example>
|
||||
|
||||
{ folder tree diagram }
|
||||
|
||||
</example>
|
||||
|
||||
### POST MVP / PRD Features
|
||||
|
||||
- Idea 1
|
||||
- Idea 2
|
||||
- ...
|
||||
- Idea N
|
||||
|
||||
## Change Log
|
||||
|
||||
{ Markdown table of key changes after document is no longer in draft and is updated, table includes the change title, the story id that the change happened during, and a description if the title is not clear enough }
|
||||
|
||||
<example>
|
||||
| Change | Story ID | Description |
|
||||
| -------------------- | -------- | ------------------------------------------------------------- |
|
||||
| Initial draft | N/A | Initial draft prd |
|
||||
| Add ML Pipeline | story-4 | Integration of machine learning prediction service story |
|
||||
| Kafka Upgrade | story-6 | Upgraded from Kafka 2.0 to Kafka 3.0 for improved performance |
|
||||
</example>
|
||||
@@ -1,53 +0,0 @@
|
||||
# Story {N}: {Title}
|
||||
|
||||
## Story
|
||||
|
||||
**As a** {role}
|
||||
**I want** {action}
|
||||
**so that** {benefit}.
|
||||
|
||||
## Status
|
||||
|
||||
Draft OR In-Progress OR Complete
|
||||
|
||||
## Context
|
||||
|
||||
{A paragraph explaining the background, current state, and why this story is needed. Include any relevant technical context or business drivers.}
|
||||
|
||||
## Estimation
|
||||
|
||||
Story Points: {Story Points (1 SP=1 day of Human Development, or 10 minutes of AI development)}
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. - [ ] {First criterion - ordered by logical progression}
|
||||
2. - [ ] {Second criterion}
|
||||
3. - [ ] {Third criterion}
|
||||
{Use - [x] for completed items}
|
||||
|
||||
## Subtasks
|
||||
|
||||
1. - [ ] {Major Task Group 1}
|
||||
1. - [ ] {Subtask}
|
||||
2. - [ ] {Subtask}
|
||||
3. - [ ] {Subtask}
|
||||
2. - [ ] {Major Task Group 2}
|
||||
1. - [ ] {Subtask}
|
||||
2. - [ ] {Subtask}
|
||||
3. - [ ] {Subtask}
|
||||
{Use - [x] for completed items, - [-] for skipped/cancelled items}
|
||||
|
||||
## Testing Requirements:\*\*
|
||||
|
||||
- Reiterate the required code coverage percentage (e.g., >= 85%).
|
||||
|
||||
## Story Wrap Up (To be filled in AFTER agent execution):\*\*
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Agent Credit or Cost:** `<Cost/Credits Consumed>`
|
||||
- **Date/Time Completed:** `<Timestamp>`
|
||||
- **Commit Hash:** `<Git Commit Hash of resulting code>`
|
||||
- **Change Log**
|
||||
- change X
|
||||
- change Y
|
||||
...
|
||||
@@ -1,230 +0,0 @@
|
||||
# Role: Software Architect
|
||||
|
||||
You are a world-class expert Software Architect with extensive experience in designing robust, scalable, and maintainable application architectures and conducting deep technical research to figure out the best patterns and technology choices to build the MVP efficiently. You specialize in translating Product Requirements Documents (PRDs) into detailed, opinionated Architecture Documents that serve as technical blueprints. You are adept at assessing technical feasibility, researching complex topics (e.g., compliance, technology trade-offs, architectural patterns), selecting appropriate technology stacks, defining standards, and clearly documenting architectural decisions and rationale.
|
||||
|
||||
### Interaction Style
|
||||
|
||||
- **Follow the explicit instruction regarding assessment and user confirmation before proceeding.**
|
||||
|
||||
- Think step-by-step to ensure all requirements from the PRD and deep research are considered and the architectural design is coherent and logical.
|
||||
|
||||
- If the PRD is ambiguous or lacks detail needed for a specific architectural decision (even after potential Deep Research), **ask clarifying questions** before proceeding with that section.
|
||||
|
||||
- Propose specific, opinionated choices where the PRD allows flexibility, but clearly justify them based on the requirements or best practices. Avoid presenting multiple options without recommending one.
|
||||
|
||||
- Focus solely on the information provided in the PRD context (potentially updated post-research). Do not invent requirements or features not present in the PRD, user provided info or deep research.
|
||||
|
||||
## Primary Instructions:
|
||||
|
||||
1. First ensure the user has provided a PRD.
|
||||
|
||||
2. Check if the user has already produced any deep research into technology or architectural decisions which they can also provide at this time.
|
||||
|
||||
3. Analyze the PRD and ask the user any technical clarifications we need to align on before kicking off the project that will be included in this document. The goal is to allow for some emergent choice as the agents develop our application, but ensure also that if there are any major decisions we should make or ensure are understood up front that need clarification from the user, or decisions you intend to make, we need to work with the user to first align on these decisions. NO not proceed with PRD generation until the user has answered your questions and agrees its time to create the draft.
|
||||
|
||||
4. ONLY after the go ahead is given, and you feel confident in being able to produce the architecture needed, will you create the draft. After the draft is ready, point out any decisions you have made so the user can easily review them before we mark the architecture as approved.
|
||||
|
||||
## Goal
|
||||
|
||||
Collaboratively design and document a detailed, opinionated Architecture Document covering all necessary aspects from goals to glossary, based on the PRD, additional research the user might do, and also questions you will ask of the user.
|
||||
|
||||
### Output Format
|
||||
|
||||
Generate the Architecture Document as a well-structured Markdown file using the following template. Use headings, subheadings, bullet points, code blocks (for versions, commands, or small snippets), and Mermaid syntax for diagrams where specified. Ensure all specified versions, standards, and patterns are clearly stated. Do not be lazy in creating the document, remember that this must have maximal detail that will be stable and a reference for user stories and the ai coding agents that are dumb and forgetful to remain consistent in their future implementation of features. Data models, database patterns, code style and documentation standards, and directory structure and layout are critical. Use the following template that runs through the end of this file and include minimally all sections:
|
||||
|
||||
````markdown
|
||||
# Architecture for {PRD Title}
|
||||
|
||||
Status: { Draft | Approved }
|
||||
|
||||
## Technical Summary
|
||||
|
||||
{ Short 1-2 paragraph }
|
||||
|
||||
## Technology Table
|
||||
|
||||
Table listing choices for languages, libraries, infra, cloud resources, etc... may add more detail or refinement that what was in the PRD
|
||||
|
||||
<example>
|
||||
| Technology | Version | Description |
|
||||
| ---------- | ------- | ----------- |
|
||||
| Kubernetes | x.y.z | Container orchestration platform for microservices deployment |
|
||||
| Apache Kafka | x.y.z | Event streaming platform for real-time data ingestion |
|
||||
| TimescaleDB | x.y.z | Time-series database for sensor data storage |
|
||||
| Go | x.y.z | Primary language for data processing services |
|
||||
| GoRilla Mux | x.y.z | REST API Framework |
|
||||
| Python | x.y.z | Used for data analysis and ML services |
|
||||
| DeepSeek LLM | R3 | Ollama local hosted and remote hosted API use for customer chat engagement |
|
||||
|
||||
</example>
|
||||
|
||||
## **High-Level Overview**
|
||||
|
||||
Define the architectural style (e.g., Monolith, Microservices, Serverless) and justify the choice based on the PRD. Include a high-level diagram (e.g., C4 Context or Container level using Mermaid syntax).
|
||||
|
||||
### **Component View**
|
||||
|
||||
Identify major logical components/modules/services, outline their responsibilities, and describe key interactions/APIs between them. Include diagrams if helpful (e.g., C4 Container/Component or class diagrams using Mermaid syntax).
|
||||
|
||||
## Architectural Diagrams, Data Models, Schemas
|
||||
|
||||
{ Mermaid Diagrams for architecture }
|
||||
{ Data Models, API Specs, Schemas }
|
||||
|
||||
<example>
|
||||
|
||||
### Dynamo One Table Design for App Table
|
||||
|
||||
```json
|
||||
{
|
||||
"TableName": "AppTable",
|
||||
"KeySchema": [
|
||||
{ "AttributeName": "PK", "KeyType": "HASH" },
|
||||
{ "AttributeName": "SK", "KeyType": "RANGE" }
|
||||
],
|
||||
"AttributeDefinitions": [
|
||||
{ "AttributeName": "PK", "AttributeType": "S" },
|
||||
{ "AttributeName": "SK", "AttributeType": "S" },
|
||||
{ "AttributeName": "GSI1PK", "AttributeType": "S" },
|
||||
{ "AttributeName": "GSI1SK", "AttributeType": "S" }
|
||||
],
|
||||
"GlobalSecondaryIndexes": [
|
||||
{
|
||||
"IndexName": "GSI1",
|
||||
"KeySchema": [
|
||||
{ "AttributeName": "GSI1PK", "KeyType": "HASH" },
|
||||
{ "AttributeName": "GSI1SK", "KeyType": "RANGE" }
|
||||
],
|
||||
"Projection": { "ProjectionType": "ALL" }
|
||||
}
|
||||
],
|
||||
"EntityExamples": [
|
||||
{
|
||||
"PK": "USER#123",
|
||||
"SK": "PROFILE",
|
||||
"GSI1PK": "USER",
|
||||
"GSI1SK": "John Doe",
|
||||
"email": "john@example.com",
|
||||
"createdAt": "2023-05-01T12:00:00Z"
|
||||
},
|
||||
{
|
||||
"PK": "USER#123",
|
||||
"SK": "ORDER#456",
|
||||
"GSI1PK": "ORDER",
|
||||
"GSI1SK": "2023-05-15T09:30:00Z",
|
||||
"total": 129.99,
|
||||
"status": "shipped"
|
||||
},
|
||||
{
|
||||
"PK": "PRODUCT#789",
|
||||
"SK": "DETAILS",
|
||||
"GSI1PK": "PRODUCT",
|
||||
"GSI1SK": "Wireless Headphones",
|
||||
"price": 79.99,
|
||||
"inventory": 42
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
````
|
||||
|
||||
### Sequence Diagram for Recording Alerts
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Sensor
|
||||
participant API
|
||||
participant ProcessingService
|
||||
participant Database
|
||||
participant NotificationService
|
||||
|
||||
Sensor->>API: Send sensor reading
|
||||
API->>ProcessingService: Forward reading data
|
||||
ProcessingService->>ProcessingService: Validate & analyze data
|
||||
alt Is threshold exceeded
|
||||
ProcessingService->>Database: Store alert
|
||||
ProcessingService->>NotificationService: Trigger notification
|
||||
NotificationService->>NotificationService: Format alert message
|
||||
NotificationService-->>API: Send notification status
|
||||
else Normal reading
|
||||
ProcessingService->>Database: Store reading only
|
||||
end
|
||||
Database-->>ProcessingService: Confirm storage
|
||||
ProcessingService-->>API: Return processing result
|
||||
API-->>Sensor: Send acknowledgement
|
||||
```
|
||||
|
||||
### Sensor Reading Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"sensor_id": "string",
|
||||
"timestamp": "datetime",
|
||||
"readings": {
|
||||
"temperature": "float",
|
||||
"pressure": "float",
|
||||
"humidity": "float"
|
||||
},
|
||||
"metadata": {
|
||||
"location": "string",
|
||||
"calibration_date": "datetime"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
</example>
|
||||
|
||||
## Project Structure
|
||||
|
||||
{ Diagram the folder and file organization structure along with descriptions }
|
||||
|
||||
```
|
||||
├ /src
|
||||
├── /services
|
||||
│ ├── /gateway # Sensor data ingestion
|
||||
│ ├── /processor # Data processing and validation
|
||||
│ ├── /analytics # Data analysis and ML
|
||||
│ └── /notifier # Alert and notification system
|
||||
├── /deploy
|
||||
│ ├── /kubernetes # K8s manifests
|
||||
│ └── /terraform # Infrastructure as Code
|
||||
└── /docs
|
||||
├── /api # API documentation
|
||||
└── /schemas # Data schemas
|
||||
```
|
||||
|
||||
## Testing Requirements and Framework
|
||||
|
||||
- Unit Testing Standards <examples>Use Jest, 80% coverage, unit test files in line with the file they are testing</examples>
|
||||
- Integration Testing <examples>Retained in a separate tests folder outside of src. Will ensure data created is clearly test data and is also cleaned up upon verification. Etc...<examples>
|
||||
|
||||
## Patterns and Standards (Opinionated & Specific)
|
||||
|
||||
- **Architectural/Design Patterns:** Mandate specific patterns to be used (e.g., Repository Pattern for data access, MVC/MVVM for structure, CQRS if applicable). .
|
||||
|
||||
- **API Design Standards:** Define the API style (e.g., REST, GraphQL), key conventions (naming, versioning strategy, authentication method), and data formats (e.g., JSON).
|
||||
|
||||
- **Coding Standards:** Specify the mandatory style guide (e.g., Airbnb JavaScript Style Guide, PEP 8), code formatter (e.g., Prettier), and linter (e.g., ESLint with specific config). Define mandatory naming conventions (files, variables, classes). Define test file location conventions.
|
||||
|
||||
- **Error Handling Strategy:** Outline the standard approach for logging errors, propagating exceptions, and formatting error responses.
|
||||
|
||||
## Initial Project Setup (Manual Steps)
|
||||
|
||||
Define Story 0: Explicitly state initial setup tasks for the user. Expand on what was in the PRD if it was present already if not sufficient, or else just repeat it. Examples:
|
||||
|
||||
- Framework CLI Generation: Specify exact command (e.g., `npx create-next-app@latest...`, `ng new...`). Justify why manual is preferred.
|
||||
- Environment Setup: Manual config file creation, environment variable setup. Register for Cloud DB Account.
|
||||
- LLM: Let up Local LLM or API key registration if using remote
|
||||
|
||||
## Infrastructure and Deployment
|
||||
|
||||
{ cloud accounts and resources we will need to provision and for what purpose }
|
||||
{ Specify the target deployment environment (e.g., Vercel, AWS EC2, Google Cloud Run) and outline the CI/CD strategy and any specific tools envisioned. }
|
||||
|
||||
## Change Log
|
||||
|
||||
{ table of changes }
|
||||
|
||||
```
|
||||
|
||||
```
|
||||
@@ -1,65 +0,0 @@
|
||||
# Role: Brainstorming BA and RA
|
||||
|
||||
You are a world-class expert Market & Business Analyst and also the best research assistant I have ever met, possessing deep expertise in both comprehensive market research and collaborative project definition. You excel at analyzing external market context and facilitating the structuring of initial ideas into clear, actionable Project Briefs with a focus on Minimum Viable Product (MVP) scope.
|
||||
|
||||
You are adept at data analysis, understanding business needs, identifying market opportunities/pain points, analyzing competitors, and defining target audiences. You communicate with exceptional clarity, capable of both presenting research findings formally and engaging in structured, inquisitive dialogue to elicit project requirements.
|
||||
|
||||
# Core Capabilities & Goal
|
||||
|
||||
Your primary goal is to assist the user in **either**:
|
||||
|
||||
## 1. Market Research Mode
|
||||
|
||||
Conduct deep research on a provided product concept or market area, delivering a structured report covering:
|
||||
|
||||
- Market Needs/Pain Points
|
||||
- Competitor Landscape
|
||||
- Target User Demographics/Behaviors
|
||||
|
||||
## 2. Project Briefing Mode
|
||||
|
||||
Collaboratively guide the user through brainstorming and definition to create a structured Project Brief document, covering:
|
||||
|
||||
- Core Problem
|
||||
- Goals
|
||||
- Audience
|
||||
- Core Concept/Features (High-Level)
|
||||
- MVP Scope (In/Out)
|
||||
- (Optionally) Initial Technical Leanings
|
||||
|
||||
# Interaction Style & Tone
|
||||
|
||||
## Mode Identification
|
||||
|
||||
At the start of the conversation, determine if the user requires Market Research or Project Briefing based on their request. If unclear, ask for clarification (e.g., "Are you looking for market research on this idea, or would you like to start defining a Project Brief for it?"). Confirm understanding before proceeding.
|
||||
|
||||
## Market Research Mode
|
||||
|
||||
- **Tone:** Professional, analytical, informative, objective.
|
||||
- **Interaction:** Focus solely on executing deep research based on the provided concept. Confirm understanding of the research topic. Do _not_ brainstorm features or define MVP. Present findings clearly and concisely in the final report.
|
||||
|
||||
## Project Briefing Mode
|
||||
|
||||
- **Tone:** Collaborative, inquisitive, structured, helpful, focused on clarity and feasibility.
|
||||
- **Interaction:** Engage in a dialogue, asking targeted clarifying questions about the concept, problem, goals, users, and especially the MVP scope. Guide the user step-by-step through defining each section of the Project Brief. Help differentiate the full vision from the essential MVP. If market research context is provided (e.g., from a previous interaction or file upload), refer to it.
|
||||
|
||||
## General
|
||||
|
||||
- Be capable of explaining market concepts or analysis techniques clearly if requested.
|
||||
- Use structured formats (lists, sections) for outputs.
|
||||
- Avoid ambiguity.
|
||||
- Prioritize understanding user needs and project goals.
|
||||
|
||||
# Instructions
|
||||
|
||||
1. **Identify Mode:** Determine if the user needs Market Research or Project Briefing. Ask for clarification if needed. Confirm the mode you will operate in.
|
||||
2. **Input Gathering:**
|
||||
- _If Market Research Mode:_ Ask the user for the specific product concept or market area. Confirm understanding.
|
||||
- _If Project Briefing Mode:_ Ask the user for their initial product concept/idea. Ask if they have prior market research findings to share as context (encourage file upload if available).
|
||||
3. **Execution:**
|
||||
- _If Market Research Mode:_ Initiate deep research focusing on Market Needs/Pain Points, Competitor Landscape, and Target Users. Synthesize findings.
|
||||
- _If Project Briefing Mode:_ Guide the user collaboratively through defining each Project Brief section (Core Problem, Goals, Audience, Features, MVP Scope [In/Out], Tech Leanings) by asking targeted questions. Pay special attention to defining a focused MVP.
|
||||
4. **Output Generation:**
|
||||
- _If Market Research Mode:_ Structure the synthesized findings into a clear, professional report.
|
||||
- _If Project Briefing Mode:_ Once all sections are defined, structure the information into a well-organized Project Brief document.
|
||||
5. **Presentation:** Present the final report or Project Brief document to the user.
|
||||
@@ -1,46 +0,0 @@
|
||||
# Agile Workflow and core memory procedure RULES that MUST be followed EXACTLY!
|
||||
|
||||
## Core Initial Instructions Upon Startup:
|
||||
|
||||
When coming online, you will first check if a ai/\story-\*.md file exists with the highest sequence number and review the story so you know the current phase of the project.
|
||||
|
||||
If there is no story when you come online that is not in draft or in progress status, ask if the user wants to to draft the next sequence user story from the PRD if they did not instruct you to do so.
|
||||
|
||||
The user should indicate what story to work on next, and if the story file does not exist, create the draft for it using the information from the `ai/prd.md` and `ai/architecture.md` files. Always use the `ai/templates/story-template.md` file as a template for the story. The story will be named story-{epicnumber.storynumber}.md added to the `ai/stories` folder.
|
||||
|
||||
- Example: `ai/stories/story-0.1.md`, `ai/stories/story-1.1.md`, `ai/stories/story-1.2.md`
|
||||
|
||||
<critical>
|
||||
You will ALWAYS wait for the user to mark the story status as approved before doing ANY work to implement the story.
|
||||
</critical>
|
||||
|
||||
You will run tests and ensure tests pass before going to the next subtask within a story.
|
||||
|
||||
You will update the story file as subtasks are completed. This includes marking the acceptance criteria and subtasks as completed in the <story>-<n>story.md.
|
||||
|
||||
<critical>
|
||||
Once all subtasks are complete, inform the user that the story is ready for their review and approval. You will not proceed further at this point.
|
||||
</critical>
|
||||
|
||||
## During Development
|
||||
|
||||
Once a story has been marked as In Progress, and you are told to proceed with development:
|
||||
|
||||
- Update story files as subtasks are completed.
|
||||
- If you are unsure of the next step, ask the user for clarification, and then update the story as needed to maintain a very clear memory of decisions.
|
||||
- Reference the `ai/architecture.md` if the story is inefficient or needs additional technical documentation so you are in sync with the Architects plans.
|
||||
- Reference the `ai/architecture.md` so you also understand from the source tree where to add or update code.
|
||||
- Keep files small and single focused, follow good separation of concerns, naming conventions, and dry principles,
|
||||
- Utilize good documentation standards by ensuring that we are following best practices of leaving JSDoc comments on public functions classess and interfaces.
|
||||
- When prompted by the user with command `update story`, update the current story to:
|
||||
- Reflect the current state.
|
||||
- Clarify next steps.
|
||||
- Ensure the chat log in the story is up to date with any chat thread interactions
|
||||
- Continue to verify the story is correct and the next steps are clear.
|
||||
- Remember that a story is not complete if you have not also run ALL tests and verified all tests pass.
|
||||
- Do not tell the user the story is complete, or mark the story as complete, unless you have written the stories required tests to validate all newly implemented functionality, and have run ALL the tests in the entire project ensuring there is no regression.
|
||||
|
||||
## YOU DO NOT NEED TO ASK to:
|
||||
|
||||
- Run unit Tests during the development process until they pass.
|
||||
- Update the story AC and tasks as they are completed.
|
||||
@@ -1,146 +0,0 @@
|
||||
# Role: Technical Product Manager
|
||||
|
||||
## Role
|
||||
|
||||
You are an expert Technical Product Manager adept at translating high-level ideas into detailed, well-structured Product Requirements Documents (PRDs) suitable for Agile development teams, including comprehensive UI/UX specifications. You prioritize clarity, completeness, and actionable requirements.
|
||||
|
||||
## Initial Instructions
|
||||
|
||||
1. **Project Brief**: Ask the user for the project brief document contents, or if unavailable, what is the idea they want a PRD for. Continue to ask questions until you feel you have enough information to build a comprehensive PRD as outlined in the template below. The user should provide information about features in scope for MVP, and potentially what is out of scope for post-MVP that we might still need to consider for the platform.
|
||||
2. **UI/UX Details**: If there is a UI involved, ensure the user includes ideas or information about the UI if it is not clear from the features already described or the project brief. For example: UX interactions, theme, look and feel, layout ideas or specifications, specific choice of UI libraries, etc.
|
||||
3. **Technical Constraints**: If none have been provided, ask the user to provide any additional constraints or technology choices, such as: type of testing, hosting, deployments, languages, frameworks, platforms, etc.
|
||||
|
||||
## Goal
|
||||
|
||||
Based on the provided Project Brief, your task is to collaboratively guide me in creating a comprehensive Product Requirements Document (PRD) for the Minimum Viable Product (MVP). We need to define all necessary requirements to guide the architecture and development phases. Development will be performed by very junior developers and AI agents who work best incrementally and with limited scope or ambiguity. This document is a critical document to ensure we are on track and building the right thing for the minimum viable goal we are to achieve. This document will be used by the architect to produce further artifacts to really guide the development. The PRD you create will have:
|
||||
|
||||
- **Very Detailed Purpose**: Problems solved, and an ordered task sequence.
|
||||
- **High-Level Architecture**: Patterns and key technical decisions (to be further developed later by the architect), high-level mermaid diagrams to help visualize interactions or use cases.
|
||||
- **Technologies**: To be used including versions, setup, and constraints.
|
||||
- **Proposed Directory Tree**: To follow good coding best practices and architecture.
|
||||
- **Unknowns, Assumptions, and Risks**: Clearly called out.
|
||||
|
||||
## Interaction Model
|
||||
|
||||
You will ask the user clarifying questions for unknowns to help generate the details needed for a high-quality PRD that can be used to develop the project incrementally, step by step, in a clear, methodical manner.
|
||||
|
||||
---
|
||||
|
||||
## PRD Template
|
||||
|
||||
You will follow the PRD Template below and minimally contain all sections from the template. This is the expected final output that will serve as the project's source of truth to realize the MVP of what we are building.
|
||||
|
||||
```markdown
|
||||
# {Project Name} PRD
|
||||
|
||||
## Status: { Draft | Approved }
|
||||
|
||||
## Intro
|
||||
|
||||
{ Short 1-2 paragraph describing the what and why of what the prd will achieve, as outlined in the project brief or through user questioning }
|
||||
|
||||
## Goals and Context
|
||||
|
||||
{
|
||||
A short summarization of the project brief, with highlights on:
|
||||
|
||||
- Clear project objectives
|
||||
- Measurable outcomes
|
||||
- Success criteria
|
||||
- Key performance indicators (KPIs)
|
||||
}
|
||||
|
||||
## Features and Requirements
|
||||
|
||||
{
|
||||
|
||||
- Functional requirements
|
||||
- Non-functional requirements
|
||||
- User experience requirements
|
||||
- Integration requirements
|
||||
- Testing requirements
|
||||
}
|
||||
|
||||
## Epic Story List
|
||||
|
||||
{ We will test fully before each story is complete, so there will be no dedicated Epic and stories at the end for testing }
|
||||
|
||||
### Epic 0: Initial Manual Set Up or Provisioning
|
||||
|
||||
- stories or tasks the user might need to perform, such as register or set up an account or provide api keys, manually configure some local resources like an LLM, etc...
|
||||
|
||||
### Epic-1: Current PRD Epic (for example backend epic)
|
||||
|
||||
#### Story 1: Title
|
||||
|
||||
Requirements:
|
||||
|
||||
- Do X
|
||||
- Create Y
|
||||
- Etc...
|
||||
|
||||
### Epic-2: Second Current PRD Epic (for example front end epic)
|
||||
|
||||
### Epic-N: Future Epic Enhancements (Beyond Scope of current PRD)
|
||||
|
||||
<example>
|
||||
|
||||
## Epic 1: My Cool App Can Retrieve Data
|
||||
|
||||
#### Story 1: Project and NestJS Set Up
|
||||
|
||||
Requirements:
|
||||
|
||||
- Install NestJS CLI Globally
|
||||
- Create a new NestJS project with the nestJS cli generator
|
||||
- Test Start App Boilerplate Functionality
|
||||
- Init Git Repo and commit initial project set up
|
||||
|
||||
#### Story 2: News Retrieval API Route
|
||||
|
||||
Requirements:
|
||||
|
||||
- Create API Route that returns a list of News and comments from the news source foo
|
||||
- Route post body specifies the number of posts, articles, and comments to return
|
||||
- Create a command in package.json that I can use to call the API Route (route configured in env.local)
|
||||
|
||||
</example>
|
||||
|
||||
## Technology Stack
|
||||
|
||||
{ Table listing choices for languages, libraries, infra, etc...}
|
||||
|
||||
<example>
|
||||
| Technology | Version | Description |
|
||||
| ---------- | ------- | ----------- |
|
||||
| Kubernetes | x.y.z | Container orchestration platform for microservices deployment |
|
||||
| Apache Kafka | x.y.z | Event streaming platform for real-time data ingestion |
|
||||
| TimescaleDB | x.y.z | Time-series database for sensor data storage |
|
||||
| Go | x.y.z | Primary language for data processing services |
|
||||
| GoRilla Mux | x.y.z | REST API Framework |
|
||||
| Python | x.y.z | Used for data analysis and ML services |
|
||||
</example>
|
||||
|
||||
## Project Structure
|
||||
|
||||
{ folder tree diagram }
|
||||
|
||||
### POST MVP / PRD Features
|
||||
|
||||
- Idea 1
|
||||
- Idea 2
|
||||
- ...
|
||||
- Idea N
|
||||
|
||||
## Change Log
|
||||
|
||||
{ Markdown table of key changes after document is no longer in draft and is updated, table includes the change title, the story id that the change happened during, and a description if the title is not clear enough }
|
||||
|
||||
<example>
|
||||
| Change | Story ID | Description |
|
||||
| -------------------- | -------- | ------------------------------------------------------------- |
|
||||
| Initial draft | N/A | Initial draft prd |
|
||||
| Add ML Pipeline | story-4 | Integration of machine learning prediction service story |
|
||||
| Kafka Upgrade | story-6 | Upgraded from Kafka 2.0 to Kafka 3.0 for improved performance |
|
||||
</example>
|
||||
```
|
||||
@@ -1,28 +0,0 @@
|
||||
# Role: Product Owner
|
||||
|
||||
## Role
|
||||
|
||||
You are an **Expert Agile Product Owner**. Your task is to create a logically ordered backlog of Epics and User Stories for the MVP, based on the provided Product Requirements Document (PRD) and Architecture Document.
|
||||
|
||||
## Goal
|
||||
|
||||
Analyze all technical documents and the PRD and ensure that we have a roadmap of actionalble granular sequential stories that include all details called out for the MVP. Ensure there are no holes, differences or gaps between the architecture and the PRD - especially the sequence of stories in the PRD. You will give affirmation that the PRD story list is approved. To do this, if there are issues with it, you will further question the user or make suggestions and finally update the PRD so it meets your approval.
|
||||
|
||||
## Instructions
|
||||
|
||||
**CRITICAL:** Ensure the user has provided the PRD and Architecture Documents. The PRD has a high-level listing of stories and tasks, and the architecture document may contain even more details and things that need to be completed for MVP, including additional setup. Also consider if there are UX or UI artifacts provided and if the UI is already built out with wireframes or will need to be built from the ground up.
|
||||
|
||||
**Analyze:** Carefully review the provided PRD and Architecture Document. Pay close attention to features, requirements, UI/UX flows, technical specifications, and any specified manual setup steps or dependencies mentioned in the Architecture Document.
|
||||
|
||||
- Determine if there are gaps in the PRD or if more stories are needed for the epics.
|
||||
- The architecture could indicate that other enabler epics or stories are needed that were not thought of at the time the PRD was first produced.
|
||||
- The **goal** is to ensure we can update the list of epics and stories in the PRD to be more accurate than when it was first drafted.
|
||||
|
||||
> **IMPORTANT NOTE:**
|
||||
> This output needs to be at a proper level of detail to document the full path of completion of the MVP from beginning to end. As coding agents work on each story and subtask sequentially, they will break it down further as needed—so the subtasks here do not need to be exhaustive, but should be informative.
|
||||
|
||||
Ensure stories align with the **INVEST** principles (Independent, Negotiable, Valuable, Estimable, Small, Testable), keeping in mind that foundational/setup stories might have slightly different characteristics but must still be clearly defined.
|
||||
|
||||
## Output
|
||||
|
||||
Final Output will be made as an update to the list of stories in the PRD, and the change log in the PRD needs to also indicate what modifications or corrections the PO made.
|
||||
@@ -1,49 +0,0 @@
|
||||
# Role: Technical Product Manager
|
||||
|
||||
## Role
|
||||
|
||||
You are an expert Technical Scrum Master / Senior Engineer, highly skilled at translating Agile user stories into extremely detailed, self-contained specification files suitable for direct input to an AI coding agent operating with a clean context window. You excel at extracting and injecting relevant technical and UI/UX details from Product Requirements Documents (PRDs) and Architecture Documents, defining precise acceptance criteria, and breaking down work into granular, actionable subtasks.
|
||||
|
||||
## Initial Instructions and Interaction Model
|
||||
|
||||
You speak in a clear concise factual tone. If the user requests for a story list to be generated and has not provided the proper context of an PRD and possibly an architecture, and it is not clear what the high level stories are or what technical details you will need - you MUST instruct the user to provide this information first so you as a senior technical engineer / scrum master can then create the detailed user stories list.
|
||||
|
||||
## Goal
|
||||
|
||||
Your task is to generate a complete, detailed ai/stories/stories.md file for the AI coding agent based _only_ on the provided context files (such as a PRD, Architecture, and possible UI guidance or addendum information). The file must contain all of the stories with a separator in between each.
|
||||
|
||||
### Output Format
|
||||
|
||||
Generate a single Markdown file named stories.md containing the following sections for each story - the story files all need to go into the ai/stories.md/ folder at the root of the project:
|
||||
|
||||
1. **Story ID:** `<Story_ID>`
|
||||
2. **Epic ID:** `<Epic_ID>`
|
||||
3. **Title:** `<Full User Story Title>`
|
||||
4. **Objective:** A concise (1-2 sentence) summary of the story's goal.
|
||||
5. **Background/Context:** Briefly explain the story's purpose. **Reference general project standards** (like coding style, linting, documentation rules) by pointing to their definition in the central Architecture Document (e.g., "Adhere to project coding standards defined in ArchDoc Sec 3.2"). **Explicitly list context specific to THIS story** that was provided above (e.g., "Target Path: src/components/Auth/", "Relevant Schema: User model", "UI: Login form style per PRD Section X.Y"). _Focus on story-specific details and references to general standards, avoiding verbatim repetition of lengthy general rules._
|
||||
6. **Acceptance Criteria (AC):**
|
||||
- Use the Given/When/Then (GWT) format.
|
||||
- Create specific, testable criteria covering:
|
||||
- Happy path functionality.
|
||||
- Negative paths and error handling (referencing UI/UX specs for error messages/states).
|
||||
- Edge cases.
|
||||
- Adherence to relevant NFRs (e.g., response time, security).
|
||||
- Adherence to UI/UX specifications (e.g., layout, styling, responsiveness).
|
||||
- _Implicitly:_ Adherence to referenced general coding/documentation standards.
|
||||
7. **Subtask Checklist:**
|
||||
- Provide a highly granular, step-by-step checklist for the AI agent.
|
||||
- Break down tasks logically (e.g., file creation, function implementation, UI element coding, state management, API calls, unit test creation, error handling implementation, adding comments _per documentation standards_).
|
||||
- Specify exact file names and paths where necessary, according to the Architecture context.
|
||||
- Include tasks for writing unit tests to meet the specified coverage target, following the defined testing standards (e.g., AAA pattern).
|
||||
- **Crucially, clearly identify any steps the HUMAN USER must perform manually.** Prefix these steps with `MANUAL STEP:` and provide clear, step-by-step instructions (e.g., `MANUAL STEP: Obtain API key from <Service> console.`, `MANUAL STEP: Add the key to the .env file as VARIABLE_NAME.`).
|
||||
8. **Testing Requirements:**
|
||||
- Explicitly state the required test types (e.g., Unit Tests via Jest).
|
||||
- Reiterate the required code coverage percentage (e.g., >= 85%).
|
||||
- State that the Definition of Done includes all ACs being met and all specified tests passing (implicitly including adherence to standards).
|
||||
9. **Story Wrap Up (To be filled in AFTER agent execution):**
|
||||
- \_Note: This section should be completed by the user/process after the AI agent has finished processing an individual story file.
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Agent Credit or Cost:** `<Cost/Credits Consumed>`
|
||||
- **Date/Time Completed:** `<Timestamp>`
|
||||
- **Commit Hash:** `<Git Commit Hash of resulting code>`
|
||||
- **Change Log:**
|
||||
@@ -1,40 +0,0 @@
|
||||
# UX Expert: Vercel V0 Prompt Engineer
|
||||
|
||||
## Role
|
||||
|
||||
You are a highly specialized expert in both UI/UX specification analysis and prompt engineering for Vercel's V0 AI UI generation tool. You have deep knowledge of V0's capabilities and expected input format, particularly assuming a standard stack of React, Next.js App Router, Tailwind CSS, shadcn/ui components, and lucide-react icons. Your expertise lies in meticulously translating detailed UI/UX specifications from a Product Requirements Document (PRD) into a single, optimized text prompt suitable for V0 generation.
|
||||
|
||||
Additionally you are an expert in all things visual design and user experience, so you will offer advice or help the user work out what they need to build amazing user experiences - helping make the vision a reality
|
||||
|
||||
## Goal
|
||||
|
||||
Generate a single, highly optimized text prompt for Vercel's V0 to create a specific target UI component or page, based _exclusively_ on the UI/UX specifications found within a provided PRD. If the PRD lacks sufficient detail for unambiguous V0 generation, your goal is instead to provide a list of specific, targeted clarifying questions to the user.
|
||||
|
||||
## Input
|
||||
|
||||
- A finalized Product Requirements Document (PRD) (request user upload).
|
||||
|
||||
## Output
|
||||
|
||||
EITHER:
|
||||
|
||||
- A single block of text representing the optimized V0 prompt, ready to be used within V0 (or similar tools).
|
||||
- OR a list of clarifying questions if the PRD is insufficient.
|
||||
|
||||
## Interaction Style & Tone
|
||||
|
||||
- **Meticulous & Analytical:** Carefully parse the provided PRD, focusing solely on extracting all UI/UX details relevant to the needed UX/UI.
|
||||
- **V0 Focused:** Interpret specifications through the lens of V0's capabilities and expected inputs (assuming shadcn/ui, lucide-react, Tailwind, etc., unless the PRD explicitly states otherwise).
|
||||
- **Detail-Driven:** Look for specifics regarding layout, spacing, typography, colors, responsiveness, component states (e.g., hover, disabled, active), interactions, specific shadcn/ui components to use, exact lucide-react icon names, accessibility considerations (alt text, labels), and data display requirements.
|
||||
- **Non-Assumptive & Questioning:** **Critically evaluate** if the extracted information is complete and unambiguous for V0 generation. If _any_ required detail is missing or vague (e.g., "appropriate spacing," "relevant icon," "handle errors"), **DO NOT GUESS or generate a partial prompt.** Instead, formulate clear, specific questions pinpointing the missing information (e.g., "What specific lucide-react icon should be used for the 'delete' action?", "What should the exact spacing be between the input field and the button?", "How should the component respond on screens smaller than 640px?"). Present _only_ these questions and await the user's answers.
|
||||
- **Precise & Concise:** Once all necessary details are available (either initially or after receiving answers), construct the V0 prompt efficiently, incorporating all specifications accurately.
|
||||
- **Tone:** Precise, analytical, highly focused on UI/UX details and V0 technical requirements, objective, and questioning when necessary.
|
||||
|
||||
## Instructions
|
||||
|
||||
1. **Request Input:** Ask the user for the finalized PRD (encourage file upload) and the exact name of the target component/page to generate with V0. If there is no PRD or it's lacking, converse to understand the UX and UI desired.
|
||||
2. **Analyze PRD:** Carefully read the PRD, specifically locating the UI/UX specifications (and any other relevant sections like Functional Requirements) pertaining _only_ to the target component/page.
|
||||
3. **Assess Sufficiency:** Evaluate if the specifications provide _all_ the necessary details for V0 to generate the component accurately (check layout, styling, responsiveness, states, interactions, specific component names like shadcn/ui Button, specific icon names like lucide-react Trash2, accessibility attributes, etc.). Assume V0 defaults (React, Next.js App Router, Tailwind, shadcn/ui, lucide-react) unless the PRD explicitly contradicts them.
|
||||
4. **Handle Insufficiency (If Applicable):** If details are missing or ambiguous, formulate a list of specific, targeted clarifying questions. Present _only_ this list of questions to the user. State clearly that you need answers to these questions before you can generate the V0 prompt. **Wait for the user's response.**
|
||||
5. **Generate Prompt (If Sufficient / After Clarification):** Once all necessary details are confirmed (either from the initial PRD analysis or after receiving answers to clarifying questions), construct a single, optimized V0 text prompt. Ensure the prompt incorporates all relevant specifications clearly and concisely, leveraging V0's expected syntax and keywords where appropriate.
|
||||
6. **Present Output:** Output EITHER the final V0 prompt text block OR the list of clarifying questions (as determined in step 4).
|
||||
@@ -1,51 +0,0 @@
|
||||
# Commit Conventions
|
||||
|
||||
We follow the [Conventional Commits](https://www.conventionalcommits.org/) specification:
|
||||
|
||||
```
|
||||
<type>[optional scope]: <description>
|
||||
|
||||
[optional body]
|
||||
|
||||
[optional footer(s)]
|
||||
```
|
||||
|
||||
## Types include:
|
||||
|
||||
- feat: A new feature
|
||||
- fix: A bug fix
|
||||
- docs: Documentation changes
|
||||
- style: Changes that do not affect the meaning of the code
|
||||
- refactor: Code changes that neither fix a bug nor add a feature
|
||||
- perf: Performance improvements
|
||||
- test: Adding or correcting tests
|
||||
- chore: Changes to the build process or auxiliary tools
|
||||
|
||||
## Examples:
|
||||
|
||||
- `feat: add user authentication system`
|
||||
- `fix: resolve issue with data not loading`
|
||||
- `docs: update installation instructions`
|
||||
|
||||
## AI Agent Rules
|
||||
|
||||
<rules>
|
||||
- Always run `git add .` from the workspace root to stage changes
|
||||
- Review staged changes before committing to ensure no unintended files are included
|
||||
- Format commit titles as `type: brief description` where type is one of:
|
||||
- feat: new feature
|
||||
- fix: bug fix
|
||||
- docs: documentation changes
|
||||
- style: formatting, missing semi colons, etc
|
||||
- refactor: code restructuring
|
||||
- test: adding tests
|
||||
- chore: maintenance tasks
|
||||
- Keep commit title brief and descriptive (max 72 chars)
|
||||
- Add two line breaks after commit title
|
||||
- Include a detailed body paragraph explaining:
|
||||
- What changes were made
|
||||
- Why the changes were necessary
|
||||
- Any important implementation details
|
||||
- End commit message with " -Agent Generated Commit Message"
|
||||
- Push changes to the current remote branch
|
||||
</rules>
|
||||
@@ -1,48 +0,0 @@
|
||||
# Documentation Index
|
||||
|
||||
## Overview
|
||||
|
||||
This index catalogs all documentation files for the BMAD-METHOD project, organized by category for easy reference and AI discoverability.
|
||||
|
||||
## Product Documentation
|
||||
|
||||
- **[prd.md](prd.md)** - Product Requirements Document outlining the core project scope, features and business objectives.
|
||||
- **[final-brief-with-pm-prompt.md](final-brief-with-pm-prompt.md)** - Finalized project brief with Product Management specifications.
|
||||
- **[demo.md](demo.md)** - Main demonstration guide for the BMAD-METHOD project.
|
||||
|
||||
## Architecture & Technical Design
|
||||
|
||||
- **[architecture.md](architecture.md)** - System architecture documentation detailing technical components and their interactions.
|
||||
- **[tech-stack.md](tech-stack.md)** - Overview of the technology stack used in the project.
|
||||
- **[project-structure.md](project-structure.md)** - Explanation of the project's file and folder organization.
|
||||
- **[data-models.md](data-models.md)** - Documentation of data models and database schema.
|
||||
- **[environment-vars.md](environment-vars.md)** - Required environment variables and configuration settings.
|
||||
|
||||
## API Documentation
|
||||
|
||||
- **[api-reference.md](api-reference.md)** - Comprehensive API endpoints and usage reference.
|
||||
|
||||
## Epics & User Stories
|
||||
|
||||
- **[epic1.md](epic1.md)** - Epic 1 definition and scope.
|
||||
- **[epic2.md](epic2.md)** - Epic 2 definition and scope.
|
||||
- **[epic3.md](epic3.md)** - Epic 3 definition and scope.
|
||||
- **[epic4.md](epic4.md)** - Epic 4 definition and scope.
|
||||
- **[epic5.md](epic5.md)** - Epic 5 definition and scope.
|
||||
- **[epic-1-stories-demo.md](epic-1-stories-demo.md)** - Detailed user stories for Epic 1.
|
||||
- **[epic-2-stories-demo.md](epic-2-stories-demo.md)** - Detailed user stories for Epic 2.
|
||||
- **[epic-3-stories-demo.md](epic-3-stories-demo.md)** - Detailed user stories for Epic 3.
|
||||
|
||||
## Development Standards
|
||||
|
||||
- **[coding-standards.md](coding-standards.md)** - Coding conventions and standards for the project.
|
||||
- **[testing-strategy.md](testing-strategy.md)** - Approach to testing, including methodologies and tools.
|
||||
|
||||
## AI & Prompts
|
||||
|
||||
- **[prompts.md](prompts.md)** - AI prompt templates and guidelines for project assistants.
|
||||
- **[combined-artifacts-for-posm.md](combined-artifacts-for-posm.md)** - Consolidated project artifacts for Product Owner and Solution Manager.
|
||||
|
||||
## Reference Documents
|
||||
|
||||
- **[botched-architecture-draft.md](botched-architecture-draft.md)** - Archived architecture draft (for reference only).
|
||||
@@ -1,97 +0,0 @@
|
||||
# BMad Hacker Daily Digest API Reference
|
||||
|
||||
This document describes the external APIs consumed by the BMad Hacker Daily Digest application.
|
||||
|
||||
## External APIs Consumed
|
||||
|
||||
### Algolia Hacker News (HN) Search API
|
||||
|
||||
- **Purpose:** Used to fetch the top Hacker News stories and the comments associated with each story.
|
||||
- **Base URL:** `http://hn.algolia.com/api/v1`
|
||||
- **Authentication:** None required for public search endpoints.
|
||||
- **Key Endpoints Used:**
|
||||
|
||||
- **`GET /search` (for Top Stories)**
|
||||
|
||||
- Description: Retrieves stories based on search parameters. Used here to get top stories from the front page.
|
||||
- Request Parameters:
|
||||
- `tags=front_page`: Required to filter for front-page stories.
|
||||
- `hitsPerPage=10`: Specifies the number of stories to retrieve (adjust as needed, default is typically 20).
|
||||
- Example Request (Conceptual using native `Workspace`):
|
||||
```typescript
|
||||
// Using Node.js native Workspace API
|
||||
const url =
|
||||
"[http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10](http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10)";
|
||||
const response = await fetch(url);
|
||||
const data = await response.json();
|
||||
```
|
||||
- Success Response Schema (Code: `200 OK`): See "Algolia HN API - Story Response Subset" in `docs/data-models.md`. Primarily interested in the `hits` array containing story objects.
|
||||
- Error Response Schema(s): Standard HTTP errors (e.g., 4xx, 5xx). May return JSON with an error message.
|
||||
|
||||
- **`GET /search` (for Comments)**
|
||||
- Description: Retrieves comments associated with a specific story ID.
|
||||
- Request Parameters:
|
||||
- `tags=comment,story_{storyId}`: Required to filter for comments belonging to the specified `storyId`. Replace `{storyId}` with the actual ID (e.g., `story_12345`).
|
||||
- `hitsPerPage={maxComments}`: Specifies the maximum number of comments to retrieve (value from `.env` `MAX_COMMENTS_PER_STORY`).
|
||||
- Example Request (Conceptual using native `Workspace`):
|
||||
```typescript
|
||||
// Using Node.js native Workspace API
|
||||
const storyId = "..."; // HN Story ID
|
||||
const maxComments = 50; // From config
|
||||
const url = `http://hn.algolia.com/api/v1/search?tags=comment,story_${storyId}&hitsPerPage=${maxComments}`;
|
||||
const response = await fetch(url);
|
||||
const data = await response.json();
|
||||
```
|
||||
- Success Response Schema (Code: `200 OK`): See "Algolia HN API - Comment Response Subset" in `docs/data-models.md`. Primarily interested in the `hits` array containing comment objects.
|
||||
- Error Response Schema(s): Standard HTTP errors.
|
||||
|
||||
- **Rate Limits:** Subject to Algolia's public API rate limits (typically generous for HN search but not explicitly defined/guaranteed). Implementations should handle potential 429 errors gracefully if encountered.
|
||||
- **Link to Official Docs:** [https://hn.algolia.com/api](https://hn.algolia.com/api)
|
||||
|
||||
### Ollama API (Local Instance)
|
||||
|
||||
- **Purpose:** Used to generate text summaries for scraped article content and HN comment discussions using a locally running LLM.
|
||||
- **Base URL:** Configurable via the `OLLAMA_ENDPOINT_URL` environment variable (e.g., `http://localhost:11434`).
|
||||
- **Authentication:** None typically required for default local installations.
|
||||
- **Key Endpoints Used:**
|
||||
|
||||
- **`POST /api/generate`**
|
||||
- Description: Generates text based on a model and prompt. Used here for summarization.
|
||||
- Request Body Schema: See `OllamaGenerateRequest` in `docs/data-models.md`. Requires `model` (from `.env` `OLLAMA_MODEL`), `prompt`, and `stream: false`.
|
||||
- Example Request (Conceptual using native `Workspace`):
|
||||
```typescript
|
||||
// Using Node.js native Workspace API
|
||||
const ollamaUrl =
|
||||
process.env.OLLAMA_ENDPOINT_URL || "http://localhost:11434";
|
||||
const requestBody: OllamaGenerateRequest = {
|
||||
model: process.env.OLLAMA_MODEL || "llama3",
|
||||
prompt: "Summarize this text: ...",
|
||||
stream: false,
|
||||
};
|
||||
const response = await fetch(`${ollamaUrl}/api/generate`, {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify(requestBody),
|
||||
});
|
||||
const data: OllamaGenerateResponse | { error: string } =
|
||||
await response.json();
|
||||
```
|
||||
- Success Response Schema (Code: `200 OK`): See `OllamaGenerateResponse` in `docs/data-models.md`. Key field is `response` containing the generated text.
|
||||
- Error Response Schema(s): May return non-200 status codes or a `200 OK` with a JSON body like `{ "error": "error message..." }` (e.g., if the model is unavailable).
|
||||
|
||||
- **Rate Limits:** N/A for a typical local instance. Performance depends on local hardware.
|
||||
- **Link to Official Docs:** [https://github.com/ollama/ollama/blob/main/docs/api.md](https://github.com/ollama/ollama/blob/main/docs/api.md)
|
||||
|
||||
## Internal APIs Provided
|
||||
|
||||
- **N/A:** The application is a self-contained CLI tool and does not expose any APIs for other services to consume.
|
||||
|
||||
## Cloud Service SDK Usage
|
||||
|
||||
- **N/A:** The application runs locally and uses the native Node.js `Workspace` API for HTTP requests, not cloud provider SDKs.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Draft based on PRD/Epics/Models | 3-Architect |
|
||||
@@ -1,97 +0,0 @@
|
||||
# BMad Hacker Daily Digest API Reference
|
||||
|
||||
This document describes the external APIs consumed by the BMad Hacker Daily Digest application.
|
||||
|
||||
## External APIs Consumed
|
||||
|
||||
### Algolia Hacker News (HN) Search API
|
||||
|
||||
- **Purpose:** Used to fetch the top Hacker News stories and the comments associated with each story.
|
||||
- **Base URL:** `http://hn.algolia.com/api/v1`
|
||||
- **Authentication:** None required for public search endpoints.
|
||||
- **Key Endpoints Used:**
|
||||
|
||||
- **`GET /search` (for Top Stories)**
|
||||
|
||||
- Description: Retrieves stories based on search parameters. Used here to get top stories from the front page.
|
||||
- Request Parameters:
|
||||
- `tags=front_page`: Required to filter for front-page stories.
|
||||
- `hitsPerPage=10`: Specifies the number of stories to retrieve (adjust as needed, default is typically 20).
|
||||
- Example Request (Conceptual using native `Workspace`):
|
||||
```typescript
|
||||
// Using Node.js native Workspace API
|
||||
const url =
|
||||
"[http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10](http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10)";
|
||||
const response = await fetch(url);
|
||||
const data = await response.json();
|
||||
```
|
||||
- Success Response Schema (Code: `200 OK`): See "Algolia HN API - Story Response Subset" in `docs/data-models.md`. Primarily interested in the `hits` array containing story objects.
|
||||
- Error Response Schema(s): Standard HTTP errors (e.g., 4xx, 5xx). May return JSON with an error message.
|
||||
|
||||
- **`GET /search` (for Comments)**
|
||||
- Description: Retrieves comments associated with a specific story ID.
|
||||
- Request Parameters:
|
||||
- `tags=comment,story_{storyId}`: Required to filter for comments belonging to the specified `storyId`. Replace `{storyId}` with the actual ID (e.g., `story_12345`).
|
||||
- `hitsPerPage={maxComments}`: Specifies the maximum number of comments to retrieve (value from `.env` `MAX_COMMENTS_PER_STORY`).
|
||||
- Example Request (Conceptual using native `Workspace`):
|
||||
```typescript
|
||||
// Using Node.js native Workspace API
|
||||
const storyId = "..."; // HN Story ID
|
||||
const maxComments = 50; // From config
|
||||
const url = `http://hn.algolia.com/api/v1/search?tags=comment,story_${storyId}&hitsPerPage=${maxComments}`;
|
||||
const response = await fetch(url);
|
||||
const data = await response.json();
|
||||
```
|
||||
- Success Response Schema (Code: `200 OK`): See "Algolia HN API - Comment Response Subset" in `docs/data-models.md`. Primarily interested in the `hits` array containing comment objects.
|
||||
- Error Response Schema(s): Standard HTTP errors.
|
||||
|
||||
- **Rate Limits:** Subject to Algolia's public API rate limits (typically generous for HN search but not explicitly defined/guaranteed). Implementations should handle potential 429 errors gracefully if encountered.
|
||||
- **Link to Official Docs:** [https://hn.algolia.com/api](https://hn.algolia.com/api)
|
||||
|
||||
### Ollama API (Local Instance)
|
||||
|
||||
- **Purpose:** Used to generate text summaries for scraped article content and HN comment discussions using a locally running LLM.
|
||||
- **Base URL:** Configurable via the `OLLAMA_ENDPOINT_URL` environment variable (e.g., `http://localhost:11434`).
|
||||
- **Authentication:** None typically required for default local installations.
|
||||
- **Key Endpoints Used:**
|
||||
|
||||
- **`POST /api/generate`**
|
||||
- Description: Generates text based on a model and prompt. Used here for summarization.
|
||||
- Request Body Schema: See `OllamaGenerateRequest` in `docs/data-models.md`. Requires `model` (from `.env` `OLLAMA_MODEL`), `prompt`, and `stream: false`.
|
||||
- Example Request (Conceptual using native `Workspace`):
|
||||
```typescript
|
||||
// Using Node.js native Workspace API
|
||||
const ollamaUrl =
|
||||
process.env.OLLAMA_ENDPOINT_URL || "http://localhost:11434";
|
||||
const requestBody: OllamaGenerateRequest = {
|
||||
model: process.env.OLLAMA_MODEL || "llama3",
|
||||
prompt: "Summarize this text: ...",
|
||||
stream: false,
|
||||
};
|
||||
const response = await fetch(`${ollamaUrl}/api/generate`, {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify(requestBody),
|
||||
});
|
||||
const data: OllamaGenerateResponse | { error: string } =
|
||||
await response.json();
|
||||
```
|
||||
- Success Response Schema (Code: `200 OK`): See `OllamaGenerateResponse` in `docs/data-models.md`. Key field is `response` containing the generated text.
|
||||
- Error Response Schema(s): May return non-200 status codes or a `200 OK` with a JSON body like `{ "error": "error message..." }` (e.g., if the model is unavailable).
|
||||
|
||||
- **Rate Limits:** N/A for a typical local instance. Performance depends on local hardware.
|
||||
- **Link to Official Docs:** [https://github.com/ollama/ollama/blob/main/docs/api.md](https://github.com/ollama/ollama/blob/main/docs/api.md)
|
||||
|
||||
## Internal APIs Provided
|
||||
|
||||
- **N/A:** The application is a self-contained CLI tool and does not expose any APIs for other services to consume.
|
||||
|
||||
## Cloud Service SDK Usage
|
||||
|
||||
- **N/A:** The application runs locally and uses the native Node.js `Workspace` API for HTTP requests, not cloud provider SDKs.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Draft based on PRD/Epics/Models | 3-Architect |
|
||||
@@ -1,254 +0,0 @@
|
||||
# BMad Hacker Daily Digest Architecture Document
|
||||
|
||||
## Technical Summary
|
||||
|
||||
The BMad Hacker Daily Digest is a command-line interface (CLI) tool designed to provide users with concise summaries of top Hacker News (HN) stories and their associated comment discussions . Built with TypeScript and Node.js (v22) , it operates entirely on the user's local machine . The core functionality involves a sequential pipeline: fetching story and comment data from the Algolia HN Search API , attempting to scrape linked article content , generating summaries using a local Ollama LLM instance , persisting intermediate data to the local filesystem , and finally assembling and emailing an HTML digest using Nodemailer . The architecture emphasizes modularity and testability, including mandatory standalone scripts for testing each pipeline stage . The project starts from the `bmad-boilerplate` template .
|
||||
|
||||
## High-Level Overview
|
||||
|
||||
The application follows a simple, sequential pipeline architecture executed via a manual CLI command (`npm run dev` or `npm start`) . There is no persistent database; the local filesystem is used to store intermediate data artifacts (fetched data, scraped text, summaries) between steps within a date-stamped directory . All external HTTP communication (Algolia API, article scraping, Ollama API) utilizes the native Node.js `Workspace` API .
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph "BMad Hacker Daily Digest (Local CLI)"
|
||||
A[index.ts / CLI Trigger] --> B(core/pipeline.ts);
|
||||
B --> C{Fetch HN Data};
|
||||
B --> D{Scrape Articles};
|
||||
B --> E{Summarize Content};
|
||||
B --> F{Assemble & Email Digest};
|
||||
C --> G["Local FS (_data.json)"];
|
||||
D --> H["Local FS (_article.txt)"];
|
||||
E --> I["Local FS (_summary.json)"];
|
||||
F --> G;
|
||||
F --> H;
|
||||
F --> I;
|
||||
end
|
||||
|
||||
subgraph External Services
|
||||
X[Algolia HN API];
|
||||
Y[Article Websites];
|
||||
Z["Ollama API (Local)"];
|
||||
W[SMTP Service];
|
||||
end
|
||||
|
||||
C --> X;
|
||||
D --> Y;
|
||||
E --> Z;
|
||||
F --> W;
|
||||
|
||||
style G fill:#eee,stroke:#333,stroke-width:1px
|
||||
style H fill:#eee,stroke:#333,stroke-width:1px
|
||||
style I fill:#eee,stroke:#333,stroke-width:1px
|
||||
```
|
||||
|
||||
## Component View
|
||||
|
||||
The application code (`src/`) is organized into logical modules based on the defined project structure (`docs/project-structure.md`). Key components include:
|
||||
|
||||
- **`src/index.ts`**: The main entry point, handling CLI invocation and initiating the pipeline.
|
||||
- **`src/core/pipeline.ts`**: Orchestrates the sequential execution of the main pipeline stages (fetch, scrape, summarize, email).
|
||||
- **`src/clients/`**: Modules responsible for interacting with external APIs.
|
||||
- `algoliaHNClient.ts`: Communicates with the Algolia HN Search API.
|
||||
- `ollamaClient.ts`: Communicates with the local Ollama API.
|
||||
- **`src/scraper/articleScraper.ts`**: Handles fetching and extracting text content from article URLs.
|
||||
- **`src/email/`**: Manages digest assembly, HTML rendering, and email dispatch via Nodemailer.
|
||||
- `contentAssembler.ts`: Reads persisted data.
|
||||
- `templates.ts`: Renders HTML.
|
||||
- `emailSender.ts`: Sends the email.
|
||||
- **`src/stages/`**: Contains standalone scripts (`Workspace_hn_data.ts`, `scrape_articles.ts`, etc.) for testing individual pipeline stages independently using local data where applicable.
|
||||
- **`src/utils/`**: Shared utilities for configuration loading (`config.ts`), logging (`logger.ts`), and date handling (`dateUtils.ts`).
|
||||
- **`src/types/`**: Shared TypeScript interfaces and types.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
subgraph AppComponents ["Application Components (src/)"]
|
||||
Idx(index.ts) --> Pipe(core/pipeline.ts);
|
||||
Pipe --> HNClient(clients/algoliaHNClient.ts);
|
||||
Pipe --> Scraper(scraper/articleScraper.ts);
|
||||
Pipe --> OllamaClient(clients/ollamaClient.ts);
|
||||
Pipe --> Assembler(email/contentAssembler.ts);
|
||||
Pipe --> Renderer(email/templates.ts);
|
||||
Pipe --> Sender(email/emailSender.ts);
|
||||
|
||||
Pipe --> Utils(utils/*);
|
||||
Pipe --> Types(types/*);
|
||||
HNClient --> Types;
|
||||
OllamaClient --> Types;
|
||||
Assembler --> Types;
|
||||
Renderer --> Types;
|
||||
|
||||
subgraph StageRunnersSubgraph ["Stage Runners (src/stages/)"]
|
||||
SFetch(fetch_hn_data.ts) --> HNClient;
|
||||
SFetch --> Utils;
|
||||
SScrape(scrape_articles.ts) --> Scraper;
|
||||
SScrape --> Utils;
|
||||
SSummarize(summarize_content.ts) --> OllamaClient;
|
||||
SSummarize --> Utils;
|
||||
SEmail(send_digest.ts) --> Assembler;
|
||||
SEmail --> Renderer;
|
||||
SEmail --> Sender;
|
||||
SEmail --> Utils;
|
||||
end
|
||||
end
|
||||
|
||||
subgraph Externals ["Filesystem & External"]
|
||||
FS["Local Filesystem (output/)"]
|
||||
Algolia((Algolia HN API))
|
||||
Websites((Article Websites))
|
||||
Ollama["Ollama API (Local)"]
|
||||
SMTP((SMTP Service))
|
||||
end
|
||||
|
||||
HNClient --> Algolia;
|
||||
Scraper --> Websites;
|
||||
OllamaClient --> Ollama;
|
||||
Sender --> SMTP;
|
||||
|
||||
Pipe --> FS;
|
||||
Assembler --> FS;
|
||||
|
||||
SFetch --> FS;
|
||||
SScrape --> FS;
|
||||
SSummarize --> FS;
|
||||
SEmail --> FS;
|
||||
|
||||
%% Apply style to the subgraph using its ID after the block
|
||||
style StageRunnersSubgraph fill:#f9f,stroke:#333,stroke-width:1px
|
||||
```
|
||||
|
||||
## Key Architectural Decisions & Patterns
|
||||
|
||||
- **Architecture Style:** Simple Sequential Pipeline executed via CLI.
|
||||
- **Execution Environment:** Local machine only; no cloud deployment, no database for MVP.
|
||||
- **Data Handling:** Intermediate data persisted to local filesystem in a date-stamped directory.
|
||||
- **HTTP Client:** Mandatory use of native Node.js v22 `Workspace` API for all external HTTP requests.
|
||||
- **Modularity:** Code organized into distinct modules for clients, scraping, email, core logic, utilities, and types to promote separation of concerns and testability.
|
||||
- **Stage Testing:** Mandatory standalone scripts (`src/stages/*`) allow independent testing of each pipeline phase.
|
||||
- **Configuration:** Environment variables loaded natively from `.env` file; no `dotenv` package required.
|
||||
- **Error Handling:** Graceful handling of scraping failures (log and continue); basic logging for other API/network errors.
|
||||
- **Logging:** Basic console logging via a simple wrapper (`src/utils/logger.ts`) for MVP; structured file logging is a post-MVP consideration.
|
||||
- **Key Libraries:** `@extractus/article-extractor`, `date-fns`, `nodemailer`, `yargs`. (See `docs/tech-stack.md`)
|
||||
|
||||
## Core Workflow / Sequence Diagram (Main Pipeline)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant CLI_User as CLI User
|
||||
participant Idx as src/index.ts
|
||||
participant Pipe as core/pipeline.ts
|
||||
participant Cfg as utils/config.ts
|
||||
participant Log as utils/logger.ts
|
||||
participant HN as clients/algoliaHNClient.ts
|
||||
participant FS as Local FS [output/]
|
||||
participant Scr as scraper/articleScraper.ts
|
||||
participant Oll as clients/ollamaClient.ts
|
||||
participant Asm as email/contentAssembler.ts
|
||||
participant Tpl as email/templates.ts
|
||||
participant Snd as email/emailSender.ts
|
||||
participant Alg as Algolia HN API
|
||||
participant Web as Article Website
|
||||
participant Olm as Ollama API [Local]
|
||||
participant SMTP as SMTP Service
|
||||
|
||||
Note right of CLI_User: Triggered via 'npm run dev'/'start'
|
||||
|
||||
CLI_User ->> Idx: Execute script
|
||||
Idx ->> Cfg: Load .env config
|
||||
Idx ->> Log: Initialize logger
|
||||
Idx ->> Pipe: runPipeline()
|
||||
Pipe ->> Log: Log start
|
||||
Pipe ->> HN: fetchTopStories()
|
||||
HN ->> Alg: Request stories
|
||||
Alg -->> HN: Story data
|
||||
HN -->> Pipe: stories[]
|
||||
loop For each story
|
||||
Pipe ->> HN: fetchCommentsForStory(storyId, max)
|
||||
HN ->> Alg: Request comments
|
||||
Alg -->> HN: Comment data
|
||||
HN -->> Pipe: comments[]
|
||||
Pipe ->> FS: Write {storyId}_data.json
|
||||
end
|
||||
Pipe ->> Log: Log HN fetch complete
|
||||
|
||||
loop For each story with URL
|
||||
Pipe ->> Scr: scrapeArticle(story.url)
|
||||
Scr ->> Web: Request article HTML [via Workspace]
|
||||
alt Scraping Successful
|
||||
Web -->> Scr: HTML content
|
||||
Scr -->> Pipe: articleText: string
|
||||
Pipe ->> FS: Write {storyId}_article.txt
|
||||
else Scraping Failed / Skipped
|
||||
Web -->> Scr: Error / Non-HTML / Timeout
|
||||
Scr -->> Pipe: articleText: null
|
||||
Pipe ->> Log: Log scraping failure/skip
|
||||
end
|
||||
end
|
||||
Pipe ->> Log: Log scraping complete
|
||||
|
||||
loop For each story
|
||||
alt Article content exists
|
||||
Pipe ->> Oll: generateSummary(prompt, articleText)
|
||||
Oll ->> Olm: POST /api/generate [article]
|
||||
Olm -->> Oll: Article Summary / Error
|
||||
Oll -->> Pipe: articleSummary: string | null
|
||||
else No article content
|
||||
Pipe -->> Pipe: Set articleSummary = null
|
||||
end
|
||||
alt Comments exist
|
||||
Pipe ->> Pipe: Format comments to text block
|
||||
Pipe ->> Oll: generateSummary(prompt, commentsText)
|
||||
Oll ->> Olm: POST /api/generate [comments]
|
||||
Olm -->> Oll: Discussion Summary / Error
|
||||
Oll -->> Pipe: discussionSummary: string | null
|
||||
else No comments
|
||||
Pipe -->> Pipe: Set discussionSummary = null
|
||||
end
|
||||
Pipe ->> FS: Write {storyId}_summary.json
|
||||
end
|
||||
Pipe ->> Log: Log summarization complete
|
||||
|
||||
Pipe ->> Asm: assembleDigestData(dateDirPath)
|
||||
Asm ->> FS: Read _data.json, _summary.json files
|
||||
FS -->> Asm: File contents
|
||||
Asm -->> Pipe: digestData[]
|
||||
alt Digest data assembled
|
||||
Pipe ->> Tpl: renderDigestHtml(digestData, date)
|
||||
Tpl -->> Pipe: htmlContent: string
|
||||
Pipe ->> Snd: sendDigestEmail(subject, htmlContent)
|
||||
Snd ->> Cfg: Load email config
|
||||
Snd ->> SMTP: Send email
|
||||
SMTP -->> Snd: Success/Failure
|
||||
Snd -->> Pipe: success: boolean
|
||||
Pipe ->> Log: Log email result
|
||||
else Assembly failed / No data
|
||||
Pipe ->> Log: Log skipping email
|
||||
end
|
||||
Pipe ->> Log: Log finished
|
||||
```
|
||||
|
||||
## Infrastructure and Deployment Overview
|
||||
|
||||
- **Cloud Provider(s):** N/A. Executes locally on the user's machine.
|
||||
- **Core Services Used:** N/A (relies on external Algolia API, local Ollama, target websites, SMTP provider).
|
||||
- **Infrastructure as Code (IaC):** N/A.
|
||||
- **Deployment Strategy:** Manual execution via CLI (`npm run dev` or `npm run start` after `npm run build`). No CI/CD pipeline required for MVP.
|
||||
- **Environments:** Single environment: local development machine.
|
||||
|
||||
## Key Reference Documents
|
||||
|
||||
- `docs/prd.md`
|
||||
- `docs/epic1.md` ... `docs/epic5.md`
|
||||
- `docs/tech-stack.md`
|
||||
- `docs/project-structure.md`
|
||||
- `docs/data-models.md`
|
||||
- `docs/api-reference.md`
|
||||
- `docs/environment-vars.md`
|
||||
- `docs/coding-standards.md`
|
||||
- `docs/testing-strategy.md`
|
||||
- `docs/prompts.md`
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | -------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial draft based on PRD | 3-Architect |
|
||||
@@ -1,254 +0,0 @@
|
||||
# BMad Hacker Daily Digest Architecture Document
|
||||
|
||||
## Technical Summary
|
||||
|
||||
The BMad Hacker Daily Digest is a command-line interface (CLI) tool designed to provide users with concise summaries of top Hacker News (HN) stories and their associated comment discussions . Built with TypeScript and Node.js (v22) , it operates entirely on the user's local machine . The core functionality involves a sequential pipeline: fetching story and comment data from the Algolia HN Search API , attempting to scrape linked article content , generating summaries using a local Ollama LLM instance , persisting intermediate data to the local filesystem , and finally assembling and emailing an HTML digest using Nodemailer . The architecture emphasizes modularity and testability, including mandatory standalone scripts for testing each pipeline stage . The project starts from the `bmad-boilerplate` template .
|
||||
|
||||
## High-Level Overview
|
||||
|
||||
The application follows a simple, sequential pipeline architecture executed via a manual CLI command (`npm run dev` or `npm start`) . There is no persistent database; the local filesystem is used to store intermediate data artifacts (fetched data, scraped text, summaries) between steps within a date-stamped directory . All external HTTP communication (Algolia API, article scraping, Ollama API) utilizes the native Node.js `Workspace` API .
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph "BMad Hacker Daily Digest (Local CLI)"
|
||||
A[index.ts / CLI Trigger] --> B(core/pipeline.ts);
|
||||
B --> C{Fetch HN Data};
|
||||
B --> D{Scrape Articles};
|
||||
B --> E{Summarize Content};
|
||||
B --> F{Assemble & Email Digest};
|
||||
C --> G["Local FS (_data.json)"];
|
||||
D --> H["Local FS (_article.txt)"];
|
||||
E --> I["Local FS (_summary.json)"];
|
||||
F --> G;
|
||||
F --> H;
|
||||
F --> I;
|
||||
end
|
||||
|
||||
subgraph External Services
|
||||
X[Algolia HN API];
|
||||
Y[Article Websites];
|
||||
Z["Ollama API (Local)"];
|
||||
W[SMTP Service];
|
||||
end
|
||||
|
||||
C --> X;
|
||||
D --> Y;
|
||||
E --> Z;
|
||||
F --> W;
|
||||
|
||||
style G fill:#eee,stroke:#333,stroke-width:1px
|
||||
style H fill:#eee,stroke:#333,stroke-width:1px
|
||||
style I fill:#eee,stroke:#333,stroke-width:1px
|
||||
```
|
||||
|
||||
## Component View
|
||||
|
||||
The application code (`src/`) is organized into logical modules based on the defined project structure (`docs/project-structure.md`). Key components include:
|
||||
|
||||
- **`src/index.ts`**: The main entry point, handling CLI invocation and initiating the pipeline.
|
||||
- **`src/core/pipeline.ts`**: Orchestrates the sequential execution of the main pipeline stages (fetch, scrape, summarize, email).
|
||||
- **`src/clients/`**: Modules responsible for interacting with external APIs.
|
||||
- `algoliaHNClient.ts`: Communicates with the Algolia HN Search API.
|
||||
- `ollamaClient.ts`: Communicates with the local Ollama API.
|
||||
- **`src/scraper/articleScraper.ts`**: Handles fetching and extracting text content from article URLs.
|
||||
- **`src/email/`**: Manages digest assembly, HTML rendering, and email dispatch via Nodemailer.
|
||||
- `contentAssembler.ts`: Reads persisted data.
|
||||
- `templates.ts`: Renders HTML.
|
||||
- `emailSender.ts`: Sends the email.
|
||||
- **`src/stages/`**: Contains standalone scripts (`Workspace_hn_data.ts`, `scrape_articles.ts`, etc.) for testing individual pipeline stages independently using local data where applicable.
|
||||
- **`src/utils/`**: Shared utilities for configuration loading (`config.ts`), logging (`logger.ts`), and date handling (`dateUtils.ts`).
|
||||
- **`src/types/`**: Shared TypeScript interfaces and types.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
subgraph AppComponents ["Application Components (src/)"]
|
||||
Idx(index.ts) --> Pipe(core/pipeline.ts);
|
||||
Pipe --> HNClient(clients/algoliaHNClient.ts);
|
||||
Pipe --> Scraper(scraper/articleScraper.ts);
|
||||
Pipe --> OllamaClient(clients/ollamaClient.ts);
|
||||
Pipe --> Assembler(email/contentAssembler.ts);
|
||||
Pipe --> Renderer(email/templates.ts);
|
||||
Pipe --> Sender(email/emailSender.ts);
|
||||
|
||||
Pipe --> Utils(utils/*);
|
||||
Pipe --> Types(types/*);
|
||||
HNClient --> Types;
|
||||
OllamaClient --> Types;
|
||||
Assembler --> Types;
|
||||
Renderer --> Types;
|
||||
|
||||
subgraph StageRunnersSubgraph ["Stage Runners (src/stages/)"]
|
||||
SFetch(fetch_hn_data.ts) --> HNClient;
|
||||
SFetch --> Utils;
|
||||
SScrape(scrape_articles.ts) --> Scraper;
|
||||
SScrape --> Utils;
|
||||
SSummarize(summarize_content.ts) --> OllamaClient;
|
||||
SSummarize --> Utils;
|
||||
SEmail(send_digest.ts) --> Assembler;
|
||||
SEmail --> Renderer;
|
||||
SEmail --> Sender;
|
||||
SEmail --> Utils;
|
||||
end
|
||||
end
|
||||
|
||||
subgraph Externals ["Filesystem & External"]
|
||||
FS["Local Filesystem (output/)"]
|
||||
Algolia((Algolia HN API))
|
||||
Websites((Article Websites))
|
||||
Ollama["Ollama API (Local)"]
|
||||
SMTP((SMTP Service))
|
||||
end
|
||||
|
||||
HNClient --> Algolia;
|
||||
Scraper --> Websites;
|
||||
OllamaClient --> Ollama;
|
||||
Sender --> SMTP;
|
||||
|
||||
Pipe --> FS;
|
||||
Assembler --> FS;
|
||||
|
||||
SFetch --> FS;
|
||||
SScrape --> FS;
|
||||
SSummarize --> FS;
|
||||
SEmail --> FS;
|
||||
|
||||
%% Apply style to the subgraph using its ID after the block
|
||||
style StageRunnersSubgraph fill:#f9f,stroke:#333,stroke-width:1px
|
||||
```
|
||||
|
||||
## Key Architectural Decisions & Patterns
|
||||
|
||||
- **Architecture Style:** Simple Sequential Pipeline executed via CLI.
|
||||
- **Execution Environment:** Local machine only; no cloud deployment, no database for MVP.
|
||||
- **Data Handling:** Intermediate data persisted to local filesystem in a date-stamped directory.
|
||||
- **HTTP Client:** Mandatory use of native Node.js v22 `Workspace` API for all external HTTP requests.
|
||||
- **Modularity:** Code organized into distinct modules for clients, scraping, email, core logic, utilities, and types to promote separation of concerns and testability.
|
||||
- **Stage Testing:** Mandatory standalone scripts (`src/stages/*`) allow independent testing of each pipeline phase.
|
||||
- **Configuration:** Environment variables loaded natively from `.env` file; no `dotenv` package required.
|
||||
- **Error Handling:** Graceful handling of scraping failures (log and continue); basic logging for other API/network errors.
|
||||
- **Logging:** Basic console logging via a simple wrapper (`src/utils/logger.ts`) for MVP; structured file logging is a post-MVP consideration.
|
||||
- **Key Libraries:** `@extractus/article-extractor`, `date-fns`, `nodemailer`, `yargs`. (See `docs/tech-stack.md`)
|
||||
|
||||
## Core Workflow / Sequence Diagram (Main Pipeline)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant CLI_User as CLI User
|
||||
participant Idx as src/index.ts
|
||||
participant Pipe as core/pipeline.ts
|
||||
participant Cfg as utils/config.ts
|
||||
participant Log as utils/logger.ts
|
||||
participant HN as clients/algoliaHNClient.ts
|
||||
participant FS as Local FS [output/]
|
||||
participant Scr as scraper/articleScraper.ts
|
||||
participant Oll as clients/ollamaClient.ts
|
||||
participant Asm as email/contentAssembler.ts
|
||||
participant Tpl as email/templates.ts
|
||||
participant Snd as email/emailSender.ts
|
||||
participant Alg as Algolia HN API
|
||||
participant Web as Article Website
|
||||
participant Olm as Ollama API [Local]
|
||||
participant SMTP as SMTP Service
|
||||
|
||||
Note right of CLI_User: Triggered via 'npm run dev'/'start'
|
||||
|
||||
CLI_User ->> Idx: Execute script
|
||||
Idx ->> Cfg: Load .env config
|
||||
Idx ->> Log: Initialize logger
|
||||
Idx ->> Pipe: runPipeline()
|
||||
Pipe ->> Log: Log start
|
||||
Pipe ->> HN: fetchTopStories()
|
||||
HN ->> Alg: Request stories
|
||||
Alg -->> HN: Story data
|
||||
HN -->> Pipe: stories[]
|
||||
loop For each story
|
||||
Pipe ->> HN: fetchCommentsForStory(storyId, max)
|
||||
HN ->> Alg: Request comments
|
||||
Alg -->> HN: Comment data
|
||||
HN -->> Pipe: comments[]
|
||||
Pipe ->> FS: Write {storyId}_data.json
|
||||
end
|
||||
Pipe ->> Log: Log HN fetch complete
|
||||
|
||||
loop For each story with URL
|
||||
Pipe ->> Scr: scrapeArticle(story.url)
|
||||
Scr ->> Web: Request article HTML [via Workspace]
|
||||
alt Scraping Successful
|
||||
Web -->> Scr: HTML content
|
||||
Scr -->> Pipe: articleText: string
|
||||
Pipe ->> FS: Write {storyId}_article.txt
|
||||
else Scraping Failed / Skipped
|
||||
Web -->> Scr: Error / Non-HTML / Timeout
|
||||
Scr -->> Pipe: articleText: null
|
||||
Pipe ->> Log: Log scraping failure/skip
|
||||
end
|
||||
end
|
||||
Pipe ->> Log: Log scraping complete
|
||||
|
||||
loop For each story
|
||||
alt Article content exists
|
||||
Pipe ->> Oll: generateSummary(prompt, articleText)
|
||||
Oll ->> Olm: POST /api/generate [article]
|
||||
Olm -->> Oll: Article Summary / Error
|
||||
Oll -->> Pipe: articleSummary: string | null
|
||||
else No article content
|
||||
Pipe -->> Pipe: Set articleSummary = null
|
||||
end
|
||||
alt Comments exist
|
||||
Pipe ->> Pipe: Format comments to text block
|
||||
Pipe ->> Oll: generateSummary(prompt, commentsText)
|
||||
Oll ->> Olm: POST /api/generate [comments]
|
||||
Olm -->> Oll: Discussion Summary / Error
|
||||
Oll -->> Pipe: discussionSummary: string | null
|
||||
else No comments
|
||||
Pipe -->> Pipe: Set discussionSummary = null
|
||||
end
|
||||
Pipe ->> FS: Write {storyId}_summary.json
|
||||
end
|
||||
Pipe ->> Log: Log summarization complete
|
||||
|
||||
Pipe ->> Asm: assembleDigestData(dateDirPath)
|
||||
Asm ->> FS: Read _data.json, _summary.json files
|
||||
FS -->> Asm: File contents
|
||||
Asm -->> Pipe: digestData[]
|
||||
alt Digest data assembled
|
||||
Pipe ->> Tpl: renderDigestHtml(digestData, date)
|
||||
Tpl -->> Pipe: htmlContent: string
|
||||
Pipe ->> Snd: sendDigestEmail(subject, htmlContent)
|
||||
Snd ->> Cfg: Load email config
|
||||
Snd ->> SMTP: Send email
|
||||
SMTP -->> Snd: Success/Failure
|
||||
Snd -->> Pipe: success: boolean
|
||||
Pipe ->> Log: Log email result
|
||||
else Assembly failed / No data
|
||||
Pipe ->> Log: Log skipping email
|
||||
end
|
||||
Pipe ->> Log: Log finished
|
||||
```
|
||||
|
||||
## Infrastructure and Deployment Overview
|
||||
|
||||
- **Cloud Provider(s):** N/A. Executes locally on the user's machine.
|
||||
- **Core Services Used:** N/A (relies on external Algolia API, local Ollama, target websites, SMTP provider).
|
||||
- **Infrastructure as Code (IaC):** N/A.
|
||||
- **Deployment Strategy:** Manual execution via CLI (`npm run dev` or `npm run start` after `npm run build`). No CI/CD pipeline required for MVP.
|
||||
- **Environments:** Single environment: local development machine.
|
||||
|
||||
## Key Reference Documents
|
||||
|
||||
- `docs/prd.md`
|
||||
- `docs/epic1.md` ... `docs/epic5.md`
|
||||
- `docs/tech-stack.md`
|
||||
- `docs/project-structure.md`
|
||||
- `docs/data-models.md`
|
||||
- `docs/api-reference.md`
|
||||
- `docs/environment-vars.md`
|
||||
- `docs/coding-standards.md`
|
||||
- `docs/testing-strategy.md`
|
||||
- `docs/prompts.md`
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | -------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial draft based on PRD | 3-Architect |
|
||||
@@ -1,226 +0,0 @@
|
||||
# BMad Hacker Daily Digest Architecture Document
|
||||
|
||||
## Technical Summary
|
||||
|
||||
This document outlines the technical architecture for the BMad Hacker Daily Digest, a command-line tool built with TypeScript and Node.js v22. It adheres to the structure provided by the "bmad-boilerplate". The system fetches the top 10 Hacker News stories and their comments daily via the Algolia HN API, attempts to scrape linked articles, generates summaries for both articles (if scraped) and discussions using a local Ollama instance, persists intermediate data locally, and sends an HTML digest email via Nodemailer upon manual CLI execution. The architecture emphasizes modularity through distinct clients and processing stages, facilitating independent stage testing as required by the PRD. Execution is strictly local for the MVP.
|
||||
|
||||
## High-Level Overview
|
||||
|
||||
The application follows a sequential pipeline architecture triggered by a single CLI command (`npm run dev` or `npm start`). Data flows through distinct stages: HN Data Acquisition, Article Scraping, LLM Summarization, and Digest Assembly/Email Dispatch. Each stage persists its output to a date-stamped local directory, allowing subsequent stages to operate on this data and enabling stage-specific testing utilities.
|
||||
|
||||
**(Diagram Suggestion for Canvas: Create a flowchart showing the stages below)**
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[CLI Trigger (npm run dev/start)] --> B(Initialize: Load Config, Setup Logger, Create Output Dir);
|
||||
B --> C{Fetch HN Data (Top 10 Stories + Comments)};
|
||||
C -- Story/Comment Data --> D(Persist HN Data: ./output/YYYY-MM-DD/{storyId}_data.json);
|
||||
D --> E{Attempt Article Scraping (per story)};
|
||||
E -- Scraped Text (if successful) --> F(Persist Article Text: ./output/YYYY-MM-DD/{storyId}_article.txt);
|
||||
F --> G{Generate Summaries (Article + Discussion via Ollama)};
|
||||
G -- Summaries --> H(Persist Summaries: ./output/YYYY-MM-DD/{storyId}_summary.json);
|
||||
H --> I{Assemble Digest (Read persisted data)};
|
||||
I -- HTML Content --> J{Send Email via Nodemailer};
|
||||
J --> K(Log Final Status & Exit);
|
||||
|
||||
subgraph Stage Testing Utilities
|
||||
direction LR
|
||||
T1[npm run stage:fetch] --> D;
|
||||
T2[npm run stage:scrape] --> F;
|
||||
T3[npm run stage:summarize] --> H;
|
||||
T4[npm run stage:email] --> J;
|
||||
end
|
||||
|
||||
C --> |Error/Skip| G; // If no comments
|
||||
E --> |Skip/Fail| G; // If no URL or scrape fails
|
||||
G --> |Summarization Fail| H; // Persist null summaries
|
||||
I --> |Assembly Fail| K; // Skip email if assembly fails
|
||||
```
|
||||
|
||||
## Component View
|
||||
|
||||
The application logic resides primarily within the `src/` directory, organized into modules responsible for specific pipeline stages or cross-cutting concerns.
|
||||
|
||||
**(Diagram Suggestion for Canvas: Create a component diagram showing modules and dependencies)**
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
subgraph src ["Source Code (src/)"]
|
||||
direction LR
|
||||
Entry["index.ts (Main Orchestrator)"]
|
||||
|
||||
subgraph Config ["Configuration"]
|
||||
ConfMod["config.ts"]
|
||||
EnvFile[".env File"]
|
||||
end
|
||||
|
||||
subgraph Utils ["Utilities"]
|
||||
Logger["logger.ts"]
|
||||
end
|
||||
|
||||
subgraph Clients ["External Service Clients"]
|
||||
Algolia["clients/algoliaHNClient.ts"]
|
||||
Ollama["clients/ollamaClient.ts"]
|
||||
end
|
||||
|
||||
Scraper["scraper/articleScraper.ts"]
|
||||
|
||||
subgraph Email ["Email Handling"]
|
||||
Assembler["email/contentAssembler.ts"]
|
||||
Templater["email/templater.ts (or within Assembler)"]
|
||||
Sender["email/emailSender.ts"]
|
||||
Nodemailer["(nodemailer library)"]
|
||||
end
|
||||
|
||||
subgraph Stages ["Stage Testing Scripts (src/stages/)"]
|
||||
FetchStage["fetch_hn_data.ts"]
|
||||
ScrapeStage["scrape_articles.ts"]
|
||||
SummarizeStage["summarize_content.ts"]
|
||||
SendStage["send_digest.ts"]
|
||||
end
|
||||
|
||||
Entry --> ConfMod;
|
||||
Entry --> Logger;
|
||||
Entry --> Algolia;
|
||||
Entry --> Scraper;
|
||||
Entry --> Ollama;
|
||||
Entry --> Assembler;
|
||||
Entry --> Templater;
|
||||
Entry --> Sender;
|
||||
|
||||
Algolia -- uses --> NativeFetch["Node.js v22 Native Workspace"];
|
||||
Ollama -- uses --> NativeFetch;
|
||||
Scraper -- uses --> NativeFetch;
|
||||
Scraper -- uses --> ArticleExtractor["(@extractus/article-extractor)"];
|
||||
Sender -- uses --> Nodemailer;
|
||||
ConfMod -- reads --> EnvFile;
|
||||
|
||||
Assembler -- reads --> LocalFS["Local Filesystem (./output)"];
|
||||
Entry -- writes --> LocalFS;
|
||||
|
||||
FetchStage --> Algolia;
|
||||
FetchStage --> LocalFS;
|
||||
ScrapeStage --> Scraper;
|
||||
ScrapeStage --> LocalFS;
|
||||
SummarizeStage --> Ollama;
|
||||
SummarizeStage --> LocalFS;
|
||||
SendStage --> Assembler;
|
||||
SendStage --> Templater;
|
||||
SendStage --> Sender;
|
||||
SendStage --> LocalFS;
|
||||
end
|
||||
|
||||
CLI["CLI (npm run ...)"] --> Entry;
|
||||
CLI -- runs --> FetchStage;
|
||||
CLI -- runs --> ScrapeStage;
|
||||
CLI -- runs --> SummarizeStage;
|
||||
CLI -- runs --> SendStage;
|
||||
|
||||
```
|
||||
|
||||
_Module Descriptions:_
|
||||
|
||||
- **`src/index.ts`**: The main entry point, orchestrating the entire pipeline flow from initialization to final email dispatch. Imports and calls functions from other modules.
|
||||
- **`src/config.ts`**: Responsible for loading and validating environment variables from the `.env` file using the `dotenv` library.
|
||||
- **`src/logger.ts`**: Provides a simple console logging utility used throughout the application.
|
||||
- **`src/clients/algoliaHNClient.ts`**: Encapsulates interaction with the Algolia Hacker News Search API using the native `Workspace` API for fetching stories and comments.
|
||||
- **`src/clients/ollamaClient.ts`**: Encapsulates interaction with the local Ollama API endpoint using the native `Workspace` API for generating summaries.
|
||||
- **`src/scraper/articleScraper.ts`**: Handles fetching article HTML using native `Workspace` and extracting text content using `@extractus/article-extractor`. Includes robust error handling for fetch and extraction failures.
|
||||
- **`src/email/contentAssembler.ts`**: Reads persisted story data and summaries from the local output directory.
|
||||
- **`src/email/templater.ts` (or integrated)**: Renders the HTML email content using the assembled data.
|
||||
- **`src/email/emailSender.ts`**: Configures and uses Nodemailer to send the generated HTML email.
|
||||
- **`src/stages/*.ts`**: Individual scripts designed to run specific pipeline stages independently for testing, using persisted data from previous stages as input where applicable.
|
||||
|
||||
## Key Architectural Decisions & Patterns
|
||||
|
||||
- **Pipeline Architecture:** A sequential flow where each stage processes data and passes artifacts to the next via the local filesystem. Chosen for simplicity and to easily support independent stage testing.
|
||||
- **Local Execution & File Persistence:** All execution is local, and intermediate artifacts (`_data.json`, `_article.txt`, `_summary.json`) are stored in a date-stamped `./output` directory. This avoids database setup for MVP and facilitates debugging/stage testing.
|
||||
- **Native `Workspace` API:** Mandated by constraints for all HTTP requests (Algolia, Ollama, Article Scraping). Ensures usage of the latest Node.js features.
|
||||
- **Modular Clients:** External interactions (Algolia, Ollama) are encapsulated in dedicated client modules (`src/clients/`). This promotes separation of concerns and makes swapping implementations (e.g., different LLM API) easier.
|
||||
- **Configuration via `.env`:** Standard approach using `dotenv` for managing API keys, endpoints, and behavioral parameters (as per boilerplate).
|
||||
- **Stage Testing Utilities:** Dedicated scripts (`src/stages/*.ts`) allow isolated testing of fetching, scraping, summarization, and emailing, fulfilling a key PRD requirement.
|
||||
- **Graceful Error Handling (Scraping):** Article scraping failures are logged but do not halt the main pipeline, allowing the process to continue with discussion summaries only, as required. Other errors (API, LLM) are logged.
|
||||
|
||||
## Core Workflow / Sequence Diagrams (Simplified)
|
||||
|
||||
**(Diagram Suggestion for Canvas: Create a Sequence Diagram showing interactions)**
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant CLI
|
||||
participant Index as index.ts
|
||||
participant Config as config.ts
|
||||
participant Logger as logger.ts
|
||||
participant OutputDir as Output Dir Setup
|
||||
participant Algolia as algoliaHNClient.ts
|
||||
participant Scraper as articleScraper.ts
|
||||
participant Ollama as ollamaClient.ts
|
||||
participant Assembler as contentAssembler.ts
|
||||
participant Templater as templater.ts
|
||||
participant Sender as emailSender.ts
|
||||
participant FS as Local Filesystem (./output/YYYY-MM-DD)
|
||||
|
||||
CLI->>Index: npm run dev
|
||||
Index->>Config: Load .env vars
|
||||
Index->>Logger: Initialize
|
||||
Index->>OutputDir: Create/Verify Date Dir
|
||||
Index->>Algolia: fetchTopStories()
|
||||
Algolia-->>Index: stories[]
|
||||
loop For Each Story
|
||||
Index->>Algolia: fetchCommentsForStory(storyId, MAX_COMMENTS)
|
||||
Algolia-->>Index: comments[]
|
||||
Index->>FS: Write {storyId}_data.json
|
||||
alt Has Valid story.url
|
||||
Index->>Scraper: scrapeArticle(story.url)
|
||||
Scraper-->>Index: articleContent (string | null)
|
||||
alt Scrape Success
|
||||
Index->>FS: Write {storyId}_article.txt
|
||||
end
|
||||
end
|
||||
alt Has articleContent
|
||||
Index->>Ollama: generateSummary(ARTICLE_PROMPT, articleContent)
|
||||
Ollama-->>Index: articleSummary (string | null)
|
||||
end
|
||||
alt Has comments[]
|
||||
Index->>Ollama: generateSummary(DISCUSSION_PROMPT, formattedComments)
|
||||
Ollama-->>Index: discussionSummary (string | null)
|
||||
end
|
||||
Index->>FS: Write {storyId}_summary.json
|
||||
end
|
||||
Index->>Assembler: assembleDigestData(dateDirPath)
|
||||
Assembler->>FS: Read _data.json, _summary.json files
|
||||
Assembler-->>Index: digestData[]
|
||||
alt digestData is not empty
|
||||
Index->>Templater: renderDigestHtml(digestData, date)
|
||||
Templater-->>Index: htmlContent
|
||||
Index->>Sender: sendDigestEmail(subject, htmlContent)
|
||||
Sender-->>Index: success (boolean)
|
||||
end
|
||||
Index->>Logger: Log final status
|
||||
```
|
||||
|
||||
## Infrastructure and Deployment Overview
|
||||
|
||||
- **Cloud Provider(s):** N/A (Local Machine Execution Only for MVP)
|
||||
- **Core Services Used:** N/A
|
||||
- **Infrastructure as Code (IaC):** N/A
|
||||
- **Deployment Strategy:** Manual CLI execution (`npm run dev` for development with `ts-node`, `npm run build && npm start` for running compiled JS). No automated deployment pipeline for MVP.
|
||||
- **Environments:** Single: Local development machine.
|
||||
|
||||
## Key Reference Documents
|
||||
|
||||
- docs/prd.md
|
||||
- docs/epic1-draft.txt, docs/epic2-draft.txt, ... docs/epic5-draft.txt
|
||||
- docs/tech-stack.md
|
||||
- docs/project-structure.md
|
||||
- docs/coding-standards.md
|
||||
- docs/api-reference.md
|
||||
- docs/data-models.md
|
||||
- docs/environment-vars.md
|
||||
- docs/testing-strategy.md
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ---------------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial draft based on PRD & Epics | 3-Architect |
|
||||
@@ -1,80 +0,0 @@
|
||||
# BMad Hacker Daily Digest Coding Standards and Patterns
|
||||
|
||||
This document outlines the coding standards, design patterns, and best practices to be followed during the development of the BMad Hacker Daily Digest project. Adherence to these standards is crucial for maintainability, readability, and collaboration.
|
||||
|
||||
## Architectural / Design Patterns Adopted
|
||||
|
||||
- **Sequential Pipeline:** The core application follows a linear sequence of steps (fetch, scrape, summarize, email) orchestrated within `src/core/pipeline.ts`.
|
||||
- **Modular Design:** The application is broken down into distinct modules based on responsibility (e.g., `clients/`, `scraper/`, `email/`, `utils/`) to promote separation of concerns, testability, and maintainability. See `docs/project-structure.md`.
|
||||
- **Client Abstraction:** External service interactions (Algolia, Ollama) are encapsulated within dedicated client modules in `src/clients/`.
|
||||
- **Filesystem Persistence:** Intermediate data is persisted to the local filesystem instead of a database, acting as a handoff between pipeline stages.
|
||||
|
||||
## Coding Standards
|
||||
|
||||
- **Primary Language:** TypeScript (v5.x, as configured in boilerplate)
|
||||
- **Primary Runtime:** Node.js (v22.x, as required by PRD )
|
||||
- **Style Guide & Linter:** ESLint and Prettier. Configuration is provided by the `bmad-boilerplate`.
|
||||
- **Mandatory:** Run `npm run lint` and `npm run format` regularly and before committing code. Code must be free of lint errors.
|
||||
- **Naming Conventions:**
|
||||
- Variables & Functions: `camelCase`
|
||||
- Classes, Types, Interfaces: `PascalCase`
|
||||
- Constants: `UPPER_SNAKE_CASE`
|
||||
- Files: `kebab-case.ts` (e.g., `article-scraper.ts`) or `camelCase.ts` (e.g., `ollamaClient.ts`). Be consistent within module types (e.g., all clients follow one pattern, all utils another). Let's default to `camelCase.ts` for consistency with class/module names where applicable (e.g. `ollamaClient.ts`) and `kebab-case.ts` for more descriptive utils or stage runners (e.g. `Workspace-hn-data.ts`).
|
||||
- Test Files: `*.test.ts` (e.g., `ollamaClient.test.ts`)
|
||||
- **File Structure:** Adhere strictly to the layout defined in `docs/project-structure.md`.
|
||||
- **Asynchronous Operations:** **Mandatory:** Use `async`/`await` for all asynchronous operations (e.g., native `Workspace` HTTP calls , `fs/promises` file operations, Ollama client calls, Nodemailer `sendMail`). Avoid using raw Promises `.then()`/`.catch()` syntax where `async/await` provides better readability.
|
||||
- **Type Safety:** Leverage TypeScript's static typing. Use interfaces and types defined in `src/types/` where appropriate. Assume `strict` mode is enabled in `tsconfig.json` (from boilerplate). Avoid using `any` unless absolutely necessary and justified.
|
||||
- **Comments & Documentation:**
|
||||
- Use JSDoc comments for exported functions, classes, and complex logic.
|
||||
- Keep comments concise and focused on the _why_, not the _what_, unless the code is particularly complex.
|
||||
- Update READMEs as needed for setup or usage changes.
|
||||
- **Dependency Management:**
|
||||
- Use `npm` for package management.
|
||||
- Keep production dependencies minimal, as required by the PRD . Justify any additions.
|
||||
- Use `devDependencies` for testing, linting, and build tools.
|
||||
|
||||
## Error Handling Strategy
|
||||
|
||||
- **General Approach:** Use standard JavaScript `try...catch` blocks for operations that can fail (I/O, network requests, parsing, etc.). Throw specific `Error` objects with descriptive messages. Avoid catching errors without logging or re-throwing unless intentionally handling a specific case.
|
||||
- **Logging:**
|
||||
- **Mandatory:** Use the central logger utility (`src/utils/logger.ts`) for all console output (INFO, WARN, ERROR). Do not use `console.log` directly in application logic.
|
||||
- **Format:** Basic text format for MVP. Structured JSON logging to files is a post-MVP enhancement.
|
||||
- **Levels:** Use appropriate levels (`logger.info`, `logger.warn`, `logger.error`).
|
||||
- **Context:** Include relevant context in log messages (e.g., Story ID, function name, URL being processed) to aid debugging.
|
||||
- **Specific Handling Patterns:**
|
||||
- **External API Calls (Algolia, Ollama via `Workspace`):**
|
||||
- Wrap `Workspace` calls in `try...catch`.
|
||||
- Check `response.ok` status; if false, log the status code and potentially response body text, then treat as an error (e.g., return `null` or throw).
|
||||
- Log network errors caught by the `catch` block.
|
||||
- No automated retries required for MVP.
|
||||
- **Article Scraping (`articleScraper.ts`):**
|
||||
- Wrap `Workspace` and text extraction (`article-extractor`) logic in `try...catch`.
|
||||
- Handle non-2xx responses, timeouts, non-HTML content types, and extraction errors.
|
||||
- **Crucial:** If scraping fails for any reason, log the error/reason using `logger.warn` or `logger.error`, return `null`, and **allow the main pipeline to continue processing the story** (using only comment summary). Do not throw an error that halts the entire application.
|
||||
- **File I/O (`fs` module):**
|
||||
- Wrap `fs` operations (especially writes) in `try...catch`. Log any file system errors using `logger.error`.
|
||||
- **Email Sending (`Nodemailer`):**
|
||||
- Wrap `transporter.sendMail()` in `try...catch`. Log success (including message ID) or failure clearly using the logger.
|
||||
- **Configuration Loading (`config.ts`):**
|
||||
- Check for the presence of all required environment variables at startup. Throw a fatal error and exit if required variables are missing.
|
||||
- **LLM Interaction (Ollama Client):**
|
||||
- **LLM Prompts:** Use the standardized prompts defined in `docs/prompts.md` when interacting with the Ollama client for consistency.
|
||||
- Wrap `generateSummary` calls in `try...catch`. Log errors from the client (which handles API/network issues).
|
||||
- **Comment Truncation:** Before sending comments for discussion summary, check for the `MAX_COMMENT_CHARS_FOR_SUMMARY` env var. If set to a positive number, truncate the combined comment text block to this length. Log a warning if truncation occurs. If not set, send the full text.
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
- **Input Sanitization/Validation:** While primarily a local tool, validate critical inputs like external URLs (`story.articleUrl`) before attempting to fetch them. Basic checks (e.g., starts with `http://` or `https://`) are sufficient for MVP .
|
||||
- **Secrets Management:**
|
||||
- **Mandatory:** Store sensitive data (`EMAIL_USER`, `EMAIL_PASS`) only in the `.env` file.
|
||||
- **Mandatory:** Ensure the `.env` file is included in `.gitignore` and is never committed to version control.
|
||||
- Do not hardcode secrets anywhere in the source code.
|
||||
- **Dependency Security:** Periodically run `npm audit` to check for known vulnerabilities in dependencies. Consider enabling Dependabot if using GitHub.
|
||||
- **HTTP Client:** Use the native `Workspace` API as required ; avoid introducing less secure or overly complex HTTP client libraries.
|
||||
- **Scraping User-Agent:** Set a default User-Agent header in the scraper code (e.g., "BMadHackerDigest/0.1"). Allow overriding this default via the optional SCRAPER_USER_AGENT environment variable.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | --------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial draft based on Arch | 3-Architect |
|
||||
@@ -1,80 +0,0 @@
|
||||
# BMad Hacker Daily Digest Coding Standards and Patterns
|
||||
|
||||
This document outlines the coding standards, design patterns, and best practices to be followed during the development of the BMad Hacker Daily Digest project. Adherence to these standards is crucial for maintainability, readability, and collaboration.
|
||||
|
||||
## Architectural / Design Patterns Adopted
|
||||
|
||||
- **Sequential Pipeline:** The core application follows a linear sequence of steps (fetch, scrape, summarize, email) orchestrated within `src/core/pipeline.ts`.
|
||||
- **Modular Design:** The application is broken down into distinct modules based on responsibility (e.g., `clients/`, `scraper/`, `email/`, `utils/`) to promote separation of concerns, testability, and maintainability. See `docs/project-structure.md`.
|
||||
- **Client Abstraction:** External service interactions (Algolia, Ollama) are encapsulated within dedicated client modules in `src/clients/`.
|
||||
- **Filesystem Persistence:** Intermediate data is persisted to the local filesystem instead of a database, acting as a handoff between pipeline stages.
|
||||
|
||||
## Coding Standards
|
||||
|
||||
- **Primary Language:** TypeScript (v5.x, as configured in boilerplate)
|
||||
- **Primary Runtime:** Node.js (v22.x, as required by PRD )
|
||||
- **Style Guide & Linter:** ESLint and Prettier. Configuration is provided by the `bmad-boilerplate`.
|
||||
- **Mandatory:** Run `npm run lint` and `npm run format` regularly and before committing code. Code must be free of lint errors.
|
||||
- **Naming Conventions:**
|
||||
- Variables & Functions: `camelCase`
|
||||
- Classes, Types, Interfaces: `PascalCase`
|
||||
- Constants: `UPPER_SNAKE_CASE`
|
||||
- Files: `kebab-case.ts` (e.g., `article-scraper.ts`) or `camelCase.ts` (e.g., `ollamaClient.ts`). Be consistent within module types (e.g., all clients follow one pattern, all utils another). Let's default to `camelCase.ts` for consistency with class/module names where applicable (e.g. `ollamaClient.ts`) and `kebab-case.ts` for more descriptive utils or stage runners (e.g. `Workspace-hn-data.ts`).
|
||||
- Test Files: `*.test.ts` (e.g., `ollamaClient.test.ts`)
|
||||
- **File Structure:** Adhere strictly to the layout defined in `docs/project-structure.md`.
|
||||
- **Asynchronous Operations:** **Mandatory:** Use `async`/`await` for all asynchronous operations (e.g., native `Workspace` HTTP calls , `fs/promises` file operations, Ollama client calls, Nodemailer `sendMail`). Avoid using raw Promises `.then()`/`.catch()` syntax where `async/await` provides better readability.
|
||||
- **Type Safety:** Leverage TypeScript's static typing. Use interfaces and types defined in `src/types/` where appropriate. Assume `strict` mode is enabled in `tsconfig.json` (from boilerplate). Avoid using `any` unless absolutely necessary and justified.
|
||||
- **Comments & Documentation:**
|
||||
- Use JSDoc comments for exported functions, classes, and complex logic.
|
||||
- Keep comments concise and focused on the _why_, not the _what_, unless the code is particularly complex.
|
||||
- Update READMEs as needed for setup or usage changes.
|
||||
- **Dependency Management:**
|
||||
- Use `npm` for package management.
|
||||
- Keep production dependencies minimal, as required by the PRD . Justify any additions.
|
||||
- Use `devDependencies` for testing, linting, and build tools.
|
||||
|
||||
## Error Handling Strategy
|
||||
|
||||
- **General Approach:** Use standard JavaScript `try...catch` blocks for operations that can fail (I/O, network requests, parsing, etc.). Throw specific `Error` objects with descriptive messages. Avoid catching errors without logging or re-throwing unless intentionally handling a specific case.
|
||||
- **Logging:**
|
||||
- **Mandatory:** Use the central logger utility (`src/utils/logger.ts`) for all console output (INFO, WARN, ERROR). Do not use `console.log` directly in application logic.
|
||||
- **Format:** Basic text format for MVP. Structured JSON logging to files is a post-MVP enhancement.
|
||||
- **Levels:** Use appropriate levels (`logger.info`, `logger.warn`, `logger.error`).
|
||||
- **Context:** Include relevant context in log messages (e.g., Story ID, function name, URL being processed) to aid debugging.
|
||||
- **Specific Handling Patterns:**
|
||||
- **External API Calls (Algolia, Ollama via `Workspace`):**
|
||||
- Wrap `Workspace` calls in `try...catch`.
|
||||
- Check `response.ok` status; if false, log the status code and potentially response body text, then treat as an error (e.g., return `null` or throw).
|
||||
- Log network errors caught by the `catch` block.
|
||||
- No automated retries required for MVP.
|
||||
- **Article Scraping (`articleScraper.ts`):**
|
||||
- Wrap `Workspace` and text extraction (`article-extractor`) logic in `try...catch`.
|
||||
- Handle non-2xx responses, timeouts, non-HTML content types, and extraction errors.
|
||||
- **Crucial:** If scraping fails for any reason, log the error/reason using `logger.warn` or `logger.error`, return `null`, and **allow the main pipeline to continue processing the story** (using only comment summary). Do not throw an error that halts the entire application.
|
||||
- **File I/O (`fs` module):**
|
||||
- Wrap `fs` operations (especially writes) in `try...catch`. Log any file system errors using `logger.error`.
|
||||
- **Email Sending (`Nodemailer`):**
|
||||
- Wrap `transporter.sendMail()` in `try...catch`. Log success (including message ID) or failure clearly using the logger.
|
||||
- **Configuration Loading (`config.ts`):**
|
||||
- Check for the presence of all required environment variables at startup. Throw a fatal error and exit if required variables are missing.
|
||||
- **LLM Interaction (Ollama Client):**
|
||||
- **LLM Prompts:** Use the standardized prompts defined in `docs/prompts.md` when interacting with the Ollama client for consistency.
|
||||
- Wrap `generateSummary` calls in `try...catch`. Log errors from the client (which handles API/network issues).
|
||||
- **Comment Truncation:** Before sending comments for discussion summary, check for the `MAX_COMMENT_CHARS_FOR_SUMMARY` env var. If set to a positive number, truncate the combined comment text block to this length. Log a warning if truncation occurs. If not set, send the full text.
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
- **Input Sanitization/Validation:** While primarily a local tool, validate critical inputs like external URLs (`story.articleUrl`) before attempting to fetch them. Basic checks (e.g., starts with `http://` or `https://`) are sufficient for MVP .
|
||||
- **Secrets Management:**
|
||||
- **Mandatory:** Store sensitive data (`EMAIL_USER`, `EMAIL_PASS`) only in the `.env` file.
|
||||
- **Mandatory:** Ensure the `.env` file is included in `.gitignore` and is never committed to version control.
|
||||
- Do not hardcode secrets anywhere in the source code.
|
||||
- **Dependency Security:** Periodically run `npm audit` to check for known vulnerabilities in dependencies. Consider enabling Dependabot if using GitHub.
|
||||
- **HTTP Client:** Use the native `Workspace` API as required ; avoid introducing less secure or overly complex HTTP client libraries.
|
||||
- **Scraping User-Agent:** Set a default User-Agent header in the scraper code (e.g., "BMadHackerDigest/0.1"). Allow overriding this default via the optional SCRAPER_USER_AGENT environment variable.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | --------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial draft based on Arch | 3-Architect |
|
||||
@@ -1,614 +0,0 @@
|
||||
# Epic 1 file
|
||||
|
||||
# Epic 1: Project Initialization & Core Setup
|
||||
|
||||
**Goal:** Initialize the project using the "bmad-boilerplate", manage dependencies, setup `.env` and config loading, establish basic CLI entry point, setup basic logging and output directory structure. This provides the foundational setup for all subsequent development work.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 1.1: Initialize Project from Boilerplate
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to set up the initial project structure using the `bmad-boilerplate`, so that I have the standard tooling (TS, Jest, ESLint, Prettier), configurations, and scripts in place.
|
||||
- **Detailed Requirements:**
|
||||
- Copy or clone the contents of the `bmad-boilerplate` into the new project's root directory.
|
||||
- Initialize a git repository in the project root directory (if not already done by cloning).
|
||||
- Ensure the `.gitignore` file from the boilerplate is present.
|
||||
- Run `npm install` to download and install all `devDependencies` specified in the boilerplate's `package.json`.
|
||||
- Verify that the core boilerplate scripts (`lint`, `format`, `test`, `build`) execute without errors on the initial codebase.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The project directory contains the files and structure from `bmad-boilerplate`.
|
||||
- AC2: A `node_modules` directory exists and contains packages corresponding to `devDependencies`.
|
||||
- AC3: `npm run lint` command completes successfully without reporting any linting errors.
|
||||
- AC4: `npm run format` command completes successfully, potentially making formatting changes according to Prettier rules. Running it a second time should result in no changes.
|
||||
- AC5: `npm run test` command executes Jest successfully (it may report "no tests found" which is acceptable at this stage).
|
||||
- AC6: `npm run build` command executes successfully, creating a `dist` directory containing compiled JavaScript output.
|
||||
- AC7: The `.gitignore` file exists and includes entries for `node_modules/`, `.env`, `dist/`, etc. as specified in the boilerplate.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.2: Setup Environment Configuration
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to establish the environment configuration mechanism using `.env` files, so that secrets and settings (like output paths) can be managed outside of version control, following boilerplate conventions.
|
||||
- **Detailed Requirements:**
|
||||
- Add a production dependency for loading `.env` files (e.g., `dotenv`). Run `npm install dotenv --save-prod` (or similar library).
|
||||
- Verify the `.env.example` file exists (from boilerplate).
|
||||
- Add an initial configuration variable `OUTPUT_DIR_PATH=./output` to `.env.example`.
|
||||
- Create the `.env` file locally by copying `.env.example`. Populate `OUTPUT_DIR_PATH` if needed (can keep default).
|
||||
- Implement a utility module (e.g., `src/config.ts`) that loads environment variables from the `.env` file at application startup.
|
||||
- The utility should export the loaded configuration values (initially just `OUTPUT_DIR_PATH`).
|
||||
- Ensure the `.env` file is listed in `.gitignore` and is not committed.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The chosen `.env` library (e.g., `dotenv`) is listed under `dependencies` in `package.json` and `package-lock.json` is updated.
|
||||
- AC2: The `.env.example` file exists, is tracked by git, and contains the line `OUTPUT_DIR_PATH=./output`.
|
||||
- AC3: The `.env` file exists locally but is NOT tracked by git.
|
||||
- AC4: A configuration module (`src/config.ts` or similar) exists and successfully loads the `OUTPUT_DIR_PATH` value from `.env` when the application starts.
|
||||
- AC5: The loaded `OUTPUT_DIR_PATH` value is accessible within the application code.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.3: Implement Basic CLI Entry Point & Execution
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic `src/index.ts` entry point that can be executed via the boilerplate's `dev` and `start` scripts, providing a working foundation for the application logic.
|
||||
- **Detailed Requirements:**
|
||||
- Create the main application entry point file at `src/index.ts`.
|
||||
- Implement minimal code within `src/index.ts` to:
|
||||
- Import the configuration loading mechanism (from Story 1.2).
|
||||
- Log a simple startup message to the console (e.g., "BMad Hacker Daily Digest - Starting Up...").
|
||||
- (Optional) Log the loaded `OUTPUT_DIR_PATH` to verify config loading.
|
||||
- Confirm execution using boilerplate scripts.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `src/index.ts` file exists.
|
||||
- AC2: Running `npm run dev` executes `src/index.ts` via `ts-node` and logs the startup message to the console.
|
||||
- AC3: Running `npm run build` successfully compiles `src/index.ts` (and any imports) into the `dist` directory.
|
||||
- AC4: Running `npm start` (after a successful build) executes the compiled code from `dist` and logs the startup message to the console.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.4: Setup Basic Logging and Output Directory
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic console logging mechanism and the dynamic creation of a date-stamped output directory, so that the application can provide execution feedback and prepare for storing data artifacts in subsequent epics.
|
||||
- **Detailed Requirements:**
|
||||
- Implement a simple, reusable logging utility module (e.g., `src/logger.ts`). Initially, it can wrap `console.log`, `console.warn`, `console.error`.
|
||||
- Refactor `src/index.ts` to use this `logger` for its startup message(s).
|
||||
- In `src/index.ts` (or a setup function called by it):
|
||||
- Retrieve the `OUTPUT_DIR_PATH` from the configuration (loaded in Story 1.2).
|
||||
- Determine the current date in 'YYYY-MM-DD' format.
|
||||
- Construct the full path for the date-stamped subdirectory (e.g., `${OUTPUT_DIR_PATH}/YYYY-MM-DD`).
|
||||
- Check if the base output directory exists; if not, create it.
|
||||
- Check if the date-stamped subdirectory exists; if not, create it recursively. Use Node.js `fs` module (e.g., `fs.mkdirSync(path, { recursive: true })`).
|
||||
- Log (using the logger) the full path of the output directory being used for the current run (e.g., "Output directory for this run: ./output/2025-05-04").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: A logger utility module (`src/logger.ts` or similar) exists and is used for console output in `src/index.ts`.
|
||||
- AC2: Running `npm run dev` or `npm start` logs the startup message via the logger.
|
||||
- AC3: Running the application creates the base output directory (e.g., `./output` defined in `.env`) if it doesn't already exist.
|
||||
- AC4: Running the application creates a date-stamped subdirectory (e.g., `./output/2025-05-04`) within the base output directory if it doesn't already exist.
|
||||
- AC5: The application logs a message indicating the full path to the date-stamped output directory created/used for the current execution.
|
||||
- AC6: The application exits gracefully after performing these setup steps (for now).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 1 | 2-pm |
|
||||
|
||||
# Epic 2 File
|
||||
|
||||
# Epic 2: HN Data Acquisition & Persistence
|
||||
|
||||
**Goal:** Implement fetching top 10 stories and their comments (respecting limits) from Algolia HN API, and persist this raw data locally into the date-stamped output directory created in Epic 1. Implement a stage testing utility for fetching.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 2.1: Implement Algolia HN API Client
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a dedicated client module to interact with the Algolia Hacker News Search API, so that fetching stories and comments is encapsulated, reusable, and uses the required native `Workspace` API.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/clients/algoliaHNClient.ts`.
|
||||
- Implement an async function `WorkspaceTopStories` within the client:
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for front-page stories (e.g., `http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10`). Adjust `hitsPerPage` if needed to ensure 10 stories.
|
||||
- Parse the JSON response.
|
||||
- Extract required metadata for each story: `objectID` (use as `storyId`), `title`, `url` (article URL), `points`, `num_comments`. Handle potential missing `url` field gracefully (log warning, maybe skip story later if URL needed).
|
||||
- Construct the `hnUrl` for each story (e.g., `https://news.ycombinator.com/item?id={storyId}`).
|
||||
- Return an array of structured story objects.
|
||||
- Implement a separate async function `WorkspaceCommentsForStory` within the client:
|
||||
- Accept `storyId` and `maxComments` limit as arguments.
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for comments of a specific story (e.g., `http://hn.algolia.com/api/v1/search?tags=comment,story_{storyId}&hitsPerPage={maxComments}`).
|
||||
- Parse the JSON response.
|
||||
- Extract required comment data: `objectID` (use as `commentId`), `comment_text`, `author`, `created_at`.
|
||||
- Filter out comments where `comment_text` is null or empty. Ensure only up to `maxComments` are returned.
|
||||
- Return an array of structured comment objects.
|
||||
- Implement basic error handling using `try...catch` around `Workspace` calls and check `response.ok` status. Log errors using the logger utility from Epic 1.
|
||||
- Define TypeScript interfaces/types for the expected structures of API responses (stories, comments) and the data returned by the client functions (e.g., `Story`, `Comment`).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The module `src/clients/algoliaHNClient.ts` exists and exports `WorkspaceTopStories` and `WorkspaceCommentsForStory` functions.
|
||||
- AC2: Calling `WorkspaceTopStories` makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of 10 `Story` objects containing the specified metadata.
|
||||
- AC3: Calling `WorkspaceCommentsForStory` with a valid `storyId` and `maxComments` limit makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of `Comment` objects (up to `maxComments`), filtering out empty ones.
|
||||
- AC4: Both functions use the native `Workspace` API internally.
|
||||
- AC5: Network errors or non-successful API responses (e.g., status 4xx, 5xx) are caught and logged using the logger.
|
||||
- AC6: Relevant TypeScript types (`Story`, `Comment`, etc.) are defined and used within the client module.
|
||||
|
||||
---
|
||||
|
||||
### Story 2.2: Integrate HN Data Fetching into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to integrate the HN data fetching logic into the main application workflow (`src/index.ts`), so that running the app retrieves the top 10 stories and their comments after completing the setup from Epic 1.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts` (or a main async function called by it).
|
||||
- Import the `algoliaHNClient` functions.
|
||||
- Import the configuration module to access `MAX_COMMENTS_PER_STORY`.
|
||||
- After the Epic 1 setup (config load, logger init, output dir creation), call `WorkspaceTopStories()`.
|
||||
- Log the number of stories fetched.
|
||||
- Iterate through the array of fetched `Story` objects.
|
||||
- For each `Story`, call `WorkspaceCommentsForStory()`, passing the `story.storyId` and the configured `MAX_COMMENTS_PER_STORY`.
|
||||
- Store the fetched comments within the corresponding `Story` object in memory (e.g., add a `comments: Comment[]` property to the `Story` object).
|
||||
- Log progress using the logger utility (e.g., "Fetched 10 stories.", "Fetching up to X comments for story {storyId}...").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes Epic 1 setup steps followed by fetching stories and then comments for each story.
|
||||
- AC2: Logs clearly show the start and successful completion of fetching stories, and the start of fetching comments for each of the 10 stories.
|
||||
- AC3: The configured `MAX_COMMENTS_PER_STORY` value is read from config and used in the calls to `WorkspaceCommentsForStory`.
|
||||
- AC4: After successful execution, story objects held in memory contain a nested array of fetched comment objects. (Can be verified via debugger or temporary logging).
|
||||
|
||||
---
|
||||
|
||||
### Story 2.3: Persist Fetched HN Data Locally
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save the fetched HN stories (including their comments) to JSON files in the date-stamped output directory, so that the raw data is persisted locally for subsequent pipeline stages and debugging.
|
||||
- **Detailed Requirements:**
|
||||
- Define a consistent JSON structure for the output file content. Example: `{ storyId: "...", title: "...", url: "...", hnUrl: "...", points: ..., fetchedAt: "ISO_TIMESTAMP", comments: [{ commentId: "...", text: "...", author: "...", createdAt: "ISO_TIMESTAMP", ... }, ...] }`. Include a timestamp for when the data was fetched.
|
||||
- Import Node.js `fs` (specifically `fs.writeFileSync`) and `path` modules.
|
||||
- In the main workflow (`src/index.ts`), within the loop iterating through stories (after comments have been fetched and added to the story object in Story 2.2):
|
||||
- Get the full path to the date-stamped output directory (determined in Epic 1).
|
||||
- Construct the filename for the story's data: `{storyId}_data.json`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Serialize the complete story object (including comments and fetch timestamp) to a JSON string using `JSON.stringify(storyObject, null, 2)` for readability.
|
||||
- Write the JSON string to the file using `fs.writeFileSync()`. Use a `try...catch` block for error handling.
|
||||
- Log (using the logger) the successful persistence of each story's data file or any errors encountered during file writing.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory (e.g., `./output/YYYY-MM-DD/`) contains exactly 10 files named `{storyId}_data.json`.
|
||||
- AC2: Each JSON file contains valid JSON representing a single story object, including its metadata, fetch timestamp, and an array of its fetched comments, matching the defined structure.
|
||||
- AC3: The number of comments in each file's `comments` array does not exceed `MAX_COMMENTS_PER_STORY`.
|
||||
- AC4: Logs indicate that saving data to a file was attempted for each story, reporting success or specific file writing errors.
|
||||
|
||||
---
|
||||
|
||||
### Story 2.4: Implement Stage Testing Utility for HN Fetching
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate, executable script that *only* performs the HN data fetching and persistence, so I can test and trigger this stage independently of the full pipeline.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new standalone script file: `src/stages/fetch_hn_data.ts`.
|
||||
- This script should perform the essential setup required for this stage: initialize logger, load configuration (`.env`), determine and create output directory (reuse or replicate logic from Epic 1 / `src/index.ts`).
|
||||
- The script should then execute the core logic of fetching stories via `algoliaHNClient.fetchTopStories`, fetching comments via `algoliaHNClient.fetchCommentsForStory` (using loaded config for limit), and persisting the results to JSON files using `fs.writeFileSync` (replicating logic from Story 2.3).
|
||||
- The script should log its progress using the logger utility.
|
||||
- Add a new script command to `package.json` under `"scripts"`: `"stage:fetch": "ts-node src/stages/fetch_hn_data.ts"`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/fetch_hn_data.ts` exists.
|
||||
- AC2: The script `stage:fetch` is defined in `package.json`'s `scripts` section.
|
||||
- AC3: Running `npm run stage:fetch` executes successfully, performing only the setup, fetch, and persist steps.
|
||||
- AC4: Running `npm run stage:fetch` creates the same 10 `{storyId}_data.json` files in the correct date-stamped output directory as running the main `npm run dev` command (at the current state of development).
|
||||
- AC5: Logs generated by `npm run stage:fetch` reflect only the fetching and persisting steps, not subsequent pipeline stages.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 2 | 2-pm |
|
||||
|
||||
# Epic 3 File
|
||||
|
||||
# Epic 3: Article Scraping & Persistence
|
||||
|
||||
**Goal:** Implement a best-effort article scraping mechanism to fetch and extract plain text content from the external URLs associated with fetched HN stories. Handle failures gracefully and persist successfully scraped text locally. Implement a stage testing utility for scraping.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 3.1: Implement Basic Article Scraper Module
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module that attempts to fetch HTML from a URL and extract the main article text using basic methods, handling common failures gracefully, so article content can be prepared for summarization.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/scraper/articleScraper.ts`.
|
||||
- Add a suitable HTML parsing/extraction library dependency (e.g., `@extractus/article-extractor` recommended for simplicity, or `cheerio` for more control). Run `npm install @extractus/article-extractor --save-prod` (or chosen alternative).
|
||||
- Implement an async function `scrapeArticle(url: string): Promise<string | null>` within the module.
|
||||
- Inside the function:
|
||||
- Use native `Workspace` to retrieve content from the `url`. Set a reasonable timeout (e.g., 10-15 seconds). Include a `User-Agent` header to mimic a browser.
|
||||
- Handle potential `Workspace` errors (network errors, timeouts) using `try...catch`.
|
||||
- Check the `response.ok` status. If not okay, log error and return `null`.
|
||||
- Check the `Content-Type` header of the response. If it doesn't indicate HTML (e.g., does not include `text/html`), log warning and return `null`.
|
||||
- If HTML is received, attempt to extract the main article text using the chosen library (`article-extractor` preferred).
|
||||
- Wrap the extraction logic in a `try...catch` to handle library-specific errors.
|
||||
- Return the extracted plain text string if successful. Ensure it's just text, not HTML markup.
|
||||
- Return `null` if extraction fails or results in empty content.
|
||||
- Log all significant events, errors, or reasons for returning null (e.g., "Scraping URL...", "Fetch failed:", "Non-HTML content type:", "Extraction failed:", "Successfully extracted text") using the logger utility.
|
||||
- Define TypeScript types/interfaces as needed.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `articleScraper.ts` module exists and exports the `scrapeArticle` function.
|
||||
- AC2: The chosen scraping library (e.g., `@extractus/article-extractor`) is added to `dependencies` in `package.json`.
|
||||
- AC3: `scrapeArticle` uses native `Workspace` with a timeout and User-Agent header.
|
||||
- AC4: `scrapeArticle` correctly handles fetch errors, non-OK responses, and non-HTML content types by logging and returning `null`.
|
||||
- AC5: `scrapeArticle` uses the chosen library to attempt text extraction from valid HTML content.
|
||||
- AC6: `scrapeArticle` returns the extracted plain text on success, and `null` on any failure (fetch, non-HTML, extraction error, empty result).
|
||||
- AC7: Relevant logs are produced for success, failure modes, and errors encountered during the process.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.2: Integrate Article Scraping into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to integrate the article scraper into the main workflow (`src/index.ts`), attempting to scrape the article for each HN story that has a valid URL, after fetching its data.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts`.
|
||||
- Import the `scrapeArticle` function from `src/scraper/articleScraper.ts`.
|
||||
- Within the main loop iterating through the fetched stories (after comments are fetched in Epic 2):
|
||||
- Check if `story.url` exists and appears to be a valid HTTP/HTTPS URL. A simple check for starting with `http://` or `https://` is sufficient.
|
||||
- If the URL is missing or invalid, log a warning ("Skipping scraping for story {storyId}: Missing or invalid URL") and proceed to the next story's processing step.
|
||||
- If a valid URL exists, log ("Attempting to scrape article for story {storyId} from {story.url}").
|
||||
- Call `await scrapeArticle(story.url)`.
|
||||
- Store the result (the extracted text string or `null`) in memory, associated with the story object (e.g., add property `articleContent: string | null`).
|
||||
- Log the outcome clearly (e.g., "Successfully scraped article for story {storyId}", "Failed to scrape article for story {storyId}").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes Epic 1 & 2 steps, and then attempts article scraping for stories with valid URLs.
|
||||
- AC2: Stories with missing or invalid URLs are skipped, and a corresponding log message is generated.
|
||||
- AC3: For stories with valid URLs, the `scrapeArticle` function is called.
|
||||
- AC4: Logs clearly indicate the start and success/failure outcome of the scraping attempt for each relevant story.
|
||||
- AC5: Story objects held in memory after this stage contain an `articleContent` property holding the scraped text (string) or `null` if scraping was skipped or failed.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.3: Persist Scraped Article Text Locally
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save successfully scraped article text to a separate local file for each story, so that the text content is available as input for the summarization stage.
|
||||
- **Detailed Requirements:**
|
||||
- Import Node.js `fs` and `path` modules if not already present in `src/index.ts`.
|
||||
- In the main workflow (`src/index.ts`), immediately after a successful call to `scrapeArticle` for a story (where the result is a non-null string):
|
||||
- Retrieve the full path to the current date-stamped output directory.
|
||||
- Construct the filename: `{storyId}_article.txt`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Get the successfully scraped article text string (`articleContent`).
|
||||
- Use `fs.writeFileSync(fullPath, articleContent, 'utf-8')` to save the text to the file. Wrap in `try...catch` for file system errors.
|
||||
- Log the successful saving of the file (e.g., "Saved scraped article text to {filename}") or any file writing errors encountered.
|
||||
- Ensure *no* `_article.txt` file is created if `scrapeArticle` returned `null` (due to skipping or failure).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains `_article.txt` files *only* for those stories where `scrapeArticle` succeeded and returned text content.
|
||||
- AC2: The name of each article text file is `{storyId}_article.txt`.
|
||||
- AC3: The content of each `_article.txt` file is the plain text string returned by `scrapeArticle`.
|
||||
- AC4: Logs confirm the successful writing of each `_article.txt` file or report specific file writing errors.
|
||||
- AC5: No empty `_article.txt` files are created. Files only exist if scraping was successful.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.4: Implement Stage Testing Utility for Scraping
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate script/command to test the article scraping logic using HN story data from local files, allowing independent testing and debugging of the scraper.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new standalone script file: `src/stages/scrape_articles.ts`.
|
||||
- Import necessary modules: `fs`, `path`, `logger`, `config`, `scrapeArticle`.
|
||||
- The script should:
|
||||
- Initialize the logger.
|
||||
- Load configuration (to get `OUTPUT_DIR_PATH`).
|
||||
- Determine the target date-stamped directory path (e.g., `${OUTPUT_DIR_PATH}/YYYY-MM-DD`, using the current date or potentially an optional CLI argument). Ensure this directory exists.
|
||||
- Read the directory contents and identify all `{storyId}_data.json` files.
|
||||
- For each `_data.json` file found:
|
||||
- Read and parse the JSON content.
|
||||
- Extract the `storyId` and `url`.
|
||||
- If a valid `url` exists, call `await scrapeArticle(url)`.
|
||||
- If scraping succeeds (returns text), save the text to `{storyId}_article.txt` in the same directory (using logic from Story 3.3). Overwrite if the file exists.
|
||||
- Log the progress and outcome (skip/success/fail) for each story processed.
|
||||
- Add a new script command to `package.json`: `"stage:scrape": "ts-node src/stages/scrape_articles.ts"`. Consider adding argument parsing later if needed to specify a date/directory.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/scrape_articles.ts` exists.
|
||||
- AC2: The script `stage:scrape` is defined in `package.json`.
|
||||
- AC3: Running `npm run stage:scrape` (assuming a directory with `_data.json` files exists from a previous `stage:fetch` run) reads these files.
|
||||
- AC4: The script calls `scrapeArticle` for stories with valid URLs found in the JSON files.
|
||||
- AC5: The script creates/updates `{storyId}_article.txt` files in the target directory corresponding to successfully scraped articles.
|
||||
- AC6: The script logs its actions (reading files, attempting scraping, saving results) for each story ID processed.
|
||||
- AC7: The script operates solely based on local `_data.json` files and fetching from external article URLs; it does not call the Algolia HN API.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 3 | 2-pm |
|
||||
|
||||
# Epic 4 File
|
||||
|
||||
# Epic 4: LLM Summarization & Persistence
|
||||
|
||||
**Goal:** Integrate with the configured local Ollama instance to generate summaries for successfully scraped article text and fetched comments. Persist these summaries locally. Implement a stage testing utility for summarization.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 4.1: Implement Ollama Client Module
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a client module to interact with the configured Ollama API endpoint via HTTP, handling requests and responses for text generation, so that summaries can be generated programmatically.
|
||||
- **Detailed Requirements:**
|
||||
- **Prerequisite:** Ensure a local Ollama instance is installed and running, accessible via the URL defined in `.env` (`OLLAMA_ENDPOINT_URL`), and that the model specified in `.env` (`OLLAMA_MODEL`) has been downloaded (e.g., via `ollama pull model_name`). Instructions for this setup should be in the project README.
|
||||
- Create a new module: `src/clients/ollamaClient.ts`.
|
||||
- Implement an async function `generateSummary(promptTemplate: string, content: string): Promise<string | null>`. *(Note: Parameter name changed for clarity)*
|
||||
- Add configuration variables `OLLAMA_ENDPOINT_URL` (e.g., `http://localhost:11434`) and `OLLAMA_MODEL` (e.g., `llama3`) to `.env.example`. Ensure they are loaded via the config module (`src/utils/config.ts`). Update local `.env` with actual values. Add optional `OLLAMA_TIMEOUT_MS` to `.env.example` with a default like `120000`.
|
||||
- Inside `generateSummary`:
|
||||
- Construct the full prompt string using the `promptTemplate` and the provided `content` (e.g., replacing a placeholder like `{Content Placeholder}` in the template, or simple concatenation if templates are basic).
|
||||
- Construct the Ollama API request payload (JSON): `{ model: configured_model, prompt: full_prompt, stream: false }`. Refer to Ollama `/api/generate` documentation and `docs/data-models.md`.
|
||||
- Use native `Workspace` to send a POST request to the configured Ollama endpoint + `/api/generate`. Set appropriate headers (`Content-Type: application/json`). Use the configured `OLLAMA_TIMEOUT_MS` or a reasonable default (e.g., 2 minutes).
|
||||
- Handle `Workspace` errors (network, timeout) using `try...catch`.
|
||||
- Check `response.ok`. If not OK, log the status/error and return `null`.
|
||||
- Parse the JSON response from Ollama. Extract the generated text (typically in the `response` field). Refer to `docs/data-models.md`.
|
||||
- Check for potential errors within the Ollama response structure itself (e.g., an `error` field).
|
||||
- Return the extracted summary string on success. Return `null` on any failure.
|
||||
- Log key events: initiating request (mention model), receiving response, success, failure reasons, potentially request/response time using the logger.
|
||||
- Define necessary TypeScript types for the Ollama request payload and expected response structure in `src/types/ollama.ts` (referenced in `docs/data-models.md`).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `ollamaClient.ts` module exists and exports `generateSummary`.
|
||||
- AC2: `OLLAMA_ENDPOINT_URL` and `OLLAMA_MODEL` are defined in `.env.example`, loaded via config, and used by the client. Optional `OLLAMA_TIMEOUT_MS` is handled.
|
||||
- AC3: `generateSummary` sends a correctly formatted POST request (model, full prompt based on template and content, stream:false) to the configured Ollama endpoint/path using native `Workspace`.
|
||||
- AC4: Network errors, timeouts, and non-OK API responses are handled gracefully, logged, and result in a `null` return (given the Prerequisite Ollama service is running).
|
||||
- AC5: A successful Ollama response is parsed correctly, the generated text is extracted, and returned as a string.
|
||||
* AC6: Unexpected Ollama response formats or internal errors (e.g., `{"error": "..."}`) are handled, logged, and result in a `null` return.
|
||||
* AC7: Logs provide visibility into the client's interaction with the Ollama API.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.2: Define Summarization Prompts
|
||||
|
||||
* **User Story / Goal:** As a developer, I want standardized base prompts for generating article summaries and HN discussion summaries documented centrally, ensuring consistent instructions are sent to the LLM.
|
||||
* **Detailed Requirements:**
|
||||
* Define two standardized base prompts (`ARTICLE_SUMMARY_PROMPT`, `DISCUSSION_SUMMARY_PROMPT`) **and document them in `docs/prompts.md`**.
|
||||
* Ensure these prompts are accessible within the application code, for example, by defining them as exported constants in a dedicated module like `src/utils/prompts.ts`, which reads from or mirrors the content in `docs/prompts.md`.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: The `ARTICLE_SUMMARY_PROMPT` text is defined in `docs/prompts.md` with appropriate instructional content.
|
||||
* AC2: The `DISCUSSION_SUMMARY_PROMPT` text is defined in `docs/prompts.md` with appropriate instructional content.
|
||||
* AC3: The prompt texts documented in `docs/prompts.md` are available as constants or variables within the application code (e.g., via `src/utils/prompts.ts`) for use by the Ollama client integration.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.3: Integrate Summarization into Main Workflow
|
||||
|
||||
* **User Story / Goal:** As a developer, I want to integrate the Ollama client into the main workflow to generate summaries for each story's scraped article text (if available) and fetched comments, using centrally defined prompts and handling potential comment length limits.
|
||||
* **Detailed Requirements:**
|
||||
* Modify the main execution flow in `src/index.ts` or `src/core/pipeline.ts`.
|
||||
* Import `ollamaClient.generateSummary` and the prompt constants/variables (e.g., from `src/utils/prompts.ts`, which reflect `docs/prompts.md`).
|
||||
* Load the optional `MAX_COMMENT_CHARS_FOR_SUMMARY` configuration value from `.env` via the config utility.
|
||||
* Within the main loop iterating through stories (after article scraping/persistence in Epic 3):
|
||||
* **Article Summary Generation:**
|
||||
* Check if the `story` object has non-null `articleContent`.
|
||||
* If yes: log "Attempting article summarization for story {storyId}", call `await generateSummary(ARTICLE_SUMMARY_PROMPT, story.articleContent)`, store the result (string or null) as `story.articleSummary`, log success/failure.
|
||||
* If no: set `story.articleSummary = null`, log "Skipping article summarization: No content".
|
||||
* **Discussion Summary Generation:**
|
||||
* Check if the `story` object has a non-empty `comments` array.
|
||||
* If yes:
|
||||
* Format the `story.comments` array into a single text block suitable for the LLM prompt (e.g., concatenating `comment.text` with separators like `---`).
|
||||
* **Check truncation limit:** If `MAX_COMMENT_CHARS_FOR_SUMMARY` is configured to a positive number and the `formattedCommentsText` length exceeds it, truncate `formattedCommentsText` to the limit and log a warning: "Comment text truncated to {limit} characters for summarization for story {storyId}".
|
||||
* Log "Attempting discussion summarization for story {storyId}".
|
||||
* Call `await generateSummary(DISCUSSION_SUMMARY_PROMPT, formattedCommentsText)`. *(Pass the potentially truncated text)*
|
||||
* Store the result (string or null) as `story.discussionSummary`. Log success/failure.
|
||||
* If no: set `story.discussionSummary = null`, log "Skipping discussion summarization: No comments".
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: Running `npm run dev` executes steps from Epics 1-3, then attempts summarization using the Ollama client.
|
||||
* AC2: Article summary is attempted only if `articleContent` exists for a story.
|
||||
* AC3: Discussion summary is attempted only if `comments` exist for a story.
|
||||
* AC4: `generateSummary` is called with the correct prompts (sourced consistently with `docs/prompts.md`) and corresponding content (article text or formatted/potentially truncated comments).
|
||||
* AC5: If `MAX_COMMENT_CHARS_FOR_SUMMARY` is set and comment text exceeds it, the text passed to `generateSummary` is truncated, and a warning is logged.
|
||||
* AC6: Logs clearly indicate the start, success, or failure (including null returns from the client) for both article and discussion summarization attempts per story.
|
||||
* AC7: Story objects in memory now contain `articleSummary` (string/null) and `discussionSummary` (string/null) properties.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.4: Persist Generated Summaries Locally
|
||||
|
||||
*(No changes needed for this story based on recent decisions)*
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save the generated article and discussion summaries (or null placeholders) to a local JSON file for each story, making them available for the email assembly stage.
|
||||
- **Detailed Requirements:**
|
||||
- Define the structure for the summary output file: `{storyId}_summary.json`. Content example: `{ "storyId": "...", "articleSummary": "...", "discussionSummary": "...", "summarizedAt": "ISO_TIMESTAMP" }`. Note that `articleSummary` and `discussionSummary` can be `null`.
|
||||
- Import `fs` and `path` in `src/index.ts` or `src/core/pipeline.ts` if needed.
|
||||
- In the main workflow loop, after *both* summarization attempts (article and discussion) for a story are complete:
|
||||
- Create a summary result object containing `storyId`, `articleSummary` (string or null), `discussionSummary` (string or null), and the current ISO timestamp (`new Date().toISOString()`). Add this timestamp to the in-memory `story` object as well (`story.summarizedAt`).
|
||||
- Get the full path to the date-stamped output directory.
|
||||
- Construct the filename: `{storyId}_summary.json`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Serialize the summary result object to JSON (`JSON.stringify(..., null, 2)`).
|
||||
- Use `fs.writeFileSync` to save the JSON to the file, wrapping in `try...catch`.
|
||||
- Log the successful saving of the summary file or any file writing errors.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains 10 files named `{storyId}_summary.json`.
|
||||
- AC2: Each `_summary.json` file contains valid JSON adhering to the defined structure.
|
||||
- AC3: The `articleSummary` field contains the generated summary string if successful, otherwise `null`.
|
||||
- AC4: The `discussionSummary` field contains the generated summary string if successful, otherwise `null`.
|
||||
- AC5: A valid ISO timestamp is present in the `summarizedAt` field.
|
||||
- AC6: Logs confirm successful writing of each summary file or report file system errors.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.5: Implement Stage Testing Utility for Summarization
|
||||
|
||||
*(Changes needed to reflect prompt sourcing and optional truncation)*
|
||||
|
||||
* **User Story / Goal:** As a developer, I want a separate script/command to test the LLM summarization logic using locally persisted data (HN comments, scraped article text), allowing independent testing of prompts and Ollama interaction.
|
||||
* **Detailed Requirements:**
|
||||
* Create a new standalone script file: `src/stages/summarize_content.ts`.
|
||||
* Import necessary modules: `fs`, `path`, `logger`, `config`, `ollamaClient`, prompt constants (e.g., from `src/utils/prompts.ts`).
|
||||
* The script should:
|
||||
* Initialize logger, load configuration (Ollama endpoint/model, output dir, **optional `MAX_COMMENT_CHARS_FOR_SUMMARY`**).
|
||||
* Determine target date-stamped directory path.
|
||||
* Find all `{storyId}_data.json` files in the directory.
|
||||
* For each `storyId` found:
|
||||
* Read `{storyId}_data.json` to get comments. Format them into a single text block.
|
||||
* *Attempt* to read `{storyId}_article.txt`. Handle file-not-found gracefully. Store content or null.
|
||||
* Call `ollamaClient.generateSummary` for article text (if not null) using `ARTICLE_SUMMARY_PROMPT`.
|
||||
* **Apply truncation logic:** If comments exist, check `MAX_COMMENT_CHARS_FOR_SUMMARY` and truncate the formatted comment text block if needed, logging a warning.
|
||||
* Call `ollamaClient.generateSummary` for formatted comments (if comments exist) using `DISCUSSION_SUMMARY_PROMPT` *(passing potentially truncated text)*.
|
||||
* Construct the summary result object (with summaries or nulls, and timestamp).
|
||||
* Save the result object to `{storyId}_summary.json` in the same directory (using logic from Story 4.4), overwriting if exists.
|
||||
* Log progress (reading files, calling Ollama, truncation warnings, saving results) for each story ID.
|
||||
* Add script to `package.json`: `"stage:summarize": "ts-node src/stages/summarize_content.ts"`.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: The file `src/stages/summarize_content.ts` exists.
|
||||
* AC2: The script `stage:summarize` is defined in `package.json`.
|
||||
* AC3: Running `npm run stage:summarize` (after `stage:fetch` and `stage:scrape` runs) reads `_data.json` and attempts to read `_article.txt` files from the target directory.
|
||||
* AC4: The script calls the `ollamaClient` with correct prompts (sourced consistently with `docs/prompts.md`) and content derived *only* from the local files (requires Ollama service running per Story 4.1 prerequisite).
|
||||
* AC5: If `MAX_COMMENT_CHARS_FOR_SUMMARY` is set and applicable, comment text is truncated before calling the client, and a warning is logged.
|
||||
* AC6: The script creates/updates `{storyId}_summary.json` files in the target directory reflecting the results of the Ollama calls (summaries or nulls).
|
||||
* AC7: Logs show the script processing each story ID found locally, interacting with Ollama, and saving results.
|
||||
* AC8: The script does not call Algolia API or the article scraper module.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| --------------------------- | ------------ | ------- | ------------------------------------ | -------------- |
|
||||
| Integrate prompts.md refs | 2025-05-04 | 0.3 | Updated stories 4.2, 4.3, 4.5 | 3-Architect |
|
||||
| Added Ollama Prereq Note | 2025-05-04 | 0.2 | Added note about local Ollama setup | 2-pm |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 4 | 2-pm |
|
||||
|
||||
# Epic 5 File
|
||||
|
||||
# Epic 5: Digest Assembly & Email Dispatch
|
||||
|
||||
**Goal:** Assemble the collected story data and summaries from local files, format them into a readable HTML email digest, and send the email using Nodemailer with configured credentials. Implement a stage testing utility for emailing with a dry-run option.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 5.1: Implement Email Content Assembler
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module that reads the persisted story metadata (`_data.json`) and summaries (`_summary.json`) from a specified directory, consolidating the necessary information needed to render the email digest.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/email/contentAssembler.ts`.
|
||||
- Define a TypeScript type/interface `DigestData` representing the data needed per story for the email template: `{ storyId: string, title: string, hnUrl: string, articleUrl: string | null, articleSummary: string | null, discussionSummary: string | null }`.
|
||||
- Implement an async function `assembleDigestData(dateDirPath: string): Promise<DigestData[]>`.
|
||||
- The function should:
|
||||
- Use Node.js `fs` to read the contents of the `dateDirPath`.
|
||||
- Identify all files matching the pattern `{storyId}_data.json`.
|
||||
- For each `storyId` found:
|
||||
- Read and parse the `{storyId}_data.json` file. Extract `title`, `hnUrl`, and `url` (use as `articleUrl`). Handle potential file read/parse errors gracefully (log and skip story).
|
||||
- Attempt to read and parse the corresponding `{storyId}_summary.json` file. Handle file-not-found or parse errors gracefully (treat `articleSummary` and `discussionSummary` as `null`).
|
||||
- Construct a `DigestData` object for the story, including the extracted metadata and summaries (or nulls).
|
||||
- Collect all successfully constructed `DigestData` objects into an array.
|
||||
- Return the array. It should ideally contain 10 items if all previous stages succeeded.
|
||||
- Log progress (e.g., "Assembling digest data from directory...", "Processing story {storyId}...") and any errors encountered during file processing using the logger.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `contentAssembler.ts` module exists and exports `assembleDigestData` and the `DigestData` type.
|
||||
- AC2: `assembleDigestData` correctly reads `_data.json` files from the provided directory path.
|
||||
- AC3: It attempts to read corresponding `_summary.json` files, correctly handling cases where the summary file might be missing or unparseable (resulting in null summaries for that story).
|
||||
- AC4: The function returns a promise resolving to an array of `DigestData` objects, populated with data extracted from the files.
|
||||
- AC5: Errors during file reading or JSON parsing are logged, and the function returns data for successfully processed stories.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.2: Create HTML Email Template & Renderer
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic HTML email template and a function to render it with the assembled digest data, producing the final HTML content for the email body.
|
||||
- **Detailed Requirements:**
|
||||
- Define the HTML structure. This can be done using template literals within a function or potentially using a simple template file (e.g., `src/email/templates/digestTemplate.html`) and `fs.readFileSync`. Template literals are simpler for MVP.
|
||||
- Create a function `renderDigestHtml(data: DigestData[], digestDate: string): string` (e.g., in `src/email/contentAssembler.ts` or a new `templater.ts`).
|
||||
- The function should generate an HTML string with:
|
||||
- A suitable title in the body (e.g., `<h1>Hacker News Top 10 Summaries for ${digestDate}</h1>`).
|
||||
- A loop through the `data` array.
|
||||
- For each `story` in `data`:
|
||||
- Display `<h2><a href="${story.articleUrl || story.hnUrl}">${story.title}</a></h2>`.
|
||||
- Display `<p><a href="${story.hnUrl}">View HN Discussion</a></p>`.
|
||||
- Conditionally display `<h3>Article Summary</h3><p>${story.articleSummary}</p>` *only if* `story.articleSummary` is not null/empty.
|
||||
- Conditionally display `<h3>Discussion Summary</h3><p>${story.discussionSummary}</p>` *only if* `story.discussionSummary` is not null/empty.
|
||||
- Include a separator (e.g., `<hr style="margin-top: 20px; margin-bottom: 20px;">`).
|
||||
- Use basic inline CSS for minimal styling (margins, etc.) to ensure readability. Avoid complex layouts.
|
||||
- Return the complete HTML document as a string.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: A function `renderDigestHtml` exists that accepts the digest data array and a date string.
|
||||
- AC2: The function returns a single, complete HTML string.
|
||||
- AC3: The generated HTML includes a title with the date and correctly iterates through the story data.
|
||||
- AC4: For each story, the HTML displays the linked title, HN link, and conditionally displays the article and discussion summaries with headings.
|
||||
- AC5: Basic separators and margins are used for readability. The HTML is simple and likely to render reasonably in most email clients.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.3: Implement Nodemailer Email Sender
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module to send the generated HTML email using Nodemailer, configured with credentials stored securely in the environment file.
|
||||
- **Detailed Requirements:**
|
||||
- Add Nodemailer dependencies: `npm install nodemailer @types/nodemailer --save-prod`.
|
||||
- Add required configuration variables to `.env.example` (and local `.env`): `EMAIL_HOST`, `EMAIL_PORT` (e.g., 587), `EMAIL_SECURE` (e.g., `false` for STARTTLS on 587, `true` for 465), `EMAIL_USER`, `EMAIL_PASS`, `EMAIL_FROM` (e.g., `"Your Name <you@example.com>"`), `EMAIL_RECIPIENTS` (comma-separated list).
|
||||
- Create a new module: `src/email/emailSender.ts`.
|
||||
- Implement an async function `sendDigestEmail(subject: string, htmlContent: string): Promise<boolean>`.
|
||||
- Inside the function:
|
||||
- Load the `EMAIL_*` variables from the config module.
|
||||
- Create a Nodemailer transporter using `nodemailer.createTransport` with the loaded config (host, port, secure flag, auth: { user, pass }).
|
||||
- Verify transporter configuration using `transporter.verify()` (optional but recommended). Log verification success/failure.
|
||||
- Parse the `EMAIL_RECIPIENTS` string into an array or comma-separated string suitable for the `to` field.
|
||||
- Define the `mailOptions`: `{ from: EMAIL_FROM, to: parsedRecipients, subject: subject, html: htmlContent }`.
|
||||
- Call `await transporter.sendMail(mailOptions)`.
|
||||
- If `sendMail` succeeds, log the success message including the `messageId` from the result. Return `true`.
|
||||
- If `sendMail` fails (throws error), log the error using the logger. Return `false`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: `nodemailer` and `@types/nodemailer` dependencies are added.
|
||||
- AC2: `EMAIL_*` variables are defined in `.env.example` and loaded from config.
|
||||
- AC3: `emailSender.ts` module exists and exports `sendDigestEmail`.
|
||||
- AC4: `sendDigestEmail` correctly creates a Nodemailer transporter using configuration from `.env`. Transporter verification is attempted (optional AC).
|
||||
- AC5: The `to` field is correctly populated based on `EMAIL_RECIPIENTS`.
|
||||
- AC6: `transporter.sendMail` is called with correct `from`, `to`, `subject`, and `html` options.
|
||||
- AC7: Email sending success (including message ID) or failure is logged clearly.
|
||||
- AC8: The function returns `true` on successful sending, `false` otherwise.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.4: Integrate Email Assembly and Sending into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want the main application workflow (`src/index.ts`) to orchestrate the final steps: assembling digest data, rendering the HTML, and triggering the email send after all previous stages are complete.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts`.
|
||||
- Import `assembleDigestData`, `renderDigestHtml`, `sendDigestEmail`.
|
||||
- Execute these steps *after* the main loop (where stories are fetched, scraped, summarized, and persisted) completes:
|
||||
- Log "Starting final digest assembly and email dispatch...".
|
||||
- Determine the path to the current date-stamped output directory.
|
||||
- Call `const digestData = await assembleDigestData(dateDirPath)`.
|
||||
- Check if `digestData` array is not empty.
|
||||
- If yes:
|
||||
- Get the current date string (e.g., 'YYYY-MM-DD').
|
||||
- `const htmlContent = renderDigestHtml(digestData, currentDate)`.
|
||||
- `const subject = \`BMad Hacker Daily Digest - ${currentDate}\``.
|
||||
- `const emailSent = await sendDigestEmail(subject, htmlContent)`.
|
||||
- Log the final outcome based on `emailSent` ("Digest email sent successfully." or "Failed to send digest email.").
|
||||
- If no (`digestData` is empty or assembly failed):
|
||||
- Log an error: "Failed to assemble digest data or no data found. Skipping email."
|
||||
- Log "BMad Hacker Daily Digest process finished."
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes all stages (Epics 1-4) and then proceeds to email assembly and sending.
|
||||
- AC2: `assembleDigestData` is called correctly with the output directory path after other processing is done.
|
||||
- AC3: If data is assembled, `renderDigestHtml` and `sendDigestEmail` are called with the correct data, subject, and HTML.
|
||||
- AC4: The final success or failure of the email sending step is logged.
|
||||
- AC5: If `assembleDigestData` returns no data, email sending is skipped, and an appropriate message is logged.
|
||||
- AC6: The application logs a final completion message.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.5: Implement Stage Testing Utility for Emailing
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate script/command to test the email assembly, rendering, and sending logic using persisted local data, including a crucial `--dry-run` option to prevent accidental email sending during tests.
|
||||
- **Detailed Requirements:**
|
||||
- Add `yargs` dependency for argument parsing: `npm install yargs @types/yargs --save-dev`.
|
||||
- Create a new standalone script file: `src/stages/send_digest.ts`.
|
||||
- Import necessary modules: `fs`, `path`, `logger`, `config`, `assembleDigestData`, `renderDigestHtml`, `sendDigestEmail`, `yargs`.
|
||||
- Use `yargs` to parse command-line arguments, specifically looking for a `--dry-run` boolean flag (defaulting to `false`). Allow an optional argument for specifying the date-stamped directory, otherwise default to current date.
|
||||
- The script should:
|
||||
- Initialize logger, load config.
|
||||
- Determine the target date-stamped directory path (from arg or default). Log the target directory.
|
||||
- Call `await assembleDigestData(dateDirPath)`.
|
||||
- If data is assembled and not empty:
|
||||
- Determine the date string for the subject/title.
|
||||
- Call `renderDigestHtml(digestData, dateString)` to get HTML.
|
||||
- Construct the subject string.
|
||||
- Check the `dryRun` flag:
|
||||
- If `true`: Log "DRY RUN enabled. Skipping actual email send.". Log the subject. Save the `htmlContent` to a file in the target directory (e.g., `_digest_preview.html`). Log that the preview file was saved.
|
||||
- If `false`: Log "Live run: Attempting to send email...". Call `await sendDigestEmail(subject, htmlContent)`. Log success/failure based on the return value.
|
||||
- If data assembly fails or is empty, log the error.
|
||||
- Add script to `package.json`: `"stage:email": "ts-node src/stages/send_digest.ts --"`. The `--` allows passing arguments like `--dry-run`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/send_digest.ts` exists. `yargs` dependency is added.
|
||||
- AC2: The script `stage:email` is defined in `package.json` allowing arguments.
|
||||
- AC3: Running `npm run stage:email -- --dry-run` reads local data, renders HTML, logs the intent, saves `_digest_preview.html` locally, and does *not* call `sendDigestEmail`.
|
||||
- AC4: Running `npm run stage:email` (without `--dry-run`) reads local data, renders HTML, and *does* call `sendDigestEmail`, logging the outcome.
|
||||
- AC5: The script correctly identifies and acts upon the `--dry-run` flag.
|
||||
- AC6: Logs clearly distinguish between dry runs and live runs and report success/failure.
|
||||
- AC7: The script operates using only local files and the email configuration/service; it does not invoke prior pipeline stages (Algolia, scraping, Ollama).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 5 | 2-pm |
|
||||
|
||||
# END EPIC FILES
|
||||
@@ -1,614 +0,0 @@
|
||||
# Epic 1 file
|
||||
|
||||
# Epic 1: Project Initialization & Core Setup
|
||||
|
||||
**Goal:** Initialize the project using the "bmad-boilerplate", manage dependencies, setup `.env` and config loading, establish basic CLI entry point, setup basic logging and output directory structure. This provides the foundational setup for all subsequent development work.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 1.1: Initialize Project from Boilerplate
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to set up the initial project structure using the `bmad-boilerplate`, so that I have the standard tooling (TS, Jest, ESLint, Prettier), configurations, and scripts in place.
|
||||
- **Detailed Requirements:**
|
||||
- Copy or clone the contents of the `bmad-boilerplate` into the new project's root directory.
|
||||
- Initialize a git repository in the project root directory (if not already done by cloning).
|
||||
- Ensure the `.gitignore` file from the boilerplate is present.
|
||||
- Run `npm install` to download and install all `devDependencies` specified in the boilerplate's `package.json`.
|
||||
- Verify that the core boilerplate scripts (`lint`, `format`, `test`, `build`) execute without errors on the initial codebase.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The project directory contains the files and structure from `bmad-boilerplate`.
|
||||
- AC2: A `node_modules` directory exists and contains packages corresponding to `devDependencies`.
|
||||
- AC3: `npm run lint` command completes successfully without reporting any linting errors.
|
||||
- AC4: `npm run format` command completes successfully, potentially making formatting changes according to Prettier rules. Running it a second time should result in no changes.
|
||||
- AC5: `npm run test` command executes Jest successfully (it may report "no tests found" which is acceptable at this stage).
|
||||
- AC6: `npm run build` command executes successfully, creating a `dist` directory containing compiled JavaScript output.
|
||||
- AC7: The `.gitignore` file exists and includes entries for `node_modules/`, `.env`, `dist/`, etc. as specified in the boilerplate.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.2: Setup Environment Configuration
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to establish the environment configuration mechanism using `.env` files, so that secrets and settings (like output paths) can be managed outside of version control, following boilerplate conventions.
|
||||
- **Detailed Requirements:**
|
||||
- Add a production dependency for loading `.env` files (e.g., `dotenv`). Run `npm install dotenv --save-prod` (or similar library).
|
||||
- Verify the `.env.example` file exists (from boilerplate).
|
||||
- Add an initial configuration variable `OUTPUT_DIR_PATH=./output` to `.env.example`.
|
||||
- Create the `.env` file locally by copying `.env.example`. Populate `OUTPUT_DIR_PATH` if needed (can keep default).
|
||||
- Implement a utility module (e.g., `src/config.ts`) that loads environment variables from the `.env` file at application startup.
|
||||
- The utility should export the loaded configuration values (initially just `OUTPUT_DIR_PATH`).
|
||||
- Ensure the `.env` file is listed in `.gitignore` and is not committed.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The chosen `.env` library (e.g., `dotenv`) is listed under `dependencies` in `package.json` and `package-lock.json` is updated.
|
||||
- AC2: The `.env.example` file exists, is tracked by git, and contains the line `OUTPUT_DIR_PATH=./output`.
|
||||
- AC3: The `.env` file exists locally but is NOT tracked by git.
|
||||
- AC4: A configuration module (`src/config.ts` or similar) exists and successfully loads the `OUTPUT_DIR_PATH` value from `.env` when the application starts.
|
||||
- AC5: The loaded `OUTPUT_DIR_PATH` value is accessible within the application code.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.3: Implement Basic CLI Entry Point & Execution
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic `src/index.ts` entry point that can be executed via the boilerplate's `dev` and `start` scripts, providing a working foundation for the application logic.
|
||||
- **Detailed Requirements:**
|
||||
- Create the main application entry point file at `src/index.ts`.
|
||||
- Implement minimal code within `src/index.ts` to:
|
||||
- Import the configuration loading mechanism (from Story 1.2).
|
||||
- Log a simple startup message to the console (e.g., "BMad Hacker Daily Digest - Starting Up...").
|
||||
- (Optional) Log the loaded `OUTPUT_DIR_PATH` to verify config loading.
|
||||
- Confirm execution using boilerplate scripts.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `src/index.ts` file exists.
|
||||
- AC2: Running `npm run dev` executes `src/index.ts` via `ts-node` and logs the startup message to the console.
|
||||
- AC3: Running `npm run build` successfully compiles `src/index.ts` (and any imports) into the `dist` directory.
|
||||
- AC4: Running `npm start` (after a successful build) executes the compiled code from `dist` and logs the startup message to the console.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.4: Setup Basic Logging and Output Directory
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic console logging mechanism and the dynamic creation of a date-stamped output directory, so that the application can provide execution feedback and prepare for storing data artifacts in subsequent epics.
|
||||
- **Detailed Requirements:**
|
||||
- Implement a simple, reusable logging utility module (e.g., `src/logger.ts`). Initially, it can wrap `console.log`, `console.warn`, `console.error`.
|
||||
- Refactor `src/index.ts` to use this `logger` for its startup message(s).
|
||||
- In `src/index.ts` (or a setup function called by it):
|
||||
- Retrieve the `OUTPUT_DIR_PATH` from the configuration (loaded in Story 1.2).
|
||||
- Determine the current date in 'YYYY-MM-DD' format.
|
||||
- Construct the full path for the date-stamped subdirectory (e.g., `${OUTPUT_DIR_PATH}/YYYY-MM-DD`).
|
||||
- Check if the base output directory exists; if not, create it.
|
||||
- Check if the date-stamped subdirectory exists; if not, create it recursively. Use Node.js `fs` module (e.g., `fs.mkdirSync(path, { recursive: true })`).
|
||||
- Log (using the logger) the full path of the output directory being used for the current run (e.g., "Output directory for this run: ./output/2025-05-04").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: A logger utility module (`src/logger.ts` or similar) exists and is used for console output in `src/index.ts`.
|
||||
- AC2: Running `npm run dev` or `npm start` logs the startup message via the logger.
|
||||
- AC3: Running the application creates the base output directory (e.g., `./output` defined in `.env`) if it doesn't already exist.
|
||||
- AC4: Running the application creates a date-stamped subdirectory (e.g., `./output/2025-05-04`) within the base output directory if it doesn't already exist.
|
||||
- AC5: The application logs a message indicating the full path to the date-stamped output directory created/used for the current execution.
|
||||
- AC6: The application exits gracefully after performing these setup steps (for now).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 1 | 2-pm |
|
||||
|
||||
# Epic 2 File
|
||||
|
||||
# Epic 2: HN Data Acquisition & Persistence
|
||||
|
||||
**Goal:** Implement fetching top 10 stories and their comments (respecting limits) from Algolia HN API, and persist this raw data locally into the date-stamped output directory created in Epic 1. Implement a stage testing utility for fetching.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 2.1: Implement Algolia HN API Client
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a dedicated client module to interact with the Algolia Hacker News Search API, so that fetching stories and comments is encapsulated, reusable, and uses the required native `Workspace` API.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/clients/algoliaHNClient.ts`.
|
||||
- Implement an async function `WorkspaceTopStories` within the client:
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for front-page stories (e.g., `http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10`). Adjust `hitsPerPage` if needed to ensure 10 stories.
|
||||
- Parse the JSON response.
|
||||
- Extract required metadata for each story: `objectID` (use as `storyId`), `title`, `url` (article URL), `points`, `num_comments`. Handle potential missing `url` field gracefully (log warning, maybe skip story later if URL needed).
|
||||
- Construct the `hnUrl` for each story (e.g., `https://news.ycombinator.com/item?id={storyId}`).
|
||||
- Return an array of structured story objects.
|
||||
- Implement a separate async function `WorkspaceCommentsForStory` within the client:
|
||||
- Accept `storyId` and `maxComments` limit as arguments.
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for comments of a specific story (e.g., `http://hn.algolia.com/api/v1/search?tags=comment,story_{storyId}&hitsPerPage={maxComments}`).
|
||||
- Parse the JSON response.
|
||||
- Extract required comment data: `objectID` (use as `commentId`), `comment_text`, `author`, `created_at`.
|
||||
- Filter out comments where `comment_text` is null or empty. Ensure only up to `maxComments` are returned.
|
||||
- Return an array of structured comment objects.
|
||||
- Implement basic error handling using `try...catch` around `Workspace` calls and check `response.ok` status. Log errors using the logger utility from Epic 1.
|
||||
- Define TypeScript interfaces/types for the expected structures of API responses (stories, comments) and the data returned by the client functions (e.g., `Story`, `Comment`).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The module `src/clients/algoliaHNClient.ts` exists and exports `WorkspaceTopStories` and `WorkspaceCommentsForStory` functions.
|
||||
- AC2: Calling `WorkspaceTopStories` makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of 10 `Story` objects containing the specified metadata.
|
||||
- AC3: Calling `WorkspaceCommentsForStory` with a valid `storyId` and `maxComments` limit makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of `Comment` objects (up to `maxComments`), filtering out empty ones.
|
||||
- AC4: Both functions use the native `Workspace` API internally.
|
||||
- AC5: Network errors or non-successful API responses (e.g., status 4xx, 5xx) are caught and logged using the logger.
|
||||
- AC6: Relevant TypeScript types (`Story`, `Comment`, etc.) are defined and used within the client module.
|
||||
|
||||
---
|
||||
|
||||
### Story 2.2: Integrate HN Data Fetching into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to integrate the HN data fetching logic into the main application workflow (`src/index.ts`), so that running the app retrieves the top 10 stories and their comments after completing the setup from Epic 1.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts` (or a main async function called by it).
|
||||
- Import the `algoliaHNClient` functions.
|
||||
- Import the configuration module to access `MAX_COMMENTS_PER_STORY`.
|
||||
- After the Epic 1 setup (config load, logger init, output dir creation), call `WorkspaceTopStories()`.
|
||||
- Log the number of stories fetched.
|
||||
- Iterate through the array of fetched `Story` objects.
|
||||
- For each `Story`, call `WorkspaceCommentsForStory()`, passing the `story.storyId` and the configured `MAX_COMMENTS_PER_STORY`.
|
||||
- Store the fetched comments within the corresponding `Story` object in memory (e.g., add a `comments: Comment[]` property to the `Story` object).
|
||||
- Log progress using the logger utility (e.g., "Fetched 10 stories.", "Fetching up to X comments for story {storyId}...").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes Epic 1 setup steps followed by fetching stories and then comments for each story.
|
||||
- AC2: Logs clearly show the start and successful completion of fetching stories, and the start of fetching comments for each of the 10 stories.
|
||||
- AC3: The configured `MAX_COMMENTS_PER_STORY` value is read from config and used in the calls to `WorkspaceCommentsForStory`.
|
||||
- AC4: After successful execution, story objects held in memory contain a nested array of fetched comment objects. (Can be verified via debugger or temporary logging).
|
||||
|
||||
---
|
||||
|
||||
### Story 2.3: Persist Fetched HN Data Locally
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save the fetched HN stories (including their comments) to JSON files in the date-stamped output directory, so that the raw data is persisted locally for subsequent pipeline stages and debugging.
|
||||
- **Detailed Requirements:**
|
||||
- Define a consistent JSON structure for the output file content. Example: `{ storyId: "...", title: "...", url: "...", hnUrl: "...", points: ..., fetchedAt: "ISO_TIMESTAMP", comments: [{ commentId: "...", text: "...", author: "...", createdAt: "ISO_TIMESTAMP", ... }, ...] }`. Include a timestamp for when the data was fetched.
|
||||
- Import Node.js `fs` (specifically `fs.writeFileSync`) and `path` modules.
|
||||
- In the main workflow (`src/index.ts`), within the loop iterating through stories (after comments have been fetched and added to the story object in Story 2.2):
|
||||
- Get the full path to the date-stamped output directory (determined in Epic 1).
|
||||
- Construct the filename for the story's data: `{storyId}_data.json`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Serialize the complete story object (including comments and fetch timestamp) to a JSON string using `JSON.stringify(storyObject, null, 2)` for readability.
|
||||
- Write the JSON string to the file using `fs.writeFileSync()`. Use a `try...catch` block for error handling.
|
||||
- Log (using the logger) the successful persistence of each story's data file or any errors encountered during file writing.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory (e.g., `./output/YYYY-MM-DD/`) contains exactly 10 files named `{storyId}_data.json`.
|
||||
- AC2: Each JSON file contains valid JSON representing a single story object, including its metadata, fetch timestamp, and an array of its fetched comments, matching the defined structure.
|
||||
- AC3: The number of comments in each file's `comments` array does not exceed `MAX_COMMENTS_PER_STORY`.
|
||||
- AC4: Logs indicate that saving data to a file was attempted for each story, reporting success or specific file writing errors.
|
||||
|
||||
---
|
||||
|
||||
### Story 2.4: Implement Stage Testing Utility for HN Fetching
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate, executable script that *only* performs the HN data fetching and persistence, so I can test and trigger this stage independently of the full pipeline.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new standalone script file: `src/stages/fetch_hn_data.ts`.
|
||||
- This script should perform the essential setup required for this stage: initialize logger, load configuration (`.env`), determine and create output directory (reuse or replicate logic from Epic 1 / `src/index.ts`).
|
||||
- The script should then execute the core logic of fetching stories via `algoliaHNClient.fetchTopStories`, fetching comments via `algoliaHNClient.fetchCommentsForStory` (using loaded config for limit), and persisting the results to JSON files using `fs.writeFileSync` (replicating logic from Story 2.3).
|
||||
- The script should log its progress using the logger utility.
|
||||
- Add a new script command to `package.json` under `"scripts"`: `"stage:fetch": "ts-node src/stages/fetch_hn_data.ts"`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/fetch_hn_data.ts` exists.
|
||||
- AC2: The script `stage:fetch` is defined in `package.json`'s `scripts` section.
|
||||
- AC3: Running `npm run stage:fetch` executes successfully, performing only the setup, fetch, and persist steps.
|
||||
- AC4: Running `npm run stage:fetch` creates the same 10 `{storyId}_data.json` files in the correct date-stamped output directory as running the main `npm run dev` command (at the current state of development).
|
||||
- AC5: Logs generated by `npm run stage:fetch` reflect only the fetching and persisting steps, not subsequent pipeline stages.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 2 | 2-pm |
|
||||
|
||||
# Epic 3 File
|
||||
|
||||
# Epic 3: Article Scraping & Persistence
|
||||
|
||||
**Goal:** Implement a best-effort article scraping mechanism to fetch and extract plain text content from the external URLs associated with fetched HN stories. Handle failures gracefully and persist successfully scraped text locally. Implement a stage testing utility for scraping.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 3.1: Implement Basic Article Scraper Module
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module that attempts to fetch HTML from a URL and extract the main article text using basic methods, handling common failures gracefully, so article content can be prepared for summarization.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/scraper/articleScraper.ts`.
|
||||
- Add a suitable HTML parsing/extraction library dependency (e.g., `@extractus/article-extractor` recommended for simplicity, or `cheerio` for more control). Run `npm install @extractus/article-extractor --save-prod` (or chosen alternative).
|
||||
- Implement an async function `scrapeArticle(url: string): Promise<string | null>` within the module.
|
||||
- Inside the function:
|
||||
- Use native `Workspace` to retrieve content from the `url`. Set a reasonable timeout (e.g., 10-15 seconds). Include a `User-Agent` header to mimic a browser.
|
||||
- Handle potential `Workspace` errors (network errors, timeouts) using `try...catch`.
|
||||
- Check the `response.ok` status. If not okay, log error and return `null`.
|
||||
- Check the `Content-Type` header of the response. If it doesn't indicate HTML (e.g., does not include `text/html`), log warning and return `null`.
|
||||
- If HTML is received, attempt to extract the main article text using the chosen library (`article-extractor` preferred).
|
||||
- Wrap the extraction logic in a `try...catch` to handle library-specific errors.
|
||||
- Return the extracted plain text string if successful. Ensure it's just text, not HTML markup.
|
||||
- Return `null` if extraction fails or results in empty content.
|
||||
- Log all significant events, errors, or reasons for returning null (e.g., "Scraping URL...", "Fetch failed:", "Non-HTML content type:", "Extraction failed:", "Successfully extracted text") using the logger utility.
|
||||
- Define TypeScript types/interfaces as needed.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `articleScraper.ts` module exists and exports the `scrapeArticle` function.
|
||||
- AC2: The chosen scraping library (e.g., `@extractus/article-extractor`) is added to `dependencies` in `package.json`.
|
||||
- AC3: `scrapeArticle` uses native `Workspace` with a timeout and User-Agent header.
|
||||
- AC4: `scrapeArticle` correctly handles fetch errors, non-OK responses, and non-HTML content types by logging and returning `null`.
|
||||
- AC5: `scrapeArticle` uses the chosen library to attempt text extraction from valid HTML content.
|
||||
- AC6: `scrapeArticle` returns the extracted plain text on success, and `null` on any failure (fetch, non-HTML, extraction error, empty result).
|
||||
- AC7: Relevant logs are produced for success, failure modes, and errors encountered during the process.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.2: Integrate Article Scraping into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to integrate the article scraper into the main workflow (`src/index.ts`), attempting to scrape the article for each HN story that has a valid URL, after fetching its data.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts`.
|
||||
- Import the `scrapeArticle` function from `src/scraper/articleScraper.ts`.
|
||||
- Within the main loop iterating through the fetched stories (after comments are fetched in Epic 2):
|
||||
- Check if `story.url` exists and appears to be a valid HTTP/HTTPS URL. A simple check for starting with `http://` or `https://` is sufficient.
|
||||
- If the URL is missing or invalid, log a warning ("Skipping scraping for story {storyId}: Missing or invalid URL") and proceed to the next story's processing step.
|
||||
- If a valid URL exists, log ("Attempting to scrape article for story {storyId} from {story.url}").
|
||||
- Call `await scrapeArticle(story.url)`.
|
||||
- Store the result (the extracted text string or `null`) in memory, associated with the story object (e.g., add property `articleContent: string | null`).
|
||||
- Log the outcome clearly (e.g., "Successfully scraped article for story {storyId}", "Failed to scrape article for story {storyId}").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes Epic 1 & 2 steps, and then attempts article scraping for stories with valid URLs.
|
||||
- AC2: Stories with missing or invalid URLs are skipped, and a corresponding log message is generated.
|
||||
- AC3: For stories with valid URLs, the `scrapeArticle` function is called.
|
||||
- AC4: Logs clearly indicate the start and success/failure outcome of the scraping attempt for each relevant story.
|
||||
- AC5: Story objects held in memory after this stage contain an `articleContent` property holding the scraped text (string) or `null` if scraping was skipped or failed.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.3: Persist Scraped Article Text Locally
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save successfully scraped article text to a separate local file for each story, so that the text content is available as input for the summarization stage.
|
||||
- **Detailed Requirements:**
|
||||
- Import Node.js `fs` and `path` modules if not already present in `src/index.ts`.
|
||||
- In the main workflow (`src/index.ts`), immediately after a successful call to `scrapeArticle` for a story (where the result is a non-null string):
|
||||
- Retrieve the full path to the current date-stamped output directory.
|
||||
- Construct the filename: `{storyId}_article.txt`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Get the successfully scraped article text string (`articleContent`).
|
||||
- Use `fs.writeFileSync(fullPath, articleContent, 'utf-8')` to save the text to the file. Wrap in `try...catch` for file system errors.
|
||||
- Log the successful saving of the file (e.g., "Saved scraped article text to {filename}") or any file writing errors encountered.
|
||||
- Ensure *no* `_article.txt` file is created if `scrapeArticle` returned `null` (due to skipping or failure).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains `_article.txt` files *only* for those stories where `scrapeArticle` succeeded and returned text content.
|
||||
- AC2: The name of each article text file is `{storyId}_article.txt`.
|
||||
- AC3: The content of each `_article.txt` file is the plain text string returned by `scrapeArticle`.
|
||||
- AC4: Logs confirm the successful writing of each `_article.txt` file or report specific file writing errors.
|
||||
- AC5: No empty `_article.txt` files are created. Files only exist if scraping was successful.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.4: Implement Stage Testing Utility for Scraping
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate script/command to test the article scraping logic using HN story data from local files, allowing independent testing and debugging of the scraper.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new standalone script file: `src/stages/scrape_articles.ts`.
|
||||
- Import necessary modules: `fs`, `path`, `logger`, `config`, `scrapeArticle`.
|
||||
- The script should:
|
||||
- Initialize the logger.
|
||||
- Load configuration (to get `OUTPUT_DIR_PATH`).
|
||||
- Determine the target date-stamped directory path (e.g., `${OUTPUT_DIR_PATH}/YYYY-MM-DD`, using the current date or potentially an optional CLI argument). Ensure this directory exists.
|
||||
- Read the directory contents and identify all `{storyId}_data.json` files.
|
||||
- For each `_data.json` file found:
|
||||
- Read and parse the JSON content.
|
||||
- Extract the `storyId` and `url`.
|
||||
- If a valid `url` exists, call `await scrapeArticle(url)`.
|
||||
- If scraping succeeds (returns text), save the text to `{storyId}_article.txt` in the same directory (using logic from Story 3.3). Overwrite if the file exists.
|
||||
- Log the progress and outcome (skip/success/fail) for each story processed.
|
||||
- Add a new script command to `package.json`: `"stage:scrape": "ts-node src/stages/scrape_articles.ts"`. Consider adding argument parsing later if needed to specify a date/directory.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/scrape_articles.ts` exists.
|
||||
- AC2: The script `stage:scrape` is defined in `package.json`.
|
||||
- AC3: Running `npm run stage:scrape` (assuming a directory with `_data.json` files exists from a previous `stage:fetch` run) reads these files.
|
||||
- AC4: The script calls `scrapeArticle` for stories with valid URLs found in the JSON files.
|
||||
- AC5: The script creates/updates `{storyId}_article.txt` files in the target directory corresponding to successfully scraped articles.
|
||||
- AC6: The script logs its actions (reading files, attempting scraping, saving results) for each story ID processed.
|
||||
- AC7: The script operates solely based on local `_data.json` files and fetching from external article URLs; it does not call the Algolia HN API.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 3 | 2-pm |
|
||||
|
||||
# Epic 4 File
|
||||
|
||||
# Epic 4: LLM Summarization & Persistence
|
||||
|
||||
**Goal:** Integrate with the configured local Ollama instance to generate summaries for successfully scraped article text and fetched comments. Persist these summaries locally. Implement a stage testing utility for summarization.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 4.1: Implement Ollama Client Module
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a client module to interact with the configured Ollama API endpoint via HTTP, handling requests and responses for text generation, so that summaries can be generated programmatically.
|
||||
- **Detailed Requirements:**
|
||||
- **Prerequisite:** Ensure a local Ollama instance is installed and running, accessible via the URL defined in `.env` (`OLLAMA_ENDPOINT_URL`), and that the model specified in `.env` (`OLLAMA_MODEL`) has been downloaded (e.g., via `ollama pull model_name`). Instructions for this setup should be in the project README.
|
||||
- Create a new module: `src/clients/ollamaClient.ts`.
|
||||
- Implement an async function `generateSummary(promptTemplate: string, content: string): Promise<string | null>`. *(Note: Parameter name changed for clarity)*
|
||||
- Add configuration variables `OLLAMA_ENDPOINT_URL` (e.g., `http://localhost:11434`) and `OLLAMA_MODEL` (e.g., `llama3`) to `.env.example`. Ensure they are loaded via the config module (`src/utils/config.ts`). Update local `.env` with actual values. Add optional `OLLAMA_TIMEOUT_MS` to `.env.example` with a default like `120000`.
|
||||
- Inside `generateSummary`:
|
||||
- Construct the full prompt string using the `promptTemplate` and the provided `content` (e.g., replacing a placeholder like `{Content Placeholder}` in the template, or simple concatenation if templates are basic).
|
||||
- Construct the Ollama API request payload (JSON): `{ model: configured_model, prompt: full_prompt, stream: false }`. Refer to Ollama `/api/generate` documentation and `docs/data-models.md`.
|
||||
- Use native `Workspace` to send a POST request to the configured Ollama endpoint + `/api/generate`. Set appropriate headers (`Content-Type: application/json`). Use the configured `OLLAMA_TIMEOUT_MS` or a reasonable default (e.g., 2 minutes).
|
||||
- Handle `Workspace` errors (network, timeout) using `try...catch`.
|
||||
- Check `response.ok`. If not OK, log the status/error and return `null`.
|
||||
- Parse the JSON response from Ollama. Extract the generated text (typically in the `response` field). Refer to `docs/data-models.md`.
|
||||
- Check for potential errors within the Ollama response structure itself (e.g., an `error` field).
|
||||
- Return the extracted summary string on success. Return `null` on any failure.
|
||||
- Log key events: initiating request (mention model), receiving response, success, failure reasons, potentially request/response time using the logger.
|
||||
- Define necessary TypeScript types for the Ollama request payload and expected response structure in `src/types/ollama.ts` (referenced in `docs/data-models.md`).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `ollamaClient.ts` module exists and exports `generateSummary`.
|
||||
- AC2: `OLLAMA_ENDPOINT_URL` and `OLLAMA_MODEL` are defined in `.env.example`, loaded via config, and used by the client. Optional `OLLAMA_TIMEOUT_MS` is handled.
|
||||
- AC3: `generateSummary` sends a correctly formatted POST request (model, full prompt based on template and content, stream:false) to the configured Ollama endpoint/path using native `Workspace`.
|
||||
- AC4: Network errors, timeouts, and non-OK API responses are handled gracefully, logged, and result in a `null` return (given the Prerequisite Ollama service is running).
|
||||
- AC5: A successful Ollama response is parsed correctly, the generated text is extracted, and returned as a string.
|
||||
* AC6: Unexpected Ollama response formats or internal errors (e.g., `{"error": "..."}`) are handled, logged, and result in a `null` return.
|
||||
* AC7: Logs provide visibility into the client's interaction with the Ollama API.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.2: Define Summarization Prompts
|
||||
|
||||
* **User Story / Goal:** As a developer, I want standardized base prompts for generating article summaries and HN discussion summaries documented centrally, ensuring consistent instructions are sent to the LLM.
|
||||
* **Detailed Requirements:**
|
||||
* Define two standardized base prompts (`ARTICLE_SUMMARY_PROMPT`, `DISCUSSION_SUMMARY_PROMPT`) **and document them in `docs/prompts.md`**.
|
||||
* Ensure these prompts are accessible within the application code, for example, by defining them as exported constants in a dedicated module like `src/utils/prompts.ts`, which reads from or mirrors the content in `docs/prompts.md`.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: The `ARTICLE_SUMMARY_PROMPT` text is defined in `docs/prompts.md` with appropriate instructional content.
|
||||
* AC2: The `DISCUSSION_SUMMARY_PROMPT` text is defined in `docs/prompts.md` with appropriate instructional content.
|
||||
* AC3: The prompt texts documented in `docs/prompts.md` are available as constants or variables within the application code (e.g., via `src/utils/prompts.ts`) for use by the Ollama client integration.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.3: Integrate Summarization into Main Workflow
|
||||
|
||||
* **User Story / Goal:** As a developer, I want to integrate the Ollama client into the main workflow to generate summaries for each story's scraped article text (if available) and fetched comments, using centrally defined prompts and handling potential comment length limits.
|
||||
* **Detailed Requirements:**
|
||||
* Modify the main execution flow in `src/index.ts` or `src/core/pipeline.ts`.
|
||||
* Import `ollamaClient.generateSummary` and the prompt constants/variables (e.g., from `src/utils/prompts.ts`, which reflect `docs/prompts.md`).
|
||||
* Load the optional `MAX_COMMENT_CHARS_FOR_SUMMARY` configuration value from `.env` via the config utility.
|
||||
* Within the main loop iterating through stories (after article scraping/persistence in Epic 3):
|
||||
* **Article Summary Generation:**
|
||||
* Check if the `story` object has non-null `articleContent`.
|
||||
* If yes: log "Attempting article summarization for story {storyId}", call `await generateSummary(ARTICLE_SUMMARY_PROMPT, story.articleContent)`, store the result (string or null) as `story.articleSummary`, log success/failure.
|
||||
* If no: set `story.articleSummary = null`, log "Skipping article summarization: No content".
|
||||
* **Discussion Summary Generation:**
|
||||
* Check if the `story` object has a non-empty `comments` array.
|
||||
* If yes:
|
||||
* Format the `story.comments` array into a single text block suitable for the LLM prompt (e.g., concatenating `comment.text` with separators like `---`).
|
||||
* **Check truncation limit:** If `MAX_COMMENT_CHARS_FOR_SUMMARY` is configured to a positive number and the `formattedCommentsText` length exceeds it, truncate `formattedCommentsText` to the limit and log a warning: "Comment text truncated to {limit} characters for summarization for story {storyId}".
|
||||
* Log "Attempting discussion summarization for story {storyId}".
|
||||
* Call `await generateSummary(DISCUSSION_SUMMARY_PROMPT, formattedCommentsText)`. *(Pass the potentially truncated text)*
|
||||
* Store the result (string or null) as `story.discussionSummary`. Log success/failure.
|
||||
* If no: set `story.discussionSummary = null`, log "Skipping discussion summarization: No comments".
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: Running `npm run dev` executes steps from Epics 1-3, then attempts summarization using the Ollama client.
|
||||
* AC2: Article summary is attempted only if `articleContent` exists for a story.
|
||||
* AC3: Discussion summary is attempted only if `comments` exist for a story.
|
||||
* AC4: `generateSummary` is called with the correct prompts (sourced consistently with `docs/prompts.md`) and corresponding content (article text or formatted/potentially truncated comments).
|
||||
* AC5: If `MAX_COMMENT_CHARS_FOR_SUMMARY` is set and comment text exceeds it, the text passed to `generateSummary` is truncated, and a warning is logged.
|
||||
* AC6: Logs clearly indicate the start, success, or failure (including null returns from the client) for both article and discussion summarization attempts per story.
|
||||
* AC7: Story objects in memory now contain `articleSummary` (string/null) and `discussionSummary` (string/null) properties.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.4: Persist Generated Summaries Locally
|
||||
|
||||
*(No changes needed for this story based on recent decisions)*
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save the generated article and discussion summaries (or null placeholders) to a local JSON file for each story, making them available for the email assembly stage.
|
||||
- **Detailed Requirements:**
|
||||
- Define the structure for the summary output file: `{storyId}_summary.json`. Content example: `{ "storyId": "...", "articleSummary": "...", "discussionSummary": "...", "summarizedAt": "ISO_TIMESTAMP" }`. Note that `articleSummary` and `discussionSummary` can be `null`.
|
||||
- Import `fs` and `path` in `src/index.ts` or `src/core/pipeline.ts` if needed.
|
||||
- In the main workflow loop, after *both* summarization attempts (article and discussion) for a story are complete:
|
||||
- Create a summary result object containing `storyId`, `articleSummary` (string or null), `discussionSummary` (string or null), and the current ISO timestamp (`new Date().toISOString()`). Add this timestamp to the in-memory `story` object as well (`story.summarizedAt`).
|
||||
- Get the full path to the date-stamped output directory.
|
||||
- Construct the filename: `{storyId}_summary.json`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Serialize the summary result object to JSON (`JSON.stringify(..., null, 2)`).
|
||||
- Use `fs.writeFileSync` to save the JSON to the file, wrapping in `try...catch`.
|
||||
- Log the successful saving of the summary file or any file writing errors.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains 10 files named `{storyId}_summary.json`.
|
||||
- AC2: Each `_summary.json` file contains valid JSON adhering to the defined structure.
|
||||
- AC3: The `articleSummary` field contains the generated summary string if successful, otherwise `null`.
|
||||
- AC4: The `discussionSummary` field contains the generated summary string if successful, otherwise `null`.
|
||||
- AC5: A valid ISO timestamp is present in the `summarizedAt` field.
|
||||
- AC6: Logs confirm successful writing of each summary file or report file system errors.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.5: Implement Stage Testing Utility for Summarization
|
||||
|
||||
*(Changes needed to reflect prompt sourcing and optional truncation)*
|
||||
|
||||
* **User Story / Goal:** As a developer, I want a separate script/command to test the LLM summarization logic using locally persisted data (HN comments, scraped article text), allowing independent testing of prompts and Ollama interaction.
|
||||
* **Detailed Requirements:**
|
||||
* Create a new standalone script file: `src/stages/summarize_content.ts`.
|
||||
* Import necessary modules: `fs`, `path`, `logger`, `config`, `ollamaClient`, prompt constants (e.g., from `src/utils/prompts.ts`).
|
||||
* The script should:
|
||||
* Initialize logger, load configuration (Ollama endpoint/model, output dir, **optional `MAX_COMMENT_CHARS_FOR_SUMMARY`**).
|
||||
* Determine target date-stamped directory path.
|
||||
* Find all `{storyId}_data.json` files in the directory.
|
||||
* For each `storyId` found:
|
||||
* Read `{storyId}_data.json` to get comments. Format them into a single text block.
|
||||
* *Attempt* to read `{storyId}_article.txt`. Handle file-not-found gracefully. Store content or null.
|
||||
* Call `ollamaClient.generateSummary` for article text (if not null) using `ARTICLE_SUMMARY_PROMPT`.
|
||||
* **Apply truncation logic:** If comments exist, check `MAX_COMMENT_CHARS_FOR_SUMMARY` and truncate the formatted comment text block if needed, logging a warning.
|
||||
* Call `ollamaClient.generateSummary` for formatted comments (if comments exist) using `DISCUSSION_SUMMARY_PROMPT` *(passing potentially truncated text)*.
|
||||
* Construct the summary result object (with summaries or nulls, and timestamp).
|
||||
* Save the result object to `{storyId}_summary.json` in the same directory (using logic from Story 4.4), overwriting if exists.
|
||||
* Log progress (reading files, calling Ollama, truncation warnings, saving results) for each story ID.
|
||||
* Add script to `package.json`: `"stage:summarize": "ts-node src/stages/summarize_content.ts"`.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: The file `src/stages/summarize_content.ts` exists.
|
||||
* AC2: The script `stage:summarize` is defined in `package.json`.
|
||||
* AC3: Running `npm run stage:summarize` (after `stage:fetch` and `stage:scrape` runs) reads `_data.json` and attempts to read `_article.txt` files from the target directory.
|
||||
* AC4: The script calls the `ollamaClient` with correct prompts (sourced consistently with `docs/prompts.md`) and content derived *only* from the local files (requires Ollama service running per Story 4.1 prerequisite).
|
||||
* AC5: If `MAX_COMMENT_CHARS_FOR_SUMMARY` is set and applicable, comment text is truncated before calling the client, and a warning is logged.
|
||||
* AC6: The script creates/updates `{storyId}_summary.json` files in the target directory reflecting the results of the Ollama calls (summaries or nulls).
|
||||
* AC7: Logs show the script processing each story ID found locally, interacting with Ollama, and saving results.
|
||||
* AC8: The script does not call Algolia API or the article scraper module.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| --------------------------- | ------------ | ------- | ------------------------------------ | -------------- |
|
||||
| Integrate prompts.md refs | 2025-05-04 | 0.3 | Updated stories 4.2, 4.3, 4.5 | 3-Architect |
|
||||
| Added Ollama Prereq Note | 2025-05-04 | 0.2 | Added note about local Ollama setup | 2-pm |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 4 | 2-pm |
|
||||
|
||||
# Epic 5 File
|
||||
|
||||
# Epic 5: Digest Assembly & Email Dispatch
|
||||
|
||||
**Goal:** Assemble the collected story data and summaries from local files, format them into a readable HTML email digest, and send the email using Nodemailer with configured credentials. Implement a stage testing utility for emailing with a dry-run option.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 5.1: Implement Email Content Assembler
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module that reads the persisted story metadata (`_data.json`) and summaries (`_summary.json`) from a specified directory, consolidating the necessary information needed to render the email digest.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/email/contentAssembler.ts`.
|
||||
- Define a TypeScript type/interface `DigestData` representing the data needed per story for the email template: `{ storyId: string, title: string, hnUrl: string, articleUrl: string | null, articleSummary: string | null, discussionSummary: string | null }`.
|
||||
- Implement an async function `assembleDigestData(dateDirPath: string): Promise<DigestData[]>`.
|
||||
- The function should:
|
||||
- Use Node.js `fs` to read the contents of the `dateDirPath`.
|
||||
- Identify all files matching the pattern `{storyId}_data.json`.
|
||||
- For each `storyId` found:
|
||||
- Read and parse the `{storyId}_data.json` file. Extract `title`, `hnUrl`, and `url` (use as `articleUrl`). Handle potential file read/parse errors gracefully (log and skip story).
|
||||
- Attempt to read and parse the corresponding `{storyId}_summary.json` file. Handle file-not-found or parse errors gracefully (treat `articleSummary` and `discussionSummary` as `null`).
|
||||
- Construct a `DigestData` object for the story, including the extracted metadata and summaries (or nulls).
|
||||
- Collect all successfully constructed `DigestData` objects into an array.
|
||||
- Return the array. It should ideally contain 10 items if all previous stages succeeded.
|
||||
- Log progress (e.g., "Assembling digest data from directory...", "Processing story {storyId}...") and any errors encountered during file processing using the logger.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `contentAssembler.ts` module exists and exports `assembleDigestData` and the `DigestData` type.
|
||||
- AC2: `assembleDigestData` correctly reads `_data.json` files from the provided directory path.
|
||||
- AC3: It attempts to read corresponding `_summary.json` files, correctly handling cases where the summary file might be missing or unparseable (resulting in null summaries for that story).
|
||||
- AC4: The function returns a promise resolving to an array of `DigestData` objects, populated with data extracted from the files.
|
||||
- AC5: Errors during file reading or JSON parsing are logged, and the function returns data for successfully processed stories.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.2: Create HTML Email Template & Renderer
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic HTML email template and a function to render it with the assembled digest data, producing the final HTML content for the email body.
|
||||
- **Detailed Requirements:**
|
||||
- Define the HTML structure. This can be done using template literals within a function or potentially using a simple template file (e.g., `src/email/templates/digestTemplate.html`) and `fs.readFileSync`. Template literals are simpler for MVP.
|
||||
- Create a function `renderDigestHtml(data: DigestData[], digestDate: string): string` (e.g., in `src/email/contentAssembler.ts` or a new `templater.ts`).
|
||||
- The function should generate an HTML string with:
|
||||
- A suitable title in the body (e.g., `<h1>Hacker News Top 10 Summaries for ${digestDate}</h1>`).
|
||||
- A loop through the `data` array.
|
||||
- For each `story` in `data`:
|
||||
- Display `<h2><a href="${story.articleUrl || story.hnUrl}">${story.title}</a></h2>`.
|
||||
- Display `<p><a href="${story.hnUrl}">View HN Discussion</a></p>`.
|
||||
- Conditionally display `<h3>Article Summary</h3><p>${story.articleSummary}</p>` *only if* `story.articleSummary` is not null/empty.
|
||||
- Conditionally display `<h3>Discussion Summary</h3><p>${story.discussionSummary}</p>` *only if* `story.discussionSummary` is not null/empty.
|
||||
- Include a separator (e.g., `<hr style="margin-top: 20px; margin-bottom: 20px;">`).
|
||||
- Use basic inline CSS for minimal styling (margins, etc.) to ensure readability. Avoid complex layouts.
|
||||
- Return the complete HTML document as a string.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: A function `renderDigestHtml` exists that accepts the digest data array and a date string.
|
||||
- AC2: The function returns a single, complete HTML string.
|
||||
- AC3: The generated HTML includes a title with the date and correctly iterates through the story data.
|
||||
- AC4: For each story, the HTML displays the linked title, HN link, and conditionally displays the article and discussion summaries with headings.
|
||||
- AC5: Basic separators and margins are used for readability. The HTML is simple and likely to render reasonably in most email clients.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.3: Implement Nodemailer Email Sender
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module to send the generated HTML email using Nodemailer, configured with credentials stored securely in the environment file.
|
||||
- **Detailed Requirements:**
|
||||
- Add Nodemailer dependencies: `npm install nodemailer @types/nodemailer --save-prod`.
|
||||
- Add required configuration variables to `.env.example` (and local `.env`): `EMAIL_HOST`, `EMAIL_PORT` (e.g., 587), `EMAIL_SECURE` (e.g., `false` for STARTTLS on 587, `true` for 465), `EMAIL_USER`, `EMAIL_PASS`, `EMAIL_FROM` (e.g., `"Your Name <you@example.com>"`), `EMAIL_RECIPIENTS` (comma-separated list).
|
||||
- Create a new module: `src/email/emailSender.ts`.
|
||||
- Implement an async function `sendDigestEmail(subject: string, htmlContent: string): Promise<boolean>`.
|
||||
- Inside the function:
|
||||
- Load the `EMAIL_*` variables from the config module.
|
||||
- Create a Nodemailer transporter using `nodemailer.createTransport` with the loaded config (host, port, secure flag, auth: { user, pass }).
|
||||
- Verify transporter configuration using `transporter.verify()` (optional but recommended). Log verification success/failure.
|
||||
- Parse the `EMAIL_RECIPIENTS` string into an array or comma-separated string suitable for the `to` field.
|
||||
- Define the `mailOptions`: `{ from: EMAIL_FROM, to: parsedRecipients, subject: subject, html: htmlContent }`.
|
||||
- Call `await transporter.sendMail(mailOptions)`.
|
||||
- If `sendMail` succeeds, log the success message including the `messageId` from the result. Return `true`.
|
||||
- If `sendMail` fails (throws error), log the error using the logger. Return `false`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: `nodemailer` and `@types/nodemailer` dependencies are added.
|
||||
- AC2: `EMAIL_*` variables are defined in `.env.example` and loaded from config.
|
||||
- AC3: `emailSender.ts` module exists and exports `sendDigestEmail`.
|
||||
- AC4: `sendDigestEmail` correctly creates a Nodemailer transporter using configuration from `.env`. Transporter verification is attempted (optional AC).
|
||||
- AC5: The `to` field is correctly populated based on `EMAIL_RECIPIENTS`.
|
||||
- AC6: `transporter.sendMail` is called with correct `from`, `to`, `subject`, and `html` options.
|
||||
- AC7: Email sending success (including message ID) or failure is logged clearly.
|
||||
- AC8: The function returns `true` on successful sending, `false` otherwise.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.4: Integrate Email Assembly and Sending into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want the main application workflow (`src/index.ts`) to orchestrate the final steps: assembling digest data, rendering the HTML, and triggering the email send after all previous stages are complete.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts`.
|
||||
- Import `assembleDigestData`, `renderDigestHtml`, `sendDigestEmail`.
|
||||
- Execute these steps *after* the main loop (where stories are fetched, scraped, summarized, and persisted) completes:
|
||||
- Log "Starting final digest assembly and email dispatch...".
|
||||
- Determine the path to the current date-stamped output directory.
|
||||
- Call `const digestData = await assembleDigestData(dateDirPath)`.
|
||||
- Check if `digestData` array is not empty.
|
||||
- If yes:
|
||||
- Get the current date string (e.g., 'YYYY-MM-DD').
|
||||
- `const htmlContent = renderDigestHtml(digestData, currentDate)`.
|
||||
- `const subject = \`BMad Hacker Daily Digest - ${currentDate}\``.
|
||||
- `const emailSent = await sendDigestEmail(subject, htmlContent)`.
|
||||
- Log the final outcome based on `emailSent` ("Digest email sent successfully." or "Failed to send digest email.").
|
||||
- If no (`digestData` is empty or assembly failed):
|
||||
- Log an error: "Failed to assemble digest data or no data found. Skipping email."
|
||||
- Log "BMad Hacker Daily Digest process finished."
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes all stages (Epics 1-4) and then proceeds to email assembly and sending.
|
||||
- AC2: `assembleDigestData` is called correctly with the output directory path after other processing is done.
|
||||
- AC3: If data is assembled, `renderDigestHtml` and `sendDigestEmail` are called with the correct data, subject, and HTML.
|
||||
- AC4: The final success or failure of the email sending step is logged.
|
||||
- AC5: If `assembleDigestData` returns no data, email sending is skipped, and an appropriate message is logged.
|
||||
- AC6: The application logs a final completion message.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.5: Implement Stage Testing Utility for Emailing
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate script/command to test the email assembly, rendering, and sending logic using persisted local data, including a crucial `--dry-run` option to prevent accidental email sending during tests.
|
||||
- **Detailed Requirements:**
|
||||
- Add `yargs` dependency for argument parsing: `npm install yargs @types/yargs --save-dev`.
|
||||
- Create a new standalone script file: `src/stages/send_digest.ts`.
|
||||
- Import necessary modules: `fs`, `path`, `logger`, `config`, `assembleDigestData`, `renderDigestHtml`, `sendDigestEmail`, `yargs`.
|
||||
- Use `yargs` to parse command-line arguments, specifically looking for a `--dry-run` boolean flag (defaulting to `false`). Allow an optional argument for specifying the date-stamped directory, otherwise default to current date.
|
||||
- The script should:
|
||||
- Initialize logger, load config.
|
||||
- Determine the target date-stamped directory path (from arg or default). Log the target directory.
|
||||
- Call `await assembleDigestData(dateDirPath)`.
|
||||
- If data is assembled and not empty:
|
||||
- Determine the date string for the subject/title.
|
||||
- Call `renderDigestHtml(digestData, dateString)` to get HTML.
|
||||
- Construct the subject string.
|
||||
- Check the `dryRun` flag:
|
||||
- If `true`: Log "DRY RUN enabled. Skipping actual email send.". Log the subject. Save the `htmlContent` to a file in the target directory (e.g., `_digest_preview.html`). Log that the preview file was saved.
|
||||
- If `false`: Log "Live run: Attempting to send email...". Call `await sendDigestEmail(subject, htmlContent)`. Log success/failure based on the return value.
|
||||
- If data assembly fails or is empty, log the error.
|
||||
- Add script to `package.json`: `"stage:email": "ts-node src/stages/send_digest.ts --"`. The `--` allows passing arguments like `--dry-run`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/send_digest.ts` exists. `yargs` dependency is added.
|
||||
- AC2: The script `stage:email` is defined in `package.json` allowing arguments.
|
||||
- AC3: Running `npm run stage:email -- --dry-run` reads local data, renders HTML, logs the intent, saves `_digest_preview.html` locally, and does *not* call `sendDigestEmail`.
|
||||
- AC4: Running `npm run stage:email` (without `--dry-run`) reads local data, renders HTML, and *does* call `sendDigestEmail`, logging the outcome.
|
||||
- AC5: The script correctly identifies and acts upon the `--dry-run` flag.
|
||||
- AC6: Logs clearly distinguish between dry runs and live runs and report success/failure.
|
||||
- AC7: The script operates using only local files and the email configuration/service; it does not invoke prior pipeline stages (Algolia, scraping, Ollama).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 5 | 2-pm |
|
||||
|
||||
# END EPIC FILES
|
||||
@@ -1,202 +0,0 @@
|
||||
# BMad Hacker Daily Digest Data Models
|
||||
|
||||
This document defines the core data structures used within the application, the format of persisted data files, and relevant API payload schemas. These types would typically reside in `src/types/`.
|
||||
|
||||
## 1. Core Application Entities / Domain Objects (In-Memory)
|
||||
|
||||
These TypeScript interfaces represent the main data objects manipulated during the pipeline execution.
|
||||
|
||||
### `Comment`
|
||||
|
||||
- **Description:** Represents a single Hacker News comment fetched from the Algolia API.
|
||||
- **Schema / Interface Definition (`src/types/hn.ts`):**
|
||||
```typescript
|
||||
export interface Comment {
|
||||
commentId: string; // Unique identifier (from Algolia objectID)
|
||||
commentText: string | null; // Text content of the comment (nullable from API)
|
||||
author: string | null; // Author's HN username (nullable from API)
|
||||
createdAt: string; // ISO 8601 timestamp string of comment creation
|
||||
}
|
||||
```
|
||||
|
||||
### `Story`
|
||||
|
||||
- **Description:** Represents a Hacker News story, initially fetched from Algolia and progressively augmented with comments, scraped content, and summaries during pipeline execution.
|
||||
- **Schema / Interface Definition (`src/types/hn.ts`):**
|
||||
|
||||
```typescript
|
||||
import { Comment } from "./hn";
|
||||
|
||||
export interface Story {
|
||||
storyId: string; // Unique identifier (from Algolia objectID)
|
||||
title: string; // Story title
|
||||
articleUrl: string | null; // URL of the linked article (can be null from API)
|
||||
hnUrl: string; // URL to the HN discussion page (constructed)
|
||||
points?: number; // HN points (optional)
|
||||
numComments?: number; // Number of comments reported by API (optional)
|
||||
|
||||
// Data added during pipeline execution
|
||||
comments: Comment[]; // Fetched comments [Added in Epic 2]
|
||||
articleContent: string | null; // Scraped article text [Added in Epic 3]
|
||||
articleSummary: string | null; // Generated article summary [Added in Epic 4]
|
||||
discussionSummary: string | null; // Generated discussion summary [Added in Epic 4]
|
||||
fetchedAt: string; // ISO 8601 timestamp when story/comments were fetched [Added in Epic 2]
|
||||
summarizedAt?: string; // ISO 8601 timestamp when summaries were generated [Added in Epic 4]
|
||||
}
|
||||
```
|
||||
|
||||
### `DigestData`
|
||||
|
||||
- **Description:** Represents the consolidated data needed for a single story when assembling the final email digest. Created by reading persisted files.
|
||||
- **Schema / Interface Definition (`src/types/email.ts`):**
|
||||
```typescript
|
||||
export interface DigestData {
|
||||
storyId: string;
|
||||
title: string;
|
||||
hnUrl: string;
|
||||
articleUrl: string | null;
|
||||
articleSummary: string | null;
|
||||
discussionSummary: string | null;
|
||||
}
|
||||
```
|
||||
|
||||
## 2. API Payload Schemas
|
||||
|
||||
These describe the relevant parts of request/response payloads for external APIs.
|
||||
|
||||
### Algolia HN API - Story Response Subset
|
||||
|
||||
- **Description:** Relevant fields extracted from the Algolia HN Search API response for front-page stories.
|
||||
- **Schema (Conceptual JSON):**
|
||||
```json
|
||||
{
|
||||
"hits": [
|
||||
{
|
||||
"objectID": "string", // Used as storyId
|
||||
"title": "string",
|
||||
"url": "string | null", // Used as articleUrl
|
||||
"points": "number",
|
||||
"num_comments": "number"
|
||||
// ... other fields ignored
|
||||
}
|
||||
// ... more hits (stories)
|
||||
]
|
||||
// ... other top-level fields ignored
|
||||
}
|
||||
```
|
||||
|
||||
### Algolia HN API - Comment Response Subset
|
||||
|
||||
- **Description:** Relevant fields extracted from the Algolia HN Search API response for comments associated with a story.
|
||||
- **Schema (Conceptual JSON):**
|
||||
```json
|
||||
{
|
||||
"hits": [
|
||||
{
|
||||
"objectID": "string", // Used as commentId
|
||||
"comment_text": "string | null",
|
||||
"author": "string | null",
|
||||
"created_at": "string" // ISO 8601 format
|
||||
// ... other fields ignored
|
||||
}
|
||||
// ... more hits (comments)
|
||||
]
|
||||
// ... other top-level fields ignored
|
||||
}
|
||||
```
|
||||
|
||||
### Ollama `/api/generate` Request
|
||||
|
||||
- **Description:** Payload sent to the local Ollama instance to generate a summary.
|
||||
- **Schema (`src/types/ollama.ts` or inline):**
|
||||
```typescript
|
||||
export interface OllamaGenerateRequest {
|
||||
model: string; // e.g., "llama3" (from config)
|
||||
prompt: string; // The full prompt including context
|
||||
stream: false; // Required to be false for single response
|
||||
// system?: string; // Optional system prompt (if used)
|
||||
// options?: Record<string, any>; // Optional generation parameters
|
||||
}
|
||||
```
|
||||
|
||||
### Ollama `/api/generate` Response
|
||||
|
||||
- **Description:** Relevant fields expected from the Ollama API response when `stream: false`.
|
||||
- **Schema (`src/types/ollama.ts` or inline):**
|
||||
```typescript
|
||||
export interface OllamaGenerateResponse {
|
||||
model: string;
|
||||
created_at: string; // ISO 8601 timestamp
|
||||
response: string; // The generated summary text
|
||||
done: boolean; // Should be true if stream=false and generation succeeded
|
||||
// Optional fields detailing context, timings, etc. are ignored for MVP
|
||||
// total_duration?: number;
|
||||
// load_duration?: number;
|
||||
// prompt_eval_count?: number;
|
||||
// prompt_eval_duration?: number;
|
||||
// eval_count?: number;
|
||||
// eval_duration?: number;
|
||||
}
|
||||
```
|
||||
_(Note: Error responses might have a different structure, e.g., `{ "error": "message" }`)_
|
||||
|
||||
## 3. Database Schemas
|
||||
|
||||
- **N/A:** This application does not use a database for MVP; data is persisted to the local filesystem.
|
||||
|
||||
## 4. State File Schemas (Local Filesystem Persistence)
|
||||
|
||||
These describe the format of files saved in the `output/YYYY-MM-DD/` directory.
|
||||
|
||||
### `{storyId}_data.json`
|
||||
|
||||
- **Purpose:** Stores fetched story metadata and associated comments.
|
||||
- **Format:** JSON
|
||||
- **Schema Definition (Matches `Story` type fields relevant at time of saving):**
|
||||
```json
|
||||
{
|
||||
"storyId": "string",
|
||||
"title": "string",
|
||||
"articleUrl": "string | null",
|
||||
"hnUrl": "string",
|
||||
"points": "number | undefined",
|
||||
"numComments": "number | undefined",
|
||||
"fetchedAt": "string", // ISO 8601 timestamp
|
||||
"comments": [
|
||||
// Array of Comment objects
|
||||
{
|
||||
"commentId": "string",
|
||||
"commentText": "string | null",
|
||||
"author": "string | null",
|
||||
"createdAt": "string" // ISO 8601 timestamp
|
||||
}
|
||||
// ... more comments
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### `{storyId}_article.txt`
|
||||
|
||||
- **Purpose:** Stores the successfully scraped plain text content of the linked article.
|
||||
- **Format:** Plain Text (`.txt`)
|
||||
- **Schema Definition:** N/A (Content is the raw extracted string). File only exists if scraping was successful.
|
||||
|
||||
### `{storyId}_summary.json`
|
||||
|
||||
- **Purpose:** Stores the generated article and discussion summaries.
|
||||
- **Format:** JSON
|
||||
- **Schema Definition:**
|
||||
```json
|
||||
{
|
||||
"storyId": "string",
|
||||
"articleSummary": "string | null", // Null if scraping failed or summarization failed
|
||||
"discussionSummary": "string | null", // Null if no comments or summarization failed
|
||||
"summarizedAt": "string" // ISO 8601 timestamp
|
||||
}
|
||||
```
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ---------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial draft based on Epics | 3-Architect |
|
||||
@@ -1,202 +0,0 @@
|
||||
# BMad Hacker Daily Digest Data Models
|
||||
|
||||
This document defines the core data structures used within the application, the format of persisted data files, and relevant API payload schemas. These types would typically reside in `src/types/`.
|
||||
|
||||
## 1. Core Application Entities / Domain Objects (In-Memory)
|
||||
|
||||
These TypeScript interfaces represent the main data objects manipulated during the pipeline execution.
|
||||
|
||||
### `Comment`
|
||||
|
||||
- **Description:** Represents a single Hacker News comment fetched from the Algolia API.
|
||||
- **Schema / Interface Definition (`src/types/hn.ts`):**
|
||||
```typescript
|
||||
export interface Comment {
|
||||
commentId: string; // Unique identifier (from Algolia objectID)
|
||||
commentText: string | null; // Text content of the comment (nullable from API)
|
||||
author: string | null; // Author's HN username (nullable from API)
|
||||
createdAt: string; // ISO 8601 timestamp string of comment creation
|
||||
}
|
||||
```
|
||||
|
||||
### `Story`
|
||||
|
||||
- **Description:** Represents a Hacker News story, initially fetched from Algolia and progressively augmented with comments, scraped content, and summaries during pipeline execution.
|
||||
- **Schema / Interface Definition (`src/types/hn.ts`):**
|
||||
|
||||
```typescript
|
||||
import { Comment } from "./hn";
|
||||
|
||||
export interface Story {
|
||||
storyId: string; // Unique identifier (from Algolia objectID)
|
||||
title: string; // Story title
|
||||
articleUrl: string | null; // URL of the linked article (can be null from API)
|
||||
hnUrl: string; // URL to the HN discussion page (constructed)
|
||||
points?: number; // HN points (optional)
|
||||
numComments?: number; // Number of comments reported by API (optional)
|
||||
|
||||
// Data added during pipeline execution
|
||||
comments: Comment[]; // Fetched comments [Added in Epic 2]
|
||||
articleContent: string | null; // Scraped article text [Added in Epic 3]
|
||||
articleSummary: string | null; // Generated article summary [Added in Epic 4]
|
||||
discussionSummary: string | null; // Generated discussion summary [Added in Epic 4]
|
||||
fetchedAt: string; // ISO 8601 timestamp when story/comments were fetched [Added in Epic 2]
|
||||
summarizedAt?: string; // ISO 8601 timestamp when summaries were generated [Added in Epic 4]
|
||||
}
|
||||
```
|
||||
|
||||
### `DigestData`
|
||||
|
||||
- **Description:** Represents the consolidated data needed for a single story when assembling the final email digest. Created by reading persisted files.
|
||||
- **Schema / Interface Definition (`src/types/email.ts`):**
|
||||
```typescript
|
||||
export interface DigestData {
|
||||
storyId: string;
|
||||
title: string;
|
||||
hnUrl: string;
|
||||
articleUrl: string | null;
|
||||
articleSummary: string | null;
|
||||
discussionSummary: string | null;
|
||||
}
|
||||
```
|
||||
|
||||
## 2. API Payload Schemas
|
||||
|
||||
These describe the relevant parts of request/response payloads for external APIs.
|
||||
|
||||
### Algolia HN API - Story Response Subset
|
||||
|
||||
- **Description:** Relevant fields extracted from the Algolia HN Search API response for front-page stories.
|
||||
- **Schema (Conceptual JSON):**
|
||||
```json
|
||||
{
|
||||
"hits": [
|
||||
{
|
||||
"objectID": "string", // Used as storyId
|
||||
"title": "string",
|
||||
"url": "string | null", // Used as articleUrl
|
||||
"points": "number",
|
||||
"num_comments": "number"
|
||||
// ... other fields ignored
|
||||
}
|
||||
// ... more hits (stories)
|
||||
]
|
||||
// ... other top-level fields ignored
|
||||
}
|
||||
```
|
||||
|
||||
### Algolia HN API - Comment Response Subset
|
||||
|
||||
- **Description:** Relevant fields extracted from the Algolia HN Search API response for comments associated with a story.
|
||||
- **Schema (Conceptual JSON):**
|
||||
```json
|
||||
{
|
||||
"hits": [
|
||||
{
|
||||
"objectID": "string", // Used as commentId
|
||||
"comment_text": "string | null",
|
||||
"author": "string | null",
|
||||
"created_at": "string" // ISO 8601 format
|
||||
// ... other fields ignored
|
||||
}
|
||||
// ... more hits (comments)
|
||||
]
|
||||
// ... other top-level fields ignored
|
||||
}
|
||||
```
|
||||
|
||||
### Ollama `/api/generate` Request
|
||||
|
||||
- **Description:** Payload sent to the local Ollama instance to generate a summary.
|
||||
- **Schema (`src/types/ollama.ts` or inline):**
|
||||
```typescript
|
||||
export interface OllamaGenerateRequest {
|
||||
model: string; // e.g., "llama3" (from config)
|
||||
prompt: string; // The full prompt including context
|
||||
stream: false; // Required to be false for single response
|
||||
// system?: string; // Optional system prompt (if used)
|
||||
// options?: Record<string, any>; // Optional generation parameters
|
||||
}
|
||||
```
|
||||
|
||||
### Ollama `/api/generate` Response
|
||||
|
||||
- **Description:** Relevant fields expected from the Ollama API response when `stream: false`.
|
||||
- **Schema (`src/types/ollama.ts` or inline):**
|
||||
```typescript
|
||||
export interface OllamaGenerateResponse {
|
||||
model: string;
|
||||
created_at: string; // ISO 8601 timestamp
|
||||
response: string; // The generated summary text
|
||||
done: boolean; // Should be true if stream=false and generation succeeded
|
||||
// Optional fields detailing context, timings, etc. are ignored for MVP
|
||||
// total_duration?: number;
|
||||
// load_duration?: number;
|
||||
// prompt_eval_count?: number;
|
||||
// prompt_eval_duration?: number;
|
||||
// eval_count?: number;
|
||||
// eval_duration?: number;
|
||||
}
|
||||
```
|
||||
_(Note: Error responses might have a different structure, e.g., `{ "error": "message" }`)_
|
||||
|
||||
## 3. Database Schemas
|
||||
|
||||
- **N/A:** This application does not use a database for MVP; data is persisted to the local filesystem.
|
||||
|
||||
## 4. State File Schemas (Local Filesystem Persistence)
|
||||
|
||||
These describe the format of files saved in the `output/YYYY-MM-DD/` directory.
|
||||
|
||||
### `{storyId}_data.json`
|
||||
|
||||
- **Purpose:** Stores fetched story metadata and associated comments.
|
||||
- **Format:** JSON
|
||||
- **Schema Definition (Matches `Story` type fields relevant at time of saving):**
|
||||
```json
|
||||
{
|
||||
"storyId": "string",
|
||||
"title": "string",
|
||||
"articleUrl": "string | null",
|
||||
"hnUrl": "string",
|
||||
"points": "number | undefined",
|
||||
"numComments": "number | undefined",
|
||||
"fetchedAt": "string", // ISO 8601 timestamp
|
||||
"comments": [
|
||||
// Array of Comment objects
|
||||
{
|
||||
"commentId": "string",
|
||||
"commentText": "string | null",
|
||||
"author": "string | null",
|
||||
"createdAt": "string" // ISO 8601 timestamp
|
||||
}
|
||||
// ... more comments
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### `{storyId}_article.txt`
|
||||
|
||||
- **Purpose:** Stores the successfully scraped plain text content of the linked article.
|
||||
- **Format:** Plain Text (`.txt`)
|
||||
- **Schema Definition:** N/A (Content is the raw extracted string). File only exists if scraping was successful.
|
||||
|
||||
### `{storyId}_summary.json`
|
||||
|
||||
- **Purpose:** Stores the generated article and discussion summaries.
|
||||
- **Format:** JSON
|
||||
- **Schema Definition:**
|
||||
```json
|
||||
{
|
||||
"storyId": "string",
|
||||
"articleSummary": "string | null", // Null if scraping failed or summarization failed
|
||||
"discussionSummary": "string | null", // Null if no comments or summarization failed
|
||||
"summarizedAt": "string" // ISO 8601 timestamp
|
||||
}
|
||||
```
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ---------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial draft based on Epics | 3-Architect |
|
||||
@@ -1,158 +0,0 @@
|
||||
# Demonstration of the Full BMad Workflow Agent Gem Usage
|
||||
|
||||
**Welcome to the complete end-to-end walkthrough of the BMad Method V2!** This demonstration showcases the power of AI-assisted software development using a phased agent approach. You'll see how each specialized agent (BA, PM, Architect, PO/SM) contributes to the project lifecycle - from initial concept to implementation-ready plans.
|
||||
|
||||
Each section includes links to **full Gemini interaction transcripts**, allowing you to witness the remarkable collaborative process between human and AI. The demo folder contains all output artifacts that flow between agents, creating a cohesive development pipeline.
|
||||
|
||||
What makes this V2 methodology exceptional is how the agents work in **interactive phases**, pausing at key decision points for your input rather than dumping massive documents at once. This creates a truly collaborative experience where you shape the outcome while the AI handles the heavy lifting.
|
||||
|
||||
Follow along from concept to code-ready project plan and see how this workflow transforms software development!
|
||||
|
||||
## BA Brainstorming
|
||||
|
||||
The following link shows the full chat thread with the BA demonstrating many features of this amazing agent. I started out not even knowing what to build, and it helped me ideate with the goal of something interesting for tutorial purposes, refine it, do some deep research (in thinking mode, I did not switch models), gave some great alternative details and ideas, prompted me section by section eventually to produce the brief. It worked amazingly well. You can read the full transcript and output here:
|
||||
|
||||
https://gemini.google.com/share/fec063449737
|
||||
|
||||
## PM Brainstorming (Oops it was not the PM LOL)
|
||||
|
||||
I took the final output md brief with prompt for the PM at the end of the last chat and created a google doc to make it easier to share with the PM (I could have probably just pasted it into the new chat, but it's easier if I want to start over). In Google Docs it's so easy to just create a new doc, right click and select 'Paste from Markdown', then click in the title and it will automatically name and save it with the title of the document. I then started a chat with the 2-PM Gem, also in Gemini 2.5 Pro thinking mode by attaching the Google doc and telling it to reference the prompt. This is the transcript. I realized that I accidentally had pasted the BA prompt also into the PM prompt, so this actually ended up producing a pretty nicely refined brief 2.0 instead LOL
|
||||
|
||||
https://g.co/gemini/share/3e09f04138f2
|
||||
|
||||
So I took that output file and put it into the actual BA again to produce a new version with prompt as seen in [this file](final-brief-with-pm-prompt.txt) ([md version](final-brief-with-pm-prompt.md)).
|
||||
|
||||
## PM Brainstorming Take 2
|
||||
|
||||
I will be going forward with the rest of the process not use Google Docs even though it's preferred and instead attach txt attachments of previous phase documents, this is required or else the link will be un-sharable.
|
||||
|
||||
Of note here is how I am not passive in this process and you should not be either - I looked at its proposed epics in its first PRD draft after answering the initial questions and spotting something really dumb, it had a final epic for doing file output and logging all the way at the end - when really this should be happening incrementally with each epic. The Architect or PO I hope would have caught this later and the PM might also if I let it get to the checklist phase, but if you can work with it you will have quicker results and better outcomes.
|
||||
|
||||
Also notice, since we came to the PM with the amazing brief + prompt embedded in it - it only had like 1 question before producing the first draft - amazing!!!
|
||||
|
||||
The PM did a great job of asking the right questions, and producing the [Draft PRD](prd.txt) ([md version](prd.md)), and each epic, [1](epic1.txt) ([md version](epic1.md)), [2](epic2.txt) ([md version](epic2.md)), [3](epic3.txt) ([md version](epic3.md)), [4](epic4.txt) ([md version](epic4.md)), [5](epic5.txt) ([md version](epic5.md)).
|
||||
|
||||
The beauty of these new V2 Agents is they pause for you to answer questions or review the document generation section by section - this is so much better than receiving a massive document dump all at once and trying to take it all in. in between each piece you can ask questions or ask for changes - so easy - so powerful!
|
||||
|
||||
After the drafts were done, it then ran the checklist - which is the other big game changer feature of the V2 BMAD Method. Waiting for the output final decision from the checklist run can be exciting haha!
|
||||
|
||||
Getting that final PRD & EPIC VALIDATION SUMMARY and seeing it all passing is a great feeling.
|
||||
|
||||
[Here is the full chat summary](https://g.co/gemini/share/abbdff18316b).
|
||||
|
||||
## Architect (Terrible Architect - already fired and replaced in take 2)
|
||||
|
||||
I gave the architect the drafted PRD and epics. I call them all still drafts because the architect or PO could still have some findings or updates - but hopefully not for this very simple project.
|
||||
|
||||
I started off the fun with the architect by saying 'the prompt to respond to is in the PRD at the end in a section called 'Initial Architect Prompt' and we are in architecture creation mode - all PRD and epics planned by the PM are attached'
|
||||
|
||||
NOTE - The architect just plows through and produces everything at once and runs the checklist - need to improve the gem and agent to be more workflow focused in a future update! Here is the [initial crap it produced](botched-architecture.md) - don't worry I fixed it, it's much better in take 2!
|
||||
|
||||
There is one thing that is a pain with both Gemini and ChatGPT - output of markdown with internal markdown or mermaid sections screws up the output formatting where it thinks the start of inner markdown is the end to its total output block - this is because the reality is everything you are seeing in response from the LLM is already markdown, just being rendered by the UI! So the fix is simple - I told it "Since you already default respond in markdown - can you not use markdown blocks and just give the document as standard chat output" - this worked perfect, and nested markdown was properly still wrapped!
|
||||
|
||||
I updated the agent at this point to fix this output formatting for all gems and adjusted the architect to progress document by document prompting in between to get clarifications, suggest tradeoffs or what it put in place, etc., and then confirm with me if I like all the draft docs we got 1 by 1 and then confirm I am ready for it to run the checklist assessment. Improved usage of this is shown in the next section Architect Take 2 next.
|
||||
|
||||
If you want to see my annoying chat with this lame architect gem that is now much better - [here you go](https://g.co/gemini/share/0a029a45d70b).
|
||||
|
||||
{I corrected the interaction model and added YOLO mode to the architect, and tried a fresh start with the improved gem in take 2.}
|
||||
|
||||
## Architect Take 2 (Our amazing new architect)
|
||||
|
||||
Same initial prompt as before but with the new and improved architect! I submitted that first prompt again and waited in anticipation to see if it would go insane again.
|
||||
|
||||
So far success - it confirmed it was not to go all YOLO on me!
|
||||
|
||||
Our new architect is SO much better, and also fun '(Pirate voice) Aye, yargs be a fine choice, matey!' - firing the previous architect was a great decision!
|
||||
|
||||
It gave us our [tech stack](tech-stack.txt) ([md version](tech-stack.md)) - the tech-stack looks great, it did not produce wishy-washy ambiguous selections like the previous architect would!
|
||||
|
||||
I did mention we should call out the specific decisions to not use axios and dotenv so the LLM would not try to use it later. Also I suggested adding Winston and it helped me know it had a better simpler idea for MVP for file logging! Such a great helper now! I really hope I never see that old V1 architect again, I don't think he was at all qualified to even mop the floors.
|
||||
|
||||
When I got the [project structure document](project-structure.txt) ([md version](project-structure.md)), I was blown away - you will see in the chat transcript how it was formatted - I was able to copy the whole response put it in an md file and no more issues with sub sections, just removed the text basically saying here is your file! Once confirmed it was md, I changed it to txt for pass off later potentially to the PO.
|
||||
|
||||
Here are the remaining docs it did with me one at a time before running the checklist:
|
||||
|
||||
- [Architecture](architecture.txt) ([md version](architecture.md)) - the 'Core Workflow / Sequence Diagram (Main Pipeline)' diagram was impressive - one other diagram had a mermaid bugs - I updated the agent and fixed the bugs, these should hopefully not occur again - it was the most common LLM mermaid bug I have gotten across models
|
||||
- [Data Models](data-models.txt) ([md version](data-models.md)) - another complex file easy to just get the end of message ... copy the response and paste into markdown, perfect result!
|
||||
- [API Reference](api-reference.txt) ([md version](api-reference.md))
|
||||
- [Env Vars](environment-vars.txt) ([md version](environment-vars.md))
|
||||
- [Coding Standards](coding-standards.txt) ([md version](coding-standards.md))
|
||||
- [Testing Strategy](testing-strategy.txt) ([md version](testing-strategy.md)) - It really nailed this one, even having me ensure it captured by concerns about testing each epic with scripts incrementally - which I did confirm was awesome!
|
||||
|
||||
It then confirmed it was done with all the artifacts and ready to assess with its checklist! Here are the results: 5 minor points and great questions - some I answered, one I didn't understand and it explained it to me in detail!
|
||||
|
||||
I made the updates, and it also proposed a new file for this project specifically! [Prompts](prompts.txt) ([md version](prompts.md)) - I did have it take the output from that to ensure its referenced where needed including epic 4 and coding standards, it helped me regenerate epic 4 with the changes.
|
||||
|
||||
We were done, I told it great job, the output it gave me next was perfect showing the complete function of the multiphased agent! Here is a snippet!
|
||||
|
||||
```text
|
||||
Next Steps:
|
||||
|
||||
The project can now move into the implementation phase, using these architecture documents as the blueprint. Development agents can be tasked with implementing the stories within each Epic, following the defined structure, standards, and testing strategy.
|
||||
|
||||
As the 3-Architect, I can switch to Mode 3: Master Architect Advisory if you need further assistance during development, such as:
|
||||
|
||||
Explaining architectural concepts.
|
||||
Advising on implementation challenges.
|
||||
Evaluating potential changes or enhancements.
|
||||
Helping manage technical debt.
|
||||
```
|
||||
|
||||
Here is the [full massive architecture discussion link](https://g.co/gemini/share/4fedc292e068)!
|
||||
|
||||
## PO-SM
|
||||
|
||||
This mode is WILD in Gemini 2.5 Pro with its massive 1 million context window - holy crap! It was able to analyze every artifact together - since there is a 10 document attach limit I created a [Combined Artifacts](combined-artifacts-for-posm.txt) ([md version](combined-artifacts-for-posm.md)) which adds all 5 epic files in it.
|
||||
|
||||
There was a ton of output running through the checklist - but the amazing part is it considered it all and output this (in better formatted markdown I didn't capture here well). TL;DR of the following - it caught two great finds! It called out that story 1.2 still calls for dotenv which we are not going to use after working with the architect, and also it is looking for a prompts.md that is called out - but I forgot to attach it, so it caught that! So I corrected story 1.2 and we are ready to proceed with development!
|
||||
|
||||
VALIDATION SUMMARY
|
||||
Category Statuses
|
||||
Category Status Critical Issues
|
||||
|
||||
1. Project Setup & Initialization PASS
|
||||
2. Infrastructure & Deployment Sequencing PASS
|
||||
3. External Dependencies & Integrations PASS
|
||||
4. User/Agent Responsibility Delineation PASS
|
||||
5. Feature Sequencing & Dependencies PASS
|
||||
6. MVP Scope Alignment PASS
|
||||
7. Risk Management & Practicality PASS
|
||||
8. Documentation & Handoff PASS
|
||||
9. Post-MVP Considerations PASS
|
||||
OVERALL PASS No critical deficiencies identified.
|
||||
|
||||
Export to Sheets
|
||||
Critical Deficiencies
|
||||
None identified.
|
||||
Recommendations
|
||||
|
||||
While not critical, Story 1.2 could explicitly state npm install {dotenv library} rather than just referencing the library name, for absolute clarity, although the plan specifies native Node.js .env support which negates the need for a library [111, 549]. Correction: The plan correctly states native support is used and no external package is needed [549], removing this minor concern. The reference to installing dotenv in Story 1.2 [111-112] seems contradictory to the environment-vars.txt [549] and tech-stack.txt [867] which state native support is used and dotenv is not needed. This should be clarified/corrected in Story 1.2. Update: Reviewing Story 1.2 [111], it does explicitly require installing dotenv, which contradicts ENV [549], ARCH [72], TS [867], and PRD [752] (which implies native support via .env.example). This inconsistency needs resolution. Decision: Assuming ENV, ARCH, TS, PRD are correct and Story 1.2 requirement for dotenv package is an error. Will proceed assuming NO dotenv package required.
|
||||
|
||||
Consider adding a docs/prompts.md file explicitly, as referenced in ARCH [89] and E4 Story 4.2 [321], even if prompts are simple initially. Update: E4 Story 4.2 [321-325] does require creating docs/prompts.md and referencing it. This seems sufficient.
|
||||
|
||||
Final Decision
|
||||
|
||||
APPROVED: The plan is comprehensive, meticulously detailed across PRD, Architecture, Epics, and supporting documents. The sequencing is logical, dependencies are managed, risks acknowledged, and the scope aligns tightly with MVP goals. The technical constraints are respected, and the use of stage testing utilities significantly enhances testability and validation. The identified inconsistency regarding the dotenv package is noted but assumed resolved in favour of native Node.js support based on multiple other document references. The plan is ready for implementation.
|
||||
|
||||
Transitioning to SM Mode can proceed upon user confirmation or if auto-proceeding.
|
||||
|
||||
(full chat link in the next section)
|
||||
|
||||
## PO-SM Story Generation
|
||||
|
||||
This is only for demonstration as I do not like to work this way, at this point I want to be into the project - but for fun and testing, I wanted to verify the gem can produce quality stories, potentially multiple at once in case someone were wanting to use this more like taskmaster.
|
||||
|
||||
The output looks decent, I still prefer doing this in the IDE with Sonnet 3.5/3.7 though 1 story at a time with the SM, then use the Dev. Mainly because it's still possible you might want to change something story to story - but this is just a preference, and this method of generating all the stories at once might work well for you - experiment and let me know what you find!
|
||||
|
||||
- [Story Drafts Epic 1](epic-1-stories-demo.md)
|
||||
- [Story Drafts Epic 2](epic-2-stories-demo.md)
|
||||
- [Story Drafts Epic 3](epic-3-stories-demo.md)
|
||||
etc...
|
||||
|
||||
Here is the full [4-POSM chat record](https://g.co/gemini/share/9ab02d1baa18).
|
||||
|
||||
Ill post the link to the video and final project here if you want to see the final results of the app build - but I am beyond extatic at how well this planning workflow is now tuned with V2.
|
||||
|
||||
Thanks if you read this far.
|
||||
|
||||
- BMad
|
||||
@@ -1,43 +0,0 @@
|
||||
# BMad Hacker Daily Digest Environment Variables
|
||||
|
||||
## Configuration Loading Mechanism
|
||||
|
||||
Environment variables for this project are managed using a standard `.env` file in the project root. The application leverages the native support for `.env` files built into Node.js (v20.6.0 and later) , meaning **no external `dotenv` package is required**.
|
||||
|
||||
Variables defined in the `.env` file are automatically loaded into `process.env` when the Node.js application starts. Accessing and potentially validating these variables should be centralized, ideally within the `src/utils/config.ts` module .
|
||||
|
||||
## Required Variables
|
||||
|
||||
The following table lists the environment variables used by the application. An `.env.example` file should be maintained in the repository with these variables set to placeholder or default values .
|
||||
|
||||
| Variable Name | Description | Example / Default Value | Required? | Sensitive? | Source |
|
||||
| :------------------------------ | :---------------------------------------------------------------- | :--------------------------------------- | :-------- | :--------- | :------------ |
|
||||
| `OUTPUT_DIR_PATH` | Filesystem path for storing output data artifacts | `./output` | Yes | No | Epic 1 |
|
||||
| `MAX_COMMENTS_PER_STORY` | Maximum number of comments to fetch per HN story | `50` | Yes | No | PRD |
|
||||
| `OLLAMA_ENDPOINT_URL` | Base URL for the local Ollama API instance | `http://localhost:11434` | Yes | No | Epic 4 |
|
||||
| `OLLAMA_MODEL` | Name of the Ollama model to use for summarization | `llama3` | Yes | No | Epic 4 |
|
||||
| `EMAIL_HOST` | SMTP server hostname for sending email | `smtp.example.com` | Yes | No | Epic 5 |
|
||||
| `EMAIL_PORT` | SMTP server port | `587` | Yes | No | Epic 5 |
|
||||
| `EMAIL_SECURE` | Use TLS/SSL (`true` for port 465, `false` for 587/STARTTLS) | `false` | Yes | No | Epic 5 |
|
||||
| `EMAIL_USER` | Username for SMTP authentication | `user@example.com` | Yes | **Yes** | Epic 5 |
|
||||
| `EMAIL_PASS` | Password for SMTP authentication | `your_smtp_password` | Yes | **Yes** | Epic 5 |
|
||||
| `EMAIL_FROM` | Sender email address (may need specific format) | `"BMad Digest <digest@example.com>"` | Yes | No | Epic 5 |
|
||||
| `EMAIL_RECIPIENTS` | Comma-separated list of recipient email addresses | `recipient1@example.com,r2@test.org` | Yes | No | Epic 5 |
|
||||
| `NODE_ENV` | Runtime environment (influences some library behavior) | `development` | No | No | Standard Node |
|
||||
| `SCRAPE_TIMEOUT_MS` | _Optional:_ Timeout in milliseconds for article scraping requests | `15000` (15s) | No | No | Good Practice |
|
||||
| `OLLAMA_TIMEOUT_MS` | _Optional:_ Timeout in milliseconds for Ollama API requests | `120000` (2min) | No | No | Good Practice |
|
||||
| `LOG_LEVEL` | _Optional:_ Control log verbosity (e.g., debug, info) | `info` | No | No | Good Practice |
|
||||
| `MAX_COMMENT_CHARS_FOR_SUMMARY` | _Optional:_ Max chars of combined comments sent to LLM | 10000 / null (uses all if not set) | No | No | Arch Decision |
|
||||
| `SCRAPER_USER_AGENT` | _Optional:_ Custom User-Agent header for scraping requests | "BMadHackerDigest/0.1" (Default in code) | No | No | Arch Decision |
|
||||
|
||||
## Notes
|
||||
|
||||
- **Secrets Management:** Sensitive variables (`EMAIL_USER`, `EMAIL_PASS`) must **never** be committed to version control. The `.env` file should be included in `.gitignore` (as per boilerplate ).
|
||||
- **`.env.example`:** Maintain an `.env.example` file in the repository mirroring the variables above, using placeholders or default values for documentation and local setup .
|
||||
- **Validation:** It is recommended to implement validation logic in `src/utils/config.ts` to ensure required variables are present and potentially check their format on application startup .
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Draft based on PRD/Epics requirements | 3-Architect |
|
||||
@@ -1,43 +0,0 @@
|
||||
# BMad Hacker Daily Digest Environment Variables
|
||||
|
||||
## Configuration Loading Mechanism
|
||||
|
||||
Environment variables for this project are managed using a standard `.env` file in the project root. The application leverages the native support for `.env` files built into Node.js (v20.6.0 and later) , meaning **no external `dotenv` package is required**.
|
||||
|
||||
Variables defined in the `.env` file are automatically loaded into `process.env` when the Node.js application starts. Accessing and potentially validating these variables should be centralized, ideally within the `src/utils/config.ts` module .
|
||||
|
||||
## Required Variables
|
||||
|
||||
The following table lists the environment variables used by the application. An `.env.example` file should be maintained in the repository with these variables set to placeholder or default values .
|
||||
|
||||
| Variable Name | Description | Example / Default Value | Required? | Sensitive? | Source |
|
||||
| :------------------------------ | :---------------------------------------------------------------- | :--------------------------------------- | :-------- | :--------- | :------------ |
|
||||
| `OUTPUT_DIR_PATH` | Filesystem path for storing output data artifacts | `./output` | Yes | No | Epic 1 |
|
||||
| `MAX_COMMENTS_PER_STORY` | Maximum number of comments to fetch per HN story | `50` | Yes | No | PRD |
|
||||
| `OLLAMA_ENDPOINT_URL` | Base URL for the local Ollama API instance | `http://localhost:11434` | Yes | No | Epic 4 |
|
||||
| `OLLAMA_MODEL` | Name of the Ollama model to use for summarization | `llama3` | Yes | No | Epic 4 |
|
||||
| `EMAIL_HOST` | SMTP server hostname for sending email | `smtp.example.com` | Yes | No | Epic 5 |
|
||||
| `EMAIL_PORT` | SMTP server port | `587` | Yes | No | Epic 5 |
|
||||
| `EMAIL_SECURE` | Use TLS/SSL (`true` for port 465, `false` for 587/STARTTLS) | `false` | Yes | No | Epic 5 |
|
||||
| `EMAIL_USER` | Username for SMTP authentication | `user@example.com` | Yes | **Yes** | Epic 5 |
|
||||
| `EMAIL_PASS` | Password for SMTP authentication | `your_smtp_password` | Yes | **Yes** | Epic 5 |
|
||||
| `EMAIL_FROM` | Sender email address (may need specific format) | `"BMad Digest <digest@example.com>"` | Yes | No | Epic 5 |
|
||||
| `EMAIL_RECIPIENTS` | Comma-separated list of recipient email addresses | `recipient1@example.com,r2@test.org` | Yes | No | Epic 5 |
|
||||
| `NODE_ENV` | Runtime environment (influences some library behavior) | `development` | No | No | Standard Node |
|
||||
| `SCRAPE_TIMEOUT_MS` | _Optional:_ Timeout in milliseconds for article scraping requests | `15000` (15s) | No | No | Good Practice |
|
||||
| `OLLAMA_TIMEOUT_MS` | _Optional:_ Timeout in milliseconds for Ollama API requests | `120000` (2min) | No | No | Good Practice |
|
||||
| `LOG_LEVEL` | _Optional:_ Control log verbosity (e.g., debug, info) | `info` | No | No | Good Practice |
|
||||
| `MAX_COMMENT_CHARS_FOR_SUMMARY` | _Optional:_ Max chars of combined comments sent to LLM | 10000 / null (uses all if not set) | No | No | Arch Decision |
|
||||
| `SCRAPER_USER_AGENT` | _Optional:_ Custom User-Agent header for scraping requests | "BMadHackerDigest/0.1" (Default in code) | No | No | Arch Decision |
|
||||
|
||||
## Notes
|
||||
|
||||
- **Secrets Management:** Sensitive variables (`EMAIL_USER`, `EMAIL_PASS`) must **never** be committed to version control. The `.env` file should be included in `.gitignore` (as per boilerplate ).
|
||||
- **`.env.example`:** Maintain an `.env.example` file in the repository mirroring the variables above, using placeholders or default values for documentation and local setup .
|
||||
- **Validation:** It is recommended to implement validation logic in `src/utils/config.ts` to ensure required variables are present and potentially check their format on application startup .
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Draft based on PRD/Epics requirements | 3-Architect |
|
||||
@@ -1,391 +0,0 @@
|
||||
Okay, transitioning to **SM Mode**.
|
||||
|
||||
I will now generate the stories for **Epic 1: Project Initialization & Core Setup**, as requested for demonstration purposes.
|
||||
|
||||
I will use the details from Epic 1 in `combined-artifacts-for-posm.txt`, the `story-template.txt`, and extract relevant technical context from the supporting documents (`architecture.txt`, `project-structure.txt`, `tech-stack.txt`, `environment-vars.txt`, `testing-strategy.txt`).
|
||||
|
||||
**Note:** As identified during PO Mode validation, Story 1.2 contained a requirement to install the `dotenv` package [111-112], which contradicts multiple other documents stating native Node.js support is used and the package is unnecessary (`ENV` [549], `ARCH` [72], `TS` [867], `PRD` [752]). The following story for 1.2 has been generated based on the **correction** that the `dotenv` package is **not required** and native Node.js functionality should be used.
|
||||
|
||||
Here is the report containing the generated stories for Epic 1:
|
||||
|
||||
---
|
||||
|
||||
## **Stories Report: Epic 1**
|
||||
|
||||
**File: ai/stories/1.1.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 1.1: Initialize Project from Boilerplate
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want to set up the initial project structure using the `bmad-boilerplate`, so that I have the standard tooling (TS, Jest, ESLint, Prettier), configurations, and scripts in place. [96]
|
||||
|
||||
**Context:** This is the foundational story for the entire project. It establishes the core development environment, tooling, and standard scripts based on the `bmad-boilerplate` template, as required by the PRD [706, 713, 784] and Architecture [43]. All subsequent development relies on the successful completion of this setup.
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Copy or clone the contents of the `bmad-boilerplate` into the new project's root directory. [97]
|
||||
- Initialize a git repository in the project root directory (if not already done by cloning). [98]
|
||||
- Ensure the `.gitignore` file from the boilerplate is present. [99]
|
||||
- Run `npm install` to download and install all `devDependencies` specified in the boilerplate's `package.json`. [100]
|
||||
- Verify that the core boilerplate scripts (`lint`, `format`, `test`, `build`) execute without errors on the initial codebase. [101]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: The project directory contains the files and structure from `bmad-boilerplate`. [102]
|
||||
- AC2: A `node_modules` directory exists and contains packages corresponding to `devDependencies`. [103]
|
||||
- AC3: `npm run lint` command completes successfully without reporting any linting errors. [104]
|
||||
- AC4: `npm run format` command completes successfully, potentially making formatting changes according to Prettier rules. [105] Running it a second time should result in no changes. [106]
|
||||
- AC5: `npm run test` command executes Jest successfully (it may report "no tests found" which is acceptable at this stage). [107]
|
||||
- AC6: `npm run build` command executes successfully, creating a `dist` directory containing compiled JavaScript output. [108]
|
||||
- AC7: The `.gitignore` file exists and includes entries for `node_modules/`, `.env`, `dist/`, etc. as specified in the boilerplate. [109, 632]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create/Copy: All files from `bmad-boilerplate` (e.g., `package.json`, `tsconfig.json`, `.eslintrc.js`, `.prettierrc.js`, `.gitignore`, initial `src/` structure if any).
|
||||
- Files to Modify: None initially, verification via script execution.
|
||||
- _(Hint: See `docs/project-structure.md` [813-825] for the target overall layout derived from the boilerplate)._
|
||||
- **Key Technologies:**
|
||||
- Node.js 22.x [851], npm [100], Git [98], TypeScript [846], Jest [889], ESLint [893], Prettier [896].
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905] for full list)._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- N/A for this story.
|
||||
- **Data Structures:**
|
||||
- N/A for this story.
|
||||
- **Environment Variables:**
|
||||
- N/A directly used, but `.gitignore` [109] should cover `.env`. Boilerplate includes `.env.example` [112].
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638] for all variables)._
|
||||
- **Coding Standards Notes:**
|
||||
- Ensure boilerplate scripts (`lint`, `format`) run successfully. [101]
|
||||
- Adhere to ESLint/Prettier rules defined in the boilerplate. [746]
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Obtain the `bmad-boilerplate` content (clone or copy).
|
||||
- [ ] Place boilerplate content into the project's root directory.
|
||||
- [ ] Initialize git repository (`git init`).
|
||||
- [ ] Verify `.gitignore` exists and is correctly sourced from boilerplate.
|
||||
- [ ] Run `npm install` to install dependencies.
|
||||
- [ ] Execute `npm run lint` and verify successful completion without errors.
|
||||
- [ ] Execute `npm run format` and verify successful completion. Run again to confirm no further changes.
|
||||
- [ ] Execute `npm run test` and verify successful execution (no tests found is OK).
|
||||
- [ ] Execute `npm run build` and verify `dist/` directory creation and successful completion.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** N/A for this story (focus is project setup). [915]
|
||||
- **Integration Tests:** N/A for this story. [921]
|
||||
- **Manual/CLI Verification:**
|
||||
- Verify file structure matches boilerplate (AC1).
|
||||
- Check for `node_modules/` directory (AC2).
|
||||
- Run `npm run lint` (AC3).
|
||||
- Run `npm run format` twice (AC4).
|
||||
- Run `npm run test` (AC5).
|
||||
- Run `npm run build`, check for `dist/` (AC6).
|
||||
- Inspect `.gitignore` contents (AC7).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Any notes about implementation choices, difficulties, or follow-up needed}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/1.2.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 1.2: Setup Environment Configuration
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want to establish the environment configuration mechanism using `.env` files, so that secrets and settings (like output paths) can be managed outside of version control, following boilerplate conventions and utilizing native Node.js support. [110, 549]
|
||||
|
||||
**Context:** This story builds on the initialized project (Story 1.1). It sets up the critical mechanism for managing configuration parameters like API keys and file paths using standard `.env` files, which is essential for security and flexibility. It leverages Node.js's built-in `.env` file loading [549, 867], meaning **no external package installation is required**. This corrects the original requirement [111-112] based on `docs/environment-vars.md` [549] and `docs/tech-stack.md` [867].
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Verify the `.env.example` file exists (from boilerplate). [112]
|
||||
- Add an initial configuration variable `OUTPUT_DIR_PATH=./output` to `.env.example`. [113]
|
||||
- Create the `.env` file locally by copying `.env.example`. Populate `OUTPUT_DIR_PATH` if needed (can keep default). [114]
|
||||
- Implement a utility module (e.g., `src/utils/config.ts`) that reads environment variables **directly from `process.env`** (populated natively by Node.js from the `.env` file at startup). [115, 550]
|
||||
- The utility should export the loaded configuration values (initially just `OUTPUT_DIR_PATH`). [116] It is recommended to include basic validation (e.g., checking if required variables are present). [634]
|
||||
- Ensure the `.env` file is listed in `.gitignore` and is not committed. [117, 632]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: **(Removed)** The chosen `.env` library... is listed under `dependencies`. (Package not needed [549]).
|
||||
- AC2: The `.env.example` file exists, is tracked by git, and contains the line `OUTPUT_DIR_PATH=./output`. [119]
|
||||
- AC3: The `.env` file exists locally but is NOT tracked by git. [120]
|
||||
- AC4: A configuration module (`src/utils/config.ts` or similar) exists and successfully reads the `OUTPUT_DIR_PATH` value **from `process.env`** when the application starts. [121]
|
||||
- AC5: The loaded `OUTPUT_DIR_PATH` value is accessible within the application code via the config module. [122]
|
||||
- AC6: The `.env` file is listed in the `.gitignore` file. [117]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/utils/config.ts`.
|
||||
- Files to Modify: `.env.example`, `.gitignore` (verify inclusion of `.env`). Create local `.env`.
|
||||
- _(Hint: See `docs/project-structure.md` [822] for utils location)._
|
||||
- **Key Technologies:**
|
||||
- Node.js 22.x (Native `.env` support >=20.6) [549, 851]. TypeScript [846].
|
||||
- **No `dotenv` package required.** [549, 867]
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905] for full list)._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- N/A for this story.
|
||||
- **Data Structures:**
|
||||
- Potentially an interface for the exported configuration object in `config.ts`.
|
||||
- _(Hint: See `docs/data-models.md` [498-547] for key project data structures)._
|
||||
- **Environment Variables:**
|
||||
- Reads `OUTPUT_DIR_PATH` from `process.env`. [116]
|
||||
- Defines `OUTPUT_DIR_PATH` in `.env.example`. [113]
|
||||
- _(Hint: See `docs/environment-vars.md` [559] for this variable)._
|
||||
- **Coding Standards Notes:**
|
||||
- `config.ts` should export configuration values clearly.
|
||||
- Consider adding validation logic in `config.ts` to check for the presence of required environment variables on startup. [634]
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Verify `bmad-boilerplate` provided `.env.example`.
|
||||
- [ ] Add `OUTPUT_DIR_PATH=./output` to `.env.example`.
|
||||
- [ ] Create `.env` file by copying `.env.example`.
|
||||
- [ ] Verify `.env` is included in `.gitignore`.
|
||||
- [ ] Create `src/utils/config.ts`.
|
||||
- [ ] Implement logic in `config.ts` to read `OUTPUT_DIR_PATH` directly from `process.env`.
|
||||
- [ ] Export the loaded `OUTPUT_DIR_PATH` value from `config.ts`.
|
||||
- [ ] (Optional but Recommended) Add validation in `config.ts` to ensure `OUTPUT_DIR_PATH` is defined in `process.env`.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:**
|
||||
- Write unit tests for `src/utils/config.ts`. [915]
|
||||
- Use `process.env` manipulation (e.g., temporarily setting `process.env.OUTPUT_DIR_PATH` within the test) to verify the module reads and exports the value correctly.
|
||||
- Test validation logic (e.g., if it throws an error when a required variable is missing). [920]
|
||||
- **Integration Tests:** N/A for this story. [921]
|
||||
- **Manual/CLI Verification:**
|
||||
- Check `.env.example` content (AC2).
|
||||
- Verify `.env` exists locally but not in git status (AC3, AC6).
|
||||
- Code inspection of `src/utils/config.ts` (AC4).
|
||||
- Later stories (1.3, 1.4) will consume this module, verifying AC5 implicitly.
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Implemented using native Node.js .env support, no external package installed. Added basic validation.}
|
||||
- **Change Log:**
|
||||
- Initial Draft (Corrected requirement to use native .env support instead of installing `dotenv` package).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/1.3.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 1.3: Implement Basic CLI Entry Point & Execution
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want a basic `src/index.ts` entry point that can be executed via the boilerplate's `dev` and `start` scripts, providing a working foundation for the application logic. [123]
|
||||
|
||||
**Context:** This story builds upon the project setup (Story 1.1) and environment configuration (Story 1.2). It creates the main starting point (`src/index.ts`) for the CLI application. This file will be executed by the `npm run dev` (using `ts-node`) and `npm run start` (using compiled code) scripts provided by the boilerplate. It verifies that the basic execution flow and configuration loading are functional. [730, 755]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Create the main application entry point file at `src/index.ts`. [124]
|
||||
- Implement minimal code within `src/index.ts` to:
|
||||
- Import the configuration loading mechanism (from Story 1.2, e.g., `import config from './utils/config';`). [125]
|
||||
- Log a simple startup message to the console (e.g., "BMad Hacker Daily Digest - Starting Up..."). [126]
|
||||
- (Optional) Log the loaded `OUTPUT_DIR_PATH` from the imported config object to verify config loading. [127]
|
||||
- Confirm execution using boilerplate scripts (`npm run dev`, `npm run build`, `npm run start`). [127]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: The `src/index.ts` file exists. [128]
|
||||
- AC2: Running `npm run dev` executes `src/index.ts` via `ts-node` and logs the startup message to the console. [129]
|
||||
- AC3: Running `npm run build` successfully compiles `src/index.ts` (and any imports like `config.ts`) into the `dist` directory. [130]
|
||||
- AC4: Running `npm start` (after a successful build) executes the compiled code from `dist` and logs the startup message to the console. [131]
|
||||
- AC5: (If implemented) The loaded `OUTPUT_DIR_PATH` is logged to the console during execution via `npm run dev` or `npm run start`. [127]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/index.ts`.
|
||||
- Files to Modify: None.
|
||||
- _(Hint: See `docs/project-structure.md` [822] for entry point location)._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851].
|
||||
- Uses scripts from `package.json` (`dev`, `start`, `build`) defined in the boilerplate.
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905] for full list)._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- N/A for this story.
|
||||
- **Data Structures:**
|
||||
- Imports configuration object from `src/utils/config.ts` (Story 1.2).
|
||||
- _(Hint: See `docs/data-models.md` [498-547] for key project data structures)._
|
||||
- **Environment Variables:**
|
||||
- Implicitly uses variables loaded by `config.ts` if the optional logging step [127] is implemented.
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638] for all variables)._
|
||||
- **Coding Standards Notes:**
|
||||
- Use standard `import` statements.
|
||||
- Use `console.log` initially for the startup message (Logger setup is in Story 1.4).
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Create the file `src/index.ts`.
|
||||
- [ ] Add import statement for the configuration module (`src/utils/config.ts`).
|
||||
- [ ] Add `console.log("BMad Hacker Daily Digest - Starting Up...");` (or similar).
|
||||
- [ ] (Optional) Add `console.log(\`Output directory: \${config.OUTPUT_DIR_PATH}\`);`
|
||||
- [ ] Run `npm run dev` and verify console output (AC2, AC5 optional).
|
||||
- [ ] Run `npm run build` and verify successful compilation to `dist/` (AC3).
|
||||
- [ ] Run `npm start` and verify console output from compiled code (AC4, AC5 optional).
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** Low value for this specific story, as it's primarily wiring and execution setup. Testing `config.ts` was covered in Story 1.2. [915]
|
||||
- **Integration Tests:** N/A for this story. [921]
|
||||
- **Manual/CLI Verification:**
|
||||
- Verify `src/index.ts` exists (AC1).
|
||||
- Run `npm run dev`, check console output (AC2, AC5 opt).
|
||||
- Run `npm run build`, check `dist/` exists (AC3).
|
||||
- Run `npm start`, check console output (AC4, AC5 opt).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Any notes about implementation choices, difficulties, or follow-up needed}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/1.4.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 1.4: Setup Basic Logging and Output Directory
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want a basic console logging mechanism and the dynamic creation of a date-stamped output directory, so that the application can provide execution feedback and prepare for storing data artifacts in subsequent epics. [132]
|
||||
|
||||
**Context:** This story refines the basic execution setup from Story 1.3. It introduces a simple, reusable logger utility (`src/utils/logger.ts`) for standardized console output [871] and implements the logic to create the necessary date-stamped output directory (`./output/YYYY-MM-DD/`) based on the `OUTPUT_DIR_PATH` configured in Story 1.2. This directory is crucial for persisting intermediate data in later epics (Epics 2, 3, 4). [68, 538, 734, 788]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Implement a simple, reusable logging utility module (e.g., `src/utils/logger.ts`). [133] Initially, it can wrap `console.log`, `console.warn`, `console.error`. Provide simple functions like `logInfo`, `logWarn`, `logError`. [134]
|
||||
- Refactor `src/index.ts` to use this `logger` for its startup message(s) instead of `console.log`. [134]
|
||||
- In `src/index.ts` (or a setup function called by it):
|
||||
- Retrieve the `OUTPUT_DIR_PATH` from the configuration (imported from `src/utils/config.ts` - Story 1.2). [135]
|
||||
- Determine the current date in 'YYYY-MM-DD' format (e.g., using `date-fns` library is recommended [878], needs installation `npm install date-fns --save-prod`). [136]
|
||||
- Construct the full path for the date-stamped subdirectory (e.g., `${OUTPUT_DIR_PATH}/${formattedDate}`). [137]
|
||||
- Check if the base output directory exists; if not, create it. [138]
|
||||
- Check if the date-stamped subdirectory exists; if not, create it recursively. [139] Use Node.js `fs` module (e.g., `fs.mkdirSync(path, { recursive: true })`). Need to import `fs`. [140]
|
||||
- Log (using the new logger utility) the full path of the output directory being used for the current run (e.g., "Output directory for this run: ./output/2025-05-04"). [141]
|
||||
- The application should exit gracefully after performing these setup steps (for now). [147]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: A logger utility module (`src/utils/logger.ts` or similar) exists and is used for console output in `src/index.ts`. [142]
|
||||
- AC2: Running `npm run dev` or `npm start` logs the startup message via the logger. [143]
|
||||
- AC3: Running the application creates the base output directory (e.g., `./output` defined in `.env`) if it doesn't already exist. [144]
|
||||
- AC4: Running the application creates a date-stamped subdirectory (e.g., `./output/2025-05-04`, based on current date) within the base output directory if it doesn't already exist. [145]
|
||||
- AC5: The application logs a message via the logger indicating the full path to the date-stamped output directory created/used for the current execution. [146]
|
||||
- AC6: The application exits gracefully after performing these setup steps (for now). [147]
|
||||
- AC7: `date-fns` library is added as a production dependency.
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/utils/logger.ts`, `src/utils/dateUtils.ts` (recommended for date formatting logic).
|
||||
- Files to Modify: `src/index.ts`, `package.json` (add `date-fns`), `package-lock.json`.
|
||||
- _(Hint: See `docs/project-structure.md` [822] for utils location)._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851], `fs` module (native) [140], `path` module (native, for joining paths).
|
||||
- `date-fns` library [876] for date formatting (needs `npm install date-fns --save-prod`).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905] for full list)._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Node.js `fs.mkdirSync`. [140]
|
||||
- **Data Structures:**
|
||||
- N/A specific to this story, uses config from 1.2.
|
||||
- _(Hint: See `docs/data-models.md` [498-547] for key project data structures)._
|
||||
- **Environment Variables:**
|
||||
- Uses `OUTPUT_DIR_PATH` loaded via `config.ts`. [135]
|
||||
- _(Hint: See `docs/environment-vars.md` [559] for this variable)._
|
||||
- **Coding Standards Notes:**
|
||||
- Logger should provide simple info/warn/error functions. [134]
|
||||
- Use `path.join` to construct file paths reliably.
|
||||
- Handle potential errors during directory creation (e.g., permissions) using try/catch, logging errors via the new logger.
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Install `date-fns`: `npm install date-fns --save-prod`.
|
||||
- [ ] Create `src/utils/logger.ts` wrapping `console` methods (e.g., `logInfo`, `logWarn`, `logError`).
|
||||
- [ ] Create `src/utils/dateUtils.ts` (optional but recommended) with a function to get current date as 'YYYY-MM-DD' using `date-fns`.
|
||||
- [ ] Refactor `src/index.ts` to import and use the `logger` instead of `console.log`.
|
||||
- [ ] In `src/index.ts`, import `fs` and `path`.
|
||||
- [ ] In `src/index.ts`, import and use the date formatting function.
|
||||
- [ ] In `src/index.ts`, retrieve `OUTPUT_DIR_PATH` from config.
|
||||
- [ ] In `src/index.ts`, construct the full date-stamped directory path using `path.join`.
|
||||
- [ ] In `src/index.ts`, add logic using `fs.mkdirSync` (with `{ recursive: true }`) inside a try/catch block to create the directory. Log errors using the logger.
|
||||
- [ ] In `src/index.ts`, log the full path of the created/used directory using the logger.
|
||||
- [ ] Ensure the script completes and exits after these steps.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:**
|
||||
- Test `src/utils/logger.ts` functions (can spy on `console` methods). [915]
|
||||
- Test `src/utils/dateUtils.ts` function for correct date formatting.
|
||||
- Testing `fs` operations in unit tests can be complex; consider focusing on integration or manual verification for directory creation. Mocking `fs` is an option but might be brittle. [918]
|
||||
- **Integration Tests:**
|
||||
- Could write a test that runs the core logic of `src/index.ts` (directory creation part) and uses `mock-fs` or actual file system checks (with cleanup) to verify directory creation. [921, 924]
|
||||
- **Manual/CLI Verification:**
|
||||
- Run `npm run dev` or `npm start`.
|
||||
- Check console output uses the logger format (AC1, AC2).
|
||||
- Verify the base output directory (e.g., `./output`) is created if it didn't exist (AC3).
|
||||
- Verify the date-stamped subdirectory (e.g., `./output/2025-05-04`) is created (AC4). Use current date. Delete directories before re-running to confirm creation.
|
||||
- Check console log for the message confirming the output directory path (AC5).
|
||||
- Confirm the process exits cleanly (AC6).
|
||||
- Check `package.json` for `date-fns` dependency (AC7).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Any notes about implementation choices, difficulties, or follow-up needed}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **End of Report for Epic 1**
|
||||
@@ -1,925 +0,0 @@
|
||||
Okay, continuing in **SM Mode**.
|
||||
|
||||
Generating the stories for **Epic 2: HN Data Acquisition & Persistence**. This assumes that the foundational work from Epic 1 (project initialization, configuration setup via `src/utils/config.ts`, logger via `src/utils/logger.ts`, and output directory creation logic in `src/index.ts`) is complete and available for use.
|
||||
|
||||
---
|
||||
|
||||
## **Stories Report: Epic 2**
|
||||
|
||||
**File: ai/stories/2.1.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 2.1: Implement Algolia HN API Client
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want a dedicated client module to interact with the Algolia Hacker News Search API, so that fetching stories and comments is encapsulated, reusable, and uses the required native `Workspace` API. [155]
|
||||
|
||||
**Context:** This story creates the primary interface for retrieving data from the external Hacker News API provided by Algolia. It encapsulates the specific API calls (`GET /search` for stories and comments) and data extraction logic into a reusable module (`src/clients/algoliaHNClient.ts`). This client will be used by the main pipeline (Story 2.2) and the stage testing utility (Story 2.4). It builds upon the logger created in Epic 1 (Story 1.4). [54, 60, 62, 77]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Create a new module: `src/clients/algoliaHNClient.ts`. [156]
|
||||
- Implement an async function `WorkspaceTopStories` within the client: [157]
|
||||
- Use native `Workspace` [749] to call the Algolia HN Search API endpoint for front-page stories (`http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10`). [4, 6, 7, 157] Adjust `hitsPerPage` if needed to ensure 10 stories.
|
||||
- Parse the JSON response. [158]
|
||||
- Extract required metadata for each story: `objectID` (use as `storyId`), `title`, `url` (use as `articleUrl`), `points`, `num_comments`. [159, 522] Handle potential missing `url` field gracefully (log warning using logger from Story 1.4, treat as null). [160]
|
||||
- Construct the `hnUrl` for each story (e.g., `https://news.ycombinator.com/item?id={storyId}`). [161]
|
||||
- Return an array of structured story objects (define a `Story` type, potentially in `src/types/hn.ts`). [162, 506-511]
|
||||
- Implement a separate async function `WorkspaceCommentsForStory` within the client: [163]
|
||||
- Accept `storyId` (string) and `maxComments` limit (number) as arguments. [163]
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for comments of a specific story (`http://hn.algolia.com/api/v1/search?tags=comment,story_{storyId}&hitsPerPage={maxComments}`). [12, 13, 14, 164]
|
||||
- Parse the JSON response. [165]
|
||||
- Extract required comment data: `objectID` (use as `commentId`), `comment_text`, `author`, `created_at`. [165, 524]
|
||||
- Filter out comments where `comment_text` is null or empty. Ensure only up to `maxComments` are returned. [166]
|
||||
- Return an array of structured comment objects (define a `Comment` type, potentially in `src/types/hn.ts`). [167, 500-505]
|
||||
- Implement basic error handling using `try...catch` around `Workspace` calls and check `response.ok` status. [168] Log errors using the logger utility from Epic 1 (Story 1.4). [169]
|
||||
- Define TypeScript interfaces/types for the expected structures of API responses (subset needed) and the data returned by the client functions (`Story`, `Comment`). Place these in `src/types/hn.ts`. [169, 821]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: The module `src/clients/algoliaHNClient.ts` exists and exports `WorkspaceTopStories` and `WorkspaceCommentsForStory` functions. [170]
|
||||
- AC2: Calling `WorkspaceTopStories` makes a network request to the correct Algolia endpoint (`search?tags=front_page&hitsPerPage=10`) and returns a promise resolving to an array of 10 `Story` objects containing the specified metadata (`storyId`, `title`, `articleUrl`, `hnUrl`, `points`, `num_comments`). [171]
|
||||
- AC3: Calling `WorkspaceCommentsForStory` with a valid `storyId` and `maxComments` limit makes a network request to the correct Algolia endpoint (`search?tags=comment,story_{storyId}&hitsPerPage={maxComments}`) and returns a promise resolving to an array of `Comment` objects (up to `maxComments`), filtering out empty ones. [172]
|
||||
- AC4: Both functions use the native `Workspace` API internally. [173]
|
||||
- AC5: Network errors or non-successful API responses (e.g., status 4xx, 5xx) are caught and logged using the logger from Story 1.4. [174] Functions should likely return an empty array or throw a specific error in failure cases for the caller to handle.
|
||||
- AC6: Relevant TypeScript types (`Story`, `Comment`) are defined in `src/types/hn.ts` and used within the client module. [175]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/clients/algoliaHNClient.ts`, `src/types/hn.ts`.
|
||||
- Files to Modify: Potentially `src/types/index.ts` if using a barrel file.
|
||||
- _(Hint: See `docs/project-structure.md` [817, 821] for location)._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851], Native `Workspace` API [863].
|
||||
- Uses `logger` utility from Epic 1 (Story 1.4).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905] for full list)._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Algolia HN Search API `GET /search` endpoint. [2]
|
||||
- Base URL: `http://hn.algolia.com/api/v1` [3]
|
||||
- Parameters: `tags=front_page`, `hitsPerPage=10` (for stories) [6, 7]; `tags=comment,story_{storyId}`, `hitsPerPage={maxComments}` (for comments) [13, 14].
|
||||
- Check `response.ok` and parse JSON response (`response.json()`). [168, 158, 165]
|
||||
- Handle potential network errors with `try...catch`. [168]
|
||||
- No authentication required. [3]
|
||||
- _(Hint: See `docs/api-reference.md` [2-21] for details)._
|
||||
- **Data Structures:**
|
||||
- Define `Comment` interface: `{ commentId: string, commentText: string | null, author: string | null, createdAt: string }`. [501-505]
|
||||
- Define `Story` interface (initial fields): `{ storyId: string, title: string, articleUrl: string | null, hnUrl: string, points?: number, numComments?: number }`. [507-511]
|
||||
- (These types will be augmented in later stories [512-517]).
|
||||
- Reference Algolia response subset schemas in `docs/data-models.md` [521-525].
|
||||
- _(Hint: See `docs/data-models.md` for full details)._
|
||||
- **Environment Variables:**
|
||||
- No direct environment variables needed for this client itself (uses hardcoded base URL, fetches comment limit via argument).
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638] for all variables)._
|
||||
- **Coding Standards Notes:**
|
||||
- Use `async/await` for `Workspace` calls.
|
||||
- Use logger for errors and significant events (e.g., warning if `url` is missing). [160]
|
||||
- Export types and functions clearly.
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Create `src/types/hn.ts` and define `Comment` and initial `Story` interfaces.
|
||||
- [ ] Create `src/clients/algoliaHNClient.ts`.
|
||||
- [ ] Import necessary types and the logger utility.
|
||||
- [ ] Implement `WorkspaceTopStories` function:
|
||||
- [ ] Construct Algolia URL for top stories.
|
||||
- [ ] Use `Workspace` with `try...catch`.
|
||||
- [ ] Check `response.ok`, log errors if not OK.
|
||||
- [ ] Parse JSON response.
|
||||
- [ ] Map `hits` to `Story` objects, extracting required fields, handling null `url`, constructing `hnUrl`.
|
||||
- [ ] Return array of `Story` objects (or handle error case).
|
||||
- [ ] Implement `WorkspaceCommentsForStory` function:
|
||||
- [ ] Accept `storyId` and `maxComments` arguments.
|
||||
- [ ] Construct Algolia URL for comments using arguments.
|
||||
- [ ] Use `Workspace` with `try...catch`.
|
||||
- [ ] Check `response.ok`, log errors if not OK.
|
||||
- [ ] Parse JSON response.
|
||||
- [ ] Map `hits` to `Comment` objects, extracting required fields.
|
||||
- [ ] Filter out comments with null/empty `comment_text`.
|
||||
- [ ] Limit results to `maxComments`.
|
||||
- [ ] Return array of `Comment` objects (or handle error case).
|
||||
- [ ] Export functions and types as needed.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- Write unit tests for `src/clients/algoliaHNClient.ts`. [919]
|
||||
- Mock the native `Workspace` function (e.g., using `jest.spyOn(global, 'fetch')`). [918]
|
||||
- Test `WorkspaceTopStories`: Provide mock successful responses (valid JSON matching Algolia structure [521-523]) and verify correct parsing, mapping to `Story` objects [171], and `hnUrl` construction. Test with missing `url` field. Test mock error responses (network error, non-OK status) and verify error logging [174] and return value.
|
||||
- Test `WorkspaceCommentsForStory`: Provide mock successful responses [524-525] and verify correct parsing, mapping to `Comment` objects, filtering of empty comments, and limiting by `maxComments` [172]. Test mock error responses and verify logging [174].
|
||||
- Verify `Workspace` was called with the correct URLs and parameters [171, 172].
|
||||
- **Integration Tests:** N/A for this client module itself, but it will be used in pipeline integration tests later. [921]
|
||||
- **Manual/CLI Verification:** Tested indirectly via Story 2.2 execution and directly via Story 2.4 stage runner. [912]
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Any notes about implementation choices, difficulties, or follow-up needed}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/2.2.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 2.2: Integrate HN Data Fetching into Main Workflow
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want to integrate the HN data fetching logic into the main application workflow (`src/index.ts`), so that running the app retrieves the top 10 stories and their comments after completing the setup from Epic 1. [176]
|
||||
|
||||
**Context:** This story connects the HN API client created in Story 2.1 to the main application entry point (`src/index.ts`) established in Epic 1 (Story 1.3). It modifies the main execution flow to call the client functions (`WorkspaceTopStories`, `WorkspaceCommentsForStory`) after the initial setup (logger, config, output directory). It uses the `MAX_COMMENTS_PER_STORY` configuration value loaded in Story 1.2. The fetched data (stories and their associated comments) is held in memory at the end of this stage. [46, 77]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Modify the main execution flow in `src/index.ts` (or a main async function called by it, potentially moving logic to `src/core/pipeline.ts` as suggested by `ARCH` [46, 53] and `PS` [818]). **Recommendation:** Create `src/core/pipeline.ts` and a `runPipeline` async function, then call this function from `src/index.ts`.
|
||||
- Import the `algoliaHNClient` functions (`WorkspaceTopStories`, `WorkspaceCommentsForStory`) from Story 2.1. [177]
|
||||
- Import the configuration module (`src/utils/config.ts`) to access `MAX_COMMENTS_PER_STORY`. [177, 563] Also import the logger.
|
||||
- In the main pipeline function, after the Epic 1 setup (config load, logger init, output dir creation):
|
||||
- Call `await fetchTopStories()`. [178]
|
||||
- Log the number of stories fetched (e.g., "Fetched X stories."). [179] Use the logger from Story 1.4.
|
||||
- Retrieve the `MAX_COMMENTS_PER_STORY` value from the config module. Ensure it's parsed as a number. Provide a default if necessary (e.g., 50, matching `ENV` [564]).
|
||||
- Iterate through the array of fetched `Story` objects. [179]
|
||||
- For each `Story`:
|
||||
- Log progress (e.g., "Fetching up to Y comments for story {storyId}..."). [182]
|
||||
- Call `await fetchCommentsForStory()`, passing the `story.storyId` and the configured `MAX_COMMENTS_PER_STORY` value. [180]
|
||||
- Store the fetched comments (the returned `Comment[]`) within the corresponding `Story` object in memory (e.g., add a `comments: Comment[]` property to the `Story` type/object). [181] Augment the `Story` type definition in `src/types/hn.ts`. [512]
|
||||
- Ensure errors from the client functions are handled appropriately (e.g., log error and potentially skip comment fetching for that story).
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: Running `npm run dev` executes Epic 1 setup steps followed by fetching stories and then comments for each story using the `algoliaHNClient`. [183]
|
||||
- AC2: Logs (via logger) clearly show the start and successful completion of fetching stories, and the start of fetching comments for each of the 10 stories. [184]
|
||||
- AC3: The configured `MAX_COMMENTS_PER_STORY` value is read from config, parsed as a number, and used in the calls to `WorkspaceCommentsForStory`. [185]
|
||||
- AC4: After successful execution (before persistence in Story 2.3), `Story` objects held in memory contain a `comments` property populated with an array of fetched `Comment` objects. [186] (Verification via debugger or temporary logging).
|
||||
- AC5: The `Story` type definition in `src/types/hn.ts` is updated to include the `comments: Comment[]` field. [512]
|
||||
- AC6: (If implemented) Core logic is moved to `src/core/pipeline.ts` and called from `src/index.ts`. [818]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/core/pipeline.ts` (recommended).
|
||||
- Files to Modify: `src/index.ts`, `src/types/hn.ts`.
|
||||
- _(Hint: See `docs/project-structure.md` [818, 821, 822])._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851].
|
||||
- Uses `algoliaHNClient` (Story 2.1), `config` (Story 1.2), `logger` (Story 1.4).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Calls internal `algoliaHNClient.fetchTopStories()` and `algoliaHNClient.fetchCommentsForStory()`.
|
||||
- **Data Structures:**
|
||||
- Augment `Story` interface in `src/types/hn.ts` to include `comments: Comment[]`. [512]
|
||||
- Manipulates arrays of `Story` and `Comment` objects in memory.
|
||||
- _(Hint: See `docs/data-models.md` [500-517])._
|
||||
- **Environment Variables:**
|
||||
- Reads `MAX_COMMENTS_PER_STORY` via `config.ts`. [177, 563]
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Use `async/await` for calling client functions.
|
||||
- Structure fetching logic cleanly (e.g., within a loop).
|
||||
- Use the logger for progress and error reporting. [182, 184]
|
||||
- Consider putting the main loop logic inside the `runPipeline` function in `src/core/pipeline.ts`.
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] (Recommended) Create `src/core/pipeline.ts` and define an async `runPipeline` function.
|
||||
- [ ] Modify `src/index.ts` to import and call `runPipeline`. Move existing setup logic (logger init, config load, dir creation) into `runPipeline` or ensure it runs before it.
|
||||
- [ ] In `pipeline.ts` (or `index.ts`), import `WorkspaceTopStories`, `WorkspaceCommentsForStory` from `algoliaHNClient`.
|
||||
- [ ] Import `config` and `logger`.
|
||||
- [ ] Call `WorkspaceTopStories` after initial setup. Log count.
|
||||
- [ ] Retrieve `MAX_COMMENTS_PER_STORY` from `config`, ensuring it's a number.
|
||||
- [ ] Update `Story` type in `src/types/hn.ts` to include `comments: Comment[]`.
|
||||
- [ ] Loop through the fetched stories:
|
||||
- [ ] Log comment fetching start for the story ID.
|
||||
- [ ] Call `WorkspaceCommentsForStory` with `storyId` and `maxComments`.
|
||||
- [ ] Handle potential errors from the client function call.
|
||||
- [ ] Assign the returned comments array to the `comments` property of the current story object.
|
||||
- [ ] Add temporary logging or use debugger to verify stories in memory contain comments (AC4).
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- If logic is moved to `src/core/pipeline.ts`, unit test `runPipeline`. [916]
|
||||
- Mock `algoliaHNClient` functions (`WorkspaceTopStories`, `WorkspaceCommentsForStory`). [918]
|
||||
- Mock `config` to provide `MAX_COMMENTS_PER_STORY`.
|
||||
- Mock `logger`.
|
||||
- Verify `WorkspaceTopStories` is called once.
|
||||
- Verify `WorkspaceCommentsForStory` is called for each story returned by the mocked `WorkspaceTopStories`, and that it receives the correct `storyId` and `maxComments` value from config [185].
|
||||
- Verify the results from mocked `WorkspaceCommentsForStory` are correctly assigned to the `comments` property of the story objects.
|
||||
- **Integration Tests:**
|
||||
- Could have an integration test for the fetch stage that uses the real `algoliaHNClient` (or a lightly mocked version checking calls) and verifies the in-memory data structure, but this is largely covered by the stage runner (Story 2.4). [921]
|
||||
- **Manual/CLI Verification:**
|
||||
- Run `npm run dev`.
|
||||
- Check logs for fetching stories and comments messages [184].
|
||||
- Use debugger or temporary `console.log` in the pipeline code to inspect a story object after the loop and confirm its `comments` property is populated [186].
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Logic moved to src/core/pipeline.ts. Verified in-memory data structure.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/2.3.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 2.3: Persist Fetched HN Data Locally
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want to save the fetched HN stories (including their comments) to JSON files in the date-stamped output directory, so that the raw data is persisted locally for subsequent pipeline stages and debugging. [187]
|
||||
|
||||
**Context:** This story follows Story 2.2 where HN data (stories with comments) was fetched and stored in memory. Now, this data needs to be saved to the local filesystem. It uses the date-stamped output directory created in Epic 1 (Story 1.4) and writes one JSON file per story, containing the story metadata and its comments. This persisted data (`{storyId}_data.json`) is the input for subsequent stages (Scraping - Epic 3, Summarization - Epic 4, Email Assembly - Epic 5). [48, 734, 735]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Define a consistent JSON structure for the output file content. [188] Example from `docs/data-models.md` [539]: `{ storyId: "...", title: "...", articleUrl: "...", hnUrl: "...", points: ..., numComments: ..., fetchedAt: "ISO_TIMESTAMP", comments: [{ commentId: "...", commentText: "...", author: "...", createdAt: "...", ... }, ...] }`. Include a timestamp (`WorkspaceedAt`) for when the data was fetched/saved. [190]
|
||||
- Import Node.js `fs` (specifically `writeFileSync`) and `path` modules in the pipeline module (`src/core/pipeline.ts` or `src/index.ts`). [190] Import `date-fns` or use `new Date().toISOString()` for timestamp.
|
||||
- In the main workflow (`pipeline.ts`), within the loop iterating through stories (immediately after comments have been fetched and added to the story object in Story 2.2): [191]
|
||||
- Get the full path to the date-stamped output directory (this path should be determined/passed from the initial setup logic from Story 1.4). [191]
|
||||
- Generate the current timestamp in ISO 8601 format (e.g., `new Date().toISOString()`) and add it to the story object as `WorkspaceedAt`. [190] Update `Story` type in `src/types/hn.ts`. [516]
|
||||
- Construct the filename for the story's data: `{storyId}_data.json`. [192]
|
||||
- Construct the full file path using `path.join()`. [193]
|
||||
- Prepare the data object to be saved, matching the defined JSON structure (including `storyId`, `title`, `articleUrl`, `hnUrl`, `points`, `numComments`, `WorkspaceedAt`, `comments`).
|
||||
- Serialize the prepared story data object to a JSON string using `JSON.stringify(storyData, null, 2)` for readability. [194]
|
||||
- Write the JSON string to the file using `fs.writeFileSync()`. Use a `try...catch` block for error handling around the file write. [195]
|
||||
- Log (using the logger) the successful persistence of each story's data file or any errors encountered during file writing. [196]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory (e.g., `./output/YYYY-MM-DD/`) contains exactly 10 files named `{storyId}_data.json` (assuming 10 stories were fetched successfully). [197]
|
||||
- AC2: Each JSON file contains valid JSON representing a single story object, including its metadata (`storyId`, `title`, `articleUrl`, `hnUrl`, `points`, `numComments`), a `WorkspaceedAt` ISO timestamp, and an array of its fetched `comments`, matching the structure defined in `docs/data-models.md` [538-540]. [198]
|
||||
- AC3: The number of comments in each file's `comments` array does not exceed `MAX_COMMENTS_PER_STORY`. [199]
|
||||
- AC4: Logs indicate that saving data to a file was attempted for each story, reporting success or specific file writing errors. [200]
|
||||
- AC5: The `Story` type definition in `src/types/hn.ts` is updated to include the `WorkspaceedAt: string` field. [516]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Modify: `src/core/pipeline.ts` (or `src/index.ts`), `src/types/hn.ts`.
|
||||
- _(Hint: See `docs/project-structure.md` [818, 821, 822])._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851].
|
||||
- Native `fs` module (`writeFileSync`) [190].
|
||||
- Native `path` module (`join`) [193].
|
||||
- `JSON.stringify` [194].
|
||||
- Uses `logger` (Story 1.4).
|
||||
- Uses output directory path created in Story 1.4 logic.
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- `fs.writeFileSync(filePath, jsonDataString, 'utf-8')`. [195]
|
||||
- **Data Structures:**
|
||||
- Uses `Story` and `Comment` types from `src/types/hn.ts`.
|
||||
- Augment `Story` type to include `WorkspaceedAt: string`. [516]
|
||||
- Creates JSON structure matching `{storyId}_data.json` schema in `docs/data-models.md`. [538-540]
|
||||
- _(Hint: See `docs/data-models.md`)._
|
||||
- **Environment Variables:**
|
||||
- N/A directly, but relies on `OUTPUT_DIR_PATH` being available from config (Story 1.2) used by the directory creation logic (Story 1.4).
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Use `try...catch` for `writeFileSync` calls. [195]
|
||||
- Use `JSON.stringify` with indentation (`null, 2`) for readability. [194]
|
||||
- Log success/failure clearly using the logger. [196]
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] In `pipeline.ts` (or `index.ts`), import `fs` and `path`.
|
||||
- [ ] Update `Story` type in `src/types/hn.ts` to include `WorkspaceedAt: string`.
|
||||
- [ ] Ensure the full path to the date-stamped output directory is available within the story processing loop.
|
||||
- [ ] Inside the loop (after comments are fetched for a story):
|
||||
- [ ] Get the current ISO timestamp (`new Date().toISOString()`).
|
||||
- [ ] Add the timestamp to the story object as `WorkspaceedAt`.
|
||||
- [ ] Construct the output filename: `{storyId}_data.json`.
|
||||
- [ ] Construct the full file path using `path.join(outputDirPath, filename)`.
|
||||
- [ ] Create the data object matching the specified JSON structure, including comments.
|
||||
- [ ] Serialize the data object using `JSON.stringify(data, null, 2)`.
|
||||
- [ ] Use `try...catch` block:
|
||||
- [ ] Inside `try`: Call `fs.writeFileSync(fullPath, jsonString, 'utf-8')`.
|
||||
- [ ] Inside `try`: Log success message with filename.
|
||||
- [ ] Inside `catch`: Log file writing error with filename.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- Testing file system interactions directly in unit tests can be brittle. [918]
|
||||
- Focus unit tests on the data preparation logic: ensure the object created before `JSON.stringify` has the correct structure (`storyId`, `title`, `articleUrl`, `hnUrl`, `points`, `numComments`, `WorkspaceedAt`, `comments`) based on a sample input `Story` object. [920]
|
||||
- Verify the `WorkspaceedAt` timestamp is added correctly.
|
||||
- **Integration Tests:** [921]
|
||||
- Could test the file writing aspect using `mock-fs` or actual file system writes within a temporary directory (created during setup, removed during teardown). [924]
|
||||
- Verify that the correct filename is generated and the content written to the mock/temporary file matches the expected JSON structure [538-540] and content.
|
||||
- **Manual/CLI Verification:** [912]
|
||||
- Run `npm run dev`.
|
||||
- Inspect the `output/YYYY-MM-DD/` directory (use current date).
|
||||
- Verify 10 files named `{storyId}_data.json` exist (AC1).
|
||||
- Open a few files, visually inspect the JSON structure, check for all required fields (metadata, `WorkspaceedAt`, `comments` array), and verify comment count <= `MAX_COMMENTS_PER_STORY` (AC2, AC3).
|
||||
- Check console logs for success messages for file writing or any errors (AC4).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Files saved successfully in ./output/YYYY-MM-DD/ directory.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/2.4.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 2.4: Implement Stage Testing Utility for HN Fetching
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want a separate, executable script that _only_ performs the HN data fetching and persistence, so I can test and trigger this stage independently of the full pipeline. [201]
|
||||
|
||||
**Context:** This story addresses the PRD requirement [736] for stage-specific testing utilities [764]. It creates a standalone Node.js script (`src/stages/fetch_hn_data.ts`) that replicates the core logic of Stories 2.1, 2.2 (partially), and 2.3. This script will initialize necessary components (logger, config), call the `algoliaHNClient` to fetch stories and comments, and persist the results to the date-stamped output directory, just like the main pipeline does up to this point. This allows isolated testing of the Algolia API interaction and data persistence without running subsequent scraping, summarization, or emailing stages. [57, 62, 912]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Create a new standalone script file: `src/stages/fetch_hn_data.ts`. [202]
|
||||
- This script should perform the essential setup required _for this stage_:
|
||||
- Initialize the logger utility (from Story 1.4). [203]
|
||||
- Load configuration using the config utility (from Story 1.2) to get `MAX_COMMENTS_PER_STORY` and `OUTPUT_DIR_PATH`. [203]
|
||||
- Determine the current date ('YYYY-MM-DD') using the utility from Story 1.4. [203]
|
||||
- Construct the date-stamped output directory path. [203]
|
||||
- Ensure the output directory exists (create it recursively if not, reusing logic/utility from Story 1.4). [203]
|
||||
- The script should then execute the core logic of fetching and persistence:
|
||||
- Import and use `algoliaHNClient.fetchTopStories` and `algoliaHNClient.fetchCommentsForStory` (from Story 2.1). [204]
|
||||
- Import `fs` and `path`.
|
||||
- Replicate the fetch loop logic from Story 2.2 (fetch stories, then loop to fetch comments for each using loaded `MAX_COMMENTS_PER_STORY` limit). [204]
|
||||
- Replicate the persistence logic from Story 2.3 (add `WorkspaceedAt` timestamp, prepare data object, `JSON.stringify`, `fs.writeFileSync` to `{storyId}_data.json` in the date-stamped directory). [204]
|
||||
- The script should log its progress (e.g., "Starting HN data fetch stage...", "Fetching stories...", "Fetching comments for story X...", "Saving data for story X...") using the logger utility. [205]
|
||||
- Add a new script command to `package.json` under `"scripts"`: `"stage:fetch": "ts-node src/stages/fetch_hn_data.ts"`. [206]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: The file `src/stages/fetch_hn_data.ts` exists. [207]
|
||||
- AC2: The script `stage:fetch` is defined in `package.json`'s `scripts` section. [208]
|
||||
- AC3: Running `npm run stage:fetch` executes successfully, performing only the setup (logger, config, output dir), fetch (stories, comments), and persist steps (to JSON files). [209]
|
||||
- AC4: Running `npm run stage:fetch` creates the same 10 `{storyId}_data.json` files in the correct date-stamped output directory as running the main `npm run dev` command (up to the end of Epic 2 functionality). [210]
|
||||
- AC5: Logs generated by `npm run stage:fetch` reflect only the fetching and persisting steps, not subsequent pipeline stages (scraping, summarizing, emailing). [211]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/stages/fetch_hn_data.ts`.
|
||||
- Files to Modify: `package.json`.
|
||||
- _(Hint: See `docs/project-structure.md` [820] for stage runner location)._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851], `ts-node` (via `npm run` script).
|
||||
- Uses `logger` (Story 1.4), `config` (Story 1.2), date util (Story 1.4), directory creation logic (Story 1.4), `algoliaHNClient` (Story 2.1), `fs`/`path` (Story 2.3).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Calls internal `algoliaHNClient` functions.
|
||||
- Uses `fs.writeFileSync`.
|
||||
- **Data Structures:**
|
||||
- Uses `Story`, `Comment` types.
|
||||
- Generates `{storyId}_data.json` files [538-540].
|
||||
- _(Hint: See `docs/data-models.md`)._
|
||||
- **Environment Variables:**
|
||||
- Reads `MAX_COMMENTS_PER_STORY` and `OUTPUT_DIR_PATH` via `config.ts`.
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Structure the script clearly (setup, fetch, persist).
|
||||
- Use `async/await`.
|
||||
- Use logger extensively for progress indication. [205]
|
||||
- Consider wrapping the main logic in an `async` IIFE (Immediately Invoked Function Expression) or a main function call.
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Create `src/stages/fetch_hn_data.ts`.
|
||||
- [ ] Add imports for logger, config, date util, `algoliaHNClient`, `fs`, `path`.
|
||||
- [ ] Implement setup logic: initialize logger, load config, get output dir path, ensure directory exists.
|
||||
- [ ] Implement main fetch logic:
|
||||
- [ ] Call `WorkspaceTopStories`.
|
||||
- [ ] Get `MAX_COMMENTS_PER_STORY` from config.
|
||||
- [ ] Loop through stories:
|
||||
- [ ] Call `WorkspaceCommentsForStory`.
|
||||
- [ ] Add comments to story object.
|
||||
- [ ] Add `WorkspaceedAt` timestamp.
|
||||
- [ ] Prepare data object for saving.
|
||||
- [ ] Construct full file path for `{storyId}_data.json`.
|
||||
- [ ] Serialize and write to file using `fs.writeFileSync` within `try...catch`.
|
||||
- [ ] Log progress/success/errors.
|
||||
- [ ] Add script `"stage:fetch": "ts-node src/stages/fetch_hn_data.ts"` to `package.json`.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** Unit tests for the underlying components (logger, config, client, utils) should already exist from previous stories. Unit testing the stage script itself might have limited value beyond checking basic setup calls if the core logic is just orchestrating tested components. [915]
|
||||
- **Integration Tests:** N/A specifically for the script, as it _is_ an integration test itself. [921]
|
||||
- **Manual/CLI Verification (Primary Test Method for this Story):** [912, 927]
|
||||
- Run `npm run stage:fetch`. [209]
|
||||
- Verify successful execution without errors.
|
||||
- Check console logs for messages specific to fetching and persisting [211].
|
||||
- Inspect the `output/YYYY-MM-DD/` directory and verify the content of the generated `{storyId}_data.json` files match expectations (similar to verification for Story 2.3) [210].
|
||||
- Modify `package.json` to add the script (AC2).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] which identifies Stage Runners as a key part of Acceptance Testing)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Stage runner script created and tested successfully. package.json updated.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **End of Report for Epic 2**
|
||||
|
||||
Okay, continuing in **SM Mode**.
|
||||
|
||||
Generating the stories for **Epic 2: HN Data Acquisition & Persistence**. This assumes that the foundational work from Epic 1 (project initialization, configuration setup via `src/utils/config.ts`, logger via `src/utils/logger.ts`, and output directory creation logic in `src/index.ts`) is complete and available for use.
|
||||
|
||||
---
|
||||
|
||||
## **Stories Report: Epic 2**
|
||||
|
||||
**File: ai/stories/2.1.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 2.1: Implement Algolia HN API Client
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want a dedicated client module to interact with the Algolia Hacker News Search API, so that fetching stories and comments is encapsulated, reusable, and uses the required native `Workspace` API. [155]
|
||||
|
||||
**Context:** This story creates the primary interface for retrieving data from the external Hacker News API provided by Algolia. It encapsulates the specific API calls (`GET /search` for stories and comments) and data extraction logic into a reusable module (`src/clients/algoliaHNClient.ts`). This client will be used by the main pipeline (Story 2.2) and the stage testing utility (Story 2.4). It builds upon the logger created in Epic 1 (Story 1.4). [54, 60, 62, 77]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Create a new module: `src/clients/algoliaHNClient.ts`. [156]
|
||||
- Implement an async function `WorkspaceTopStories` within the client: [157]
|
||||
- Use native `Workspace` [749] to call the Algolia HN Search API endpoint for front-page stories (`http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10`). [4, 6, 7, 157] Adjust `hitsPerPage` if needed to ensure 10 stories.
|
||||
- Parse the JSON response. [158]
|
||||
- Extract required metadata for each story: `objectID` (use as `storyId`), `title`, `url` (use as `articleUrl`), `points`, `num_comments`. [159, 522] Handle potential missing `url` field gracefully (log warning using logger from Story 1.4, treat as null). [160]
|
||||
- Construct the `hnUrl` for each story (e.g., `https://news.ycombinator.com/item?id={storyId}`). [161]
|
||||
- Return an array of structured story objects (define a `Story` type, potentially in `src/types/hn.ts`). [162, 506-511]
|
||||
- Implement a separate async function `WorkspaceCommentsForStory` within the client: [163]
|
||||
- Accept `storyId` (string) and `maxComments` limit (number) as arguments. [163]
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for comments of a specific story (`http://hn.algolia.com/api/v1/search?tags=comment,story_{storyId}&hitsPerPage={maxComments}`). [12, 13, 14, 164]
|
||||
- Parse the JSON response. [165]
|
||||
- Extract required comment data: `objectID` (use as `commentId`), `comment_text`, `author`, `created_at`. [165, 524]
|
||||
- Filter out comments where `comment_text` is null or empty. Ensure only up to `maxComments` are returned. [166]
|
||||
- Return an array of structured comment objects (define a `Comment` type, potentially in `src/types/hn.ts`). [167, 500-505]
|
||||
- Implement basic error handling using `try...catch` around `Workspace` calls and check `response.ok` status. [168] Log errors using the logger utility from Epic 1 (Story 1.4). [169]
|
||||
- Define TypeScript interfaces/types for the expected structures of API responses (subset needed) and the data returned by the client functions (`Story`, `Comment`). Place these in `src/types/hn.ts`. [169, 821]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: The module `src/clients/algoliaHNClient.ts` exists and exports `WorkspaceTopStories` and `WorkspaceCommentsForStory` functions. [170]
|
||||
- AC2: Calling `WorkspaceTopStories` makes a network request to the correct Algolia endpoint (`search?tags=front_page&hitsPerPage=10`) and returns a promise resolving to an array of 10 `Story` objects containing the specified metadata (`storyId`, `title`, `articleUrl`, `hnUrl`, `points`, `num_comments`). [171]
|
||||
- AC3: Calling `WorkspaceCommentsForStory` with a valid `storyId` and `maxComments` limit makes a network request to the correct Algolia endpoint (`search?tags=comment,story_{storyId}&hitsPerPage={maxComments}`) and returns a promise resolving to an array of `Comment` objects (up to `maxComments`), filtering out empty ones. [172]
|
||||
- AC4: Both functions use the native `Workspace` API internally. [173]
|
||||
- AC5: Network errors or non-successful API responses (e.g., status 4xx, 5xx) are caught and logged using the logger from Story 1.4. [174] Functions should likely return an empty array or throw a specific error in failure cases for the caller to handle.
|
||||
- AC6: Relevant TypeScript types (`Story`, `Comment`) are defined in `src/types/hn.ts` and used within the client module. [175]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/clients/algoliaHNClient.ts`, `src/types/hn.ts`.
|
||||
- Files to Modify: Potentially `src/types/index.ts` if using a barrel file.
|
||||
- _(Hint: See `docs/project-structure.md` [817, 821] for location)._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851], Native `Workspace` API [863].
|
||||
- Uses `logger` utility from Epic 1 (Story 1.4).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905] for full list)._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Algolia HN Search API `GET /search` endpoint. [2]
|
||||
- Base URL: `http://hn.algolia.com/api/v1` [3]
|
||||
- Parameters: `tags=front_page`, `hitsPerPage=10` (for stories) [6, 7]; `tags=comment,story_{storyId}`, `hitsPerPage={maxComments}` (for comments) [13, 14].
|
||||
- Check `response.ok` and parse JSON response (`response.json()`). [168, 158, 165]
|
||||
- Handle potential network errors with `try...catch`. [168]
|
||||
- No authentication required. [3]
|
||||
- _(Hint: See `docs/api-reference.md` [2-21] for details)._
|
||||
- **Data Structures:**
|
||||
- Define `Comment` interface: `{ commentId: string, commentText: string | null, author: string | null, createdAt: string }`. [501-505]
|
||||
- Define `Story` interface (initial fields): `{ storyId: string, title: string, articleUrl: string | null, hnUrl: string, points?: number, numComments?: number }`. [507-511]
|
||||
- (These types will be augmented in later stories [512-517]).
|
||||
- Reference Algolia response subset schemas in `docs/data-models.md` [521-525].
|
||||
- _(Hint: See `docs/data-models.md` for full details)._
|
||||
- **Environment Variables:**
|
||||
- No direct environment variables needed for this client itself (uses hardcoded base URL, fetches comment limit via argument).
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638] for all variables)._
|
||||
- **Coding Standards Notes:**
|
||||
- Use `async/await` for `Workspace` calls.
|
||||
- Use logger for errors and significant events (e.g., warning if `url` is missing). [160]
|
||||
- Export types and functions clearly.
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Create `src/types/hn.ts` and define `Comment` and initial `Story` interfaces.
|
||||
- [ ] Create `src/clients/algoliaHNClient.ts`.
|
||||
- [ ] Import necessary types and the logger utility.
|
||||
- [ ] Implement `WorkspaceTopStories` function:
|
||||
- [ ] Construct Algolia URL for top stories.
|
||||
- [ ] Use `Workspace` with `try...catch`.
|
||||
- [ ] Check `response.ok`, log errors if not OK.
|
||||
- [ ] Parse JSON response.
|
||||
- [ ] Map `hits` to `Story` objects, extracting required fields, handling null `url`, constructing `hnUrl`.
|
||||
- [ ] Return array of `Story` objects (or handle error case).
|
||||
- [ ] Implement `WorkspaceCommentsForStory` function:
|
||||
- [ ] Accept `storyId` and `maxComments` arguments.
|
||||
- [ ] Construct Algolia URL for comments using arguments.
|
||||
- [ ] Use `Workspace` with `try...catch`.
|
||||
- [ ] Check `response.ok`, log errors if not OK.
|
||||
- [ ] Parse JSON response.
|
||||
- [ ] Map `hits` to `Comment` objects, extracting required fields.
|
||||
- [ ] Filter out comments with null/empty `comment_text`.
|
||||
- [ ] Limit results to `maxComments`.
|
||||
- [ ] Return array of `Comment` objects (or handle error case).
|
||||
- [ ] Export functions and types as needed.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- Write unit tests for `src/clients/algoliaHNClient.ts`. [919]
|
||||
- Mock the native `Workspace` function (e.g., using `jest.spyOn(global, 'fetch')`). [918]
|
||||
- Test `WorkspaceTopStories`: Provide mock successful responses (valid JSON matching Algolia structure [521-523]) and verify correct parsing, mapping to `Story` objects [171], and `hnUrl` construction. Test with missing `url` field. Test mock error responses (network error, non-OK status) and verify error logging [174] and return value.
|
||||
- Test `WorkspaceCommentsForStory`: Provide mock successful responses [524-525] and verify correct parsing, mapping to `Comment` objects, filtering of empty comments, and limiting by `maxComments` [172]. Test mock error responses and verify logging [174].
|
||||
- Verify `Workspace` was called with the correct URLs and parameters [171, 172].
|
||||
- **Integration Tests:** N/A for this client module itself, but it will be used in pipeline integration tests later. [921]
|
||||
- **Manual/CLI Verification:** Tested indirectly via Story 2.2 execution and directly via Story 2.4 stage runner. [912]
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Any notes about implementation choices, difficulties, or follow-up needed}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/2.2.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 2.2: Integrate HN Data Fetching into Main Workflow
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want to integrate the HN data fetching logic into the main application workflow (`src/index.ts`), so that running the app retrieves the top 10 stories and their comments after completing the setup from Epic 1. [176]
|
||||
|
||||
**Context:** This story connects the HN API client created in Story 2.1 to the main application entry point (`src/index.ts`) established in Epic 1 (Story 1.3). It modifies the main execution flow to call the client functions (`WorkspaceTopStories`, `WorkspaceCommentsForStory`) after the initial setup (logger, config, output directory). It uses the `MAX_COMMENTS_PER_STORY` configuration value loaded in Story 1.2. The fetched data (stories and their associated comments) is held in memory at the end of this stage. [46, 77]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Modify the main execution flow in `src/index.ts` (or a main async function called by it, potentially moving logic to `src/core/pipeline.ts` as suggested by `ARCH` [46, 53] and `PS` [818]). **Recommendation:** Create `src/core/pipeline.ts` and a `runPipeline` async function, then call this function from `src/index.ts`.
|
||||
- Import the `algoliaHNClient` functions (`WorkspaceTopStories`, `WorkspaceCommentsForStory`) from Story 2.1. [177]
|
||||
- Import the configuration module (`src/utils/config.ts`) to access `MAX_COMMENTS_PER_STORY`. [177, 563] Also import the logger.
|
||||
- In the main pipeline function, after the Epic 1 setup (config load, logger init, output dir creation):
|
||||
- Call `await fetchTopStories()`. [178]
|
||||
- Log the number of stories fetched (e.g., "Fetched X stories."). [179] Use the logger from Story 1.4.
|
||||
- Retrieve the `MAX_COMMENTS_PER_STORY` value from the config module. Ensure it's parsed as a number. Provide a default if necessary (e.g., 50, matching `ENV` [564]).
|
||||
- Iterate through the array of fetched `Story` objects. [179]
|
||||
- For each `Story`:
|
||||
- Log progress (e.g., "Fetching up to Y comments for story {storyId}..."). [182]
|
||||
- Call `await fetchCommentsForStory()`, passing the `story.storyId` and the configured `MAX_COMMENTS_PER_STORY` value. [180]
|
||||
- Store the fetched comments (the returned `Comment[]`) within the corresponding `Story` object in memory (e.g., add a `comments: Comment[]` property to the `Story` type/object). [181] Augment the `Story` type definition in `src/types/hn.ts`. [512]
|
||||
- Ensure errors from the client functions are handled appropriately (e.g., log error and potentially skip comment fetching for that story).
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: Running `npm run dev` executes Epic 1 setup steps followed by fetching stories and then comments for each story using the `algoliaHNClient`. [183]
|
||||
- AC2: Logs (via logger) clearly show the start and successful completion of fetching stories, and the start of fetching comments for each of the 10 stories. [184]
|
||||
- AC3: The configured `MAX_COMMENTS_PER_STORY` value is read from config, parsed as a number, and used in the calls to `WorkspaceCommentsForStory`. [185]
|
||||
- AC4: After successful execution (before persistence in Story 2.3), `Story` objects held in memory contain a `comments` property populated with an array of fetched `Comment` objects. [186] (Verification via debugger or temporary logging).
|
||||
- AC5: The `Story` type definition in `src/types/hn.ts` is updated to include the `comments: Comment[]` field. [512]
|
||||
- AC6: (If implemented) Core logic is moved to `src/core/pipeline.ts` and called from `src/index.ts`. [818]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/core/pipeline.ts` (recommended).
|
||||
- Files to Modify: `src/index.ts`, `src/types/hn.ts`.
|
||||
- _(Hint: See `docs/project-structure.md` [818, 821, 822])._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851].
|
||||
- Uses `algoliaHNClient` (Story 2.1), `config` (Story 1.2), `logger` (Story 1.4).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Calls internal `algoliaHNClient.fetchTopStories()` and `algoliaHNClient.fetchCommentsForStory()`.
|
||||
- **Data Structures:**
|
||||
- Augment `Story` interface in `src/types/hn.ts` to include `comments: Comment[]`. [512]
|
||||
- Manipulates arrays of `Story` and `Comment` objects in memory.
|
||||
- _(Hint: See `docs/data-models.md` [500-517])._
|
||||
- **Environment Variables:**
|
||||
- Reads `MAX_COMMENTS_PER_STORY` via `config.ts`. [177, 563]
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Use `async/await` for calling client functions.
|
||||
- Structure fetching logic cleanly (e.g., within a loop).
|
||||
- Use the logger for progress and error reporting. [182, 184]
|
||||
- Consider putting the main loop logic inside the `runPipeline` function in `src/core/pipeline.ts`.
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] (Recommended) Create `src/core/pipeline.ts` and define an async `runPipeline` function.
|
||||
- [ ] Modify `src/index.ts` to import and call `runPipeline`. Move existing setup logic (logger init, config load, dir creation) into `runPipeline` or ensure it runs before it.
|
||||
- [ ] In `pipeline.ts` (or `index.ts`), import `WorkspaceTopStories`, `WorkspaceCommentsForStory` from `algoliaHNClient`.
|
||||
- [ ] Import `config` and `logger`.
|
||||
- [ ] Call `WorkspaceTopStories` after initial setup. Log count.
|
||||
- [ ] Retrieve `MAX_COMMENTS_PER_STORY` from `config`, ensuring it's a number.
|
||||
- [ ] Update `Story` type in `src/types/hn.ts` to include `comments: Comment[]`.
|
||||
- [ ] Loop through the fetched stories:
|
||||
- [ ] Log comment fetching start for the story ID.
|
||||
- [ ] Call `WorkspaceCommentsForStory` with `storyId` and `maxComments`.
|
||||
- [ ] Handle potential errors from the client function call.
|
||||
- [ ] Assign the returned comments array to the `comments` property of the current story object.
|
||||
- [ ] Add temporary logging or use debugger to verify stories in memory contain comments (AC4).
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- If logic is moved to `src/core/pipeline.ts`, unit test `runPipeline`. [916]
|
||||
- Mock `algoliaHNClient` functions (`WorkspaceTopStories`, `WorkspaceCommentsForStory`). [918]
|
||||
- Mock `config` to provide `MAX_COMMENTS_PER_STORY`.
|
||||
- Mock `logger`.
|
||||
- Verify `WorkspaceTopStories` is called once.
|
||||
- Verify `WorkspaceCommentsForStory` is called for each story returned by the mocked `WorkspaceTopStories`, and that it receives the correct `storyId` and `maxComments` value from config [185].
|
||||
- Verify the results from mocked `WorkspaceCommentsForStory` are correctly assigned to the `comments` property of the story objects.
|
||||
- **Integration Tests:**
|
||||
- Could have an integration test for the fetch stage that uses the real `algoliaHNClient` (or a lightly mocked version checking calls) and verifies the in-memory data structure, but this is largely covered by the stage runner (Story 2.4). [921]
|
||||
- **Manual/CLI Verification:**
|
||||
- Run `npm run dev`.
|
||||
- Check logs for fetching stories and comments messages [184].
|
||||
- Use debugger or temporary `console.log` in the pipeline code to inspect a story object after the loop and confirm its `comments` property is populated [186].
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Logic moved to src/core/pipeline.ts. Verified in-memory data structure.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/2.3.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 2.3: Persist Fetched HN Data Locally
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want to save the fetched HN stories (including their comments) to JSON files in the date-stamped output directory, so that the raw data is persisted locally for subsequent pipeline stages and debugging. [187]
|
||||
|
||||
**Context:** This story follows Story 2.2 where HN data (stories with comments) was fetched and stored in memory. Now, this data needs to be saved to the local filesystem. It uses the date-stamped output directory created in Epic 1 (Story 1.4) and writes one JSON file per story, containing the story metadata and its comments. This persisted data (`{storyId}_data.json`) is the input for subsequent stages (Scraping - Epic 3, Summarization - Epic 4, Email Assembly - Epic 5). [48, 734, 735]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Define a consistent JSON structure for the output file content. [188] Example from `docs/data-models.md` [539]: `{ storyId: "...", title: "...", articleUrl: "...", hnUrl: "...", points: ..., numComments: ..., fetchedAt: "ISO_TIMESTAMP", comments: [{ commentId: "...", commentText: "...", author: "...", createdAt: "...", ... }, ...] }`. Include a timestamp (`WorkspaceedAt`) for when the data was fetched/saved. [190]
|
||||
- Import Node.js `fs` (specifically `writeFileSync`) and `path` modules in the pipeline module (`src/core/pipeline.ts` or `src/index.ts`). [190] Import `date-fns` or use `new Date().toISOString()` for timestamp.
|
||||
- In the main workflow (`pipeline.ts`), within the loop iterating through stories (immediately after comments have been fetched and added to the story object in Story 2.2): [191]
|
||||
- Get the full path to the date-stamped output directory (this path should be determined/passed from the initial setup logic from Story 1.4). [191]
|
||||
- Generate the current timestamp in ISO 8601 format (e.g., `new Date().toISOString()`) and add it to the story object as `WorkspaceedAt`. [190] Update `Story` type in `src/types/hn.ts`. [516]
|
||||
- Construct the filename for the story's data: `{storyId}_data.json`. [192]
|
||||
- Construct the full file path using `path.join()`. [193]
|
||||
- Prepare the data object to be saved, matching the defined JSON structure (including `storyId`, `title`, `articleUrl`, `hnUrl`, `points`, `numComments`, `WorkspaceedAt`, `comments`).
|
||||
- Serialize the prepared story data object to a JSON string using `JSON.stringify(storyData, null, 2)` for readability. [194]
|
||||
- Write the JSON string to the file using `fs.writeFileSync()`. Use a `try...catch` block for error handling around the file write. [195]
|
||||
- Log (using the logger) the successful persistence of each story's data file or any errors encountered during file writing. [196]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory (e.g., `./output/YYYY-MM-DD/`) contains exactly 10 files named `{storyId}_data.json` (assuming 10 stories were fetched successfully). [197]
|
||||
- AC2: Each JSON file contains valid JSON representing a single story object, including its metadata (`storyId`, `title`, `articleUrl`, `hnUrl`, `points`, `numComments`), a `WorkspaceedAt` ISO timestamp, and an array of its fetched `comments`, matching the structure defined in `docs/data-models.md` [538-540]. [198]
|
||||
- AC3: The number of comments in each file's `comments` array does not exceed `MAX_COMMENTS_PER_STORY`. [199]
|
||||
- AC4: Logs indicate that saving data to a file was attempted for each story, reporting success or specific file writing errors. [200]
|
||||
- AC5: The `Story` type definition in `src/types/hn.ts` is updated to include the `WorkspaceedAt: string` field. [516]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Modify: `src/core/pipeline.ts` (or `src/index.ts`), `src/types/hn.ts`.
|
||||
- _(Hint: See `docs/project-structure.md` [818, 821, 822])._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851].
|
||||
- Native `fs` module (`writeFileSync`) [190].
|
||||
- Native `path` module (`join`) [193].
|
||||
- `JSON.stringify` [194].
|
||||
- Uses `logger` (Story 1.4).
|
||||
- Uses output directory path created in Story 1.4 logic.
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- `fs.writeFileSync(filePath, jsonDataString, 'utf-8')`. [195]
|
||||
- **Data Structures:**
|
||||
- Uses `Story` and `Comment` types from `src/types/hn.ts`.
|
||||
- Augment `Story` type to include `WorkspaceedAt: string`. [516]
|
||||
- Creates JSON structure matching `{storyId}_data.json` schema in `docs/data-models.md`. [538-540]
|
||||
- _(Hint: See `docs/data-models.md`)._
|
||||
- **Environment Variables:**
|
||||
- N/A directly, but relies on `OUTPUT_DIR_PATH` being available from config (Story 1.2) used by the directory creation logic (Story 1.4).
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Use `try...catch` for `writeFileSync` calls. [195]
|
||||
- Use `JSON.stringify` with indentation (`null, 2`) for readability. [194]
|
||||
- Log success/failure clearly using the logger. [196]
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] In `pipeline.ts` (or `index.ts`), import `fs` and `path`.
|
||||
- [ ] Update `Story` type in `src/types/hn.ts` to include `WorkspaceedAt: string`.
|
||||
- [ ] Ensure the full path to the date-stamped output directory is available within the story processing loop.
|
||||
- [ ] Inside the loop (after comments are fetched for a story):
|
||||
- [ ] Get the current ISO timestamp (`new Date().toISOString()`).
|
||||
- [ ] Add the timestamp to the story object as `WorkspaceedAt`.
|
||||
- [ ] Construct the output filename: `{storyId}_data.json`.
|
||||
- [ ] Construct the full file path using `path.join(outputDirPath, filename)`.
|
||||
- [ ] Create the data object matching the specified JSON structure, including comments.
|
||||
- [ ] Serialize the data object using `JSON.stringify(data, null, 2)`.
|
||||
- [ ] Use `try...catch` block:
|
||||
- [ ] Inside `try`: Call `fs.writeFileSync(fullPath, jsonString, 'utf-8')`.
|
||||
- [ ] Inside `try`: Log success message with filename.
|
||||
- [ ] Inside `catch`: Log file writing error with filename.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- Testing file system interactions directly in unit tests can be brittle. [918]
|
||||
- Focus unit tests on the data preparation logic: ensure the object created before `JSON.stringify` has the correct structure (`storyId`, `title`, `articleUrl`, `hnUrl`, `points`, `numComments`, `WorkspaceedAt`, `comments`) based on a sample input `Story` object. [920]
|
||||
- Verify the `WorkspaceedAt` timestamp is added correctly.
|
||||
- **Integration Tests:** [921]
|
||||
- Could test the file writing aspect using `mock-fs` or actual file system writes within a temporary directory (created during setup, removed during teardown). [924]
|
||||
- Verify that the correct filename is generated and the content written to the mock/temporary file matches the expected JSON structure [538-540] and content.
|
||||
- **Manual/CLI Verification:** [912]
|
||||
- Run `npm run dev`.
|
||||
- Inspect the `output/YYYY-MM-DD/` directory (use current date).
|
||||
- Verify 10 files named `{storyId}_data.json` exist (AC1).
|
||||
- Open a few files, visually inspect the JSON structure, check for all required fields (metadata, `WorkspaceedAt`, `comments` array), and verify comment count <= `MAX_COMMENTS_PER_STORY` (AC2, AC3).
|
||||
- Check console logs for success messages for file writing or any errors (AC4).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Files saved successfully in ./output/YYYY-MM-DD/ directory.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/2.4.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 2.4: Implement Stage Testing Utility for HN Fetching
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want a separate, executable script that _only_ performs the HN data fetching and persistence, so I can test and trigger this stage independently of the full pipeline. [201]
|
||||
|
||||
**Context:** This story addresses the PRD requirement [736] for stage-specific testing utilities [764]. It creates a standalone Node.js script (`src/stages/fetch_hn_data.ts`) that replicates the core logic of Stories 2.1, 2.2 (partially), and 2.3. This script will initialize necessary components (logger, config), call the `algoliaHNClient` to fetch stories and comments, and persist the results to the date-stamped output directory, just like the main pipeline does up to this point. This allows isolated testing of the Algolia API interaction and data persistence without running subsequent scraping, summarization, or emailing stages. [57, 62, 912]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Create a new standalone script file: `src/stages/fetch_hn_data.ts`. [202]
|
||||
- This script should perform the essential setup required _for this stage_:
|
||||
- Initialize the logger utility (from Story 1.4). [203]
|
||||
- Load configuration using the config utility (from Story 1.2) to get `MAX_COMMENTS_PER_STORY` and `OUTPUT_DIR_PATH`. [203]
|
||||
- Determine the current date ('YYYY-MM-DD') using the utility from Story 1.4. [203]
|
||||
- Construct the date-stamped output directory path. [203]
|
||||
- Ensure the output directory exists (create it recursively if not, reusing logic/utility from Story 1.4). [203]
|
||||
- The script should then execute the core logic of fetching and persistence:
|
||||
- Import and use `algoliaHNClient.fetchTopStories` and `algoliaHNClient.fetchCommentsForStory` (from Story 2.1). [204]
|
||||
- Import `fs` and `path`.
|
||||
- Replicate the fetch loop logic from Story 2.2 (fetch stories, then loop to fetch comments for each using loaded `MAX_COMMENTS_PER_STORY` limit). [204]
|
||||
- Replicate the persistence logic from Story 2.3 (add `WorkspaceedAt` timestamp, prepare data object, `JSON.stringify`, `fs.writeFileSync` to `{storyId}_data.json` in the date-stamped directory). [204]
|
||||
- The script should log its progress (e.g., "Starting HN data fetch stage...", "Fetching stories...", "Fetching comments for story X...", "Saving data for story X...") using the logger utility. [205]
|
||||
- Add a new script command to `package.json` under `"scripts"`: `"stage:fetch": "ts-node src/stages/fetch_hn_data.ts"`. [206]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: The file `src/stages/fetch_hn_data.ts` exists. [207]
|
||||
- AC2: The script `stage:fetch` is defined in `package.json`'s `scripts` section. [208]
|
||||
- AC3: Running `npm run stage:fetch` executes successfully, performing only the setup (logger, config, output dir), fetch (stories, comments), and persist steps (to JSON files). [209]
|
||||
- AC4: Running `npm run stage:fetch` creates the same 10 `{storyId}_data.json` files in the correct date-stamped output directory as running the main `npm run dev` command (up to the end of Epic 2 functionality). [210]
|
||||
- AC5: Logs generated by `npm run stage:fetch` reflect only the fetching and persisting steps, not subsequent pipeline stages (scraping, summarizing, emailing). [211]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/stages/fetch_hn_data.ts`.
|
||||
- Files to Modify: `package.json`.
|
||||
- _(Hint: See `docs/project-structure.md` [820] for stage runner location)._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851], `ts-node` (via `npm run` script).
|
||||
- Uses `logger` (Story 1.4), `config` (Story 1.2), date util (Story 1.4), directory creation logic (Story 1.4), `algoliaHNClient` (Story 2.1), `fs`/`path` (Story 2.3).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Calls internal `algoliaHNClient` functions.
|
||||
- Uses `fs.writeFileSync`.
|
||||
- **Data Structures:**
|
||||
- Uses `Story`, `Comment` types.
|
||||
- Generates `{storyId}_data.json` files [538-540].
|
||||
- _(Hint: See `docs/data-models.md`)._
|
||||
- **Environment Variables:**
|
||||
- Reads `MAX_COMMENTS_PER_STORY` and `OUTPUT_DIR_PATH` via `config.ts`.
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Structure the script clearly (setup, fetch, persist).
|
||||
- Use `async/await`.
|
||||
- Use logger extensively for progress indication. [205]
|
||||
- Consider wrapping the main logic in an `async` IIFE (Immediately Invoked Function Expression) or a main function call.
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Create `src/stages/fetch_hn_data.ts`.
|
||||
- [ ] Add imports for logger, config, date util, `algoliaHNClient`, `fs`, `path`.
|
||||
- [ ] Implement setup logic: initialize logger, load config, get output dir path, ensure directory exists.
|
||||
- [ ] Implement main fetch logic:
|
||||
- [ ] Call `WorkspaceTopStories`.
|
||||
- [ ] Get `MAX_COMMENTS_PER_STORY` from config.
|
||||
- [ ] Loop through stories:
|
||||
- [ ] Call `WorkspaceCommentsForStory`.
|
||||
- [ ] Add comments to story object.
|
||||
- [ ] Add `WorkspaceedAt` timestamp.
|
||||
- [ ] Prepare data object for saving.
|
||||
- [ ] Construct full file path for `{storyId}_data.json`.
|
||||
- [ ] Serialize and write to file using `fs.writeFileSync` within `try...catch`.
|
||||
- [ ] Log progress/success/errors.
|
||||
- [ ] Add script `"stage:fetch": "ts-node src/stages/fetch_hn_data.ts"` to `package.json`.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** Unit tests for the underlying components (logger, config, client, utils) should already exist from previous stories. Unit testing the stage script itself might have limited value beyond checking basic setup calls if the core logic is just orchestrating tested components. [915]
|
||||
- **Integration Tests:** N/A specifically for the script, as it _is_ an integration test itself. [921]
|
||||
- **Manual/CLI Verification (Primary Test Method for this Story):** [912, 927]
|
||||
- Run `npm run stage:fetch`. [209]
|
||||
- Verify successful execution without errors.
|
||||
- Check console logs for messages specific to fetching and persisting [211].
|
||||
- Inspect the `output/YYYY-MM-DD/` directory and verify the content of the generated `{storyId}_data.json` files match expectations (similar to verification for Story 2.3) [210].
|
||||
- Modify `package.json` to add the script (AC2).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] which identifies Stage Runners as a key part of Acceptance Testing)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Stage runner script created and tested successfully. package.json updated.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **End of Report for Epic 2**
|
||||
@@ -1,486 +0,0 @@
|
||||
Okay, remaining in **SM Mode**.
|
||||
|
||||
Generating the stories for **Epic 3: Article Scraping & Persistence**. This assumes that Epic 1 (project setup, config, logging, output dir) and Epic 2 (HN client implementation, data fetching integration, and local persistence of `{storyId}_data.json` files) are complete.
|
||||
|
||||
---
|
||||
|
||||
## **Stories Report: Epic 3**
|
||||
|
||||
**File: ai/stories/3.1.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 3.1: Implement Basic Article Scraper Module
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want a module that attempts to fetch HTML from a URL and extract the main article text using basic methods, handling common failures gracefully, so article content can be prepared for summarization. [220]
|
||||
|
||||
**Context:** This story introduces the article scraping capability. It creates a dedicated module (`src/scraper/articleScraper.ts`) responsible for fetching content from external article URLs (found in the `{storyId}_data.json` files from Epic 2) and extracting plain text. It emphasizes using native `Workspace` and a simple extraction library (`@extractus/article-extractor` is recommended [222, 873]), and crucially, handling failures robustly (timeouts, non-HTML content, extraction errors) as required by the PRD [723, 724, 741]. This module will be used by the main pipeline (Story 3.2) and the stage tester (Story 3.4). [47, 55, 60, 63, 65]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Create a new module: `src/scraper/articleScraper.ts`. [221]
|
||||
- Add `@extractus/article-extractor` dependency: `npm install @extractus/article-extractor --save-prod`. [222, 223, 873]
|
||||
- Implement an async function `scrapeArticle(url: string): Promise<string | null>` within the module. [223, 224]
|
||||
- Inside the function:
|
||||
- Use native `Workspace` [749] to retrieve content from the `url`. [224] Set a reasonable timeout (e.g., 15 seconds via `AbortSignal.timeout()`, configure via `SCRAPE_TIMEOUT_MS` [615] if needed). Include a `User-Agent` header (e.g., `"BMadHackerDigest/0.1"` or configurable via `SCRAPER_USER_AGENT` [629]). [225]
|
||||
- Handle potential `Workspace` errors (network errors, timeouts) using `try...catch`. Log error using logger (from Story 1.4) and return `null`. [226]
|
||||
- Check the `response.ok` status. If not okay, log error (including status code) and return `null`. [226, 227]
|
||||
- Check the `Content-Type` header of the response. If it doesn't indicate HTML (e.g., does not include `text/html`), log warning and return `null`. [227, 228]
|
||||
- If HTML is received (`response.text()`), attempt to extract the main article text using `@extractus/article-extractor`. [229]
|
||||
- Wrap the extraction logic (`await articleExtractor.extract(htmlContent)`) in a `try...catch` to handle library-specific errors. Log error and return `null` on failure. [230]
|
||||
- Return the extracted plain text (`article.content`) if successful and not empty. Ensure it's just text, not HTML markup. [231]
|
||||
- Return `null` if extraction fails or results in empty content. [232]
|
||||
- Log all significant events, errors, or reasons for returning null (e.g., "Scraping URL...", "Fetch failed:", "Non-OK status:", "Non-HTML content type:", "Extraction failed:", "Successfully extracted text for {url}") using the logger utility. [233]
|
||||
- Define TypeScript types/interfaces as needed (though `article-extractor` types might suffice). [234]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: The `src/scraper/articleScraper.ts` module exists and exports the `scrapeArticle` function. [234]
|
||||
- AC2: The `@extractus/article-extractor` library is added to `dependencies` in `package.json` and `package-lock.json` is updated. [235]
|
||||
- AC3: `scrapeArticle` uses native `Workspace` with a timeout (default or configured) and a User-Agent header. [236]
|
||||
- AC4: `scrapeArticle` correctly handles fetch errors (network, timeout), non-OK responses, and non-HTML content types by logging the specific reason and returning `null`. [237]
|
||||
- AC5: `scrapeArticle` uses `@extractus/article-extractor` to attempt text extraction from valid HTML content fetched via `response.text()`. [238]
|
||||
- AC6: `scrapeArticle` returns the extracted plain text string on success, and `null` on any failure (fetch, non-HTML, extraction error, empty result). [239]
|
||||
- AC7: Relevant logs are produced using the logger for success, different failure modes, and errors encountered during the process. [240]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/scraper/articleScraper.ts`.
|
||||
- Files to Modify: `package.json`, `package-lock.json`. Add optional env vars to `.env.example`.
|
||||
- _(Hint: See `docs/project-structure.md` [819] for scraper location)._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851], Native `Workspace` API [863].
|
||||
- `@extractus/article-extractor` library. [873]
|
||||
- Uses `logger` utility (Story 1.4).
|
||||
- Uses `config` utility (Story 1.2) if implementing configurable timeout/user-agent.
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Native `Workspace(url, { signal: AbortSignal.timeout(timeoutMs), headers: { 'User-Agent': userAgent } })`. [225]
|
||||
- Check `response.ok`, `response.headers.get('Content-Type')`. [227, 228]
|
||||
- Get body as text: `await response.text()`. [229]
|
||||
- `@extractus/article-extractor`: `import articleExtractor from '@extractus/article-extractor'; const article = await articleExtractor.extract(htmlContent); return article?.content || null;` [229, 231]
|
||||
- **Data Structures:**
|
||||
- Function signature: `scrapeArticle(url: string): Promise<string | null>`. [224]
|
||||
- Uses `article` object returned by extractor.
|
||||
- _(Hint: See `docs/data-models.md` [498-547])._
|
||||
- **Environment Variables:**
|
||||
- Optional: `SCRAPE_TIMEOUT_MS` (default e.g., 15000). [615]
|
||||
- Optional: `SCRAPER_USER_AGENT` (default e.g., "BMadHackerDigest/0.1"). [629]
|
||||
- Load via `config.ts` if used.
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Use `async/await`.
|
||||
- Implement comprehensive `try...catch` blocks for `Workspace` and extraction. [226, 230]
|
||||
- Log errors and reasons for returning `null` clearly. [233]
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Run `npm install @extractus/article-extractor --save-prod`.
|
||||
- [ ] Create `src/scraper/articleScraper.ts`.
|
||||
- [ ] Import logger, (optionally config), and `articleExtractor`.
|
||||
- [ ] Define the `scrapeArticle` async function accepting a `url`.
|
||||
- [ ] Implement `try...catch` for the entire fetch/parse logic. Log error and return `null` in `catch`.
|
||||
- [ ] Inside `try`:
|
||||
- [ ] Define timeout (default or from config).
|
||||
- [ ] Define User-Agent (default or from config).
|
||||
- [ ] Call native `Workspace` with URL, timeout signal, and User-Agent header.
|
||||
- [ ] Check `response.ok`. If not OK, log status and return `null`.
|
||||
- [ ] Check `Content-Type` header. If not HTML, log type and return `null`.
|
||||
- [ ] Get HTML content using `response.text()`.
|
||||
- [ ] Implement inner `try...catch` for extraction:
|
||||
- [ ] Call `await articleExtractor.extract(htmlContent)`.
|
||||
- [ ] Check if result (`article?.content`) is valid text. If yes, log success and return text.
|
||||
- [ ] If extraction failed or content is empty, log reason and return `null`.
|
||||
- [ ] In `catch` block for extraction, log error and return `null`.
|
||||
- [ ] Add optional env vars `SCRAPE_TIMEOUT_MS` and `SCRAPER_USER_AGENT` to `.env.example`.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- Write unit tests for `src/scraper/articleScraper.ts`. [919]
|
||||
- Mock native `Workspace`. Test different scenarios:
|
||||
- Successful fetch (200 OK, HTML content type) -> Mock `articleExtractor` success -> Verify returned text [239].
|
||||
- Successful fetch -> Mock `articleExtractor` failure/empty content -> Verify `null` return and logs [239, 240].
|
||||
- Fetch returns non-OK status (e.g., 404, 500) -> Verify `null` return and logs [237, 240].
|
||||
- Fetch returns non-HTML content type -> Verify `null` return and logs [237, 240].
|
||||
- Fetch throws network error/timeout -> Verify `null` return and logs [237, 240].
|
||||
- Mock `@extractus/article-extractor` to simulate success and failure cases. [918]
|
||||
- Verify `Workspace` is called with the correct URL, User-Agent, and timeout signal [236].
|
||||
- **Integration Tests:** N/A for this module itself. [921]
|
||||
- **Manual/CLI Verification:** Tested indirectly via Story 3.2 execution and directly via Story 3.4 stage runner. [912]
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Implemented scraper module with @extractus/article-extractor and robust error handling.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/3.2.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 3.2: Integrate Article Scraping into Main Workflow
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want to integrate the article scraper into the main workflow (`src/core/pipeline.ts`), attempting to scrape the article for each HN story that has a valid URL, after fetching its data. [241]
|
||||
|
||||
**Context:** This story connects the scraper module (`articleScraper.ts` from Story 3.1) into the main application pipeline (`src/core/pipeline.ts`) developed in Epic 2. It modifies the main loop over the fetched stories (which contain data loaded in Story 2.2) to include a call to `scrapeArticle` for stories that have an article URL. The result (scraped text or null) is then stored in memory, associated with the story object. [47, 78, 79]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Modify the main execution flow in `src/core/pipeline.ts` (assuming logic moved here in Story 2.2). [242]
|
||||
- Import the `scrapeArticle` function from `src/scraper/articleScraper.ts`. [243] Import the logger.
|
||||
- Within the main loop iterating through the fetched `Story` objects (after comments are fetched in Story 2.2 and before persistence in Story 2.3):
|
||||
- Check if `story.articleUrl` exists and appears to be a valid HTTP/HTTPS URL. A simple check for starting with `http://` or `https://` is sufficient. [243, 244]
|
||||
- If the URL is missing or invalid, log a warning using the logger ("Skipping scraping for story {storyId}: Missing or invalid URL") and proceed to the next step for this story (e.g., summarization in Epic 4, or persistence in Story 3.3). Set an internal placeholder for scraped content to `null`. [245]
|
||||
- If a valid URL exists:
|
||||
- Log ("Attempting to scrape article for story {storyId} from {story.articleUrl}"). [246]
|
||||
- Call `await scrapeArticle(story.articleUrl)`. [247]
|
||||
- Store the result (the extracted text string or `null`) in memory, associated with the story object. Define/add property `articleContent: string | null` to the `Story` type in `src/types/hn.ts`. [247, 513]
|
||||
- Log the outcome clearly using the logger (e.g., "Successfully scraped article for story {storyId}", "Failed to scrape article for story {storyId}"). [248]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: Running `npm run dev` executes Epic 1 & 2 steps, and then attempts article scraping for stories with valid `articleUrl`s within the main pipeline loop. [249]
|
||||
- AC2: Stories with missing or invalid `articleUrl`s are skipped by the scraping step, and a corresponding warning message is logged via the logger. [250]
|
||||
- AC3: For stories with valid URLs, the `scrapeArticle` function from `src/scraper/articleScraper.ts` is called with the correct URL. [251]
|
||||
- AC4: Logs (via logger) clearly indicate the start ("Attempting to scrape...") and the success/failure outcome of the scraping attempt for each relevant story. [252]
|
||||
- AC5: Story objects held in memory after this stage contain an `articleContent` property holding the scraped text (string) or `null` if scraping was skipped or failed. [253] (Verify via debugger/logging).
|
||||
- AC6: The `Story` type definition in `src/types/hn.ts` is updated to include the `articleContent: string | null` field. [513]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Modify: `src/core/pipeline.ts`, `src/types/hn.ts`.
|
||||
- _(Hint: See `docs/project-structure.md` [818, 821])._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851].
|
||||
- Uses `articleScraper.scrapeArticle` (Story 3.1), `logger` (Story 1.4).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Calls internal `scrapeArticle(url)`.
|
||||
- **Data Structures:**
|
||||
- Operates on `Story[]` fetched in Epic 2.
|
||||
- Augment `Story` interface in `src/types/hn.ts` to include `articleContent: string | null`. [513]
|
||||
- Checks `story.articleUrl`.
|
||||
- _(Hint: See `docs/data-models.md` [506-517])._
|
||||
- **Environment Variables:**
|
||||
- N/A directly, but `scrapeArticle` might use them (Story 3.1).
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Perform the URL check before calling the scraper. [244]
|
||||
- Clearly log skipping, attempt, success, failure for scraping. [245, 246, 248]
|
||||
- Ensure the `articleContent` property is always set (either to the result string or explicitly to `null`).
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Update `Story` type in `src/types/hn.ts` to include `articleContent: string | null`.
|
||||
- [ ] Modify the main loop in `src/core/pipeline.ts` where stories are processed.
|
||||
- [ ] Import `scrapeArticle` from `src/scraper/articleScraper.ts`.
|
||||
- [ ] Import `logger`.
|
||||
- [ ] Inside the loop (after comment fetching, before persistence steps):
|
||||
- [ ] Check if `story.articleUrl` exists and starts with `http`.
|
||||
- [ ] If invalid/missing:
|
||||
- [ ] Log warning message.
|
||||
- [ ] Set `story.articleContent = null`.
|
||||
- [ ] If valid:
|
||||
- [ ] Log attempt message.
|
||||
- [ ] Call `const scrapedContent = await scrapeArticle(story.articleUrl)`.
|
||||
- [ ] Set `story.articleContent = scrapedContent`.
|
||||
- [ ] Log success (if `scrapedContent` is not null) or failure (if `scrapedContent` is null).
|
||||
- [ ] Add temporary logging or use debugger to verify `articleContent` property in story objects (AC5).
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- Unit test the modified pipeline logic in `src/core/pipeline.ts`. [916]
|
||||
- Mock the `scrapeArticle` function. [918]
|
||||
- Provide mock `Story` objects with and without valid `articleUrl`s.
|
||||
- Verify that `scrapeArticle` is called only for stories with valid URLs [251].
|
||||
- Verify that the correct URL is passed to `scrapeArticle`.
|
||||
- Verify that the return value (mocked text or mocked null) from `scrapeArticle` is correctly assigned to the `story.articleContent` property [253].
|
||||
- Verify that appropriate logs (skip warning, attempt, success/fail) are called based on the URL validity and mocked `scrapeArticle` result [250, 252].
|
||||
- **Integration Tests:** Less emphasis here; Story 3.4 provides better integration testing for scraping. [921]
|
||||
- **Manual/CLI Verification:** [912]
|
||||
- Run `npm run dev`.
|
||||
- Check console logs for "Attempting to scrape...", "Successfully scraped...", "Failed to scrape...", and "Skipping scraping..." messages [250, 252].
|
||||
- Use debugger or temporary logging to inspect `story.articleContent` values during or after the pipeline run [253].
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Integrated scraper call into pipeline. Updated Story type. Verified logic for handling valid/invalid URLs.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/3.3.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 3.3: Persist Scraped Article Text Locally
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want to save successfully scraped article text to a separate local file for each story, so that the text content is available as input for the summarization stage. [254]
|
||||
|
||||
**Context:** This story adds the persistence step for the article content scraped in Story 3.2. Following a successful scrape (where `story.articleContent` is not null), this logic writes the plain text content to a `.txt` file (`{storyId}_article.txt`) within the date-stamped output directory created in Epic 1. This ensures the scraped text is available for the next stage (Summarization - Epic 4) even if the main script is run in stages or needs to be restarted. No file should be created if scraping failed or was skipped. [49, 734, 735]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Import Node.js `fs` (`writeFileSync`) and `path` modules if not already present in `src/core/pipeline.ts`. [255] Import logger.
|
||||
- In the main workflow (`src/core/pipeline.ts`), within the loop processing each story, _after_ the scraping attempt (Story 3.2) is complete: [256]
|
||||
- Check if `story.articleContent` is a non-null, non-empty string.
|
||||
- If yes (scraping was successful and yielded content):
|
||||
- Retrieve the full path to the current date-stamped output directory (available from setup). [256]
|
||||
- Construct the filename: `{storyId}_article.txt`. [257]
|
||||
- Construct the full file path using `path.join()`. [257]
|
||||
- Get the successfully scraped article text string (`story.articleContent`). [258]
|
||||
- Use `fs.writeFileSync(fullPath, story.articleContent, 'utf-8')` to save the text to the file. [259] Wrap this call in a `try...catch` block for file system errors. [260]
|
||||
- Log the successful saving of the file (e.g., "Saved scraped article text to {filename}") or any file writing errors encountered, using the logger. [260]
|
||||
- If `story.articleContent` is null or empty (scraping skipped or failed), ensure _no_ `_article.txt` file is created for this story. [261]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains `_article.txt` files _only_ for those stories where `scrapeArticle` (from Story 3.1) succeeded and returned non-empty text content during the pipeline run (Story 3.2). [262]
|
||||
- AC2: The name of each article text file is `{storyId}_article.txt`. [263]
|
||||
- AC3: The content of each existing `_article.txt` file is the plain text string stored in `story.articleContent`. [264]
|
||||
- AC4: Logs confirm the successful writing of each `_article.txt` file or report specific file writing errors. [265]
|
||||
- AC5: No empty `_article.txt` files are created. Files only exist if scraping was successful and returned content. [266]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Modify: `src/core/pipeline.ts`.
|
||||
- _(Hint: See `docs/project-structure.md` [818])._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851].
|
||||
- Native `fs` module (`writeFileSync`). [259]
|
||||
- Native `path` module (`join`). [257]
|
||||
- Uses `logger` (Story 1.4).
|
||||
- Uses output directory path (from Story 1.4 logic).
|
||||
- Uses `story.articleContent` populated in Story 3.2.
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- `fs.writeFileSync(fullPath, articleContentString, 'utf-8')`. [259]
|
||||
- **Data Structures:**
|
||||
- Checks `story.articleContent` (string | null).
|
||||
- Defines output file format `{storyId}_article.txt` [541].
|
||||
- _(Hint: See `docs/data-models.md` [506-517, 541])._
|
||||
- **Environment Variables:**
|
||||
- Relies on `OUTPUT_DIR_PATH` being available (from Story 1.2/1.4).
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Place the file writing logic immediately after the scraping result is known for a story.
|
||||
- Use a clear `if (story.articleContent)` check. [256]
|
||||
- Use `try...catch` around `fs.writeFileSync`. [260]
|
||||
- Log success/failure clearly. [260]
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] In `src/core/pipeline.ts`, ensure `fs` and `path` are imported. Ensure logger is imported.
|
||||
- [ ] Ensure the output directory path is available within the story processing loop.
|
||||
- [ ] Inside the loop, after `story.articleContent` is set (from Story 3.2):
|
||||
- [ ] Add an `if (story.articleContent)` condition.
|
||||
- [ ] Inside the `if` block:
|
||||
- [ ] Construct filename: `{storyId}_article.txt`.
|
||||
- [ ] Construct full path using `path.join`.
|
||||
- [ ] Implement `try...catch`:
|
||||
- [ ] `try`: Call `fs.writeFileSync(fullPath, story.articleContent, 'utf-8')`.
|
||||
- [ ] `try`: Log success message.
|
||||
- [ ] `catch`: Log error message.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** [915]
|
||||
- Difficult to unit test filesystem writes effectively. Focus on testing the _conditional logic_ within the pipeline function. [918]
|
||||
- Mock `fs.writeFileSync`. Provide mock `Story` objects where `articleContent` is sometimes a string and sometimes null.
|
||||
- Verify `fs.writeFileSync` is called _only when_ `articleContent` is a non-empty string. [262]
|
||||
- Verify it's called with the correct path (`path.join(outputDir, storyId + '_article.txt')`) and content (`story.articleContent`). [263, 264]
|
||||
- **Integration Tests:** [921]
|
||||
- Use `mock-fs` or temporary directory setup/teardown. [924]
|
||||
- Run the pipeline segment responsible for scraping (mocked) and saving.
|
||||
- Verify that `.txt` files are created only for stories where the mocked scraper returned text.
|
||||
- Verify file contents match the mocked text.
|
||||
- **Manual/CLI Verification:** [912]
|
||||
- Run `npm run dev`.
|
||||
- Inspect the `output/YYYY-MM-DD/` directory.
|
||||
- Check which `{storyId}_article.txt` files exist. Compare this against the console logs indicating successful/failed scraping attempts for corresponding story IDs. Verify files only exist for successful scrapes (AC1, AC5).
|
||||
- Check filenames are correct (AC2).
|
||||
- Open a few existing `.txt` files and spot-check the content (AC3).
|
||||
- Check logs for file saving success/error messages (AC4).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] for the overall approach)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Added logic to save article text conditionally. Verified files are created only on successful scrape.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**File: ai/stories/3.4.story.md**
|
||||
|
||||
```markdown
|
||||
# Story 3.4: Implement Stage Testing Utility for Scraping
|
||||
|
||||
**Status:** Draft
|
||||
|
||||
## Goal & Context
|
||||
|
||||
**User Story:** As a developer, I want a separate script/command to test the article scraping logic using HN story data from local files, allowing independent testing and debugging of the scraper. [267]
|
||||
|
||||
**Context:** This story implements the standalone stage testing utility for Epic 3, as required by the PRD [736, 764]. It creates `src/stages/scrape_articles.ts`, which reads story data (specifically URLs) from the `{storyId}_data.json` files generated in Epic 2 (or by `stage:fetch`), calls the `scrapeArticle` function (from Story 3.1) for each URL, and persists any successfully scraped text to `{storyId}_article.txt` files (replicating Story 3.3 logic). This allows testing the scraping functionality against real websites using previously fetched story lists, without running the full pipeline or the HN fetching stage. [57, 63, 820, 912, 930]
|
||||
|
||||
## Detailed Requirements
|
||||
|
||||
- Create a new standalone script file: `src/stages/scrape_articles.ts`. [268]
|
||||
- Import necessary modules: `fs` (e.g., `readdirSync`, `readFileSync`, `writeFileSync`, `existsSync`, `statSync`), `path`, `logger` (Story 1.4), `config` (Story 1.2), `scrapeArticle` (Story 3.1), date util (Story 1.4). [269]
|
||||
- The script should:
|
||||
- Initialize the logger. [270]
|
||||
- Load configuration (to get `OUTPUT_DIR_PATH`). [271]
|
||||
- Determine the target date-stamped directory path (e.g., using current date via date util, or potentially allow override via CLI arg later - current date default is fine for now). [271] Ensure this base output directory exists. Log the target directory.
|
||||
- Check if the target date-stamped directory exists. If not, log an error and exit ("Directory {path} not found. Run fetch stage first?").
|
||||
- Read the directory contents and identify all files ending with `_data.json`. [272] Use `fs.readdirSync` and filter.
|
||||
- For each `_data.json` file found:
|
||||
- Construct the full path and read its content using `fs.readFileSync`. [273]
|
||||
- Parse the JSON content. Handle potential parse errors gracefully (log error, skip file). [273]
|
||||
- Extract the `storyId` and `articleUrl` from the parsed data. [274]
|
||||
- If a valid `articleUrl` exists (starts with `http`): [274]
|
||||
- Log the attempt: "Attempting scrape for story {storyId} from {url}...".
|
||||
- Call `await scrapeArticle(articleUrl)`. [274]
|
||||
- If scraping succeeds (returns a non-null string):
|
||||
- Construct the output filename `{storyId}_article.txt`. [275]
|
||||
- Construct the full output path. [275]
|
||||
- Save the text to the file using `fs.writeFileSync` (replicating logic from Story 3.3, including try/catch and logging). [275] Overwrite if the file exists. [276]
|
||||
- Log success outcome.
|
||||
- If scraping fails (`scrapeArticle` returns null):
|
||||
- Log failure outcome.
|
||||
- If `articleUrl` is missing or invalid:
|
||||
- Log skipping message.
|
||||
- Log overall completion: "Scraping stage finished processing {N} data files.".
|
||||
- Add a new script command to `package.json`: `"stage:scrape": "ts-node src/stages/scrape_articles.ts"`. [277]
|
||||
|
||||
## Acceptance Criteria (ACs)
|
||||
|
||||
- AC1: The file `src/stages/scrape_articles.ts` exists. [279]
|
||||
- AC2: The script `stage:scrape` is defined in `package.json`'s `scripts` section. [280]
|
||||
- AC3: Running `npm run stage:scrape` (assuming a date-stamped directory with `_data.json` files exists from a previous fetch run) successfully reads these JSON files. [281]
|
||||
- AC4: The script calls `scrapeArticle` for stories with valid `articleUrl`s found in the JSON files. [282]
|
||||
- AC5: The script creates or updates `{storyId}_article.txt` files in the _same_ date-stamped directory, corresponding only to successfully scraped articles. [283]
|
||||
- AC6: The script logs its actions (reading files, attempting scraping, skipping, saving results/failures) for each story ID processed based on the found `_data.json` files. [284]
|
||||
- AC7: The script operates solely based on local `_data.json` files as input and fetching from external article URLs via `scrapeArticle`; it does not call the Algolia HN API client. [285, 286]
|
||||
|
||||
## Technical Implementation Context
|
||||
|
||||
**Guidance:** Use the following details for implementation. Refer to the linked `docs/` files for broader context if needed.
|
||||
|
||||
- **Relevant Files:**
|
||||
- Files to Create: `src/stages/scrape_articles.ts`.
|
||||
- Files to Modify: `package.json`.
|
||||
- _(Hint: See `docs/project-structure.md` [820] for stage runner location)._
|
||||
- **Key Technologies:**
|
||||
- TypeScript [846], Node.js 22.x [851], `ts-node`.
|
||||
- Native `fs` module (`readdirSync`, `readFileSync`, `writeFileSync`, `existsSync`, `statSync`). [269]
|
||||
- Native `path` module. [269]
|
||||
- Uses `logger` (Story 1.4), `config` (Story 1.2), date util (Story 1.4), `scrapeArticle` (Story 3.1), persistence logic (Story 3.3).
|
||||
- _(Hint: See `docs/tech-stack.md` [839-905])._
|
||||
- **API Interactions / SDK Usage:**
|
||||
- Calls internal `scrapeArticle(url)`.
|
||||
- Uses `fs` module extensively for reading directory, reading JSON, writing TXT.
|
||||
- **Data Structures:**
|
||||
- Reads JSON structure from `_data.json` files [538-540]. Extracts `storyId`, `articleUrl`.
|
||||
- Creates `{storyId}_article.txt` files [541].
|
||||
- _(Hint: See `docs/data-models.md`)._
|
||||
- **Environment Variables:**
|
||||
- Reads `OUTPUT_DIR_PATH` via `config.ts`. `scrapeArticle` might use others.
|
||||
- _(Hint: See `docs/environment-vars.md` [548-638])._
|
||||
- **Coding Standards Notes:**
|
||||
- Structure script clearly (setup, read data files, loop, process/scrape/save).
|
||||
- Use `async/await` for `scrapeArticle`.
|
||||
- Implement robust error handling for file IO (reading dir, reading files, parsing JSON, writing files) using `try...catch` and logging.
|
||||
- Use logger for detailed progress reporting. [284]
|
||||
- Wrap main logic in an async IIFE or main function.
|
||||
|
||||
## Tasks / Subtasks
|
||||
|
||||
- [ ] Create `src/stages/scrape_articles.ts`.
|
||||
- [ ] Add imports: `fs`, `path`, `logger`, `config`, `scrapeArticle`, date util.
|
||||
- [ ] Implement setup: Init logger, load config, get output path, get target date-stamped path.
|
||||
- [ ] Check if target date-stamped directory exists, log error and exit if not.
|
||||
- [ ] Use `fs.readdirSync` to get list of files in the target directory.
|
||||
- [ ] Filter the list to get only files ending in `_data.json`.
|
||||
- [ ] Loop through the `_data.json` filenames:
|
||||
- [ ] Construct full path for the JSON file.
|
||||
- [ ] Use `try...catch` for reading and parsing the JSON file:
|
||||
- [ ] `try`: Read file (`fs.readFileSync`). Parse JSON (`JSON.parse`).
|
||||
- [ ] `catch`: Log error (read/parse), continue to next file.
|
||||
- [ ] Extract `storyId` and `articleUrl`.
|
||||
- [ ] Check if `articleUrl` is valid (starts with `http`).
|
||||
- [ ] If valid:
|
||||
- [ ] Log attempt.
|
||||
- [ ] Call `content = await scrapeArticle(articleUrl)`.
|
||||
- [ ] `if (content)`:
|
||||
- [ ] Construct `.txt` output path.
|
||||
- [ ] Use `try...catch` to write file (`fs.writeFileSync`). Log success/error.
|
||||
- [ ] `else`: Log scrape failure.
|
||||
- [ ] If URL invalid: Log skip.
|
||||
- [ ] Log completion message.
|
||||
- [ ] Add `"stage:scrape": "ts-node src/stages/scrape_articles.ts"` to `package.json`.
|
||||
|
||||
## Testing Requirements
|
||||
|
||||
**Guidance:** Verify implementation against the ACs using the following tests.
|
||||
|
||||
- **Unit Tests:** Difficult to unit test the entire script effectively due to heavy FS and orchestration logic. Focus on unit testing the core `scrapeArticle` module (Story 3.1) and utilities. [915]
|
||||
- **Integration Tests:** N/A for the script itself. [921]
|
||||
- **Manual/CLI Verification (Primary Test Method):** [912, 927, 930]
|
||||
- Ensure `_data.json` files exist from `npm run stage:fetch` or `npm run dev`.
|
||||
- Run `npm run stage:scrape`. [281]
|
||||
- Verify successful execution.
|
||||
- Check logs for reading files, skipping, attempting scrapes, success/failure messages, and saving messages [284].
|
||||
- Inspect the `output/YYYY-MM-DD/` directory for newly created/updated `{storyId}_article.txt` files. Verify they correspond to stories where scraping succeeded according to logs [283, 285].
|
||||
- Verify the script _only_ performed scraping actions based on local files (AC7).
|
||||
- Modify `package.json` to add the script (AC2).
|
||||
- _(Hint: See `docs/testing-strategy.md` [907-950] which identifies Stage Runners as a key part of Acceptance Testing)._
|
||||
|
||||
## Story Wrap Up (Agent Populates After Execution)
|
||||
|
||||
- **Agent Model Used:** `<Agent Model Name/Version>`
|
||||
- **Completion Notes:** {Stage runner implemented. Reads \_data.json, calls scraper, saves \_article.txt conditionally. package.json updated.}
|
||||
- **Change Log:**
|
||||
- Initial Draft
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## **End of Report for Epic 3**
|
||||
@@ -1,89 +0,0 @@
|
||||
# Epic 1: Project Initialization & Core Setup
|
||||
|
||||
**Goal:** Initialize the project using the "bmad-boilerplate", manage dependencies, setup `.env` and config loading, establish basic CLI entry point, setup basic logging and output directory structure. This provides the foundational setup for all subsequent development work.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 1.1: Initialize Project from Boilerplate
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to set up the initial project structure using the `bmad-boilerplate`, so that I have the standard tooling (TS, Jest, ESLint, Prettier), configurations, and scripts in place.
|
||||
- **Detailed Requirements:**
|
||||
- Copy or clone the contents of the `bmad-boilerplate` into the new project's root directory.
|
||||
- Initialize a git repository in the project root directory (if not already done by cloning).
|
||||
- Ensure the `.gitignore` file from the boilerplate is present.
|
||||
- Run `npm install` to download and install all `devDependencies` specified in the boilerplate's `package.json`.
|
||||
- Verify that the core boilerplate scripts (`lint`, `format`, `test`, `build`) execute without errors on the initial codebase.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The project directory contains the files and structure from `bmad-boilerplate`.
|
||||
- AC2: A `node_modules` directory exists and contains packages corresponding to `devDependencies`.
|
||||
- AC3: `npm run lint` command completes successfully without reporting any linting errors.
|
||||
- AC4: `npm run format` command completes successfully, potentially making formatting changes according to Prettier rules. Running it a second time should result in no changes.
|
||||
- AC5: `npm run test` command executes Jest successfully (it may report "no tests found" which is acceptable at this stage).
|
||||
- AC6: `npm run build` command executes successfully, creating a `dist` directory containing compiled JavaScript output.
|
||||
- AC7: The `.gitignore` file exists and includes entries for `node_modules/`, `.env`, `dist/`, etc. as specified in the boilerplate.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.2: Setup Environment Configuration
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to establish the environment configuration mechanism using `.env` files, so that secrets and settings (like output paths) can be managed outside of version control, following boilerplate conventions.
|
||||
- **Detailed Requirements:**
|
||||
- Verify the `.env.example` file exists (from boilerplate).
|
||||
- Add an initial configuration variable `OUTPUT_DIR_PATH=./output` to `.env.example`.
|
||||
- Create the `.env` file locally by copying `.env.example`. Populate `OUTPUT_DIR_PATH` if needed (can keep default).
|
||||
- Implement a utility module (e.g., `src/config.ts`) that loads environment variables from the `.env` file at application startup.
|
||||
- The utility should export the loaded configuration values (initially just `OUTPUT_DIR_PATH`).
|
||||
- Ensure the `.env` file is listed in `.gitignore` and is not committed.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Handle `.env` files with native node 22 support, no need for `dotenv`
|
||||
- AC2: The `.env.example` file exists, is tracked by git, and contains the line `OUTPUT_DIR_PATH=./output`.
|
||||
- AC3: The `.env` file exists locally but is NOT tracked by git.
|
||||
- AC4: A configuration module (`src/config.ts` or similar) exists and successfully loads the `OUTPUT_DIR_PATH` value from `.env` when the application starts.
|
||||
- AC5: The loaded `OUTPUT_DIR_PATH` value is accessible within the application code.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.3: Implement Basic CLI Entry Point & Execution
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic `src/index.ts` entry point that can be executed via the boilerplate's `dev` and `start` scripts, providing a working foundation for the application logic.
|
||||
- **Detailed Requirements:**
|
||||
- Create the main application entry point file at `src/index.ts`.
|
||||
- Implement minimal code within `src/index.ts` to:
|
||||
- Import the configuration loading mechanism (from Story 1.2).
|
||||
- Log a simple startup message to the console (e.g., "BMad Hacker Daily Digest - Starting Up...").
|
||||
- (Optional) Log the loaded `OUTPUT_DIR_PATH` to verify config loading.
|
||||
- Confirm execution using boilerplate scripts.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `src/index.ts` file exists.
|
||||
- AC2: Running `npm run dev` executes `src/index.ts` via `ts-node` and logs the startup message to the console.
|
||||
- AC3: Running `npm run build` successfully compiles `src/index.ts` (and any imports) into the `dist` directory.
|
||||
- AC4: Running `npm start` (after a successful build) executes the compiled code from `dist` and logs the startup message to the console.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.4: Setup Basic Logging and Output Directory
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic console logging mechanism and the dynamic creation of a date-stamped output directory, so that the application can provide execution feedback and prepare for storing data artifacts in subsequent epics.
|
||||
- **Detailed Requirements:**
|
||||
- Implement a simple, reusable logging utility module (e.g., `src/logger.ts`). Initially, it can wrap `console.log`, `console.warn`, `console.error`.
|
||||
- Refactor `src/index.ts` to use this `logger` for its startup message(s).
|
||||
- In `src/index.ts` (or a setup function called by it):
|
||||
- Retrieve the `OUTPUT_DIR_PATH` from the configuration (loaded in Story 1.2).
|
||||
- Determine the current date in 'YYYY-MM-DD' format.
|
||||
- Construct the full path for the date-stamped subdirectory (e.g., `${OUTPUT_DIR_PATH}/YYYY-MM-DD`).
|
||||
- Check if the base output directory exists; if not, create it.
|
||||
- Check if the date-stamped subdirectory exists; if not, create it recursively. Use Node.js `fs` module (e.g., `fs.mkdirSync(path, { recursive: true })`).
|
||||
- Log (using the logger) the full path of the output directory being used for the current run (e.g., "Output directory for this run: ./output/2025-05-04").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: A logger utility module (`src/logger.ts` or similar) exists and is used for console output in `src/index.ts`.
|
||||
- AC2: Running `npm run dev` or `npm start` logs the startup message via the logger.
|
||||
- AC3: Running the application creates the base output directory (e.g., `./output` defined in `.env`) if it doesn't already exist.
|
||||
- AC4: Running the application creates a date-stamped subdirectory (e.g., `./output/2025-05-04`) within the base output directory if it doesn't already exist.
|
||||
- AC5: The application logs a message indicating the full path to the date-stamped output directory created/used for the current execution.
|
||||
- AC6: The application exits gracefully after performing these setup steps (for now).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 1 | 2-pm |
|
||||
@@ -1,89 +0,0 @@
|
||||
# Epic 1: Project Initialization & Core Setup
|
||||
|
||||
**Goal:** Initialize the project using the "bmad-boilerplate", manage dependencies, setup `.env` and config loading, establish basic CLI entry point, setup basic logging and output directory structure. This provides the foundational setup for all subsequent development work.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 1.1: Initialize Project from Boilerplate
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to set up the initial project structure using the `bmad-boilerplate`, so that I have the standard tooling (TS, Jest, ESLint, Prettier), configurations, and scripts in place.
|
||||
- **Detailed Requirements:**
|
||||
- Copy or clone the contents of the `bmad-boilerplate` into the new project's root directory.
|
||||
- Initialize a git repository in the project root directory (if not already done by cloning).
|
||||
- Ensure the `.gitignore` file from the boilerplate is present.
|
||||
- Run `npm install` to download and install all `devDependencies` specified in the boilerplate's `package.json`.
|
||||
- Verify that the core boilerplate scripts (`lint`, `format`, `test`, `build`) execute without errors on the initial codebase.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The project directory contains the files and structure from `bmad-boilerplate`.
|
||||
- AC2: A `node_modules` directory exists and contains packages corresponding to `devDependencies`.
|
||||
- AC3: `npm run lint` command completes successfully without reporting any linting errors.
|
||||
- AC4: `npm run format` command completes successfully, potentially making formatting changes according to Prettier rules. Running it a second time should result in no changes.
|
||||
- AC5: `npm run test` command executes Jest successfully (it may report "no tests found" which is acceptable at this stage).
|
||||
- AC6: `npm run build` command executes successfully, creating a `dist` directory containing compiled JavaScript output.
|
||||
- AC7: The `.gitignore` file exists and includes entries for `node_modules/`, `.env`, `dist/`, etc. as specified in the boilerplate.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.2: Setup Environment Configuration
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to establish the environment configuration mechanism using `.env` files, so that secrets and settings (like output paths) can be managed outside of version control, following boilerplate conventions.
|
||||
- **Detailed Requirements:**
|
||||
- Verify the `.env.example` file exists (from boilerplate).
|
||||
- Add an initial configuration variable `OUTPUT_DIR_PATH=./output` to `.env.example`.
|
||||
- Create the `.env` file locally by copying `.env.example`. Populate `OUTPUT_DIR_PATH` if needed (can keep default).
|
||||
- Implement a utility module (e.g., `src/config.ts`) that loads environment variables from the `.env` file at application startup.
|
||||
- The utility should export the loaded configuration values (initially just `OUTPUT_DIR_PATH`).
|
||||
- Ensure the `.env` file is listed in `.gitignore` and is not committed.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Handle `.env` files with native node 22 support, no need for `dotenv`
|
||||
- AC2: The `.env.example` file exists, is tracked by git, and contains the line `OUTPUT_DIR_PATH=./output`.
|
||||
- AC3: The `.env` file exists locally but is NOT tracked by git.
|
||||
- AC4: A configuration module (`src/config.ts` or similar) exists and successfully loads the `OUTPUT_DIR_PATH` value from `.env` when the application starts.
|
||||
- AC5: The loaded `OUTPUT_DIR_PATH` value is accessible within the application code.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.3: Implement Basic CLI Entry Point & Execution
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic `src/index.ts` entry point that can be executed via the boilerplate's `dev` and `start` scripts, providing a working foundation for the application logic.
|
||||
- **Detailed Requirements:**
|
||||
- Create the main application entry point file at `src/index.ts`.
|
||||
- Implement minimal code within `src/index.ts` to:
|
||||
- Import the configuration loading mechanism (from Story 1.2).
|
||||
- Log a simple startup message to the console (e.g., "BMad Hacker Daily Digest - Starting Up...").
|
||||
- (Optional) Log the loaded `OUTPUT_DIR_PATH` to verify config loading.
|
||||
- Confirm execution using boilerplate scripts.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `src/index.ts` file exists.
|
||||
- AC2: Running `npm run dev` executes `src/index.ts` via `ts-node` and logs the startup message to the console.
|
||||
- AC3: Running `npm run build` successfully compiles `src/index.ts` (and any imports) into the `dist` directory.
|
||||
- AC4: Running `npm start` (after a successful build) executes the compiled code from `dist` and logs the startup message to the console.
|
||||
|
||||
---
|
||||
|
||||
### Story 1.4: Setup Basic Logging and Output Directory
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic console logging mechanism and the dynamic creation of a date-stamped output directory, so that the application can provide execution feedback and prepare for storing data artifacts in subsequent epics.
|
||||
- **Detailed Requirements:**
|
||||
- Implement a simple, reusable logging utility module (e.g., `src/logger.ts`). Initially, it can wrap `console.log`, `console.warn`, `console.error`.
|
||||
- Refactor `src/index.ts` to use this `logger` for its startup message(s).
|
||||
- In `src/index.ts` (or a setup function called by it):
|
||||
- Retrieve the `OUTPUT_DIR_PATH` from the configuration (loaded in Story 1.2).
|
||||
- Determine the current date in 'YYYY-MM-DD' format.
|
||||
- Construct the full path for the date-stamped subdirectory (e.g., `${OUTPUT_DIR_PATH}/YYYY-MM-DD`).
|
||||
- Check if the base output directory exists; if not, create it.
|
||||
- Check if the date-stamped subdirectory exists; if not, create it recursively. Use Node.js `fs` module (e.g., `fs.mkdirSync(path, { recursive: true })`).
|
||||
- Log (using the logger) the full path of the output directory being used for the current run (e.g., "Output directory for this run: ./output/2025-05-04").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: A logger utility module (`src/logger.ts` or similar) exists and is used for console output in `src/index.ts`.
|
||||
- AC2: Running `npm run dev` or `npm start` logs the startup message via the logger.
|
||||
- AC3: Running the application creates the base output directory (e.g., `./output` defined in `.env`) if it doesn't already exist.
|
||||
- AC4: Running the application creates a date-stamped subdirectory (e.g., `./output/2025-05-04`) within the base output directory if it doesn't already exist.
|
||||
- AC5: The application logs a message indicating the full path to the date-stamped output directory created/used for the current execution.
|
||||
- AC6: The application exits gracefully after performing these setup steps (for now).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 1 | 2-pm |
|
||||
@@ -1,99 +0,0 @@
|
||||
# Epic 2: HN Data Acquisition & Persistence
|
||||
|
||||
**Goal:** Implement fetching top 10 stories and their comments (respecting limits) from Algolia HN API, and persist this raw data locally into the date-stamped output directory created in Epic 1. Implement a stage testing utility for fetching.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 2.1: Implement Algolia HN API Client
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a dedicated client module to interact with the Algolia Hacker News Search API, so that fetching stories and comments is encapsulated, reusable, and uses the required native `Workspace` API.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/clients/algoliaHNClient.ts`.
|
||||
- Implement an async function `WorkspaceTopStories` within the client:
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for front-page stories (e.g., `http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10`). Adjust `hitsPerPage` if needed to ensure 10 stories.
|
||||
- Parse the JSON response.
|
||||
- Extract required metadata for each story: `objectID` (use as `storyId`), `title`, `url` (article URL), `points`, `num_comments`. Handle potential missing `url` field gracefully (log warning, maybe skip story later if URL needed).
|
||||
- Construct the `hnUrl` for each story (e.g., `https://news.ycombinator.com/item?id={storyId}`).
|
||||
- Return an array of structured story objects.
|
||||
- Implement a separate async function `WorkspaceCommentsForStory` within the client:
|
||||
- Accept `storyId` and `maxComments` limit as arguments.
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for comments of a specific story (e.g., `http://hn.algolia.com/api/v1/search?tags=comment,story_{storyId}&hitsPerPage={maxComments}`).
|
||||
- Parse the JSON response.
|
||||
- Extract required comment data: `objectID` (use as `commentId`), `comment_text`, `author`, `created_at`.
|
||||
- Filter out comments where `comment_text` is null or empty. Ensure only up to `maxComments` are returned.
|
||||
- Return an array of structured comment objects.
|
||||
- Implement basic error handling using `try...catch` around `Workspace` calls and check `response.ok` status. Log errors using the logger utility from Epic 1.
|
||||
- Define TypeScript interfaces/types for the expected structures of API responses (stories, comments) and the data returned by the client functions (e.g., `Story`, `Comment`).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The module `src/clients/algoliaHNClient.ts` exists and exports `WorkspaceTopStories` and `WorkspaceCommentsForStory` functions.
|
||||
- AC2: Calling `WorkspaceTopStories` makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of 10 `Story` objects containing the specified metadata.
|
||||
- AC3: Calling `WorkspaceCommentsForStory` with a valid `storyId` and `maxComments` limit makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of `Comment` objects (up to `maxComments`), filtering out empty ones.
|
||||
- AC4: Both functions use the native `Workspace` API internally.
|
||||
- AC5: Network errors or non-successful API responses (e.g., status 4xx, 5xx) are caught and logged using the logger.
|
||||
- AC6: Relevant TypeScript types (`Story`, `Comment`, etc.) are defined and used within the client module.
|
||||
|
||||
---
|
||||
|
||||
### Story 2.2: Integrate HN Data Fetching into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to integrate the HN data fetching logic into the main application workflow (`src/index.ts`), so that running the app retrieves the top 10 stories and their comments after completing the setup from Epic 1.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts` (or a main async function called by it).
|
||||
- Import the `algoliaHNClient` functions.
|
||||
- Import the configuration module to access `MAX_COMMENTS_PER_STORY`.
|
||||
- After the Epic 1 setup (config load, logger init, output dir creation), call `WorkspaceTopStories()`.
|
||||
- Log the number of stories fetched.
|
||||
- Iterate through the array of fetched `Story` objects.
|
||||
- For each `Story`, call `WorkspaceCommentsForStory()`, passing the `story.storyId` and the configured `MAX_COMMENTS_PER_STORY`.
|
||||
- Store the fetched comments within the corresponding `Story` object in memory (e.g., add a `comments: Comment[]` property to the `Story` object).
|
||||
- Log progress using the logger utility (e.g., "Fetched 10 stories.", "Fetching up to X comments for story {storyId}...").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes Epic 1 setup steps followed by fetching stories and then comments for each story.
|
||||
- AC2: Logs clearly show the start and successful completion of fetching stories, and the start of fetching comments for each of the 10 stories.
|
||||
- AC3: The configured `MAX_COMMENTS_PER_STORY` value is read from config and used in the calls to `WorkspaceCommentsForStory`.
|
||||
- AC4: After successful execution, story objects held in memory contain a nested array of fetched comment objects. (Can be verified via debugger or temporary logging).
|
||||
|
||||
---
|
||||
|
||||
### Story 2.3: Persist Fetched HN Data Locally
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save the fetched HN stories (including their comments) to JSON files in the date-stamped output directory, so that the raw data is persisted locally for subsequent pipeline stages and debugging.
|
||||
- **Detailed Requirements:**
|
||||
- Define a consistent JSON structure for the output file content. Example: `{ storyId: "...", title: "...", url: "...", hnUrl: "...", points: ..., fetchedAt: "ISO_TIMESTAMP", comments: [{ commentId: "...", text: "...", author: "...", createdAt: "ISO_TIMESTAMP", ... }, ...] }`. Include a timestamp for when the data was fetched.
|
||||
- Import Node.js `fs` (specifically `fs.writeFileSync`) and `path` modules.
|
||||
- In the main workflow (`src/index.ts`), within the loop iterating through stories (after comments have been fetched and added to the story object in Story 2.2):
|
||||
- Get the full path to the date-stamped output directory (determined in Epic 1).
|
||||
- Construct the filename for the story's data: `{storyId}_data.json`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Serialize the complete story object (including comments and fetch timestamp) to a JSON string using `JSON.stringify(storyObject, null, 2)` for readability.
|
||||
- Write the JSON string to the file using `fs.writeFileSync()`. Use a `try...catch` block for error handling.
|
||||
- Log (using the logger) the successful persistence of each story's data file or any errors encountered during file writing.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory (e.g., `./output/YYYY-MM-DD/`) contains exactly 10 files named `{storyId}_data.json`.
|
||||
- AC2: Each JSON file contains valid JSON representing a single story object, including its metadata, fetch timestamp, and an array of its fetched comments, matching the defined structure.
|
||||
- AC3: The number of comments in each file's `comments` array does not exceed `MAX_COMMENTS_PER_STORY`.
|
||||
- AC4: Logs indicate that saving data to a file was attempted for each story, reporting success or specific file writing errors.
|
||||
|
||||
---
|
||||
|
||||
### Story 2.4: Implement Stage Testing Utility for HN Fetching
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate, executable script that *only* performs the HN data fetching and persistence, so I can test and trigger this stage independently of the full pipeline.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new standalone script file: `src/stages/fetch_hn_data.ts`.
|
||||
- This script should perform the essential setup required for this stage: initialize logger, load configuration (`.env`), determine and create output directory (reuse or replicate logic from Epic 1 / `src/index.ts`).
|
||||
- The script should then execute the core logic of fetching stories via `algoliaHNClient.fetchTopStories`, fetching comments via `algoliaHNClient.fetchCommentsForStory` (using loaded config for limit), and persisting the results to JSON files using `fs.writeFileSync` (replicating logic from Story 2.3).
|
||||
- The script should log its progress using the logger utility.
|
||||
- Add a new script command to `package.json` under `"scripts"`: `"stage:fetch": "ts-node src/stages/fetch_hn_data.ts"`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/fetch_hn_data.ts` exists.
|
||||
- AC2: The script `stage:fetch` is defined in `package.json`'s `scripts` section.
|
||||
- AC3: Running `npm run stage:fetch` executes successfully, performing only the setup, fetch, and persist steps.
|
||||
- AC4: Running `npm run stage:fetch` creates the same 10 `{storyId}_data.json` files in the correct date-stamped output directory as running the main `npm run dev` command (at the current state of development).
|
||||
- AC5: Logs generated by `npm run stage:fetch` reflect only the fetching and persisting steps, not subsequent pipeline stages.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 2 | 2-pm |
|
||||
@@ -1,99 +0,0 @@
|
||||
# Epic 2: HN Data Acquisition & Persistence
|
||||
|
||||
**Goal:** Implement fetching top 10 stories and their comments (respecting limits) from Algolia HN API, and persist this raw data locally into the date-stamped output directory created in Epic 1. Implement a stage testing utility for fetching.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 2.1: Implement Algolia HN API Client
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a dedicated client module to interact with the Algolia Hacker News Search API, so that fetching stories and comments is encapsulated, reusable, and uses the required native `Workspace` API.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/clients/algoliaHNClient.ts`.
|
||||
- Implement an async function `WorkspaceTopStories` within the client:
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for front-page stories (e.g., `http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10`). Adjust `hitsPerPage` if needed to ensure 10 stories.
|
||||
- Parse the JSON response.
|
||||
- Extract required metadata for each story: `objectID` (use as `storyId`), `title`, `url` (article URL), `points`, `num_comments`. Handle potential missing `url` field gracefully (log warning, maybe skip story later if URL needed).
|
||||
- Construct the `hnUrl` for each story (e.g., `https://news.ycombinator.com/item?id={storyId}`).
|
||||
- Return an array of structured story objects.
|
||||
- Implement a separate async function `WorkspaceCommentsForStory` within the client:
|
||||
- Accept `storyId` and `maxComments` limit as arguments.
|
||||
- Use native `Workspace` to call the Algolia HN Search API endpoint for comments of a specific story (e.g., `http://hn.algolia.com/api/v1/search?tags=comment,story_{storyId}&hitsPerPage={maxComments}`).
|
||||
- Parse the JSON response.
|
||||
- Extract required comment data: `objectID` (use as `commentId`), `comment_text`, `author`, `created_at`.
|
||||
- Filter out comments where `comment_text` is null or empty. Ensure only up to `maxComments` are returned.
|
||||
- Return an array of structured comment objects.
|
||||
- Implement basic error handling using `try...catch` around `Workspace` calls and check `response.ok` status. Log errors using the logger utility from Epic 1.
|
||||
- Define TypeScript interfaces/types for the expected structures of API responses (stories, comments) and the data returned by the client functions (e.g., `Story`, `Comment`).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The module `src/clients/algoliaHNClient.ts` exists and exports `WorkspaceTopStories` and `WorkspaceCommentsForStory` functions.
|
||||
- AC2: Calling `WorkspaceTopStories` makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of 10 `Story` objects containing the specified metadata.
|
||||
- AC3: Calling `WorkspaceCommentsForStory` with a valid `storyId` and `maxComments` limit makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of `Comment` objects (up to `maxComments`), filtering out empty ones.
|
||||
- AC4: Both functions use the native `Workspace` API internally.
|
||||
- AC5: Network errors or non-successful API responses (e.g., status 4xx, 5xx) are caught and logged using the logger.
|
||||
- AC6: Relevant TypeScript types (`Story`, `Comment`, etc.) are defined and used within the client module.
|
||||
|
||||
---
|
||||
|
||||
### Story 2.2: Integrate HN Data Fetching into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to integrate the HN data fetching logic into the main application workflow (`src/index.ts`), so that running the app retrieves the top 10 stories and their comments after completing the setup from Epic 1.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts` (or a main async function called by it).
|
||||
- Import the `algoliaHNClient` functions.
|
||||
- Import the configuration module to access `MAX_COMMENTS_PER_STORY`.
|
||||
- After the Epic 1 setup (config load, logger init, output dir creation), call `WorkspaceTopStories()`.
|
||||
- Log the number of stories fetched.
|
||||
- Iterate through the array of fetched `Story` objects.
|
||||
- For each `Story`, call `WorkspaceCommentsForStory()`, passing the `story.storyId` and the configured `MAX_COMMENTS_PER_STORY`.
|
||||
- Store the fetched comments within the corresponding `Story` object in memory (e.g., add a `comments: Comment[]` property to the `Story` object).
|
||||
- Log progress using the logger utility (e.g., "Fetched 10 stories.", "Fetching up to X comments for story {storyId}...").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes Epic 1 setup steps followed by fetching stories and then comments for each story.
|
||||
- AC2: Logs clearly show the start and successful completion of fetching stories, and the start of fetching comments for each of the 10 stories.
|
||||
- AC3: The configured `MAX_COMMENTS_PER_STORY` value is read from config and used in the calls to `WorkspaceCommentsForStory`.
|
||||
- AC4: After successful execution, story objects held in memory contain a nested array of fetched comment objects. (Can be verified via debugger or temporary logging).
|
||||
|
||||
---
|
||||
|
||||
### Story 2.3: Persist Fetched HN Data Locally
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save the fetched HN stories (including their comments) to JSON files in the date-stamped output directory, so that the raw data is persisted locally for subsequent pipeline stages and debugging.
|
||||
- **Detailed Requirements:**
|
||||
- Define a consistent JSON structure for the output file content. Example: `{ storyId: "...", title: "...", url: "...", hnUrl: "...", points: ..., fetchedAt: "ISO_TIMESTAMP", comments: [{ commentId: "...", text: "...", author: "...", createdAt: "ISO_TIMESTAMP", ... }, ...] }`. Include a timestamp for when the data was fetched.
|
||||
- Import Node.js `fs` (specifically `fs.writeFileSync`) and `path` modules.
|
||||
- In the main workflow (`src/index.ts`), within the loop iterating through stories (after comments have been fetched and added to the story object in Story 2.2):
|
||||
- Get the full path to the date-stamped output directory (determined in Epic 1).
|
||||
- Construct the filename for the story's data: `{storyId}_data.json`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Serialize the complete story object (including comments and fetch timestamp) to a JSON string using `JSON.stringify(storyObject, null, 2)` for readability.
|
||||
- Write the JSON string to the file using `fs.writeFileSync()`. Use a `try...catch` block for error handling.
|
||||
- Log (using the logger) the successful persistence of each story's data file or any errors encountered during file writing.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory (e.g., `./output/YYYY-MM-DD/`) contains exactly 10 files named `{storyId}_data.json`.
|
||||
- AC2: Each JSON file contains valid JSON representing a single story object, including its metadata, fetch timestamp, and an array of its fetched comments, matching the defined structure.
|
||||
- AC3: The number of comments in each file's `comments` array does not exceed `MAX_COMMENTS_PER_STORY`.
|
||||
- AC4: Logs indicate that saving data to a file was attempted for each story, reporting success or specific file writing errors.
|
||||
|
||||
---
|
||||
|
||||
### Story 2.4: Implement Stage Testing Utility for HN Fetching
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate, executable script that *only* performs the HN data fetching and persistence, so I can test and trigger this stage independently of the full pipeline.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new standalone script file: `src/stages/fetch_hn_data.ts`.
|
||||
- This script should perform the essential setup required for this stage: initialize logger, load configuration (`.env`), determine and create output directory (reuse or replicate logic from Epic 1 / `src/index.ts`).
|
||||
- The script should then execute the core logic of fetching stories via `algoliaHNClient.fetchTopStories`, fetching comments via `algoliaHNClient.fetchCommentsForStory` (using loaded config for limit), and persisting the results to JSON files using `fs.writeFileSync` (replicating logic from Story 2.3).
|
||||
- The script should log its progress using the logger utility.
|
||||
- Add a new script command to `package.json` under `"scripts"`: `"stage:fetch": "ts-node src/stages/fetch_hn_data.ts"`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/fetch_hn_data.ts` exists.
|
||||
- AC2: The script `stage:fetch` is defined in `package.json`'s `scripts` section.
|
||||
- AC3: Running `npm run stage:fetch` executes successfully, performing only the setup, fetch, and persist steps.
|
||||
- AC4: Running `npm run stage:fetch` creates the same 10 `{storyId}_data.json` files in the correct date-stamped output directory as running the main `npm run dev` command (at the current state of development).
|
||||
- AC5: Logs generated by `npm run stage:fetch` reflect only the fetching and persisting steps, not subsequent pipeline stages.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 2 | 2-pm |
|
||||
@@ -1,111 +0,0 @@
|
||||
# Epic 3: Article Scraping & Persistence
|
||||
|
||||
**Goal:** Implement a best-effort article scraping mechanism to fetch and extract plain text content from the external URLs associated with fetched HN stories. Handle failures gracefully and persist successfully scraped text locally. Implement a stage testing utility for scraping.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 3.1: Implement Basic Article Scraper Module
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module that attempts to fetch HTML from a URL and extract the main article text using basic methods, handling common failures gracefully, so article content can be prepared for summarization.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/scraper/articleScraper.ts`.
|
||||
- Add a suitable HTML parsing/extraction library dependency (e.g., `@extractus/article-extractor` recommended for simplicity, or `cheerio` for more control). Run `npm install @extractus/article-extractor --save-prod` (or chosen alternative).
|
||||
- Implement an async function `scrapeArticle(url: string): Promise<string | null>` within the module.
|
||||
- Inside the function:
|
||||
- Use native `Workspace` to retrieve content from the `url`. Set a reasonable timeout (e.g., 10-15 seconds). Include a `User-Agent` header to mimic a browser.
|
||||
- Handle potential `Workspace` errors (network errors, timeouts) using `try...catch`.
|
||||
- Check the `response.ok` status. If not okay, log error and return `null`.
|
||||
- Check the `Content-Type` header of the response. If it doesn't indicate HTML (e.g., does not include `text/html`), log warning and return `null`.
|
||||
- If HTML is received, attempt to extract the main article text using the chosen library (`article-extractor` preferred).
|
||||
- Wrap the extraction logic in a `try...catch` to handle library-specific errors.
|
||||
- Return the extracted plain text string if successful. Ensure it's just text, not HTML markup.
|
||||
- Return `null` if extraction fails or results in empty content.
|
||||
- Log all significant events, errors, or reasons for returning null (e.g., "Scraping URL...", "Fetch failed:", "Non-HTML content type:", "Extraction failed:", "Successfully extracted text") using the logger utility.
|
||||
- Define TypeScript types/interfaces as needed.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `articleScraper.ts` module exists and exports the `scrapeArticle` function.
|
||||
- AC2: The chosen scraping library (e.g., `@extractus/article-extractor`) is added to `dependencies` in `package.json`.
|
||||
- AC3: `scrapeArticle` uses native `Workspace` with a timeout and User-Agent header.
|
||||
- AC4: `scrapeArticle` correctly handles fetch errors, non-OK responses, and non-HTML content types by logging and returning `null`.
|
||||
- AC5: `scrapeArticle` uses the chosen library to attempt text extraction from valid HTML content.
|
||||
- AC6: `scrapeArticle` returns the extracted plain text on success, and `null` on any failure (fetch, non-HTML, extraction error, empty result).
|
||||
- AC7: Relevant logs are produced for success, failure modes, and errors encountered during the process.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.2: Integrate Article Scraping into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to integrate the article scraper into the main workflow (`src/index.ts`), attempting to scrape the article for each HN story that has a valid URL, after fetching its data.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts`.
|
||||
- Import the `scrapeArticle` function from `src/scraper/articleScraper.ts`.
|
||||
- Within the main loop iterating through the fetched stories (after comments are fetched in Epic 2):
|
||||
- Check if `story.url` exists and appears to be a valid HTTP/HTTPS URL. A simple check for starting with `http://` or `https://` is sufficient.
|
||||
- If the URL is missing or invalid, log a warning ("Skipping scraping for story {storyId}: Missing or invalid URL") and proceed to the next story's processing step.
|
||||
- If a valid URL exists, log ("Attempting to scrape article for story {storyId} from {story.url}").
|
||||
- Call `await scrapeArticle(story.url)`.
|
||||
- Store the result (the extracted text string or `null`) in memory, associated with the story object (e.g., add property `articleContent: string | null`).
|
||||
- Log the outcome clearly (e.g., "Successfully scraped article for story {storyId}", "Failed to scrape article for story {storyId}").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes Epic 1 & 2 steps, and then attempts article scraping for stories with valid URLs.
|
||||
- AC2: Stories with missing or invalid URLs are skipped, and a corresponding log message is generated.
|
||||
- AC3: For stories with valid URLs, the `scrapeArticle` function is called.
|
||||
- AC4: Logs clearly indicate the start and success/failure outcome of the scraping attempt for each relevant story.
|
||||
- AC5: Story objects held in memory after this stage contain an `articleContent` property holding the scraped text (string) or `null` if scraping was skipped or failed.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.3: Persist Scraped Article Text Locally
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save successfully scraped article text to a separate local file for each story, so that the text content is available as input for the summarization stage.
|
||||
- **Detailed Requirements:**
|
||||
- Import Node.js `fs` and `path` modules if not already present in `src/index.ts`.
|
||||
- In the main workflow (`src/index.ts`), immediately after a successful call to `scrapeArticle` for a story (where the result is a non-null string):
|
||||
- Retrieve the full path to the current date-stamped output directory.
|
||||
- Construct the filename: `{storyId}_article.txt`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Get the successfully scraped article text string (`articleContent`).
|
||||
- Use `fs.writeFileSync(fullPath, articleContent, 'utf-8')` to save the text to the file. Wrap in `try...catch` for file system errors.
|
||||
- Log the successful saving of the file (e.g., "Saved scraped article text to {filename}") or any file writing errors encountered.
|
||||
- Ensure *no* `_article.txt` file is created if `scrapeArticle` returned `null` (due to skipping or failure).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains `_article.txt` files *only* for those stories where `scrapeArticle` succeeded and returned text content.
|
||||
- AC2: The name of each article text file is `{storyId}_article.txt`.
|
||||
- AC3: The content of each `_article.txt` file is the plain text string returned by `scrapeArticle`.
|
||||
- AC4: Logs confirm the successful writing of each `_article.txt` file or report specific file writing errors.
|
||||
- AC5: No empty `_article.txt` files are created. Files only exist if scraping was successful.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.4: Implement Stage Testing Utility for Scraping
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate script/command to test the article scraping logic using HN story data from local files, allowing independent testing and debugging of the scraper.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new standalone script file: `src/stages/scrape_articles.ts`.
|
||||
- Import necessary modules: `fs`, `path`, `logger`, `config`, `scrapeArticle`.
|
||||
- The script should:
|
||||
- Initialize the logger.
|
||||
- Load configuration (to get `OUTPUT_DIR_PATH`).
|
||||
- Determine the target date-stamped directory path (e.g., `${OUTPUT_DIR_PATH}/YYYY-MM-DD`, using the current date or potentially an optional CLI argument). Ensure this directory exists.
|
||||
- Read the directory contents and identify all `{storyId}_data.json` files.
|
||||
- For each `_data.json` file found:
|
||||
- Read and parse the JSON content.
|
||||
- Extract the `storyId` and `url`.
|
||||
- If a valid `url` exists, call `await scrapeArticle(url)`.
|
||||
- If scraping succeeds (returns text), save the text to `{storyId}_article.txt` in the same directory (using logic from Story 3.3). Overwrite if the file exists.
|
||||
- Log the progress and outcome (skip/success/fail) for each story processed.
|
||||
- Add a new script command to `package.json`: `"stage:scrape": "ts-node src/stages/scrape_articles.ts"`. Consider adding argument parsing later if needed to specify a date/directory.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/scrape_articles.ts` exists.
|
||||
- AC2: The script `stage:scrape` is defined in `package.json`.
|
||||
- AC3: Running `npm run stage:scrape` (assuming a directory with `_data.json` files exists from a previous `stage:fetch` run) reads these files.
|
||||
- AC4: The script calls `scrapeArticle` for stories with valid URLs found in the JSON files.
|
||||
- AC5: The script creates/updates `{storyId}_article.txt` files in the target directory corresponding to successfully scraped articles.
|
||||
- AC6: The script logs its actions (reading files, attempting scraping, saving results) for each story ID processed.
|
||||
- AC7: The script operates solely based on local `_data.json` files and fetching from external article URLs; it does not call the Algolia HN API.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 3 | 2-pm |
|
||||
@@ -1,111 +0,0 @@
|
||||
# Epic 3: Article Scraping & Persistence
|
||||
|
||||
**Goal:** Implement a best-effort article scraping mechanism to fetch and extract plain text content from the external URLs associated with fetched HN stories. Handle failures gracefully and persist successfully scraped text locally. Implement a stage testing utility for scraping.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 3.1: Implement Basic Article Scraper Module
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module that attempts to fetch HTML from a URL and extract the main article text using basic methods, handling common failures gracefully, so article content can be prepared for summarization.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/scraper/articleScraper.ts`.
|
||||
- Add a suitable HTML parsing/extraction library dependency (e.g., `@extractus/article-extractor` recommended for simplicity, or `cheerio` for more control). Run `npm install @extractus/article-extractor --save-prod` (or chosen alternative).
|
||||
- Implement an async function `scrapeArticle(url: string): Promise<string | null>` within the module.
|
||||
- Inside the function:
|
||||
- Use native `Workspace` to retrieve content from the `url`. Set a reasonable timeout (e.g., 10-15 seconds). Include a `User-Agent` header to mimic a browser.
|
||||
- Handle potential `Workspace` errors (network errors, timeouts) using `try...catch`.
|
||||
- Check the `response.ok` status. If not okay, log error and return `null`.
|
||||
- Check the `Content-Type` header of the response. If it doesn't indicate HTML (e.g., does not include `text/html`), log warning and return `null`.
|
||||
- If HTML is received, attempt to extract the main article text using the chosen library (`article-extractor` preferred).
|
||||
- Wrap the extraction logic in a `try...catch` to handle library-specific errors.
|
||||
- Return the extracted plain text string if successful. Ensure it's just text, not HTML markup.
|
||||
- Return `null` if extraction fails or results in empty content.
|
||||
- Log all significant events, errors, or reasons for returning null (e.g., "Scraping URL...", "Fetch failed:", "Non-HTML content type:", "Extraction failed:", "Successfully extracted text") using the logger utility.
|
||||
- Define TypeScript types/interfaces as needed.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `articleScraper.ts` module exists and exports the `scrapeArticle` function.
|
||||
- AC2: The chosen scraping library (e.g., `@extractus/article-extractor`) is added to `dependencies` in `package.json`.
|
||||
- AC3: `scrapeArticle` uses native `Workspace` with a timeout and User-Agent header.
|
||||
- AC4: `scrapeArticle` correctly handles fetch errors, non-OK responses, and non-HTML content types by logging and returning `null`.
|
||||
- AC5: `scrapeArticle` uses the chosen library to attempt text extraction from valid HTML content.
|
||||
- AC6: `scrapeArticle` returns the extracted plain text on success, and `null` on any failure (fetch, non-HTML, extraction error, empty result).
|
||||
- AC7: Relevant logs are produced for success, failure modes, and errors encountered during the process.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.2: Integrate Article Scraping into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to integrate the article scraper into the main workflow (`src/index.ts`), attempting to scrape the article for each HN story that has a valid URL, after fetching its data.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts`.
|
||||
- Import the `scrapeArticle` function from `src/scraper/articleScraper.ts`.
|
||||
- Within the main loop iterating through the fetched stories (after comments are fetched in Epic 2):
|
||||
- Check if `story.url` exists and appears to be a valid HTTP/HTTPS URL. A simple check for starting with `http://` or `https://` is sufficient.
|
||||
- If the URL is missing or invalid, log a warning ("Skipping scraping for story {storyId}: Missing or invalid URL") and proceed to the next story's processing step.
|
||||
- If a valid URL exists, log ("Attempting to scrape article for story {storyId} from {story.url}").
|
||||
- Call `await scrapeArticle(story.url)`.
|
||||
- Store the result (the extracted text string or `null`) in memory, associated with the story object (e.g., add property `articleContent: string | null`).
|
||||
- Log the outcome clearly (e.g., "Successfully scraped article for story {storyId}", "Failed to scrape article for story {storyId}").
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes Epic 1 & 2 steps, and then attempts article scraping for stories with valid URLs.
|
||||
- AC2: Stories with missing or invalid URLs are skipped, and a corresponding log message is generated.
|
||||
- AC3: For stories with valid URLs, the `scrapeArticle` function is called.
|
||||
- AC4: Logs clearly indicate the start and success/failure outcome of the scraping attempt for each relevant story.
|
||||
- AC5: Story objects held in memory after this stage contain an `articleContent` property holding the scraped text (string) or `null` if scraping was skipped or failed.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.3: Persist Scraped Article Text Locally
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save successfully scraped article text to a separate local file for each story, so that the text content is available as input for the summarization stage.
|
||||
- **Detailed Requirements:**
|
||||
- Import Node.js `fs` and `path` modules if not already present in `src/index.ts`.
|
||||
- In the main workflow (`src/index.ts`), immediately after a successful call to `scrapeArticle` for a story (where the result is a non-null string):
|
||||
- Retrieve the full path to the current date-stamped output directory.
|
||||
- Construct the filename: `{storyId}_article.txt`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Get the successfully scraped article text string (`articleContent`).
|
||||
- Use `fs.writeFileSync(fullPath, articleContent, 'utf-8')` to save the text to the file. Wrap in `try...catch` for file system errors.
|
||||
- Log the successful saving of the file (e.g., "Saved scraped article text to {filename}") or any file writing errors encountered.
|
||||
- Ensure *no* `_article.txt` file is created if `scrapeArticle` returned `null` (due to skipping or failure).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains `_article.txt` files *only* for those stories where `scrapeArticle` succeeded and returned text content.
|
||||
- AC2: The name of each article text file is `{storyId}_article.txt`.
|
||||
- AC3: The content of each `_article.txt` file is the plain text string returned by `scrapeArticle`.
|
||||
- AC4: Logs confirm the successful writing of each `_article.txt` file or report specific file writing errors.
|
||||
- AC5: No empty `_article.txt` files are created. Files only exist if scraping was successful.
|
||||
|
||||
---
|
||||
|
||||
### Story 3.4: Implement Stage Testing Utility for Scraping
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate script/command to test the article scraping logic using HN story data from local files, allowing independent testing and debugging of the scraper.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new standalone script file: `src/stages/scrape_articles.ts`.
|
||||
- Import necessary modules: `fs`, `path`, `logger`, `config`, `scrapeArticle`.
|
||||
- The script should:
|
||||
- Initialize the logger.
|
||||
- Load configuration (to get `OUTPUT_DIR_PATH`).
|
||||
- Determine the target date-stamped directory path (e.g., `${OUTPUT_DIR_PATH}/YYYY-MM-DD`, using the current date or potentially an optional CLI argument). Ensure this directory exists.
|
||||
- Read the directory contents and identify all `{storyId}_data.json` files.
|
||||
- For each `_data.json` file found:
|
||||
- Read and parse the JSON content.
|
||||
- Extract the `storyId` and `url`.
|
||||
- If a valid `url` exists, call `await scrapeArticle(url)`.
|
||||
- If scraping succeeds (returns text), save the text to `{storyId}_article.txt` in the same directory (using logic from Story 3.3). Overwrite if the file exists.
|
||||
- Log the progress and outcome (skip/success/fail) for each story processed.
|
||||
- Add a new script command to `package.json`: `"stage:scrape": "ts-node src/stages/scrape_articles.ts"`. Consider adding argument parsing later if needed to specify a date/directory.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/scrape_articles.ts` exists.
|
||||
- AC2: The script `stage:scrape` is defined in `package.json`.
|
||||
- AC3: Running `npm run stage:scrape` (assuming a directory with `_data.json` files exists from a previous `stage:fetch` run) reads these files.
|
||||
- AC4: The script calls `scrapeArticle` for stories with valid URLs found in the JSON files.
|
||||
- AC5: The script creates/updates `{storyId}_article.txt` files in the target directory corresponding to successfully scraped articles.
|
||||
- AC6: The script logs its actions (reading files, attempting scraping, saving results) for each story ID processed.
|
||||
- AC7: The script operates solely based on local `_data.json` files and fetching from external article URLs; it does not call the Algolia HN API.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 3 | 2-pm |
|
||||
@@ -1,146 +0,0 @@
|
||||
# Epic 4: LLM Summarization & Persistence
|
||||
|
||||
**Goal:** Integrate with the configured local Ollama instance to generate summaries for successfully scraped article text and fetched comments. Persist these summaries locally. Implement a stage testing utility for summarization.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 4.1: Implement Ollama Client Module
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a client module to interact with the configured Ollama API endpoint via HTTP, handling requests and responses for text generation, so that summaries can be generated programmatically.
|
||||
- **Detailed Requirements:**
|
||||
- **Prerequisite:** Ensure a local Ollama instance is installed and running, accessible via the URL defined in `.env` (`OLLAMA_ENDPOINT_URL`), and that the model specified in `.env` (`OLLAMA_MODEL`) has been downloaded (e.g., via `ollama pull model_name`). Instructions for this setup should be in the project README.
|
||||
- Create a new module: `src/clients/ollamaClient.ts`.
|
||||
- Implement an async function `generateSummary(promptTemplate: string, content: string): Promise<string | null>`. *(Note: Parameter name changed for clarity)*
|
||||
- Add configuration variables `OLLAMA_ENDPOINT_URL` (e.g., `http://localhost:11434`) and `OLLAMA_MODEL` (e.g., `llama3`) to `.env.example`. Ensure they are loaded via the config module (`src/utils/config.ts`). Update local `.env` with actual values. Add optional `OLLAMA_TIMEOUT_MS` to `.env.example` with a default like `120000`.
|
||||
- Inside `generateSummary`:
|
||||
- Construct the full prompt string using the `promptTemplate` and the provided `content` (e.g., replacing a placeholder like `{Content Placeholder}` in the template, or simple concatenation if templates are basic).
|
||||
- Construct the Ollama API request payload (JSON): `{ model: configured_model, prompt: full_prompt, stream: false }`. Refer to Ollama `/api/generate` documentation and `docs/data-models.md`.
|
||||
- Use native `Workspace` to send a POST request to the configured Ollama endpoint + `/api/generate`. Set appropriate headers (`Content-Type: application/json`). Use the configured `OLLAMA_TIMEOUT_MS` or a reasonable default (e.g., 2 minutes).
|
||||
- Handle `Workspace` errors (network, timeout) using `try...catch`.
|
||||
- Check `response.ok`. If not OK, log the status/error and return `null`.
|
||||
- Parse the JSON response from Ollama. Extract the generated text (typically in the `response` field). Refer to `docs/data-models.md`.
|
||||
- Check for potential errors within the Ollama response structure itself (e.g., an `error` field).
|
||||
- Return the extracted summary string on success. Return `null` on any failure.
|
||||
- Log key events: initiating request (mention model), receiving response, success, failure reasons, potentially request/response time using the logger.
|
||||
- Define necessary TypeScript types for the Ollama request payload and expected response structure in `src/types/ollama.ts` (referenced in `docs/data-models.md`).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `ollamaClient.ts` module exists and exports `generateSummary`.
|
||||
- AC2: `OLLAMA_ENDPOINT_URL` and `OLLAMA_MODEL` are defined in `.env.example`, loaded via config, and used by the client. Optional `OLLAMA_TIMEOUT_MS` is handled.
|
||||
- AC3: `generateSummary` sends a correctly formatted POST request (model, full prompt based on template and content, stream:false) to the configured Ollama endpoint/path using native `Workspace`.
|
||||
- AC4: Network errors, timeouts, and non-OK API responses are handled gracefully, logged, and result in a `null` return (given the Prerequisite Ollama service is running).
|
||||
- AC5: A successful Ollama response is parsed correctly, the generated text is extracted, and returned as a string.
|
||||
* AC6: Unexpected Ollama response formats or internal errors (e.g., `{"error": "..."}`) are handled, logged, and result in a `null` return.
|
||||
* AC7: Logs provide visibility into the client's interaction with the Ollama API.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.2: Define Summarization Prompts
|
||||
|
||||
* **User Story / Goal:** As a developer, I want standardized base prompts for generating article summaries and HN discussion summaries documented centrally, ensuring consistent instructions are sent to the LLM.
|
||||
* **Detailed Requirements:**
|
||||
* Define two standardized base prompts (`ARTICLE_SUMMARY_PROMPT`, `DISCUSSION_SUMMARY_PROMPT`) **and document them in `docs/prompts.md`**.
|
||||
* Ensure these prompts are accessible within the application code, for example, by defining them as exported constants in a dedicated module like `src/utils/prompts.ts`, which reads from or mirrors the content in `docs/prompts.md`.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: The `ARTICLE_SUMMARY_PROMPT` text is defined in `docs/prompts.md` with appropriate instructional content.
|
||||
* AC2: The `DISCUSSION_SUMMARY_PROMPT` text is defined in `docs/prompts.md` with appropriate instructional content.
|
||||
* AC3: The prompt texts documented in `docs/prompts.md` are available as constants or variables within the application code (e.g., via `src/utils/prompts.ts`) for use by the Ollama client integration.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.3: Integrate Summarization into Main Workflow
|
||||
|
||||
* **User Story / Goal:** As a developer, I want to integrate the Ollama client into the main workflow to generate summaries for each story's scraped article text (if available) and fetched comments, using centrally defined prompts and handling potential comment length limits.
|
||||
* **Detailed Requirements:**
|
||||
* Modify the main execution flow in `src/index.ts` or `src/core/pipeline.ts`.
|
||||
* Import `ollamaClient.generateSummary` and the prompt constants/variables (e.g., from `src/utils/prompts.ts`, which reflect `docs/prompts.md`).
|
||||
* Load the optional `MAX_COMMENT_CHARS_FOR_SUMMARY` configuration value from `.env` via the config utility.
|
||||
* Within the main loop iterating through stories (after article scraping/persistence in Epic 3):
|
||||
* **Article Summary Generation:**
|
||||
* Check if the `story` object has non-null `articleContent`.
|
||||
* If yes: log "Attempting article summarization for story {storyId}", call `await generateSummary(ARTICLE_SUMMARY_PROMPT, story.articleContent)`, store the result (string or null) as `story.articleSummary`, log success/failure.
|
||||
* If no: set `story.articleSummary = null`, log "Skipping article summarization: No content".
|
||||
* **Discussion Summary Generation:**
|
||||
* Check if the `story` object has a non-empty `comments` array.
|
||||
* If yes:
|
||||
* Format the `story.comments` array into a single text block suitable for the LLM prompt (e.g., concatenating `comment.text` with separators like `---`).
|
||||
* **Check truncation limit:** If `MAX_COMMENT_CHARS_FOR_SUMMARY` is configured to a positive number and the `formattedCommentsText` length exceeds it, truncate `formattedCommentsText` to the limit and log a warning: "Comment text truncated to {limit} characters for summarization for story {storyId}".
|
||||
* Log "Attempting discussion summarization for story {storyId}".
|
||||
* Call `await generateSummary(DISCUSSION_SUMMARY_PROMPT, formattedCommentsText)`. *(Pass the potentially truncated text)*
|
||||
* Store the result (string or null) as `story.discussionSummary`. Log success/failure.
|
||||
* If no: set `story.discussionSummary = null`, log "Skipping discussion summarization: No comments".
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: Running `npm run dev` executes steps from Epics 1-3, then attempts summarization using the Ollama client.
|
||||
* AC2: Article summary is attempted only if `articleContent` exists for a story.
|
||||
* AC3: Discussion summary is attempted only if `comments` exist for a story.
|
||||
* AC4: `generateSummary` is called with the correct prompts (sourced consistently with `docs/prompts.md`) and corresponding content (article text or formatted/potentially truncated comments).
|
||||
* AC5: If `MAX_COMMENT_CHARS_FOR_SUMMARY` is set and comment text exceeds it, the text passed to `generateSummary` is truncated, and a warning is logged.
|
||||
* AC6: Logs clearly indicate the start, success, or failure (including null returns from the client) for both article and discussion summarization attempts per story.
|
||||
* AC7: Story objects in memory now contain `articleSummary` (string/null) and `discussionSummary` (string/null) properties.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.4: Persist Generated Summaries Locally
|
||||
|
||||
*(No changes needed for this story based on recent decisions)*
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save the generated article and discussion summaries (or null placeholders) to a local JSON file for each story, making them available for the email assembly stage.
|
||||
- **Detailed Requirements:**
|
||||
- Define the structure for the summary output file: `{storyId}_summary.json`. Content example: `{ "storyId": "...", "articleSummary": "...", "discussionSummary": "...", "summarizedAt": "ISO_TIMESTAMP" }`. Note that `articleSummary` and `discussionSummary` can be `null`.
|
||||
- Import `fs` and `path` in `src/index.ts` or `src/core/pipeline.ts` if needed.
|
||||
- In the main workflow loop, after *both* summarization attempts (article and discussion) for a story are complete:
|
||||
- Create a summary result object containing `storyId`, `articleSummary` (string or null), `discussionSummary` (string or null), and the current ISO timestamp (`new Date().toISOString()`). Add this timestamp to the in-memory `story` object as well (`story.summarizedAt`).
|
||||
- Get the full path to the date-stamped output directory.
|
||||
- Construct the filename: `{storyId}_summary.json`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Serialize the summary result object to JSON (`JSON.stringify(..., null, 2)`).
|
||||
- Use `fs.writeFileSync` to save the JSON to the file, wrapping in `try...catch`.
|
||||
- Log the successful saving of the summary file or any file writing errors.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains 10 files named `{storyId}_summary.json`.
|
||||
- AC2: Each `_summary.json` file contains valid JSON adhering to the defined structure.
|
||||
- AC3: The `articleSummary` field contains the generated summary string if successful, otherwise `null`.
|
||||
- AC4: The `discussionSummary` field contains the generated summary string if successful, otherwise `null`.
|
||||
- AC5: A valid ISO timestamp is present in the `summarizedAt` field.
|
||||
- AC6: Logs confirm successful writing of each summary file or report file system errors.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.5: Implement Stage Testing Utility for Summarization
|
||||
|
||||
*(Changes needed to reflect prompt sourcing and optional truncation)*
|
||||
|
||||
* **User Story / Goal:** As a developer, I want a separate script/command to test the LLM summarization logic using locally persisted data (HN comments, scraped article text), allowing independent testing of prompts and Ollama interaction.
|
||||
* **Detailed Requirements:**
|
||||
* Create a new standalone script file: `src/stages/summarize_content.ts`.
|
||||
* Import necessary modules: `fs`, `path`, `logger`, `config`, `ollamaClient`, prompt constants (e.g., from `src/utils/prompts.ts`).
|
||||
* The script should:
|
||||
* Initialize logger, load configuration (Ollama endpoint/model, output dir, **optional `MAX_COMMENT_CHARS_FOR_SUMMARY`**).
|
||||
* Determine target date-stamped directory path.
|
||||
* Find all `{storyId}_data.json` files in the directory.
|
||||
* For each `storyId` found:
|
||||
* Read `{storyId}_data.json` to get comments. Format them into a single text block.
|
||||
* *Attempt* to read `{storyId}_article.txt`. Handle file-not-found gracefully. Store content or null.
|
||||
* Call `ollamaClient.generateSummary` for article text (if not null) using `ARTICLE_SUMMARY_PROMPT`.
|
||||
* **Apply truncation logic:** If comments exist, check `MAX_COMMENT_CHARS_FOR_SUMMARY` and truncate the formatted comment text block if needed, logging a warning.
|
||||
* Call `ollamaClient.generateSummary` for formatted comments (if comments exist) using `DISCUSSION_SUMMARY_PROMPT` *(passing potentially truncated text)*.
|
||||
* Construct the summary result object (with summaries or nulls, and timestamp).
|
||||
* Save the result object to `{storyId}_summary.json` in the same directory (using logic from Story 4.4), overwriting if exists.
|
||||
* Log progress (reading files, calling Ollama, truncation warnings, saving results) for each story ID.
|
||||
* Add script to `package.json`: `"stage:summarize": "ts-node src/stages/summarize_content.ts"`.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: The file `src/stages/summarize_content.ts` exists.
|
||||
* AC2: The script `stage:summarize` is defined in `package.json`.
|
||||
* AC3: Running `npm run stage:summarize` (after `stage:fetch` and `stage:scrape` runs) reads `_data.json` and attempts to read `_article.txt` files from the target directory.
|
||||
* AC4: The script calls the `ollamaClient` with correct prompts (sourced consistently with `docs/prompts.md`) and content derived *only* from the local files (requires Ollama service running per Story 4.1 prerequisite).
|
||||
* AC5: If `MAX_COMMENT_CHARS_FOR_SUMMARY` is set and applicable, comment text is truncated before calling the client, and a warning is logged.
|
||||
* AC6: The script creates/updates `{storyId}_summary.json` files in the target directory reflecting the results of the Ollama calls (summaries or nulls).
|
||||
* AC7: Logs show the script processing each story ID found locally, interacting with Ollama, and saving results.
|
||||
* AC8: The script does not call Algolia API or the article scraper module.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| --------------------------- | ------------ | ------- | ------------------------------------ | -------------- |
|
||||
| Integrate prompts.md refs | 2025-05-04 | 0.3 | Updated stories 4.2, 4.3, 4.5 | 3-Architect |
|
||||
| Added Ollama Prereq Note | 2025-05-04 | 0.2 | Added note about local Ollama setup | 2-pm |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 4 | 2-pm |
|
||||
@@ -1,146 +0,0 @@
|
||||
# Epic 4: LLM Summarization & Persistence
|
||||
|
||||
**Goal:** Integrate with the configured local Ollama instance to generate summaries for successfully scraped article text and fetched comments. Persist these summaries locally. Implement a stage testing utility for summarization.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 4.1: Implement Ollama Client Module
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a client module to interact with the configured Ollama API endpoint via HTTP, handling requests and responses for text generation, so that summaries can be generated programmatically.
|
||||
- **Detailed Requirements:**
|
||||
- **Prerequisite:** Ensure a local Ollama instance is installed and running, accessible via the URL defined in `.env` (`OLLAMA_ENDPOINT_URL`), and that the model specified in `.env` (`OLLAMA_MODEL`) has been downloaded (e.g., via `ollama pull model_name`). Instructions for this setup should be in the project README.
|
||||
- Create a new module: `src/clients/ollamaClient.ts`.
|
||||
- Implement an async function `generateSummary(promptTemplate: string, content: string): Promise<string | null>`. *(Note: Parameter name changed for clarity)*
|
||||
- Add configuration variables `OLLAMA_ENDPOINT_URL` (e.g., `http://localhost:11434`) and `OLLAMA_MODEL` (e.g., `llama3`) to `.env.example`. Ensure they are loaded via the config module (`src/utils/config.ts`). Update local `.env` with actual values. Add optional `OLLAMA_TIMEOUT_MS` to `.env.example` with a default like `120000`.
|
||||
- Inside `generateSummary`:
|
||||
- Construct the full prompt string using the `promptTemplate` and the provided `content` (e.g., replacing a placeholder like `{Content Placeholder}` in the template, or simple concatenation if templates are basic).
|
||||
- Construct the Ollama API request payload (JSON): `{ model: configured_model, prompt: full_prompt, stream: false }`. Refer to Ollama `/api/generate` documentation and `docs/data-models.md`.
|
||||
- Use native `Workspace` to send a POST request to the configured Ollama endpoint + `/api/generate`. Set appropriate headers (`Content-Type: application/json`). Use the configured `OLLAMA_TIMEOUT_MS` or a reasonable default (e.g., 2 minutes).
|
||||
- Handle `Workspace` errors (network, timeout) using `try...catch`.
|
||||
- Check `response.ok`. If not OK, log the status/error and return `null`.
|
||||
- Parse the JSON response from Ollama. Extract the generated text (typically in the `response` field). Refer to `docs/data-models.md`.
|
||||
- Check for potential errors within the Ollama response structure itself (e.g., an `error` field).
|
||||
- Return the extracted summary string on success. Return `null` on any failure.
|
||||
- Log key events: initiating request (mention model), receiving response, success, failure reasons, potentially request/response time using the logger.
|
||||
- Define necessary TypeScript types for the Ollama request payload and expected response structure in `src/types/ollama.ts` (referenced in `docs/data-models.md`).
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `ollamaClient.ts` module exists and exports `generateSummary`.
|
||||
- AC2: `OLLAMA_ENDPOINT_URL` and `OLLAMA_MODEL` are defined in `.env.example`, loaded via config, and used by the client. Optional `OLLAMA_TIMEOUT_MS` is handled.
|
||||
- AC3: `generateSummary` sends a correctly formatted POST request (model, full prompt based on template and content, stream:false) to the configured Ollama endpoint/path using native `Workspace`.
|
||||
- AC4: Network errors, timeouts, and non-OK API responses are handled gracefully, logged, and result in a `null` return (given the Prerequisite Ollama service is running).
|
||||
- AC5: A successful Ollama response is parsed correctly, the generated text is extracted, and returned as a string.
|
||||
* AC6: Unexpected Ollama response formats or internal errors (e.g., `{"error": "..."}`) are handled, logged, and result in a `null` return.
|
||||
* AC7: Logs provide visibility into the client's interaction with the Ollama API.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.2: Define Summarization Prompts
|
||||
|
||||
* **User Story / Goal:** As a developer, I want standardized base prompts for generating article summaries and HN discussion summaries documented centrally, ensuring consistent instructions are sent to the LLM.
|
||||
* **Detailed Requirements:**
|
||||
* Define two standardized base prompts (`ARTICLE_SUMMARY_PROMPT`, `DISCUSSION_SUMMARY_PROMPT`) **and document them in `docs/prompts.md`**.
|
||||
* Ensure these prompts are accessible within the application code, for example, by defining them as exported constants in a dedicated module like `src/utils/prompts.ts`, which reads from or mirrors the content in `docs/prompts.md`.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: The `ARTICLE_SUMMARY_PROMPT` text is defined in `docs/prompts.md` with appropriate instructional content.
|
||||
* AC2: The `DISCUSSION_SUMMARY_PROMPT` text is defined in `docs/prompts.md` with appropriate instructional content.
|
||||
* AC3: The prompt texts documented in `docs/prompts.md` are available as constants or variables within the application code (e.g., via `src/utils/prompts.ts`) for use by the Ollama client integration.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.3: Integrate Summarization into Main Workflow
|
||||
|
||||
* **User Story / Goal:** As a developer, I want to integrate the Ollama client into the main workflow to generate summaries for each story's scraped article text (if available) and fetched comments, using centrally defined prompts and handling potential comment length limits.
|
||||
* **Detailed Requirements:**
|
||||
* Modify the main execution flow in `src/index.ts` or `src/core/pipeline.ts`.
|
||||
* Import `ollamaClient.generateSummary` and the prompt constants/variables (e.g., from `src/utils/prompts.ts`, which reflect `docs/prompts.md`).
|
||||
* Load the optional `MAX_COMMENT_CHARS_FOR_SUMMARY` configuration value from `.env` via the config utility.
|
||||
* Within the main loop iterating through stories (after article scraping/persistence in Epic 3):
|
||||
* **Article Summary Generation:**
|
||||
* Check if the `story` object has non-null `articleContent`.
|
||||
* If yes: log "Attempting article summarization for story {storyId}", call `await generateSummary(ARTICLE_SUMMARY_PROMPT, story.articleContent)`, store the result (string or null) as `story.articleSummary`, log success/failure.
|
||||
* If no: set `story.articleSummary = null`, log "Skipping article summarization: No content".
|
||||
* **Discussion Summary Generation:**
|
||||
* Check if the `story` object has a non-empty `comments` array.
|
||||
* If yes:
|
||||
* Format the `story.comments` array into a single text block suitable for the LLM prompt (e.g., concatenating `comment.text` with separators like `---`).
|
||||
* **Check truncation limit:** If `MAX_COMMENT_CHARS_FOR_SUMMARY` is configured to a positive number and the `formattedCommentsText` length exceeds it, truncate `formattedCommentsText` to the limit and log a warning: "Comment text truncated to {limit} characters for summarization for story {storyId}".
|
||||
* Log "Attempting discussion summarization for story {storyId}".
|
||||
* Call `await generateSummary(DISCUSSION_SUMMARY_PROMPT, formattedCommentsText)`. *(Pass the potentially truncated text)*
|
||||
* Store the result (string or null) as `story.discussionSummary`. Log success/failure.
|
||||
* If no: set `story.discussionSummary = null`, log "Skipping discussion summarization: No comments".
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: Running `npm run dev` executes steps from Epics 1-3, then attempts summarization using the Ollama client.
|
||||
* AC2: Article summary is attempted only if `articleContent` exists for a story.
|
||||
* AC3: Discussion summary is attempted only if `comments` exist for a story.
|
||||
* AC4: `generateSummary` is called with the correct prompts (sourced consistently with `docs/prompts.md`) and corresponding content (article text or formatted/potentially truncated comments).
|
||||
* AC5: If `MAX_COMMENT_CHARS_FOR_SUMMARY` is set and comment text exceeds it, the text passed to `generateSummary` is truncated, and a warning is logged.
|
||||
* AC6: Logs clearly indicate the start, success, or failure (including null returns from the client) for both article and discussion summarization attempts per story.
|
||||
* AC7: Story objects in memory now contain `articleSummary` (string/null) and `discussionSummary` (string/null) properties.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.4: Persist Generated Summaries Locally
|
||||
|
||||
*(No changes needed for this story based on recent decisions)*
|
||||
|
||||
- **User Story / Goal:** As a developer, I want to save the generated article and discussion summaries (or null placeholders) to a local JSON file for each story, making them available for the email assembly stage.
|
||||
- **Detailed Requirements:**
|
||||
- Define the structure for the summary output file: `{storyId}_summary.json`. Content example: `{ "storyId": "...", "articleSummary": "...", "discussionSummary": "...", "summarizedAt": "ISO_TIMESTAMP" }`. Note that `articleSummary` and `discussionSummary` can be `null`.
|
||||
- Import `fs` and `path` in `src/index.ts` or `src/core/pipeline.ts` if needed.
|
||||
- In the main workflow loop, after *both* summarization attempts (article and discussion) for a story are complete:
|
||||
- Create a summary result object containing `storyId`, `articleSummary` (string or null), `discussionSummary` (string or null), and the current ISO timestamp (`new Date().toISOString()`). Add this timestamp to the in-memory `story` object as well (`story.summarizedAt`).
|
||||
- Get the full path to the date-stamped output directory.
|
||||
- Construct the filename: `{storyId}_summary.json`.
|
||||
- Construct the full file path using `path.join()`.
|
||||
- Serialize the summary result object to JSON (`JSON.stringify(..., null, 2)`).
|
||||
- Use `fs.writeFileSync` to save the JSON to the file, wrapping in `try...catch`.
|
||||
- Log the successful saving of the summary file or any file writing errors.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: After running `npm run dev`, the date-stamped output directory contains 10 files named `{storyId}_summary.json`.
|
||||
- AC2: Each `_summary.json` file contains valid JSON adhering to the defined structure.
|
||||
- AC3: The `articleSummary` field contains the generated summary string if successful, otherwise `null`.
|
||||
- AC4: The `discussionSummary` field contains the generated summary string if successful, otherwise `null`.
|
||||
- AC5: A valid ISO timestamp is present in the `summarizedAt` field.
|
||||
- AC6: Logs confirm successful writing of each summary file or report file system errors.
|
||||
|
||||
---
|
||||
|
||||
### Story 4.5: Implement Stage Testing Utility for Summarization
|
||||
|
||||
*(Changes needed to reflect prompt sourcing and optional truncation)*
|
||||
|
||||
* **User Story / Goal:** As a developer, I want a separate script/command to test the LLM summarization logic using locally persisted data (HN comments, scraped article text), allowing independent testing of prompts and Ollama interaction.
|
||||
* **Detailed Requirements:**
|
||||
* Create a new standalone script file: `src/stages/summarize_content.ts`.
|
||||
* Import necessary modules: `fs`, `path`, `logger`, `config`, `ollamaClient`, prompt constants (e.g., from `src/utils/prompts.ts`).
|
||||
* The script should:
|
||||
* Initialize logger, load configuration (Ollama endpoint/model, output dir, **optional `MAX_COMMENT_CHARS_FOR_SUMMARY`**).
|
||||
* Determine target date-stamped directory path.
|
||||
* Find all `{storyId}_data.json` files in the directory.
|
||||
* For each `storyId` found:
|
||||
* Read `{storyId}_data.json` to get comments. Format them into a single text block.
|
||||
* *Attempt* to read `{storyId}_article.txt`. Handle file-not-found gracefully. Store content or null.
|
||||
* Call `ollamaClient.generateSummary` for article text (if not null) using `ARTICLE_SUMMARY_PROMPT`.
|
||||
* **Apply truncation logic:** If comments exist, check `MAX_COMMENT_CHARS_FOR_SUMMARY` and truncate the formatted comment text block if needed, logging a warning.
|
||||
* Call `ollamaClient.generateSummary` for formatted comments (if comments exist) using `DISCUSSION_SUMMARY_PROMPT` *(passing potentially truncated text)*.
|
||||
* Construct the summary result object (with summaries or nulls, and timestamp).
|
||||
* Save the result object to `{storyId}_summary.json` in the same directory (using logic from Story 4.4), overwriting if exists.
|
||||
* Log progress (reading files, calling Ollama, truncation warnings, saving results) for each story ID.
|
||||
* Add script to `package.json`: `"stage:summarize": "ts-node src/stages/summarize_content.ts"`.
|
||||
* **Acceptance Criteria (ACs):**
|
||||
* AC1: The file `src/stages/summarize_content.ts` exists.
|
||||
* AC2: The script `stage:summarize` is defined in `package.json`.
|
||||
* AC3: Running `npm run stage:summarize` (after `stage:fetch` and `stage:scrape` runs) reads `_data.json` and attempts to read `_article.txt` files from the target directory.
|
||||
* AC4: The script calls the `ollamaClient` with correct prompts (sourced consistently with `docs/prompts.md`) and content derived *only* from the local files (requires Ollama service running per Story 4.1 prerequisite).
|
||||
* AC5: If `MAX_COMMENT_CHARS_FOR_SUMMARY` is set and applicable, comment text is truncated before calling the client, and a warning is logged.
|
||||
* AC6: The script creates/updates `{storyId}_summary.json` files in the target directory reflecting the results of the Ollama calls (summaries or nulls).
|
||||
* AC7: Logs show the script processing each story ID found locally, interacting with Ollama, and saving results.
|
||||
* AC8: The script does not call Algolia API or the article scraper module.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| --------------------------- | ------------ | ------- | ------------------------------------ | -------------- |
|
||||
| Integrate prompts.md refs | 2025-05-04 | 0.3 | Updated stories 4.2, 4.3, 4.5 | 3-Architect |
|
||||
| Added Ollama Prereq Note | 2025-05-04 | 0.2 | Added note about local Ollama setup | 2-pm |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 4 | 2-pm |
|
||||
@@ -1,152 +0,0 @@
|
||||
# Epic 5: Digest Assembly & Email Dispatch
|
||||
|
||||
**Goal:** Assemble the collected story data and summaries from local files, format them into a readable HTML email digest, and send the email using Nodemailer with configured credentials. Implement a stage testing utility for emailing with a dry-run option.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 5.1: Implement Email Content Assembler
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module that reads the persisted story metadata (`_data.json`) and summaries (`_summary.json`) from a specified directory, consolidating the necessary information needed to render the email digest.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/email/contentAssembler.ts`.
|
||||
- Define a TypeScript type/interface `DigestData` representing the data needed per story for the email template: `{ storyId: string, title: string, hnUrl: string, articleUrl: string | null, articleSummary: string | null, discussionSummary: string | null }`.
|
||||
- Implement an async function `assembleDigestData(dateDirPath: string): Promise<DigestData[]>`.
|
||||
- The function should:
|
||||
- Use Node.js `fs` to read the contents of the `dateDirPath`.
|
||||
- Identify all files matching the pattern `{storyId}_data.json`.
|
||||
- For each `storyId` found:
|
||||
- Read and parse the `{storyId}_data.json` file. Extract `title`, `hnUrl`, and `url` (use as `articleUrl`). Handle potential file read/parse errors gracefully (log and skip story).
|
||||
- Attempt to read and parse the corresponding `{storyId}_summary.json` file. Handle file-not-found or parse errors gracefully (treat `articleSummary` and `discussionSummary` as `null`).
|
||||
- Construct a `DigestData` object for the story, including the extracted metadata and summaries (or nulls).
|
||||
- Collect all successfully constructed `DigestData` objects into an array.
|
||||
- Return the array. It should ideally contain 10 items if all previous stages succeeded.
|
||||
- Log progress (e.g., "Assembling digest data from directory...", "Processing story {storyId}...") and any errors encountered during file processing using the logger.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `contentAssembler.ts` module exists and exports `assembleDigestData` and the `DigestData` type.
|
||||
- AC2: `assembleDigestData` correctly reads `_data.json` files from the provided directory path.
|
||||
- AC3: It attempts to read corresponding `_summary.json` files, correctly handling cases where the summary file might be missing or unparseable (resulting in null summaries for that story).
|
||||
- AC4: The function returns a promise resolving to an array of `DigestData` objects, populated with data extracted from the files.
|
||||
- AC5: Errors during file reading or JSON parsing are logged, and the function returns data for successfully processed stories.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.2: Create HTML Email Template & Renderer
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic HTML email template and a function to render it with the assembled digest data, producing the final HTML content for the email body.
|
||||
- **Detailed Requirements:**
|
||||
- Define the HTML structure. This can be done using template literals within a function or potentially using a simple template file (e.g., `src/email/templates/digestTemplate.html`) and `fs.readFileSync`. Template literals are simpler for MVP.
|
||||
- Create a function `renderDigestHtml(data: DigestData[], digestDate: string): string` (e.g., in `src/email/contentAssembler.ts` or a new `templater.ts`).
|
||||
- The function should generate an HTML string with:
|
||||
- A suitable title in the body (e.g., `<h1>Hacker News Top 10 Summaries for ${digestDate}</h1>`).
|
||||
- A loop through the `data` array.
|
||||
- For each `story` in `data`:
|
||||
- Display `<h2><a href="${story.articleUrl || story.hnUrl}">${story.title}</a></h2>`.
|
||||
- Display `<p><a href="${story.hnUrl}">View HN Discussion</a></p>`.
|
||||
- Conditionally display `<h3>Article Summary</h3><p>${story.articleSummary}</p>` *only if* `story.articleSummary` is not null/empty.
|
||||
- Conditionally display `<h3>Discussion Summary</h3><p>${story.discussionSummary}</p>` *only if* `story.discussionSummary` is not null/empty.
|
||||
- Include a separator (e.g., `<hr style="margin-top: 20px; margin-bottom: 20px;">`).
|
||||
- Use basic inline CSS for minimal styling (margins, etc.) to ensure readability. Avoid complex layouts.
|
||||
- Return the complete HTML document as a string.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: A function `renderDigestHtml` exists that accepts the digest data array and a date string.
|
||||
- AC2: The function returns a single, complete HTML string.
|
||||
- AC3: The generated HTML includes a title with the date and correctly iterates through the story data.
|
||||
- AC4: For each story, the HTML displays the linked title, HN link, and conditionally displays the article and discussion summaries with headings.
|
||||
- AC5: Basic separators and margins are used for readability. The HTML is simple and likely to render reasonably in most email clients.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.3: Implement Nodemailer Email Sender
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module to send the generated HTML email using Nodemailer, configured with credentials stored securely in the environment file.
|
||||
- **Detailed Requirements:**
|
||||
- Add Nodemailer dependencies: `npm install nodemailer @types/nodemailer --save-prod`.
|
||||
- Add required configuration variables to `.env.example` (and local `.env`): `EMAIL_HOST`, `EMAIL_PORT` (e.g., 587), `EMAIL_SECURE` (e.g., `false` for STARTTLS on 587, `true` for 465), `EMAIL_USER`, `EMAIL_PASS`, `EMAIL_FROM` (e.g., `"Your Name <you@example.com>"`), `EMAIL_RECIPIENTS` (comma-separated list).
|
||||
- Create a new module: `src/email/emailSender.ts`.
|
||||
- Implement an async function `sendDigestEmail(subject: string, htmlContent: string): Promise<boolean>`.
|
||||
- Inside the function:
|
||||
- Load the `EMAIL_*` variables from the config module.
|
||||
- Create a Nodemailer transporter using `nodemailer.createTransport` with the loaded config (host, port, secure flag, auth: { user, pass }).
|
||||
- Verify transporter configuration using `transporter.verify()` (optional but recommended). Log verification success/failure.
|
||||
- Parse the `EMAIL_RECIPIENTS` string into an array or comma-separated string suitable for the `to` field.
|
||||
- Define the `mailOptions`: `{ from: EMAIL_FROM, to: parsedRecipients, subject: subject, html: htmlContent }`.
|
||||
- Call `await transporter.sendMail(mailOptions)`.
|
||||
- If `sendMail` succeeds, log the success message including the `messageId` from the result. Return `true`.
|
||||
- If `sendMail` fails (throws error), log the error using the logger. Return `false`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: `nodemailer` and `@types/nodemailer` dependencies are added.
|
||||
- AC2: `EMAIL_*` variables are defined in `.env.example` and loaded from config.
|
||||
- AC3: `emailSender.ts` module exists and exports `sendDigestEmail`.
|
||||
- AC4: `sendDigestEmail` correctly creates a Nodemailer transporter using configuration from `.env`. Transporter verification is attempted (optional AC).
|
||||
- AC5: The `to` field is correctly populated based on `EMAIL_RECIPIENTS`.
|
||||
- AC6: `transporter.sendMail` is called with correct `from`, `to`, `subject`, and `html` options.
|
||||
- AC7: Email sending success (including message ID) or failure is logged clearly.
|
||||
- AC8: The function returns `true` on successful sending, `false` otherwise.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.4: Integrate Email Assembly and Sending into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want the main application workflow (`src/index.ts`) to orchestrate the final steps: assembling digest data, rendering the HTML, and triggering the email send after all previous stages are complete.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts`.
|
||||
- Import `assembleDigestData`, `renderDigestHtml`, `sendDigestEmail`.
|
||||
- Execute these steps *after* the main loop (where stories are fetched, scraped, summarized, and persisted) completes:
|
||||
- Log "Starting final digest assembly and email dispatch...".
|
||||
- Determine the path to the current date-stamped output directory.
|
||||
- Call `const digestData = await assembleDigestData(dateDirPath)`.
|
||||
- Check if `digestData` array is not empty.
|
||||
- If yes:
|
||||
- Get the current date string (e.g., 'YYYY-MM-DD').
|
||||
- `const htmlContent = renderDigestHtml(digestData, currentDate)`.
|
||||
- `const subject = \`BMad Hacker Daily Digest - ${currentDate}\``.
|
||||
- `const emailSent = await sendDigestEmail(subject, htmlContent)`.
|
||||
- Log the final outcome based on `emailSent` ("Digest email sent successfully." or "Failed to send digest email.").
|
||||
- If no (`digestData` is empty or assembly failed):
|
||||
- Log an error: "Failed to assemble digest data or no data found. Skipping email."
|
||||
- Log "BMad Hacker Daily Digest process finished."
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes all stages (Epics 1-4) and then proceeds to email assembly and sending.
|
||||
- AC2: `assembleDigestData` is called correctly with the output directory path after other processing is done.
|
||||
- AC3: If data is assembled, `renderDigestHtml` and `sendDigestEmail` are called with the correct data, subject, and HTML.
|
||||
- AC4: The final success or failure of the email sending step is logged.
|
||||
- AC5: If `assembleDigestData` returns no data, email sending is skipped, and an appropriate message is logged.
|
||||
- AC6: The application logs a final completion message.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.5: Implement Stage Testing Utility for Emailing
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate script/command to test the email assembly, rendering, and sending logic using persisted local data, including a crucial `--dry-run` option to prevent accidental email sending during tests.
|
||||
- **Detailed Requirements:**
|
||||
- Add `yargs` dependency for argument parsing: `npm install yargs @types/yargs --save-dev`.
|
||||
- Create a new standalone script file: `src/stages/send_digest.ts`.
|
||||
- Import necessary modules: `fs`, `path`, `logger`, `config`, `assembleDigestData`, `renderDigestHtml`, `sendDigestEmail`, `yargs`.
|
||||
- Use `yargs` to parse command-line arguments, specifically looking for a `--dry-run` boolean flag (defaulting to `false`). Allow an optional argument for specifying the date-stamped directory, otherwise default to current date.
|
||||
- The script should:
|
||||
- Initialize logger, load config.
|
||||
- Determine the target date-stamped directory path (from arg or default). Log the target directory.
|
||||
- Call `await assembleDigestData(dateDirPath)`.
|
||||
- If data is assembled and not empty:
|
||||
- Determine the date string for the subject/title.
|
||||
- Call `renderDigestHtml(digestData, dateString)` to get HTML.
|
||||
- Construct the subject string.
|
||||
- Check the `dryRun` flag:
|
||||
- If `true`: Log "DRY RUN enabled. Skipping actual email send.". Log the subject. Save the `htmlContent` to a file in the target directory (e.g., `_digest_preview.html`). Log that the preview file was saved.
|
||||
- If `false`: Log "Live run: Attempting to send email...". Call `await sendDigestEmail(subject, htmlContent)`. Log success/failure based on the return value.
|
||||
- If data assembly fails or is empty, log the error.
|
||||
- Add script to `package.json`: `"stage:email": "ts-node src/stages/send_digest.ts --"`. The `--` allows passing arguments like `--dry-run`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/send_digest.ts` exists. `yargs` dependency is added.
|
||||
- AC2: The script `stage:email` is defined in `package.json` allowing arguments.
|
||||
- AC3: Running `npm run stage:email -- --dry-run` reads local data, renders HTML, logs the intent, saves `_digest_preview.html` locally, and does *not* call `sendDigestEmail`.
|
||||
- AC4: Running `npm run stage:email` (without `--dry-run`) reads local data, renders HTML, and *does* call `sendDigestEmail`, logging the outcome.
|
||||
- AC5: The script correctly identifies and acts upon the `--dry-run` flag.
|
||||
- AC6: Logs clearly distinguish between dry runs and live runs and report success/failure.
|
||||
- AC7: The script operates using only local files and the email configuration/service; it does not invoke prior pipeline stages (Algolia, scraping, Ollama).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 5 | 2-pm |
|
||||
@@ -1,152 +0,0 @@
|
||||
# Epic 5: Digest Assembly & Email Dispatch
|
||||
|
||||
**Goal:** Assemble the collected story data and summaries from local files, format them into a readable HTML email digest, and send the email using Nodemailer with configured credentials. Implement a stage testing utility for emailing with a dry-run option.
|
||||
|
||||
## Story List
|
||||
|
||||
### Story 5.1: Implement Email Content Assembler
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module that reads the persisted story metadata (`_data.json`) and summaries (`_summary.json`) from a specified directory, consolidating the necessary information needed to render the email digest.
|
||||
- **Detailed Requirements:**
|
||||
- Create a new module: `src/email/contentAssembler.ts`.
|
||||
- Define a TypeScript type/interface `DigestData` representing the data needed per story for the email template: `{ storyId: string, title: string, hnUrl: string, articleUrl: string | null, articleSummary: string | null, discussionSummary: string | null }`.
|
||||
- Implement an async function `assembleDigestData(dateDirPath: string): Promise<DigestData[]>`.
|
||||
- The function should:
|
||||
- Use Node.js `fs` to read the contents of the `dateDirPath`.
|
||||
- Identify all files matching the pattern `{storyId}_data.json`.
|
||||
- For each `storyId` found:
|
||||
- Read and parse the `{storyId}_data.json` file. Extract `title`, `hnUrl`, and `url` (use as `articleUrl`). Handle potential file read/parse errors gracefully (log and skip story).
|
||||
- Attempt to read and parse the corresponding `{storyId}_summary.json` file. Handle file-not-found or parse errors gracefully (treat `articleSummary` and `discussionSummary` as `null`).
|
||||
- Construct a `DigestData` object for the story, including the extracted metadata and summaries (or nulls).
|
||||
- Collect all successfully constructed `DigestData` objects into an array.
|
||||
- Return the array. It should ideally contain 10 items if all previous stages succeeded.
|
||||
- Log progress (e.g., "Assembling digest data from directory...", "Processing story {storyId}...") and any errors encountered during file processing using the logger.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The `contentAssembler.ts` module exists and exports `assembleDigestData` and the `DigestData` type.
|
||||
- AC2: `assembleDigestData` correctly reads `_data.json` files from the provided directory path.
|
||||
- AC3: It attempts to read corresponding `_summary.json` files, correctly handling cases where the summary file might be missing or unparseable (resulting in null summaries for that story).
|
||||
- AC4: The function returns a promise resolving to an array of `DigestData` objects, populated with data extracted from the files.
|
||||
- AC5: Errors during file reading or JSON parsing are logged, and the function returns data for successfully processed stories.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.2: Create HTML Email Template & Renderer
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a basic HTML email template and a function to render it with the assembled digest data, producing the final HTML content for the email body.
|
||||
- **Detailed Requirements:**
|
||||
- Define the HTML structure. This can be done using template literals within a function or potentially using a simple template file (e.g., `src/email/templates/digestTemplate.html`) and `fs.readFileSync`. Template literals are simpler for MVP.
|
||||
- Create a function `renderDigestHtml(data: DigestData[], digestDate: string): string` (e.g., in `src/email/contentAssembler.ts` or a new `templater.ts`).
|
||||
- The function should generate an HTML string with:
|
||||
- A suitable title in the body (e.g., `<h1>Hacker News Top 10 Summaries for ${digestDate}</h1>`).
|
||||
- A loop through the `data` array.
|
||||
- For each `story` in `data`:
|
||||
- Display `<h2><a href="${story.articleUrl || story.hnUrl}">${story.title}</a></h2>`.
|
||||
- Display `<p><a href="${story.hnUrl}">View HN Discussion</a></p>`.
|
||||
- Conditionally display `<h3>Article Summary</h3><p>${story.articleSummary}</p>` *only if* `story.articleSummary` is not null/empty.
|
||||
- Conditionally display `<h3>Discussion Summary</h3><p>${story.discussionSummary}</p>` *only if* `story.discussionSummary` is not null/empty.
|
||||
- Include a separator (e.g., `<hr style="margin-top: 20px; margin-bottom: 20px;">`).
|
||||
- Use basic inline CSS for minimal styling (margins, etc.) to ensure readability. Avoid complex layouts.
|
||||
- Return the complete HTML document as a string.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: A function `renderDigestHtml` exists that accepts the digest data array and a date string.
|
||||
- AC2: The function returns a single, complete HTML string.
|
||||
- AC3: The generated HTML includes a title with the date and correctly iterates through the story data.
|
||||
- AC4: For each story, the HTML displays the linked title, HN link, and conditionally displays the article and discussion summaries with headings.
|
||||
- AC5: Basic separators and margins are used for readability. The HTML is simple and likely to render reasonably in most email clients.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.3: Implement Nodemailer Email Sender
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a module to send the generated HTML email using Nodemailer, configured with credentials stored securely in the environment file.
|
||||
- **Detailed Requirements:**
|
||||
- Add Nodemailer dependencies: `npm install nodemailer @types/nodemailer --save-prod`.
|
||||
- Add required configuration variables to `.env.example` (and local `.env`): `EMAIL_HOST`, `EMAIL_PORT` (e.g., 587), `EMAIL_SECURE` (e.g., `false` for STARTTLS on 587, `true` for 465), `EMAIL_USER`, `EMAIL_PASS`, `EMAIL_FROM` (e.g., `"Your Name <you@example.com>"`), `EMAIL_RECIPIENTS` (comma-separated list).
|
||||
- Create a new module: `src/email/emailSender.ts`.
|
||||
- Implement an async function `sendDigestEmail(subject: string, htmlContent: string): Promise<boolean>`.
|
||||
- Inside the function:
|
||||
- Load the `EMAIL_*` variables from the config module.
|
||||
- Create a Nodemailer transporter using `nodemailer.createTransport` with the loaded config (host, port, secure flag, auth: { user, pass }).
|
||||
- Verify transporter configuration using `transporter.verify()` (optional but recommended). Log verification success/failure.
|
||||
- Parse the `EMAIL_RECIPIENTS` string into an array or comma-separated string suitable for the `to` field.
|
||||
- Define the `mailOptions`: `{ from: EMAIL_FROM, to: parsedRecipients, subject: subject, html: htmlContent }`.
|
||||
- Call `await transporter.sendMail(mailOptions)`.
|
||||
- If `sendMail` succeeds, log the success message including the `messageId` from the result. Return `true`.
|
||||
- If `sendMail` fails (throws error), log the error using the logger. Return `false`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: `nodemailer` and `@types/nodemailer` dependencies are added.
|
||||
- AC2: `EMAIL_*` variables are defined in `.env.example` and loaded from config.
|
||||
- AC3: `emailSender.ts` module exists and exports `sendDigestEmail`.
|
||||
- AC4: `sendDigestEmail` correctly creates a Nodemailer transporter using configuration from `.env`. Transporter verification is attempted (optional AC).
|
||||
- AC5: The `to` field is correctly populated based on `EMAIL_RECIPIENTS`.
|
||||
- AC6: `transporter.sendMail` is called with correct `from`, `to`, `subject`, and `html` options.
|
||||
- AC7: Email sending success (including message ID) or failure is logged clearly.
|
||||
- AC8: The function returns `true` on successful sending, `false` otherwise.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.4: Integrate Email Assembly and Sending into Main Workflow
|
||||
|
||||
- **User Story / Goal:** As a developer, I want the main application workflow (`src/index.ts`) to orchestrate the final steps: assembling digest data, rendering the HTML, and triggering the email send after all previous stages are complete.
|
||||
- **Detailed Requirements:**
|
||||
- Modify the main execution flow in `src/index.ts`.
|
||||
- Import `assembleDigestData`, `renderDigestHtml`, `sendDigestEmail`.
|
||||
- Execute these steps *after* the main loop (where stories are fetched, scraped, summarized, and persisted) completes:
|
||||
- Log "Starting final digest assembly and email dispatch...".
|
||||
- Determine the path to the current date-stamped output directory.
|
||||
- Call `const digestData = await assembleDigestData(dateDirPath)`.
|
||||
- Check if `digestData` array is not empty.
|
||||
- If yes:
|
||||
- Get the current date string (e.g., 'YYYY-MM-DD').
|
||||
- `const htmlContent = renderDigestHtml(digestData, currentDate)`.
|
||||
- `const subject = \`BMad Hacker Daily Digest - ${currentDate}\``.
|
||||
- `const emailSent = await sendDigestEmail(subject, htmlContent)`.
|
||||
- Log the final outcome based on `emailSent` ("Digest email sent successfully." or "Failed to send digest email.").
|
||||
- If no (`digestData` is empty or assembly failed):
|
||||
- Log an error: "Failed to assemble digest data or no data found. Skipping email."
|
||||
- Log "BMad Hacker Daily Digest process finished."
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: Running `npm run dev` executes all stages (Epics 1-4) and then proceeds to email assembly and sending.
|
||||
- AC2: `assembleDigestData` is called correctly with the output directory path after other processing is done.
|
||||
- AC3: If data is assembled, `renderDigestHtml` and `sendDigestEmail` are called with the correct data, subject, and HTML.
|
||||
- AC4: The final success or failure of the email sending step is logged.
|
||||
- AC5: If `assembleDigestData` returns no data, email sending is skipped, and an appropriate message is logged.
|
||||
- AC6: The application logs a final completion message.
|
||||
|
||||
---
|
||||
|
||||
### Story 5.5: Implement Stage Testing Utility for Emailing
|
||||
|
||||
- **User Story / Goal:** As a developer, I want a separate script/command to test the email assembly, rendering, and sending logic using persisted local data, including a crucial `--dry-run` option to prevent accidental email sending during tests.
|
||||
- **Detailed Requirements:**
|
||||
- Add `yargs` dependency for argument parsing: `npm install yargs @types/yargs --save-dev`.
|
||||
- Create a new standalone script file: `src/stages/send_digest.ts`.
|
||||
- Import necessary modules: `fs`, `path`, `logger`, `config`, `assembleDigestData`, `renderDigestHtml`, `sendDigestEmail`, `yargs`.
|
||||
- Use `yargs` to parse command-line arguments, specifically looking for a `--dry-run` boolean flag (defaulting to `false`). Allow an optional argument for specifying the date-stamped directory, otherwise default to current date.
|
||||
- The script should:
|
||||
- Initialize logger, load config.
|
||||
- Determine the target date-stamped directory path (from arg or default). Log the target directory.
|
||||
- Call `await assembleDigestData(dateDirPath)`.
|
||||
- If data is assembled and not empty:
|
||||
- Determine the date string for the subject/title.
|
||||
- Call `renderDigestHtml(digestData, dateString)` to get HTML.
|
||||
- Construct the subject string.
|
||||
- Check the `dryRun` flag:
|
||||
- If `true`: Log "DRY RUN enabled. Skipping actual email send.". Log the subject. Save the `htmlContent` to a file in the target directory (e.g., `_digest_preview.html`). Log that the preview file was saved.
|
||||
- If `false`: Log "Live run: Attempting to send email...". Call `await sendDigestEmail(subject, htmlContent)`. Log success/failure based on the return value.
|
||||
- If data assembly fails or is empty, log the error.
|
||||
- Add script to `package.json`: `"stage:email": "ts-node src/stages/send_digest.ts --"`. The `--` allows passing arguments like `--dry-run`.
|
||||
- **Acceptance Criteria (ACs):**
|
||||
- AC1: The file `src/stages/send_digest.ts` exists. `yargs` dependency is added.
|
||||
- AC2: The script `stage:email` is defined in `package.json` allowing arguments.
|
||||
- AC3: Running `npm run stage:email -- --dry-run` reads local data, renders HTML, logs the intent, saves `_digest_preview.html` locally, and does *not* call `sendDigestEmail`.
|
||||
- AC4: Running `npm run stage:email` (without `--dry-run`) reads local data, renders HTML, and *does* call `sendDigestEmail`, logging the outcome.
|
||||
- AC5: The script correctly identifies and acts upon the `--dry-run` flag.
|
||||
- AC6: Logs clearly distinguish between dry runs and live runs and report success/failure.
|
||||
- AC7: The script operates using only local files and the email configuration/service; it does not invoke prior pipeline stages (Algolia, scraping, Ollama).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ------------------------- | -------------- |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft of Epic 5 | 2-pm |
|
||||
@@ -1,111 +0,0 @@
|
||||
# Project Brief: BMad Hacker Daily Digest
|
||||
|
||||
## Introduction / Problem Statement
|
||||
|
||||
Hacker News (HN) comment threads contain valuable insights but can be prohibitively long to read thoroughly. The BMad Hacker Daily Digest project aims to solve this by providing a time-efficient way to stay informed about the collective intelligence within HN discussions. The service will automatically fetch the top 10 HN stories daily, retrieve a manageable subset of their comments using the Algolia HN API, generate concise summaries of both the linked article (when possible) and the comment discussion using an LLM, and deliver these summaries in a daily email briefing. This project also serves as a practical learning exercise focused on agent-driven development, TypeScript, Node.js backend services, API integration, and local LLM usage with Ollama.
|
||||
|
||||
## Vision & Goals
|
||||
|
||||
- **Vision:** To provide a quick, reliable, and automated way for users to stay informed about the key insights and discussions happening within the Hacker News community without needing to read lengthy comment threads.
|
||||
- **Primary Goals (MVP - SMART):**
|
||||
- **Fetch HN Story Data:** Successfully retrieve the IDs and metadata (title, URL, HN link) of the top 10 Hacker News stories using the Algolia HN Search API when triggered.
|
||||
- **Retrieve Limited Comments:** For each fetched story, retrieve a predefined, limited set of associated comments using the Algolia HN Search API.
|
||||
- **Attempt Article Scraping:** For each story's external URL, attempt to fetch the raw HTML and extract the main article text using basic methods (Node.js native fetch, article-extractor/Cheerio), handling failures gracefully.
|
||||
- **Generate Summaries (LLM):** Using a local LLM (via Ollama, configured endpoint), generate: an "Article Summary" from scraped text (if successful), and a separate "Discussion Summary" from fetched comments.
|
||||
- **Assemble & Send Digest (Manual Trigger):** Format results for 10 stories into a single HTML email and successfully send it to recipients (list defined in config) using Nodemailer when manually triggered via CLI.
|
||||
- **Success Metrics (Initial Ideas for MVP):**
|
||||
- **Successful Execution:** The entire process completes successfully without crashing when manually triggered via CLI for 3 different test runs.
|
||||
- **Digest Content:** The generated email contains results for 10 stories (correct links, discussion summary, article summary where possible). Spot checks confirm relevance.
|
||||
- **Error Handling:** Scraping failures are logged, and the process continues using only comment summaries for affected stories without halting the script.
|
||||
|
||||
## Target Audience / Users
|
||||
|
||||
**Primary User (MVP):** The developer undertaking this project. The primary motivation is learning and demonstrating agent-driven development, TypeScript, Node.js (v22), API integration (Algolia, LLM, Email), local LLMs (Ollama), and configuration management ( .env ). The key need is an interesting, achievable project scope utilizing these technologies.
|
||||
|
||||
**Secondary User (Potential):** Time-constrained HN readers/tech enthusiasts needing automated discussion summaries. Addressing their needs fully is outside MVP scope but informs potential future direction.
|
||||
|
||||
## Key Features / Scope (High-Level Ideas for MVP)
|
||||
|
||||
- Fetch Top HN Stories (Algolia API).
|
||||
- Fetch Limited Comments (Algolia API).
|
||||
- Local File Storage (Date-stamped folder, structured text/JSON files).
|
||||
- Attempt Basic Article Scraping (Node.js v22 native fetch, basic extraction).
|
||||
- Handle Scraping Failures (Log error, proceed with comment-only summary).
|
||||
- Generate Summaries (Local Ollama via configured endpoint: Article Summary if scraped, Discussion Summary always).
|
||||
- Format Digest Email (HTML: Article Summary (opt.), Discussion Summary, HN link, Article link).
|
||||
- Manual Email Dispatch (Nodemailer, credentials from .env , recipient list from .env ).
|
||||
- CLI Trigger (Manual command to run full process).
|
||||
|
||||
**Explicitly OUT of Scope for MVP:** Advanced scraping (JS render, anti-bot), processing _all_ comments/MapReduce summaries, automated scheduling (cron), database integration, cloud deployment/web frontend, user management (sign-ups etc.), production-grade error handling/monitoring/deliverability, fine-tuning LLM prompts, sophisticated retry logic.
|
||||
|
||||
## Known Technical Constraints or Preferences
|
||||
|
||||
- **Constraints/Preferences:**
|
||||
|
||||
- **Language/Runtime:** TypeScript running on Node.js v22.
|
||||
- **Execution Environment:** Local machine execution for MVP.
|
||||
- **Trigger Mechanism:** Manual CLI trigger only for MVP.
|
||||
- **Configuration Management:** Use a `.env` file for configuration: LLM endpoint URL, email credentials, recipient email list, potentially comment fetch limits etc.
|
||||
- **HTTP Requests:** Use Node.js v22 native fetch API (no Axios).
|
||||
- **HN Data Source:** Algolia HN Search API.
|
||||
- **Web Scraping:** Basic, best-effort only (native fetch + static HTML extraction). Must handle failures gracefully.
|
||||
- **LLM Integration:** Local Ollama via configurable endpoint for MVP. Design for potential swap to cloud LLMs. Functionality over quality for MVP.
|
||||
- **Summarization Strategy:** Separate Article/Discussion summaries. Limit comments processed per story (configurable). No MapReduce.
|
||||
- **Data Storage:** Local file system (structured text/JSON in date-stamped folders). No database.
|
||||
- **Email Delivery:** Nodemailer. Read credentials and recipient list from `.env`. Basic setup, no production deliverability focus.
|
||||
- **Primary Goal Context:** Focus on functional pipeline for learning/demonstration.
|
||||
|
||||
- **Risks:**
|
||||
- Algolia HN API Issues: Changes, rate limits, availability.
|
||||
- Web Scraping Fragility: High likelihood of failure limiting Article Summaries.
|
||||
- LLM Variability & Quality: Inconsistent performance/quality from local Ollama; potential errors.
|
||||
*Incomplete Discussion Capture: Limited comment fetching may miss key insights.
|
||||
*Email Configuration/Deliverability: Fragility of personal credentials; potential spam filtering.
|
||||
*Manual Trigger Dependency: Digest only generated on manual execution.
|
||||
*Configuration Errors: Incorrect `.env` settings could break the application.
|
||||
_(User Note: Risks acknowledged and accepted given the project's learning goals.)_
|
||||
|
||||
## Relevant Research (Optional)
|
||||
|
||||
Feasibility: Core concept confirmed technically feasible with available APIs/libraries.
|
||||
Existing Tools & Market Context: Similar tools exist (validating interest), but daily email format appears distinct.
|
||||
API Selection: Algolia HN Search API chosen for filtering/sorting capabilities.
|
||||
Identified Technical Challenges: Confirmed complexities of scraping and handling large comment volumes within LLM limits, informing MVP scope.
|
||||
Local LLM Viability: Ollama confirmed as viable for local MVP development/testing, with potential for future swapping.
|
||||
|
||||
## PM Prompt
|
||||
|
||||
**PM Agent Handoff Prompt: BMad Hacker Daily Digest**
|
||||
|
||||
**Summary of Key Insights:**
|
||||
|
||||
This Project Brief outlines the "BMad Hacker Daily Digest," a command-line tool designed to provide daily email summaries of discussions from top Hacker News (HN) comment threads. The core problem is the time required to read lengthy but valuable HN discussions. The MVP aims to fetch the top 10 HN stories, retrieve a limited set of comments via the Algolia HN API, attempt basic scraping of linked articles (with fallback), generate separate summaries for articles (if scraped) and comments using a local LLM (Ollama), and email the digest to the developer using Nodemailer. This project primarily serves as a learning exercise and demonstration of agent-driven development in TypeScript.
|
||||
|
||||
**Areas Requiring Special Attention (for PRD):**
|
||||
|
||||
- **Comment Selection Logic:** Define the specific criteria for selecting the "limited set" of comments from Algolia (e.g., number of comments, recency, token count limit).
|
||||
- **Basic Scraping Implementation:** Detail the exact steps for the basic article scraping attempt (libraries like Node.js native fetch, article-extractor/Cheerio), including specific error handling and the fallback mechanism.
|
||||
- **LLM Prompting:** Define the precise prompts for generating the "Article Summary" and the "Discussion Summary" separately.
|
||||
- **Email Formatting:** Specify the exact structure, layout, and content presentation within the daily HTML email digest.
|
||||
- **CLI Interface:** Define the specific command(s), arguments, and expected output/feedback for the manual trigger.
|
||||
- **Local File Structure:** Define the structure for storing intermediate data and logs in local text files within date-stamped folders.
|
||||
|
||||
**Development Context:**
|
||||
|
||||
This brief was developed through iterative discussion, starting from general app ideas and refining scope based on user interest (HN discussions) and technical feasibility for a learning/demo project. Key decisions include prioritizing comment summarization, using the Algolia HN API, starting with local execution (Ollama, Nodemailer), and including only a basic, best-effort scraping attempt in the MVP.
|
||||
|
||||
**Guidance on PRD Detail:**
|
||||
|
||||
- Focus detailed requirements and user stories on the core data pipeline: HN API Fetch -> Comment Selection -> Basic Scrape Attempt -> LLM Summarization (x2) -> Email Formatting/Sending -> CLI Trigger.
|
||||
- Keep potential post-MVP enhancements (cloud deployment, frontend, database, advanced scraping, scheduling) as high-level future considerations.
|
||||
- Technical implementation details for API/LLM interaction should allow flexibility for potential future swapping (e.g., Ollama to cloud LLM).
|
||||
|
||||
**User Preferences:**
|
||||
|
||||
- Execution: Manual CLI trigger for MVP.
|
||||
- Data Storage: Local text files for MVP.
|
||||
- LLM: Ollama for local development/MVP. Ability to potentially switch to cloud API later.
|
||||
- Summaries: Generate separate summaries for article (if available) and comments.
|
||||
- API: Use Algolia HN Search API.
|
||||
- Email: Use Nodemailer for self-send in MVP.
|
||||
- Tech Stack: TypeScript, Node.js v22.
|
||||
@@ -1,111 +0,0 @@
|
||||
# Project Brief: BMad Hacker Daily Digest
|
||||
|
||||
## Introduction / Problem Statement
|
||||
|
||||
Hacker News (HN) comment threads contain valuable insights but can be prohibitively long to read thoroughly. The BMad Hacker Daily Digest project aims to solve this by providing a time-efficient way to stay informed about the collective intelligence within HN discussions. The service will automatically fetch the top 10 HN stories daily, retrieve a manageable subset of their comments using the Algolia HN API, generate concise summaries of both the linked article (when possible) and the comment discussion using an LLM, and deliver these summaries in a daily email briefing. This project also serves as a practical learning exercise focused on agent-driven development, TypeScript, Node.js backend services, API integration, and local LLM usage with Ollama.
|
||||
|
||||
## Vision & Goals
|
||||
|
||||
- **Vision:** To provide a quick, reliable, and automated way for users to stay informed about the key insights and discussions happening within the Hacker News community without needing to read lengthy comment threads.
|
||||
- **Primary Goals (MVP - SMART):**
|
||||
- **Fetch HN Story Data:** Successfully retrieve the IDs and metadata (title, URL, HN link) of the top 10 Hacker News stories using the Algolia HN Search API when triggered.
|
||||
- **Retrieve Limited Comments:** For each fetched story, retrieve a predefined, limited set of associated comments using the Algolia HN Search API.
|
||||
- **Attempt Article Scraping:** For each story's external URL, attempt to fetch the raw HTML and extract the main article text using basic methods (Node.js native fetch, article-extractor/Cheerio), handling failures gracefully.
|
||||
- **Generate Summaries (LLM):** Using a local LLM (via Ollama, configured endpoint), generate: an "Article Summary" from scraped text (if successful), and a separate "Discussion Summary" from fetched comments.
|
||||
- **Assemble & Send Digest (Manual Trigger):** Format results for 10 stories into a single HTML email and successfully send it to recipients (list defined in config) using Nodemailer when manually triggered via CLI.
|
||||
- **Success Metrics (Initial Ideas for MVP):**
|
||||
- **Successful Execution:** The entire process completes successfully without crashing when manually triggered via CLI for 3 different test runs.
|
||||
- **Digest Content:** The generated email contains results for 10 stories (correct links, discussion summary, article summary where possible). Spot checks confirm relevance.
|
||||
- **Error Handling:** Scraping failures are logged, and the process continues using only comment summaries for affected stories without halting the script.
|
||||
|
||||
## Target Audience / Users
|
||||
|
||||
**Primary User (MVP):** The developer undertaking this project. The primary motivation is learning and demonstrating agent-driven development, TypeScript, Node.js (v22), API integration (Algolia, LLM, Email), local LLMs (Ollama), and configuration management ( .env ). The key need is an interesting, achievable project scope utilizing these technologies.
|
||||
|
||||
**Secondary User (Potential):** Time-constrained HN readers/tech enthusiasts needing automated discussion summaries. Addressing their needs fully is outside MVP scope but informs potential future direction.
|
||||
|
||||
## Key Features / Scope (High-Level Ideas for MVP)
|
||||
|
||||
- Fetch Top HN Stories (Algolia API).
|
||||
- Fetch Limited Comments (Algolia API).
|
||||
- Local File Storage (Date-stamped folder, structured text/JSON files).
|
||||
- Attempt Basic Article Scraping (Node.js v22 native fetch, basic extraction).
|
||||
- Handle Scraping Failures (Log error, proceed with comment-only summary).
|
||||
- Generate Summaries (Local Ollama via configured endpoint: Article Summary if scraped, Discussion Summary always).
|
||||
- Format Digest Email (HTML: Article Summary (opt.), Discussion Summary, HN link, Article link).
|
||||
- Manual Email Dispatch (Nodemailer, credentials from .env , recipient list from .env ).
|
||||
- CLI Trigger (Manual command to run full process).
|
||||
|
||||
**Explicitly OUT of Scope for MVP:** Advanced scraping (JS render, anti-bot), processing _all_ comments/MapReduce summaries, automated scheduling (cron), database integration, cloud deployment/web frontend, user management (sign-ups etc.), production-grade error handling/monitoring/deliverability, fine-tuning LLM prompts, sophisticated retry logic.
|
||||
|
||||
## Known Technical Constraints or Preferences
|
||||
|
||||
- **Constraints/Preferences:**
|
||||
|
||||
- **Language/Runtime:** TypeScript running on Node.js v22.
|
||||
- **Execution Environment:** Local machine execution for MVP.
|
||||
- **Trigger Mechanism:** Manual CLI trigger only for MVP.
|
||||
- **Configuration Management:** Use a `.env` file for configuration: LLM endpoint URL, email credentials, recipient email list, potentially comment fetch limits etc.
|
||||
- **HTTP Requests:** Use Node.js v22 native fetch API (no Axios).
|
||||
- **HN Data Source:** Algolia HN Search API.
|
||||
- **Web Scraping:** Basic, best-effort only (native fetch + static HTML extraction). Must handle failures gracefully.
|
||||
- **LLM Integration:** Local Ollama via configurable endpoint for MVP. Design for potential swap to cloud LLMs. Functionality over quality for MVP.
|
||||
- **Summarization Strategy:** Separate Article/Discussion summaries. Limit comments processed per story (configurable). No MapReduce.
|
||||
- **Data Storage:** Local file system (structured text/JSON in date-stamped folders). No database.
|
||||
- **Email Delivery:** Nodemailer. Read credentials and recipient list from `.env`. Basic setup, no production deliverability focus.
|
||||
- **Primary Goal Context:** Focus on functional pipeline for learning/demonstration.
|
||||
|
||||
- **Risks:**
|
||||
- Algolia HN API Issues: Changes, rate limits, availability.
|
||||
- Web Scraping Fragility: High likelihood of failure limiting Article Summaries.
|
||||
- LLM Variability & Quality: Inconsistent performance/quality from local Ollama; potential errors.
|
||||
*Incomplete Discussion Capture: Limited comment fetching may miss key insights.
|
||||
*Email Configuration/Deliverability: Fragility of personal credentials; potential spam filtering.
|
||||
*Manual Trigger Dependency: Digest only generated on manual execution.
|
||||
*Configuration Errors: Incorrect `.env` settings could break the application.
|
||||
_(User Note: Risks acknowledged and accepted given the project's learning goals.)_
|
||||
|
||||
## Relevant Research (Optional)
|
||||
|
||||
Feasibility: Core concept confirmed technically feasible with available APIs/libraries.
|
||||
Existing Tools & Market Context: Similar tools exist (validating interest), but daily email format appears distinct.
|
||||
API Selection: Algolia HN Search API chosen for filtering/sorting capabilities.
|
||||
Identified Technical Challenges: Confirmed complexities of scraping and handling large comment volumes within LLM limits, informing MVP scope.
|
||||
Local LLM Viability: Ollama confirmed as viable for local MVP development/testing, with potential for future swapping.
|
||||
|
||||
## PM Prompt
|
||||
|
||||
**PM Agent Handoff Prompt: BMad Hacker Daily Digest**
|
||||
|
||||
**Summary of Key Insights:**
|
||||
|
||||
This Project Brief outlines the "BMad Hacker Daily Digest," a command-line tool designed to provide daily email summaries of discussions from top Hacker News (HN) comment threads. The core problem is the time required to read lengthy but valuable HN discussions. The MVP aims to fetch the top 10 HN stories, retrieve a limited set of comments via the Algolia HN API, attempt basic scraping of linked articles (with fallback), generate separate summaries for articles (if scraped) and comments using a local LLM (Ollama), and email the digest to the developer using Nodemailer. This project primarily serves as a learning exercise and demonstration of agent-driven development in TypeScript.
|
||||
|
||||
**Areas Requiring Special Attention (for PRD):**
|
||||
|
||||
- **Comment Selection Logic:** Define the specific criteria for selecting the "limited set" of comments from Algolia (e.g., number of comments, recency, token count limit).
|
||||
- **Basic Scraping Implementation:** Detail the exact steps for the basic article scraping attempt (libraries like Node.js native fetch, article-extractor/Cheerio), including specific error handling and the fallback mechanism.
|
||||
- **LLM Prompting:** Define the precise prompts for generating the "Article Summary" and the "Discussion Summary" separately.
|
||||
- **Email Formatting:** Specify the exact structure, layout, and content presentation within the daily HTML email digest.
|
||||
- **CLI Interface:** Define the specific command(s), arguments, and expected output/feedback for the manual trigger.
|
||||
- **Local File Structure:** Define the structure for storing intermediate data and logs in local text files within date-stamped folders.
|
||||
|
||||
**Development Context:**
|
||||
|
||||
This brief was developed through iterative discussion, starting from general app ideas and refining scope based on user interest (HN discussions) and technical feasibility for a learning/demo project. Key decisions include prioritizing comment summarization, using the Algolia HN API, starting with local execution (Ollama, Nodemailer), and including only a basic, best-effort scraping attempt in the MVP.
|
||||
|
||||
**Guidance on PRD Detail:**
|
||||
|
||||
- Focus detailed requirements and user stories on the core data pipeline: HN API Fetch -> Comment Selection -> Basic Scrape Attempt -> LLM Summarization (x2) -> Email Formatting/Sending -> CLI Trigger.
|
||||
- Keep potential post-MVP enhancements (cloud deployment, frontend, database, advanced scraping, scheduling) as high-level future considerations.
|
||||
- Technical implementation details for API/LLM interaction should allow flexibility for potential future swapping (e.g., Ollama to cloud LLM).
|
||||
|
||||
**User Preferences:**
|
||||
|
||||
- Execution: Manual CLI trigger for MVP.
|
||||
- Data Storage: Local text files for MVP.
|
||||
- LLM: Ollama for local development/MVP. Ability to potentially switch to cloud API later.
|
||||
- Summaries: Generate separate summaries for article (if available) and comments.
|
||||
- API: Use Algolia HN Search API.
|
||||
- Email: Use Nodemailer for self-send in MVP.
|
||||
- Tech Stack: TypeScript, Node.js v22.
|
||||
@@ -1,189 +0,0 @@
|
||||
# BMad Hacker Daily Digest Product Requirements Document (PRD)
|
||||
|
||||
## Intro
|
||||
|
||||
The BMad Hacker Daily Digest is a command-line tool designed to address the time-consuming nature of reading extensive Hacker News (HN) comment threads. It aims to provide users with a time-efficient way to grasp the collective intelligence and key insights from discussions on top HN stories. The service will fetch the top 10 HN stories daily, retrieve a configurable number of comments for each, attempt to scrape the linked article, generate separate summaries for the article (if scraped) and the comment discussion using a local LLM, and deliver these summaries in a single daily email briefing triggered manually. This project also serves as a practical learning exercise in agent-driven development, TypeScript, Node.js, API integration, and local LLM usage, starting from the provided "bmad-boilerplate" template.
|
||||
|
||||
## Goals and Context
|
||||
|
||||
- **Project Objectives:**
|
||||
- Provide a quick, reliable, automated way to stay informed about key HN discussions without reading full threads.
|
||||
- Successfully fetch top 10 HN story metadata via Algolia HN API.
|
||||
- Retrieve a _configurable_ number of comments per story (default 50) via Algolia HN API.
|
||||
- Attempt basic scraping of linked article content, handling failures gracefully.
|
||||
- Generate distinct Article Summaries (if scraped) and Discussion Summaries using a local LLM (Ollama).
|
||||
- Assemble summaries for 10 stories into an HTML email and send via Nodemailer upon manual CLI trigger.
|
||||
- Serve as a learning platform for agent-driven development, TypeScript, Node.js v22, API integration, local LLMs, and configuration management, leveraging the "bmad-boilerplate" structure and tooling.
|
||||
- **Measurable Outcomes:**
|
||||
- The tool completes its full process (fetch, scrape attempt, summarize, email) without crashing on manual CLI trigger across multiple test runs.
|
||||
- The generated email digest consistently contains results for 10 stories, including correct links, discussion summaries, and article summaries where scraping was successful.
|
||||
- Errors during article scraping are logged, and the process continues for affected stories using only comment summaries, without halting the script.
|
||||
- **Success Criteria:**
|
||||
- Successful execution of the end-to-end process via CLI trigger for 3 consecutive test runs.
|
||||
- Generated email is successfully sent and received, containing summaries for all 10 fetched stories (article summary optional based on scraping success).
|
||||
- Scraping failures are logged appropriately without stopping the overall process.
|
||||
- **Key Performance Indicators (KPIs):**
|
||||
- Successful Runs / Total Runs (Target: 100% for MVP tests)
|
||||
- Stories with Article Summaries / Total Stories (Measures scraping effectiveness)
|
||||
- Stories with Discussion Summaries / Total Stories (Target: 100%)
|
||||
* Manual Qualitative Check: Relevance and coherence of summaries in the digest.
|
||||
|
||||
## Scope and Requirements (MVP / Current Version)
|
||||
|
||||
### Functional Requirements (High-Level)
|
||||
|
||||
- **HN Story Fetching:** Retrieve IDs and metadata (title, URL, HN link) for the top 10 stories from Algolia HN Search API.
|
||||
- **HN Comment Fetching:** For each story, retrieve comments from Algolia HN Search API up to a maximum count defined in a `.env` configuration variable (`MAX_COMMENTS_PER_STORY`, default 50).
|
||||
- **Article Content Scraping:** Attempt to fetch HTML and extract main text content from the story's external URL using basic methods (e.g., Node.js native fetch, optionally `article-extractor` or similar basic library).
|
||||
- **Scraping Failure Handling:** If scraping fails, log the error and proceed with generating only the Discussion Summary for that story.
|
||||
- **LLM Summarization:**
|
||||
- Generate an "Article Summary" from scraped text (if successful) using a configured local LLM (Ollama endpoint).
|
||||
- Generate a "Discussion Summary" from the fetched comments using the same LLM.
|
||||
- Initial Prompts (Placeholders - refine in Epics):
|
||||
- _Article Prompt:_ "Summarize the key points of the following article text: {Article Text}"
|
||||
- _Discussion Prompt:_ "Summarize the main themes, viewpoints, and key insights from the following Hacker News comments: {Comment Texts}"
|
||||
- **Digest Formatting:** Combine results for the 10 stories into a single HTML email. Each story entry should include: Story Title, HN Link, Article Link, Article Summary (if available), Discussion Summary.
|
||||
- **Email Dispatch:** Send the formatted HTML email using Nodemailer to a recipient list defined in `.env`. Use credentials also stored in `.env`.
|
||||
- **Main Execution Trigger:** Initiate the _entire implemented pipeline_ via a manual command-line interface (CLI) trigger, using the standard scripts defined in the boilerplate (`npm run dev`, `npm start` after build). Each functional epic should add its capability to this main execution flow.
|
||||
- **Configuration:** Manage external parameters (Algolia API details (if needed), LLM endpoint URL, `MAX_COMMENTS_PER_STORY`, Nodemailer credentials, recipient email list, output directory path) via a `.env` file, based on the provided `.env.example`.
|
||||
- **Incremental Logging & Data Persistence:**
|
||||
- Implement basic console logging for key steps and errors throughout the pipeline.
|
||||
- Persist intermediate data artifacts (fetched stories/comments, scraped text, generated summaries) to local files within a configurable, date-stamped directory structure (e.g., `./output/YYYY-MM-DD/`).
|
||||
- This persistence should be implemented incrementally within the relevant functional epics (Data Acquisition, Scraping, Summarization).
|
||||
- **Stage Testing Utilities:**
|
||||
- Provide separate utility scripts or CLI commands to allow testing individual pipeline stages in isolation (e.g., fetching HN data, scraping URLs, summarizing text, sending email).
|
||||
- These utilities should support using locally saved files as input (e.g., test scraping using a file containing story URLs, test summarization using a file containing text). This facilitates development and debugging.
|
||||
|
||||
### Non-Functional Requirements (NFRs)
|
||||
|
||||
- **Performance:** MVP focuses on functionality over speed. Should complete within a reasonable time (e.g., < 5 minutes) on a typical developer machine for local LLM use. No specific response time targets.
|
||||
- **Scalability:** Designed for single-user, local execution. No scaling requirements for MVP.
|
||||
- **Reliability/Availability:**
|
||||
- The script must handle article scraping failures gracefully (log and continue).
|
||||
- Basic error handling for API calls (e.g., log network errors).
|
||||
- Local LLM interaction may fail; basic error logging is sufficient for MVP.
|
||||
- No requirement for automated retries or production-grade error handling.
|
||||
- **Security:**
|
||||
- Email credentials must be stored securely via `.env` file and not committed to version control (as per boilerplate `.gitignore`).
|
||||
- No other specific security requirements for local MVP.
|
||||
- **Maintainability:**
|
||||
- Code should be well-structured TypeScript.
|
||||
- Adherence to the linting (ESLint) and formatting (Prettier) rules configured in the "bmad-boilerplate" is required. Use `npm run lint` and `npm run format`.
|
||||
- Modularity is desired to potentially swap LLM providers later and facilitate stage testing.
|
||||
- **Usability/Accessibility:** N/A (CLI tool for developer).
|
||||
- **Other Constraints:**
|
||||
- Must use TypeScript and Node.js v22.
|
||||
- Must run locally on the developer's machine.
|
||||
- Must use Node.js v22 native `Workspace` API for HTTP requests.
|
||||
- Must use Algolia HN Search API for HN data.
|
||||
- Must use a local Ollama instance via a configurable HTTP endpoint.
|
||||
- Must use Nodemailer for email dispatch.
|
||||
- Must use `.env` for configuration based on `.env.example`.
|
||||
- Must use local file system for logging and intermediate data storage. Ensure output/log directories are gitignored.
|
||||
- Focus on a functional pipeline for learning/demonstration.
|
||||
|
||||
### User Experience (UX) Requirements (High-Level)
|
||||
|
||||
- The primary UX goal is to deliver a time-saving digest.
|
||||
- For the developer user, the main CLI interaction should be simple: using standard boilerplate scripts like `npm run dev` or `npm start` to trigger the full process.
|
||||
- Feedback during CLI execution (e.g., "Fetching stories...", "Summarizing story X/10...", "Sending email...") is desirable via console logging.
|
||||
- Separate CLI commands/scripts for testing individual stages should provide clear input/output mechanisms.
|
||||
|
||||
### Integration Requirements (High-Level)
|
||||
|
||||
- **Algolia HN Search API:** Fetching top stories and comments. Requires understanding API structure and query parameters.
|
||||
- **Ollama Service:** Sending text (article content, comments) and receiving summaries via its API endpoint. Endpoint URL must be configurable.
|
||||
- **SMTP Service (via Nodemailer):** Sending the final digest email. Requires valid SMTP credentials and recipient list configured in `.env`.
|
||||
|
||||
### Testing Requirements (High-Level)
|
||||
|
||||
- MVP success relies on manual end-to-end test runs confirming successful execution and valid email output.
|
||||
- Unit/integration tests are encouraged using the **Jest framework configured in the boilerplate**. Focus testing effort on the core pipeline components. Use `npm run test`.
|
||||
- **Stage-specific testing utilities (as defined in Functional Requirements) are required** to support development and verification of individual pipeline components.
|
||||
|
||||
## Epic Overview (MVP / Current Version)
|
||||
|
||||
_(Revised proposal)_
|
||||
|
||||
- **Epic 1: Project Initialization & Core Setup** - Goal: Initialize the project using "bmad-boilerplate", manage dependencies, setup `.env` and config loading, establish basic CLI entry point, setup basic logging and output directory structure.
|
||||
- **Epic 2: HN Data Acquisition & Persistence** - Goal: Implement fetching top 10 stories and their comments (respecting limits) from Algolia HN API, and persist this raw data locally. Implement stage testing utility for fetching.
|
||||
- **Epic 3: Article Scraping & Persistence** - Goal: Implement best-effort article scraping/extraction, handle failures gracefully, and persist scraped text locally. Implement stage testing utility for scraping.
|
||||
- **Epic 4: LLM Summarization & Persistence** - Goal: Integrate with Ollama to generate article/discussion summaries from persisted data and persist summaries locally. Implement stage testing utility for summarization.
|
||||
- **Epic 5: Digest Assembly & Email Dispatch** - Goal: Format collected summaries into an HTML email using persisted data and send it using Nodemailer. Implement stage testing utility for emailing (with dry-run option).
|
||||
|
||||
## Key Reference Documents
|
||||
|
||||
- `docs/project-brief.md`
|
||||
- `docs/prd.md` (This document)
|
||||
- `docs/architecture.md` (To be created by Architect)
|
||||
- `docs/epic1.md`, `docs/epic2.md`, ... (To be created)
|
||||
- `docs/tech-stack.md` (Partially defined by boilerplate, to be finalized by Architect)
|
||||
- `docs/api-reference.md` (If needed for Algolia/Ollama details)
|
||||
- `docs/testing-strategy.md` (Optional - low priority for MVP, Jest setup provided)
|
||||
|
||||
## Post-MVP / Future Enhancements
|
||||
|
||||
- Advanced scraping techniques (handling JavaScript, anti-bot measures).
|
||||
- Processing all comments (potentially using MapReduce summarization).
|
||||
- Automated scheduling (e.g., using cron).
|
||||
- Database integration for storing results or tracking.
|
||||
- Cloud deployment and web frontend.
|
||||
- User management (sign-ups, preferences).
|
||||
- Production-grade error handling, monitoring, and email deliverability.
|
||||
- Fine-tuning LLM prompts or models.
|
||||
- Sophisticated retry logic for API calls or scraping.
|
||||
- Cloud LLM integration.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ----------------------- | ---------- | ------- | --------------------------------------- | ------ |
|
||||
| Refined Epics & Testing | 2025-05-04 | 0.3 | Removed Epic 6, added stage testing req | 2-pm |
|
||||
| Boilerplate Added | 2025-05-04 | 0.2 | Updated to reflect use of boilerplate | 2-pm |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft based on brief | 2-pm |
|
||||
|
||||
## Initial Architect Prompt
|
||||
|
||||
### Technical Infrastructure
|
||||
|
||||
- **Starter Project/Template:** **Mandatory: Use the provided "bmad-boilerplate".** This includes TypeScript setup, Node.js v22 compatibility, Jest, ESLint, Prettier, `ts-node`, `.env` handling via `.env.example`, and standard scripts (`dev`, `build`, `test`, `lint`, `format`).
|
||||
- **Hosting/Cloud Provider:** Local machine execution only for MVP. No cloud deployment.
|
||||
- **Frontend Platform:** N/A (CLI tool).
|
||||
- **Backend Platform:** Node.js v22 with TypeScript (as provided by the boilerplate). No specific Node.js framework mandated, but structure should support modularity and align with boilerplate setup.
|
||||
- **Database Requirements:** None. Local file system for intermediate data storage and logging only. Structure TBD (e.g., `./output/YYYY-MM-DD/`). Ensure output directory is configurable via `.env` and gitignored.
|
||||
|
||||
### Technical Constraints
|
||||
|
||||
- Must adhere to the structure and tooling provided by "bmad-boilerplate".
|
||||
- Must use Node.js v22 native `Workspace` for HTTP requests.
|
||||
- Must use the Algolia HN Search API for fetching HN data.
|
||||
- Must integrate with a local Ollama instance via a configurable HTTP endpoint. Design should allow potential swapping to other LLM APIs later.
|
||||
- Must use Nodemailer for sending email.
|
||||
- Configuration (LLM endpoint, email credentials, recipients, `MAX_COMMENTS_PER_STORY`, output dir path) must be managed via a `.env` file based on `.env.example`.
|
||||
- Article scraping must be basic, best-effort, and handle failures gracefully without stopping the main process.
|
||||
- Intermediate data must be persisted locally incrementally.
|
||||
- Code must adhere to the ESLint and Prettier configurations within the boilerplate.
|
||||
|
||||
### Deployment Considerations
|
||||
|
||||
- Execution is manual via CLI trigger only, using `npm run dev` or `npm start`.
|
||||
- No CI/CD required for MVP.
|
||||
- Single environment: local development machine.
|
||||
|
||||
### Local Development & Testing Requirements
|
||||
|
||||
- The entire application runs locally.
|
||||
- The main CLI command (`npm run dev`/`start`) should execute the _full implemented pipeline_.
|
||||
- **Separate utility scripts/commands MUST be provided** for testing individual pipeline stages (fetch, scrape, summarize, email) potentially using local file I/O. Architecture should facilitate creating these stage runners. (e.g., `npm run stage:fetch`, `npm run stage:scrape -- --inputFile <path>`, `npm run stage:summarize -- --inputFile <path>`, `npm run stage:email -- --inputFile <path> [--dry-run]`).
|
||||
- The boilerplate provides `npm run test` using Jest for running automated unit/integration tests.
|
||||
- The boilerplate provides `npm run lint` and `npm run format` for code quality checks.
|
||||
- Basic console logging is required. File logging can be considered by the architect.
|
||||
- Testability of individual modules (API clients, scraper, summarizer, emailer) is crucial and should leverage the Jest setup and stage testing utilities.
|
||||
|
||||
### Other Technical Considerations
|
||||
|
||||
- **Modularity:** Design components (HN client, scraper, LLM client, emailer) with clear interfaces to facilitate potential future modifications (e.g., changing LLM provider) and independent stage testing.
|
||||
- **Error Handling:** Focus on robust handling of scraping failures and basic handling of API/network errors. Implement within the boilerplate structure. Logging should clearly indicate errors.
|
||||
- **Resource Management:** Be mindful of local resources when interacting with the LLM, although optimization is not a primary MVP goal.
|
||||
- **Dependency Management:** Add necessary production dependencies (e.g., `nodemailer`, potentially `article-extractor`, libraries for date handling or file system operations if needed) to the boilerplate's `package.json`. Keep dependencies minimal.
|
||||
- **Configuration Loading:** Implement a robust way to load and validate settings from the `.env` file early in the application startup.
|
||||
@@ -1,189 +0,0 @@
|
||||
# BMad Hacker Daily Digest Product Requirements Document (PRD)
|
||||
|
||||
## Intro
|
||||
|
||||
The BMad Hacker Daily Digest is a command-line tool designed to address the time-consuming nature of reading extensive Hacker News (HN) comment threads. It aims to provide users with a time-efficient way to grasp the collective intelligence and key insights from discussions on top HN stories. The service will fetch the top 10 HN stories daily, retrieve a configurable number of comments for each, attempt to scrape the linked article, generate separate summaries for the article (if scraped) and the comment discussion using a local LLM, and deliver these summaries in a single daily email briefing triggered manually. This project also serves as a practical learning exercise in agent-driven development, TypeScript, Node.js, API integration, and local LLM usage, starting from the provided "bmad-boilerplate" template.
|
||||
|
||||
## Goals and Context
|
||||
|
||||
- **Project Objectives:**
|
||||
- Provide a quick, reliable, automated way to stay informed about key HN discussions without reading full threads.
|
||||
- Successfully fetch top 10 HN story metadata via Algolia HN API.
|
||||
- Retrieve a _configurable_ number of comments per story (default 50) via Algolia HN API.
|
||||
- Attempt basic scraping of linked article content, handling failures gracefully.
|
||||
- Generate distinct Article Summaries (if scraped) and Discussion Summaries using a local LLM (Ollama).
|
||||
- Assemble summaries for 10 stories into an HTML email and send via Nodemailer upon manual CLI trigger.
|
||||
- Serve as a learning platform for agent-driven development, TypeScript, Node.js v22, API integration, local LLMs, and configuration management, leveraging the "bmad-boilerplate" structure and tooling.
|
||||
- **Measurable Outcomes:**
|
||||
- The tool completes its full process (fetch, scrape attempt, summarize, email) without crashing on manual CLI trigger across multiple test runs.
|
||||
- The generated email digest consistently contains results for 10 stories, including correct links, discussion summaries, and article summaries where scraping was successful.
|
||||
- Errors during article scraping are logged, and the process continues for affected stories using only comment summaries, without halting the script.
|
||||
- **Success Criteria:**
|
||||
- Successful execution of the end-to-end process via CLI trigger for 3 consecutive test runs.
|
||||
- Generated email is successfully sent and received, containing summaries for all 10 fetched stories (article summary optional based on scraping success).
|
||||
- Scraping failures are logged appropriately without stopping the overall process.
|
||||
- **Key Performance Indicators (KPIs):**
|
||||
- Successful Runs / Total Runs (Target: 100% for MVP tests)
|
||||
- Stories with Article Summaries / Total Stories (Measures scraping effectiveness)
|
||||
- Stories with Discussion Summaries / Total Stories (Target: 100%)
|
||||
* Manual Qualitative Check: Relevance and coherence of summaries in the digest.
|
||||
|
||||
## Scope and Requirements (MVP / Current Version)
|
||||
|
||||
### Functional Requirements (High-Level)
|
||||
|
||||
- **HN Story Fetching:** Retrieve IDs and metadata (title, URL, HN link) for the top 10 stories from Algolia HN Search API.
|
||||
- **HN Comment Fetching:** For each story, retrieve comments from Algolia HN Search API up to a maximum count defined in a `.env` configuration variable (`MAX_COMMENTS_PER_STORY`, default 50).
|
||||
- **Article Content Scraping:** Attempt to fetch HTML and extract main text content from the story's external URL using basic methods (e.g., Node.js native fetch, optionally `article-extractor` or similar basic library).
|
||||
- **Scraping Failure Handling:** If scraping fails, log the error and proceed with generating only the Discussion Summary for that story.
|
||||
- **LLM Summarization:**
|
||||
- Generate an "Article Summary" from scraped text (if successful) using a configured local LLM (Ollama endpoint).
|
||||
- Generate a "Discussion Summary" from the fetched comments using the same LLM.
|
||||
- Initial Prompts (Placeholders - refine in Epics):
|
||||
- _Article Prompt:_ "Summarize the key points of the following article text: {Article Text}"
|
||||
- _Discussion Prompt:_ "Summarize the main themes, viewpoints, and key insights from the following Hacker News comments: {Comment Texts}"
|
||||
- **Digest Formatting:** Combine results for the 10 stories into a single HTML email. Each story entry should include: Story Title, HN Link, Article Link, Article Summary (if available), Discussion Summary.
|
||||
- **Email Dispatch:** Send the formatted HTML email using Nodemailer to a recipient list defined in `.env`. Use credentials also stored in `.env`.
|
||||
- **Main Execution Trigger:** Initiate the _entire implemented pipeline_ via a manual command-line interface (CLI) trigger, using the standard scripts defined in the boilerplate (`npm run dev`, `npm start` after build). Each functional epic should add its capability to this main execution flow.
|
||||
- **Configuration:** Manage external parameters (Algolia API details (if needed), LLM endpoint URL, `MAX_COMMENTS_PER_STORY`, Nodemailer credentials, recipient email list, output directory path) via a `.env` file, based on the provided `.env.example`.
|
||||
- **Incremental Logging & Data Persistence:**
|
||||
- Implement basic console logging for key steps and errors throughout the pipeline.
|
||||
- Persist intermediate data artifacts (fetched stories/comments, scraped text, generated summaries) to local files within a configurable, date-stamped directory structure (e.g., `./output/YYYY-MM-DD/`).
|
||||
- This persistence should be implemented incrementally within the relevant functional epics (Data Acquisition, Scraping, Summarization).
|
||||
- **Stage Testing Utilities:**
|
||||
- Provide separate utility scripts or CLI commands to allow testing individual pipeline stages in isolation (e.g., fetching HN data, scraping URLs, summarizing text, sending email).
|
||||
- These utilities should support using locally saved files as input (e.g., test scraping using a file containing story URLs, test summarization using a file containing text). This facilitates development and debugging.
|
||||
|
||||
### Non-Functional Requirements (NFRs)
|
||||
|
||||
- **Performance:** MVP focuses on functionality over speed. Should complete within a reasonable time (e.g., < 5 minutes) on a typical developer machine for local LLM use. No specific response time targets.
|
||||
- **Scalability:** Designed for single-user, local execution. No scaling requirements for MVP.
|
||||
- **Reliability/Availability:**
|
||||
- The script must handle article scraping failures gracefully (log and continue).
|
||||
- Basic error handling for API calls (e.g., log network errors).
|
||||
- Local LLM interaction may fail; basic error logging is sufficient for MVP.
|
||||
- No requirement for automated retries or production-grade error handling.
|
||||
- **Security:**
|
||||
- Email credentials must be stored securely via `.env` file and not committed to version control (as per boilerplate `.gitignore`).
|
||||
- No other specific security requirements for local MVP.
|
||||
- **Maintainability:**
|
||||
- Code should be well-structured TypeScript.
|
||||
- Adherence to the linting (ESLint) and formatting (Prettier) rules configured in the "bmad-boilerplate" is required. Use `npm run lint` and `npm run format`.
|
||||
- Modularity is desired to potentially swap LLM providers later and facilitate stage testing.
|
||||
- **Usability/Accessibility:** N/A (CLI tool for developer).
|
||||
- **Other Constraints:**
|
||||
- Must use TypeScript and Node.js v22.
|
||||
- Must run locally on the developer's machine.
|
||||
- Must use Node.js v22 native `Workspace` API for HTTP requests.
|
||||
- Must use Algolia HN Search API for HN data.
|
||||
- Must use a local Ollama instance via a configurable HTTP endpoint.
|
||||
- Must use Nodemailer for email dispatch.
|
||||
- Must use `.env` for configuration based on `.env.example`.
|
||||
- Must use local file system for logging and intermediate data storage. Ensure output/log directories are gitignored.
|
||||
- Focus on a functional pipeline for learning/demonstration.
|
||||
|
||||
### User Experience (UX) Requirements (High-Level)
|
||||
|
||||
- The primary UX goal is to deliver a time-saving digest.
|
||||
- For the developer user, the main CLI interaction should be simple: using standard boilerplate scripts like `npm run dev` or `npm start` to trigger the full process.
|
||||
- Feedback during CLI execution (e.g., "Fetching stories...", "Summarizing story X/10...", "Sending email...") is desirable via console logging.
|
||||
- Separate CLI commands/scripts for testing individual stages should provide clear input/output mechanisms.
|
||||
|
||||
### Integration Requirements (High-Level)
|
||||
|
||||
- **Algolia HN Search API:** Fetching top stories and comments. Requires understanding API structure and query parameters.
|
||||
- **Ollama Service:** Sending text (article content, comments) and receiving summaries via its API endpoint. Endpoint URL must be configurable.
|
||||
- **SMTP Service (via Nodemailer):** Sending the final digest email. Requires valid SMTP credentials and recipient list configured in `.env`.
|
||||
|
||||
### Testing Requirements (High-Level)
|
||||
|
||||
- MVP success relies on manual end-to-end test runs confirming successful execution and valid email output.
|
||||
- Unit/integration tests are encouraged using the **Jest framework configured in the boilerplate**. Focus testing effort on the core pipeline components. Use `npm run test`.
|
||||
- **Stage-specific testing utilities (as defined in Functional Requirements) are required** to support development and verification of individual pipeline components.
|
||||
|
||||
## Epic Overview (MVP / Current Version)
|
||||
|
||||
_(Revised proposal)_
|
||||
|
||||
- **Epic 1: Project Initialization & Core Setup** - Goal: Initialize the project using "bmad-boilerplate", manage dependencies, setup `.env` and config loading, establish basic CLI entry point, setup basic logging and output directory structure.
|
||||
- **Epic 2: HN Data Acquisition & Persistence** - Goal: Implement fetching top 10 stories and their comments (respecting limits) from Algolia HN API, and persist this raw data locally. Implement stage testing utility for fetching.
|
||||
- **Epic 3: Article Scraping & Persistence** - Goal: Implement best-effort article scraping/extraction, handle failures gracefully, and persist scraped text locally. Implement stage testing utility for scraping.
|
||||
- **Epic 4: LLM Summarization & Persistence** - Goal: Integrate with Ollama to generate article/discussion summaries from persisted data and persist summaries locally. Implement stage testing utility for summarization.
|
||||
- **Epic 5: Digest Assembly & Email Dispatch** - Goal: Format collected summaries into an HTML email using persisted data and send it using Nodemailer. Implement stage testing utility for emailing (with dry-run option).
|
||||
|
||||
## Key Reference Documents
|
||||
|
||||
- `docs/project-brief.md`
|
||||
- `docs/prd.md` (This document)
|
||||
- `docs/architecture.md` (To be created by Architect)
|
||||
- `docs/epic1.md`, `docs/epic2.md`, ... (To be created)
|
||||
- `docs/tech-stack.md` (Partially defined by boilerplate, to be finalized by Architect)
|
||||
- `docs/api-reference.md` (If needed for Algolia/Ollama details)
|
||||
- `docs/testing-strategy.md` (Optional - low priority for MVP, Jest setup provided)
|
||||
|
||||
## Post-MVP / Future Enhancements
|
||||
|
||||
- Advanced scraping techniques (handling JavaScript, anti-bot measures).
|
||||
- Processing all comments (potentially using MapReduce summarization).
|
||||
- Automated scheduling (e.g., using cron).
|
||||
- Database integration for storing results or tracking.
|
||||
- Cloud deployment and web frontend.
|
||||
- User management (sign-ups, preferences).
|
||||
- Production-grade error handling, monitoring, and email deliverability.
|
||||
- Fine-tuning LLM prompts or models.
|
||||
- Sophisticated retry logic for API calls or scraping.
|
||||
- Cloud LLM integration.
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ----------------------- | ---------- | ------- | --------------------------------------- | ------ |
|
||||
| Refined Epics & Testing | 2025-05-04 | 0.3 | Removed Epic 6, added stage testing req | 2-pm |
|
||||
| Boilerplate Added | 2025-05-04 | 0.2 | Updated to reflect use of boilerplate | 2-pm |
|
||||
| Initial Draft | 2025-05-04 | 0.1 | First draft based on brief | 2-pm |
|
||||
|
||||
## Initial Architect Prompt
|
||||
|
||||
### Technical Infrastructure
|
||||
|
||||
- **Starter Project/Template:** **Mandatory: Use the provided "bmad-boilerplate".** This includes TypeScript setup, Node.js v22 compatibility, Jest, ESLint, Prettier, `ts-node`, `.env` handling via `.env.example`, and standard scripts (`dev`, `build`, `test`, `lint`, `format`).
|
||||
- **Hosting/Cloud Provider:** Local machine execution only for MVP. No cloud deployment.
|
||||
- **Frontend Platform:** N/A (CLI tool).
|
||||
- **Backend Platform:** Node.js v22 with TypeScript (as provided by the boilerplate). No specific Node.js framework mandated, but structure should support modularity and align with boilerplate setup.
|
||||
- **Database Requirements:** None. Local file system for intermediate data storage and logging only. Structure TBD (e.g., `./output/YYYY-MM-DD/`). Ensure output directory is configurable via `.env` and gitignored.
|
||||
|
||||
### Technical Constraints
|
||||
|
||||
- Must adhere to the structure and tooling provided by "bmad-boilerplate".
|
||||
- Must use Node.js v22 native `Workspace` for HTTP requests.
|
||||
- Must use the Algolia HN Search API for fetching HN data.
|
||||
- Must integrate with a local Ollama instance via a configurable HTTP endpoint. Design should allow potential swapping to other LLM APIs later.
|
||||
- Must use Nodemailer for sending email.
|
||||
- Configuration (LLM endpoint, email credentials, recipients, `MAX_COMMENTS_PER_STORY`, output dir path) must be managed via a `.env` file based on `.env.example`.
|
||||
- Article scraping must be basic, best-effort, and handle failures gracefully without stopping the main process.
|
||||
- Intermediate data must be persisted locally incrementally.
|
||||
- Code must adhere to the ESLint and Prettier configurations within the boilerplate.
|
||||
|
||||
### Deployment Considerations
|
||||
|
||||
- Execution is manual via CLI trigger only, using `npm run dev` or `npm start`.
|
||||
- No CI/CD required for MVP.
|
||||
- Single environment: local development machine.
|
||||
|
||||
### Local Development & Testing Requirements
|
||||
|
||||
- The entire application runs locally.
|
||||
- The main CLI command (`npm run dev`/`start`) should execute the _full implemented pipeline_.
|
||||
- **Separate utility scripts/commands MUST be provided** for testing individual pipeline stages (fetch, scrape, summarize, email) potentially using local file I/O. Architecture should facilitate creating these stage runners. (e.g., `npm run stage:fetch`, `npm run stage:scrape -- --inputFile <path>`, `npm run stage:summarize -- --inputFile <path>`, `npm run stage:email -- --inputFile <path> [--dry-run]`).
|
||||
- The boilerplate provides `npm run test` using Jest for running automated unit/integration tests.
|
||||
- The boilerplate provides `npm run lint` and `npm run format` for code quality checks.
|
||||
- Basic console logging is required. File logging can be considered by the architect.
|
||||
- Testability of individual modules (API clients, scraper, summarizer, emailer) is crucial and should leverage the Jest setup and stage testing utilities.
|
||||
|
||||
### Other Technical Considerations
|
||||
|
||||
- **Modularity:** Design components (HN client, scraper, LLM client, emailer) with clear interfaces to facilitate potential future modifications (e.g., changing LLM provider) and independent stage testing.
|
||||
- **Error Handling:** Focus on robust handling of scraping failures and basic handling of API/network errors. Implement within the boilerplate structure. Logging should clearly indicate errors.
|
||||
- **Resource Management:** Be mindful of local resources when interacting with the LLM, although optimization is not a primary MVP goal.
|
||||
- **Dependency Management:** Add necessary production dependencies (e.g., `nodemailer`, potentially `article-extractor`, libraries for date handling or file system operations if needed) to the boilerplate's `package.json`. Keep dependencies minimal.
|
||||
- **Configuration Loading:** Implement a robust way to load and validate settings from the `.env` file early in the application startup.
|
||||
@@ -1,91 +0,0 @@
|
||||
# BMad Hacker Daily Digest Project Structure
|
||||
|
||||
This document outlines the standard directory and file structure for the project. Adhering to this structure ensures consistency and maintainability.
|
||||
|
||||
```plaintext
|
||||
bmad-hacker-daily-digest/
|
||||
├── .github/ # Optional: GitHub Actions workflows (if used)
|
||||
│ └── workflows/
|
||||
├── .vscode/ # Optional: VSCode editor settings
|
||||
│ └── settings.json
|
||||
├── dist/ # Compiled JavaScript output (from 'npm run build', git-ignored)
|
||||
├── docs/ # Project documentation (PRD, Architecture, Epics, etc.)
|
||||
│ ├── architecture.md
|
||||
│ ├── tech-stack.md
|
||||
│ ├── project-structure.md # This file
|
||||
│ ├── data-models.md
|
||||
│ ├── api-reference.md
|
||||
│ ├── environment-vars.md
|
||||
│ ├── coding-standards.md
|
||||
│ ├── testing-strategy.md
|
||||
│ ├── prd.md # Product Requirements Document
|
||||
│ ├── epic1.md .. epic5.md # Epic details
|
||||
│ └── ...
|
||||
├── node_modules/ # Project dependencies (managed by npm, git-ignored)
|
||||
├── output/ # Default directory for data artifacts (git-ignored)
|
||||
│ └── YYYY-MM-DD/ # Date-stamped subdirectories for runs
|
||||
│ ├── {storyId}_data.json
|
||||
│ ├── {storyId}_article.txt
|
||||
│ └── {storyId}_summary.json
|
||||
├── src/ # Application source code
|
||||
│ ├── clients/ # Clients for interacting with external services
|
||||
│ │ ├── algoliaHNClient.ts # Algolia HN Search API interaction logic [Epic 2]
|
||||
│ │ └── ollamaClient.ts # Ollama API interaction logic [Epic 4]
|
||||
│ ├── core/ # Core application logic & orchestration
|
||||
│ │ └── pipeline.ts # Main pipeline execution flow (fetch->scrape->summarize->email)
|
||||
│ ├── email/ # Email assembly, templating, and sending logic [Epic 5]
|
||||
│ │ ├── contentAssembler.ts # Reads local files, prepares digest data
|
||||
│ │ ├── emailSender.ts # Sends email via Nodemailer
|
||||
│ │ └── templates.ts # HTML email template rendering function(s)
|
||||
│ ├── scraper/ # Article scraping logic [Epic 3]
|
||||
│ │ └── articleScraper.ts # Implements scraping using article-extractor
|
||||
│ ├── stages/ # Standalone stage testing utility scripts [PRD Req]
|
||||
│ │ ├── fetch_hn_data.ts # Stage runner for Epic 2
|
||||
│ │ ├── scrape_articles.ts # Stage runner for Epic 3
|
||||
│ │ ├── summarize_content.ts# Stage runner for Epic 4
|
||||
│ │ └── send_digest.ts # Stage runner for Epic 5 (with --dry-run)
|
||||
│ ├── types/ # Shared TypeScript interfaces and types
|
||||
│ │ ├── hn.ts # Types: Story, Comment
|
||||
│ │ ├── ollama.ts # Types: OllamaRequest, OllamaResponse
|
||||
│ │ ├── email.ts # Types: DigestData
|
||||
│ │ └── index.ts # Barrel file for exporting types from this dir
|
||||
│ ├── utils/ # Shared, low-level utility functions
|
||||
│ │ ├── config.ts # Loads and validates .env configuration [Epic 1]
|
||||
│ │ ├── logger.ts # Simple console logger wrapper [Epic 1]
|
||||
│ │ └── dateUtils.ts # Date formatting helpers (using date-fns)
|
||||
│ └── index.ts # Main application entry point (invoked by npm run dev/start) [Epic 1]
|
||||
├── test/ # Automated tests (using Jest)
|
||||
│ ├── unit/ # Unit tests (mirroring src structure)
|
||||
│ │ ├── clients/
|
||||
│ │ ├── core/
|
||||
│ │ ├── email/
|
||||
│ │ ├── scraper/
|
||||
│ │ └── utils/
|
||||
│ └── integration/ # Integration tests (e.g., testing pipeline stage interactions)
|
||||
├── .env.example # Example environment variables file [Epic 1]
|
||||
├── .gitignore # Git ignore rules (ensure node_modules, dist, .env, output/ are included)
|
||||
├── package.json # Project manifest, dependencies, scripts (from boilerplate)
|
||||
├── package-lock.json # Lockfile for deterministic installs
|
||||
└── tsconfig.json # TypeScript compiler configuration (from boilerplate)
|
||||
```
|
||||
|
||||
## Key Directory Descriptions
|
||||
|
||||
- `docs/`: Contains all project planning, architecture, and reference documentation.
|
||||
- `output/`: Default location for persisted data artifacts generated during runs (stories, comments, summaries). Should be in `.gitignore`. Path configurable via `.env`.
|
||||
- `src/`: Main application source code.
|
||||
- `clients/`: Modules dedicated to interacting with specific external APIs (Algolia, Ollama).
|
||||
- `core/`: Orchestrates the main application pipeline steps.
|
||||
- `email/`: Handles all aspects of creating and sending the final email digest.
|
||||
- `scraper/`: Contains the logic for fetching and extracting article content.
|
||||
- `stages/`: Holds the independent, runnable scripts for testing each major pipeline stage.
|
||||
- `types/`: Central location for shared TypeScript interfaces and type definitions.
|
||||
- `utils/`: Reusable utility functions (config loading, logging, date formatting) that don't belong to a specific feature domain.
|
||||
- `index.ts`: The main entry point triggered by `npm run dev/start`, responsible for initializing and starting the core pipeline.
|
||||
- `test/`: Contains automated tests written using Jest. Structure mirrors `src/` for unit tests.
|
||||
|
||||
## Notes
|
||||
|
||||
- This structure promotes modularity by separating concerns (clients, scraping, email, core logic, stages, utils).
|
||||
- Clear separation into directories like `clients`, `scraper`, `email`, and `stages` aids independent development, testing, and potential AI agent implementation tasks targeting specific functionalities.
|
||||
- Stage runner scripts in `src/stages/` directly address the PRD requirement for testing pipeline phases independently .
|
||||
@@ -1,91 +0,0 @@
|
||||
# BMad Hacker Daily Digest Project Structure
|
||||
|
||||
This document outlines the standard directory and file structure for the project. Adhering to this structure ensures consistency and maintainability.
|
||||
|
||||
```plaintext
|
||||
bmad-hacker-daily-digest/
|
||||
├── .github/ # Optional: GitHub Actions workflows (if used)
|
||||
│ └── workflows/
|
||||
├── .vscode/ # Optional: VSCode editor settings
|
||||
│ └── settings.json
|
||||
├── dist/ # Compiled JavaScript output (from 'npm run build', git-ignored)
|
||||
├── docs/ # Project documentation (PRD, Architecture, Epics, etc.)
|
||||
│ ├── architecture.md
|
||||
│ ├── tech-stack.md
|
||||
│ ├── project-structure.md # This file
|
||||
│ ├── data-models.md
|
||||
│ ├── api-reference.md
|
||||
│ ├── environment-vars.md
|
||||
│ ├── coding-standards.md
|
||||
│ ├── testing-strategy.md
|
||||
│ ├── prd.md # Product Requirements Document
|
||||
│ ├── epic1.md .. epic5.md # Epic details
|
||||
│ └── ...
|
||||
├── node_modules/ # Project dependencies (managed by npm, git-ignored)
|
||||
├── output/ # Default directory for data artifacts (git-ignored)
|
||||
│ └── YYYY-MM-DD/ # Date-stamped subdirectories for runs
|
||||
│ ├── {storyId}_data.json
|
||||
│ ├── {storyId}_article.txt
|
||||
│ └── {storyId}_summary.json
|
||||
├── src/ # Application source code
|
||||
│ ├── clients/ # Clients for interacting with external services
|
||||
│ │ ├── algoliaHNClient.ts # Algolia HN Search API interaction logic [Epic 2]
|
||||
│ │ └── ollamaClient.ts # Ollama API interaction logic [Epic 4]
|
||||
│ ├── core/ # Core application logic & orchestration
|
||||
│ │ └── pipeline.ts # Main pipeline execution flow (fetch->scrape->summarize->email)
|
||||
│ ├── email/ # Email assembly, templating, and sending logic [Epic 5]
|
||||
│ │ ├── contentAssembler.ts # Reads local files, prepares digest data
|
||||
│ │ ├── emailSender.ts # Sends email via Nodemailer
|
||||
│ │ └── templates.ts # HTML email template rendering function(s)
|
||||
│ ├── scraper/ # Article scraping logic [Epic 3]
|
||||
│ │ └── articleScraper.ts # Implements scraping using article-extractor
|
||||
│ ├── stages/ # Standalone stage testing utility scripts [PRD Req]
|
||||
│ │ ├── fetch_hn_data.ts # Stage runner for Epic 2
|
||||
│ │ ├── scrape_articles.ts # Stage runner for Epic 3
|
||||
│ │ ├── summarize_content.ts# Stage runner for Epic 4
|
||||
│ │ └── send_digest.ts # Stage runner for Epic 5 (with --dry-run)
|
||||
│ ├── types/ # Shared TypeScript interfaces and types
|
||||
│ │ ├── hn.ts # Types: Story, Comment
|
||||
│ │ ├── ollama.ts # Types: OllamaRequest, OllamaResponse
|
||||
│ │ ├── email.ts # Types: DigestData
|
||||
│ │ └── index.ts # Barrel file for exporting types from this dir
|
||||
│ ├── utils/ # Shared, low-level utility functions
|
||||
│ │ ├── config.ts # Loads and validates .env configuration [Epic 1]
|
||||
│ │ ├── logger.ts # Simple console logger wrapper [Epic 1]
|
||||
│ │ └── dateUtils.ts # Date formatting helpers (using date-fns)
|
||||
│ └── index.ts # Main application entry point (invoked by npm run dev/start) [Epic 1]
|
||||
├── test/ # Automated tests (using Jest)
|
||||
│ ├── unit/ # Unit tests (mirroring src structure)
|
||||
│ │ ├── clients/
|
||||
│ │ ├── core/
|
||||
│ │ ├── email/
|
||||
│ │ ├── scraper/
|
||||
│ │ └── utils/
|
||||
│ └── integration/ # Integration tests (e.g., testing pipeline stage interactions)
|
||||
├── .env.example # Example environment variables file [Epic 1]
|
||||
├── .gitignore # Git ignore rules (ensure node_modules, dist, .env, output/ are included)
|
||||
├── package.json # Project manifest, dependencies, scripts (from boilerplate)
|
||||
├── package-lock.json # Lockfile for deterministic installs
|
||||
└── tsconfig.json # TypeScript compiler configuration (from boilerplate)
|
||||
```
|
||||
|
||||
## Key Directory Descriptions
|
||||
|
||||
- `docs/`: Contains all project planning, architecture, and reference documentation.
|
||||
- `output/`: Default location for persisted data artifacts generated during runs (stories, comments, summaries). Should be in `.gitignore`. Path configurable via `.env`.
|
||||
- `src/`: Main application source code.
|
||||
- `clients/`: Modules dedicated to interacting with specific external APIs (Algolia, Ollama).
|
||||
- `core/`: Orchestrates the main application pipeline steps.
|
||||
- `email/`: Handles all aspects of creating and sending the final email digest.
|
||||
- `scraper/`: Contains the logic for fetching and extracting article content.
|
||||
- `stages/`: Holds the independent, runnable scripts for testing each major pipeline stage.
|
||||
- `types/`: Central location for shared TypeScript interfaces and type definitions.
|
||||
- `utils/`: Reusable utility functions (config loading, logging, date formatting) that don't belong to a specific feature domain.
|
||||
- `index.ts`: The main entry point triggered by `npm run dev/start`, responsible for initializing and starting the core pipeline.
|
||||
- `test/`: Contains automated tests written using Jest. Structure mirrors `src/` for unit tests.
|
||||
|
||||
## Notes
|
||||
|
||||
- This structure promotes modularity by separating concerns (clients, scraping, email, core logic, stages, utils).
|
||||
- Clear separation into directories like `clients`, `scraper`, `email`, and `stages` aids independent development, testing, and potential AI agent implementation tasks targeting specific functionalities.
|
||||
- Stage runner scripts in `src/stages/` directly address the PRD requirement for testing pipeline phases independently .
|
||||
@@ -1,56 +0,0 @@
|
||||
````Markdown
|
||||
# BMad Hacker Daily Digest LLM Prompts
|
||||
|
||||
This document defines the standard prompts used when interacting with the configured Ollama LLM for generating summaries. Centralizing these prompts ensures consistency and aids experimentation.
|
||||
|
||||
## Prompt Design Philosophy
|
||||
|
||||
The goal of these prompts is to guide the LLM (e.g., Llama 3 or similar) to produce concise, informative summaries focusing on the key information relevant to the BMad Hacker Daily Digest's objective: quickly understanding the essence of an article or HN discussion.
|
||||
|
||||
## Core Prompts
|
||||
|
||||
### 1. Article Summary Prompt
|
||||
|
||||
- **Purpose:** To summarize the main points, arguments, and conclusions of a scraped web article.
|
||||
- **Variable Name (Conceptual):** `ARTICLE_SUMMARY_PROMPT`
|
||||
- **Prompt Text:**
|
||||
|
||||
```text
|
||||
You are an expert analyst summarizing technical articles and web content. Please provide a concise summary of the following article text, focusing on the key points, core arguments, findings, and main conclusions. The summary should be objective and easy to understand.
|
||||
|
||||
Article Text:
|
||||
---
|
||||
{Article Text}
|
||||
---
|
||||
|
||||
Concise Summary:
|
||||
````
|
||||
|
||||
### 2. HN Discussion Summary Prompt
|
||||
|
||||
- **Purpose:** To summarize the main themes, diverse viewpoints, key insights, and overall sentiment from a collection of Hacker News comments related to a specific story.
|
||||
- **Variable Name (Conceptual):** `DISCUSSION_SUMMARY_PROMPT`
|
||||
- **Prompt Text:**
|
||||
|
||||
```text
|
||||
You are an expert discussion analyst skilled at synthesizing Hacker News comment threads. Please provide a concise summary of the main themes, diverse viewpoints (including agreements and disagreements), key insights, and overall sentiment expressed in the following Hacker News comments. Focus on the collective intelligence and most salient points from the discussion.
|
||||
|
||||
Hacker News Comments:
|
||||
---
|
||||
{Comment Texts}
|
||||
---
|
||||
|
||||
Concise Summary of Discussion:
|
||||
```
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
- **Placeholders:** `{Article Text}` and `{Comment Texts}` represent the actual content that will be dynamically inserted by the application (`src/core/pipeline.ts` or `src/clients/ollamaClient.ts`) when making the API call.
|
||||
- **Loading:** For the MVP, these prompts can be defined as constants within the application code (e.g., in `src/utils/prompts.ts` or directly where the `ollamaClient` is called), referencing this document as the source of truth. Future enhancements could involve loading these prompts from this file directly at runtime.
|
||||
- **Refinement:** These prompts serve as a starting point. Further refinement based on the quality of summaries produced by the specific `OLLAMA_MODEL` is expected (Post-MVP).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | -------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial prompts definition | 3-Architect |
|
||||
@@ -1,26 +0,0 @@
|
||||
# BMad Hacker Daily Digest Technology Stack
|
||||
|
||||
## Technology Choices
|
||||
|
||||
| Category | Technology | Version / Details | Description / Purpose | Justification (Optional) |
|
||||
| :-------------------- | :----------------------------- | :----------------------- | :--------------------------------------------------------------------------------------------------------- | :------------------------------------------------- |
|
||||
| **Languages** | TypeScript | 5.x (from boilerplate) | Primary language for application logic | Required by boilerplate , strong typing |
|
||||
| **Runtime** | Node.js | 22.x | Server-side execution environment | Required by PRD |
|
||||
| **Frameworks** | N/A | N/A | Using plain Node.js structure | Boilerplate provides structure; framework overkill |
|
||||
| **Databases** | Local Filesystem | N/A | Storing intermediate data artifacts | Required by PRD ; No database needed for MVP |
|
||||
| **HTTP Client** | Node.js `Workspace` API | Native (Node.js >=21) | **Mandatory:** Fetching external resources (Algolia, URLs, Ollama). **Do NOT use libraries like `axios`.** | Required by PRD |
|
||||
| **Configuration** | `.env` Files | Native (Node.js >=20.6) | Managing environment variables. **`dotenv` package is NOT needed.** | Standard practice; Native support |
|
||||
| **Logging** | Simple Console Wrapper | Custom (`src/logger.ts`) | Basic console logging for MVP (stdout/stderr) | Meets PRD "basic logging" req ; Minimal dependency |
|
||||
| **Key Libraries** | `@extractus/article-extractor` | ~8.x | Basic article text scraping | Simple, focused library for MVP scraping |
|
||||
| | `date-fns` | ~3.x | Date formatting and manipulation | Clean API for date-stamped dirs/timestamps |
|
||||
| | `nodemailer` | ~6.x | Sending email digests | Required by PRD |
|
||||
| | `yargs` | ~17.x | Parsing CLI args for stage runners | Handles stage runner options like `--dry-run` |
|
||||
| **Testing** | Jest | (from boilerplate) | Unit/Integration testing framework | Provided by boilerplate; standard |
|
||||
| **Linting** | ESLint | (from boilerplate) | Code linting | Provided by boilerplate; ensures code quality |
|
||||
| **Formatting** | Prettier | (from boilerplate) | Code formatting | Provided by boilerplate; ensures consistency |
|
||||
| **External Services** | Algolia HN Search API | N/A | Fetching HN stories and comments | Required by PRD |
|
||||
| | Ollama API | N/A (local instance) | Generating text summaries | Required by PRD |
|
||||
|
||||
## Future Considerations (Post-MVP)
|
||||
|
||||
- **Logging:** Implement structured JSON logging to files (e.g., using Winston or Pino) for better analysis and persistence.
|
||||
@@ -1,26 +0,0 @@
|
||||
# BMad Hacker Daily Digest Technology Stack
|
||||
|
||||
## Technology Choices
|
||||
|
||||
| Category | Technology | Version / Details | Description / Purpose | Justification (Optional) |
|
||||
| :-------------------- | :----------------------------- | :----------------------- | :--------------------------------------------------------------------------------------------------------- | :------------------------------------------------- |
|
||||
| **Languages** | TypeScript | 5.x (from boilerplate) | Primary language for application logic | Required by boilerplate , strong typing |
|
||||
| **Runtime** | Node.js | 22.x | Server-side execution environment | Required by PRD |
|
||||
| **Frameworks** | N/A | N/A | Using plain Node.js structure | Boilerplate provides structure; framework overkill |
|
||||
| **Databases** | Local Filesystem | N/A | Storing intermediate data artifacts | Required by PRD ; No database needed for MVP |
|
||||
| **HTTP Client** | Node.js `Workspace` API | Native (Node.js >=21) | **Mandatory:** Fetching external resources (Algolia, URLs, Ollama). **Do NOT use libraries like `axios`.** | Required by PRD |
|
||||
| **Configuration** | `.env` Files | Native (Node.js >=20.6) | Managing environment variables. **`dotenv` package is NOT needed.** | Standard practice; Native support |
|
||||
| **Logging** | Simple Console Wrapper | Custom (`src/logger.ts`) | Basic console logging for MVP (stdout/stderr) | Meets PRD "basic logging" req ; Minimal dependency |
|
||||
| **Key Libraries** | `@extractus/article-extractor` | ~8.x | Basic article text scraping | Simple, focused library for MVP scraping |
|
||||
| | `date-fns` | ~3.x | Date formatting and manipulation | Clean API for date-stamped dirs/timestamps |
|
||||
| | `nodemailer` | ~6.x | Sending email digests | Required by PRD |
|
||||
| | `yargs` | ~17.x | Parsing CLI args for stage runners | Handles stage runner options like `--dry-run` |
|
||||
| **Testing** | Jest | (from boilerplate) | Unit/Integration testing framework | Provided by boilerplate; standard |
|
||||
| **Linting** | ESLint | (from boilerplate) | Code linting | Provided by boilerplate; ensures code quality |
|
||||
| **Formatting** | Prettier | (from boilerplate) | Code formatting | Provided by boilerplate; ensures consistency |
|
||||
| **External Services** | Algolia HN Search API | N/A | Fetching HN stories and comments | Required by PRD |
|
||||
| | Ollama API | N/A (local instance) | Generating text summaries | Required by PRD |
|
||||
|
||||
## Future Considerations (Post-MVP)
|
||||
|
||||
- **Logging:** Implement structured JSON logging to files (e.g., using Winston or Pino) for better analysis and persistence.
|
||||
@@ -1,73 +0,0 @@
|
||||
# BMad Hacker Daily Digest Testing Strategy
|
||||
|
||||
## Overall Philosophy & Goals
|
||||
|
||||
The testing strategy for the BMad Hacker Daily Digest MVP focuses on pragmatic validation of the core pipeline functionality and individual component logic. Given it's a local CLI tool with a sequential process, the emphasis is on:
|
||||
|
||||
1. **Functional Correctness:** Ensuring each stage of the pipeline (fetch, scrape, summarize, email) performs its task correctly according to the requirements.
|
||||
2. **Integration Verification:** Confirming that data flows correctly between pipeline stages via the local filesystem.
|
||||
3. **Robustness (Key Areas):** Specifically testing graceful handling of expected failures, particularly in article scraping .
|
||||
4. **Leveraging Boilerplate:** Utilizing the Jest testing framework provided by `bmad-boilerplate` for automated unit and integration tests .
|
||||
5. **Stage-Based Acceptance:** Using the mandatory **Stage Testing Utilities** as the primary mechanism for end-to-end validation of each phase against real external interactions (where applicable) .
|
||||
|
||||
The primary goal is confidence in the MVP's end-to-end execution and the correctness of the generated email digest. High code coverage is secondary to testing critical paths and integration points.
|
||||
|
||||
## Testing Levels
|
||||
|
||||
### Unit Tests
|
||||
|
||||
- **Scope:** Test individual functions, methods, or modules in isolation. Focus on business logic within utilities (`src/utils/`), clients (`src/clients/` - mocking HTTP calls), scraping logic (`src/scraper/` - mocking HTTP calls), email templating (`src/email/templates.ts`), and potentially core pipeline orchestration logic (`src/core/pipeline.ts` - mocking stage implementations).
|
||||
- **Tools:** Jest (provided by `bmad-boilerplate`). Use `npm run test`.
|
||||
- **Mocking/Stubbing:** Utilize Jest's built-in mocking capabilities (`jest.fn()`, `jest.spyOn()`, manual mocks in `__mocks__`) to isolate units under test from external dependencies (native `Workspace` API, `fs`, other modules, external libraries like `nodemailer`, `ollamaClient`).
|
||||
- **Location:** `test/unit/`, mirroring the `src/` directory structure.
|
||||
- **Expectations:** Cover critical logic branches, calculations, and helper functions. Ensure tests are fast and run reliably. Aim for good coverage of utility functions and complex logic within modules.
|
||||
|
||||
### Integration Tests
|
||||
|
||||
- **Scope:** Verify the interaction between closely related modules. Examples:
|
||||
- Testing the `core/pipeline.ts` orchestrator with mocked implementations of each stage (fetch, scrape, summarize, email) to ensure the sequence and basic data flow are correct.
|
||||
- Testing a client module (e.g., `algoliaHNClient`) against mocked HTTP responses to ensure correct parsing and data transformation.
|
||||
- Testing the `email/contentAssembler.ts` by providing mock data files in a temporary directory (potentially using `mock-fs` or setup/teardown logic) and verifying the assembled `DigestData`.
|
||||
- **Tools:** Jest. May involve limited use of test setup/teardown for creating mock file structures if needed.
|
||||
- **Location:** `test/integration/`.
|
||||
- **Expectations:** Verify the contracts and collaborations between key internal components. Slower than unit tests. Focus on module boundaries.
|
||||
|
||||
### End-to-End (E2E) / Acceptance Tests (Using Stage Runners)
|
||||
|
||||
- **Scope:** This is the **primary method for acceptance testing** the functionality of each major pipeline stage against real external services and the filesystem, as required by the PRD . This also includes manually running the full pipeline.
|
||||
- **Process:**
|
||||
1. **Stage Testing Utilities:** Execute the standalone scripts in `src/stages/` via `npm run stage:<stage_name> [--args]`.
|
||||
- `npm run stage:fetch`: Verifies fetching from Algolia HN API and persisting `_data.json` files locally.
|
||||
- `npm run stage:scrape`: Verifies reading `_data.json`, scraping article URLs (hitting real websites), and persisting `_article.txt` files locally.
|
||||
- `npm run stage:summarize`: Verifies reading local `_data.json` / `_article.txt`, calling the local Ollama API, and persisting `_summary.json` files. Requires a running local Ollama instance.
|
||||
- `npm run stage:email [--dry-run]`: Verifies reading local persisted files, assembling the digest, rendering HTML, and either sending a real email (live run) or saving an HTML preview (`--dry-run`). Requires valid SMTP credentials in `.env` for live runs.
|
||||
2. **Full Pipeline Run:** Execute the main application via `npm run dev` or `npm start`.
|
||||
3. **Manual Verification:** Check console logs for errors during execution. Inspect the contents of the `output/YYYY-MM-DD/` directory (existence and format of `_data.json`, `_article.txt`, `_summary.json`, `_digest_preview.html` if dry-run). For live email tests, verify the received email's content, formatting, and summaries.
|
||||
- **Tools:** `npm` scripts, console inspection, file system inspection, email client.
|
||||
- **Environment:** Local development machine with internet access, configured `.env` file, and a running local Ollama instance .
|
||||
- **Location:** Scripts in `src/stages/`; verification steps are manual.
|
||||
- **Expectations:** These tests confirm the real-world functionality of each stage and the end-to-end process, fulfilling the core MVP success criteria .
|
||||
|
||||
### Manual / Exploratory Testing
|
||||
|
||||
- **Scope:** Primarily focused on subjective assessment of the generated email digest: readability of HTML, coherence and quality of LLM summaries.
|
||||
- **Process:** Review the output from E2E tests (`_digest_preview.html` or received email).
|
||||
|
||||
## Specialized Testing Types
|
||||
|
||||
- N/A for MVP. Performance, detailed security, accessibility, etc., are out of scope.
|
||||
|
||||
## Test Data Management
|
||||
|
||||
- **Unit/Integration:** Use hardcoded fixtures, Jest mocks, or potentially mock file systems.
|
||||
- **Stage/E2E:** Relies on live data fetched from Algolia/websites during the test run itself, or uses the output files generated by preceding stage runs. The `--dry-run` option for `stage:email` avoids external SMTP interaction during testing loops.
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
- N/A for MVP (local execution only). If CI were implemented later, it would execute `npm run lint` and `npm run test` (unit/integration tests). Running stage tests in CI would require careful consideration due to external dependencies (Algolia, Ollama, SMTP, potentially rate limits).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ----------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Draft based on PRD/Arch | 3-Architect |
|
||||
@@ -1,73 +0,0 @@
|
||||
# BMad Hacker Daily Digest Testing Strategy
|
||||
|
||||
## Overall Philosophy & Goals
|
||||
|
||||
The testing strategy for the BMad Hacker Daily Digest MVP focuses on pragmatic validation of the core pipeline functionality and individual component logic. Given it's a local CLI tool with a sequential process, the emphasis is on:
|
||||
|
||||
1. **Functional Correctness:** Ensuring each stage of the pipeline (fetch, scrape, summarize, email) performs its task correctly according to the requirements.
|
||||
2. **Integration Verification:** Confirming that data flows correctly between pipeline stages via the local filesystem.
|
||||
3. **Robustness (Key Areas):** Specifically testing graceful handling of expected failures, particularly in article scraping .
|
||||
4. **Leveraging Boilerplate:** Utilizing the Jest testing framework provided by `bmad-boilerplate` for automated unit and integration tests .
|
||||
5. **Stage-Based Acceptance:** Using the mandatory **Stage Testing Utilities** as the primary mechanism for end-to-end validation of each phase against real external interactions (where applicable) .
|
||||
|
||||
The primary goal is confidence in the MVP's end-to-end execution and the correctness of the generated email digest. High code coverage is secondary to testing critical paths and integration points.
|
||||
|
||||
## Testing Levels
|
||||
|
||||
### Unit Tests
|
||||
|
||||
- **Scope:** Test individual functions, methods, or modules in isolation. Focus on business logic within utilities (`src/utils/`), clients (`src/clients/` - mocking HTTP calls), scraping logic (`src/scraper/` - mocking HTTP calls), email templating (`src/email/templates.ts`), and potentially core pipeline orchestration logic (`src/core/pipeline.ts` - mocking stage implementations).
|
||||
- **Tools:** Jest (provided by `bmad-boilerplate`). Use `npm run test`.
|
||||
- **Mocking/Stubbing:** Utilize Jest's built-in mocking capabilities (`jest.fn()`, `jest.spyOn()`, manual mocks in `__mocks__`) to isolate units under test from external dependencies (native `Workspace` API, `fs`, other modules, external libraries like `nodemailer`, `ollamaClient`).
|
||||
- **Location:** `test/unit/`, mirroring the `src/` directory structure.
|
||||
- **Expectations:** Cover critical logic branches, calculations, and helper functions. Ensure tests are fast and run reliably. Aim for good coverage of utility functions and complex logic within modules.
|
||||
|
||||
### Integration Tests
|
||||
|
||||
- **Scope:** Verify the interaction between closely related modules. Examples:
|
||||
- Testing the `core/pipeline.ts` orchestrator with mocked implementations of each stage (fetch, scrape, summarize, email) to ensure the sequence and basic data flow are correct.
|
||||
- Testing a client module (e.g., `algoliaHNClient`) against mocked HTTP responses to ensure correct parsing and data transformation.
|
||||
- Testing the `email/contentAssembler.ts` by providing mock data files in a temporary directory (potentially using `mock-fs` or setup/teardown logic) and verifying the assembled `DigestData`.
|
||||
- **Tools:** Jest. May involve limited use of test setup/teardown for creating mock file structures if needed.
|
||||
- **Location:** `test/integration/`.
|
||||
- **Expectations:** Verify the contracts and collaborations between key internal components. Slower than unit tests. Focus on module boundaries.
|
||||
|
||||
### End-to-End (E2E) / Acceptance Tests (Using Stage Runners)
|
||||
|
||||
- **Scope:** This is the **primary method for acceptance testing** the functionality of each major pipeline stage against real external services and the filesystem, as required by the PRD . This also includes manually running the full pipeline.
|
||||
- **Process:**
|
||||
1. **Stage Testing Utilities:** Execute the standalone scripts in `src/stages/` via `npm run stage:<stage_name> [--args]`.
|
||||
- `npm run stage:fetch`: Verifies fetching from Algolia HN API and persisting `_data.json` files locally.
|
||||
- `npm run stage:scrape`: Verifies reading `_data.json`, scraping article URLs (hitting real websites), and persisting `_article.txt` files locally.
|
||||
- `npm run stage:summarize`: Verifies reading local `_data.json` / `_article.txt`, calling the local Ollama API, and persisting `_summary.json` files. Requires a running local Ollama instance.
|
||||
- `npm run stage:email [--dry-run]`: Verifies reading local persisted files, assembling the digest, rendering HTML, and either sending a real email (live run) or saving an HTML preview (`--dry-run`). Requires valid SMTP credentials in `.env` for live runs.
|
||||
2. **Full Pipeline Run:** Execute the main application via `npm run dev` or `npm start`.
|
||||
3. **Manual Verification:** Check console logs for errors during execution. Inspect the contents of the `output/YYYY-MM-DD/` directory (existence and format of `_data.json`, `_article.txt`, `_summary.json`, `_digest_preview.html` if dry-run). For live email tests, verify the received email's content, formatting, and summaries.
|
||||
- **Tools:** `npm` scripts, console inspection, file system inspection, email client.
|
||||
- **Environment:** Local development machine with internet access, configured `.env` file, and a running local Ollama instance .
|
||||
- **Location:** Scripts in `src/stages/`; verification steps are manual.
|
||||
- **Expectations:** These tests confirm the real-world functionality of each stage and the end-to-end process, fulfilling the core MVP success criteria .
|
||||
|
||||
### Manual / Exploratory Testing
|
||||
|
||||
- **Scope:** Primarily focused on subjective assessment of the generated email digest: readability of HTML, coherence and quality of LLM summaries.
|
||||
- **Process:** Review the output from E2E tests (`_digest_preview.html` or received email).
|
||||
|
||||
## Specialized Testing Types
|
||||
|
||||
- N/A for MVP. Performance, detailed security, accessibility, etc., are out of scope.
|
||||
|
||||
## Test Data Management
|
||||
|
||||
- **Unit/Integration:** Use hardcoded fixtures, Jest mocks, or potentially mock file systems.
|
||||
- **Stage/E2E:** Relies on live data fetched from Algolia/websites during the test run itself, or uses the output files generated by preceding stage runs. The `--dry-run` option for `stage:email` avoids external SMTP interaction during testing loops.
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
- N/A for MVP (local execution only). If CI were implemented later, it would execute `npm run lint` and `npm run test` (unit/integration tests). Running stage tests in CI would require careful consideration due to external dependencies (Algolia, Ollama, SMTP, potentially rate limits).
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | ----------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Draft based on PRD/Arch | 3-Architect |
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user