v2 fully tested and production ready - beta tag removed - full demonstration with all artifacts and agent transcripts for a full project from ideation to dev pickup point
This commit is contained in:
254
V2-FULL-DEMO-WALKTHROUGH/architecture.md
Normal file
254
V2-FULL-DEMO-WALKTHROUGH/architecture.md
Normal file
@@ -0,0 +1,254 @@
|
||||
# BMad Hacker Daily Digest Architecture Document
|
||||
|
||||
## Technical Summary
|
||||
|
||||
The BMad Hacker Daily Digest is a command-line interface (CLI) tool designed to provide users with concise summaries of top Hacker News (HN) stories and their associated comment discussions . Built with TypeScript and Node.js (v22) , it operates entirely on the user's local machine . The core functionality involves a sequential pipeline: fetching story and comment data from the Algolia HN Search API , attempting to scrape linked article content , generating summaries using a local Ollama LLM instance , persisting intermediate data to the local filesystem , and finally assembling and emailing an HTML digest using Nodemailer . The architecture emphasizes modularity and testability, including mandatory standalone scripts for testing each pipeline stage . The project starts from the `bmad-boilerplate` template .
|
||||
|
||||
## High-Level Overview
|
||||
|
||||
The application follows a simple, sequential pipeline architecture executed via a manual CLI command (`npm run dev` or `npm start`) . There is no persistent database; the local filesystem is used to store intermediate data artifacts (fetched data, scraped text, summaries) between steps within a date-stamped directory . All external HTTP communication (Algolia API, article scraping, Ollama API) utilizes the native Node.js `Workspace` API .
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph "BMad Hacker Daily Digest (Local CLI)"
|
||||
A[index.ts / CLI Trigger] --> B(core/pipeline.ts);
|
||||
B --> C{Fetch HN Data};
|
||||
B --> D{Scrape Articles};
|
||||
B --> E{Summarize Content};
|
||||
B --> F{Assemble & Email Digest};
|
||||
C --> G["Local FS (_data.json)"];
|
||||
D --> H["Local FS (_article.txt)"];
|
||||
E --> I["Local FS (_summary.json)"];
|
||||
F --> G;
|
||||
F --> H;
|
||||
F --> I;
|
||||
end
|
||||
|
||||
subgraph External Services
|
||||
X[Algolia HN API];
|
||||
Y[Article Websites];
|
||||
Z["Ollama API (Local)"];
|
||||
W[SMTP Service];
|
||||
end
|
||||
|
||||
C --> X;
|
||||
D --> Y;
|
||||
E --> Z;
|
||||
F --> W;
|
||||
|
||||
style G fill:#eee,stroke:#333,stroke-width:1px
|
||||
style H fill:#eee,stroke:#333,stroke-width:1px
|
||||
style I fill:#eee,stroke:#333,stroke-width:1px
|
||||
```
|
||||
|
||||
## Component View
|
||||
|
||||
The application code (`src/`) is organized into logical modules based on the defined project structure (`docs/project-structure.md`). Key components include:
|
||||
|
||||
- **`src/index.ts`**: The main entry point, handling CLI invocation and initiating the pipeline.
|
||||
- **`src/core/pipeline.ts`**: Orchestrates the sequential execution of the main pipeline stages (fetch, scrape, summarize, email).
|
||||
- **`src/clients/`**: Modules responsible for interacting with external APIs.
|
||||
- `algoliaHNClient.ts`: Communicates with the Algolia HN Search API.
|
||||
- `ollamaClient.ts`: Communicates with the local Ollama API.
|
||||
- **`src/scraper/articleScraper.ts`**: Handles fetching and extracting text content from article URLs.
|
||||
- **`src/email/`**: Manages digest assembly, HTML rendering, and email dispatch via Nodemailer.
|
||||
- `contentAssembler.ts`: Reads persisted data.
|
||||
- `templates.ts`: Renders HTML.
|
||||
- `emailSender.ts`: Sends the email.
|
||||
- **`src/stages/`**: Contains standalone scripts (`Workspace_hn_data.ts`, `scrape_articles.ts`, etc.) for testing individual pipeline stages independently using local data where applicable.
|
||||
- **`src/utils/`**: Shared utilities for configuration loading (`config.ts`), logging (`logger.ts`), and date handling (`dateUtils.ts`).
|
||||
- **`src/types/`**: Shared TypeScript interfaces and types.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
subgraph AppComponents ["Application Components (src/)"]
|
||||
Idx(index.ts) --> Pipe(core/pipeline.ts);
|
||||
Pipe --> HNClient(clients/algoliaHNClient.ts);
|
||||
Pipe --> Scraper(scraper/articleScraper.ts);
|
||||
Pipe --> OllamaClient(clients/ollamaClient.ts);
|
||||
Pipe --> Assembler(email/contentAssembler.ts);
|
||||
Pipe --> Renderer(email/templates.ts);
|
||||
Pipe --> Sender(email/emailSender.ts);
|
||||
|
||||
Pipe --> Utils(utils/*);
|
||||
Pipe --> Types(types/*);
|
||||
HNClient --> Types;
|
||||
OllamaClient --> Types;
|
||||
Assembler --> Types;
|
||||
Renderer --> Types;
|
||||
|
||||
subgraph StageRunnersSubgraph ["Stage Runners (src/stages/)"]
|
||||
SFetch(fetch_hn_data.ts) --> HNClient;
|
||||
SFetch --> Utils;
|
||||
SScrape(scrape_articles.ts) --> Scraper;
|
||||
SScrape --> Utils;
|
||||
SSummarize(summarize_content.ts) --> OllamaClient;
|
||||
SSummarize --> Utils;
|
||||
SEmail(send_digest.ts) --> Assembler;
|
||||
SEmail --> Renderer;
|
||||
SEmail --> Sender;
|
||||
SEmail --> Utils;
|
||||
end
|
||||
end
|
||||
|
||||
subgraph Externals ["Filesystem & External"]
|
||||
FS["Local Filesystem (output/)"]
|
||||
Algolia((Algolia HN API))
|
||||
Websites((Article Websites))
|
||||
Ollama["Ollama API (Local)"]
|
||||
SMTP((SMTP Service))
|
||||
end
|
||||
|
||||
HNClient --> Algolia;
|
||||
Scraper --> Websites;
|
||||
OllamaClient --> Ollama;
|
||||
Sender --> SMTP;
|
||||
|
||||
Pipe --> FS;
|
||||
Assembler --> FS;
|
||||
|
||||
SFetch --> FS;
|
||||
SScrape --> FS;
|
||||
SSummarize --> FS;
|
||||
SEmail --> FS;
|
||||
|
||||
%% Apply style to the subgraph using its ID after the block
|
||||
style StageRunnersSubgraph fill:#f9f,stroke:#333,stroke-width:1px
|
||||
```
|
||||
|
||||
## Key Architectural Decisions & Patterns
|
||||
|
||||
- **Architecture Style:** Simple Sequential Pipeline executed via CLI.
|
||||
- **Execution Environment:** Local machine only; no cloud deployment, no database for MVP.
|
||||
- **Data Handling:** Intermediate data persisted to local filesystem in a date-stamped directory.
|
||||
- **HTTP Client:** Mandatory use of native Node.js v22 `Workspace` API for all external HTTP requests.
|
||||
- **Modularity:** Code organized into distinct modules for clients, scraping, email, core logic, utilities, and types to promote separation of concerns and testability.
|
||||
- **Stage Testing:** Mandatory standalone scripts (`src/stages/*`) allow independent testing of each pipeline phase.
|
||||
- **Configuration:** Environment variables loaded natively from `.env` file; no `dotenv` package required.
|
||||
- **Error Handling:** Graceful handling of scraping failures (log and continue); basic logging for other API/network errors.
|
||||
- **Logging:** Basic console logging via a simple wrapper (`src/utils/logger.ts`) for MVP; structured file logging is a post-MVP consideration.
|
||||
- **Key Libraries:** `@extractus/article-extractor`, `date-fns`, `nodemailer`, `yargs`. (See `docs/tech-stack.md`)
|
||||
|
||||
## Core Workflow / Sequence Diagram (Main Pipeline)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant CLI_User as CLI User
|
||||
participant Idx as src/index.ts
|
||||
participant Pipe as core/pipeline.ts
|
||||
participant Cfg as utils/config.ts
|
||||
participant Log as utils/logger.ts
|
||||
participant HN as clients/algoliaHNClient.ts
|
||||
participant FS as Local FS [output/]
|
||||
participant Scr as scraper/articleScraper.ts
|
||||
participant Oll as clients/ollamaClient.ts
|
||||
participant Asm as email/contentAssembler.ts
|
||||
participant Tpl as email/templates.ts
|
||||
participant Snd as email/emailSender.ts
|
||||
participant Alg as Algolia HN API
|
||||
participant Web as Article Website
|
||||
participant Olm as Ollama API [Local]
|
||||
participant SMTP as SMTP Service
|
||||
|
||||
Note right of CLI_User: Triggered via 'npm run dev'/'start'
|
||||
|
||||
CLI_User ->> Idx: Execute script
|
||||
Idx ->> Cfg: Load .env config
|
||||
Idx ->> Log: Initialize logger
|
||||
Idx ->> Pipe: runPipeline()
|
||||
Pipe ->> Log: Log start
|
||||
Pipe ->> HN: fetchTopStories()
|
||||
HN ->> Alg: Request stories
|
||||
Alg -->> HN: Story data
|
||||
HN -->> Pipe: stories[]
|
||||
loop For each story
|
||||
Pipe ->> HN: fetchCommentsForStory(storyId, max)
|
||||
HN ->> Alg: Request comments
|
||||
Alg -->> HN: Comment data
|
||||
HN -->> Pipe: comments[]
|
||||
Pipe ->> FS: Write {storyId}_data.json
|
||||
end
|
||||
Pipe ->> Log: Log HN fetch complete
|
||||
|
||||
loop For each story with URL
|
||||
Pipe ->> Scr: scrapeArticle(story.url)
|
||||
Scr ->> Web: Request article HTML [via Workspace]
|
||||
alt Scraping Successful
|
||||
Web -->> Scr: HTML content
|
||||
Scr -->> Pipe: articleText: string
|
||||
Pipe ->> FS: Write {storyId}_article.txt
|
||||
else Scraping Failed / Skipped
|
||||
Web -->> Scr: Error / Non-HTML / Timeout
|
||||
Scr -->> Pipe: articleText: null
|
||||
Pipe ->> Log: Log scraping failure/skip
|
||||
end
|
||||
end
|
||||
Pipe ->> Log: Log scraping complete
|
||||
|
||||
loop For each story
|
||||
alt Article content exists
|
||||
Pipe ->> Oll: generateSummary(prompt, articleText)
|
||||
Oll ->> Olm: POST /api/generate [article]
|
||||
Olm -->> Oll: Article Summary / Error
|
||||
Oll -->> Pipe: articleSummary: string | null
|
||||
else No article content
|
||||
Pipe -->> Pipe: Set articleSummary = null
|
||||
end
|
||||
alt Comments exist
|
||||
Pipe ->> Pipe: Format comments to text block
|
||||
Pipe ->> Oll: generateSummary(prompt, commentsText)
|
||||
Oll ->> Olm: POST /api/generate [comments]
|
||||
Olm -->> Oll: Discussion Summary / Error
|
||||
Oll -->> Pipe: discussionSummary: string | null
|
||||
else No comments
|
||||
Pipe -->> Pipe: Set discussionSummary = null
|
||||
end
|
||||
Pipe ->> FS: Write {storyId}_summary.json
|
||||
end
|
||||
Pipe ->> Log: Log summarization complete
|
||||
|
||||
Pipe ->> Asm: assembleDigestData(dateDirPath)
|
||||
Asm ->> FS: Read _data.json, _summary.json files
|
||||
FS -->> Asm: File contents
|
||||
Asm -->> Pipe: digestData[]
|
||||
alt Digest data assembled
|
||||
Pipe ->> Tpl: renderDigestHtml(digestData, date)
|
||||
Tpl -->> Pipe: htmlContent: string
|
||||
Pipe ->> Snd: sendDigestEmail(subject, htmlContent)
|
||||
Snd ->> Cfg: Load email config
|
||||
Snd ->> SMTP: Send email
|
||||
SMTP -->> Snd: Success/Failure
|
||||
Snd -->> Pipe: success: boolean
|
||||
Pipe ->> Log: Log email result
|
||||
else Assembly failed / No data
|
||||
Pipe ->> Log: Log skipping email
|
||||
end
|
||||
Pipe ->> Log: Log finished
|
||||
```
|
||||
|
||||
## Infrastructure and Deployment Overview
|
||||
|
||||
- **Cloud Provider(s):** N/A. Executes locally on the user's machine.
|
||||
- **Core Services Used:** N/A (relies on external Algolia API, local Ollama, target websites, SMTP provider).
|
||||
- **Infrastructure as Code (IaC):** N/A.
|
||||
- **Deployment Strategy:** Manual execution via CLI (`npm run dev` or `npm run start` after `npm run build`). No CI/CD pipeline required for MVP.
|
||||
- **Environments:** Single environment: local development machine.
|
||||
|
||||
## Key Reference Documents
|
||||
|
||||
- `docs/prd.md`
|
||||
- `docs/epic1.md` ... `docs/epic5.md`
|
||||
- `docs/tech-stack.md`
|
||||
- `docs/project-structure.md`
|
||||
- `docs/data-models.md`
|
||||
- `docs/api-reference.md`
|
||||
- `docs/environment-vars.md`
|
||||
- `docs/coding-standards.md`
|
||||
- `docs/testing-strategy.md`
|
||||
- `docs/prompts.md`
|
||||
|
||||
## Change Log
|
||||
|
||||
| Change | Date | Version | Description | Author |
|
||||
| ------------- | ---------- | ------- | -------------------------- | ----------- |
|
||||
| Initial draft | 2025-05-04 | 0.1 | Initial draft based on PRD | 3-Architect |
|
||||
Reference in New Issue
Block a user