ros/BMAD-METHOD

Fork 0

Files

Brian Madison 3a840b6362 demo output artifacts

2025-05-21 20:24:31 -05:00

44 KiB

Raw Blame History

BMad Daily Digest Product Requirements Document (PRD)

Version: 0.1 Date: May 20, 2025 Author: JohnAI

1. Goal, Objective and Context

Overall Goal: To provide busy tech executives with a quick, daily audio digest of top Hacker News posts and discussions, enabling them to stay effortlessly informed.
Project Objective (MVP Focus): To successfully launch the "BMad Daily Digest" by:
- Automating the daily fetching of top 10 Hacker News posts (article metadata and comments via Algolia HN API) and scraping of linked article content.
- Processing this content into a structured format.
- Generating a 2-agent audio podcast using the play.ai PlayNote API.
- Delivering the podcast via a simple Next.js web application (polyrepo structure) with a list of episodes and detail pages including an audio player and links to source materials.
- Operating this process daily, aiming for delivery by a consistent morning hour.
- Adhering to a full TypeScript stack (Node.js 22 for backend), with a Next.js frontend, AWS Lambda backend, DynamoDB, S3, and AWS CDK for IaC, while aiming to stay within AWS free-tier limits where possible.
Context/Problem Solved: Busy tech executives lack the time to thoroughly read Hacker News daily but need to stay updated on key tech discussions, trends, and news for strategic insights. "BMad Daily Digest" solves this by offering a convenient, curated audio summary.

2. Functional Requirements (MVP)

FR1: Content Acquisition

The system must automatically fetch data for the top 10 (configurable) posts from Hacker News daily.
For each Hacker News post, the system must identify and retrieve:
- The URL of the linked article.
- Key metadata about the Hacker News post (e.g., title, HN link, score, author, HN Post ID).
The system must fetch comments for each identified Hacker News post using the Algolia HN Search API, with logic to handle new vs. repeat posts and scraping failures differently.
The system must attempt to scrape and extract the primary textual content from the linked article URL for each of the top posts (unless it's a repeat post where only new comments are needed).
- This process should aim to isolate the main article body.
- If scraping fails, a fallback using HN title, summary (if available), and increased comments must be used.

FR2: Content Processing and Formatting

The system must aggregate the extracted/fallback article content and selected comments for the top 10 posts.
The system must process and structure the aggregated text content into a single text file suitable for submission to the play.ai PlayNote API.
- The text file must begin with an introductory sentence: "It's a top 10 countdown for [Today'sDate]".
- Content must be structured sequentially (e.g., "Story 10 - [details]..."), with special phrasing for repeat posts or posts where article scraping failed.
- Article content may be truncated if MAX_ARTICLE_LENGTH is set, preserving intro/conclusion where possible.

FR3: Podcast Generation

The system must submit the formatted text content to the play.ai PlayNote API daily using specified voice and style parameters (configurable via environment variables).
The system must capture the jobId from Play.ai and use a polling mechanism (e.g., AWS Step Functions) to check for job completion status.
Upon successful completion, the system must retrieve the generated audio podcast file from the Play.ai-provided URL.
The system must store the generated audio file (e.g., on S3) and its associated metadata (including episode number, generated title, S3 location, original Play.ai URL, source HN posts, and processing status) in DynamoDB.

FR4: Web Application Interface (MVP)

The system must provide a web application (Next.js, "80s retro CRT terminal" theme with Tailwind CSS & shadcn/ui) with a List Page that:
- Displays a chronological list (newest first) of all generated "BMad Daily Digest" episodes, formatted as "Episode [EpisodeNumber]: [PublicationDate] - [PodcastTitle]".
- Allows users to navigate to a Detail Page for each episode.
The system must provide a web application with a Detail Page for each episode that:
- Displays the podcastGeneratedTitle, publicationDate, and episodeNumber.
- Includes an embedded HTML5 audio player for the podcast.
- Lists the individual Hacker News stories included, with direct links to the original source article and the Hacker News discussion page.
The system must provide a minimalist About Page.
The web application must be responsive.

FR5: Automation and Scheduling

The entire end-to-end backend process must be orchestrated (preferably via AWS Step Functions) and automated to run daily, triggered by Amazon EventBridge Scheduler (default 12:00 UTC, configurable).
For MVP, a re-run of the daily job for the same day must overwrite/start over previous data for that day.

3. Non-Functional Requirements (MVP)

a. Performance:

Podcast Generation Time: The daily process should complete in a timely manner (e.g., target by 8 AM CST/12:00-13:00 UTC, specific completion window TBD based on Play.ai processing).
Web Application Load Time: Pages on the Next.js app should aim for fast load times (e.g., target under 3-5 seconds).

b. Reliability / Availability:

Daily Process Success Rate: Target >95% success rate for automated podcast generation without manual intervention.
Web Application Uptime: Target 99.5%+ uptime.

c. Maintainability:

Code Quality: Code must be well-documented. Internal code comments should be used when logic isn't clear from names. All functions must have JSDoc-style outer comments. Adherence to defined coding standards (ESLint, Prettier).
Configuration Management: System configurations and secrets must be managed via environment variables (.env locally, Lambda environment variables when deployed), set manually for MVP.

d. Usability (Web Application):

The web application must be intuitive for busy tech executives.
The audio player must be simple and reliable.
Accessibility: Standard MVP considerations, with particular attention to contrast for the "glowing green on dark" theme, good keyboard navigation, and basic screen reader compatibility.

e. Security (MVP Focus):

API Key Management: Keys for Algolia, Play.ai, AWS must be stored securely (gitignored .env files locally, Lambda environment variables in AWS), not hardcoded.
Data Handling: Scraped content handled responsibly.

f. Cost Efficiency:

AWS service usage must aim to stay within free-tier limits where feasible. Play.ai usage is via existing user subscription.

4. User Interaction and Design Goals

a. Overall Vision & Experience:

Look and Feel: Dark mode UI, "glowing green ASCII/text on a black background" aesthetic (CRT terminal style), "80s retro everything" theme.
UI Component Toolkit: Tailwind CSS and shadcn/ui, customized for the theme. Initial structure/components kickstarted by an AI UI generation tool.
User Experience: Highly efficient, clear navigation, no clutter, prioritizing content readability for busy tech executives.

b. Key Interaction Paradigms (MVP):

View list of digests (reverse chronological), select one for details. No sorting/filtering on list page for MVP.

c. Core Screens/Views (MVP):

List Page: Episodes ("Episode [N]: [Date] - [PodcastTitle]").
Detail Page: Episode details, HTML5 audio player, list of source HN stories with links to articles and HN discussions.
About Page: Minimalist, explaining the service, consistent theme.

d. Accessibility Aspirations (MVP):

Standard considerations: good contrast (critical for theme), keyboard navigation, basic screen reader compatibility.

e. Branding Considerations (High-Level):

"80s retro everything" theme is central. Logo/typeface should complement this (e.g., pixel art, retro fonts).

f. Target Devices/Platforms:

Responsive web application, good UX on desktop and mobile.

5. Technical Assumptions

a. Core Technology Stack & Approach:

Full TypeScript Stack: TypeScript for frontend and backend.
Frontend: Next.js (React), Node.js 22. Styling: Tailwind CSS, shadcn/ui. Hosting: Static site on AWS S3 (via CloudFront).
Backend: Node.js 22, TypeScript. HTTP Client: axios. Compute: AWS Lambda. Database: AWS DynamoDB.
Infrastructure as Code (IaC): All AWS infrastructure via AWS CDK.
Key External Services/APIs: Algolia HN Search API (posts/comments), Play.ai PlayNote API (audio gen, user has subscription, polling for status), Custom scraping for articles (TypeScript with Cheerio, Readability.js, potentially Puppeteer/Playwright).
Automation: Daily trigger via Amazon EventBridge Scheduler. Orchestration via AWS Step Functions.
Configuration & Secrets: Environment variables (.env local & gitignored, Lambda env vars).
Coding Standards: JSDoc for functions, inline comments for clarity. ESLint, Prettier.

b. Repository Structure & Service Architecture:

Repository Structure: Polyrepo (separate Git repositories for bmad-daily-digest-frontend and bmad-daily-digest-backend).
High-Level Service Architecture: Backend is serverless functions (AWS Lambda) for distinct tasks, orchestrated by Step Functions. API layer via AWS API Gateway to expose backend to frontend.

6. Epic Overview

This section details the Epics and their User Stories for the MVP.

Epic 1: Backend Foundation, Tooling & "Hello World" API

Goal: To establish the core backend project infrastructure in its dedicated repository, including robust development tooling and initial AWS CDK setup for essential services. By the end of this epic:
1. A simple "hello world" API endpoint (AWS API Gateway + Lambda) must be deployed and testable via curl, returning a dynamic message.
2. The backend project must have ESLint, Prettier, Jest (unit testing), and esbuild (TypeScript bundling) configured and operational.
3. Basic unit tests must exist for the "hello world" Lambda function.
4. Code formatting and linting checks should be integrated into a pre-commit hook and/or a basic CI pipeline stub.
User Stories for Epic 1:

Story 1.1: Initialize Backend Project using TS-TEMPLATE-STARTER
- User Story Statement: As a Developer, I want to create the bmad-daily-digest-backend Git repository and initialize it using the existing TS-TEMPLATE-STARTER, ensuring all foundational tooling (TypeScript, Node.js 22, ESLint, Prettier, Jest, esbuild) is correctly configured and operational for this specific project, so that I have a high-quality, standardized development environment ready for application logic.
- Acceptance Criteria (ACs):
  1. A new, private Git repository named bmad-daily-digest-backend must be created on GitHub.
  2. The contents of the TS-TEMPLATE-STARTER project must be copied/cloned into this new repository.
  3. package.json must be updated (project name, version, description).
  4. Project dependencies must be installable.
  5. TypeScript setup (tsconfig.json) must be verified for Node.js 22, esbuild compatibility; project must compile.
  6. ESLint and Prettier configurations must be operational; lint/format scripts must execute successfully.
  7. Jest configuration must be operational; test scripts must execute successfully with any starter example tests.
  8. Irrelevant generic demo code from starter should be removed. index.ts/index.test.ts can remain as placeholders.
  9. A standard .gitignore and an updated project README.md must be present.
Story 1.2: Pre-commit Hook Implementation
- User Story Statement: As a Developer, I want pre-commit hooks automatically enforced in the bmad-daily-digest-backend repository, so that code quality standards (like linting and formatting) are checked and applied to staged files before any code is committed, thereby maintaining codebase consistency and reducing trivial errors.
- Acceptance Criteria (ACs):
  1. A pre-commit hook tool (e.g., Husky) must be installed and configured.
  2. A tool for running linters/formatters on staged files (e.g., lint-staged) must be installed and configured.
  3. Pre-commit hook must trigger lint-staged on staged .ts files.
  4. lint-staged must be configured to run ESLint (--fix) and Prettier (--write).
  5. Attempting to commit files with auto-fixable issues must result in fixes applied and successful commit.
  6. Attempting to commit files with non-auto-fixable linting errors must abort the commit with error messages.
  7. Committing clean files must proceed without issues.
Story 1.3: "Hello World" Lambda Function Implementation & Unit Tests
- User Story Statement: As a Developer, I need a simple "Hello World" AWS Lambda function implemented in TypeScript within the bmad-daily-digest-backend project. This function, when invoked, should return a dynamic greeting message including the current date and time, and it must be accompanied by comprehensive Jest unit tests, so that our basic serverless compute functionality, testing setup, and TypeScript bundling are validated.
- Acceptance Criteria (ACs):
  1. A src/handlers/helloWorldHandler.ts file (or similar) must contain the Lambda handler.
  2. Handler must be AWS Lambda compatible (event, context, Promise response).
  3. Successful execution must return JSON: statusCode: 200, body with message: "Hello from BMad Daily Digest Backend, today is [current_date] at [current_time].".
  4. Date and time in message must be dynamic.
  5. A corresponding Jest unit test file (e.g., src/handlers/helloWorldHandler.test.ts) must be created.
  6. Unit tests must verify: 200 status, valid JSON body, expected message field, "Hello from..." prefix, dynamic date/time portion (use mocked Date).
  7. All unit tests must pass.
  8. esbuild configuration must correctly bundle the handler.
Story 1.4: AWS CDK Setup for "Hello World" API (Lambda & API Gateway)
- User Story Statement: As a Developer, I want to define the necessary AWS infrastructure (Lambda function and API Gateway endpoint) for the "Hello World" service using AWS CDK (Cloud Development Kit) in TypeScript, so that the infrastructure is version-controlled, repeatable, and can be deployed programmatically.
- Acceptance Criteria (ACs):
  1. AWS CDK (v2) must be a development dependency.
  2. CDK app structure must be initialized (e.g., in cdk/ or infra/).
  3. A new CDK stack (e.g., BmadDailyDigestBackendStack) must be defined in TypeScript.
  4. CDK stack must define an AWS Lambda resource for the "Hello World" function (Node.js 22, bundled code reference, handler entry point, basic IAM role for CloudWatch logs, free-tier conscious settings).
  5. CDK stack must define an AWS API Gateway (HTTP API preferred) with a route (e.g., GET /hello) triggering the Lambda.
  6. CDK stack must be synthesizable (cdk synth) without errors.
  7. CDK code must adhere to project ESLint/Prettier standards.
  8. Mechanism for passing Lambda environment variables via CDK (if any needed for "Hello World") must be in place.
Story 1.5: "Hello World" API Deployment & Manual Invocation Test
- User Story Statement: As a Developer, I need to deploy the "Hello World" API (defined in AWS CDK) to an AWS environment and successfully invoke its endpoint using a tool like curl, so that I can verify the end-to-end deployment process and confirm the basic API is operational in the cloud.
- Acceptance Criteria (ACs):
  1. The AWS CDK stack for "Hello World" API must deploy successfully to a designated AWS account/region.
  2. The API Gateway endpoint URL for /hello must be retrievable post-deployment.
  3. A GET request to the deployed /hello endpoint must receive a response.
  4. HTTP response status must be 200 OK.
  5. Response body must be JSON containing the expected dynamic "Hello..." message.
  6. Basic Lambda invocation logs must be visible in AWS CloudWatch Logs.
Story 1.6: Basic CI/CD Pipeline Stub with Quality Gates
- User Story Statement: As a Developer, I need a basic Continuous Integration (CI) pipeline established for the bmad-daily-digest-backend repository, so that code quality checks (linting, formatting, unit tests) and the build process are automated upon code pushes and pull requests, ensuring early feedback and maintaining codebase health.
- Acceptance Criteria (ACs):
  1. A CI workflow file (e.g., GitHub Actions in .github/workflows/main.yml) must be created.
  2. Pipeline must trigger on pushes to main and PRs targeting main.
  3. Pipeline must include steps for: checkout, Node.js 22 setup, dependency install, ESLint check, Prettier format check, Jest unit tests, esbuild bundle.
  4. Pipeline must fail if any lint, format, test, or bundle step fails.
  5. A successful CI run on the main branch (with "Hello World" code) must be demonstrated.
  6. CI pipeline for MVP does not need to perform AWS deployment; focus is on quality gates.

Epic 2: Automated Content Ingestion & Podcast Generation Pipeline

Goal: To implement the complete automated daily workflow within the backend. This includes fetching Hacker News post data, scraping and extracting content from linked external articles, aggregating and formatting text, submitting it to Play.ai, managing job status via polling, and retrieving/storing the final audio file and associated metadata. This epic delivers the core value proposition of generating the daily audio content and making it ready for consumption via an API.

User Stories for Epic 2:

Story 2.1: AWS CDK Extension for Epic 2 Resources

User Story Statement: As a Developer, I need to extend the existing AWS CDK stack within the bmad-daily-digest-backend project to define and provision all new AWS resources required for the content ingestion and podcast generation pipeline—including a DynamoDB table for episode and processed post metadata, an S3 bucket for audio storage, and the AWS Step Functions state machine for orchestrating the Play.ai job status polling—so that all backend infrastructure for this epic is managed as code and ready for the application logic.
Acceptance Criteria (ACs):
1. The existing AWS CDK application (from Epic 1) must be extended with new resource definitions for Epic 2.
2. A DynamoDB table (e.g., BmadDailyDigestEpisodes) must be defined via CDK for episode metadata, with episodeId (String UUID) as PK, key attributes (publicationDate, episodeNumber, podcastGeneratedTitle, audioS3Key, audioS3Bucket, playAiJobId, playAiSourceAudioUrl, sourceHNPosts (List of objects: { hnPostId, title, originalArticleUrl, hnLink, isUpdateStatus, lastCommentFetchTimestamp, oldRank }), status, createdAt), and PAY_PER_REQUEST billing.
3. A DynamoDB table or GSI strategy must be defined via CDK to efficiently track processed Hacker News posts (hnPostId) and their lastCommentFetchTimestamp to support the "new comments only" feature.
4. An S3 bucket (e.g., bmad-daily-digest-audio-{unique-suffix}) must be defined via CDK for audio storage, with private access.
5. An AWS Step Functions state machine must be defined via CDK to manage the Play.ai job status polling workflow (details in Story 2.6).
6. Necessary IAM roles/permissions for Lambdas to interact with DynamoDB, S3, Step Functions, CloudWatch Logs must be defined via CDK.
7. The updated CDK stack must synthesize (cdk synth) and deploy (cdk deploy) successfully.
8. All new CDK code must adhere to project ESLint/Prettier standards.

Story 2.2: Fetch Top Hacker News Posts & Identify Repeats

User Story Statement: As the System, I need to reliably fetch the top N (configurable, e.g., 10) current Hacker News posts daily using the Algolia HN API, including their essential metadata. I also need to identify if each fetched post has been processed in a recent digest by checking against stored data, so that I have an accurate list of stories and their status (new or repeat) to begin generating the daily digest.
Acceptance Criteria (ACs):
1. A hackerNewsService.ts function must fetch top N HN posts (stories only) via axios from Algolia API (configurable HN_POSTS_COUNT).
2. Extracted metadata per post: Title, Article URL, HN Post URL, HN Post ID (objectID), Author, Points, Creation timestamp.
3. For each post, the function must query DynamoDB (see Story 2.1 AC#3) to determine its isUpdateStatus (true if recently processed and article scraped) and retrieve lastCommentFetchTimestamp and oldRank if available.
4. Function must return an array of HN post objects with metadata, isUpdateStatus, lastCommentFetchTimestamp, and oldRank.
5. Error handling for Algolia/DynamoDB calls must be implemented and logged.
6. Unit tests (Jest) must verify API calls, data extraction, repeat identification (mocked DDB), and error handling. All tests must pass.

Story 2.3: Article Content Scraping & Extraction (Conditional)

User Story Statement: As the System, for each Hacker News post identified as new (or for which article scraping previously failed), I need to robustly fetch its HTML content from the linked article URL and extract the primary textual content and title. If scraping fails, a fallback mechanism must be triggered using available HN metadata and signaling the need for increased comments.
Acceptance Criteria (ACs):
1. An articleScraperService.ts function must accept an article URL and isUpdateStatus.
2. If isUpdateStatus is true (article already scraped and good), scraping must be skipped, and a success status indicating no new scrape needed is returned (or pre-existing content reference).
3. If new scrape needed: use axios (timeout, User-Agent) to fetch HTML.
4. Use Mozilla Readability (JS port) or similar to extract main article text and title.
5. Return { success: true, title: string, content: string, ... } on success.
6. If scraping fails: log failure, return { success: false, error: string, fallbackNeeded: true }.
7. No "polite" inter-article scraping delays for MVP.
8. Unit tests (Jest) must mock axios, test successful extraction, skip logic, and failure/fallback scenarios. All tests must pass.

Story 2.4: Fetch Hacker News Comments (Conditional Logic)

User Story Statement: As the System, I need to fetch comments for each selected Hacker News post using the Algolia HN API, adjusting the strategy based on whether the post is new, a repeat (requiring only new comments), or if its article scraping failed (requiring more comments).
Acceptance Criteria (ACs):
1. hackerNewsService.ts must be extended to fetch comments for an HN Post ID, accepting isUpdateStatus, lastCommentFetchTimestamp, and articleScrapingFailed flags.
2. Use axios to call Algolia HN API item endpoint (/api/v1/items/[POST_ID]).
3. Comment Fetching Logic:
  - If articleScrapingFailed: Fetch up to 3x HN_COMMENTS_COUNT_PER_POST (configurable, e.g., ~150) available comments (top-level first, then immediate children, by Algolia sort order).
  - If isUpdateStatus (repeat post): Fetch all comments, then filter client-side for comments with created_at_i > lastCommentFetchTimestamp. Select up to HN_COMMENTS_COUNT_PER_POST of these new comments.
  - Else (new post, successful scrape): Fetch up to HN_COMMENTS_COUNT_PER_POST (configurable, e.g., ~50) available comments.
4. Extract plain text (HTML stripped), author, creation timestamp for selected comments.
5. Return array of comment objects; empty if none.
6. Error handling and logging for API calls.
7. Unit tests (Jest) must mock axios and verify all conditional fetching logic, comment selection/filtering, data extraction (HTML stripping), and error handling. All tests must pass.

Story 2.5: Content Aggregation and Formatting for Play.ai

User Story Statement: As the System, I need to aggregate the collected Hacker News post data (titles), associated article content (full, truncated, or fallback summary), and comments (new, updated, or extended sets) for all top stories, and format this combined text according to the specified structure for the play.ai PlayNote API, including special phrasing for different post types (new, update, scrape-failed), so that it's ready for podcast generation.
Acceptance Criteria (ACs):
1. A contentFormatterService.ts must be implemented.
2. It must accept an array of processed HN post objects (containing all necessary metadata, statuses, content, and comments).
3. Output must be a single string.
4. String must start: "It's a top 10 countdown for [Current Date]".
5. Posts must be sequenced in reverse rank order.
6. Formatting for standard new post: "Story [Rank] - [Article Title]. [Full/Truncated Article Text]. Comments Section. [Number] comments follow. Comment 1: [Text]. Comment 2: [Text]..."
7. Formatting for repeat post (update): "Story [Rank] (previously Rank [OldRank] yesterday) - [Article Title]. We're bringing you new comments on this popular story. Comments Section. [Number] new comments follow. Comment 1: [Text]..."
8. Formatting for scrape-failed post: "Story [Rank] - [Article Title]. We couldn't retrieve the full article, but here's a summary if available and the latest comments. [Optional HN Summary]. Comments Section. [Number] comments follow. Comment 1: [Text]..."
9. Article Truncation: If MAX_ARTICLE_LENGTH (env var) is set and article exceeds, truncate by attempting to preserve beginning and end (or simpler first X / last Y characters for MVP).
10. Graceful handling for missing parts (e.g., "Article content not available," "0 comments follow") must be implemented.
11. Unit tests (Jest) must verify all formatting variations, truncation, data merging, and error handling. All tests must pass.

Story 2.6 (REVISED): Implement Podcast Generation Status Polling via Play.ai API

User Story Statement: As the System, after submitting a podcast generation job to Play.ai and receiving a jobId, I need an AWS Step Function state machine to periodically poll the Play.ai API for the status of this specific job, continuing until the job is reported as "completed" or "failed" (or a configurable max duration/attempts limit is reached), so the system can reliably determine when the podcast audio is ready or if an error occurred.
Acceptance Criteria (ACs):
1. The AWS Step Function state machine (CDK defined in Story 2.1) must manage the polling workflow.
2. Input: jobId.
3. States: Invoke Poller Lambda (calls Play.ai status endpoint with axios), Wait (configurable POLLING_INTERVAL_MINUTES), Choice (evaluates status).
4. Loop if status is "processing".
5. Stop if "completed" or "failed".
6. Max polling duration/attempts (configurable env vars MAX_POLLING_DURATION_MINUTES, MAX_POLLING_ATTEMPTS) must be enforced, treating expiry as failure.
7. If "completed": extract audioUrl, trigger next step (Story 2.8 process) with data.
8. If "failed"/"timeout": log event, record failure, terminate.
9. Poller Lambda handles Play.ai API errors gracefully.
10. Unit tests for Poller Lambda; Step Function definition tested. All tests must pass.

Story 2.7: Submit Content to Play.ai PlayNote API & Initiate Podcast Generation

User Story Statement: As the System, I need to securely submit the aggregated and formatted text content (using sourceText) to the play.ai PlayNote API via an application/json request to initiate the podcast generation process, and I must capture the jobId returned by Play.ai, so that this jobId can be passed to the status polling mechanism (Step Function).
Acceptance Criteria (ACs):
1. A playAiService.ts function must handle submission.
2. Input: formatted text (from Story 2.5).
3. Use axios for POST to Play.ai endpoint (e.g., https://api.play.ai/api/v1/playnotes).
4. Request Content-Type: application/json.
5. JSON body: sourceText, and configurable title, voiceId1, name1, voiceId2, name2, styleGuidance (from env vars). Developer to confirm exact field names per Play.ai JSON API for text input.
6. Headers: Authorization: Bearer [TOKEN], X-USER-ID: [API_KEY] (from env vars PLAY_AI_BEARER_TOKEN, PLAY_AI_USER_ID).
7. No webHookUrl sent.
8. On success: extract jobId, log it, initiate polling Step Function (Story 2.6) with jobId.
9. Error handling for API submission.
10. Unit tests (Jest) mock axios, verify API call, auth, payload, jobId extraction, Step Function initiation, error handling. All tests must pass.

Story 2.8: Retrieve, Store Generated Podcast Audio & Persist Episode Metadata

User Story Statement: As the System, once the podcast generation status polling (Story 2.6) indicates a Play.ai job is "completed," I need to download the generated audio file from the provided audioUrl, store this file in our designated S3 bucket, and then save all relevant metadata for the episode (including the S3 audio location, episodeNumber, podcastGeneratedTitle, playAiSourceAudioUrl, and source information like isUpdateStatus for each HN story, and lastCommentFetchTimestamp for each HN post) into our DynamoDB table, so that the daily digest is fully processed, archived, and ready for access.
Acceptance Criteria (ACs):
1. A podcastStorageService.ts function must be triggered by Step Function (Story 2.6) on "completed" status, receiving audioUrl, Play.ai jobId, and original context (list of source HN posts with their processing statuses & metadata).
2. Use axios to download audio from audioUrl.
3. Upload audio to S3 bucket (from Story 2.1), using key (e.g., YYYY/MM/DD/episodeId.mp3).
4. Prepare episode metadata: episodeId (UUID), publicationDate (YYYY-MM-DD), episodeNumber (sequential logic), podcastGeneratedTitle (from Play.ai or constructed), audioS3Bucket, audioS3Key, playAiJobId, playAiSourceAudioUrl, sourceHNPosts (array of objects: { hnPostId, title, originalArticleUrl, hnLink, isUpdateStatus, lastCommentFetchTimestamp, oldRank }), status: "Published", createdAt.
5. The lastCommentFetchTimestamp for each processed hnPostId in sourceHNPosts must be set to the current time (or comment processing time) for future "new comments only" logic.
6. Save metadata to BmadDailyDigestEpisodes DynamoDB table.
7. Error handling for download, S3 upload, DDB write; failure sets episode status to "Failed".
8. Unit tests (Jest) mock axios, AWS SDK (S3, DynamoDB); verify data handling, storage, metadata, errors. All tests must pass.

Story 2.9: Daily Workflow Orchestration & Scheduling

User Story Statement: As the System Administrator, I need the entire daily backend workflow (Stories 2.2 through 2.8) to be fully orchestrated by the primary AWS Step Function state machine and automatically scheduled to run once per day using Amazon EventBridge Scheduler, ensuring it handles re-runs for the same day by overwriting/starting over (for MVP), so that "BMad Daily Digest" episodes are produced consistently and reliably.
Acceptance Criteria (ACs):
1. The primary AWS Step Function state machine must orchestrate the sequence: Call Lambdas/services for Fetch HN Posts (2.2), then for each post: Scrape Article (2.3) & Fetch Comments (2.4); then Aggregate & Format (2.5); then Submit to Play.ai (2.7 which returns jobId); then initiate Polling (2.6 using jobId); on "completed" polling, trigger Retrieve & Store Audio/Metadata (2.8).
2. State machine must manage data flow between steps.
3. Overall workflow error handling: critical step failure marks state machine execution as "Failed" and logs comprehensively. Individual steps use retries for transient errors.
4. Idempotency (MVP): Re-running for the same publicationDate must re-process and overwrite previous data for that date.
5. Amazon EventBridge Scheduler rule (CDK defined) must trigger the main Step Function daily at 12:00 UTC (default, configurable via DAILY_JOB_SCHEDULE_UTC_CRON).
6. Successful end-to-end run must be demonstrated.
7. Step Function execution history must provide a clear audit trail.
8. Unit tests for any new orchestrator-specific Lambda functions. All tests must pass.

Epic 3: Web Application MVP & Podcast Consumption

Goal: To set up the frontend project in its dedicated repository and develop and deploy the Next.js frontend application MVP, enabling users to consume the "BMad Daily Digest." This includes initial project setup (AI-assisted UI kickstart), pages for listing and detailing episodes, an about page, and deployment.

User Stories for Epic 3:

Story 3.1: Frontend Project Repository & Initial UI Setup (AI-Assisted)

User Story Statement: As a Developer, I need to establish the bmad-daily-digest-frontend Git repository with a new Next.js (TypeScript, Node.js 22) project. This foundational setup must include essential development tooling (ESLint, Prettier, Jest with React Testing Library, a basic CI stub), and integrate an initial UI structure and core presentational components—kickstarted by an AI UI generation tool—styled with Tailwind CSS and shadcn/ui to embody the "80s retro CRT terminal" aesthetic, so that a high-quality, styled, and standardized frontend development environment is ready for building application features.
Acceptance Criteria (ACs):
1. A new, private Git repository bmad-daily-digest-frontend must be created on GitHub.
2. A Next.js project (TypeScript, targeting Node.js 22) must be initialized.
3. Tooling (ESLint for Next.js/React/TS, Prettier, Jest with React Testing Library) must be installed and configured.
4. NPM scripts for lint, format, test, build must be in package.json.
5. Tailwind CSS and shadcn/ui must be installed and configured.
6. An initial UI structure (basic page layout, placeholder pages for List, Detail, About) and a few core presentational components (themed buttons, text display) must be generated using an agreed-upon AI UI generation tool, targeting the "80s retro CRT terminal" aesthetic.
7. The AI-generated UI code must be integrated into the Next.js project.
8. The application must build successfully with the initial UI.
9. A basic CI pipeline stub (GitHub Actions) for lint, format check, test, build must be created.
10. A standard Next.js .gitignore and an updated README.md must be present.

Story 3.2: Frontend API Service Layer for Backend Communication

User Story Statement: As a Frontend Developer, I need a dedicated and well-typed API service layer within the Next.js frontend application to manage all HTTP communication with the "BMad Daily Digest" backend API (for fetching episode lists and specific episode details), so that UI components can cleanly and securely consume backend data with robust error handling.
Acceptance Criteria (ACs):
1. A TypeScript module (e.g., src/services/apiClient.ts) must encapsulate backend API interactions.
2. Functions must exist for: fetching all podcast episodes (e.g., getEpisodes()) and fetching details for a single episode by episodeId (e.g., getEpisodeDetails(episodeId)).
3. axios must be used for HTTP requests.
4. Backend API base URL must be configurable via NEXT_PUBLIC_BACKEND_API_URL.
5. TypeScript interfaces (e.g., EpisodeListItem, EpisodeDetail, SourceHNStory) for API response data must be defined, matching data structure from backend (Story 2.8).
6. API functions must correctly parse JSON responses and transform data into defined frontend interfaces.
7. Error handling (network errors, non-2xx responses) must be implemented, providing clear error information.
8. Unit tests (Jest) must mock axios and verify API calls, data parsing/transformation, and error handling. All tests must pass.

Story 3.3: Episode List Page Implementation

User Story Statement: As a Busy Tech Executive, I want to view a responsive "Episode List Page" that clearly displays all available "BMad Daily Digest" episodes in reverse chronological order, showing the episode number, publication date, and podcast title for each, so that I can quickly find and select the latest or a specific past episode to listen to.
Acceptance Criteria (ACs):
1. A Next.js page component (e.g., src/app/page.tsx or src/app/episodes/page.tsx) must be created.
2. It must use the API service layer (Story 3.2) to fetch episodes.
3. A themed loading state must be shown during data fetching.
4. An error message must be shown if fetching fails.
5. A "No episodes available yet" message must be shown for an empty list.
6. Episodes must be listed in reverse chronological order (newest first), based on publicationDate or episodeNumber.
7. Each list item must display "Episode [EpisodeNumber]: [PublicationDate] - [PodcastGeneratedTitle]".
8. Each item must link to the Episode Detail Page for that episode using its episodeId.
9. Styling must adhere to the "80s retro CRT terminal" aesthetic.
10. The page must be responsive.
11. Unit/integration tests (Jest with RTL) must cover loading, error, no episodes, list rendering, data display, order, and navigation. All tests must pass.

Story 3.4: Episode Detail Page Implementation

User Story Statement: As a Busy Tech Executive, after selecting a podcast episode from the list, I want to be taken to a responsive "Episode Detail Page" where I can easily play the audio using a standard HTML5 player, see a clear breakdown of the Hacker News stories discussed in that episode, and have direct links to explore both the original articles and the Hacker News comment threads, so I can listen to the digest and dive deeper into topics of interest.
Acceptance Criteria (ACs):
1. A Next.js dynamic route page (e.g., src/app/episodes/[episodeId]/page.tsx) must be created.
2. It must accept episodeId from the URL.
3. It must use the API service layer (Story 3.2) to fetch details for the episodeId.
4. Loading and error states must be handled.
5. If data is found, it must display: podcastGeneratedTitle, publicationDate, episodeNumber.
6. An embedded HTML5 audio player (<audio controls>) must play the podcast. The src must be the publicly accessible URL for the audio file (e.g., a CloudFront URL pointing to the S3 object).
7. A list of included Hacker News stories (from the sourceHNPosts array in the episode data) must be displayed.
8. For each HN story in the list: its title, a link to the originalArticleUrl (opening in new tab), and a link to its hnLink (opening in new tab) must be displayed.
9. Styling must adhere to the "80s retro CRT terminal" aesthetic.
10. The page must be responsive.
11. Unit/integration tests (Jest with RTL) must cover loading, error, rendering of all details, player presence, and correct link generation. All tests must pass.

Story 3.5: "About" Page Implementation

User Story Statement: As a User, I want to access a minimalist, responsive, and consistently styled "About Page" that clearly explains what the "BMad Daily Digest" service is, its core purpose, and how it works at a high level, so that I can quickly understand the value and nature of the service.
Acceptance Criteria (ACs):
1. A Next.js page component (e.g., src/app/about/page.tsx) must be created.
2. It must display static informational content (placeholder text: "BMad Daily Digest provides a daily audio summary of top Hacker News discussions for busy tech professionals, generated using AI.").
3. Content must explain: What "BMad Daily Digest" is, its purpose, and a high-level overview of generation.
4. Styling must adhere to the "80s retro CRT terminal" aesthetic.
5. The page must be responsive.
6. A link to the "About Page" must be accessible (e.g., in site navigation/footer, specific location defined by Story 3.1's AI-generated layout).
7. Unit tests (Jest with RTL) must verify rendering of static content. All tests must pass.

Story 3.6: Frontend Deployment to S3 & CloudFront via CDK

User Story Statement: As a Developer, I need the Next.js frontend application to be configured for static export (or an equivalent static-first deployment model) and have its AWS infrastructure (S3 for hosting, CloudFront for CDN and HTTPS) defined and managed via AWS CDK. This setup should automate the deployment of the static site, making the "BMad Daily Digest" web application publicly accessible, performant, and cost-effective.
Acceptance Criteria (ACs):
1. The Next.js application must be configured for static export suitable for S3/CloudFront hosting.
2. AWS CDK scripts (managed in the bmad-daily-digest-backend repo's CDK app for unified IaC, unless Architect advises a separate frontend CDK app) must define the frontend S3 bucket and CloudFront distribution.
3. CDK stack must define: S3 bucket (static web hosting), CloudFront distribution (S3 origin, HTTPS via default CloudFront domain or ACM cert for custom domain if specified, caching behaviors, OAC/OAI).
4. A package.json build script must generate the static output (e.g., to out/).
5. The CDK deployment process (or related script) must include steps to build the Next.js app and sync static files to S3.
6. The application must be accessible via its CloudFront URL.
7. All MVP functionalities must be operational on the deployed site.
8. HTTPS must be enforced.
9. CDK code must meet project standards.

7. Key Reference Documents

{This section will be populated later with links to the final PRD, Architecture Document, UI/UX Specification, etc.}

8. Out of Scope Ideas Post MVP

Advanced Audio Player Functionality: Custom controls (skip +/- 15s), playback speed adjustment, remembering playback position.
User Accounts & Personalization: User creation, email subscription management, customizable podcast hosts.
Enhanced Content Delivery & Discovery: Daily email summary, full RSS feed, full podcast transcription on website, search functionality.
Expanded Content Sources: Beyond Hacker News.
Community & Feedback: In-app feedback mechanisms.

9. Change Log

Change	Date	Version	Description	Author
Initial PRD draft and MVP scope definition.	May 20, 2025	0.1	Created initial PRD based on Project Brief; defined Epics & detailed Stories.	John (PM) & User

44 KiB Raw Blame History