46 KiB
BMad Daily Digest Product Requirements Document (PRD)
Version: 0.2 Date: May 20, 2025 Author: BMad Project Team (John - PM, Fred - Architect, Sarah - PO, User)
1. Goal, Objective and Context
- Overall Goal: To provide busy tech executives with a quick, daily audio digest of top Hacker News posts and discussions, enabling them to stay effortlessly informed.
- Project Objective (MVP Focus): To successfully launch the "BMad Daily Digest" by:
- Automating the daily fetching of top 10 Hacker News posts (article metadata and comments via Algolia HN API) and scraping of linked article content.
- Processing this content into a structured format.
- Generating a 2-agent audio podcast using the play.ai PlayNote API.
- Delivering the podcast via a simple Next.js web application (polyrepo structure) with a list of episodes and detail pages including an audio player and links to source materials.
- Operating this process daily, aiming for delivery by a consistent morning hour.
- Adhering to a full TypeScript stack (Node.js 22 for backend), with a Next.js frontend, AWS Lambda backend, DynamoDB, S3, and AWS CDK for IaC, while aiming to stay within AWS free-tier limits where possible.
- Context/Problem Solved: Busy tech executives lack the time to thoroughly read Hacker News daily but need to stay updated on key tech discussions, trends, and news for strategic insights. "BMad Daily Digest" solves this by offering a convenient, curated audio summary.
2. Functional Requirements (MVP)
FR1: Content Acquisition
- The system must automatically fetch data for the top 10 (configurable) posts from Hacker News daily.
- For each Hacker News post, the system must identify and retrieve:
- The URL of the linked article.
- Key metadata about the Hacker News post (e.g., title, HN link, score, author, HN Post ID).
- The system must fetch comments for each identified Hacker News post using the Algolia HN Search API, with logic to handle new vs. repeat posts and scraping failures differently.
- The system must attempt to scrape and extract the primary textual content from the linked article URL for each of the top posts (unless it's a repeat post where only new comments are needed).
- This process should aim to isolate the main article body.
- If scraping fails, a fallback using HN title, summary (if available), and increased comments must be used.
FR2: Content Processing and Formatting
- The system must aggregate the extracted/fallback article content and selected comments for the top 10 posts.
- The system must process and structure the aggregated text content into a single text file suitable for submission to the play.ai PlayNote API.
- The text file must begin with an introductory sentence: "It's a top 10 countdown for [Today'sDate]".
- Content must be structured sequentially (e.g., "Story 10 - [details]..."), with special phrasing for repeat posts or posts where article scraping failed.
- Article content may be truncated if
MAX_ARTICLE_LENGTH(environment variable) is set, aiming to preserve intro/conclusion where possible.
FR3: Podcast Generation
- The system must submit the formatted text content to the play.ai PlayNote API daily using specified voice and style parameters (configurable via environment variables).
- The system must capture the
jobIdfrom Play.ai and use a polling mechanism (e.g., AWS Step Functions) to check for job completion status. - Upon successful completion, the system must retrieve the generated audio podcast file from the Play.ai-provided URL.
- The system must store the generated audio file (e.g., on S3) and its associated metadata (including episode number, generated title, S3 location, original Play.ai URL, source HN posts, and processing status) in DynamoDB.
FR4: Web Application Interface (MVP)
- The system must provide a web application (Next.js, "80s retro CRT terminal" theme with Tailwind CSS & shadcn/ui) with a List Page that:
- Displays a chronological list (newest first) of all generated "BMad Daily Digest" episodes, formatted as "Episode [EpisodeNumber]: [PublicationDate] - [PodcastTitle]".
- Allows users to navigate to a Detail Page for each episode.
- The system must provide a web application with a Detail Page for each episode that:
- Displays the
podcastGeneratedTitle,publicationDate, andepisodeNumber. - Includes an embedded HTML5 audio player for the podcast.
- Lists the individual Hacker News stories included, with direct links to the original source article and the Hacker News discussion page.
- Displays the
- The system must provide a minimalist About Page.
- The web application must be responsive.
FR5: Automation and Scheduling
- The entire end-to-end backend process must be orchestrated (preferably via AWS Step Functions) and automated to run daily, triggered by Amazon EventBridge Scheduler (default 12:00 UTC, configurable).
- For MVP, a re-run of the daily job for the same day must overwrite/start over previous data for that day.
3. Non-Functional Requirements (MVP)
a. Performance:
- Podcast Generation Time: The daily process should complete in a timely manner (e.g., target by 8 AM CST/12:00-13:00 UTC, specific completion window TBD based on Play.ai processing).
- Web Application Load Time: Pages on the Next.js app should aim for fast load times (e.g., target under 3-5 seconds).
b. Reliability / Availability:
- Daily Process Success Rate: Target >95% success rate for automated podcast generation without manual intervention.
- Web Application Uptime: Target 99.5%+ uptime.
c. Maintainability:
- Code Quality: Code must be well-documented. Internal code comments should be used when logic isn't clear from names. All functions must have JSDoc-style outer comments. Adherence to defined coding standards (ESLint, Prettier).
- Configuration Management: System configurations and secrets must be managed via environment variables (
.envlocally, Lambda environment variables when deployed), set manually for MVP.
d. Usability (Web Application):
- The web application must be intuitive for busy tech executives.
- The audio player must be simple and reliable.
- Accessibility: Standard MVP considerations, with particular attention to contrast for the "glowing green on dark" theme, good keyboard navigation, and basic screen reader compatibility.
e. Security (MVP Focus):
- API Key Management: Keys for Algolia, Play.ai, AWS must be stored securely (gitignored
.envfiles locally, Lambda environment variables in AWS), not hardcoded. - Data Handling: Scraped content handled responsibly.
f. Cost Efficiency:
- AWS service usage must aim to stay within free-tier limits where feasible. Play.ai usage is via existing user subscription.
4. User Interaction and Design Goals
a. Overall Vision & Experience:
- Look and Feel: Dark mode UI, "glowing green ASCII/text on a black background" aesthetic (CRT terminal style), "80s retro everything" theme.
- UI Component Toolkit: Tailwind CSS and shadcn/ui, customized for the theme. Initial structure/components kickstarted by an AI UI generation tool (using the
bmad-daily-digest-uiV0 scaffold as a base). - User Experience: Highly efficient, clear navigation, no clutter, prioritizing content readability for busy tech executives.
b. Key Interaction Paradigms (MVP):
- View list of digests (reverse chronological), select one for details. No sorting/filtering on list page for MVP.
c. Core Screens/Views (MVP):
- List Page: Episodes ("Episode [N]: [Date] - [PodcastTitle]").
- Detail Page: Episode details, HTML5 audio player, list of source HN stories with links to articles and HN discussions.
- About Page: Minimalist, explaining the service, consistent theme.
d. Accessibility Aspirations (MVP):
- Standard considerations: good contrast (critical for theme), keyboard navigation, basic screen reader compatibility.
e. Branding Considerations (High-Level):
- "80s retro everything" theme is central. Logo/typeface should complement this (e.g., pixel art, retro fonts).
f. Target Devices/Platforms:
- Responsive web application, good UX on desktop and mobile.
5. Technical Assumptions
a. Core Technology Stack & Approach:
- Full TypeScript Stack: TypeScript for frontend and backend.
- Frontend: Next.js (React), Node.js 22. Styling: Tailwind CSS, shadcn/ui. Hosting: Static site on AWS S3 (via CloudFront).
- Backend: Node.js 22, TypeScript. HTTP Client:
axios. Compute: AWS Lambda. Database: AWS DynamoDB. - Infrastructure as Code (IaC): All AWS infrastructure via AWS CDK.
- Key External Services/APIs: Algolia HN Search API (posts/comments), Play.ai PlayNote API (audio gen, user has subscription, polling for status), Custom scraping for articles (TypeScript with Cheerio, Readability.js, potentially Puppeteer/Playwright).
- Automation: Daily trigger via Amazon EventBridge Scheduler. Orchestration via AWS Step Functions.
- Configuration & Secrets: Environment variables (
.envlocal & gitignored, Lambda env vars). - Coding Standards: JSDoc for functions, inline comments for clarity. ESLint, Prettier.
b. Repository Structure & Service Architecture:
- Repository Structure: Polyrepo (separate Git repositories for
bmad-daily-digest-frontendandbmad-daily-digest-backend). - High-Level Service Architecture: Backend is serverless functions (AWS Lambda) for distinct tasks, orchestrated by Step Functions. API layer via AWS API Gateway to expose backend to frontend, secured with API Keys.
6. Epic Overview
This section details the Epics and their User Stories for the MVP. Architectural refinements have been incorporated.
Epic 1: Backend Foundation, Tooling & "Hello World" API
-
Goal: To establish the core backend project infrastructure in its dedicated repository, including robust development tooling and initial AWS CDK setup for essential services. By the end of this epic:
- A simple "hello world" API endpoint (AWS API Gateway + Lambda) must be deployed and testable via
curl, returning a dynamic message. - The backend project must have ESLint, Prettier, Jest (unit testing), and esbuild (TypeScript bundling) configured and operational.
- Basic unit tests must exist for the "hello world" Lambda function.
- Code formatting and linting checks should be integrated into a pre-commit hook and/or a basic CI pipeline stub.
- A simple "hello world" API endpoint (AWS API Gateway + Lambda) must be deployed and testable via
-
User Stories for Epic 1:
Story 1.1: Initialize Backend Project using TS-TEMPLATE-STARTER
- User Story Statement: As a Developer, I want to create the
bmad-daily-digest-backendGit repository and initialize it using the existingTS-TEMPLATE-STARTER, ensuring all foundational tooling (TypeScript, Node.js 22, ESLint, Prettier, Jest, esbuild) is correctly configured and operational for this specific project, so that I have a high-quality, standardized development environment ready for application logic. - Acceptance Criteria (ACs):
- A new, private Git repository named
bmad-daily-digest-backendmust be created on GitHub. - The contents of the
TS-TEMPLATE-STARTERproject must be copied/cloned into this new repository. package.jsonmust be updated (project name, version, description).- Project dependencies must be installable.
- TypeScript setup (
tsconfig.json) must be verified for Node.js 22, esbuild compatibility; project must compile. - ESLint and Prettier configurations must be operational; lint/format scripts must execute successfully.
- Jest configuration must be operational; test scripts must execute successfully with any starter example tests.
- Irrelevant generic demo code from starter should be removed.
index.ts/index.test.tscan remain as placeholders. - A standard
.gitignoreand an updated projectREADME.mdmust be present.
- A new, private Git repository named
Story 1.2: Pre-commit Hook Implementation
- User Story Statement: As a Developer, I want pre-commit hooks automatically enforced in the
bmad-daily-digest-backendrepository, so that code quality standards (like linting and formatting) are checked and applied to staged files before any code is committed, thereby maintaining codebase consistency and reducing trivial errors. - Acceptance Criteria (ACs):
- A pre-commit hook tool (e.g., Husky) must be installed and configured.
- A tool for running linters/formatters on staged files (e.g.,
lint-staged) must be installed and configured. - Pre-commit hook must trigger
lint-stagedon staged.tsfiles. lint-stagedmust be configured to run ESLint (--fix) and Prettier (--write).- Attempting to commit files with auto-fixable issues must result in fixes applied and successful commit.
- Attempting to commit files with non-auto-fixable linting errors must abort the commit with error messages.
- Committing clean files must proceed without issues.
Story 1.3: "Hello World" Lambda Function Implementation & Unit Tests
- User Story Statement: As a Developer, I need a simple "Hello World" AWS Lambda function implemented in TypeScript within the
bmad-daily-digest-backendproject. This function, when invoked, should return a dynamic greeting message including the current date and time, and it must be accompanied by comprehensive Jest unit tests, so that our basic serverless compute functionality, testing setup, and TypeScript bundling are validated. - Acceptance Criteria (ACs):
- A
src/features/publicApi/statusHandler.tsfile (or similar according to final backend structure) must contain the Lambda handler. - Handler must be AWS Lambda compatible (event, context, Promise response).
- Successful execution must return JSON:
statusCode: 200, body withmessage: "Hello from BMad Daily Digest Backend, today is [current_date] at [current_time].". - Date and time in message must be dynamic.
- A corresponding Jest unit test file (e.g.,
src/features/publicApi/statusHandler.test.ts) must be created. - Unit tests must verify: 200 status, valid JSON body, expected
messagefield, "Hello from..." prefix, dynamic date/time portion (use mockedDate). - All unit tests must pass.
- esbuild configuration must correctly bundle the handler.
- A
Story 1.4: AWS CDK Setup for "Hello World" API (Lambda & API Gateway)
- User Story Statement: As a Developer, I want to define the necessary AWS infrastructure (Lambda function and API Gateway endpoint) for the "Hello World" service using AWS CDK (Cloud Development Kit) in TypeScript, so that the infrastructure is version-controlled, repeatable, and can be deployed programmatically.
- Acceptance Criteria (ACs):
- AWS CDK (v2) must be a development dependency.
- CDK app structure must be initialized (e.g., in
cdk/). - A new CDK stack (e.g.,
BmadDailyDigestBackendStack) must be defined in TypeScript. - CDK stack must define an AWS Lambda resource for the "Hello World" function (Node.js 22, bundled code reference, handler entry point, basic IAM role for CloudWatch logs, free-tier conscious settings).
- CDK stack must define an AWS API Gateway (HTTP API preferred) with a route (e.g.,
GET /statusorGET /hello) triggering the Lambda, secured with the "Frontend Read API Key". - CDK stack must be synthesizable (
cdk synth) without errors. - CDK code must adhere to project ESLint/Prettier standards.
- Mechanism for passing Lambda environment variables via CDK must be in place.
Story 1.5: "Hello World" API Deployment & Manual Invocation Test
- User Story Statement: As a Developer, I need to deploy the "Hello World" API (defined in AWS CDK) to an AWS environment and successfully invoke its endpoint using a tool like
curl(including the API Key), so that I can verify the end-to-end deployment process and confirm the basic API is operational in the cloud. - Acceptance Criteria (ACs):
- The AWS CDK stack for "Hello World" API must deploy successfully to a designated AWS account/region.
- The API Gateway endpoint URL for the
/status(or/hello) route must be retrievable post-deployment. - A
GETrequest to the deployed endpoint, including the correctx-api-keyheader, must receive a response. - HTTP response status must be 200 OK.
- Response body must be JSON containing the expected dynamic "Hello..." message.
- Basic Lambda invocation logs must be visible in AWS CloudWatch Logs.
Story 1.6: Basic CI/CD Pipeline Stub with Quality Gates
- User Story Statement: As a Developer, I need a basic Continuous Integration (CI) pipeline established for the
bmad-daily-digest-backendrepository, so that code quality checks (linting, formatting, unit tests) and the build process are automated upon code pushes and pull requests, ensuring early feedback and maintaining codebase health. - Acceptance Criteria (ACs):
- A CI workflow file (e.g., GitHub Actions in
.github/workflows/main.yml) must be created. - Pipeline must trigger on pushes to
mainand PRs targetingmain. - Pipeline must include steps for: checkout, Node.js 22 setup, dependency install, ESLint check, Prettier format check, Jest unit tests, esbuild bundle.
- Pipeline must fail if any lint, format, test, or bundle step fails.
- A successful CI run on the
mainbranch must be demonstrated. - CI pipeline for MVP does not need to perform AWS deployment.
- A CI workflow file (e.g., GitHub Actions in
- User Story Statement: As a Developer, I want to create the
Epic 2: Automated Content Ingestion & Podcast Generation Pipeline
Goal: To implement the complete automated daily workflow within the backend. This includes fetching Hacker News post data, scraping and extracting content from linked external articles, aggregating and formatting text, submitting it to Play.ai, managing job status via polling, and retrieving/storing the final audio file and associated metadata. This epic delivers the core value proposition of generating the daily audio content and making it ready for consumption via an API.
User Stories for Epic 2:
Story 2.1: AWS CDK Extension for Epic 2 Resources
- User Story Statement: As a Developer, I need to extend the existing AWS CDK stack within the
bmad-daily-digest-backendproject to define and provision all new AWS resources required for the content ingestion and podcast generation pipeline—including theBmadDailyDigestEpisodesDynamoDB table (with GSI), theHackerNewsPostProcessStateDynamoDB table, an S3 bucket for audio storage, and the AWS Step Functions state machine for orchestrating the Play.ai job status polling—so that all backend infrastructure for this epic is managed as code and ready for the application logic. - Acceptance Criteria (ACs):
- The existing AWS CDK application (from Epic 1) must be extended.
- The
BmadDailyDigestEpisodesDynamoDB table resource must be defined in CDK as specified in the System Architecture Document's "Data Models" section (withepisodeIdPK, key attributes likepublicationDate,episodeNumber,podcastGeneratedTitle,audioS3Key,audioS3Bucket,playAiJobId,playAiSourceAudioUrl,sourceHNPostslist,status,createdAt,updatedAt), including a GSI for chronological sorting (e.g., PKstatus, SKpublicationDate), and PAY_PER_REQUEST billing. - The
HackerNewsPostProcessStateDynamoDB table resource must be defined in CDK as specified in the System Architecture Document's "Data Models" section (withhnPostIdPK and attributes likelastCommentFetchTimestamp,lastSuccessfullyScrapedTimestamp,lastKnownRank), and PAY_PER_REQUEST billing. - An S3 bucket resource (e.g.,
bmad-daily-digest-audio-{unique-suffix}) must be defined via CDK for audio storage, with private access by default. - An AWS Step Functions state machine resource must be defined via CDK to manage the Play.ai job status polling workflow (as detailed in Story 2.6).
- Necessary IAM roles and permissions for Lambda functions within this epic to interact with DynamoDB, S3, Step Functions, CloudWatch Logs must be defined via CDK, adhering to least privilege.
- The updated CDK stack must synthesize (
cdk synth) and deploy (cdk deploy) successfully. - All new CDK code must adhere to project ESLint/Prettier standards.
Story 2.2: Fetch Top Hacker News Posts & Identify Repeats
- User Story Statement: As the System, I need to reliably fetch the top N (configurable, e.g., 10) current Hacker News posts daily using the Algolia HN API, including their essential metadata. I also need to identify if each fetched post has been processed in a recent digest by checking against the
HackerNewsPostProcessStatetable, so that I have an accurate list of stories and their status (new or repeat) to begin generating the daily digest. - Acceptance Criteria (ACs):
- A
hackerNewsService.tsfunction must fetch top N HN posts (stories only) viaaxiosfrom Algolia API (configurableHN_POSTS_COUNT). - Extracted metadata per post: Title, Article URL, HN Post URL, HN Post ID (
objectID), Author, Points, Creation timestamp. - For each post, the function must query the
HackerNewsPostProcessStateDynamoDB table to determine itsisUpdateStatus(true iflastSuccessfullyScrapedTimestampandlastCommentFetchTimestampindicate prior full processing) and retrievelastCommentFetchTimestampandlastKnownRankif available. - Function must return an array of HN post objects with metadata,
isUpdateStatus,lastCommentFetchTimestamp, andlastKnownRank. - Error handling for Algolia/DynamoDB calls must be implemented and logged.
- Unit tests (Jest) must verify API calls, data extraction, repeat identification (mocked DDB), and error handling. All tests must pass.
- A
Story 2.3: Article Content Scraping & Extraction (Conditional)
- User Story Statement: As the System, for each Hacker News post identified as new (or for which article scraping previously failed and is being retried), I need to robustly fetch its HTML content from the linked article URL and extract the primary textual content and title using libraries like Cheerio and Mozilla Readability. If scraping fails, a fallback mechanism must be triggered.
- Acceptance Criteria (ACs):
- An
articleScraperService.tsfunction must accept an article URL andisUpdateStatus. - If
isUpdateStatusis true (article already successfully scraped and stored, though we are not storing full articles long term - this implies we have the article data available from a previous step if it's a repeat post where we don't re-scrape), scraping must be skipped. (For MVP, if it's a repeat post, we assume we don't need to re-scrape the article itself, only comments, as per user feedback. This story focuses on new scrapes or retries of failed scrapes). - If a new scrape is needed: use
axios(timeout, User-Agent) to fetch HTML. - Use
Mozilla Readability(JS port) and/orCheerioto extract main article text and title. - Return
{ success: true, title: string, content: string }on success. - If scraping fails: log failure, return
{ success: false, error: string, fallbackNeeded: true }. - No specific "polite" inter-article scraping delays for MVP.
- Unit tests (Jest) must mock
axios, test successful extraction, skip logic for non-applicable cases, and failure/fallback scenarios. All tests must pass.
- An
Story 2.4: Fetch Hacker News Comments (Conditional Logic)
- User Story Statement: As the System, I need to fetch comments for each selected Hacker News post using the Algolia HN API, adjusting the strategy to fetch up to N comments for new posts, only new comments since last fetch for repeat posts, or up to 3N comments if article scraping failed.
- Acceptance Criteria (ACs):
hackerNewsService.tsmust be extended to fetch comments for an HN Post ID, acceptingisUpdateStatus,lastCommentFetchTimestamp(fromHackerNewsPostProcessState), andarticleScrapingFailedflags.- Use
axiosto call Algolia HN API item endpoint. - Comment Fetching Logic:
- If
articleScrapingFailed: Fetch up to 3 *HN_COMMENTS_COUNT_PER_POSTavailable comments. - If
isUpdateStatus: Fetch all comments, then filter client-side for comments withcreated_at_i>lastCommentFetchTimestamp. Select up toHN_COMMENTS_COUNT_PER_POSTof these new comments. - Else (new post, successful scrape): Fetch up to
HN_COMMENTS_COUNT_PER_POST.
- If
- For selected comments, extract plain text (HTML stripped), author, creation timestamp.
- Return array of comment objects; empty if none. An updated
lastCommentFetchTimestamp(maxcreated_at_iof fetched comments for this post) should be available for updatingHackerNewsPostProcessState. - Error handling and logging for API calls.
- Unit tests (Jest) must mock
axiosand verify all conditional fetching logic, comment selection/filtering, data extraction, and error handling. All tests must pass.
Story 2.5: Content Aggregation and Formatting for Play.ai
- User Story Statement: As the System, I need to aggregate the collected Hacker News post data (titles), associated article content (full, truncated, or fallback summary), and comments (new, updated, or extended sets) for all top stories, and format this combined text according to the specified structure for the play.ai PlayNote API, including special phrasing for different post types (new, update, scrape-failed) and configurable article truncation.
- Acceptance Criteria (ACs):
- A
contentFormatterService.tsmust be implemented. - Inputs: Array of processed HN post objects (with metadata, statuses, content, comments).
- Output: A single string.
- String starts: "It's a top 10 countdown for [Current Date]".
- Posts sequenced in reverse rank order.
- Formatting (new post): "Story [Rank] - [Article Title]. [Full/Truncated Article Text]. Comments Section. [Number] comments follow. Comment 1: [Text]..."
- Formatting (repeat post): "Story [Rank] (previously Rank [OldRank] yesterday) - [Article Title]. We're bringing you new comments on this popular story. Comments Section. [Number] new comments follow. Comment 1: [Text]..."
- Formatting (scrape-failed post): "Story [Rank] - [Article Title]. We couldn't retrieve the full article, but here's a summary if available and the latest comments. [Optional HN Summary]. Comments Section. [Number] comments follow. Comment 1: [Text]..."
- Article Truncation: If
MAX_ARTICLE_LENGTH(env var) set and article exceeds, truncate aiming to preserve intro/conclusion. - Graceful handling for missing parts.
- Unit tests (Jest) verify all formatting, truncation, data merging, error handling. All tests must pass.
- A
Story 2.6 (REVISED): Implement Podcast Generation Status Polling via Play.ai API
- User Story Statement: As the System, after submitting a podcast generation job to Play.ai and receiving a
jobId, I need an AWS Step Function state machine to periodically poll the Play.ai API for the status of this specific job, continuing until the job is reported as "completed" or "failed" (or a configurable max duration/attempts limit is reached), so the system can reliably determine when the podcast audio is ready or if an error occurred. - Acceptance Criteria (ACs):
- The AWS Step Function state machine (CDK defined in Story 2.1) must manage the polling workflow.
- Input:
jobId. - States: Invoke Poller Lambda (calls Play.ai status GET endpoint with
axios), Wait (configurablePOLLING_INTERVAL_MINUTES), Choice (evaluates status: "processing", "completed", "failed"). - Loop if "processing". Stop if "completed" or "failed".
- Max polling duration/attempts (configurable env vars
MAX_POLLING_DURATION_MINUTES,MAX_POLLING_ATTEMPTS) must be enforced, treating expiry as failure. - If "completed": extract
audioUrl, trigger next step (Story 2.8 process) with data. - If "failed"/"timeout": log event, record failure (e.g., update episode status in DDB via a Lambda), terminate.
- Poller Lambda handles API errors gracefully.
- Unit tests for Poller Lambda logic; Step Function definition tested (locally if possible, or via AWS console tests). All tests must pass.
Story 2.7: Submit Content to Play.ai PlayNote API & Initiate Podcast Generation
- User Story Statement: As the System, I need to securely submit the aggregated and formatted text content (using
sourceText) to the play.ai PlayNote API via anapplication/jsonrequest to initiate the podcast generation process, and I must capture thejobIdreturned by Play.ai, so that thisjobIdcan be passed to the status polling mechanism (Step Function). - Acceptance Criteria (ACs):
- A
playAiService.tsfunction must handle submission. - Input: formatted text (from Story 2.5).
- Use
axiosforPOSTto Play.ai endpoint (e.g.,https://api.play.ai/api/v1/playnotes). - Request
Content-Type: application/json. - JSON body:
sourceText, and configurabletitle,voiceId1,name1(default "Angelo"),voiceId2,name2(default "Deedee"),styleGuidance(default "podcast") from env vars. - Headers:
Authorization: Bearer <PLAY_AI_BEARER_TOKEN>,X-USER-ID: <PLAY_AI_USER_ID>(from env vars). - No
webHookUrlsent. - On success: extract
jobId, log it, initiate polling Step Function (Story 2.6) withjobIdand other context (like internalepisodeId). - Error handling for API submission; log and flag failure.
- Unit tests (Jest) mock
axios, verify API call, auth, payload,jobIdextraction, Step Function initiation (mocked), error handling. All tests must pass.
- A
Story 2.8: Retrieve, Store Generated Podcast Audio & Persist Episode Metadata
- User Story Statement: As the System, once the podcast generation status polling (Story 2.6) indicates a Play.ai job is "completed," I need to download the generated audio file from the provided
audioUrl, store this file in our designated S3 bucket, and then save/update all relevant metadata for the episode (including S3 audio location,episodeNumber,podcastGeneratedTitle,playAiSourceAudioUrl, and source HN post information including theirlastCommentFetchTimestamp) into our DynamoDB tables (BmadDailyDigestEpisodesandHackerNewsPostProcessState), so that the daily digest is fully processed, archived, and ready for access. - Acceptance Criteria (ACs):
- A
podcastStorageService.tsfunction must be triggered by Step Function (Story 2.6) on "completed" status, receivingaudioUrl, Play.aijobId, and original context (like internalepisodeId, list of source HN posts with their metadata and processing status). - Use
axiosto download audio fromaudioUrl. - Upload audio to S3 bucket (from Story 2.1), using key (e.g.,
YYYY/MM/DD/episodeId.mp3). - Prepare
Episodemetadata forBmadDailyDigestEpisodestable:episodeId(UUID),publicationDate(YYYY-MM-DD),episodeNumber(sequential logic, TBD),podcastGeneratedTitle(from Play.ai or constructed),audioS3Bucket,audioS3Key,playAiJobId,playAiSourceAudioUrl,sourceHNPosts(array of objects:{ hnPostId, title, originalArticleUrl, hnLink, isUpdateStatus, oldRank, articleScrapingFailed }),status: "Published",createdAt,updatedAt. - For each
hnPostIdinsourceHNPosts, update its corresponding item in theHackerNewsPostProcessStatetable with thelastCommentFetchTimestamp(current time or max comment time from this run),lastProcessedDate(current date), andlastKnownRank. IfarticleScrapingFailedwas false for this run, updatelastSuccessfullyScrapedTimestamp. - Save
Episodemetadata toBmadDailyDigestEpisodesDynamoDB table. - Error handling for download, S3 upload, DDB writes; failure should result in episode
status: "Failed". - Unit tests (Jest) mock
axios, AWS SDK (S3, DynamoDB); verify data handling, storage, metadata construction for both tables, errors. All tests must pass.
- A
Story 2.9: Daily Workflow Orchestration & Scheduling
- User Story Statement: As the System Administrator, I need the entire daily backend workflow (Stories 2.2 through 2.8) to be fully orchestrated by the primary AWS Step Function state machine and automatically scheduled to run once per day using Amazon EventBridge Scheduler, ensuring it handles re-runs for the same day by overwriting/starting over (for MVP), so that "BMad Daily Digest" episodes are produced consistently and reliably.
- Acceptance Criteria (ACs):
- The primary AWS Step Function state machine must orchestrate the sequence: Fetch HN Posts & Identify Repeats (2.2); For each post: conditionally Scrape Article (2.3) & Fetch Comments (2.4); then Aggregate & Format Content (2.5); then Submit to Play.ai & get
jobId(2.7); then initiate/manage Polling (2.6 usingjobId); on "completed" polling, trigger Retrieve & Store Audio/Metadata (2.8). - State machine must manage data flow (inputs/outputs) between steps correctly.
- Overall workflow error handling: critical step failure marks state machine execution as "Failed" and logs comprehensively. Steps use retries for transient errors.
- Idempotency (MVP): Re-running for the same
publicationDatemust re-process and effectively overwrite previous data for that date. - Amazon EventBridge Scheduler rule (CDK defined) must trigger the main Step Function daily at 12:00 UTC (default, configurable via
DAILY_JOB_SCHEDULE_UTC_CRON). - Successful end-to-end run must be demonstrated (e.g., processing sample data through the pipeline).
- Step Function execution history must provide a clear audit trail of steps and data.
- Unit tests for any new orchestrator-specific Lambda functions (if any not covered). All tests must pass.
- The primary AWS Step Function state machine must orchestrate the sequence: Fetch HN Posts & Identify Repeats (2.2); For each post: conditionally Scrape Article (2.3) & Fetch Comments (2.4); then Aggregate & Format Content (2.5); then Submit to Play.ai & get
Epic 3: Web Application MVP & Podcast Consumption
Goal: To set up the frontend project in its dedicated repository and develop and deploy the Next.js frontend application MVP, enabling users to consume the "BMad Daily Digest." This includes initial project setup (AI-assisted UI kickstart from bmad-daily-digest-ui scaffold), pages for listing and detailing episodes, an about page, and deployment.
User Stories for Epic 3:
Story 3.1: Frontend Project Repository & Initial UI Setup (AI-Assisted)
- User Story Statement: As a Developer, I need to establish the
bmad-daily-digest-frontendGit repository with a new Next.js (TypeScript, Node.js 22) project, using the providedbmad-daily-digest-uiV0 scaffold as the base. This setup must include all foundational tooling (ESLint, Prettier, Jest with React Testing Library, a basic CI stub), and an initial AWS CDK application structure, ensuring the "80s retro CRT terminal" aesthetic (with Tailwind CSS and shadcn/ui) is operational, so that a high-quality, styled, and standardized frontend development environment is ready. - Acceptance Criteria (ACs):
- A new, private Git repository
bmad-daily-digest-frontendmust be created on GitHub. - The
bmad-daily-digest-uiV0 scaffold project files must be used as the initial codebase in this repository. package.jsonmust be updated (project name, version, description).- Project dependencies must be installable.
- TypeScript (
tsconfig.json), Next.js (next.config.mjs), Tailwind (tailwind.config.ts), ESLint, Prettier, Jest configurations from the scaffold must be verified and operational. - The application must build successfully (
npm run build) with the scaffolded UI. - A basic CI pipeline stub (GitHub Actions) for lint, format check, test, build must be created.
- A standard
.gitignoreand an updatedREADME.mdmust be present. - An initial AWS CDK application structure must be created within a
cdk/directory in this repository, ready for defining frontend-specific infrastructure (S3, CloudFront in Story 3.6).
- A new, private Git repository
Story 3.2: Frontend API Service Layer for Backend Communication
- User Story Statement: As a Frontend Developer, I need a dedicated and well-typed API service layer (e.g.,
lib/api-client.ts) within the Next.js frontend application to manage all HTTP communication with the "BMad Daily Digest" backend API (for fetching episode lists and specific episode details), so that UI components can cleanly and securely consume backend data with robust error handling. - Acceptance Criteria (ACs):
- A TypeScript module
lib/api-client.ts(or similar) must encapsulate backend API interactions. - Functions must exist for:
getEpisodes(): Promise<EpisodeListItem[]>andgetEpisodeDetails(episodeId: string): Promise<EpisodeDetail | null>. axios(or nativeWorkspacewith a wrapper if preferred for frontend) must be used for HTTP requests.- Backend API base URL (
NEXT_PUBLIC_BACKEND_API_URL) and Frontend Read API Key (NEXT_PUBLIC_FRONTEND_API_KEY) must be configurable via public environment variables and used in requests. - TypeScript interfaces (
EpisodeListItem,EpisodeDetail,SourceHNPostDetailfromlib/types.ts) for API response data must be defined/used, matching backend API. - API functions must correctly parse JSON responses and transform data into defined interfaces.
- Error handling (network errors, non-2xx responses from backend) must be implemented, providing clear error information/objects.
- Unit tests (Jest) must mock the HTTP client and verify API calls, data parsing/transformation, and error handling. All tests must pass.
- A TypeScript module
Story 3.3: Episode List Page Implementation
- User Story Statement: As a Busy Tech Executive, I want to view a responsive "Episode List Page" (based on
app/(pages)/episodes/page.tsxfrom the scaffold) that clearly displays all available "BMad Daily Digest" episodes in reverse chronological order, showing the episode number, publication date, and podcast title for each, using themed components likeepisode-card.tsx, so that I can quickly find and select an episode. - Acceptance Criteria (ACs):
- The existing
app/(pages)/episodes/page.tsx(or equivalent main list page from scaffold) must be updated. - It must use the API service layer (Story 3.2) to fetch episodes.
- A themed loading state (e.g., using
loading-state.tsx) must be shown. - A themed error message (e.g., using
error-state.tsx) must be shown if fetching fails. - A "No episodes available yet" message must be shown for an empty list.
- Episodes must be listed in reverse chronological order.
- Each list item, potentially using a modified
episode-card.tsxcomponent, must display "Episode [EpisodeNumber]: [PublicationDate] - [PodcastGeneratedTitle]". - Each item must link to the Episode Detail Page for that episode using its
episodeId. - Styling must adhere to the "80s retro CRT terminal" aesthetic.
- The page must be responsive.
- Unit/integration tests (Jest with RTL) must cover all states, data display, order, and navigation. All tests must pass.
- The existing
Story 3.4: Episode Detail Page Implementation
- User Story Statement: As a Busy Tech Executive, after selecting an episode, I want to navigate to a responsive "Episode Detail Page" (based on
app/(pages)/episodes/[episodeId]/page.tsx/page.tsx] from the scaffold) that features an embedded HTML5 audio player, displays the episode title/date/number, a list of the Hacker News stories covered (using components likestory-item.tsx), and provides clear links to the original articles and HN discussions, so I can listen and explore sources. - Acceptance Criteria (ACs):
- The dynamic route page
app/(pages)/episodes/[episodeId]/page.tsxmust be implemented. - It must accept
episodeIdfrom the URL. - It must use the API service layer (Story 3.2) to fetch episode details.
- Loading and error states must be handled and displayed with themed components.
- If data found, must display:
podcastGeneratedTitle,publicationDate,episodeNumber. - An embedded HTML5 audio player (
<audio controls>) must play the podcast using the publicaudioUrlfrom the episode details. - A list of included Hacker News stories (from
sourceHNPosts) must be displayed, potentially using astory-item.tsxcomponent for each. - For each HN story: its title, a link to
originalArticleUrl(new tab), and a link tohnLink(new tab) must be displayed. - Styling must adhere to the "80s retro CRT terminal" aesthetic.
- The page must be responsive.
- Unit/integration tests (Jest with RTL) must cover all states, rendering of details, player, links. All tests must pass.
- The dynamic route page
Story 3.5: "About" Page Implementation
- User Story Statement: As a User, I want to access a minimalist, responsive "About Page" (based on
app/(pages)/about/page.tsxfrom the scaffold) that clearly explains "BMad Daily Digest," its purpose, and how it works, styled consistently, so I can understand the service. - Acceptance Criteria (ACs):
- The
app/(pages)/about/page.tsxcomponent must be implemented. - It must display static informational content (Placeholder: "BMad Daily Digest provides a daily audio summary of top Hacker News discussions for busy tech professionals, generated using AI. Our mission is to keep you informed, efficiently. All content is curated and processed to deliver key insights in an easily digestible audio format, presented with a unique retro-tech vibe.").
- Styling must adhere to the "80s retro CRT terminal" aesthetic.
- The page must be responsive.
- A link to "About Page" must be accessible from site navigation (e.g., via
header.tsxorfooter.tsx). - Unit tests (Jest with RTL) for rendering static content. All tests must pass.
- The
Story 3.6: Frontend Deployment to S3 & CloudFront via CDK
- User Story Statement: As a Developer, I need the Next.js frontend application to be configured for static export (or an equivalent static-first deployment model) and have its AWS infrastructure (S3 for hosting, CloudFront for CDN and HTTPS) defined and managed via its own AWS CDK application within the frontend repository. This setup should automate the build and deployment of the static site, making the "BMad Daily Digest" web application publicly accessible, performant, and cost-effective.
- Acceptance Criteria (ACs):
- Next.js app must be configured for static export suitable for S3/CloudFront.
- The AWS CDK app within
bmad-daily-digest-frontend/cdk/(from Story 3.1) must define the S3 bucket and CloudFront distribution. - CDK stack must define: S3 bucket (static web hosting), CloudFront distribution (S3 origin, HTTPS via default CloudFront domain or ACM cert for custom domain if specified for MVP, caching, OAC/OAI).
- A
package.jsonbuild script must generate the static output. - The CDK deployment process (
cdk deployrun via CI or manually for MVP) must include steps/hooks to build the Next.js app and sync static files to S3. - Application must be accessible via its CloudFront URL.
- All MVP functionalities must be operational on the deployed site.
- HTTPS must be enforced.
- CDK code must meet project standards.
7. Key Reference Documents
- Product Requirements Document (PRD) - BMad Daily Digest (This Document, v0.2)
- UI/UX Specification - BMad Daily Digest (v0.1)
- System Architecture Document - BMad Daily Digest (v0.1)
- Frontend Architecture Document - BMad Daily Digest (v0.1)
- Algolia Hacker News Search API Documentation (
https://hn.algolia.com/api) - Play.ai PlayNote API Documentation (
https://docs.play.ai/api-reference/playnote/post)
8. Out of Scope Ideas Post MVP
- Advanced Audio Player Functionality (skip +/-, speed control, playback position memory).
- User Accounts & Personalization (account creation, email subscription management, customizable podcast hosts).
- Enhanced Content Delivery & Discovery (Daily Email Summary, Full RSS Feed, Full Podcast Transcription, Search Functionality).
- Expanded Content Sources (beyond Hacker News).
- Community & Feedback (In-app feedback mechanisms).
9. Change Log
| Change | Date | Version | Description | Author |
|---|---|---|---|---|
| Initial PRD draft and MVP scope definition. | May 20, 2025 | 0.1 | Created initial PRD based on Project Brief and discussions on goals, requirements, and Epics/Stories (shells). | John (PM) & User |
| Architectural refinements incorporated into Story ACs. | May 20, 2025 | 0.2 | Updated ACs for Stories 2.1 and 3.1 based on System Architecture Document feedback from Fred (Architect). | Sarah (PO) & User |