Files
BMAD-METHOD/BETA-V3/sample/prd-massive.md
Brian Madison 13c752e3b1 analyst and pm
2025-05-11 12:28:41 -05:00

21 KiB

BMad News DiCaster Product Requirements Document (PRD)

Goal, Objective and Context

BMad News DiCaster is a web application that generates daily podcasts and newsletters summarizing the top 10 Hacker News stories. The primary goal is to provide a way for individuals to efficiently keep up with Hacker News content. The application will be built using Next.js, Supabase, and Vercel. [cite: 1, 2, 3, 4, 5, 6, 85, 86]

Functional Requirements (MVP)

  • Content Sourcing:
    • Automated fetching of top Hacker News stories, configurable for time/frequency and triggerable manually via CLI.
      • Clarification: The fetching schedule should be configurable and ideally read from the database.
  • Content Scraping:
    • Scraping linked article content, attempting to retrieve up to MAX_NUMBER of posts to produce NEWSLETTER_ITEM_COUNT articles.
    • Scraped article content and retrieved comments should be saved in connection with the HN post.
      • Clarification: Scraper should retrieve up to MAX_NUMBER posts to ensure we can summarize NEWSLETTER_ITEM_COUNT articles. More advanced scraping to be considered post-MVP.
    • Error Handling: If scraping fails for an article, the system should proceed to the next article. If the required NEWSLETTER_ITEM_COUNT cannot be reached after scraping MAX_NUMBER posts, the system will use the available successful scrapes and include a summary of the comment thread for the articles that failed to scrape.
  • Content Summarization:
    • LLM summarization of articles (approximately 2 paragraphs) and comments (approximately 2 paragraphs), with configurable local/remote LLM selection (URL, API key, model).
    • Summaries of articles and comments should be saved.
    • Prompts and newsletter templates should be stored in the database for easy updating.
    • A setting should define the maximum number of comments to pull and summarize.
  • Data Storage:
    • Storage of all data in Supabase (local and cloud-hosted), including:
      • HN posts and associated scraped article content and comments.
      • Summaries of articles and comments.
      • Webhook responses from Play.ai.
  • Audio Generation:
    • Integration with Play.ai PlayNote API, with voice, quality, and tone parameters to be determined during development. [cite: 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 86, 87]
    • Webhook response indicating generation completion should be saved.
  • Content Generation Workflow:
    • Automated daily process with incremental saving of assets at each stage of the pipeline. [cite: 28]
    • CLI tool for on-demand generation. [cite: 29]
  • Web Interface:
    • Single unauthenticated page listing newsletter/podcast titles, date/time, and links to detail pages. [cite: 30, 31, 32]
    • Detail page displaying the newsletter and embedded audio player. [cite: 32, 33]
  • Newsletter Content:
    • The newsletter should be visually appealing and include:
      • Article summaries.
      • Comment summaries.
      • Hacker News post title.
      • Hacker News post upvote count.
      • Hacker News post date.
      • Link to the Hacker News post.
      • Link to the article.
  • User Authentication:
    • Moved to Post-MVP.
  • User Profile:
    • Moved to Post-MVP.
  • Email Dispatch:
    • Automated daily email dispatch to a manually maintained list of subscribed users. [cite: 34, 35]
      • Clarification: User subscription management (add/remove) will be done directly by the admin in the database for the MVP.

Non-Functional Requirements (MVP)

  • Performance:
    • The system should efficiently generate and deliver daily summaries within a defined time window.
    • LLM processing time should be minimized to avoid delays.
    • The web interface should load quickly and provide a responsive user experience.
  • Scalability:
    • The system should be able to handle a growing number of users and summaries.
  • Reliability:
    • The daily content generation process should be reliable and fault-tolerant.
    • The system should handle potential issues with external APIs (Hacker News, LLM, Play.ai) gracefully. [cite: 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61]
  • Security:
    • Data should be stored securely in Supabase.
    • Appropriate security measures should be in place to protect against unauthorized access.
  • Development and Deployment:
    • The system should support both local development (with local Supabase and LLM) and remote deployment on Vercel. [cite: 40, 41]
    • The content generation process should be deployable as a pipeline of serverless functions on Vercel. [cite: 49]
  • Logging and Monitoring:
    • The system should log errors and successful completion of pipeline stages.
    • Vercel's logging and monitoring capabilities should be utilized.
  • Error Handling:
    • If scraping fails for an article, the system should proceed to the next article.
    • If the required NEWSLETTER_ITEM_COUNT cannot be reached after scraping MAX_NUMBER posts, the system will use the available successful scrapes and include a summary of the comment thread for the articles that failed to scrape.

User Interaction and Design Goals

  • Overall UX Goals & Principles:
    • Target User Personas: Tech-savvy individuals interested in Hacker News. [cite: 16, 17, 18]
    • Usability Goals:
      • Ease of finding daily summaries.
      • Efficient access to both text and audio versions.
      • Clear presentation of information.
    • Design Principles:
      • Clarity: Prioritize clear presentation of information.
      • Accessibility: Ensure content is accessible to all users.
      • Responsiveness: The interface should work well on various screen sizes.
      • Modern Aesthetic: Implement a synthwave-inspired, dark, glowing, and minimalist design.
  • Information Architecture (IA):
    • Two pages:
      • List Page: Displays a list of summaries with titles, dates, and links to detail pages.
      • Detail Page: Shows the full newsletter content and an embedded audio player.
  • User Flows:
    • View Summary List: User navigates to the list page and browses available summaries.
    • View Summary Detail: User clicks on a summary to view the detail page with the text and audio.
  • UI Elements:
    • List Page:
      • List of newsletter titles with dates and times.
      • Links to detail pages.
    • Detail Page:
      • Newsletter content (article and comment summaries, HN post details).
      • Embedded audio player.
  • Technology Stack:
    • shadcn/ui and Tailwind CSS will be used for UI development.
  • Design Considerations:
    • Visual appeal of the newsletter (as mentioned in functional requirements).
    • Clear display of HN post details (title, upvotes, date, links).
    • Mobile-friendly layout.
    • Synthwave-inspired, dark, glowing, and minimalist aesthetic.

Technical Assumptions

  • Core Stack: Next.js, Supabase, Vercel (using the starter template from [https://vercel.com/templates/authentication/supabase](https://vercel.com/templates/authentication/supabase) and its current versions). [cite: 40, 41]
  • Hosting: Vercel Pro tier. [cite: 41]
  • Content Fetching: hnangolia library. [cite: 42]
  • Content Scraping: Cheerio. [cite: 42]
  • LLM:
    • Local LLMs (e.g., Ollama) for local development. [cite: 43, 44, 45]
    • API-based LLMs (e.g., OpenAI, Anthropic) for production/local. [cite: 43, 44, 45]
    • LLM configuration via API keys and URLs. [cite: 43, 44, 45]
  • Audio Generation: Play.ai PlayNote API. [cite: 45, 46]
  • Local Development:
    • Local Supabase instance in Docker. [cite: 46, 47, 48]
    • CLI for on-demand content generation. [cite: 46, 47, 48]
  • Architecture:
    • Serverless functions on Vercel. [cite: 49, 50, 51, 52]
    • Use of facades for external library interactions to facilitate unit testing and library swapping.
    • Use of a factory pattern for scraper implementation to support adding new scrapers.
  • Data Persistence: All data stored in Supabase (local and cloud). [cite: 48]

Testing requirements

  • Unit Testing:
    • Individual components and functions should be unit tested to ensure they behave as expected.
    • This includes testing the scraper, LLM summarization logic, data storage interactions, etc.
    • Jest should be used as the unit testing framework.
  • Integration Testing:
    • Integration tests should verify the interactions between different components.
    • For example, testing the integration between the Hacker News data fetching and the article scraping, or the integration between the LLM summarization and the audio generation.
  • End-to-End (E2E) Testing:
    • E2E tests should simulate user flows and verify the overall functionality of the application.
    • This could include testing the content generation workflow from start to finish, or testing the display of summaries in the web interface.
    • React Testing Library (RTL) should be used for E2E testing.
  • API Testing:
    • The APIs used for fetching data, LLM interaction, and audio generation should be tested to ensure they are functioning correctly and returning the expected data.
  • Local Testing:
    • The CLI tool for on-demand content generation should be thoroughly tested in the local development environment.
    • Local testing should also include verifying the local Supabase and LLM integration.
  • Deployment Testing:
    • Testing in the Vercel environment should ensure that the application functions correctly after deployment.
    • This includes testing the serverless function pipeline, webhooks, and any Vercel-specific configurations.

Epic Overview (MVP / Current Version)

  • Epic 1: Project Setup and Initial UI
    • Goal: Deploy the starter template with an initial, generated UI and configure the project.
    • Story 1.1: As a developer, I want to set up the project using the Supabase starter template so that I have a foundation to build upon.
      • Acceptance Criteria:
        • The Supabase starter template is successfully initialized.
        • The project directory is structured as defined by the template.
        • The necessary Supabase client libraries are installed.
    • Story 1.2: As a developer, I want to configure the project's dependencies and environment variables so that I can run the application locally.
      • Acceptance Criteria:
        • All project dependencies are installed.
        • Environment variables are configured for local development.
        • The application can be run locally without errors.
    • Story 1.3: As a developer, I want to deploy the starter template to Vercel so that the application is accessible online.
      • Acceptance Criteria:
        • The project is successfully deployed to Vercel.
        • The deployed application is accessible via a Vercel-provided URL.
        • Environment variables are configured for the Vercel environment.
    • Story 1.4: As a developer, I want to set up CI/CD so that changes to the codebase are automatically deployed.
      • Acceptance Criteria:
        • A CI/CD pipeline is set up (e.g., using Vercel's Git integration).
        • Changes to the main branch trigger automatic deployment to Vercel.
        • The deployment process is automated.
    • Story 1.5: As a developer, I want to generate an initial UI with placeholder content for the list and detail pages using a UI generation tool, and style it.
      • Acceptance Criteria:
        • A UI generation tool (e.g., V0 or lovable.ai) is used to create the initial structure and styling of the web interface.
        • The generated UI includes placeholder content for the list page (titles, dates, links) and detail page (newsletter content, audio player).
        • The UI is styled using shadcn/ui and Tailwind CSS with a synthwave-inspired, dark, glowing, and minimalist aesthetic.
        • The UI is designed for a single large desktop layout.
  • Epic 2: Hacker News Content Retrieval and Scraping
    • Goal: Implement the functionality to fetch Hacker News stories and scrape the content from the linked websites.
    • Story 2.1: As a developer, I want to fetch the top Hacker News stories using the hnangolia library so that I can retrieve the data needed for the newsletter.
      • Acceptance Criteria:
        • The hnangolia library is successfully integrated into the project.
        • The system can fetch the specified number of top Hacker News stories.
        • The fetched data includes the necessary fields (e.g., title, URL, HN post ID).
    • Story 2.2: As a developer, I want to implement a scraper to extract article content from the URLs provided by Hacker News so that I can obtain the article text for summarization.
      • Acceptance Criteria:
        • A scraper is implemented using Cheerio.
        • The scraper can extract the main content from articles across different websites.
        • The scraper handles potential issues like missing content or different website structures gracefully (e.g., logs errors and continues).
    • Story 2.3: As a developer, I want to save the fetched Hacker News data and scraped article content so that it can be used in subsequent steps.
      • Acceptance Criteria:
        • The fetched Hacker News data is saved in the database, including relevant details.
        • The scraped article content is saved in the database, associated with the corresponding Hacker News post.
    • Story 2.4: As a developer, I want to configure the number of top Hacker News stories to fetch and the maximum number of articles to scrape so that these parameters can be adjusted as needed.
      • Acceptance Criteria:
        • Configuration options are implemented for:
          • The number of top Hacker News stories to fetch (NEWSLETTER_ITEM_COUNT).
          • The maximum number of articles to scrape (MAX_NUMBER).
        • These configuration options can be easily modified (e.g., via environment variables or a configuration file).
  • Epic 3: LLM Summarization
    • Goal: Implement the LLM-powered summarization of articles and comments.
    • Story 3.1: As a developer, I want to integrate an LLM API for text summarization so that I can generate concise summaries of articles and comments.
      • Acceptance Criteria:
        • The chosen LLM API is successfully integrated into the project.
        • The system can send text to the LLM API and receive summaries.
    • Story 3.2: As a developer, I want to implement the logic to summarize article content so that I can provide users with a quick overview of the main points.
      • Acceptance Criteria:
        • The logic for summarizing article content is implemented.
        • The system can extract relevant text from the scraped article content and provide it to the LLM API.
        • The generated summaries are concise (approximately 2 paragraphs) and capture the main points of the article.
    • Story 3.3: As a developer, I want to implement the logic to summarize comments on Hacker News posts so that I can capture the main discussion points.
      • Acceptance Criteria:
        • The logic for summarizing Hacker News comments is implemented.
        • The system can retrieve comments associated with an HN post and provide them to the LLM API.
        • The generated summaries are concise (approximately 2 paragraphs) and capture the main discussion points.
    • Story 3.4: As a developer, I want to store the generated summaries in the database, associated with the corresponding articles and HN posts, so that they can be used in the newsletter.
      • Acceptance Criteria:
        • The generated article summaries are stored in the database, associated with the corresponding articles.
        • The generated comment summaries are stored in the database, associated with the corresponding HN posts.
    • Story 3.5: As a developer, I want to make the LLM API endpoint, model, and API key configurable so that I can easily switch between different LLM providers or models.
      • Acceptance Criteria:
        • The LLM API endpoint, model, and API key are configurable via environment variables or a configuration file.
        • The system can switch between different LLM providers or models by changing the configuration.
    • Story 3.6: As a developer, I want to store the summarization prompts in the database so that they can be easily updated without requiring code changes.
      • Acceptance Criteria:
        • The summarization prompts are stored in the database.
        • The system retrieves the prompts from the database and uses them when calling the LLM API.
        • The prompts can be updated in the database without requiring code changes or redeployment.
  • Epic 4: Web Interface Implementation
    • Goal: Implement the functionality of the web interface pages.
    • Story 4.1: As a developer, I want to make the list page display the actual data.
      • Acceptance Criteria:
        • The list page displays newsletter titles and dates/times from the database.
        • Each item in the list is a link to the corresponding detail page.
        • The list is sorted by date/time.
    • Story 4.2: As a developer, I want to make the detail page display the actual newsletter content and allow navigation to and from the list page.
      • Acceptance Criteria:
        • The detail page displays the full newsletter content from the database.
        • The newsletter content includes article summaries, comment summaries, and Hacker News post details.
        • Users can navigate to the detail page by clicking on an item in the list page.
        • The detail page includes a "back to list" navigation element.
    • Story 4.3: As a developer, I want to make the audio player on the detail page play the actual podcast.
      • Acceptance Criteria:
        • The audio player on the detail page plays the podcast associated with the displayed newsletter.
  • Epic 5: Email Dispatch
    • Goal: Implement the automated email dispatch of newsletters to subscribed users.
    • Story 5.1: As a user, I want to receive a daily newsletter email so that I can stay updated on the top Hacker News stories.
      • Acceptance Criteria:
        • The system sends a newsletter email.
        • The email includes the newsletter content (article and comment summaries, HN post details).
        • The email is formatted correctly and is visually appealing.
        • The email is sent to the list of emails maintained manually in the database.
    • Story 5.2: As a developer, I want to be able to manually trigger the newsletter email sending process via a command-line interface so that I can test and initiate the sending process on demand.
      • Acceptance Criteria:
        • A CLI command is available to trigger the newsletter email sending process.
        • The command can be executed in the local development environment.
        • Executing the command sends the newsletter email.
    • Story 5.3: As a developer, I want to automate the daily sending of the newsletter email so that it is sent out regularly without manual intervention.
      • Acceptance Criteria:
        • The sending of the newsletter email is automated (e.g., using Vercel's cron jobs or similar).
        • The email is sent out daily at a specified time.
        • Question: What specific cron job capabilities does Vercel Pro support?
  • Epic 6: Podcast Generation and UI Update
    • Goal: Implement podcast generation, update the newsletter with the audio link, and update the UI with the audio player.
    • Story 6.1: As a developer, I want to integrate the Play.ai PlayNote API to generate audio versions of the newsletters.
      • Acceptance Criteria:
        • The Play.ai PlayNote API is successfully integrated into the project.
        • The system can send the newsletter text to the Play.ai API and receive a confirmation that the request was accepted.
        • The system implements a webhook endpoint to receive the generated audio URL from Play.ai.
    • Story 6.2: As a developer, I want to store the generated podcast URLs in the database, associated with the corresponding newsletters, upon receiving the webhook notification.
      • Acceptance Criteria:
        • The system can receive the audio URL via the webhook.
        • The generated podcast URLs are stored in the database, associated with the corresponding newsletters.
    • Story 6.3: As a developer, I want to update the newsletter content to include a link to the audio version, and ensure that the email is not sent until the podcast link is available.
      • Acceptance Criteria:
        • The newsletter data in the database is updated to include the audio URL.
        • The newsletter email includes a link to the audio version.
        • The system ensures that the email is not sent until the podcast URL is successfully received from Play.ai and stored in the database.
    • Story 6.4: As a developer, I want to embed an audio player in the UI so that users can listen to the podcast.
      • Acceptance Criteria:
        • An audio player is embedded in the detail page of the UI.
        • The audio player can play the audio file from the generated URL.

Key Reference Documents

{Will be populated at a later time}

Out of Scope Ideas Post MVP

  • User Authentication
  • User Profiles
  • Advanced scraping
  • Admin Interface
  • Flexible Scheduling & Editions
  • User Customization
  • Expanded Content Sources

Change Log

Change Date Version Description Author