Files
BMAD-METHOD/V2-FULL-DEMO-WALKTHROUGH/epic2.md
Brian Madison f7d6a4d2b5 V2 Frozen
2025-06-04 22:16:41 -05:00

9.3 KiB

Epic 2: HN Data Acquisition & Persistence

Goal: Implement fetching top 10 stories and their comments (respecting limits) from Algolia HN API, and persist this raw data locally into the date-stamped output directory created in Epic 1. Implement a stage testing utility for fetching.

Story List

Story 2.1: Implement Algolia HN API Client

  • User Story / Goal: As a developer, I want a dedicated client module to interact with the Algolia Hacker News Search API, so that fetching stories and comments is encapsulated, reusable, and uses the required native Workspace API.
  • Detailed Requirements:
    • Create a new module: src/clients/algoliaHNClient.ts.
    • Implement an async function WorkspaceTopStories within the client:
      • Use native Workspace to call the Algolia HN Search API endpoint for front-page stories (e.g., http://hn.algolia.com/api/v1/search?tags=front_page&hitsPerPage=10). Adjust hitsPerPage if needed to ensure 10 stories.
      • Parse the JSON response.
      • Extract required metadata for each story: objectID (use as storyId), title, url (article URL), points, num_comments. Handle potential missing url field gracefully (log warning, maybe skip story later if URL needed).
      • Construct the hnUrl for each story (e.g., https://news.ycombinator.com/item?id={storyId}).
      • Return an array of structured story objects.
    • Implement a separate async function WorkspaceCommentsForStory within the client:
      • Accept storyId and maxComments limit as arguments.
      • Use native Workspace to call the Algolia HN Search API endpoint for comments of a specific story (e.g., http://hn.algolia.com/api/v1/search?tags=comment,story_{storyId}&hitsPerPage={maxComments}).
      • Parse the JSON response.
      • Extract required comment data: objectID (use as commentId), comment_text, author, created_at.
      • Filter out comments where comment_text is null or empty. Ensure only up to maxComments are returned.
      • Return an array of structured comment objects.
    • Implement basic error handling using try...catch around Workspace calls and check response.ok status. Log errors using the logger utility from Epic 1.
    • Define TypeScript interfaces/types for the expected structures of API responses (stories, comments) and the data returned by the client functions (e.g., Story, Comment).
  • Acceptance Criteria (ACs):
    • AC1: The module src/clients/algoliaHNClient.ts exists and exports WorkspaceTopStories and WorkspaceCommentsForStory functions.
    • AC2: Calling WorkspaceTopStories makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of 10 Story objects containing the specified metadata.
    • AC3: Calling WorkspaceCommentsForStory with a valid storyId and maxComments limit makes a network request to the correct Algolia endpoint and returns a promise resolving to an array of Comment objects (up to maxComments), filtering out empty ones.
    • AC4: Both functions use the native Workspace API internally.
    • AC5: Network errors or non-successful API responses (e.g., status 4xx, 5xx) are caught and logged using the logger.
    • AC6: Relevant TypeScript types (Story, Comment, etc.) are defined and used within the client module.

Story 2.2: Integrate HN Data Fetching into Main Workflow

  • User Story / Goal: As a developer, I want to integrate the HN data fetching logic into the main application workflow (src/index.ts), so that running the app retrieves the top 10 stories and their comments after completing the setup from Epic 1.
  • Detailed Requirements:
    • Modify the main execution flow in src/index.ts (or a main async function called by it).
    • Import the algoliaHNClient functions.
    • Import the configuration module to access MAX_COMMENTS_PER_STORY.
    • After the Epic 1 setup (config load, logger init, output dir creation), call WorkspaceTopStories().
    • Log the number of stories fetched.
    • Iterate through the array of fetched Story objects.
    • For each Story, call WorkspaceCommentsForStory(), passing the story.storyId and the configured MAX_COMMENTS_PER_STORY.
    • Store the fetched comments within the corresponding Story object in memory (e.g., add a comments: Comment[] property to the Story object).
    • Log progress using the logger utility (e.g., "Fetched 10 stories.", "Fetching up to X comments for story {storyId}...").
  • Acceptance Criteria (ACs):
    • AC1: Running npm run dev executes Epic 1 setup steps followed by fetching stories and then comments for each story.
    • AC2: Logs clearly show the start and successful completion of fetching stories, and the start of fetching comments for each of the 10 stories.
    • AC3: The configured MAX_COMMENTS_PER_STORY value is read from config and used in the calls to WorkspaceCommentsForStory.
    • AC4: After successful execution, story objects held in memory contain a nested array of fetched comment objects. (Can be verified via debugger or temporary logging).

Story 2.3: Persist Fetched HN Data Locally

  • User Story / Goal: As a developer, I want to save the fetched HN stories (including their comments) to JSON files in the date-stamped output directory, so that the raw data is persisted locally for subsequent pipeline stages and debugging.
  • Detailed Requirements:
    • Define a consistent JSON structure for the output file content. Example: { storyId: "...", title: "...", url: "...", hnUrl: "...", points: ..., fetchedAt: "ISO_TIMESTAMP", comments: [{ commentId: "...", text: "...", author: "...", createdAt: "ISO_TIMESTAMP", ... }, ...] }. Include a timestamp for when the data was fetched.
    • Import Node.js fs (specifically fs.writeFileSync) and path modules.
    • In the main workflow (src/index.ts), within the loop iterating through stories (after comments have been fetched and added to the story object in Story 2.2):
      • Get the full path to the date-stamped output directory (determined in Epic 1).
      • Construct the filename for the story's data: {storyId}_data.json.
      • Construct the full file path using path.join().
      • Serialize the complete story object (including comments and fetch timestamp) to a JSON string using JSON.stringify(storyObject, null, 2) for readability.
      • Write the JSON string to the file using fs.writeFileSync(). Use a try...catch block for error handling.
    • Log (using the logger) the successful persistence of each story's data file or any errors encountered during file writing.
  • Acceptance Criteria (ACs):
    • AC1: After running npm run dev, the date-stamped output directory (e.g., ./output/YYYY-MM-DD/) contains exactly 10 files named {storyId}_data.json.
    • AC2: Each JSON file contains valid JSON representing a single story object, including its metadata, fetch timestamp, and an array of its fetched comments, matching the defined structure.
    • AC3: The number of comments in each file's comments array does not exceed MAX_COMMENTS_PER_STORY.
    • AC4: Logs indicate that saving data to a file was attempted for each story, reporting success or specific file writing errors.

Story 2.4: Implement Stage Testing Utility for HN Fetching

  • User Story / Goal: As a developer, I want a separate, executable script that only performs the HN data fetching and persistence, so I can test and trigger this stage independently of the full pipeline.
  • Detailed Requirements:
    • Create a new standalone script file: src/stages/fetch_hn_data.ts.
    • This script should perform the essential setup required for this stage: initialize logger, load configuration (.env), determine and create output directory (reuse or replicate logic from Epic 1 / src/index.ts).
    • The script should then execute the core logic of fetching stories via algoliaHNClient.fetchTopStories, fetching comments via algoliaHNClient.fetchCommentsForStory (using loaded config for limit), and persisting the results to JSON files using fs.writeFileSync (replicating logic from Story 2.3).
    • The script should log its progress using the logger utility.
    • Add a new script command to package.json under "scripts": "stage:fetch": "ts-node src/stages/fetch_hn_data.ts".
  • Acceptance Criteria (ACs):
    • AC1: The file src/stages/fetch_hn_data.ts exists.
    • AC2: The script stage:fetch is defined in package.json's scripts section.
    • AC3: Running npm run stage:fetch executes successfully, performing only the setup, fetch, and persist steps.
    • AC4: Running npm run stage:fetch creates the same 10 {storyId}_data.json files in the correct date-stamped output directory as running the main npm run dev command (at the current state of development).
    • AC5: Logs generated by npm run stage:fetch reflect only the fetching and persisting steps, not subsequent pipeline stages.

Change Log

Change Date Version Description Author
Initial Draft 2025-05-04 0.1 First draft of Epic 2 2-pm