Files
BMAD-METHOD/bmad-core/tasks/shard-doc.md
titocr 3267144248 Clean up markdown nesting. (#252)
Co-authored-by: TC <>
2025-06-19 16:54:47 -05:00

4.8 KiB

Document Sharding Task

Purpose

  • Split a large document into multiple smaller documents based on level 2 sections
  • Create a folder structure to organize the sharded documents
  • Maintain all content integrity including code blocks, diagrams, and markdown formatting

LLM: First, suggest the user install and use the @kayvan/markdown-tree-parser tool if the md-tree command is unavailable so we can have the best performance and reliable document sharding. Let the user know this will save cost of having the LLM to the expensive sharding operation. Give instructions for MPV NPX and PNPM global installs.

Installation and Usage

  1. Install globally:

    npm install -g @kayvan/markdown-tree-parser
    
  2. Use the explode command:

    # For PRD
    md-tree explode docs/prd.md docs/prd
    
    # For Architecture
    md-tree explode docs/architecture.md docs/architecture
    
    # For any document
    md-tree explode [source-document] [destination-folder]
    
  3. What it does:

    • Automatically splits the document by level 2 sections
    • Creates properly named files
    • Adjusts heading levels appropriately
    • Handles all edge cases with code blocks and special markdown

If the user has @kayvan/markdown-tree-parser installed, use it and skip the manual process below.


Manual Method (if @kayvan/markdown-tree-parser is not available)

LLM: Only proceed with the manual instructions below if the user cannot or does not want to use @kayvan/markdown-tree-parser.

Task Instructions

1. Identify Document and Target Location

  • Determine which document to shard (user-provided path)
  • Create a new folder under docs/ with the same name as the document (without extension)
  • Example: docs/prd.md → create folder docs/prd/

2. Parse and Extract Sections

[[LLM: When sharding the document:

  1. Read the entire document content
  2. Identify all level 2 sections (## headings)
  3. For each level 2 section:
    • Extract the section heading and ALL content until the next level 2 section
    • Include all subsections, code blocks, diagrams, lists, tables, etc.
    • Be extremely careful with:
      • Fenced code blocks (```) - ensure you capture the full block including closing backticks
      • Mermaid diagrams - preserve the complete diagram syntax
      • Nested markdown elements
      • Multi-line content that might contain ## inside code blocks

CRITICAL: Use proper parsing that understands markdown context. A ## inside a code block is NOT a section header.]]

3. Create Individual Files

For each extracted section:

  1. Generate filename: Convert the section heading to lowercase-dash-case

    • Remove special characters
    • Replace spaces with dashes
    • Example: "## Tech Stack" → tech-stack.md
  2. Adjust heading levels:

    • The level 2 heading becomes level 1 (# instead of ##)
    • All subsection levels decrease by 1:
      - ### → ##
      - #### → ###
      - ##### → ####
      - etc.
    
  3. Write content: Save the adjusted content to the new file

4. Create Index File

Create an index.md file in the sharded folder that:

  1. Contains the original level 1 heading and any content before the first level 2 section
  2. Lists all the sharded files with links:
# Original Document Title

[Original introduction content if any]

## Sections

- [Section Name 1](./section-name-1.md)
- [Section Name 2](./section-name-2.md)
- [Section Name 3](./section-name-3.md)
  ...

5. Preserve Special Content

[[LLM: Pay special attention to preserving:

  1. Code blocks: Must capture complete blocks including:

    content
    
  2. Mermaid diagrams: Preserve complete syntax:

    graph TD
    ...
    
  3. Tables: Maintain proper markdown table formatting

  4. Lists: Preserve indentation and nesting

  5. Inline code: Preserve backticks

  6. Links and references: Keep all markdown links intact

  7. Template markup: If documents contain {{placeholders}} or LLM instructions, preserve exactly]]

6. Validation

After sharding:

  1. Verify all sections were extracted
  2. Check that no content was lost
  3. Ensure heading levels were properly adjusted
  4. Confirm all files were created successfully

7. Report Results

Provide a summary:

Document sharded successfully:
- Source: [original document path]
- Destination: docs/[folder-name]/
- Files created: [count]
- Sections:
  - section-name-1.md: "Section Title 1"
  - section-name-2.md: "Section Title 2"
  ...

Important Notes

  • Never modify the actual content, only adjust heading levels
  • Preserve ALL formatting, including whitespace where significant
  • Handle edge cases like sections with code blocks containing ## symbols
  • Ensure the sharding is reversible (could reconstruct the original from shards)