Manifest Reference

Field reference for manifest.jsonl chunk and deletion records. For behavior and configuration, see the Pipelines overview. To inspect a manifest interactively, see scripts/inspect-manifest.sh.

Record types

There are two record types:

Chunk record — a content section, ready to embed or index.

Deletion record{"_deleted": true, "id": "..."} — a stale chunk ID to remove. Braised emits these for pages that were deleted or whose content changed since the last build.

Braised tracks source-file hashes in .braised/index-state.json. After the first build, only new or changed pages produce chunk records — the manifest is always incremental. On a clean first build every record is a chunk record. See Writing a Pipeline Script for how to handle deletion records and CI state.

Chunk record fields

{
  "id":            "/getting-started/install/#linux",
  "url":           "/getting-started/install/",
  "title":         "Installation Guide",
  "heading":       "Linux",
  "heading_level": 3,
  "breadcrumb":    ["Installation Guide", "Installation", "Linux"],
  "content":       "Installation Guide > Installation > Linux\n\nDownload the binary from the releases page...",
  "hash":          "a3f9c2b1"
}
Field Type Description
id string Chunk identifier. Format: /url#anchor for heading chunks, /url (no trailing slash) for page-root chunks. Note: page-root id drops the trailing slash that url carries. Renaming a heading changes its id — treat it as stable within a build, not across renames. Use as the vector DB document ID.
url string Output URL of the source page, with trailing slash.
title string Page title from frontmatter.
heading string Text of the nearest heading. Equals title for page-root chunks.
heading_level int Heading depth: 0 = page root (content before the first heading), 16 = H1–H6.
breadcrumb []string Ancestor heading labels from page title down to the current heading. Always starts with title.
content string Pre-baked breadcrumb context followed by the plain-text chunk body. Pass this directly to your embedding model.
hash string FNV-32a hex digest of content, zero-padded to 8 hex characters. Used by braised for incremental change detection.

The content field

content is plain text — no Markdown, no HTML. Inline formatting is stripped. Braised prepends a breadcrumb context line automatically:

Installation Guide > Installation > Linux

Download the binary from the releases page and place it somewhere on your PATH.
Verify the installation by running: braised --version

Do not strip the breadcrumb prefix — it gives the model the hierarchical context it needs for accurate retrieval.

Page-root chunks

Content that appears before the first heading in a page becomes a chunk with heading_level: 0. The heading field equals the page title and the id has no fragment. These are valid embedding targets — introductory prose often carries useful semantic signal.

What is excluded

  • Draft pages (draft: true in frontmatter) are excluded from the build entirely and produce no chunks.
  • Pages that fail to build (missing title, broken frontmatter) are skipped and produce no chunks.
  • Empty sections — a heading with no content below it produces no chunk.
  • Pages with llm_exclude: true in frontmatter are excluded from all agent outputs including the manifest.