Pipelines

Make your documentation searchable and AI-queryable without custom parsing or chunking work. braised build produces manifest.jsonl in the build output directory (default: dist/) — a stream of content chunks ready to embed into any vector store or search index. A pipeline script reads this file and ingests it into whatever system you use.

Common uses: site search backed by semantic similarity, an AI assistant or chatbot that answers questions about your docs (embed at build time, retrieve at query time), or a RAG pipeline that keeps an LLM grounded in your current content.

Braised has no opinion about which embedding model or vector store you use — OpenAI, Ollama, Chroma, pgvector, Pinecone, or anything else. The manifest is the contract; the rest is yours.

How the manifest works

The manifest is newline-delimited JSON (NDJSON). Each line is one of two record types:

Chunk records contain a section of content from a page — plain text, ready to embed. Braised tracks source-file hashes between builds in .braised/index-state.json and only emits chunk records for new or modified pages. On a 500-page site, a 2-page change produces ~12 chunk records instead of ~3,000.

Deletion records ({"_deleted": true, "id": "..."}) tell your pipeline to remove a stale chunk from the store. Braised emits these whenever a page is removed or its content changes — the old chunk IDs need to be retired so they stop appearing in search results or AI responses.

Your script must handle both. A script that only processes chunk records will accumulate stale content indefinitely — deleted pages and outdated sections stay in your vector store and get served to users long after the source was changed or removed. See Writing a Pipeline Script for the correct pattern.

Build and pipeline flow

braised build
    │
    ├── dist/          ← HTML site, ready to serve
    │   ├── index.html
    │   ├── ...
    │   └── manifest.jsonl ← chunk + deletion records for your pipeline
    │
    └── runs: python3 scripts/ingest.py   ← your script, if configured

After each build, braised optionally runs a pipeline command. The script reads manifest.jsonl, embeds or indexes the chunks and removes stale IDs, then exits. Braised waits for the script to finish and reports any non-zero exit code as a build warning.

Configuration

Add a pipeline block to braised.yaml:

pipeline:
  command: python3 scripts/ingest.py

The environment variable BRAISED_MANIFEST is set to the path of manifest.jsonl before the command runs. The script also inherits all environment variables from the braised process (e.g. PATH, HOME, and any variables you export before running braised build).

Pipeline output is suppressed on success. On failure (non-zero exit code), braised prints the script's stdout and stderr in a framed error block alongside the build summary.

For a webhook target instead of a script:

pipeline:
  http: https://your-ingest-endpoint.example.com/ingest

Braised will POST the manifest content to that URL with Content-Type: application/x-ndjson. The request has a 30-second timeout and redirects are intentionally rejected (SSRF protection — the endpoint is expected to terminate where it points).

Where to go next

Get something working: Writing a Pipeline Script — the complete ingestion loop with a working reference implementation.

Field details: Manifest Reference — every field, record type, and what gets excluded from the manifest.