Noukai Docs

Principles for token efficiency, naming, schema shape, and decomposition discipline.

These are the recurring "smell tests" for a Noukai pipeline. They apply across every structural pattern.

Pass distilled output forward, not raw input

Each block re-pays its input tokens on every call. If block A turns a 5,000-token article into a 200-token list of key points, block B should consume the 200 — not the 5,000 plus the 200. Decomposition only saves cost when each block works from the smaller representation.

Use `code` for deterministic logic

If the transformation is "join these strings," "pull result.summary out of the previous block," "filter items where score > 0.7," that's a code block. Burning LLM tokens on string formatting is the most common avoidable cost in a Noukai flow.

Use `passthrough` to route, not to think

passthrough is a structural anchor. It's the right tool when you want a single upstream node to feed a parallel container without an extra LLM call in between. It's the wrong tool if you actually wanted to transform the data — that's an llm or code block.

Define output schemas on every `llm` block consumed downstream

A schema is the contract that lets the next block trust the shape of its input. Without one, you're relying on the model to emit the same JSON keys every call. Set the schema; let the platform enforce it.

Prefer one block with structured JSON over two blocks for tightly coupled reasoning

If splitting two steps would force you to feed the same context into both blocks, keep them as one block with a richer output schema. Two coupled blocks is two prompts, two model calls, and two chances for shape drift.

Use loops for "same prompt × N items"

If you find yourself adding a third copy of the same block, you want a loop. Loops scale; copy-pasted blocks don't.

Name blocks like keys, because they are

Downstream blocks reference upstream output by block name (or by the keys you set in output schemas). Treat block names as part of the data model: short, lowercase-with-dashes, descriptive of the output, not the action. extracted_entities is better than run_entity_extractor.

Pick the shape before writing prompts

Sketch the tree first — boxes and arrows. Once the topology is right, fill in prompts and schemas. Designing prompt-first leads to flows that are really one big prompt awkwardly broken across multiple blocks.

Anti-patterns

These are the failure modes we see most often. Each one is the result of skipping decomposition or trusting a template too much.

1. Default-to-single-block

Symptom: "This looks like one LLM call" — picked before any decomposition.

Why it's wrong: Single-block hides the cost and reliability ceiling. An LLM doing 3 reasoning tasks in one call hallucinates more often than 3 blocks doing one task each. The cost may also be higher when a smaller distillation step would have made the next block's call shorter.

Fix: Decompose first. Only land on single-block if decomposition confirms there is nothing to decompose.

2. Treat-recipes-as-templates

Symptom: "This matches recipe X" — picked without checking if the recipe's data shape matches the user's actual data shape.

Why it's wrong: Recipes are worked examples for a specific domain shape. The user's input data may not match the recipe's assumptions (e.g. recipe assumes one word; user has an array of words). Cargo-culting the recipe's block_template inherits its assumptions.

Fix: Use recipes as validation references AFTER decomposition. Compare your decomposition to the recipe's block_template — if they match, great; if they don't, your decomposition is right and the recipe doesn't apply.

3. Skip-the-array-case

Symptom: User says "words" or "tickets" (plural), and the pipeline is designed for one item.

Why it's wrong: Plural input is the most common cause of "this works for the demo but breaks in production." If the user's natural phrasing is plural, the pipeline should accept an array — and that means a loop, almost always.

Fix: Surface "is the input an array?" in clarification. Default to the array shape unless explicitly told otherwise. Use loop-over-items.

4. Raw-input-forwarding

Symptom: Block N receives the user's full input that block 1 already processed.

Why it's wrong: Doubles your token cost (every block pays for the full input again) and gives downstream blocks the opportunity to re-interpret context that block 1 already pinned. The result is drift between blocks.

Fix: Pass distilled output forward, not raw input. Block 1's output should be the smallest payload that block 2 needs.

5. LLM-for-deterministic-assembly

Symptom: A final LLM block whose job is to slot fields into a JSON shape, order an array, or set a boolean flag.

Why it's wrong: Shape work has zero hallucination risk when done in code, and arbitrary hallucination risk when done by an LLM. Code is also faster and free per call.

Fix: End your pipeline with a code block whenever the final step is pure assembly. The LLM blocks above it produce structured intermediate outputs; the code block produces the final shape.

6. Schema-discipline failures (both directions)

Symptoms:

Under (skipping schemas when needed): Block N+1's prompt parses block N's output as text instead of relying on a typed contract. Drift accumulates between blocks; downstream blocks defensively re-derive shape from prose.
Over (forcing schemas when not needed): A single-block free-form chat flow gets an elaborate output_schema that the runtime never validates against any downstream consumer. Or every loop iteration body carries a redundant per-iteration schema instead of inheriting from the loop's own input_schema.

Why it's wrong: Schemas are a tool with a specific purpose — runtime validation and downstream-block reliability. Either skipping them where they're needed or forcing them where they're not is the same underlying mistake: failure to reason through what the data contract actually requires.

Fix: For each block, ask two questions:

Does anything downstream depend on this block's output shape?
Is the output free text (to a user/UI) or structured (for code/downstream-LLM consumption)?

If shape-dependence + structured → design schemas first (before the prompt). If neither → skip. The data-contracts subsection of the design brief (propose-design-brief workflow step) is where you state this call per-block.

These anti-patterns map 1:1 with the decomposition_heuristics field returned by the get_pipeline_authoring_guide MCP tool — if your LLM client is connected to noukai-mcp, it already has these heuristics in scope before any pattern-matching happens.

Best Practices