Noukai Docs

Before you start: Read the relevant pattern reference page so you know what to look for:

LangChain Patterns — LCEL, LangGraph, agents, RAG, legacy chains
OpenRouter Patterns — raw SDK chaining, tool loops, router patterns

Audit Protocol

Follow these steps in order. Present results to the user after each step.

Identify the Stack

Scan dependency files to determine which patterns are in use:

Python: Check requirements.txt, pyproject.toml, setup.py, Pipfile for langchain, langchain-core, langchain-openai, langgraph, openai
TypeScript/JavaScript: Check package.json for langchain, @langchain/core, @langchain/openai, @langchain/langgraph, openai
Other languages: Check for openai or openrouter SDK imports

Report which stack is detected. If both LangChain and raw SDK calls are present, note both.

Find All LLM Call Sites

Search the codebase for LLM invocation patterns. Use the pattern reference pages for the specific imports and function signatures to grep for.

LangChain indicators:

# Imports to search for
from langchain              # legacy
from langchain_core         # current
from langchain_openai       # current
from langgraph              # graph workflows

Raw SDK indicators:

# Function calls to search for
client.chat.completions.create
openai.ChatCompletion.create    # legacy SDK

Collect every match with file path and line number.

Trace Each Chain

For each LLM call site found, trace the full chain by reading the surrounding code. Identify:

Field	What to capture
Name	Function/class name, or infer from context (e.g. `classify_intent`)
Type	Sequential, parallel, branching, agent loop, RAG, or single call
LLM calls	Count of distinct LLM invocations per single user request
Input	What data enters the chain (function parameters, request body)
Output	What the chain returns (structured data, raw text)
Dependencies	Does this chain consume output from another chain?
Location	File path and line range

How to identify chain types:

Sequential: Output of call A feeds into call B. Look for: LCEL pipes (a | b | c), SequentialChain, multiple client.chat.completions.create where the response is formatted into the next prompt.
Parallel: Multiple independent calls that run concurrently. Look for: RunnableParallel, asyncio.gather, Promise.all.
Branching: A router call decides which downstream call to make. Look for: RunnableBranch, add_conditional_edges, if/else blocks after an LLM call.
Agent loop: An LLM calls tools in a loop until it has an answer. Look for: AgentExecutor, create_react_agent, while-loops with tool dispatch.
RAG: Retrieval followed by LLM generation. Look for: create_retrieval_chain, vector store queries followed by LLM calls.
Single call: One LLM invocation with no chaining. These are the simplest to migrate.

Group into Candidate Flows

Map each chain (or group of related chains) to a proposed Noukai flow. Chains that share a single entry point or are always called together should be grouped into one flow.

For each proposed flow, determine:

Flow name and slug — descriptive, matching the user's domain language
Description — one sentence explaining what the flow does
Block count and types — how many blocks, each as llm, passthrough, or code
Topology — sequential (blocks in series), parallel (blocks in a v container), or mixed
Structural improvements — where Noukai's architecture offers concrete gains:
- Sequential calls that can become parallel blocks (latency reduction)
- Redundant context re-sends eliminated by Noukai's block-to-block data passing (fewer tokens — estimate rough percentage if the redundancy is clearly visible, e.g. "full chat history re-sent 3 times → eliminated, roughly 60% fewer tokens on context")
- Router prompts that become branching flow topology
- Hardcoded retry logic replaced by managed execution
- Scattered prompts centralized into versioned blocks
- No structured output enforcement → Noukai output schemas

Present the Migration Plan

Output the audit results as a table:

| # | Current Chain | Location | Type | LLM Calls | Proposed Flow | Slug | Blocks | Structural Improvement |
|---|--------------|----------|------|-----------|---------------|------|--------|----------------------|
| 1 | classify_and_respond | src/chains.py:14-45 | Sequential | 2 | Classify & Respond | classify-respond | 2 (llm, llm) | None — already linear |
| 2 | route_to_specialist | src/router.py:8-82 | Branching | 1+3 | Specialist Router | specialist-router | 4 (llm, 3× llm parallel) | 3 specialist calls can parallelize; ~3× latency reduction |
| 3 | summarize_and_extract | src/pipeline.py:20-55 | Sequential | 2 | Summarize + Extract | summarize-extract | 2 (llm, llm) | Full message re-sent to 2nd call → eliminated by block data passing, ~50% fewer tokens on 2nd block |

Then ask the user:

"Does this migration plan look right? Would you like to adjust any flow names, merge or split any chains, or skip any? I won't proceed until you approve."

Do not move to the Create & Test phase until the user approves.

Design Output

After the user approves the migration plan, produce a design card for each flow:

### Flow: {name}
- **Slug:** {slug}
- **Description:** {description}

**Block tree:**
├── {block-1-name} (llm, {model})
│   Prompt: {first 50 chars of prompt}...
│   Output schema: {summary}
├── [parallel]
│   ├── {block-2a-name} (llm, {model})
│   └── {block-2b-name} (llm, {model})
└── {block-3-name} (passthrough)

**Input:** {input schema summary}
**Output:** {output schema summary}

Ask the user if they want to save this design document. If yes, write it to a location of their choosing (suggest docs/migration-plan.md or similar).

Then proceed to Create & Test.

Audit & Design