OpenRouter & OpenAI SDK Patterns
Pattern catalog for auditing raw OpenAI/OpenRouter SDK chained LLM code — no framework.
This page is a reference for the Audit & Design phase. It covers chained LLM call patterns built with the raw OpenAI or OpenRouter SDK — no LangChain or other framework.
Imports to Search For
Python
TypeScript/JavaScript
Function Calls to Search For
Code Shapes
Single LLM Call (No Chaining)
The simplest pattern. Maps directly to a single LLM block in Noukai.
What to capture: The system message becomes the block's prompt. The model becomes the block's model config. The user message is the flow input.
Sequential Chain (Output Feeds Next Call)
The most common manual chaining pattern. Maps to a sequential Noukai flow.
What to capture: Each client.chat.completions.create call becomes a block. The data passed between calls (here, category) shows the block-to-block data flow. In Noukai, this becomes {{previous_output}} in the second block's prompt.
Multi-Turn Conversation Loop
A loop that accumulates message history. Maps to a single LLM block called repeatedly by the user's code (Noukai handles one turn at a time, the calling code manages history).
Token bloat alert: This pattern re-sends the entire conversation history on every call. As the conversation grows, token usage grows linearly. When migrating, the calling code should manage history and pass only relevant context to the Noukai flow — or use a separate "summarize history" flow to compress context.
Tool Calling / Function Calling Loop
An LLM calls tools in a loop until it has a final answer. This is the SDK equivalent of a LangChain agent.
Like LangChain agents, tool-calling loops are hard to migrate directly because Noukai flows are DAGs, not loops. Consider: (1) if the tools are simple data lookups, have the calling code run the tools and pass results as input to a single Noukai flow, (2) decompose into a classification block + specialized handler blocks, or (3) keep the tool loop as-is and only migrate the non-loop chains.
Router Pattern (Pick-Next-Prompt)
A first LLM call decides which prompt to use for the second call. Maps to a branching Noukai flow with a router block.
What to capture: The routing call becomes a router block. Each branch becomes a downstream block. In Noukai, this can be modeled as a sequential flow where the first block classifies and the second block uses {{previous_output}} to condition its response, or as a branching topology.
Fan-Out / Aggregate
Multiple independent LLM calls run concurrently, then results are combined. Maps to parallel blocks in a v container.
What to capture: Each concurrent call becomes a parallel block. The aggregation at the end is handled by Noukai's parallel container output merging — or by a final passthrough or code block if custom merging is needed.
Common Pain Points
These are the problems that make migration worthwhile. Reference them in the "Structural Improvement" column during audit.
| Problem | How to Spot It | Noukai Advantage |
|---|---|---|
| History bloat | messages array grows unbounded; full history re-sent every call | Block-to-block data passing — only relevant data flows forward |
| No caching | Same prompt + input produces same output but always re-calls the API | Versioned flows enable caching strategies at the infrastructure level |
| Sequential bottleneck | Independent calls run one after another (await in series) | Parallel containers execute independent blocks concurrently |
| Prompt sprawl | System prompts scattered across files, hardcoded in strings | Centralized, versioned prompts in the Noukai flow editor |
| No observability | No tracing, logging is ad-hoc print() or custom code | Built-in step-level tracing and SSE streaming |
| Ad-hoc retries | Manual try/except with time.sleep retry loops | Managed execution with retry policies |
| No versioning | Prompt changes = code changes = deploy cycle | Flow versioning — publish, rollback, A/B test without code changes |
| No structured output | Parsing LLM text output with regex or string splitting | Block output schemas enforce JSON structure |
Mapping Cheat Sheet
| Raw SDK Pattern | Noukai Equivalent |
|---|---|
Single completions.create | Single LLM block |
| Sequential calls (output → next input) | Sequential blocks, use {{previous_output}} |
asyncio.gather / Promise.all on independent calls | Blocks in a v (parallel) container |
| Router + if/else dispatch | Router block → conditional downstream blocks |
| Tool-calling while loop | Decompose into classification + handler blocks, or keep as-is |
| Conversation history array | Calling code manages history, passes relevant context as flow input |
JSON mode / response_format | Block output schema |
| System message | Block prompt text |
| Temperature / max_tokens | Block config via update_block_config |