Document Q&A with Citations
Extract relevant passages from a document, then answer using only those passages with paragraph-level citations.
The user asks a question against a document. The pipeline extracts only the relevant passages, then answers using just those passages — keeping the answering block's context small and the citations grounded.
Tree
Blocks
extract-relevant-passages (llm)
Output schema:
answer-with-citations (llm)
Multi-document variant
If the question runs against multiple documents, wrap the extraction in a loop:
Why this shape?
- Two blocks, not one, because the answering block's job is grounded synthesis — and synthesis quality drops sharply when the model also has to hunt through irrelevant text. The first block does the hunting; the second only sees the small set of relevant passages.
- The full document never reaches the answering block. This is the entire point of the decomposition — and it's also where the token savings come from. A 50,000-token document becomes a 1,000-token passage list.
- Citations are constructed from the structured passage list, not generated freely. Because each passage carries its paragraph number, the answering block can cite by index instead of inventing locations.