Skip to content

Tutorials & Examples

Guided walkthroughs that show complete, runnable pipelines from install through to output. Each tutorial uses the Python SDK and includes the YAML definition, the code to run it, and realistic expected outputs.

Before you start

Install Trellis with the extras you need for these tutorials:

pip install "trellis[cli]"

Most tutorials call an LLM. Set at least one provider key before running them:

export OPENAI_API_KEY=sk-...        # OpenAI
export ANTHROPIC_API_KEY=sk-ant-... # Anthropic Claude
export OLLAMA_HOST=http://localhost:11434  # Ollama (local)

Trellis uses LiteLLM internally, so any provider it supports works. Pass the model string as "openai/gpt-4o", "anthropic/claude-3-5-sonnet-20241022", or "ollama/llama3".


Tutorials

Compile a Pipeline from a Prompt

Describe what you want in plain English and let the Trellis compiler generate a validated pipeline YAML. Covers CLI usage, Python SDK, model selection, repair attempts, and a full SEC extraction end-to-end example.

Tools: PipelineCompilertrellis compile
Requires: LLM API key
Time: ~10 minutes


Web Search and Summarization

Search the web for a topic and distil the results into a structured summary using an LLM.

Tools: search_webllm_jobstore
Requires: LLM API key
Time: ~5 minutes


PDF Ingest, Page Selection, and Extraction

Load a PDF from disk or URL, select the pages that matter, extract structured content, and summarize.

Tools: ingest_documentselectextract_from_textsllm_job
Requires: LLM API key
Time: ~10 minutes


BM25 Section Extraction

Extract structured fields from a specific section of a large document using keyword-based BM25 retrieval. Covers how chunk-mode select works, when to use it over page-mode, and how to build a BM25 index directly from the Python SDK.

Tools: ingest_documentselect (BM25 mode) → extract_from_textsexport
Requires: LLM API key
Time: ~10 minutes


SEC Filing Field Extraction

Fetch a public 10-K or 10-Q from SEC EDGAR, ingest it, and extract typed financial fields against a declared schema. The full document-intelligence workflow end-to-end.

Tools: fetch_dataingest_documentselectload_schemaextract_fieldsexport
Requires: LLM API key
Time: ~15 minutes


Exporting Results

Take any pipeline output and write it to JSON, Markdown, CSV, or XLSX with optional schema conformance checks.

Tools: export
Requires: Nothing (runs with mock data)
Time: ~5 minutes


These pipelines ship in examples/pipelines/ and can be run directly with trellis run:

File What it shows
single_mock.yaml Minimal single-task pipeline
dependency_chain.yaml Implicit task dependencies via {{task_id.output.field}}
fan_out.yaml Parallel fan-out with parallel_over and {{item}}
pipeline_inputs.yaml Runtime inputs with {{pipeline.inputs.key}}
fetch_10k_parametrized.yaml Typed params block with required and optional params
web_search_investor_day.yaml Fan-out web search across multiple years
pdf_summarize.yaml PDF → extract → LLM summarize
image_ocr_summarize.yaml Scanned PDF with OCR → select → extract → summarize
extract_sec_field.yaml Full SEC extraction: schema + manual + fetch + extract + export
bm25_field_extraction.yaml BM25 section selection + extract_from_texts on a large HTML filing
meta_10k_workforce.yaml Workforce facts from Meta's 2025 10-K using BM25 select