PDF & Document Analysis

OpenClaw includes a dedicated PDF analysis tool that lets your agent read, interpret, and extract information from PDF documents. Whether you need to summarize a report, pull data from a table, or answer questions about a contract, the PDF tool handles it directly in your conversation.

How PDF analysis works

The PDF tool supports two processing modes depending on your configured model provider. When a model supports native PDF input, the file is sent directly for interpretation. When it does not, OpenClaw extracts text and images from the PDF and passes them as separate inputs.

Native PDF processing

Anthropic Claude and Google Gemini accept PDF files natively. This means the model sees the full document structure: text, images, tables, headers, footers, and layout. Native processing delivers the best results for complex documents with mixed content.

Fallback extraction

For providers without native PDF support, OpenClaw extracts text content and page images automatically. The extracted text is sent as structured input, and page images are processed through the vision model. This works well for most documents but may miss subtle layout cues.

Using the PDF tool

You can pass a single PDF or up to 10 files in one call. Each analysis requires a prompt that tells the model what to look for.

Single file analysis

Pass a local file path or URL to the pdf parameter. Include a prompt describing what you need extracted or analyzed.

Local paths: /home/user/documents/report.pdf
URLs: https://example.com/document.pdf
Combine with a clear prompt for best results

Batch analysis

Use the pdfs parameter with an array of up to 10 files. Batch analysis is useful for comparing documents, cross-referencing data, or processing a folder of invoices.

Page selection

For large documents, use the pages parameter to analyze specific pages only. Accepted formats:

"1-5" for a range
"1,3,7" for individual pages
"1,3,5-7" for mixed ranges

Page selection reduces token usage and processing time significantly for documents with hundreds of pages.

Practical use cases

Report summarization

Pass a quarterly report or research paper and ask for a structured summary. The model can extract key findings, financial data, and recommendations across all pages.

Contract review

Upload contracts or legal documents and ask about specific clauses, obligations, or risks. Native PDF processing preserves document structure, making it easier to reference specific sections.

Data extraction

Extract structured data from forms, invoices, or tables. For native PDF providers, table structure is preserved. For fallback mode, describe the table format in your prompt to improve accuracy.

Multi-document comparison

Use batch analysis to compare versions of the same document, check for discrepancies between contracts, or verify that updated terms match expectations.

Configuration and limits

File size and count

Parameter	Default	Notes
maxBytesMb	20 MB	Maximum single file size
pdfs (batch)	10 files	Maximum files per call
pages	All pages	Optional page range filter

Model selection

For best results with PDFs containing images, tables, or complex layouts, use a model with native PDF support (Claude or Gemini). For text-heavy documents, any model with the fallback extractor works well.

You can override the default model with the model parameter in the PDF tool call. This lets you route PDF analysis to a capable model without changing your global configuration.

Tips for better results

Be specific in your prompt: Instead of "analyze this PDF", ask "extract the revenue figures from the Q3 financial table on page 4"
Use page selection: For large documents, narrow the scope to relevant pages
Verify extracted data: Always cross-check critical numbers and facts from the output
Combine with other tools: Use browser automation to download PDFs from portals, then analyze them with the PDF tool
Chunk very large documents: If a document exceeds size limits, split analysis across multiple calls with different page ranges

Troubleshooting

Common issues

File too large: Reduce maxBytesMb or split the document
Poor extraction quality: Switch to a native PDF model (Claude or Gemini)
Missing images: Ensure the model has vision capabilities enabled
Timeout on large files: Use page selection to process smaller chunks
Encoding errors: Verify the PDF is not password-protected or corrupted

Security notes

PDF files are processed by the model provider and may be stored temporarily according to their data policies
Do not upload documents containing secrets, credentials, or sensitive personal data to public model endpoints
For confidential documents, use a local model or a provider with explicit data handling guarantees

Need help from people who already use this stuff?

Questions about PDF analysis?

Get help with document processing workflows and model selection in the OpenClaw community.

Join My AI Agent Profit Lab See the community page

FAQ

Which models support native PDF analysis?

Anthropic Claude and Google Gemini models support native PDF processing, meaning they can interpret layout, images, and tables directly. Other providers fall back to text and image extraction.

What is the maximum file size for PDF analysis?

The default limit is 20 MB per file. You can adjust this with the maxBytesMb parameter if your provider supports larger uploads.

Can I analyze multiple PDFs at once?

Yes. Use the pdfs parameter (array) to pass up to 10 PDF files in a single analysis call. Each file is processed independently.

Does PDF analysis work with scanned documents?

For scanned documents, OpenClaw extracts images from the PDF pages and passes them to the vision model. OCR quality depends on the underlying model's capabilities.

Can I limit analysis to specific pages?

Yes. Use the pages parameter with ranges like '1-5' or '1,3,5-7' to analyze only the pages you need. This saves tokens and processing time.