Get started in five steps
Convert any document into high quality, LLM ready data.Step 1: Install the Ingestor SDK
Our SDK is currently available for Python and Node, with REST support coming soon.Step 2: Initialize client
Navigate to your dashboard to obtain a copy of your API key.Step 3: Call the document parsing API
We use distributed systems with high concurrency to process your documents. Each document passes through multiple stages in the pipeline. Every stage builds on the last: splitting files, analyzing layout, extraction, splitting files, removing noise, formatting content, classification, enriching content, etc. The example below shows how to parse a loan application scan that may include multiple documents, such as the form, ID, and supporting materials.Step 4: Poll for job completion
Because jobs run through a multi-step pipeline, they may take several minutes to complete. Additional
processing_options can further increase this time.response
Step 5: Extract key-value fields (optional)
Use our SDK to extract specific key-value pairs from the parsed document response.You have two options:
- Use our hosted endpoint (default): We’ll handle the extraction and bill you for usage.
- Bring your own OpenAI API key: Pass your own
openai_api_keyand we won’t charge you for extraction, OpenAI will bill you directly.
response
Next steps
Now that you’ve parsed your first document, explore these key concepts:Parse content
Learn about different document parsing functions.
Extract structured outputs
Learn best practices to ensure reliable structured outputs.
Agentic chunking
Learn how to get contextually rich chunks so your agent never misses critical context.
API reference
Explore endpoints, schemas, and examples to integrate programmatically.
Need help? Join our Discord.