Important Considerations (Supported Features)
The library supports:- These APIs: , ADE Parse Jobs, , and
- Setting the
- The legacy API
Install the Library
Set the API Key as an Environment Variable
To use the library, first generate an API key. Save the key to a.zshrc file or another secure location on your computer. Then export the key as an environment variable.
For more information about API keys and alternate methods for setting the API key, go to API Key.
Use with EU Endpoints
By default, the library uses the US endpoints. If your API key is from the EU endpoint, set theenvironment parameter to eu when initializing the client.
For more information about using in the EU, go to European Union (EU).
Parse: Getting Started
Theparse function converts documents into structured markdown with chunk and grounding metadata. Use these examples as guides to get started with parsing with the library.
Parse Local Files
Use thedocument parameter to parse files from your filesystem. Pass the file path as a Path object.
Parse Remote URLs
Use thedocument_url parameter to parse files from remote URLs (http, https, ftp, ftps).
Set Parameters
Theparse function accepts optional parameters to customize parsing behavior. To see all available parameters, go to ADE Parse API.
Pass these parameters directly to the parse() function.
Parse Jobs
Theparse_jobs function enables you to asynchronously parse documents that are up to 1,000 pages or 1 GB.
For more information about parse jobs, go to Parse Large Files (Parse Jobs).
Here is the basic workflow for working with parse jobs:
- Start a parse job.
- Copy the
job_idin the response. - Get the results from the parsing job with the
job_id.
List Parse Jobs
To list all async parse jobs associated with your API key, run this code:Parse Output
Theparse function returns a ParseResponse object with the following fields:
chunks: List ofChunkobjects, one for each parsed regionmarkdown: Complete Markdown representation of the documentmetadata: Processing information (credit usage, duration, filename, job ID, page count, version)splits: List ofSplitobjects organizing chunks by page or sectiongrounding: Dictionary mapping chunk IDs to detailed grounding information
Common Use Cases for ParseResponse Fields
Access all text chunks:Split: Getting Started
Thesplit function classifies and separates a parsed document into multiple sub-documents based on Split Rules you define. Use these examples as guides to get started with splitting with the library.
Pass Markdown Content
The library supports a few methods for passing the Markdown content for splitting:
- Split data directly from the parse response
- Split data from a local Markdown file
- Split data from a Markdown file at a remote URL:
markdown_url="https://example.com/file.md"
name: The Split Type name (required)description: Additional context about what this Split Type represents (optional)identifier: A field that makes each instance unique, used to create separate splits (optional)
Split from Parse Response
After parsing a document, you can pass the Markdown string directly from theParseResponse to the split function without saving it to a file.
Split from Markdown Files
If you already have a Markdown file (from a previous parsing operation), you can split it directly. Use themarkdown parameter for local Markdown files or markdown_url for remote Markdown files.
Set Parameters
Thesplit function accepts optional parameters to customize split behavior. To see all available parameters, go to ADE Split API.
Split Output
Thesplit function returns a SplitResponse object with the following fields:
splits: List ofSplitobjects, each containing:classification: The Split Type name assigned to this sub-documentidentifier: The unique identifier value (orNoneif no identifier was specified)pages: List of zero-indexed page numbers that belong to this splitmarkdowns: List of Markdown content strings, one for each page
metadata: Processing information (credit usage, duration, filename, job ID, page count, version)
Common Use Cases for SplitResponse Fields
Access all splits by classification:Extract: Getting Started
Theextract function extracts structured data from Markdown content using extraction schemas. Use these examples as guides to get started with extracting with the library.
Pass Markdown Content
The library supports a few methods for passing the Markdown content for extraction:
- Extract data from directly from the parse response
- Extract data from a local Markdown file
- Extract data from a Markdown file at a remote URL:
markdown_url="https://example.com/file.md"
Extract from Parse Response
After parsing a document, you can pass the markdown string directly from theParseResponse to the extract function without saving it to a file.
Extract from Markdown Files
If you already have a Markdown file (from a previous parsing operation), you can extract data directly from it. Use themarkdown parameter for local markdown files or markdown_url for remote markdown files.
Extraction with Pydantic
Use Pydantic models to define your extraction schema in a type-safe way. The library provides a helper function to convert Pydantic models to JSON schemas.Extraction with JSON Schema (Inline)
Define your extraction schema directly as a JSON string in your script.Extraction with JSON Schema File
Load your extraction schema from a separate JSON file for better organization and reusability. For example, here is thepay_stub_schema.json file:
Extract Nested Subfields
Define nested Pydantic models to extract hierarchical data from documents. This approach organizes related information under meaningful section names. The models with nested fields must be defined before the main extraction schema. Otherwise you may get an error that the classes with the nested fields are not defined. For example, to extract data from the Patient Details and Emergency Contact Information sections in this Medical Form, define separate models for each section, then combine them in a main model.Extract Variable-Length Data with List Objects
Use pythonList type inside of a Pydantic BaseModel to extract repeatable data structures when you don’t know how many items will appear. Common examples include line items in invoices, transaction records, or contact information for multiple people.
For example, to extract variable-length wire instructions and line items from this Wire Transfer Form, use List[DescriptionItem] for line items and List[WireInstruction] for wire transfer details.
Extraction Output
Theextract function returns an ExtractResponse object with the following fields:
extraction: The extracted key-value pairs as defined by your schemaextraction_metadata: Metadata showing which chunks were referenced for each extracted fieldmetadata: Processing information including credit usage, duration, filename, job ID, version, and schema validation errors
Linking Extracted Data to Document Locations
Use the reference IDs fromextraction_metadata to find the exact location where data was extracted in the source document. This is useful for visual validation, quality assurance, or building confidence scores.
Sample Scripts for Common Use Cases
Parse a Directory of Documents
Async Parse: Processing Multiple Documents Concurrently
UseAsyncLandingAIADE when you need to process many lightweight documents (such as invoices, receipts, or forms) efficiently. This async client allows you to send multiple parse requests concurrently using Python’s asyncio, which significantly reduces total processing time compared to sequential requests.
The async approach lets you send multiple requests in parallel. While one document is being processed, another request can be sent. The API server handles the actual document processing in the background.
To avoid exceeding the pages per hour limits and receiving 429 errors, use a client-side rate limiter like aiolimiter to control concurrency.
Save Parsed Output
Use this script to save the parsed output to JSON and Markdown files to use for downstream processing.Visualize Parsed Chunks: Draw Bounding Boxes
Use this script to visualize parsed chunks by drawing color-coded bounding boxes on your document. Each chunk type uses a distinct color, making it easy to see how the document was parsed. The script identifies chunk types and table cells. For PDFs, the script creates a separate annotated PNG for each page (page_1_annotated.png, page_2_annotated.png). For image files, the script creates a single page_annotated.png.
The image below shows an example output with bounding boxes drawn on the first page of a PDF:

Save Parsed Chunks as Images
Use this script to extract and save each parsed chunk as a separate PNG. This is useful for building datasets, analyzing chunk quality, or processing individual document regions.TIMESTAMPis the time and date the document was parsed (format:YYYYMMDD_HHMMSS)page_0is the zero-indexed page numberChunkTypeis the chunk typeCHUNK_IDis the unique chunk identifier (UUID format)

