Decoupled Parse and Extraction APIs
In our original launch of , the field extraction function was part of the parsing function; every time you wanted to run extraction, you had to run parsing, even if you had already parsed the document. In September 2025, we introduced two new endpoints that separate these functions: and . These APIs allow you to decouple parsing and extraction workflows for greater flexibility. You can parse the document once with the API, and then use the API to run field extraction on that output multiple times. This is helpful if you want to experiment with different extraction schemas or you have multiple extraction tasks.Availability
The and APIs are available:- in the Playground
- by calling the endpoints directly: and
- when using the library
Process Overview
- Create an extraction schema. The easiest way is to use the Schema Wizard in the Playground to generate a schema.
- Run the API. This returns the parsed content as Markdown.
- Run a script to save the returned Markdown text as a Markdown file.
- Run the API on the Markdown output from the API.
- If needed, connect the extracted fields to their original locations in the document.
Use ADE Parse to Parse Documents
See the full API reference here. Use the API to parse data from documents.Rotation detection can be enabled upon request. To request this feature, contact support@landing.ai.
Specify Documents to Parse
The API offers two parameters for specifying the document you want to parse:document: Specify the actual file you want to parse.document_url: Include the URL to the file that you want to parse.
Set Up Splits for Parsing
By default, the full document is parsed when you call the API. However, you can set thesplit parameter to page to parse each page of the document separately. When this is selected, the splits object in the API output contains a set of data for each page.
Parsed Output
When you run , the API returns this response in JSON:Differences Between the Parsed Output from Legacy API & New API
If you’ve been calling the legacy API endpoint (https://api.va.landing.ai/v1/tools/agentic-document-analysis), you will notice that the output for the new API is different. If you’re switching from that endpoint to the new endpoint, you may need to update any scripts you have that interact with the parsed output. Here are some key ways in which the output is different:- The output doesn’t include any extraction data, because the API doesn’t perform extraction.
- The output is not wrapped in a
dataobject. - Each
chunksobject now has amarkdownattribute - The chunk type is defined in the
typeattribute. (The legacy endpoint defines this inchunk_type.) - The chunk ID is defined in the
idattribute. (The legacy endpoint defines this inchunk_id.) - The coordinates of each chunk’s bounding box is now spelled out in the attribute:
left,top,right,bottom. (The legacy API abbreviates the coordinates:l,t,r,b.) - The output includes a
splitsobject that shows how the document was split during the parsing process. - The output includes a
metadataobject that includes important information about the parsing process.
Use ADE Extract to Extract Fields from Markdown
See the full API reference here. Use the API to extract data from the Markdown output created by the API.Specify Documents to Run Extraction On
The API offers two parameters for specifying the document you want to parse:markdown: Specify the actual Markdown file you want to run extraction on.markdown_url: Include the URL to the Markdown file you want to run extraction on.
Set the Extraction Schema
Set the extraction schema in theschema parameter. This must be a valid JSON schema. To learn more about extraction schemas and how to create them, go to Overview: Extract Data.
Extracted Output
When you run , the API returns this response in JSON:Differences Between the Extracted Output from Legacy API & New API
If you’ve been calling the legacy API endpoint (https://api.va.landing.ai/v1/tools/agentic-document-analysis), you will notice that the output for the new API is different. If you’re switching from that endpoint to the new endpoint, you may need to update any scripts you have that interact with the extraction output. Here is the key way in which the output is different:- The output doesn’t include confidence scores.
- The output doesn’t contain the coordinates to the bounding boxes for each chunk. Instead, it contains a unique ID (
id) for the chunk that an extracted key-value pair is from. If you need to locate the source of a key-value pair, you can create a script that connects theidto the bounding box coordinates from the output. To get a sample script that does this, go to End-to-End Workflow: Parse, Extract, and Visually Ground Extracted Fields.
End-to-End Workflow: Parse and Extract
This tutorial walks you through how to parse a document with the API and then extract a subset of fields from it using the API. We provide a separate script for each endpoint, so you can choose to skip the extraction steps if you don’t need them. Scenario and materials:- Parse this PDF: Wire Transfer Form
- Extract these fields: Bank Account and Bank Account Number
- JSON extraction schema: Schema for Wire Transfer
1. Parse and Save Content as a Markdown File
First, run the script below to parse the document and save the response to a Markdown file (similar to Markdown for Wire Transfer).id. For example, the first chunk is the text ”# WIRE TRANSFER FORM”. The id for that chunk is 33335548-e7c3-40bd-898e-4f23d6c99d34.
2. Create a JSON Extraction Schema
As a reminder, we want to extract these fields from the Wire Transfer form: Bank Account and Bank Account Number. To do this, create a JSON extraction schema that identifies these fields. We will use this JSON file when we run the ADE Extract API in the next step. We’ve created the JSON schema below for you to use. You can also download this schema here: Schema for Wire Transfer.To learn more about extraction schemas and how to create them, go to Overview: Extract Data.
3. Use the Extraction Schema to Extract Data from the Markdown File
Now that we have the parsed output in a Markdown file and a JSON extraction schema, we’re ready to extract these fields: Bank Account and Bank Account Number. To do this, run the script below.End-to-End Workflow: Parse, Extract, and Visually Ground Extracted Fields
This tutorial walks you through how to parse a document, extract a subset of fields, and then connect the fields back to their original locations in the document. We provide a single script for the full workflow. Running this script saves images of the locations of the fields as PNGs. Scenario and materials:- Parse this PDF: Wire Transfer Form
- Extract these fields: Bank Account and Bank Account Number
- JSON extraction schema: Schema for Wire Transfer

