Skip to main content
December 4, 2025: Document Splitting Is Now in Preview

Split Documents with the New Split API (Preview)

We’re releasing a Preview of the API, which classifies and separates a parsed document into multiple sub-documents based on Split Rules you define. This is useful when you receive batched documents containing multiple document types or multiple instances of the same document type.For example, a financial institution processing KYC documentation might receive a single PDF containing bank statements, utility bills, and identification documents for a customer. The API can automatically classify and separate each document type, enabling downstream processing systems to route each document appropriately.Get the full details in Split.
is in Preview. This feature is still in development and may not return accurate results. Do not use this feature in production environments.

How It Works

  1. Parse your document using the API to generate Markdown output
  2. Define Split Rules that describe the document types or sections you want to identify
  3. Call the API with the parsed Markdown and your Split Rules
  4. The API returns each classified sub-document with its full Markdown content
For the complete workflow, go to Process Overview.

When to Use Split

Use the API when you need to:
  • Separate batched documents containing multiple document types (invoices, receipts, contracts)
  • Split documents with repeated sections by unique identifiers (multiple pay stubs by date)
  • Organize multi-section documents into logical parts (academic articles with body, references, supplemental materials)
  • Route different document types to appropriate downstream systems
For more use cases, go to Example Use Cases.

How to Use Split

The API is available through multiple interfaces:
  • Playground: Interactively create and test Split Rules
  • API: Integrate directly into your applications
  • Python Library: Integrate into Python-based application with our Python library
  • TypeScript Library: Integrate into TypeScript-based application with our TypeScript library
December 4, 2025: Revamped Playground

Revamped Playground

We’ve launched a complete redesign to our Playground! The updated Playground now guides you through each step of the document processing process: Parsing, Splitting, and Extraction. Simply click a tile to get started!You can now see all the files you’ve processed on your Playground homepage, including which tools you’ve run on each file (parse, extract, split).We’ve also made it easier to get help with by adding Product Update and Resources panels.Revamped Playground
November 27, 2025: Updates to Parse Jobs

The Parse Jobs API Supports up to 6,000-Page Documents

The ADE Parse Jobs API now supports documents up to 6,000 pages long. Previously, the limit was 1,000 pages.For more information, go to Rate Limits for ADE Parse Jobs.

Improved Support for Partial Content with the Parse Jobs API

We’ve improved how the ADE Parse Jobs API handles partially parsed documents.Previously, if any pages failed to process, the job would fail with status failed.Now, the API processes all pages in the document. If some pages fail, the job completes with status completed, and the successfully processed pages are returned in the results.The failed_pages array in the metadata lists which pages failed, and the failure_reason field provides details about the failures.For more information, go to Troubleshoot Parsing.

The Parse Jobs API Supports Additional Storage Providers for ZDR

When calling the ADE Parse Jobs API with zero data retention (ZDR) enabled, you must include the output_save_url parameter. This parameter specifies the URL where parsed results are saved, ensuring that does not store the document content.We have now tested and confirmed support for Amazon S3, Azure Blob Storage, and Google Cloud Storage. Other storage providers that support PUT or CREATE operations via public or presigned URLs may also work.For detailed information, go to Requirements for ZDR.
November 17, 2025: Credit Rounding Updated

Credit Rounding Updated

Credit usage for the API and is now rounded up to the nearest tenth decimal place instead of the nearest whole credit.For example, if a calculation results in 1.67 credits, the cost is now rounded up to 1.7 credits (previously would have been rounded up to 2 credits).For more information, go to Pricing & Billing.
November 12, 2025: DPT-2 mini Launch

DPT-2 mini Preview

We’ve released a preview of , a lightweight parsing model optimized for simple, digitally native documents. consumes fewer credits than other parsing models, making it a cost-effective option for straightforward document processing.
is in Preview. This model is still in development and may not return accurate results. Do not use this model in production environments.

When to Use DPT-2 mini

supports:
  • Digitally native documents, such as PDFs created from digital files.
  • English text.
  • Layout detection and document structure identification.
  • Simple tables.
  • All chunk types, including paragraphs, figures, and more. The model transcribes any text present in image-based chunk types but does not generate descriptions (captions) for visual elements.
For complete information about capabilities, limitations, and availability, go to DPT-2 mini.

Credit Consumption

consumes 1.5 credits per page, compared to 3 credits per page for other parsing models. If ZDR is enabled, parsing consumes an additional 1 credit per page. Usage is rounded up to the nearest whole credit.For pricing details, go to Pricing & Billing.
November 10, 2025: DPT-2 Is Now Generally Available, Spreadsheet Support, and Credit Rounding

DPT-2 Is Now Generally Available

, the latest series of parsing models for , is now generally available (GA). As part of going GA, we’re releasing this new snapshot: dpt-2-20251103. This updated version offers improvements to table parsing, figure captioning, and chunk detection.For more information about parsing models, go to Document Pre-Trained Transformers (Parsing Models).

How This Affects Your API Calls

The new snapshot dpt-2-20251103 is now the default model. If you call the ADE Parse or ADE Parse Jobs API without specifying a model parameter (or if you use dpt-2-latest), your API calls will automatically use this latest snapshot.Your parsing results may change with this update due to improvements in table parsing, figure captioning, and chunk detection.

Choose Your Approach

You can choose between two approaches when setting the model parameter:Get automatic improvements:
  • Omit the model parameter, or set it to dpt-2-latest
  • Your API calls will automatically use the latest snapshot
  • You’ll receive parsing improvements as new snapshots are released
  • Parsing results may change when new versions are released
curl -X POST 'https://api.va.landing.ai/v1/ade/parse' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F '[email protected]' \
  -F 'model=dpt-2-latest'
Maintain consistent results:
  • Set the model parameter to a specific snapshot (like dpt-2-20251103)
  • Your parsing results will remain consistent over time
  • You won’t automatically receive improvements from new snapshots
curl -X POST 'https://api.va.landing.ai/v1/ade/parse' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F '[email protected]' \
  -F 'model=dpt-2-20251103'
For more information about model versioning and when to use each approach, go to Model Versions and Snapshots.

Spreadsheet Support

can now parse the following file types:
  • CSV (comma-separated values)
  • XLSX (Microsoft Excel)
For more information, go to Supported File Types.

Credits Are Now Rounded Up

Credit usage for the API is now rounded up to the nearest whole credit. For more information, go to Pricing & Billing.
October 31, 2025: Annual Subscriptions
now offers annual subscriptions for . Annual subscriptions offer 10% more credits per dollar when compared to monthly subscriptions. Explore our pricing plans here: https://va.landing.ai/plan.You can now upgrade and downgrade your subscription plans directly in the Playground. For more information, go to Pricing & Billing.
October 30, 2025: New Extraction Model with Improved Accuracy

New Extraction Model: extract-20251024

We’ve released a new extraction model, extract-20251024, which offers improved extraction capabilities. An extraction model powers the field extraction capabilities of the API. It analyzes your Markdown content and extracts structured data according to your JSON schema.The new model is now available for testing. It will become the default model soon.

What’s New

Model extract-20251024 provides:
  • Better support for field extraction for large arrays. For example, the API can better extract data from multi-page tables.
  • More deterministic outputs.
  • Consistent handling of missing fields (returns null for all missing values).
  • Improved accuracy for complex fields.
  • Enhanced support for large documents. The model can reliably process 20+ pages of Markdown content.

Test the New Model

You can test how your JSON schema performs with the new extraction model by specifying the model parameter in your API calls.Example:
curl -X POST 'https://api.va.landing.ai/v1/ade/extract' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'schema=@{"type": "object", "properties": {"field1": {"type": "string"}, "field2": {"type": "string"}}, "required": ["field1", "field2"]}' \
  -F '[email protected]' \
  -F 'model=extract-20251024'
For details on how to specify a model, go to Extraction Model Versions.

Prepare Your JSON Schema

Model extract-20251024 has different JSON schema requirements than the previous model (extract-20250930). Review your schemas to ensure compatibility before the new model becomes the default.Key changes to be aware of:
  • Limited keyword support (only specific JSON Schema keywords are supported).
  • Use nullable keyword instead of type arrays with null.
  • Only string enums are supported.
  • Complex schemas may fall back to the previous model.
For a complete migration checklist and detailed guidance, go to Migrate to extract-20251024.
October 27, 2025: New File Types, Increased Page Limits & EU Subscriptions

Support for Text Documents & Presentations

can now parse the following file types:
  • DOC (Word)
  • DOCX (Word)
  • PPT (PowerPoint)
  • PPTX (PowerPoint)
  • ODT (OpenDocument Text)
  • ODP (OpenDocument Presentation)
For more information, go to Supported File Types.

Document Length Limit Increased to 100 Pages

now supports documents with up to 100 pages in both the Playground and via the API.Need to parse longer documents? Use the ADE Parse Jobs API to parse documents that are up to 1,000 pages or 1 GB.For more information, go to Rate Limits.

Subscriptions Now Available for EU Users

The EU-hosted version of now offers monthly subscription plans. To see the plans and upgrade, go to the EU Plans page.The credit-based monthly subscription plans are designed to deliver more value and features to your team.All EU users start on our pay-as-you-go plan that comes with free credits to help you get started! Once you’re ready for production, upgrade to a monthly subscription plan to get access to these features:
  • More credits per dollar
  • One-click Zero Data Retention (ZDR)
  • Organization management
  • Role-based access control (RBAC)
  • API key management
October 10, 2025: New APIs for Parsing Large Documents

New APIs for Parsing Large Documents

We have released new APIs that allow you to create parsing jobs. These APIs allow you to process large documents without blocking other operations, improving performance and user experience.To learn more about this workflow, go to Parse Large Files (Parse Jobs).

API Reference

To learn more, go to the reference pages for new APIs:
September 30, 2025: DPT Models

Document Pre-Trained Transformers: You Can Now Pick a Parsing Model

In this release, we’re previewing a concept called (DPT). A DPT is the model that powers the parsing capabilities of the ADE Parsing APIs. The DPT identifies document layouts and chunks, then generates descriptive explanations (captions) for those chunks.The API initially launched with a single DPT model called . Because there was only one DPT, it was not surfaced to users.We are now introducing , which offers:
  • Improved performance for complex tables
  • Support for new chunk types (including barcodes and ID cards)
  • More precise captioning for figures
With multiple DPT models now available, you can now select a DPT in both the Playground and when calling the API directly.For more information about models and how to use them, go to Document Pre-Trained Transformers (Parsing Models).

ADE Parse and ADE Extract Are Now Generally Available

The and APIs are now Generally Available (GA). We recommend using these endpoints moving forward.

New Python Library

We’ve launched a new Python library to support extending the APIs: the library.Key benefits:
  • Support for the and APIs.
  • Support for setting the .
  • The library is automatically generated from our API specification, ensuring you have access to the latest endpoints and parameters.
  • The library is lighweight, which makes it suitable for resource-constrained environments like AWS Lambda functions.

agentic-doc Library Transitioned to Legacy Status

The agentic-doc Python library has been transitioned to legacy status.Migrate to the new library, which is now the recommended Python library for .For more information, go to Legacy Library: agentic-doc.

The tools/agentic-document-analysis Endpoint Is Now Legacy

This endpoint has been transitioned to legacy status: https://api.va.landing.ai/v1/tools/agentic-document-analysis.Migrate to the new and APIs.
September 23, 2025: Updated Overage Structure for New Subscriptions

Updated Overage Structure for New Subscriptions

If you enroll in an subscription plan (a Team, Visionary, or Enterprise plan) on or after September 23, 2025, you can use an unlimited number of overage credits. If your overage usage reaches 100% of your monthly credit allocation, you will receive an immediate invoice for those overage charges.For example, if your plan includes 55,000 credits per month and you use 55,001 overage credits, you will be billed immediately for the overage amount.Credits used beyond your monthly allocation are billed at $0.01/credit.

Enrollments Before September 23, 2025

If you enrolled in a subscription plan before September 23, 2025, you can only use up to an additional 100% of your monthly credit allocation in overages.Credits used beyond your monthly allocation are billed at $0.01/credit.

Pricing & Plans

To get detailed information about pricing and overages, go to Pricing.
September 12, 2025: Separate APIs for Parsing & Extraction

Separate APIs for Parsing & Extraction

In our original launch of , the field extraction function was part of the parsing function; every time you wanted to run extraction, you had to run parsing, even if you had already parsed the document.We are now introducing a Preview of two new endpoints that separate these functions: and . These APIs allow you to decouple parsing and extraction workflows for greater flexibility.You can now parse the document once with the API, and then use the API to run field extraction on that output multiple times. This is helpful if you want to experiment with different extraction schemas or you have multiple extraction tasks.To get detailed information about how to use these new APIs, go to Separate APIs: Parse & Extract.
August 29, 2025: Rotation Detection

Rotation Detection

You can now turn on rotation detection. When rotation detection is enabled, detects if pages are rotated and automatically corrects text and table chunks for better extraction accuracy.To learn how to enable this when using the library, go to Pass Settings with ParseConfig.To learn how to enable this when calling the API directly, go to API Reference.
August 20, 2025: Launch - Subscriptions & New Features

Monthly Subscriptions

We’re excited to announce a major update to Agentic Document Extraction! We’ve just launched credit-based monthly subscription plans designed to deliver more value and features to your team.Learn more about available plans in Pricing.All users start on our pay-as-you-go plan that comes with free credits to help you get started! Once you’re ready for production, upgrade to a monthly subscription plan to get access to these new features:
  • More credits per dollar
  • One-click Zero Data Retention (ZDR)
  • Organization management
  • Role-based access control (RBAC)
  • API key management

One-Click Zero Data Retention (ZDR)

Users on the Team, Visionary, and Enterprise plans can turn on zero data retention (ZDR) directly in the user interface! This ensures that your documents are processed in-memory and are never stored at rest on LandingAI systems or by our sub-processors.To learn more, go to Zero Data Retention (ZDR) Option Overview.

Organization Management

Upgrading to a Team, Visionary, or Enterprise plan automatically creates an organization. An organization contains all of the credits, members, API keys, and settings for the plan.To learn more, go to Organizations & Members.

Member Management: Role-Based Access Control (RBAC)

Users on the Team, Visionary, and Enterprise plans can invite multiple users to their organization. These plans offer granular member controls, including the ability to:
  • invite members
  • assign roles to members that determine what functions they can perform
  • change member roles
  • revoke invitations
  • remove members
To learn more, go to Organizations & Members.

API Key Management

Users on the Team, Visionary, and Enterprise plans can create multiple API keys for their organization. These plans offer granular API key controls, including the ability to:
  • create API keys
  • revoke API keys
To learn more, go to API Key.
July 21, 2025: Confidence Score

Confidence Score for Schema-Based Extraction

The field extraction results now include a confidence score for each extracted field. This score indicates how certain is about the accuracy of the extracted data.For detailed information about how to get the confidence score, go to Confidence Scores.
July 17, 2025: European Union Availability

Agentic Document Extraction Now Available in Europe

Agentic Document Extraction is now available in Europe. To learn more, go to European Union (EU).Agentic Document Extraction in the EU provides:
  • Data residency: All data is stored and processed within the EU
  • GDPR compliance: Coming soon; learn more at our Security and Data page
  • Regional performance: Reduced latency for European users
July 9, 2025: agentic-doc v0.3.0

Manage Settings with ParseConfig

The agentic-doc library v0.3.0 introduces the ParseConfig class for the parse function. This allows you to pass multiple settings (like api_key, include_marginalia, and extraction_model) in a single ParseConfig object.For detailed information, go to Pass Settings with ParseConfig.You can now pass settings, like the API key, to the parse function using the new ParseConfig class.

Upcoming Deprecation: Settings Class

Setting values directly on agentic_doc.config.settings will be deprecated in a future release. Configure settings with ParseConfig instead.
June 6, 2025: agentic-doc v0.2.4

Load Bytes

In addition to supporting PDFs and images, the parse function now supports raw bytes from PDF and image files.For more information, go to Sample Script: Parse Files from Bytes.
May 29, 2025: agentic-doc v0.2.3

Consolidated Parsing Function

We released library v0.2.3, which includes a new parsing function: parse. chunk types.The parse function allows you to parse multiple documents, and supports loading documents from Amazon S3 buckets, Google Drive, and other locations by using the connectors module.To use the new parse function and the `connectors module, upgrade the library to v0.2.3.The orginal parsing functions will continue to work, but we recommending using parse for new projects.
May 20, 2025: agentic-doc v0.2.1

Consolidated Chunk Types

We released library v0.2.1, which includes consolidated chunk types.The library now has the following chunk types: table, figure, marginalia, and text.These chunk types were consolidated into marginalia:
  • page_header
  • page_footer
  • page_number
These chunk types were consolidated into text:
  • title
  • form
  • key_value

Action Required When Using Library

If you use the library and your scripts or workflows use any of the deprecated chunk types, update your code to use the new types.How the library handles the deprecated chunk types depends on the version you’re using:
  • Upgrade to v0.2.1 to use the new chunk types.
  • If using v0.0.13 to v​​0.1.3, the marginalia type doesn’t exist and will fallback to page_header.
  • If using v0.0.12 or earlier, the code will NOT work after May 22.

Action Required When Calling the API Directly

If you call the API directly and your scripts or workflows use any of the deprecated chunk types, update your code to use the new types.We are making these same changes (consolidating the chunk types) to the API on Thursday, May 22.Starting May 22, the API will stop using the deprecated types in the response. If your code uses the deprecated chunk types, the code will no longer work.
May 14, 2025

Improved Accuracy

now delivers higher accuracy when extracting data from complex tables and multi-column layouts.

Increased Processing Speed

is now significantly faster than before, so you can process thousands of pages per minute.

Process Longer Pages

We’ve increased our page limits, so that you can process longer documents.For more information, go to Rate Limits.

Zero Data Retention

Users on the Custom plan can enable a zero data retention policy, ensuring all data is deleted immediately after processing—supporting strict privacy and compliance requirements.For more information, contact us.

Consolidated Chunk Types

We consolidated these chunk types into page_header:
  • page_header
  • page_footer
  • page_number
We consolidated these chunk types into form:
  • form
  • key_value
For more information, go to Chunk Types.