Document Pre-Trained Transformers (Parsing Models)

Parsing Models Overview

A (DPT) is the model that powers the parsing capabilities of the ADE Parsing APIs. The DPT identifies document layouts and chunks, then generates descriptive explanations (captions) for those chunks. You can select a DPT in both the Playground and when calling the API directly.

Important Considerations

The ability to select a is available:

in the Playground
when calling the API
when using the library

The ability to select a is not available in the agentic-doc library.

Model Versions and Snapshots

The following table lists the available model values for the and ADE Async Parse API:

Model Values	Description
dpt-1	Use the latest snapshot of .
dpt-1-latest	Use the latest snapshot of .
dpt-1-20250615	Use the snapshot of released on June 15, 2025.
dpt-2	Use the latest snapshot of .
dpt-2-latest	Use the latest snapshot of .
dpt-2-20250919	Use the snapshot of released on September 19, 2025.
dpt-2-20251103	Use the snapshot released on November 3, 2025.
dpt-2-mini	Use the latest snapshot of .
dpt-2-mini-20251003	Use the snapshot of released on October 3, 2025.
dpt-2-mini-latest	Use the latest snapshot of .

Why Model Versioning Matters

When integrating the API, you have two options for specifying the model:

Use a general model name (like dpt-2 or dpt-2-latest) to always get the newest version. This automatically give you improvements and updates, but parsing results may change when new model versions are released
Use a specific snapshot (like dpt-2-20250919) to pin to an exact model version. This ensures consistent parsing results over time, but you won’t receive improvements.

If you use only a general model name like dpt-2 in production, your application may produce different results when we release model updates. Consider whether you need consistent results or prefer to receive the latest improvements.

Understanding Snapshots and -latest

Snapshots are frozen versions of a model released on specific dates. Each snapshot maintains the same parsing behavior indefinitely, making your results predictable. The -latest suffix always points to the most recent snapshot of that model.

DPT-1

is the original for . It offers the basic ability to parse documents.

DPT-1 Availability

The can be used in these API endpoints:

The legacy API
ADE Parse Jobs

Additionally, is the only model that the legacy API can use.

DPT-2

was introduced in September 2025. It builds upon , and offers these advanced features:

Agentic Table Captioning: can parse large, complex, no-gridline, and merged-cell tables with unprecedented fidelity. Every cell is preserved, aligned, and made accessible—enabling cell-level grounding so you know exactly where values came from.
Refined Figure Captioning: Logos, seals, and small figures are now identified precisely and concisely, eliminating the noise of verbose descriptions.
Smarter Layout Detection: Fewer chunks are missed, even in messy scans. can even detect stamps inside tables and process them separately—critical for compliance workflows.
Expanded Chunk Ontology: Beyond text, tables, and figures, now recognizes attestation (signatures, stamps, seals), ID cards, logos, barcodes, and QR codes—ensuring all document elements are classified consistently. To learn more, go to Chunk Types.

DPT-2 Availability

The can be used in these API endpoints:

ADE Parse Jobs

DPT-2 mini

is a lightweight model optimized for simple, digitally native documents. It provides cost-effective parsing for straightforward document structures.

is in Preview. This model is still in development and may not return accurate results. Do not use this model in production environments.

Supported Features

supports:

Digitally native documents, such as PDFs created from digital files.
English text.
Layout detection and document structure identification.
Simple tables.
All chunk types, including paragraphs, figures, and more. The model transcribes any text present in image-based chunk types but does not generate descriptions (captions) for visual elements.

Ideal Document Types

is ideal for digitally native English documents with straightforward layouts, such as:

Business correspondence (letters, memos, emails)
Simple reports and documentation
Basic forms with key-value pairs
Invoices with simple tables
Digital contracts

Limitations and When to Use DPT-2 Instead

Use instead of if your documents contain any of the following features or if your use case requires image descriptions. does not support:

Scanned documents or handwritten content.
Non-English languages.
Complex tables with multi-level headers, merged cells, or nested structures.
Checkboxes.
Very small fonts.
Full visual element analysis. The following image-based chunk types are identified, but the output will not contain a description (caption) or analysis of the chunk: figure, logo, card, attestation, and scan_code. For example, the model cannot identify if a signature field is signed or not.

DPT-2 mini Availability

The can be used in these API endpoints:

ADE Parse Jobs

Set the Model in the API

When calling the or ADE Parse Jobs endpoint, you can set the model using the model parameter. If you omit the model parameter, the API will use the latest snapshot of the dpt-2 model. For example, run the command below to use the latest snapshot of .

curl -X POST 'https://api.va.landing.ai/v1/ade/parse' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F '[email protected]' \
  -F 'model=dpt-2-latest'

Set the Model with the Library

When using the library, you can set the model using the model parameter in the parse() function. If you omit the model parameter, the library will use the latest snapshot of the dpt-2 model. For example, use this code to parse a document with the latest snapshot of :

from pathlib import Path
from landingai_ade import LandingAIADE

client = LandingAIADE()

response = client.parse(
    document=Path("/path/to/document.pdf"),
    model="dpt-2-latest"
)

Set the Model in the Playground

To toggle between different models in the Playground:

Load a document into the Playground.
Ensure the Parse tab is open.
Select the model you want to use from the top right corner.

Get Started

Parsing

Split

Extraction

Troubleshooting

General

Security

Administration

Agentic Document Extraction on Snowflake

Legacy Python Library

Document Pre-Trained Transformers (Parsing Models)

Parsing Models Overview

Important Considerations

Model Versions and Snapshots

Why Model Versioning Matters

Understanding Snapshots and -latest

DPT-1

DPT-1 Availability

DPT-2

DPT-2 Availability

DPT-2 mini

Supported Features

Ideal Document Types

Limitations and When to Use DPT-2 Instead

DPT-2 mini Availability

Set the Model in the API

Set the Model with the Library

Set the Model in the Playground

Get Started

Parsing

Split

Extraction

Troubleshooting

General

Security

Administration

Agentic Document Extraction on Snowflake

Legacy Python Library

​Parsing Models Overview

​Important Considerations

​Model Versions and Snapshots

​Why Model Versioning Matters

​Understanding Snapshots and -latest

​DPT-1

​DPT-1 Availability

​DPT-2

​DPT-2 Availability

​DPT-2 mini

​Supported Features

​Ideal Document Types

​Limitations and When to Use DPT-2 Instead

​DPT-2 mini Availability

​Set the Model in the API

​Set the Model with the Library

​Set the Model in the Playground

Parsing Models Overview

Important Considerations

Model Versions and Snapshots

Why Model Versioning Matters

Understanding Snapshots and -latest

DPT-1

DPT-1 Availability

DPT-2

DPT-2 Availability

DPT-2 mini

Supported Features

Ideal Document Types

Limitations and When to Use DPT-2 Instead

DPT-2 mini Availability

Set the Model in the API

Set the Model with the Library

Set the Model in the Playground