Extraction Model Versions

An extraction model powers the field extraction capabilities of the API. It analyzes your Markdown content and extracts structured data according to your JSON schema. You can specify a model when calling the API directly or when using the library. If you don’t specify a model, extract-20250930 is used by default. The newer model extract-20251024 is available for testing and will become the default soon. Different model versions have different capabilities and JSON schema requirements. For information about creating JSON schemas for extraction, go to Extraction Schema (JSON).

Model Versions

The following table lists the available model values for the API:

Model Values	Description
`extract-20250930`	Use the extraction model snapshot released on September 30, 2025.
`extract-20251024`	Use the extraction model snapshot released on October 24, 2025. For more information, go to extract-20251024.
`extract-latest`	Use the latest extraction model snapshot.

Why Model Versioning Matters

When integrating the API, you have two options for specifying the model:

Use extract-latest to always get the newest version. This automatically gives you improvements and updates, but extraction results may change when new model versions are released.
Use a specific version (like extract-20251024) to pin to an exact model version. This ensures consistent extraction results over time, but you won’t receive improvements.

extract-20251024

Model extract-20251024 offers improved extraction capabilities. Model extract-20251024 provides:

Better support for field extraction for large arrays. For example, the API can better extract data from multi-page tables.
More deterministic outputs.
Consistent handling of missing fields (returns null for all missing values).
Improved accuracy for complex fields.
Enhanced support for large documents. The model can reliably process 20+ pages of Markdown content.

Extraction model extract-20251024 has different JSON schema requirements than the previous model. Learn about all schema requirements in Extraction Schema (JSON).

Migrate to extract-20251024

If you are migrating from extract-20250930 to extract-20251024, follow this checklist to prepare your JSON schema:

Review keyword usage: Model extract-20251024 only supports specific JSON Schema keywords. Review your schema and remove or replace any unsupported keywords. For details, go to Keyword Support.
Update nullable fields: Model extract-20251024 uses the nullable keyword instead of type arrays with null. Update your schema accordingly. For details, go to Nullable Fields.
Update enum data types: Model extract-20251024 only supports string enums. If your schema uses enums with other data types, the extraction request will fail. For details, go to Restrict Values with Enum.
Simplify complex schemas: If the API determines that your JSON schema is too complex, it will fall back to extract-20250930. If a fallback occurs, you can check the metadata.fallback_model_version field in the API response to see which model was used. For guidance on reducing complexity, go to Reduce JSON Schema Complexity. For information about the response structure, go to JSON Response for Extraction.
Update code to handle null values: Model extract-20251024 returns null for missing fields, even if they’re marked as required. Ensure your downstream code handles null values appropriately. For details, go to Missing Fields.
Understand partial results: If extracted data doesn’t match your schema, the API returns a 206 status with partial results. For details, go to Schema Validation.

Set the Model in the API

When calling the endpoint, you can set the model using the model parameter. If you omit the model parameter, the API uses the latest model. This example shows how to specify a model:

curl -X POST 'https://api.va.landing.ai/v1/ade/extract' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'schema=@{"type": "object", "properties": {"field1": {"type": "string"}, "field2": {"type": "string"}}, "required": ["field1", "field2"]}' \
  -F '[email protected]' \
  -F 'model=extract-latest'

Set the Model with the Library

When using the library, you can set the model using the model parameter in the extract() function. If you omit the model parameter, the library will use the latest extraction model. For example, use this code to extract fields with the latest extraction model:

import json
from pathlib import Path
from landingai_ade import LandingAIADE

# Define your extraction schema
schema_dict = {
    "type": "object",
    "properties": {
        "field1": {"type": "string"},
        "field2": {"type": "string"}
    },
    "required": ["field1", "field2"]
}

client = LandingAIADE()
schema_json = json.dumps(schema_dict)

response = client.extract(
    schema=schema_json,
    markdown=Path("/path/to/output.md"),
    model="extract-latest"
)

Get Started

Parsing

Split

Extraction

Troubleshooting

General

Security

Administration

Agentic Document Extraction on Snowflake

Legacy Python Library

Extraction Model Versions

Model Versions

Why Model Versioning Matters

extract-20251024

Migrate to extract-20251024

Set the Model in the API

Set the Model with the Library

Get Started

Parsing

Split

Extraction

Troubleshooting

General

Security

Administration

Agentic Document Extraction on Snowflake

Legacy Python Library

​Model Versions

​Why Model Versioning Matters

​extract-20251024

​Migrate to extract-20251024

​Set the Model in the API

​Set the Model with the Library

Model Versions

Why Model Versioning Matters

extract-20251024

Migrate to extract-20251024

Set the Model in the API

Set the Model with the Library