POST
/
v1
/
tools
/
agentic-document-analysis
cURL
curl -X POST 'https://api.va.landing.ai/v1/tools/agentic-document-analysis' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'pdf=@document.pdf' \
  -F 'include_marginalia=true' \
  -F 'include_metadata_in_markdown=true' \
  -F 'enable_rotation_detection=false' \
  -F 'fields_schema={"type": "object", "properties": {"field1": {"type": "string"}, "field2": {"type": "string"}}, "required": ["field1", "field2"]}'

# To upload an image instead of a PDF, replace the pdf parameter:
# -F 'image=@document.png'
{
  "data": {
    "markdown": "<string>",
    "extracted_schema": {},
    "extraction_metadata": {},
    "chunks": [
      {
        "text": "<string>",
        "grounding": [
          {
            "box": {
              "l": 123,
              "t": 123,
              "r": 123,
              "b": 123
            },
            "page": 123
          }
        ],
        "chunk_type": "form",
        "chunk_id": "<string>",
        "rotation_angle": 0
      }
    ]
  },
  "errors": [
    {
      "page_num": 123,
      "error": "<string>",
      "error_code": 123
    }
  ],
  "extraction_error": "<string>",
  "metadata": {}
}

Authorizations

Authorization
string
header
required

Your unique API key for authentication.

Get your API key here: https://va.landing.ai/settings/api-key.

If using the EU endpoint, get your API key here: https://va.eu-west-1.landing.ai/settings/api-key.

Query Parameters

pages
string | null

Which pages to process, separated by commas and starting from 0. For example, to process the first 3 pages, use '0,1,2'.

Body

multipart/form-data
pdf
file | null

A PDF file to be analyzed (50 pages max). Either this parameter or the image parameter must be provided.

image
file | null

An image representing the document to analyse (50MB max). The image must be a valid image file (PNG, JPEG, etc.). Either this parameter or pdf parameter must be provided.

include_marginalia
boolean
default:true

Whether to include marginalia (headers, footers, notes in margins, etc.) in the response.

include_metadata_in_markdown
boolean
default:true

Whether to include metadata in the Markdown output.

fields_schema
string | null

JSON schema for field extraction from the document. This schema extracts structured data from the document. If provided, the response includes an extracted_schema object with the extracted data and an extraction_metadata object with the visual grounding metadata. The schema must be a valid JSON object and will be validated before processing the document.

enable_rotation_detection
boolean | null

Enable automatic rotation detection and correction for document pages. When enabled, the system will detect if pages are rotated and automatically correct text and table chunks for better extraction accuracy.

Response

Successful Response

data
object
required
errors
AgenticDocAnalysisPageError · object[]
extraction_error
string | null
metadata
object | null