Skip to main content
Use the API to programmatically generate a JSON extraction schema from the Markdown output of the API. The API analyzes the Markdown content and returns a schema you can pass directly to the API.
Each call to the API consumes credits. See Build Extract Schema API pricing.

When to Use the Schema Builder API

The API is useful when you want to automate schema creation or refinement as part of a larger pipeline, without using the Playground schema wizard. Use the API to:
  • Build a master schema from multiple documents to handle field and layout variation across document types.
  • Detect schema drift by passing updated documents alongside an existing schema to surface new or changed fields before they reach your pipeline.
For schema format requirements and supported field types, see Extraction Schema (JSON).

API Reference

See the full API reference here. Endpoint: https://api.va.landing.ai/v1/ade/extract/build-schema

Request Parameters

At least one of markdowns, markdown_urls, or prompt must be provided.
ParameterTypeRequiredDescription
modelstringNoThe extraction model to use. Use extract-latest for the latest version.
markdownsfile or stringNoOne or more Markdown files or inline Markdown strings to analyze. Provide multiple Markdown files for better schema coverage.
markdown_urlsarray of stringsNoURLs to Markdown files to analyze.
promptstringNoInstructions for how to generate or modify the schema.
schemastringNoAn existing JSON schema to refine or iterate on.

Response

The response contains:
  • extraction_schema (string): The generated JSON schema, returned as a string.
  • metadata: Includes job_id, duration_ms, credit_usage, and version.

Workflows

Generate a Master Schema from Markdown Files

Pass one or more Markdown files to generate a schema based on the content. The API identifies the fields present in the Markdown and returns an extraction schema.
curl -X POST 'https://api.va.landing.ai/v1/ade/extract/build-schema' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'markdowns=@markdown.md' \
  -F 'model=extract-latest'
To build a master schema that covers multiple document types, pass multiple Markdown files that represent the range of layouts you expect to process. The API generates a single schema that handles field and layout variation across all of them:
curl -X POST 'https://api.va.landing.ai/v1/ade/extract/build-schema' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'markdowns=@markdown_1.md' \
  -F 'markdowns=@markdown_2.md' \
  -F 'model=extract-latest'

Generate a Schema from a Prompt

Use the prompt parameter to specify which fields to extract. This is useful when you only need a subset of the fields in the Markdown file, or when you want to shape the field names and structure.
curl -X POST 'https://api.va.landing.ai/v1/ade/extract/build-schema' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'markdowns=@markdown.md' \
  -F 'model=extract-latest' \
  -F 'prompt=Extract the vendor name, invoice date, and total amount due'
You can also use prompt without any Markdown input to generate a schema based on instructions alone:
curl -X POST 'https://api.va.landing.ai/v1/ade/extract/build-schema' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'model=extract-latest' \
  -F 'prompt=Create a schema for extracting patient name, date of birth, and insurance provider from medical intake forms'

Detect Schema Drift and Refine an Existing Schema

Pass an existing schema in the schema parameter to refine it. This is useful for schema drift detection: if a new document type enters your pipeline (for example, invoices from a new vendor that uses a different layout and field names), you can pass the new Markdown alongside your current schema. The API surfaces new or changed fields so you can update the schema before it affects your pipeline. To refine a schema based on a Markdown file:
curl -X POST 'https://api.va.landing.ai/v1/ade/extract/build-schema' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'markdowns=@markdown.md' \
  -F 'model=extract-latest' \
  -F 'schema={"type":"object","properties":{"vendor":{"type":"string"},"total":{"type":"number"}}}'
To update a schema based on a prompt:
curl -X POST 'https://api.va.landing.ai/v1/ade/extract/build-schema' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'model=extract-latest' \
  -F 'schema={"type":"object","properties":{"vendor":{"type":"string"},"total":{"type":"number"}}}' \
  -F 'prompt=Add a field for the invoice number and make the total field return a string with the currency symbol'