> ## Documentation Index
> Fetch the complete documentation index at: https://docs.landing.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Parse & Extract

> Parse a document and extract specific fields from it using the Python or TypeScript library.

export const splitJSON = 'split rules';

export const split = 'ADE Split';

export const adeTypeScriptLibrary = 'ade-typescript';

export const adePythonLibrary = 'ade-python';

export const dpt2mini = 'DPT-2 mini';

export const dpt2 = 'DPT-2';

export const dpt1 = 'DPT-1';

export const dpt = 'Document Pre-Trained Transformer';

export const companyName = 'LandingAI';

export const buildExtract = 'ADE Build Extract Schema';

export const extract = 'ADE Extract';

export const parse = 'ADE Parse';

export const ade = 'Agentic Document Extraction';

## Overview

This tutorial walks you through how to parse a document with the {parse} API and extract specific fields from it with the {extract} API.

This tutorial uses the [{adePythonLibrary} library](./ade-python) and [{adeTypeScriptLibrary} library](./ade-typescript).

In this tutorial, we will:

* Parse this PDF: <a href="/examples/wire-transfer.pdf" download="wire-transfer.pdf">Wire Transfer Form</a>
* Extract these fields: **Bank Name** and **Total Invoice Amount**

<Info>
  These examples require the [Python](./ade-python) or [TypeScript](./ade-typescript) client library. Before running a script, set your API key and install the library and any required dependencies.
</Info>

<Info>
  The scripts have been tested with PDF and PNG files and may work with other file types supported by {ade}.
</Info>

## 1. Download the Document to Process

Download the <a href="/examples/wire-transfer.pdf" download="wire-transfer.pdf">Wire Transfer Form</a> and save it to a local directory.

## 2. Create the Script

Copy the script for your language and save it as `parse-extract.py` or `parse-extract.ts` in the same directory as the PDF.

<CodeGroup>
  ```python Python [expandable] theme={null}
  import json
  from pathlib import Path
  from landingai_ade import LandingAIADE

  # Initialize client (uses VISION_AGENT_API_KEY environment variable)
  client = LandingAIADE()

  # Define the extraction schema
  schema = json.dumps({
      "type": "object",
      "properties": {
          "bank_name": {
              "description": "The official name of the bank where the account is held.",
              "x-alternativeNames": ["Name of Bank", "Financial Institution", "Bank"],
              "type": "string"
          },
          "total_invoice_amount": {
              "description": "The total monetary amount of the invoice, including all charges and taxes.",
              "x-alternativeNames": ["Grand Total", "Amount Due", "Invoice Total"],
              "type": "number"
          }
      }
  })

  # Parse the document
  # save_to is optional, but saves the full parse response, which is useful for
  # keeping a record and for other downstream processing tasks
  parse_response = client.parse(
      document=Path('wire-transfer.pdf'),
      model='dpt-2-latest',
      save_to='output'
  )

  # Extract fields from the parsed output
  extract_response = client.extract(
      schema=schema,
      markdown=parse_response.markdown,
      model='extract-latest'
  )

  # Save the extract results to a JSON file
  with open('output/wire-transfer_extract_output.json', 'w') as f:
      json.dump(extract_response.to_dict(), f, indent=2)
  ```

  ```typescript TypeScript [expandable] theme={null}
  import LandingAIADE, { toFile } from "landingai-ade";
  import fs from "fs";

  // Initialize client (uses VISION_AGENT_API_KEY environment variable)
  const client = new LandingAIADE();

  // Define the extraction schema
  const schema = JSON.stringify({
    type: "object",
    properties: {
      bank_name: {
        description: "The official name of the bank where the account is held.",
        "x-alternativeNames": ["Name of Bank", "Financial Institution", "Bank"],
        type: "string"
      },
      total_invoice_amount: {
        description: "The total monetary amount of the invoice, including all charges and taxes.",
        "x-alternativeNames": ["Grand Total", "Amount Due", "Invoice Total"],
        type: "number"
      }
    }
  });

  // Parse the document
  // saveTo is optional, but saves the full parse response, which is useful for
  // keeping a record and for other downstream processing tasks
  const parseResponse = await client.parse({
    document: fs.createReadStream("wire-transfer.pdf"),
    model: "dpt-2-latest",
    saveTo: "output"
  });

  // Extract fields from the parsed output
  const extractResponse = await client.extract({
    schema: schema,
    markdown: await toFile(Buffer.from(parseResponse.markdown), "document.md"),
    model: "extract-latest"
  });

  // Save the extract results to a JSON file
  fs.mkdirSync("output", { recursive: true });
  fs.writeFileSync(
    "output/wire-transfer_extract_output.json",
    JSON.stringify(extractResponse, null, 2)
  );
  ```
</CodeGroup>

## 3. Run the Script

Run the script from the same directory:

<CodeGroup>
  ```bash Run Python theme={null}
  python parse-extract.py
  ```

  ```bash Run TypeScript theme={null}
  npx tsx parse-extract.ts
  ```
</CodeGroup>

## 4. View Extraction Output

The results are saved to an `output` folder in the same directory. View the extracted fields and metadata in `wire-transfer_extract_output.json`.

```json [expandable] theme={null}
{
  "extraction": {
    "bank_name": "JPMorgan Chase Bank, N.A.",
    "total_invoice_amount": 15750.0
  },
  "extraction_metadata": {
    "bank_name": {
      "references": [
        "4f64f8d9-ff3a-4c47-aeb5-2ab6eaa9ce7a"
      ],
      "value": "JPMorgan Chase Bank, N.A."
    },
    "total_invoice_amount": {
      "references": [
        "deeb001e-6b3e-4c4e-96b1-6f321521ad4f",
        "0-h"
      ],
      "value": 15750.0
    }
  },
  "metadata": {
    "credit_usage": 0.5396,
    "duration_ms": 11536,
    "filename": "upload.md",
    "job_id": "bec005b58d144096b0525af3aa6ed12d",
    "org_id": null,
    "version": "extract-20260314",
    "fallback_model_version": null,
    "schema_violation_error": null,
    "warnings": []
  }
}
```

## Next Steps

Now that you have a working script, you can:

* Replace `wire-transfer.pdf` with any document you want to parse and extract from.
* Modify the `schema` dictionary to extract different fields. For guidance, see [Extraction Schema (JSON)](./ade-extract-schema-json).
* Use the Playground to build and test a schema before adding it to your code. See [Schema Wizard](./ade-extract-playground).
* Link extracted fields back to their locations in the original document. See [Link Extracted Data to Document Locations](./ade-extract-grounding-sample).