Skip to main content
The library is a lightweight TypeScript library you can use for parsing documents, splitting documents into sub-documents, and extracting data. The library is automatically generated from our API specification, ensuring you have access to the latest endpoints and parameters.

Important Considerations (Supported Features)

The library supports: The library does not support:
  • The legacy API

Install the Library

npm install landingai-ade

Set the API Key as an Environment Variable

To use the library, first generate an API key. Save the key to a .zshrc file or another secure location on your computer. Then export the key as an environment variable.
export VISION_AGENT_API_KEY=<your-api-key>
When initializing the client, the library automatically reads from the VISION_AGENT_API_KEY environment variable:
import LandingAIADE from "landingai-ade";

const client = new LandingAIADE();
Alternatively, you can explicitly pass the API key when initializing the client:
import LandingAIADE from "landingai-ade";

const client = new LandingAIADE({
  apikey: process.env.VISION_AGENT_API_KEY
});
For more information about API keys and alternate methods for setting the API key, go to API Key.

Use with EU Endpoints

By default, the library uses the US endpoints. If your API key is from the EU endpoint, set the environment parameter to eu when initializing the client.
import LandingAIADE from "landingai-ade";

const client = new LandingAIADE({
  environment: "eu",
});

// ... rest of your code
For more information about using in the EU, go to European Union (EU).

Parse: Getting Started

The parse method converts documents into structured Markdown with chunk and grounding metadata. Use these examples as guides to get started with parsing with the library.

Parse Local Files

Use the document parameter to parse files from your filesystem. Pass the file as a read stream using fs.createReadStream().
import LandingAIADE from "landingai-ade";
import fs from "fs";

const client = new LandingAIADE();

// Replace with your file path
const response = await client.parse({
  document: fs.createReadStream("/path/to/file/document"),
  model: "dpt-2-latest"
});

console.log(response.chunks);

// Save Markdown output (useful if you plan to run extract on the Markdown)
fs.writeFileSync("output.md", response.markdown, "utf-8");

Parse Remote URLs

Use the document parameter with fetch() to parse files from remote URLs (http, https).
import LandingAIADE from "landingai-ade";
import fs from "fs";

const client = new LandingAIADE();

// Parse a remote file
const response = await client.parse({
  document: await fetch("https://example.com/document.pdf"),
  model: "dpt-2-latest"
});

console.log(response.chunks);

// Save Markdown output (useful if you plan to run extract on the Markdown)
fs.writeFileSync("output.md", response.markdown, "utf-8");

Set Parameters

The parse method accepts optional parameters to customize parsing behavior. To see all available parameters, go to ADE Parse API. Pass these parameters directly to the parse() method.
import LandingAIADE from "landingai-ade";
import fs from "fs";

const client = new LandingAIADE();

const response = await client.parse({
  document: fs.createReadStream("/path/to/document.pdf"),
  model: "dpt-2-latest",
  split: "page"
});

Parse Jobs

The parseJobs resource enables you to asynchronously parse documents that are up to 1,000 pages or 1 GB. For more information about parse jobs, go to Parse Large Files (Parse Jobs). Here is the basic workflow for working with parse jobs:
  1. Start a parse job.
  2. Copy the job_id in the response.
  3. Get the results from the parsing job with the job_id.
This script contains the full workflow:
import LandingAIADE from "landingai-ade";
import fs from "fs";

const client = new LandingAIADE();

// Step 1: Create a parse job
const job = await client.parseJobs.create({
  document: fs.createReadStream("/path/to/file/document"),
  model: "dpt-2-latest"
});

const jobId = job.job_id;
console.log(`Job ${jobId} created.`);

// Step 2: Get the parsing results
while (true) {
  const response = await client.parseJobs.get(jobId);
  if (response.status === "completed") {
    console.log(`Job ${jobId} completed.`);
    break;
  }
  console.log(`Job ${jobId}: ${response.status} (${(response.progress * 100).toFixed(0)}% complete)`);
  await new Promise(resolve => setTimeout(resolve, 5000));
}

// Step 3: Access the parsed data
const response = await client.parseJobs.get(jobId);
console.log("Global Markdown:", response.data.markdown.substring(0, 200) + "...");
console.log(`Number of chunks: ${response.data.chunks.length}`);

// Save Markdown output (useful if you plan to run extract on the Markdown)
fs.writeFileSync("output.md", response.data.markdown, "utf-8");

List Parse Jobs

To list all async parse jobs associated with your API key, run this code:
import LandingAIADE from "landingai-ade";

const client = new LandingAIADE();

// List all jobs
const response = await client.parseJobs.list();
for (const job of response.jobs) {
  console.log(`Job ${job.job_id}: ${job.status}`);
}

Work with Parse Response Data

Access all text chunks:
for (const chunk of response.chunks) {
  if (chunk.type === 'text') {
    console.log(`Chunk ${chunk.id}: ${chunk.markdown}`);
  }
}
Filter chunks by page:
const page0Chunks = response.chunks.filter(chunk => chunk.grounding.page === 0);
console.log(page0Chunks);
Get chunk locations:
for (const chunk of response.chunks) {
  const box = chunk.grounding.box;
  console.log(`Chunk at page ${chunk.grounding.page}: (${box.left}, ${box.top}, ${box.right}, ${box.bottom})`);
}
Identify the chunk type for each chunk:
for (const [chunkId, grounding] of Object.entries(response.grounding)) {
  console.log(`Chunk ${chunkId} has type: ${grounding.type}`);
}

Split: Getting Started

The split method classifies and separates a parsed document into multiple sub-documents based on Split Rules you define. Use these examples as guides to get started with splitting with the library. Pass Markdown Content The library supports a few methods for passing the Markdown content for splitting:
  • Split data directly from the parse response
  • Split data from a local Markdown file
  • Split data from a Markdown file at a remote URL: markdown: await fetch("https://example.com/file.md")
Define Split Rules Split Rules define how the API classifies and separates your document. Each Split Rule consists of:
  • name: The Split Type name (required)
  • description: Additional context about what this Split Type represents (optional)
  • identifier: A field that makes each instance unique, used to create separate splits (optional)
For more information about Split Rules, see Split Rules.

Split from Parse Response

After parsing a document, you can pass the Markdown string directly from the ParseResponse to the split method without saving it to a file.
import LandingAIADE from "landingai-ade";
import fs from "fs";

const client = new LandingAIADE();

// Parse the document
const parseResponse = await client.parse({
  document: fs.createReadStream("/path/to/document.pdf"),
  model: "dpt-2-latest"
});

// Define Split Rules
const splitClass = [
  {
    name: "Bank Statement",
    description: "Document from a bank that summarizes all account activity over a period of time."
  },
  {
    name: "Pay Stub",
    description: "Document that details an employee's earnings, deductions, and net pay for a specific pay period.",
    identifier: "Pay Stub Date"
  }
];

// Split using the Markdown string from parse response
const splitResponse = await client.split({
  split_class: splitClass,
  markdown: parseResponse.markdown,  // Pass Markdown string directly
  model: "split-latest"
});

// Access the splits
for (const split of splitResponse.splits) {
  console.log(`Classification: ${split.classification}`);
  console.log(`Identifier: ${split.identifier}`);
  console.log(`Pages: ${split.pages}`);
}

Split from Markdown Files

If you already have a Markdown file (from a previous parsing operation), you can split it directly. Use the markdown parameter for local Markdown files or the markdown parameter with fetch() for remote Markdown files.
import LandingAIADE, { toFile } from "landingai-ade";
import fs from "fs";

const client = new LandingAIADE();

// Define Split Rules
const splitClass = [
  {
    name: "Invoice",
    description: "A document requesting payment for goods or services.",
    identifier: "Invoice Number"
  },
  {
    name: "Receipt",
    description: "A document acknowledging that payment has been received."
  }
];

// Split from a local Markdown file
const splitResponse = await client.split({
  split_class: splitClass,
  markdown: fs.createReadStream("/path/to/parsed_output.md"),
  model: "split-latest"
});

// Or split from a remote Markdown file
const splitResponse = await client.split({
  split_class: splitClass,
  markdown: await fetch("https://example.com/document.md"),
  model: "split-latest"
});

// Access the splits
for (const split of splitResponse.splits) {
  console.log(`Classification: ${split.classification}`);
  if (split.identifier) {
    console.log(`Identifier: ${split.identifier}`);
  }
  console.log(`Number of pages: ${split.pages.length}`);
  console.log(`Markdown content: ${split.markdowns[0].substring(0, 100)}...`);
}

Set Parameters

The split method accepts optional parameters to customize split behavior. To see all available parameters, go to ADE Split API.
import LandingAIADE from "landingai-ade";
import fs from "fs";

const client = new LandingAIADE();

const splitResponse = await client.split({
  split_class: [
    { name: "Section A", description: "Introduction section" },
    { name: "Section B", description: "Main content section" }
  ],
  markdown: fs.createReadStream("/path/to/parsed_output.md"),
  model: "split-20251105"  // Use a specific model version
});

Split Output

The split method returns a SplitResponse object with the following fields:
  • splits: Array of Split objects, each containing:
    • classification: The Split Type name assigned to this sub-document
    • identifier: The unique identifier value (or null if no identifier was specified)
    • pages: Array of zero-indexed page numbers that belong to this split
    • markdowns: Array of Markdown content strings, one for each page
  • metadata: Processing information (credit usage, duration, filename, job ID, page count, version)
For detailed information about the response structure, see JSON Response for Splitting.

Work with Split Response Data

Access all splits by classification:
for (const split of splitResponse.splits) {
  console.log(`Split Type: ${split.classification}`);
  console.log(`Pages included: ${split.pages}`);
}
Filter splits by classification:
const invoices = splitResponse.splits.filter(split => split.classification === "Invoice");
console.log(`Found ${invoices.length} invoices`);
Access Markdown content for each split:
for (const split of splitResponse.splits) {
  console.log(`Classification: ${split.classification}`);
  for (let i = 0; i < split.markdowns.length; i++) {
    console.log(`  Page ${split.pages[i]} Markdown: ${split.markdowns[i].substring(0, 100)}...`);
  }
}
Group splits by identifier:
const splitsByIdentifier = new Map<string, Array<typeof splitResponse.splits[0]>>();

for (const split of splitResponse.splits) {
  if (split.identifier) {
    const existing = splitsByIdentifier.get(split.identifier) || [];
    existing.push(split);
    splitsByIdentifier.set(split.identifier, existing);
  }
}

for (const [identifier, splits] of splitsByIdentifier.entries()) {
  console.log(`Identifier '${identifier}': ${splits.length} split(s)`);
}

Extract: Getting Started

The extract method extracts structured data from Markdown content using extraction schemas. Use these examples as guides to get started with extracting with the library. Pass Markdown Content The library supports a few methods for passing the Markdown content for extraction:
  • Extract data directly from the parse response
  • Extract data from a local Markdown file
  • Extract data from a Markdown file at a remote URL: markdown: await fetch("https://example.com/file.md")
Pass the Extraction Schema The library supports a few methods for passing the extraction schema:

Extract from Parse Response

After parsing a document, you can pass the markdown string directly from the ParseResponse to the extract method without saving it to a file.
import LandingAIADE, { toFile } from "landingai-ade";
import fs from "fs";

// Define your extraction schema
const schemaDict = {
  type: "object",
  properties: {
    employee_name: {
      type: "string",
      description: "The employee's full name"
    }
  }
};

const client = new LandingAIADE();
const schemaJson = JSON.stringify(schemaDict);

// Parse the document
const parseResponse = await client.parse({
  document: fs.createReadStream("/path/to/document.pdf"),
  model: "dpt-2-latest"
});

// Extract data using the markdown string from parse response
const extractResponse = await client.extract({
  schema: schemaJson,
  markdown: await toFile(Buffer.from(parseResponse.markdown), "document.md"),
  model: "extract-latest"
});

// Access the extracted data
console.log(extractResponse.extraction);

Extract from Markdown Files

If you already have a Markdown file (from a previous parsing operation), you can extract data directly from it. Use the markdown parameter with fs.createReadStream() for local Markdown files or with fetch() for remote Markdown files.
import LandingAIADE from "landingai-ade";
import fs from "fs";

// Define your extraction schema
const schemaDict = {
  type: "object",
  properties: {
    employee_name: {
      type: "string",
      description: "The employee's full name"
    },
    employee_ssn: {
      type: "string",
      description: "The employee's Social Security Number"
    },
    gross_pay: {
      type: "number",
      description: "The gross pay amount"
    }
  },
  required: ["employee_name", "employee_ssn", "gross_pay"]
};

const client = new LandingAIADE();
const schemaJson = JSON.stringify(schemaDict);

// Extract from a local markdown file
const extractResponse = await client.extract({
  schema: schemaJson,
  markdown: fs.createReadStream("/path/to/output.md"),
  model: "extract-latest"
});

// Or extract from a remote markdown file
const extractResponse = await client.extract({
  schema: schemaJson,
  markdown: await fetch("https://example.com/document.md"),
  model: "extract-latest"
});

// Access the extracted data
console.log(extractResponse.extraction);

Extraction with Zod

Use Zod schemas to define your extraction schema in a type-safe way. Zod provides TypeScript type inference and runtime validation for your extracted data. To use Zod with the library, install zod:
npm install zod
After installing zod, run extraction with the library:
import LandingAIADE, { toFile } from "landingai-ade";
import fs from "fs";
import { z } from "zod";

// Define your extraction schema as a Zod schema
const PayStubSchema = z.object({
  employee_name: z.string().describe("The employee's full name"),
  employee_ssn: z.string().describe("The employee's Social Security Number"),
  gross_pay: z.number().describe("The gross pay amount")
});

// Extract TypeScript type from schema
type PayStubData = z.infer<typeof PayStubSchema>;

// Initialize the client
const client = new LandingAIADE();

// First, parse the document to get markdown
const parseResponse = await client.parse({
  document: fs.createReadStream("/path/to/pay-stub.pdf"),
  model: "dpt-2-latest"
});

// Convert Zod schema to JSON schema
const schema = JSON.stringify(z.toJSONSchema(PayStubSchema));

// Extract structured data using the schema
const extractResponse = await client.extract({
  schema: schema,
  markdown: await toFile(Buffer.from(parseResponse.markdown), "document.md"),
  model: "extract-latest"
});

// Access the extracted data with type safety
const data = extractResponse.extraction as PayStubData;
console.log(data);

// Access extraction metadata to see which chunks were referenced
console.log(extractResponse.extraction_metadata);

Extraction with JSON Schema (Inline)

Define your extraction schema directly as a JSON string in your script.
import LandingAIADE, { toFile } from "landingai-ade";
import fs from "fs";

// Define your extraction schema as an object
const schemaDict = {
  type: "object",
  properties: {
    employee_name: {
      type: "string",
      description: "The employee's full name"
    },
    employee_ssn: {
      type: "string",
      description: "The employee's Social Security Number"
    },
    gross_pay: {
      type: "number",
      description: "The gross pay amount"
    }
  },
  required: ["employee_name", "employee_ssn", "gross_pay"]
};

// Initialize the client
const client = new LandingAIADE();

// First, parse the document to get markdown
const parseResponse = await client.parse({
  document: fs.createReadStream("/path/to/pay-stub.pdf"),
  model: "dpt-2-latest"
});

// Convert schema object to JSON string
const schemaJson = JSON.stringify(schemaDict);

// Extract structured data using the schema
const extractResponse = await client.extract({
  schema: schemaJson,
  markdown: await toFile(Buffer.from(parseResponse.markdown), "document.md"),
  model: "extract-latest"
});

// Access the extracted data
console.log(extractResponse.extraction);

// Access extraction metadata to see which chunks were referenced
console.log(extractResponse.extraction_metadata);

Extraction with JSON Schema File

Load your extraction schema from a separate JSON file for better organization and reusability. For example, here is the pay_stub_schema.json file:
{
  "type": "object",
  "properties": {
    "employee_name": {
      "type": "string",
      "description": "The employee's full name"
    },
    "employee_ssn": {
      "type": "string",
      "description": "The employee's Social Security Number"
    },
    "gross_pay": {
      "type": "number",
      "description": "The gross pay amount"
    }
  },
  "required": ["employee_name", "employee_ssn", "gross_pay"]
}
You can pass the JSON file defined above in the following script:
import LandingAIADE, { toFile } from "landingai-ade";
import fs from "fs";

// Initialize the client
const client = new LandingAIADE();

// First, parse the document to get markdown
const parseResponse = await client.parse({
  document: fs.createReadStream("/path/to/pay-stub.pdf"),
  model: "dpt-2-latest"
});

// Load schema from JSON file
const schemaJson = fs.readFileSync("pay_stub_schema.json", "utf-8");

// Extract structured data using the schema
const extractResponse = await client.extract({
  schema: schemaJson,
  markdown: await toFile(Buffer.from(parseResponse.markdown), "document.md"),
  model: "extract-latest"
});

// Access the extracted data
console.log(extractResponse.extraction);

// Access extraction metadata to see which chunks were referenced
console.log(extractResponse.extraction_metadata);

Extract Nested Subfields

Define nested Zod schemas to extract hierarchical data from documents. This approach organizes related information under meaningful section names. The schemas with nested fields must be defined before the main extraction schema. Otherwise you may get an error that the schemas with the nested fields are not defined. For example, to extract data from the Patient Details and Emergency Contact Information sections in this Medical Form, define separate schemas for each section, then combine them in a main schema.
import LandingAIADE, { toFile } from "landingai-ade";
import fs from "fs";
import { z } from "zod";

// Define a nested schema for patient-specific information
const PatientDetailsSchema = z.object({
  patient_name: z.string().describe("Full name of the patient."),
  date: z.string().describe("Date the patient information form was filled out.")
});

// Define a nested schema for emergency contact details
const EmergencyContactInformationSchema = z.object({
  emergency_contact_name: z.string().describe("Full name of the emergency contact person."),
  relationship_to_patient: z.string().describe("Relationship of the emergency contact to the patient."),
  primary_phone_number: z.string().describe("Primary phone number of the emergency contact."),
  secondary_phone_number: z.string().describe("Secondary phone number of the emergency contact."),
  address: z.string().describe("Full address of the emergency contact.")
});

// Define the main extraction schema that combines all the nested schemas
const PatientAndEmergencyContactInformationSchema = z.object({
  patient_details: PatientDetailsSchema.describe("Information about the patient as provided in the form."),
  emergency_contact_information: EmergencyContactInformationSchema.describe("Details of the emergency contact person for the patient.")
});

// Extract TypeScript type from schema
type PatientAndEmergencyContactInformation = z.infer<typeof PatientAndEmergencyContactInformationSchema>;

// Initialize the client
const client = new LandingAIADE();

// Parse the document to get markdown
const parseResponse = await client.parse({
  document: fs.createReadStream("/path/to/medical-form.pdf"),
  model: "dpt-2-latest"
});

// Convert Zod schema to JSON schema
const schema = JSON.stringify(z.toJSONSchema(PatientAndEmergencyContactInformationSchema));

// Extract structured data using the schema
const extractResponse = await client.extract({
  schema: schema,
  markdown: await toFile(Buffer.from(parseResponse.markdown), "document.md"),
  model: "extract-latest"
});

// Display the extracted structured data
console.log(extractResponse.extraction);

Extract Variable-Length Data with List Objects

Use Zod’s z.array() to extract repeatable data structures when you don’t know how many items will appear. Common examples include line items in invoices, transaction records, or contact information for multiple people. For example, to extract variable-length wire instructions and line items from this Wire Transfer Form, use z.array(DescriptionItemSchema) for line items and z.array(WireInstructionSchema) for wire transfer details.
import LandingAIADE, { toFile } from "landingai-ade";
import fs from "fs";
import { z } from "zod";

// Nested schemas for array fields
const DescriptionItemSchema = z.object({
  description: z.string().describe("Invoice or Bill Description"),
  amount: z.number().describe("Invoice or Bill Amount")
});

const WireInstructionSchema = z.object({
  bank_name: z.string().describe("Bank name"),
  bank_address: z.string().describe("Bank address"),
  bank_account_no: z.string().describe("Bank account number"),
  swift_code: z.string().describe("SWIFT code"),
  aba_routing: z.string().describe("ABA routing number"),
  ach_routing: z.string().describe("ACH routing number")
});

// Invoice schema containing array fields
const InvoiceSchema = z.object({
  description_or_particular: z.array(DescriptionItemSchema).describe("List of invoice line items (description and amount)"),
  wire_instructions: z.array(WireInstructionSchema).describe("Wire transfer instructions")
});

// Main extraction schema
const ExtractedInvoiceFieldsSchema = z.object({
  invoice: InvoiceSchema.describe("Invoice list-type fields")
});

// Extract TypeScript type from schema
type ExtractedInvoiceFields = z.infer<typeof ExtractedInvoiceFieldsSchema>;

// Initialize the client
const client = new LandingAIADE();

// Parse the document to get markdown
const parseResponse = await client.parse({
  document: fs.createReadStream("/path/to/wire-transfer.pdf"),
  model: "dpt-2-latest"
});

// Convert Zod schema to JSON schema
const schema = JSON.stringify(z.toJSONSchema(ExtractedInvoiceFieldsSchema));

// Extract structured data using the schema
const extractResponse = await client.extract({
  schema: schema,
  markdown: await toFile(Buffer.from(parseResponse.markdown), "document.md"),
  model: "extract-latest"
});

// Display the extracted data
console.log(extractResponse.extraction);
Use the references IDs from extraction_metadata to find the exact location where data was extracted in the source document. This is useful for visual validation, quality assurance, or building confidence scores.
import LandingAIADE, { toFile } from "landingai-ade";
import fs from "fs";
import { z } from "zod";

// Define extraction schema
const PayStubDataSchema = z.object({
  employee_name: z.string().describe("The employee's full name"),
  gross_pay: z.number().describe("The gross pay amount")
});

const client = new LandingAIADE();

// Parse the document
const parseResponse = await client.parse({
  document: fs.createReadStream("/path/to/pay-stub.pdf"),
  model: "dpt-2-latest"
});

// Extract data
const schema = JSON.stringify(z.toJSONSchema(PayStubDataSchema));
const extractResponse = await client.extract({
  schema: schema,
  markdown: await toFile(Buffer.from(parseResponse.markdown), "document.md"),
  model: "extract-latest"
});

// Link extracted field to its source location
const chunkId = extractResponse.extraction_metadata["employee_name"]["references"][0];
const grounding = parseResponse.grounding[chunkId];

console.log(`Employee name: ${extractResponse.extraction["employee_name"]}`);
console.log(`Found on page ${grounding.page}`);
console.log(`Location: (${grounding.box.left.toFixed(3)}, ${grounding.box.top.toFixed(3)}, ${grounding.box.right.toFixed(3)}, ${grounding.box.bottom.toFixed(3)})`);
console.log(`Chunk type: ${grounding.type}`);

Sample Scripts for Common Use Cases

Parse a Directory of Documents

import LandingAIADE from "landingai-ade";
import fs from "fs";
import path from "path";

const client = new LandingAIADE();

// Replace with your folder path containing documents to parse
const dataFolder = "data/";

const files = fs.readdirSync(dataFolder);

for (const filename of files) {
  const filepath = path.join(dataFolder, filename);
  const fileExt = path.extname(filepath).toLowerCase();

  // Adjust file types as needed for your use case
  if (['.pdf', '.png', '.jpg', '.jpeg'].includes(fileExt)) {
    console.log(`Processing: ${filename}`);

    const response = await client.parse({
      document: fs.createReadStream(filepath),
      model: "dpt-2-latest"
    });
    console.log(response.chunks);

    // Save Markdown output (useful if you plan to run extract on the Markdown)
    const outputMd = path.basename(filepath, fileExt) + ".md";
    fs.writeFileSync(outputMd, response.markdown, "utf-8");
  }
}

Async Parse: Processing Multiple Documents Concurrently

Use concurrent parsing when you need to process many lightweight documents (such as invoices, receipts, or forms) efficiently. The TypeScript library’s methods are already asynchronous, allowing you to send multiple parse requests concurrently using JavaScript’s Promise.all() or Promise.allSettled(). This significantly reduces total processing time compared to sequential requests. The concurrent approach lets you send multiple requests in parallel. While one document is being processed, another request can be sent. The API server handles the actual document processing in the background. To avoid exceeding the pages per hour limits and receiving 429 errors, use a concurrency control library like p-limit to limit the number of simultaneous requests.

Basic Concurrent Parsing

import LandingAIADE from "landingai-ade";
import fs from "fs";
import path from "path";

const client = new LandingAIADE();

async function parseMultipleDocuments() {
  // Replace with your folder path containing documents to parse
  const dataFolder = "data/";
  const files = fs.readdirSync(dataFolder);

  // Filter for file types (adjust as needed for your use case)
  const documentFiles = files.filter(filename => {
    const fileExt = path.extname(filename).toLowerCase();
    return ['.pdf', '.png', '.jpg', '.jpeg'].includes(fileExt);
  });

  // Parse all documents concurrently
  const parsePromises = documentFiles.map(filename => {
    const filepath = path.join(dataFolder, filename);
    return client.parse({
      document: fs.createReadStream(filepath),
      model: "dpt-2-latest"
    }).then(response => ({
      filename,
      response
    })).catch(error => ({
      filename,
      error: error.message
    }));
  });

  // Wait for all parsing to complete
  const results = await Promise.all(parsePromises);

  // Process results
  for (const result of results) {
    if ('error' in result) {
      console.error(`Failed to parse ${result.filename}: ${result.error}`);
    } else {
      console.log(`Successfully parsed ${result.filename}`);

      // Save Markdown output
      const outputMd = path.basename(result.filename, path.extname(result.filename)) + ".md";
      fs.writeFileSync(outputMd, result.response.markdown, "utf-8");
    }
  }
}

parseMultipleDocuments();

Concurrent Parsing with Rate Limiting

To control concurrency and avoid rate limits, use the p-limit library:
npm install p-limit
import LandingAIADE from "landingai-ade";
import fs from "fs";
import path from "path";
import pLimit from "p-limit";

const client = new LandingAIADE();

// Adjust concurrency limit as needed to avoid rate limits
const limit = pLimit(5);

async function parseMultipleDocumentsWithRateLimit() {
  // Replace with your folder path containing documents to parse
  const dataFolder = "data/";
  const files = fs.readdirSync(dataFolder);

  // Filter for file types (adjust as needed for your use case)
  const documentFiles = files.filter(filename => {
    const fileExt = path.extname(filename).toLowerCase();
    return ['.pdf', '.png', '.jpg', '.jpeg'].includes(fileExt);
  });

  // Parse all documents with concurrency control
  const parsePromises = documentFiles.map(filename => {
    // Wrap each parse call with the rate limiter
    return limit(async () => {
      const filepath = path.join(dataFolder, filename);
      console.log(`Parsing ${filename}...`);

      try {
        const response = await client.parse({
          document: fs.createReadStream(filepath),
          model: "dpt-2-latest"
        });

        // Save Markdown output
        const outputMd = path.basename(filename, path.extname(filename)) + ".md";
        fs.writeFileSync(outputMd, response.markdown, "utf-8");

        console.log(`Successfully parsed ${filename}`);
        return { filename, success: true };
      } catch (error) {
        console.error(`Failed to parse ${filename}: ${error.message}`);
        return { filename, success: false, error: error.message };
      }
    });
  });

  // Wait for all parsing to complete
  const results = await Promise.all(parsePromises);

  // Summary
  const successful = results.filter(r => r.success).length;
  const failed = results.filter(r => !r.success).length;
  console.log(`\nCompleted: ${successful} successful, ${failed} failed`);
}

parseMultipleDocumentsWithRateLimit();

Save Parsed Output

Use this script to save the parsed output to JSON and Markdown files for downstream processing.
import LandingAIADE from "landingai-ade";
import fs from "fs";
import path from "path";

const client = new LandingAIADE();

async function saveParsedOutput() {
  // Replace with your file path
  const response = await client.parse({
    document: fs.createReadStream("/path/to/file/document"),
    model: "dpt-2-latest"
  });

  // Create output directory if it doesn't exist
  const outputDir = "ade_results";
  if (!fs.existsSync(outputDir)) {
    fs.mkdirSync(outputDir, { recursive: true });
  }

  // Save the response to JSON
  const outputFile = path.join(outputDir, "parse_results.json");
  fs.writeFileSync(outputFile, JSON.stringify(response, null, 2), "utf-8");

  // Save Markdown output (useful if you plan to run extract on the Markdown)
  const markdownFile = path.join(outputDir, "output.md");
  fs.writeFileSync(markdownFile, response.markdown, "utf-8");

  console.log(`Results saved to: ${outputFile}`);
  console.log(`Markdown saved to: ${markdownFile}`);
}

saveParsedOutput();

Visualize Parsed Chunks: Draw Bounding Boxes

Use this script to visualize parsed chunks by drawing color-coded bounding boxes on your document. Each chunk type uses a distinct color, making it easy to see how the document was parsed. The script identifies chunk types and table cells. For PDFs, the script creates a separate annotated PNG for each page (page_1_annotated.png, page_2_annotated.png). For image files, the script creates a single page_annotated.png. The image below shows an example output with bounding boxes drawn on the first page of a PDF: Annotated PDF Page First, install the required dependencies:
npm install canvas pdf-to-img
import LandingAIADE from "landingai-ade";
import fs from "fs";
import path from "path";
import { createCanvas, loadImage } from "canvas";
import { pdf } from "pdf-to-img";

// Define colors for each chunk type
const CHUNK_TYPE_COLORS: Record<string, [number, number, number]> = {
  chunkText: [40, 167, 69],        // Green
  chunkTable: [0, 123, 255],       // Blue
  chunkMarginalia: [111, 66, 193], // Purple
  chunkFigure: [255, 0, 255],      // Magenta
  chunkLogo: [144, 238, 144],      // Light green
  chunkCard: [255, 165, 0],        // Orange
  chunkAttestation: [0, 255, 255], // Cyan
  chunkScanCode: [255, 193, 7],    // Yellow
  chunkForm: [220, 20, 60],        // Red
  tableCell: [173, 216, 230],      // Light blue
  table: [70, 130, 180],           // Steel blue
};

function rgbToString(rgb: [number, number, number]): string {
  return `rgb(${rgb[0]}, ${rgb[1]}, ${rgb[2]})`;
}

async function drawAnnotations(
  imageBuffer: Buffer,
  groundings: any,
  pageNum: number
): Promise<Buffer> {
  // Load the rendered PDF page image
  const image = await loadImage(imageBuffer);
  const canvas = createCanvas(image.width, image.height);
  const ctx = canvas.getContext("2d");

  // Draw the original PDF page
  ctx.drawImage(image, 0, 0);

  const imgWidth = canvas.width;
  const imgHeight = canvas.height;

  // Draw bounding boxes for each grounding
  for (const [gid, grounding] of Object.entries(groundings)) {
    const g = grounding as any;

    // Check if grounding belongs to this page
    if (g.page !== pageNum) {
      continue;
    }

    const box = g.box;

    // Convert normalized coordinates to pixel coordinates
    const x1 = Math.floor(box.left * imgWidth);
    const y1 = Math.floor(box.top * imgHeight);
    const x2 = Math.floor(box.right * imgWidth);
    const y2 = Math.floor(box.bottom * imgHeight);

    // Get color for this chunk type (default to gray)
    const color = CHUNK_TYPE_COLORS[g.type] || [128, 128, 128];
    const colorString = rgbToString(color);

    // Draw bounding box
    ctx.strokeStyle = colorString;
    ctx.lineWidth = 3;
    ctx.strokeRect(x1, y1, x2 - x1, y2 - y1);

    // Draw label background and text
    const label = `${g.type}:${gid.substring(0, 8)}`;
    const labelY = Math.max(0, y1 - 20);

    ctx.fillStyle = colorString;
    ctx.fillRect(x1, labelY, label.length * 8, 20);

    ctx.fillStyle = "white";
    ctx.font = "12px sans-serif";
    ctx.fillText(label, x1 + 2, labelY + 14);
  }

  return canvas.toBuffer("image/png");
}

async function drawBoundingBoxes(parseResponse: any, documentPath: string) {
  const fileExtension = path.extname(documentPath).toLowerCase();

  if (fileExtension === ".pdf") {
    // Convert PDF to images using pdf-to-img
    const document = await pdf(documentPath, { scale: 2.0 });

    let pageNum = 0;
    for await (const page of document) {
      console.log(`Processing page ${pageNum + 1}...`);

      // Draw annotations on the rendered PDF page
      const annotatedBuffer = await drawAnnotations(
        page,
        parseResponse.grounding,
        pageNum
      );

      // Save annotated image
      const outputPath = `page_${pageNum + 1}_annotated.png`;
      fs.writeFileSync(outputPath, annotatedBuffer);
      console.log(`Annotated image saved to: ${outputPath}`);

      pageNum++;
    }
  } else {
    // Load image file directly
    const image = await loadImage(documentPath);
    const canvas = createCanvas(image.width, image.height);
    const ctx = canvas.getContext("2d");

    // Draw the image
    ctx.drawImage(image, 0, 0);

    const imgWidth = canvas.width;
    const imgHeight = canvas.height;

    // Draw bounding boxes for page 0
    for (const [gid, grounding] of Object.entries(parseResponse.grounding)) {
      const g = grounding as any;

      if (g.page !== 0) continue;

      const box = g.box;
      const x1 = Math.floor(box.left * imgWidth);
      const y1 = Math.floor(box.top * imgHeight);
      const x2 = Math.floor(box.right * imgWidth);
      const y2 = Math.floor(box.bottom * imgHeight);

      const color = CHUNK_TYPE_COLORS[g.type] || [128, 128, 128];
      const colorString = rgbToString(color);

      ctx.strokeStyle = colorString;
      ctx.lineWidth = 3;
      ctx.strokeRect(x1, y1, x2 - x1, y2 - y1);

      const label = `${g.type}:${gid.substring(0, 8)}`;
      const labelY = Math.max(0, y1 - 20);

      ctx.fillStyle = colorString;
      ctx.fillRect(x1, labelY, label.length * 8, 20);

      ctx.fillStyle = "white";
      ctx.font = "12px sans-serif";
      ctx.fillText(label, x1 + 2, labelY + 14);
    }

    // Save annotated image
    const buffer = canvas.toBuffer("image/png");
    fs.writeFileSync("page_annotated.png", buffer);
    console.log("Annotated image saved to: page_annotated.png");
  }
}

// Main execution
const client = new LandingAIADE();

async function visualizeChunks() {
  // Replace with your file path
  const documentPath = "/path/to/file/document";

  // Parse the document
  console.log("Parsing document...");
  const parseResponse = await client.parse({
    document: fs.createReadStream(documentPath),
    model: "dpt-2-latest"
  });
  console.log("Parsing complete!");

  // Draw bounding boxes and create annotated images
  await drawBoundingBoxes(parseResponse, documentPath);
}

visualizeChunks();

Save Parsed Chunks as Images

Use this script to extract and save each parsed chunk as a separate PNG. This is useful for building datasets, analyzing chunk quality, or processing individual document regions. First, install the required dependencies:
npm install canvas pdf-to-img
import LandingAIADE from "landingai-ade";
import fs from "fs";
import path from "path";
import { createCanvas, loadImage } from "canvas";
import { pdf } from "pdf-to-img";

async function saveChunksAsImages(
  parseResponse: any,
  documentPath: string,
  outputBaseDir: string = "groundings"
) {
  // Create timestamped output directory
  const timestamp = new Date().toISOString().replace(/[:.]/g, "-").slice(0, -5);
  const documentName = path.basename(documentPath, path.extname(documentPath));
  const outputDir = path.join(outputBaseDir, `${documentName}_${timestamp}`);

  async function savePageChunks(
    imageBuffer: Buffer,
    chunks: any[],
    pageNum: number
  ) {
    // Load the page image
    const image = await loadImage(imageBuffer);
    const imgWidth = image.width;
    const imgHeight = image.height;

    // Create page-specific directory
    const pageDir = path.join(outputDir, `page_${pageNum}`);
    if (!fs.existsSync(pageDir)) {
      fs.mkdirSync(pageDir, { recursive: true });
    }

    // Process each chunk
    for (const chunk of chunks) {
      // Check if chunk belongs to this page
      if (chunk.grounding.page !== pageNum) {
        continue;
      }

      const box = chunk.grounding.box;

      // Convert normalized coordinates to pixel coordinates
      const x1 = Math.floor(box.left * imgWidth);
      const y1 = Math.floor(box.top * imgHeight);
      const x2 = Math.floor(box.right * imgWidth);
      const y2 = Math.floor(box.bottom * imgHeight);

      // Calculate crop dimensions
      const width = x2 - x1;
      const height = y2 - y1;

      // Create canvas for cropped chunk
      const canvas = createCanvas(width, height);
      const ctx = canvas.getContext("2d");

      // Draw the cropped region
      ctx.drawImage(image, x1, y1, width, height, 0, 0, width, height);

      // Save with descriptive filename
      const filename = `${chunk.type}.${chunk.id}.png`;
      const outputPath = path.join(pageDir, filename);
      const buffer = canvas.toBuffer("image/png");
      fs.writeFileSync(outputPath, buffer);

      console.log(`Saved chunk: ${outputPath}`);
    }
  }

  const fileExtension = path.extname(documentPath).toLowerCase();

  if (fileExtension === ".pdf") {
    // Convert PDF to images
    const document = await pdf(documentPath, { scale: 2.0 });

    let pageNum = 0;
    for await (const page of document) {
      console.log(`Processing page ${pageNum}...`);
      await savePageChunks(page, parseResponse.chunks, pageNum);
      pageNum++;
    }
  } else {
    // Load image file directly
    const imageBuffer = fs.readFileSync(documentPath);
    await savePageChunks(imageBuffer, parseResponse.chunks, 0);
  }

  console.log(`\nAll chunks saved to: ${outputDir}`);
  return outputDir;
}

// Main execution
const client = new LandingAIADE();

async function extractChunks() {
  // Replace with your file path
  const documentPath = "/path/to/file/document";

  // Parse the document
  console.log("Parsing document...");
  const parseResponse = await client.parse({
    document: fs.createReadStream(documentPath),
    model: "dpt-2-latest"
  });
  console.log("Parsing complete!");

  // Save each chunk as a separate image
  await saveChunksAsImages(parseResponse, documentPath);
}

extractChunks();