Skip to main content

Prerequisites

In :
  1. If you haven’t already, create an account.
  2. Get your API key.
In Snowsight:
  1. Install the app.
  2. Enter your API key in the app.
  3. If you want to parse documents staged in Snowflake, grant the app access to the stages that have the files you want to parse.

Set Up the Session

Before running a parse or extract procedure, run the command below to set your session to use the Agentic Document Extraction application and procedures. Replace this placeholder with the name of your instance of Agentic Document Extraction: APP_NAME.
USE "APP_NAME";

Parse

To parse documents, use the api.parse procedure. The api.parse procedure sends a document from a Snowflake stage or publicly accessible URL to the -hosted service, and saves the parsed content to an output table (defaults to db.parse_output). The api.parse procedure runs the ADE Parse Jobs API.

Optional Parameters

The api.parse procedure supports these optional parameters:

Specify a Custom Output Table

You can specify a custom table name for storing parse results instead of using the default parse_output:
CALL api.parse(
    file_path => 'https://va.landing.ai/pdfs/invoice_1.pdf',
    output_table => 'my_custom_results'
);

SELECT * FROM db.my_custom_results;
The procedure automatically creates the table (if it doesn’t exist) with the following schema:
  • DOC_ID
  • SOURCE_URL
  • FILENAME
  • PAGE_COUNT
  • MODEL_VERSION
  • PARSED_AT
  • STATUS_CODE
  • MARKDOWN
  • CHUNKS
  • SPLITS
  • GROUNDING
  • METADATA
  • ERROR

Example with Optional Parameters

CALL api.parse(
    file_path => 'https://va.landing.ai/pdfs/invoice_1.pdf',
    model => 'dpt-2-mini-latest',
    output_table => 'invoice_results'
);

SELECT * FROM db.invoice_results;

Parse Return Object

The api.parse procedure returns an OBJECT with the following fields:
  • message: Success or error message
  • output_table: Name of the table where results were saved (such as “db.parse_output”)
  • doc_id: Unique document identifier for the parsed document
  • status_code: HTTP status code for the request
This return object is useful when chaining parse and extract operations together. You can capture the result into a variable and pass it to api.extract. Example of capturing the return object:
DECLARE
    result OBJECT;
BEGIN
    CALL api.parse(
        'https://va.landing.ai/pdfs/invoice_1.pdf'
    ) INTO :result;

    RETURN result;
END;

Use build_scoped_file_url for Staged Files

You can use the build_scoped_file_url() function to reference files in Snowflake stages.
CALL api.parse(
    file_path => build_scoped_file_url('@your_db.your_schema.your_stage', '/sample_image.png')
);

SELECT * FROM db.parse_output;

Sample Scenarios

This section provides examples of how to run the api.parse procedure in different scenarios.

Parse a File at a Publicly Accessible URL

Run the command below to parse a single file at a publicly accessible URL. We’ve provided a sample file to help you get started. Replace this placeholder with your information: APP_NAME.
USE "APP_NAME";

CALL api.parse(
    'https://va.landing.ai/pdfs/invoice_1.pdf'
);

SELECT * FROM db.parse_output;

Parse a Staged File

Before parsing staged files, you must grant the application access to your stage. For more information, go to Grant Access to Stages.
Run the command below to parse a single file in a Snowflake stage. Replace these placeholders with your information: APP_NAME, your_db, your_schema, your_stage, and path/to/file.pdf.
USE "APP_NAME";

CALL api.parse(
    '@your_db.your_schema.your_stage/path/to/file.pdf'
);

SELECT * FROM db.parse_output;

Sample Script: Parse a Staged File

Let’s say you have the following setup:
  • APP_NAME: AGENTIC_DOCUMENT_EXTRACTION__APP
  • Database: DEMO_DB
  • Schema: DEMO_SCHEMA
  • Stage: DEMO_STAGE
  • PDF: statement-jane-harper.pdf
First, grant the application access to the stage:
GRANT USAGE ON DATABASE DEMO_DB TO APPLICATION "AGENTIC_DOCUMENT_EXTRACTION__APP";
GRANT USAGE ON SCHEMA DEMO_DB.DEMO_SCHEMA TO APPLICATION "AGENTIC_DOCUMENT_EXTRACTION__APP";
GRANT READ, WRITE ON STAGE DEMO_DB.DEMO_SCHEMA.DEMO_STAGE TO APPLICATION "AGENTIC_DOCUMENT_EXTRACTION__APP";
Then, parse the PDF:
USE "AGENTIC_DOCUMENT_EXTRACTION__APP";

CALL api.parse(
    '@DEMO_DB.DEMO_SCHEMA.DEMO_STAGE/statement-jane-harper.pdf'
);

SELECT * FROM db.parse_output;

Parse Multiple Staged Files

Before parsing staged files, you must grant the application access to your stage. For more information, go to Grant Access to Stages.
One way to process multiple documents is to call the api.parse procedure for each file. The procedure saves the results to the db.parse_output table, where you can query all parsed documents. Here’s an example of processing multiple files and then viewing the results. Replace these placeholders with your information: APP_NAME, your_db, your_schema, and your_stage.
USE "APP_NAME";

-- Parse multiple files
CALL api.parse('@your_db.your_schema.your_stage/file1.pdf');
CALL api.parse('@your_db.your_schema.your_stage/file2.pdf');
CALL api.parse('@your_db.your_schema.your_stage/file3.pdf');

-- View all parsed results
SELECT * FROM db.parse_output;

Sample Script: Parse Multiple Staged Files

Let’s say you have the following setup:
  • APP_NAME: AGENTIC_DOCUMENT_EXTRACTION__APP
  • Database: DEMO_DB
  • Schema: DEMO_SCHEMA
  • Stage: DEMO_STAGE (contains PDFs and images)
The DEMO_STAGE contains the following files:
  • statement-george-mathew.png
  • statement-jane-harper.pdf
  • statement-john-doe.png
  • statement-john-smith.png
First, grant the application access to the stage:
GRANT USAGE ON DATABASE DEMO_DB TO APPLICATION "AGENTIC_DOCUMENT_EXTRACTION__APP";
GRANT USAGE ON SCHEMA DEMO_DB.DEMO_SCHEMA TO APPLICATION "AGENTIC_DOCUMENT_EXTRACTION__APP";
GRANT READ, WRITE ON STAGE DEMO_DB.DEMO_SCHEMA.DEMO_STAGE TO APPLICATION "AGENTIC_DOCUMENT_EXTRACTION__APP";
Then, parse the documents:
USE "AGENTIC_DOCUMENT_EXTRACTION__APP";

-- Parse each document
CALL api.parse('@DEMO_DB.DEMO_SCHEMA.DEMO_STAGE/statement-george-mathew.png');
CALL api.parse('@DEMO_DB.DEMO_SCHEMA.DEMO_STAGE/statement-jane-harper.pdf');
CALL api.parse('@DEMO_DB.DEMO_SCHEMA.DEMO_STAGE/statement-john-doe.png');
CALL api.parse('@DEMO_DB.DEMO_SCHEMA.DEMO_STAGE/statement-john-smith.png');

-- View all parsed results
SELECT * FROM db.parse_output;

Remove Parse Output Tables

Each time you run the api.parse procedure, the results are saved to an output table (defaults to db.parse_output). Over time, you may want to remove these tables to start fresh or clean up old results. This is especially useful if you have specified custom output table names and accumulated multiple tables. To remove parse output tables:
  1. Go to Catalog > Apps > Agentic Document Extraction - App.
  2. Click Settings.
  3. Navigate to SQL Execution.
  4. Enter the following SQL command in the SQL field:
    DROP TABLE IF EXISTS APP_NAME.DB.TABLE_NAME;
    
  5. Replace these placeholders with your information:
    • APP_NAME: The name of your app instance
    • DB: The app’s database name (use db)
    • TABLE_NAME: The name of the table you want to remove (such as parse_output or your custom table name)
  6. Click Run Query.