Quickstart: Index Documents
- Grant permissions to the stage with the files you want to index.
- If you plan on using Local Processing to parse the documents, grant permissions to access Cortex.
- Follow the instructions in Index Documents.
Index Documents
After you have granted the app permission to access the stage with the files you want to index (and if using Local Processing, granted the app permission to use Cortex), you can index documents.- Open Snowsight.
- Go to Catalog > Apps > LandingAI Agentic Document Extraction.
- Click Index Documents.
- Enter the location of the files you want to index in the Stage Name field. Use this format:
<YOUR_DB>.<YOUR_SCHEMA>.<YOUR_STAGE>
. - If you want to filter the documents included in indexing, enter a Regex Filter. This field accepts Python-compatible regular expressions to match specific filenames. For example,
^ABC.*
matches files starting with “ABC”, or `.*.csv matches files ending with “.csv”. For supported patterns, go to the Snowflake documentation. - Enter the number of Parallel Jobs.
- Select the Processing Method:
- Cloud: Uses Cloud Processing.
- Snowflake Cortex: Uses Local Processing.
- Click Index Documents.
- An indexing job is created and displays. You might need to refresh the page to see the updated status.
Indexed Data
The indexing process creates a table that contains key file information and the extracted data. This is the table schema, where:APP_NAME
is the name of your appapp_data
is schema nameindexed_documents
is the table name
Access Indexed Data
Use the following script to see all indexed data. Replace this placeholder with your information:APP_NAME
.
All Indexed Documents Are Stored in One Table
The results from all document indexing jobs are stored in this table:APP_NAME.app_data.indexed_documents
.
To view results from specific indexing jobs, filter your queries using the stage_name
and file_path
columns.
For example, use the following script to see indexed data from files on a specific stage. Replace these placeholders with your information: APP_NAME
, and your_stage
.