The easiest way to parse documents with is to use our Python library. In this quickstart, we’ll use the library to extract data from a PDF on a local directory.

1. Set the API Key as an Environment Variable

Get your API key and set it as an environment variable (or put it in a .env file):

export VISION_AGENT_API_KEY=<your-api-key>

2. Install the Library

pip install agentic-doc

3. Extract Data from a Local File and Return Results as Objects

Run this script to parse a file on a local directory and return the results as Markdown and JSON objects.

from agentic_doc.parse import parse

# Parse a local file
result = parse("path/to/file.pdf")

# Get the extracted data as markdown
(result.markdown)

# Get the extracted data as structured chunks of content in a JSON schema
(result.chunks)  

The API parses the document and prints the JSON and Markdown outputs for the document in the console. Because the extracted data is returned as objects, you can write scripts that take that output and immediately process it. For example, you could create a web app that extracts structured data from a PDF and immediately renders it in the UI.

4. Extract Data from a Local File and Save Results

In the previous example, you parsed a file and immediately output the results in the console. Now, run a different script to parse the same file and save the results as a JSON file in a local directory.

Run this script to parse a local file and save the results as a JSON file at the specified directory.

from agentic_doc.parse import parse

# Parse a local PDF and save results to directory
result = parse("path/to/file.pdf", result_save_dir="path/to/save/results")

# Print the file path to the JSON file
print(f"Final result: {result[0].result_path}")

The API parses the document and saves the results in the directory you specified. Because the extracted data is saved, you can later audit it or build an app that references it. For example, you could build out a document processing system that parses documents nightly and saves the output as JSON files for auditors to inspect the next day.

5. Next Steps

Now that you know how to parse documents, learn about the additional parameters in Parsing Basics so that you can build out custom scripts for your use case.