Overview
When you run the API, the response includes anextraction_metadata field with reference IDs that connect each extracted value back to its location in the original document. This tutorial shows you how to use those references.
This tutorial uses the library and library.
In this tutorial, we will:
- Parse this PDF: Pay Stub
- Extract these fields: Employee Name and Gross Pay
- Save a PNG of each extracted field’s location
- Save bounding box coordinates for each extracted field to a JSON file
These examples require the Python or TypeScript client library. Before running a script, set your API key and install the library and any required dependencies.
The Python script has been tested with PDF and PNG files and may work with other file types supported by . The TypeScript script is written specifically for PDF files. If you need to export crops from other file types, use it as a reference and adapt the image export logic.
1. Download the Document to Process
Download the Pay Stub and save it to a local directory.2. Create the Script
Copy the script for your language and save it asgrounding.py or grounding.ts in the same directory as the PDF.
3. Run the Script
Run the script from the same directory:4. View Output
The script saves the following files to theoutput folder:
| File | Description |
|---|---|
pay-stub_parse_output.json | Full parse response, including all chunks and grounding data. |
pay-stub_extract_output.json | Extraction results, including extracted values and reference IDs. |
pay-stub_grounding_output.json | Extracted values and bounding box coordinates for each field. |
employee_name.png | Cropped image of the chunk where the employee name was found. |
gross_pay.png | Cropped image of the chunk where the gross pay was found. |
Chunk Coordinates
Each entry inpay-stub_grounding_output.json includes the page number and bounding box coordinates. Coordinates are normalized values between 0 and 1, relative to the page dimensions:
Next Steps
Now that you have a working script, you can:- Replace
pay-stub.pdfwith any document you want to parse and extract from. - Modify the
schemadictionary to extract different fields. For guidance, see Extraction Schema (JSON). - Use the Playground to build and test a schema before adding it to your code. See Schema Wizard.

