What Is a “Grounding”?

When parses a document, it breaks the content into chunks, which are discrete elements extracted from a document, like blocks of text or tables.

Each chunk includes a grounding, which represents the location of the chunk in the document. The grounding includes:

  • the page number that the chunk is on
  • the relative coordinates of the bounding box of the chunk

For example, below is the JSON output for a text chunk. The grounding object indicates that the text is on the first page, and the box object indicates the bounding box coordinates.

{
      "text": "## INSURANCE COMPANY",
      "grounding": [
        {
          "box": {
            "l": 0.35,
            "t": 0.22619999999999998,
            "r": 0.565,
            "b": 0.24033749999999998
          },
          "page": 0
        }
      ],
      "chunk_type": "text",
      "chunk_id": "9475461e-0686-4b16-b503-ccec7d7f115c"
    }

Save Groundings as Images

When using any of the parsing functions from the library, you can use the optional grounding_save_dir parameter to save each grounding as an image. The images are saved to a directory you specify.

Here’s an example of how to use the grounding_save_dir parameter with the parse_documents parsing function:

from agentic_doc.parse import parse_documents

# Parse a document from a URL & save groundings
results = parse_documents(
    ["https://www.rbcroyalbank.com/banking-services/_assets-custom/pdf/eStatement.pdf"],
    grounding_save_dir="./grounding"
)

# Print the path to each saved grounding
for chunk in results[0].chunks:
    for grounding in chunk.grounding:
        if grounding.image_path:
            print(f"Grounding saved to: {grounding.image_path}")

File Path and File Name Conventions for Saved Groundings

Images are saved with this structure:

path/to/save/groundings/
└── document_TIMESTAMP/
    └── page_0/
        └── ChunkType.TYPE_CHUNK_ID_Y.png

Where:

  • TIMESTAMP is the time and date the document was parsed
  • page_0 is the page number
  • TYPE is the chunk type
  • CHUNK_ID is the chunk ID
  • Y is the index of the grounding (in case a chunk spans multiple regions)