When parsing a document with the library, you can use the visualization utility to create annotated images that show where each chunk of content was extracted from the document.Visualizing the results is useful for:
Verifying the accuracy of the extraction
Debugging extraction issues
For example, here is a side-by-side comparision of a document (left) and the saved visualization of that document (right).
Sample Script: Save Visualization with Default Settings
Run the following script to create visualizations using the default settings from the viz_parsed_document function.
Copy
Ask AI
from agentic_doc.parse import parsefrom agentic_doc.utils import viz_parsed_document# Define the document path and output directorydoc_path = "path/to/document.pdf"output_dir = "path/to/save/visualizations"# Parse the document (returns a list of parsed documents)results = parse(doc_path)parsed_doc = results[0] # Get the first parsed document# Create visualizations with default settingsimages = viz_parsed_document( doc_path, parsed_doc, output_dir=output_dir)
Sample Script: Save Visualization with Custom Settings
You can customize the visualization settings by using the VisualizationConfig class. For example, you can change the colors of the bounding boxes around each chunk type.
Copy
Ask AI
from agentic_doc.parse import parsefrom agentic_doc.utils import viz_parsed_documentfrom agentic_doc.common import ChunkTypefrom agentic_doc.config import VisualizationConfig# Define the document path and output directorydoc_path = "path/to/document.pdf"output_dir = "path/to/save/visualizations"# Parse the document (returns a list of parsed documents)results = parse(doc_path)parsed_doc = results[0] # Get the first parsed documentviz_config = VisualizationConfig( thickness=2, # Thicker bounding boxes text_bg_opacity=0.8, # More opaque text background font_scale=0.7, # Larger text # Custom colors for different chunk types color_map={ ChunkType.marginalia: (0,255,0), # Green for marginalia ChunkType.table: (0, 0, 255), # Red for tables ChunkType.figure: (255, 165, 0), # Light blue for figures ChunkType.text: (255, 0, 0), # Blue for regular text })images = viz_parsed_document( doc_path, parsed_doc, output_dir=output_dir, viz_config=viz_config)