The Agentic Document Extraction API endpoint imposes rate limits per API key. The Agentic Document Extraction library automatically handles the rate limit error or other intermittent HTTP errors with retries.

You can customize the retry options with the Configuration Options in the library.

Error Handling

The Agentic Document Extraction library implements a retry mechanism for handling API failures:

  • Retries are performed for these HTTP status codes: 408, 429, 502, 503, 504.
  • Exponential backoff with jitter is used for retry wait time.
  • The initial retry wait time is 1 second, which increases exponentially.
  • Retry will stop after max_retries attempts. Exceeding the limit raises an exception and results in a failure for this request.
  • Retry wait time is capped at max_retry_wait_time seconds.
  • Retries include a random jitter of up to 10 seconds to distribute requests and prevent the thundering herd problem.

Configuration Options

The Agentic Document Extraction library uses a Settings object to manage configuration. You can customize these settings either through environment variables or a .env file:

Below is an example .env file that customizes the configurations:

# Number of files to process in parallel, defaults to 4
BATCH_SIZE=4
# Number of threads used to process parts of each file in parallel, defaults to 5.
MAX_WORKERS=2
# Maximum number of retry attempts for failed intermittent requests, defaults to 100
MAX_RETRIES=80
# Maximum wait time in seconds for each retry, defaults to 60
MAX_RETRY_WAIT_TIME=30
# Logging style for retry, defaults to log_msg
RETRY_LOGGING_STYLE=log_msg

Max Parallelism

The maximum number of parallel requests is determined by multiplying BATCH_SIZE × MAX_WORKERS.

NOTE: The maximum parallelism allowed by the Agentic Document Extraction library is 100.

Specifically, increasing MAX_WORKERS can speed up the processing of large individual files, while increasing BATCH_SIZE improves throughput when processing multiple files.

NOTE: Your job’s maximum processing throughput may be limited by your API rate limit. If your rate limit isn’t high enough, you may encounter rate limit errors, which the library will automatically handle through retries.

The optimal values for MAX_WORKERS and BATCH_SIZE depend on your API rate limit and the latency of each REST API call. For example, if your account has a rate limit of 5 requests per minute, and each REST API call takes approximately 60 seconds to complete, and you’re processing a single large file, then MAX_WORKERS should be set to 5 and BATCH_SIZE to 1.

You can find your REST API latency in the logs. If you want to increase your rate limit, schedule a time to meet with us here.

Set RETRY_LOGGING_STYLE

The RETRY_LOGGING_STYLE setting controls how the Agentic Document Extraction library logs the retry attempts.

  • log_msg: Log the retry attempts as a log messages. Each attempt is logged as a separate message. This is the default setting.
  • inline_block: Print a yellow progress block (’█’) on the same line. Each block represents one retry attempt. Choose this if you don’t want to see the verbose retry logging message and still want to track the number of retries that have been made.
  • none: Do not log the retry attempts.
`; document.head.insertAdjacentHTML('afterbegin', gtmHeadHTML); } // Add GTM noscript to body function addGTMNoscript() { const gtmBodyHTML = ` `; document.body.insertAdjacentHTML('afterbegin', gtmBodyHTML); } // Initialize GTM when DOM is ready function initializeGTM() { if (document.readyState === 'loading') { document.addEventListener('DOMContentLoaded', function() { addGTMScript(); addGTMNoscript(); }); } else { addGTMScript(); addGTMNoscript(); } } // Initialize dataLayer if it doesn't exist window.dataLayer = window.dataLayer || []; // Start initialization initializeGTM(); })();