max_retries
attempts. Exceeding the limit raises an exception and results in a failure for this request.max_retry_wait_time
seconds.Settings
object to manage configuration. You can customize these settings either through environment variables or a .env
file:
Below is an example .env
file that customizes the configurations:
BATCH_SIZE
× MAX_WORKERS
.
NOTE: The maximum parallelism allowed by the library is 100.Specifically, increasing
MAX_WORKERS
can speed up the processing of large individual files, while increasing BATCH_SIZE
improves throughput when processing multiple files.
NOTE: Your job’s maximum processing throughput may be limited by your API rate limit. If your rate limit isn’t high enough, you may encounter rate limit errors, which the library will automatically handle through retries.The optimal values for
MAX_WORKERS
and BATCH_SIZE
depend on your API rate limit and the latency of each REST API call. For example, if your account has a rate limit of 5 requests per minute, and each REST API call takes approximately 60 seconds to complete, and you’re processing a single large file, then MAX_WORKERS
should be set to 5 and BATCH_SIZE
to 1.
You can find your REST API latency in the logs. If you want to increase your rate limit, schedule a time to meet with us here.
RETRY_LOGGING_STYLE
RETRY_LOGGING_STYLE
setting controls how the library logs the retry attempts.
log_msg
: Log the retry attempts as a log messages. Each attempt is logged as a separate message. This is the default setting.inline_block
: Print a yellow progress block (’█’) on the same line. Each block represents one retry attempt. Choose this if you don’t want to see the verbose retry logging message and still want to track the number of retries that have been made.none
: Do not log the retry attempts.