Troubleshoot: Extraction
This section describes possible error messages you may encounter with extraction and how to resolve them.
Error: Only JSON schema version 2020-12 is supported. Invalid JSON Schema: ‘unknown_type’ is not valid under any of the given schemas
This error occurs when an unsupported data type is specified for a field in the extraction schema. To see a list of supported data types, go to Supported Data Types.
Error: Top-level schema must be of type ‘object’
This error occurs if the top-level element on the schema is not object
. The top-level element of the schema must be object
.
Correct:
Incorrect:
Error: Schema depth exceeds 5 at PATH
This error occurs when the schema has more than five nested levels. The extraction schema supports up to five nested levels.
Error: Type list definition at PATH cannot contain ‘object’ or ‘array’. Please use ‘anyOf’ instead.”
This error occurs when you define a JSON schema field with multiple allowed types (called a “type array”) that includes object
or array
as one of the options. Agentic Document Extraction does not allow type arrays to contain these complex data types because they can create validation conflicts and ambiguous schema definitions.
A type array allows a field to accept multiple data types. For example, "type": ["string", "number"]
means the field can contain either a text value or a numeric value. However, when you include “object
” or “array
” in a type array, the schema becomes difficult to validate consistently.
Common scenarios that trigger this error:
"type": ["number", "object"]
"type": ["string", "array"]
"type": ["object", "array"]
Solution:
To fix this issue, replace the type array with an anyOf
construct. The anyOf
keyword provides a clearer and more flexible way to specify that a field can match any one of several schema definitions. This approach eliminates the validation ambiguity that occurs with complex types in type arrays.
Correct:
Incorrect
Error: Keyword ‘KEY’ is not supported
This error occurs when a prohibited keyword is included in the schema. The extraction schema does not support these keywords:
allOf
not
dependentRequired
dependentSchemas
if
then
else
Error: ‘properties’ must be defined for object at root
This error occurs when fields (properties
) are not defined for object
types in the extraction schema.
To fix this issue, define the properties
field for all object types in the schema.
Correct:
Incorrect:
Error: ‘items’ must be defined for array at PATH
This error occurs when an array is missing the items
definition.
To fix this issue, define the fields in the array with items
.
Correct:
Incorrect: