Define your own schemas
Note: This feature is currently available by request and only in a pipeline without Human-In-The-Loop!
Through the Annotation Schemas API you can choose to receive the annotations of your documents with the fields and the structure as you need them. The only mandatory rule is to define it as a json specification.
You can:
- Use one of the existing public schemas available in our api
- Define you own schema(s) from scratch, or use one of our public schemas as a base to do your changes and create a new one just for you
Each schema has it's own identifier that you need while uploading your document.
When you upload your document with a schema identifier, our system will temporarily persist your document along with the corresponding schema during the annotation process. So, if you change your schema during the annotation process of a document, your changes will only be applied to new documents uploaded after those schema changes has been done.
How to do it?
- Choose or define the schema that you want to receive your document annotation
- Start by getting the list of available schemas (the public ones if you don't create anyone yet)
- Choose your schema. Or create a new one. In either case take note of the schema identifier.
- Upload your document file setting your schema identifier. Notice that the upload endpoint is asynchronous, you submit and receive an immediate response with the
id
of your document but your document will be queued to be processed. You can than pool our service (point 3) until your document is annotated, which is when the state is READY.- Authenticate your API
- Use the Annotate endpoint to submit your file but setting the following mandatory body params:
files
- your file (Ex. mycv.pdf)documentClass
- set the schema identifier that you take note in step 1 (Ex. a5b9f0e4-7986-4dc9-9b01-e28fd8dea575)pipeline
- LLM02 (this is the pipeline name that's using this feature)
- Optionally, you can also set the callback params to notify your API when it's READY.
- Take note of the id, this is the document identifier.
- Get your document annotation
- Use the GET
https://assetgw.docdigitizer.com/api/v5/documents/{docid}
replacing the{docid}
with the document identifier returned by the Upload. Authenticate your API if your token has expired. - Wait until the return has
state=READY
. When your document is ready you should have theannotations
fulfilled accordingly with your json schema specification.
- Use the GET
Please contact your sales support if you need to have access to the Annotation Schemas API.
Updated 25 days ago