Define your own schemas

Through the Annotation Schemas API you can choose to receive the annotations of your documents with the fields and the structure as you need them. The only mandatory rule is to define it as a json specification.

You can:

Use one of the existing public schemas available in our api
Define you own schema(s) from scratch, or use one of our public schemas as a base to do your changes and create a new one just for you

Each schema has it's own identifier that you need while uploading your document.

When you upload your document with a schema identifier, our system will temporarily persist your document along with the corresponding schema during the annotation process. So, if you change your schema during the annotation process of a document, your changes will only be applied to new documents uploaded after those schema changes has been done.

How to do it?

Authenticate yourself calling the Auth service with your API Key. You will need the accessToken to set the Authorization = Bearer {{accessToken}} header in all service calls.
Choose or define the schema that you want to receive your document annotation
1. Start by getting the list of available schemas (the public ones if you don't create anyone yet)
2. Choose your schema. Or create a new one. In either case take note of the schema identifier.
Upload your document file setting your schema identifier. Notice that the upload endpoint is asynchronous, you submit and receive an immediate response with the id of your document but your document will be queued to be processed. You can than pool our service (point 3) until your document is annotated, which is when the state is READY.
1. Authenticate your API if not done before.
2. Use the Annotate endpoint to submit your file but setting the following mandatory body params:
  1. files - your file (Ex. mycv.pdf)
  2. documentClass - set the schema identifier that you take note in step 1 (Ex. a5b9f0e4-7986-4dc9-9b01-e28fd8dea575)
  3. pipeline - LLM02 (this is the pipeline name that's using this feature)
3. Optionally, you can also set the callback params to notify your API when it's READY.
4. Take note of the id, this is the document identifier.
Get your document annotation
1. Use the GET https://assetgw.docdigitizer.com/api/v5/documents/{docid} replacing the {docid} with the document identifier returned by the Upload. Authenticate your API if your token has expired.
2. Wait until the return has state=READY. When your document is ready you should have the annotations fulfilled accordingly with your json schema specification.

Please contact your sales support if you need to have access to the Annotation Schemas API.