Using DocDigitizer PowerCapture API
Using DocDigitizer PowerCapture API
DocDigitizer PowerCapture API - Main Services
The extraction process is asynchronous and by default, have “Human Validation”.
For this reason, the data extract is will not be immediately available, having SLA in accordance with contract established (for the Trial/Demo accounts, SLA by default is 30mins)


HTTP Headers & Authorization
For your request to be authorized you need API_KEY.
This API Key have to go in each request, key field Authorization
Also key field Content-Type = "multipart/form-data"
is needed only for the "Submit document," because in this service you are sending files (encode the data that forms the body of the request).
headers = {
'Authorization': 'API_KEY ' + APIKEY,
'Content-Type': 'multipart/form-data'
}
* Submit a file
Purpose: submit a file to be processed. In the response there is the value of document_id
that identify the file submit
Method: POST
Body Parameters:
callback_url
: url of the Callback service (status notification)files
: file to submit to extract data
When sending the request, in case of Accepted (Status=202), you will receive a response similar to the following:
{
"detail": "Accepted",
"status": 202,
"task": {
"created": "2021-11-02T12:14:26.564283",
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"links": {
"self": "/api/v1/tasks/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"collection": "/api/v1/tasks"
},
"status": "PENDING"
}
}
To get the ID of the document submitted to the DocDigitizer platform, you need to check the Header of the response.
Go to the Header
response and check KEY X-Document-Location
The document ID value is the GUID that you find at end of the URL
"X-Document-Location": "http://api.docdigitizer.com/api/v1/documents/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
* Submit and annotate the given file
Purpose: submit a file to be processed indicating the class of the document. In the response there is the value of document_id
that identify the file submit
Method: POST
Path Parameters:
document_class
: optional parameter (indicated the type of document being submitted
possible values: “financial-document”, "invoice", "pay-slip", "bank-statement", "citizen-id-card", ...
Body Parameters:callback_url
: url of the Callback service (status notification)files
: file to submit to extract data
When sending the request, in case of Accepted (Status=202), you will receive a response similar to the following:
{
"detail": "Accepted",
"status": 202,
"task": {
"created": "2021-11-02T12:14:26.564283",
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"links": {
"self": "/api/v1/tasks/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"collection": "/api/v1/tasks"
},
"status": "PENDING"
}
}
To get the ID of the document submitted to the DocDigitizer platform, you need to check the Header of the response.
Go to the Header
response and check KEY X-Document-Location
The document ID value is the GUID that you find at end of the URL
"X-Document-Location": "http://api.docdigitizer.com/api/v1/documents/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
* Get details of the document
Purpose: Get data extract from given document
Method: GET
Path params:
document_id
: document id (receive in the Header response of "Submit a file" or "Submit and annotate the given file")
Call this service when receiving “Notification” withstatus=annotated
(passing the correspondingdocument_id
)
If you don’t have callback define, to check if the process of correspondingdocument_id
is finished, check the field reviewed=true in the response
* View document's rejection.
Purpose: Get information about rejection of a document of given document
Method: GET
Path params:
document_id
: document id (receive in the Header response of "Submit a file" or "Submit and annotate the given file")
Call this service when receiving “Notification” withstatus=rejected
(passing the correspondingdocument_id
)
If you don’t have callback define, invoke this service only when on this previous case you received a response that have fieldrejecetd=true
* List the organization's documents ordered by creation time (descending) or review time (descending)
Purpose: Get data extract of list documents
Method: GET
Parameters:
Request (Best practices)
Check pagination logic describe here
- Use default value for documents retrieve (
maxRetrieve
) - If you only what the reviewed documents (excluding the rejected), pass the parameter “reviewed” a “only“
- Use of parameter
reviewed_after
(returned documents reviewed after the given date) - Save the value of
review_date
of the first document receive, to be use on the next request (except in the case of requesting the next page, where the query parameters value used in the original order must be kept, exceptcursorId
) - Pass the parameter
assets=1
, if you want to include in the response the endpoint to the original document. Otherwise Pass the parameterassets=0
* This link (document) have a short lifetime (around 72 hours)
Response - In the response, check the field
rejected
, that if it has the valuetrue
this means that document was rejected.- Is this cases I should call Reject a document. to obtain information about details about the rejection
This service can also be used to check if there is new information
- Call the service with the same parameters/values to be used on the
GET
, but instead use the methodHEAD
- The
HEAD
method is identical toGET
except that the server don´t return the message-body in the response
- The
To obtain detail information about DocDigitizer PowerCapture API, please consult API Reference v1.0
Updated about 2 months ago