Using DocDigitizer PowerCapture API

Using DocDigitizer PowerCapture API

DocDigitizer PowerCapture API - Main Services

The extraction process is asynchronous and by default, have “Human Validation”.
For this reason, the data extract is will not be immediately available, having SLA in accordance with contract established (for the Trial/Demo accounts, SLA by default is 30mins)

HTTP Headers & Authorization
For your request to be authorized you need API_KEY.
This API Key have to go in each request, key field Authorization
Also key field Content-Type = "multipart/form-data" is needed only for the "Submit document," because in this service you are sending files (encode the data that forms the body of the request).

headers = {
  'Authorization': 'API_KEY ' + APIKEY,
  'Content-Type': 'multipart/form-data'
}

* Submit a file
Purpose: submit a file to be processed. In the response there is the value of document_id that identify the file submit
Method: POST
Body Parameters:

When sending the request, in case of Accepted (Status=202), you will receive a response similar to the following:

{
    "detail": "Accepted",
    "status": 202,
    "task": {
        "created": "2021-11-02T12:14:26.564283",
        "id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
        "links": {
            "self": "/api/v1/tasks/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
            "collection": "/api/v1/tasks"
        },
        "status": "PENDING"
    }
}

To get the ID of the document submitted to the DocDigitizer platform, you need to check the Header of the response.
Go to the Header response and check KEY X-Document-Location
The document ID value is the GUID that you find at end of the URL

"X-Document-Location": "http://api.docdigitizer.com/api/v1/documents/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

* Submit and annotate the given file
Purpose: submit a file to be processed indicating the class of the document. In the response there is the value of document_id that identify the file submit
Method: POST
Path Parameters:

  • document_class : optional parameter (indicated the type of document being submitted
    possible values: “financial-document”, "invoice", "pay-slip", "bank-statement", "citizen-id-card", ...
    Body Parameters:
  • callback_url : url of the Callback service (status notification)
  • files : file to submit to extract data

When sending the request, in case of Accepted (Status=202), you will receive a response similar to the following:

{
    "detail": "Accepted",
    "status": 202,
    "task": {
        "created": "2021-11-02T12:14:26.564283",
        "id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
        "links": {
            "self": "/api/v1/tasks/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
            "collection": "/api/v1/tasks"
        },
        "status": "PENDING"
    }
}

To get the ID of the document submitted to the DocDigitizer platform, you need to check the Header of the response.
Go to the Header response and check KEY X-Document-Location
The document ID value is the GUID that you find at end of the URL

"X-Document-Location": "http://api.docdigitizer.com/api/v1/documents/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

* Get details of the document
Purpose: Get data extract from given document
Method: GET
Path params:

  • document_id : document id (receive in the Header response of "Submit a file" or "Submit and annotate the given file")
    Call this service when receiving “Notification” with status=annotated (passing the corresponding document_id)
    If you don’t have callback define, to check if the process of corresponding document_id is finished, check the field reviewed=true in the response

* View document's rejection.
Purpose: Get information about rejection of a document of given document
Method: GET
Path params:

  • document_id : document id (receive in the Header response of "Submit a file" or "Submit and annotate the given file")
    Call this service when receiving “Notification” with status=rejected (passing the corresponding document_id)
    If you don’t have callback define, invoke this service only when on this previous case you received a response that have field rejecetd=true

* List the organization's documents ordered by creation time (descending) or review time (descending)
Purpose: Get data extract of list documents
Method: GET
Parameters:
Request (Best practices)
Check pagination logic describe here

  • Use default value for documents retrieve (maxRetrieve)
  • If you only what the reviewed documents (excluding the rejected), pass the parameter “reviewed” a “only“
  • Use of parameter reviewed_after (returned documents reviewed after the given date)
  • Save the value of review_date of the first document receive, to be use on the next request (except in the case of requesting the next page, where the query parameters value used in the original order must be kept, except cursorId)
  • Pass the parameter assets=1, if you want to include in the response the endpoint to the original document. Otherwise Pass the parameter assets=0
    * This link (document) have a short lifetime (around 72 hours)
    Response
  • In the response, check the field rejected, that if it has the value true this means that document was rejected.
    • Is this cases I should call Reject a document. to obtain information about details about the rejection

This service can also be used to check if there is new information

  • Call the service with the same parameters/values to be used on the GET, but instead use the method HEAD
    • The HEAD method is identical to GET except that the server don´t return the message-body in the response

To obtain detail information about DocDigitizer PowerCapture API, please consult API Reference v1.0


Did this page help you?