HTTP-Based, RESTful, with SSL

We attempt to follow the principles of Representational State Transfer (REST), This means DocDigitizer PowerCapture API does not store 'state' nor 'sessions'. Most of the endpoints use JSON data format for responses and requests.

We leverage the verbosity of the HTTP protocol. Methods that retrieve data require a GET request, methods that send it might require a PUT or a POST.

All communication with the API should be made over SSL this is extremely important.

By design:
We do not accept any insecure HTTP communication nor inbound nor outbound
We do not accept cryptographic protocols already known as vulnerables or known as using weak ciphers. Specifically, all versions of SSL (Secure Sockets Layer) and TLS (Transport Layer Security) versions 1.0, 1.1.
We do not accept weak ciphers on TLS version 1.2. To be more specifically we only accept the following ciphers:
Enabled features
We plan to decommission TLS 1.2 and only accept connections using TLS version 1.3, but unfortunately we couldn’t take yet the step to only accept TLS version 1.3 (or above) due to some legacy present in some of our customers and even on a few cloud services that we use, sorry for that.


As mentioned on the Guidelines, the Document Classification and Data Extraction tasks can be done by Humans and Machine Learning.

It is possible to configure that some filed or document classification are only done by Machine Learning.
In these cases, it is important to take in account the value of confidence return to see the accuracy that the Machine Learning have on extracting certain Field (e.g. Social Security Number) or Classifying a certain Type of Document (e.g. Citizen Card)

"features": {
    "5fc854bd-9608-4012-beec-16c52947d4ef": {
        "uuid": "5fc854bd-9608-4012-beec-16c52947d4ef",
        "name": "atm_receipt",
        "label": "ATM Receipt",
        "createdAt": "2021-12-09T16:02:34.173",
        "origin": 20,
        "originText": "MachineLearning",
        "confidence": 0.9975,
        "fields": {
            "66e356c3-df7f-4eb1-add9-b682fa9e2bd7": {
                "uuid": "66e356c3-df7f-4eb1-add9-b682fa9e2bd7",
                "name": "bank_card_last_digits",
                "label": "Últimos 4 dígitos de multibanco/cartão",
                "dataType": 10,
                "dataTypeText": "String",
                "createdAt": "2021-12-09T16:02:34.173",
                "confidence": 0.9975,
                "fieldValue": "6758",
                "origin": 20,
                "originText": "MachineLearning",
                "annotationInformation": null

DocDigitizer PowerCapture Machine Learning automate the training to the goal of generating different machine learning models to better obtain the highest degree of confidence (accuracy of data extracted).

Better for better understating how DocDigitizer PowerCapture achieve the level 100% accuracy due to AI/ML + Human in the Loop, please check see Get Started with DocDigitizer PowerCapture