DocDigitizer PowerCapture General Rules

What kind of alphabets are supported by DocDigitizer?

DocDigitizer PowerCapture supports the following alphabets;

Latin

Does DocDigitizer support handwritten documents?

Yes. DocDigitizer extracts readable handwritten characters in the supported alphabets.

Does DocDigitizer PowerCapture extract information exactly how it is represented in the document?

No. To facilitate the integration, DocDigitizer PowerCapture normalizes some of the fields based on their content and format. An example of format normalization is dates, values, or code.

Example 1: Aug 15, 2019, will return 15/08/2019
Example 2: CHE-114.431.157 will return CHE114431157

Another example of content normalization is fields where DocDigitizer PowerCapture uses third-party data to ensure consistency.

Example 3: ABC-CBA & CIA may return as ABC.CBA & Company because the latter includes content from a trusted external database.

In the document subtype credit note, we also normalize the values from negative to positive.

Does DocDigitizer PowerCapture perform any validation over the extracted data?

Yes. In specific field types, we ensure that the format is consistent. A quite good example of that is dates, where DocDigitizer does not return invalid dates being returned as null.

Example 1: “If the date is “01-01-0000” we return the field empty.

Does DocDigitizer PowerCapture apply mathematical calculations on the field’s output?

No. DocDigitizer PowerCapture only extracts data that is present in the document.

Does DocDigitizer PowerCapture handle multi-page documents?

DocDigitizer PowerCapture can extract data from all pages of documents submitted but it is possible to configure that what are pages that have relevant data to be extracted (v2 only).

Does DocDigitizer process different documents in the same file (PDF)?

No. We process the first supported document and all pages of that document, ignoring any other documents present on the file.