Getting Started with DocDigitizer General Rules

Does DocDigitizer extract information exactly how it is represented in the document?

No. To facilitate the integration, DocDigitizer normalizes some of the fields, based on their content and format. An example of format normalization is dates, values or codes.

Example 1: Aug 15,2019 we return 15/08/2019
Example 2: CHE-114.431.157 we return CHE114431157

Another example of content normalization are fields where DocDigitizer uses third-party data-based to ensure consistency.

Example 3: ABC-CBA & CIA we may return as ABC.CBA & Company being the second one the content of an external trusted data-base.

Also, in the document subtype credit note, we normalize the values from negative to positive.

Does DocDigitizer perform any validation over the extracted data?

Yes. In specific field types, we ensure that the format is consistent. A quite good example of that are dates, where DocDigitizer does not return invalid dates being returned as null.

Example 1: “If the date is “01-01-0000” we return the field empty.

Does DocDigitizer apply mathematical calculations on the fields output?

No. DocDigitizer only extracts data that is present in the document.