Your First API Call
Your First API Call
Let's extract data from a PDF document using the DocDigitizer Sync API.
Prerequisites
Before making your first API call, ensure you have:
- API credentials - Request access if you don't have them yet
- A PDF document to process
Quick Start
Step 1: Health Check
First, verify the API is available:
curl https://apix.docdigitizer.com/syncResponse:
I am alive
Step 2: Process a Document
Upload a PDF for processing:
curl -X POST https://apix.docdigitizer.com/sync \
-H "X-API-Key: your-api-key-here" \
-F "[email protected]" \
-F "id=$(uuidgen)" \
-F "contextID=$(uuidgen)"Step 3: Receive Results
The API returns extracted data immediately:
{
"StateText": "COMPLETED",
"TraceId": "ABC1234",
"Pipeline": "MainPipelineWithOCR",
"NumberPages": 2,
"Output": {
"extractions": [
{
"documentType": "Invoice",
"confidence": 0.95,
"countryCode": "PT",
"pageRange": {
"start": 1,
"end": 2
},
"extraction": {
"invoiceNumber": "INV-2024-001",
"invoiceDate": "2024-01-15",
"vendorName": "Acme Corp",
"vendorNIF": "123456789",
"totalAmount": 1250.00,
"currency": "EUR",
"lineItems": [
{
"description": "Product A",
"quantity": 2,
"unitPrice": 500.00,
"total": 1000.00
}
]
}
}
]
},
"Timers": {
"DocIngester": {
"total": 2345.67
}
}
}Request Parameters
| Parameter | Required | Description |
|---|---|---|
files | Yes | The PDF file to process |
id | Yes | Unique document ID (UUID) - use for tracking |
contextID | Yes | Context ID (UUID) - group related documents |
pipelineIdentifier | No | Specific pipeline to use |
requestToken | No | Custom trace token |
Understanding the Response
StateText
| Value | Meaning |
|---|---|
COMPLETED | Document processed successfully |
ERROR | Processing failed - check Messages |
TraceId
A unique 7-character identifier for this request. Provide this when contacting support.
Output.extractions
Array of extracted documents. A multi-page PDF might contain multiple documents (e.g., several invoices).
Each extraction includes:
documentType: Detected document type (Invoice, Receipt, etc.)confidence: Classification confidence (0-1)countryCode: Detected countrypageRange: Which pages this extraction coversextraction: The actual extracted fields
Timers
Processing time breakdown in milliseconds. Useful for performance monitoring.
Code Examples
Python
import requests
import uuid
API_KEY = "your-api-key-here"
BASE_URL = "https://apix.docdigitizer.com/sync"
# Generate unique IDs
document_id = str(uuid.uuid4())
context_id = str(uuid.uuid4())
# Upload document
with open("invoice.pdf", "rb") as f:
response = requests.post(
BASE_URL,
headers={"X-API-Key": API_KEY},
files={"files": f},
data={
"id": document_id,
"contextID": context_id
}
)
result = response.json()
if result["StateText"] == "COMPLETED":
for extraction in result["Output"]["extractions"]:
print(f"Document Type: {extraction['documentType']}")
print(f"Confidence: {extraction['confidence']}")
print(f"Extracted Data: {extraction['extraction']}")
else:
print(f"Error: {result['Messages']}")JavaScript
const FormData = require('form-data');
const fs = require('fs');
const axios = require('axios');
const { v4: uuidv4 } = require('uuid');
const API_KEY = 'your-api-key-here';
const BASE_URL = 'https://apix.docdigitizer.com/sync';
async function processDocument(filePath) {
const formData = new FormData();
formData.append('files', fs.createReadStream(filePath));
formData.append('id', uuidv4());
formData.append('contextID', uuidv4());
const response = await axios.post(BASE_URL, formData, {
headers: {
'X-API-Key': API_KEY,
...formData.getHeaders()
}
});
const result = response.data;
if (result.StateText === 'COMPLETED') {
result.Output.extractions.forEach(extraction => {
console.log('Document Type:', extraction.documentType);
console.log('Confidence:', extraction.confidence);
console.log('Extracted Data:', extraction.extraction);
});
} else {
console.error('Error:', result.Messages);
}
}
processDocument('invoice.pdf');C#
using System.Net.Http;
using System.Text.Json;
var apiKey = "your-api-key-here";
var baseUrl = "https://apix.docdigitizer.com/sync";
using var client = new HttpClient();
client.DefaultRequestHeaders.Add("X-API-Key", apiKey);
using var content = new MultipartFormDataContent();
content.Add(new StreamContent(File.OpenRead("invoice.pdf")), "files", "invoice.pdf");
content.Add(new StringContent(Guid.NewGuid().ToString()), "id");
content.Add(new StringContent(Guid.NewGuid().ToString()), "contextID");
var response = await client.PostAsync(baseUrl, content);
var json = await response.Content.ReadAsStringAsync();
var result = JsonSerializer.Deserialize<JsonElement>(json);
if (result.GetProperty("StateText").GetString() == "COMPLETED")
{
var extractions = result.GetProperty("Output").GetProperty("extractions");
foreach (var extraction in extractions.EnumerateArray())
{
Console.WriteLine($"Document Type: {extraction.GetProperty("documentType")}");
Console.WriteLine($"Confidence: {extraction.GetProperty("confidence")}");
}
}Next Steps
Updated 15 days ago
