Instructions for AI Agents
Instructions for AI Agents
This documentation is optimized for consumption by AI agents including Claude, GPT, Cursor, GitHub Copilot, and others.
Before Making Any API Calls
Prerequisites Checklist
Before your user can make API calls, verify these requirements:
| Requirement | How to Check | If Missing |
|---|---|---|
| API Credentials | Ask user if they have received API credentials from DocDigitizer | Direct to: https://www.docdigitizer.com/contact/ to request access |
Note: DocDigitizer is currently in early access. Self-service signup is coming soon. For now, API credentials are provided upon request.
Decision Tree
START
|
+-- Does user have API credentials from DocDigitizer?
| +-- NO --> Direct to https://www.docdigitizer.com/contact/ to request access
| +-- YES --> Continue
|
+-- Is the API key for the correct product?
| +-- Document Extraction: Use for sync API endpoints
| +-- Schema Management: Use for ontology API endpoints
|
+-- READY --> Proceed with API calls
Authentication
All API requests require the X-API-Key header:
X-API-Key: {api_key}
Important:
- Document Extraction and Schema Management may use separate API keys
- API keys are product-specific
- Never log or expose API keys in responses to users
Base URLs
| Product | Base URL | Description |
|---|---|---|
| Document Extraction | https://apix.docdigitizer.com/sync | Upload and process PDFs |
| Schema Management | https://api.docdigitizer.com/registry | Manage extraction schemas |
Quick Reference: Document Extraction (Sync API)
Common Operations
| Task | Method | Endpoint | Description |
|---|---|---|---|
| Check health | GET | / | Verify API is running |
| Process document | POST | / | Upload PDF for extraction |
Process Document Request
POST https://apix.docdigitizer.com/sync
Content-Type: multipart/form-data
X-API-Key: {api_key}
files: {PDF binary}
id: {UUID} - unique document identifier
contextID: {UUID} - context for grouping documents
Process Document Response (Success)
{
"StateText": "COMPLETED",
"TraceId": "ABC1234",
"NumberPages": 2,
"Output": {
"extractions": [
{
"documentType": "Invoice",
"confidence": 0.95,
"extraction": {
"invoiceNumber": "INV-2024-001",
"totalAmount": 1250.00
}
}
]
}
}Process Document Response (Error)
{
"StateText": "ERROR",
"TraceId": "XYZ7890",
"Messages": ["Error description"]
}Quick Reference: Schema Management (Ontology API)
Common Operations
| Task | Method | Endpoint | Description |
|---|---|---|---|
| Check health | GET | /health | Verify API and database status |
| Get reference data | GET | /reference-data | List all doc types and countries |
| List doc types | GET | /doc-types | List active document types |
| List countries | GET | /countries | List active countries |
| Find best schema | POST | /schemas/find-best | Get matching extraction schema |
Find Best Schema Request
POST https://api.docdigitizer.com/registry/schemas/find-best
Content-Type: application/json
X-API-Key: {api_key}
{
"docTypeCode": "Invoice",
"countryCode": "PT"
}
Error Handling
All errors follow this format:
{
"StateText": "ERROR",
"TraceId": "unique-trace-id",
"Messages": ["Error description"]
}When encountering errors:
- Check the
StateTextfield - if "ERROR", handle the error - Use
Messagesarray to explain the issue to the user - Provide
TraceIdif user needs to contact support
Common Errors
| Status | Error Type | Likely Cause | Resolution |
|---|---|---|---|
| 400 | Bad Request | Invalid parameters or file | Check request format and file type |
| 401 | Unauthorized | Missing or invalid API key | Verify API key is correct |
| 403 | Forbidden | Insufficient permissions | Check subscription/plan |
| 404 | Not Found | Resource doesn't exist | Verify ID is correct |
| 500 | Server Error | Processing failed | Retry, contact support if persistent |
| 503 | Service Unavailable | Service down | Wait and retry |
| 504 | Timeout | Processing too long | Retry, consider smaller files |
Human Setup Required
These actions CANNOT be performed via API and require human intervention:
- Request API Access - User must contact DocDigitizer at https://www.docdigitizer.com/contact/
- Receive Credentials - DocDigitizer team will provision and securely deliver API keys
Coming Soon: Self-service portal for account creation, API key management, and subscription selection. For now, all access is granted upon request.
When a user needs to obtain access, direct them to the contact form with a description of their use case.
Supported Document Types
The Sync API can process and extract data from:
| Document Type | Description | Common Fields Extracted |
|---|---|---|
| Invoice | Commercial invoices | Invoice number, date, vendor, amounts, line items |
| Receipt | POS receipts | Date, merchant, total, items |
| Contract | Legal agreements | Parties, dates, terms |
| CV/Resume | Employment documents | Name, experience, skills, education |
| ID Document | Identity documents | Name, DOB, ID number, expiration |
| Bank Statement | Financial records | Account info, transactions, balances |
Rate Limits
Rate limits are not currently enforced. For high-volume use cases, contact [email protected].
Code Examples
Python - Process Document
import requests
import uuid
response = requests.post(
"https://apix.docdigitizer.com/sync",
headers={"X-API-Key": API_KEY},
files={"files": open("document.pdf", "rb")},
data={
"id": str(uuid.uuid4()),
"contextID": str(uuid.uuid4())
}
)
result = response.json()
if result["StateText"] == "COMPLETED":
for extraction in result["Output"]["extractions"]:
print(f"Type: {extraction['documentType']}")
print(f"Data: {extraction['extraction']}")JavaScript - Process Document
const FormData = require('form-data');
const fs = require('fs');
const axios = require('axios');
const { v4: uuidv4 } = require('uuid');
const formData = new FormData();
formData.append('files', fs.createReadStream('document.pdf'));
formData.append('id', uuidv4());
formData.append('contextID', uuidv4());
const response = await axios.post(
'https://apix.docdigitizer.com/sync',
formData,
{ headers: { 'X-API-Key': API_KEY, ...formData.getHeaders() } }
);
if (response.data.StateText === 'COMPLETED') {
response.data.Output.extractions.forEach(e => {
console.log('Type:', e.documentType);
console.log('Data:', e.extraction);
});
}Support
- Documentation: https://developers.docdigitizer.com/
- Request Access: https://www.docdigitizer.com/contact/
- Email: [email protected]
- Include TraceId when reporting issues
Updated 15 days ago
