Schema Management Quickstart
Schema Management Quickstart
Learn how to manage extraction schemas using the DocDigitizer Ontology API.
Overview
The Schema Management API allows you to:
- Browse available document types and countries
- Find the best schema for a document type/country combination
- Create custom extraction schemas
- Manage schema versions and lifecycle
Prerequisites
- DocDigitizer account with API key
- API key with Schema Management permissions
Understanding Schemas
What is a Schema?
A schema defines which fields to extract from a document type. For example, an Invoice schema might specify:
{
"type": "object",
"properties": {
"invoiceNumber": { "type": "string" },
"invoiceDate": { "type": "string", "format": "date" },
"totalAmount": { "type": "number" },
"vendorName": { "type": "string" },
"lineItems": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": { "type": "string" },
"quantity": { "type": "number" },
"unitPrice": { "type": "number" }
}
}
}
}
}Schema Selection
When processing a document, the system selects the best schema based on:
- Customer schemas - Your private schemas (if any)
- Public schemas with country match - e.g., "Invoice + Portugal"
- Generic public schemas - e.g., "Invoice" without country
Quick Start
Get Reference Data
List all available document types and countries:
curl https://api.docdigitizer.com/registry/reference-data \
-H "X-API-Key: your-api-key"Response:
{
"docTypes": [
{ "code": "Invoice", "name": "Commercial Invoice", "isActive": true },
{ "code": "Receipt", "name": "Point-of-Sale Receipt", "isActive": true },
{ "code": "Contract", "name": "Legal Contract", "isActive": true }
],
"countries": [
{ "code": "PT", "name": "Portugal", "isActive": true },
{ "code": "US", "name": "United States", "isActive": true },
{ "code": "GB", "name": "United Kingdom", "isActive": true }
]
}Find Best Schema
Get the appropriate schema for a document type:
curl -X POST https://api.docdigitizer.com/registry/schemas/find-best \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"docTypeCode": "Invoice",
"countryCode": "PT"
}'Response:
{
"schema": {
"publicId": "sch_abc123xyz789",
"publicVersionId": "schv_def456uvw012",
"name": "Invoice Portugal",
"version": 2,
"status": "active",
"docTypeCode": "Invoice",
"countryCode": "PT",
"content": {
"type": "object",
"properties": {
"invoiceNumber": { "type": "string" },
"nif": { "type": "string" },
"totalAmount": { "type": "number" }
}
}
},
"matchType": "exact"
}Match Types
| Match Type | Description |
|---|---|
exact | Found schema matching both docType and country |
fallback | Found generic schema (docType only, no country) |
null | No matching schema found |
Schema Lifecycle
Schemas go through a lifecycle:
DRAFT --> ACTIVE --> DEPRECATED
| Status | Description |
|---|---|
draft | In development, can be modified freely |
active | Published and in use |
deprecated | Outdated, use newer version |
Creating Custom Schemas
Step 1: Create a Draft Schema
curl -X POST https://api.docdigitizer.com/registry/admin/schemas \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"name": "Custom Invoice PT",
"description": "Custom Portuguese invoice schema",
"docTypeCode": "Invoice",
"countryCode": "PT",
"visibility": "private",
"customerId": "your-customer-uuid",
"content": {
"type": "object",
"properties": {
"invoiceNumber": { "type": "string" },
"nif": { "type": "string" },
"customField": { "type": "string" }
}
}
}'Step 2: Test the Schema
Use the schema in document processing to verify it works correctly.
Step 3: Activate the Schema
curl -X POST https://api.docdigitizer.com/registry/admin/schemas/sch_abc123/activate \
-H "X-API-Key: your-api-key"Schema Visibility
| Visibility | Who Can Use | Requirements |
|---|---|---|
public | Everyone | Requires docTypeCode AND countryCode |
community | Logged-in users | Future feature |
private | Only owner | Requires customerId |
Versioning
When you update an active schema, a new version is created:
# Update an active schema - creates version 2
curl -X PATCH https://api.docdigitizer.com/registry/admin/schemas/sch_abc123 \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"content": {
"type": "object",
"properties": {
"invoiceNumber": { "type": "string" },
"nif": { "type": "string" },
"newField": { "type": "string" }
}
}
}'Get All Versions
curl https://api.docdigitizer.com/registry/admin/schemas/sch_abc123/versions \
-H "X-API-Key: your-api-key"Code Examples
Python - Find and Use Schema
import requests
API_KEY = "your-api-key"
BASE_URL = "https://api.docdigitizer.com/registry"
# Find best schema
response = requests.post(
f"{BASE_URL}/schemas/find-best",
headers={
"X-API-Key": API_KEY,
"Content-Type": "application/json"
},
json={
"docTypeCode": "Invoice",
"countryCode": "PT"
}
)
result = response.json()
if result["schema"]:
schema = result["schema"]
print(f"Found schema: {schema['name']} (v{schema['version']})")
print(f"Match type: {result['matchType']}")
print(f"Fields: {list(schema['content']['properties'].keys())}")
else:
print("No matching schema found")JavaScript - Create Custom Schema
const axios = require('axios');
const API_KEY = 'your-api-key';
const BASE_URL = 'https://api.docdigitizer.com/registry';
async function createSchema() {
const response = await axios.post(
`${BASE_URL}/admin/schemas`,
{
name: 'Custom Invoice Schema',
docTypeCode: 'Invoice',
countryCode: 'US',
visibility: 'private',
customerId: 'your-customer-uuid',
content: {
type: 'object',
properties: {
invoiceNumber: { type: 'string' },
ein: { type: 'string' }, // US-specific field
totalAmount: { type: 'number' }
}
}
},
{
headers: {
'X-API-Key': API_KEY,
'Content-Type': 'application/json'
}
}
);
console.log('Created schema:', response.data.publicId);
return response.data;
}Best Practices
-
Start with public schemas - Check if a suitable public schema exists before creating custom ones
-
Use country-specific schemas - They include region-specific fields (NIF for Portugal, EIN for US, etc.)
-
Test before activating - Thoroughly test draft schemas before activation
-
Version carefully - Each update to an active schema creates a new version
-
Deprecate gracefully - When replacing a schema, deprecate the old one rather than deleting
API Endpoints Reference
| Endpoint | Method | Description |
|---|---|---|
/reference-data | GET | Get all doc types and countries |
/doc-types | GET | List active document types |
/countries | GET | List active countries |
/schemas/find-best | POST | Find best matching schema |
/admin/schemas | GET | List schemas (with filters) |
/admin/schemas | POST | Create new schema |
/admin/schemas/{id} | GET | Get schema by ID |
/admin/schemas/{id} | PATCH | Update schema |
/admin/schemas/{id} | DELETE | Delete draft schema |
/admin/schemas/{id}/activate | POST | Activate schema |
/admin/schemas/{id}/deprecate | POST | Deprecate schema |
/admin/schemas/{id}/versions | GET | Get all versions |
Next Steps
Updated 15 days ago
