Platform
Batch Upload
Process content in bulk for governance, compliance, or performance optimization. Upload historical records or offload real-time extraction to an async pipeline.
When to Use Batch
Process historical content that predates your DeepaData integration. With permission, existing records become governed artifacts with full audit trails.
- Years of interaction history → governed artifacts
- EU AI Act compliance for historical data
- Populate Observe dashboards from day one
Offload extraction from the real-time request path. Upload content in batches for better performance, lower latency, and energy efficiency.
- Non-blocking extraction for high-volume apps
- Process during off-peak hours
- Reduce API call overhead with bulk operations
Key Benefits
Don't wait to accumulate governed data. Process your historical corpus and have audit-ready records immediately.
Historical extractions feed into Observe metrics. See drift, escalation patterns, and emotional exposure from your full history.
Retrospective pathway marks artifacts as backfill with explicit permission. Audit-ready governance for historical data.
Permission Model
Batch processing uses the retrospective issuance pathway. This requires explicit permission from either the data subject or organizational authority.
Retrospective pathway requirements
- Content must be voluntarily expressed text (interpreted from meaning, not derived from behavioral signals)
- Subject must have consented to the original collection
- Reprocessing must be permitted under original consent or new explicit permission
- Artifacts are marked with
pathway: "retrospective"
Batch Operations
Batch Upload supports two operations, corresponding to the two capture modes.
Full 96-field EDM artifact extraction. ~15 seconds per record. Creates artifacts that can be sealed into .ddna envelopes.
Max content: 50,000 characters per record
Lightweight salience capture. ~5 seconds per record. Creates Salience Records for trigger/escalation analysis. Requires subject_id.
Max content: 20,000 characters per record
API Usage
Upload returns immediately with a job_id. Poll the status endpoint to track progress.
# Upload batch
curl -X POST https://www.deepadata.com/api/v1/batch/upload \
-H "Authorization: Bearer dda_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"operation": "extract",
"name": "Q1 2025 therapy sessions",
"records": [
{ "content": "First session transcript..." },
{ "content": "Second session transcript..." }
]
}'
# Response includes job_id
# { "data": { "job_id": "job-01HZ...", "status": "pending" } }
# Poll for status
curl https://www.deepadata.com/api/v1/batch/status?job_id=job-01HZ... \
-H "Authorization: Bearer dda_live_YOUR_KEY"Input Formats
Batch Upload accepts JSON or CSV. CSV uploads pass the operation as a query parameter.
JSON format
{
"operation": "observe",
"name": "Historical chats",
"records": [
{
"content": "User message content...",
"subject_id": "user-123"
},
{
"content": "Another message...",
"subject_id": "user-456"
}
]
}CSV format
content,subject_id
"First passage text...",user-123
"Second passage text...",user-456
"Third passage text...",user-789Upload with Content-Type: text/csv and ?operation=observe query param
Limits & Timing
| Limit | Value |
|---|---|
| Max records per batch | 1,000 |
| Max content length (extract) | 50,000 characters |
| Max content length (observe) | 20,000 characters |
| Processing time (extract) | ~15 seconds/record |
| Processing time (observe) | ~5 seconds/record |
| Typical batch completion | 30-120 seconds |
Status Response
{
"success": true,
"data": {
"job_id": "job-01HZ3GKWP7XTJY9QN4RD",
"name": "Q1 2025 therapy sessions",
"operation": "extract",
"status": "completed",
"total_records": 150,
"processed_records": 150,
"failed_records": 2,
"progress_percent": 100,
"created_at": "2026-02-24T10:30:00.000Z",
"completed_at": "2026-02-24T10:45:30.000Z"
}
}Enterprise Workflow
For large-scale historical ingestion, we recommend a phased approach.
Sample extraction
Run a batch of 100 representative records. Review extraction quality and identify any content patterns that need preprocessing.
Permission audit
Verify consent basis for historical data. Document the permission model for retrospective processing.
Staged ingestion
Process in batches of 500-1000 records. Monitor job status and error rates. Address failures before proceeding.
Seal high-value records
After extraction, seal records that require long-term retention via/v1/issue with pathway: "retrospective".