Integration Partner Program

Financial Document Intelligence.
Turn Documents into Data. Keep the Source Safe.

The infrastructure layer for fund software. We ingest PDFs into a high-speed queryable index for your API, while securing original assets in an immutable, audit-ready vault.

White-Label Ready Β· Multi-Schema Projection Β· Immutable Audit Trail

Hot IndexFast Queries
Cold VaultImmutable
SchemaProjection
AuditTrail
1/4

How It Works

1. Ingest & Vault

Securely upload financial documents. We immediately lock the original file in an Immutable Cold Vault (WORM storage) for compliance, while simultaneously processing the content into our "Hot Index."

Status: Encrypted & Locked
Retention: 7 Years (Configurable)
Access: Audit-Only

2. The "Hot Index" (Rich Extraction)

We convert the document into a Spatial Databaseβ€”mapping every text, table, and value to its exact location (bounding box). This "Digital Twin" lives in high-speed storage, ready for any query.

Processing: OCR + Layout Analysis
Output: Rich Intermediate JSON
Latency: <100ms for subsequent queries

3. Multi-Schema Projection

Don't re-process files when requirements change. Project different schemas onto the same document index to retrieve exactly the data needed for specific workflows.

multi-schema-projection.py
# --- Step 1: Ingest Once ---# Document is vaulted & indexed. Returns a handle.doc = client.ingest("./capital_call_Q3.pdf")# --- Step 2: The Accounting Workflow ---# Project a schema for general ledger entryledger_data = doc.project(schema="accounting_v1")print(ledger_data.total_amount)# Output: 4,500,000.00# --- Step 3: The Compliance Workflow ---# Later, project a different schema on the SAME doc# No re-upload or re-processing cost.audit_data = doc.project(schema="compliance_v1")print(audit_data.bank_name)# Output: "Silicon Valley Bank"

Embed Into Your Platform

White-label REST endpoints or event-driven webhooks. Your brand, our infrastructure.

⚑

REST Endpoint

Instant custom endpoint ready to use

POST /extract/your-schema
πŸ”’
Production-Ready

Secure Webhooks

Event-driven with HMAC signature verification

βœ“ Signature SDKs included
Embedded Infrastructure

White-Label Integration Options

Offer document extraction under your brand with instant REST endpoints or secure event-driven webhooks. Complete integration infrastructure that your customers expect.

πŸ”’ Enterprise Webhook Security

Event-driven webhooks with HMAC SHA-256 signature verification and replay attack prevention. SDKs included for Python and TypeScript to verify signatures with one line of code.

βœ“

HMAC Signature Verification

Every webhook signed with your secret key

βœ“

Replay Attack Prevention

Timestamp validation prevents old events from being reused

βœ“

SDK Included

Python and TypeScript helpers for signature checking

βœ“

Event Types & Retries

Automatic retries with exponential backoff

webhook-verify.py
from docintel import Webhook# Verify webhook signaturesignature = request.headers["X-DocIntell-Signature"]timestamp = request.headers["X-DocIntell-Timestamp"]event = Webhook.verify(payload=request.body,signature=signature,timestamp=timestamp,secret="whsec_...")# Safely process verified eventif event.type == "extraction.completed":data = event.data # Verified payload
Webhook Payload Structure:
β†’ event.id: evt_abc123
β†’ event.type: extraction.completed
β†’ event.data: {your schema}
β†’ event.created: 1699564800
Product Experience

The Hot Index: Queryable Document Intelligence

Navigate complex financial documents with ease. Our Hot Index structures every text, table, and entity from the source document into a clean, queryable view with verifiable links back to the original. Schema versioning means your data models can evolve without breaking existing integrations β€” backward compatibility handled automatically.

DocIntel

Documents

Q3 Capital Call - Global Tech II.pdf
ReadyJust now
Quarterly Report - Fund IV.xlsx
Ready2h ago
Distribution Notice - Oct 2024.pdf
Processing5m ago
Capital Account Statement.pdf
Ready1d ago
Document DetailsID: doc_8x92...
Fund Name
Global Tech Opportunities Fund II
Document TypeCapital Call
Period
Q3 2024
CurrencyUSD
Call DetailsStructured Object
Total Amount
$4,500,000.00
Due Date
2024-11-30
Bank NameSilicon Valley Bank
Account No.
**** 8829
AllocationsArray [3]
PurposeAmount%
Investment - AI Core$3,000,000.0066.7%
Management Fees$1,000,000.0022.2%
Partnership Expenses$500,000.0011.1%
Security & Compliance

The Speed of a Database. The Safety of a Vault.

We architected DocIntell to solve the tension between data utility and data residency. Your customers get instant access to parsed data, backed by an immutable chain of custody.

πŸ›‘οΈ

Strict Tenant Isolation

Enterprise-grade logical isolation ensures customer data never commingles. Every API request is scoped to a strict Tenant Context, enforced by Row-Level Security (RLS).

πŸ”’

The Immutable Vault

Original documents are stored in an Object-Locked (WORM) bucket. Even if you delete the data from the API, the source file remains tamper-proof for audit requirements.

βœ…

Verifiable Audit Trails

Because we link every extracted data point back to the source PDF, you can build "Click-to-Verify" UIs. Show your users exactly where the number $4.5M came from.

Security StatusAll Systems Secure
Tenant Isolation (RLS)
Active
Hot Index Encryption
AES-256 Enabled
Cold Vault (WORM)
Object-Locked
Audit Trail Links
Verified
Data Residency
US-East-1
Last Audit: Just now
Tenant ID: tenant_8f92x...
Infrastructure Layer

Surgical Precision, Not Data Dumps

Unlike traditional OCR APIs that return gigabytes of unstructured coordinates, DocIntell extracts the full document context once into the Hot Index, then lets you project exactly the fields you need. A 50-page report becomes a 2KB response instead of a 45MB payload.

βœ“

Confidence-Based Extraction β€” Auditable confidence scores for every field

βœ“

Multi-Schema Projection β€” Query different data views from a single extraction

βœ“

Enterprise-Grade Tenant Isolation β€” Secure data separation for white-label deployments

βœ“

Async Processing with Webhooks β€” HMAC-verified event notifications

βœ“

Structured & Unstructured Data β€” Extract tables, text, and metadata in one pass

extract.py
import docintelclient = docintel.Client(api_key="di_...")# Async extraction - returns immediatelyoperation = client.extract("./capital_call_q3.pdf")# Wait for completion (or use webhooks)extraction = operation.wait()# Project schema on the resultledger = extraction.project("accounting_v1")print(ledger.total_amount)# Output: 4500000.00
multi-schema.py
# Extract once, project multiple schemasextraction = client.extract("./doc.pdf").wait()# Accounting needs top-line numbersaccounting = extraction.project("accounting_v1")print(accounting.total_amount)# $4,500,000.00# Compliance needs bank detailscompliance = extraction.project("compliance_v1")print(compliance.bank_name)# "Silicon Valley Bank"# No re-processing required
efficiency-comparison.py
# ❌ Traditional OCR: Dump everythingraw_ocr = competitor.extract("50_page_report.pdf")# β†’ 45MB response with every bounding box# β†’ Parse on your server, filter client-side# β†’ Slow, expensive, wasteful# βœ… DocIntell: Extract once, project what you needdoc = client.ingest("50_page_report.pdf")# β†’ Full extraction stored in Hot Index (not transmitted)# Get only what you need for each workflow:summary = doc.project("executive_summary")# β†’ 2KB response ⚑tables = doc.project("financial_tables")# β†’ 15KB response ⚑# 20x-2000x smaller payloads, instant queries
confidence-scores.py
# Every field includes a confidence scoreextraction = client.extract("./doc.pdf").wait()data = extraction.project("accounting_v1")# Route low-confidence to human reviewfor field, value in data.fields.items():if value.confidence < 0.85:review_queue.add(field, value)# Example:# total_amount: 0.98 βœ“# due_date: 0.76 β†’ review queue

Power Your Platform with Document Intelligence

Private Beta Open Now.

We are onboarding select integration partners. Skip the waitlist to get early API access and white-glove implementation support.