Integration Partner Program

Financial Document Intelligence.
Turn Documents into Data. Keep the Source Safe.

The infrastructure layer for fund software. We ingest PDFs into a high-speed queryable index for your API, while securing original assets in an immutable, audit-ready vault.

White-Label Ready · Multi-Schema Projection · Immutable Audit Trail

Request API Keys View Interactive Demo

Hot IndexFast Queries

Cold VaultImmutable

SchemaProjection

AuditTrail

1/4

How It Works

1. Ingest & Vault

Securely upload financial documents. We immediately lock the original file in an Immutable Cold Vault (WORM storage) for compliance, while simultaneously processing the content into our "Hot Index."

Status: Encrypted & Locked

Retention: 7 Years (Configurable)

Access: Audit-Only

2. The "Hot Index" (Rich Extraction)

We convert the document into a Spatial Database—mapping every text, table, and value to its exact location (bounding box). This "Digital Twin" lives in high-speed storage, ready for any query.

Processing: OCR + Layout Analysis

Output: Rich Intermediate JSON

Latency: <100ms for subsequent queries

3. Multi-Schema Projection

Don't re-process files when requirements change. Project different schemas onto the same document index to retrieve exactly the data needed for specific workflows.

multi-schema-projection.py

# --- Step 1: Ingest Once ---# Document is vaulted & indexed. Returns a handle.doc = client.ingest("./capital_call_Q3.pdf")# --- Step 2: The Accounting Workflow ---# Project a schema for general ledger entryledger_data = doc.project(schema="accounting_v1")print(ledger_data.total_amount)# Output: 4,500,000.00# --- Step 3: The Compliance Workflow ---# Later, project a different schema on the SAME doc# No re-upload or re-processing cost.audit_data = doc.project(schema="compliance_v1")print(audit_data.bank_name)# Output: "Silicon Valley Bank"

Embed Into Your Platform

White-label REST endpoints or event-driven webhooks. Your brand, our infrastructure.

⚡

REST Endpoint

Instant custom endpoint ready to use

POST /extract/your-schema

🔒

Production-Ready

Secure Webhooks

Event-driven with HMAC signature verification

✓ Signature SDKs included

Embedded Infrastructure

White-Label Integration Options

Offer document extraction under your brand with instant REST endpoints or secure event-driven webhooks. Complete integration infrastructure that your customers expect.

🔒 Enterprise Webhook Security

Event-driven webhooks with HMAC SHA-256 signature verification and replay attack prevention. SDKs included for Python and TypeScript to verify signatures with one line of code.

✓

HMAC Signature Verification

Every webhook signed with your secret key

✓

Replay Attack Prevention

Timestamp validation prevents old events from being reused

✓

SDK Included

Python and TypeScript helpers for signature checking

✓

Event Types & Retries

Automatic retries with exponential backoff

webhook-verify.py

from docintel import Webhook# Verify webhook signaturesignature = request.headers["X-DocIntell-Signature"]timestamp = request.headers["X-DocIntell-Timestamp"]event = Webhook.verify(payload=request.body,signature=signature,timestamp=timestamp,secret="whsec_...")# Safely process verified eventif event.type == "extraction.completed":data = event.data # Verified payload

Webhook Payload Structure:

→ event.id: evt_abc123

→ event.type: extraction.completed

→ event.data: {your schema}

→ event.created: 1699564800

Product Experience

The Hot Index: Queryable Document Intelligence

Navigate complex financial documents with ease. Our Hot Index structures every text, table, and entity from the source document into a clean, queryable view with verifiable links back to the original. Schema versioning means your data models can evolve without breaking existing integrations — backward compatibility handled automatically.

Documents

Q3 Capital Call - Global Tech II.pdf

ReadyJust now

Quarterly Report - Fund IV.xlsx

Ready2h ago

Distribution Notice - Oct 2024.pdf

Processing5m ago

Capital Account Statement.pdf

Ready1d ago

Document DetailsID: doc_8x92...

Fund Name

Global Tech Opportunities Fund II

Document TypeCapital Call

Period

Q3 2024

CurrencyUSD

Call DetailsStructured Object

Total Amount

$4,500,000.00

Due Date

2024-11-30

Bank NameSilicon Valley Bank

Account No.

**** 8829

AllocationsArray [3]

Purpose	Amount	%
Investment - AI Core	$3,000,000.00	66.7%
Management Fees	$1,000,000.00	22.2%
Partnership Expenses	$500,000.00	11.1%

Security & Compliance

The Speed of a Database. The Safety of a Vault.

We architected DocIntell to solve the tension between data utility and data residency. Your customers get instant access to parsed data, backed by an immutable chain of custody.

🛡️

Strict Tenant Isolation

Enterprise-grade logical isolation ensures customer data never commingles. Every API request is scoped to a strict Tenant Context, enforced by Row-Level Security (RLS).

🔒

The Immutable Vault

Original documents are stored in an Object-Locked (WORM) bucket. Even if you delete the data from the API, the source file remains tamper-proof for audit requirements.

✅

Verifiable Audit Trails

Because we link every extracted data point back to the source PDF, you can build "Click-to-Verify" UIs. Show your users exactly where the number $4.5M came from.

Security StatusAll Systems Secure

Tenant Isolation (RLS)

Active

Hot Index Encryption

AES-256 Enabled

Cold Vault (WORM)

Object-Locked

Audit Trail Links

Verified

Data Residency

US-East-1

Last Audit: Just now
Tenant ID: tenant_8f92x...

Infrastructure Layer

Surgical Precision, Not Data Dumps

Unlike traditional OCR APIs that return gigabytes of unstructured coordinates, DocIntell extracts the full document context once into the Hot Index, then lets you project exactly the fields you need. A 50-page report becomes a 2KB response instead of a 45MB payload.

✓

Confidence-Based Extraction — Auditable confidence scores for every field

✓

Multi-Schema Projection — Query different data views from a single extraction

✓

Enterprise-Grade Tenant Isolation — Secure data separation for white-label deployments

✓

Async Processing with Webhooks — HMAC-verified event notifications

✓

Structured & Unstructured Data — Extract tables, text, and metadata in one pass

extract.py

import docintelclient = docintel.Client(api_key="di_...")# Async extraction - returns immediatelyoperation = client.extract("./capital_call_q3.pdf")# Wait for completion (or use webhooks)extraction = operation.wait()# Project schema on the resultledger = extraction.project("accounting_v1")print(ledger.total_amount)# Output: 4500000.00

multi-schema.py

# Extract once, project multiple schemasextraction = client.extract("./doc.pdf").wait()# Accounting needs top-line numbersaccounting = extraction.project("accounting_v1")print(accounting.total_amount)# $4,500,000.00# Compliance needs bank detailscompliance = extraction.project("compliance_v1")print(compliance.bank_name)# "Silicon Valley Bank"# No re-processing required

efficiency-comparison.py

# ❌ Traditional OCR: Dump everythingraw_ocr = competitor.extract("50_page_report.pdf")# → 45MB response with every bounding box# → Parse on your server, filter client-side# → Slow, expensive, wasteful# ✅ DocIntell: Extract once, project what you needdoc = client.ingest("50_page_report.pdf")# → Full extraction stored in Hot Index (not transmitted)# Get only what you need for each workflow:summary = doc.project("executive_summary")# → 2KB response ⚡tables = doc.project("financial_tables")# → 15KB response ⚡# 20x-2000x smaller payloads, instant queries

confidence-scores.py

# Every field includes a confidence scoreextraction = client.extract("./doc.pdf").wait()data = extraction.project("accounting_v1")# Route low-confidence to human reviewfor field, value in data.fields.items():if value.confidence < 0.85:review_queue.add(field, value)# Example:# total_amount: 0.98 ✓# due_date: 0.76 → review queue

Financial Document Intelligence.Turn Documents into Data. Keep the Source Safe.