InSight DXP's capabilities
- Intelligent ingestion & OCR – High‑speed intake of paper and digital documents, barcode/QC checks, support for over 100 languages and large file sizes.
- GenAI‑powered document understanding – Zero‑shot prompts interpret unseen formats, summarize context and extract intent.
- Zero‑shot classification – Use layout, phrases and logos to classify hundreds of document types without retraining.
- Advanced table & narrative extraction – Extract nested headers, merged cells and derived tables while preserving multi‑page continuity.
- Forms and contract parsing – Locate parties, clauses, dates, terms and obligations, and normalize them to your schema.
- Document splitting – Large Language Model (LLM)‑driven separation turns multi‑document files into clean, analysis‑ready units.
- Text extraction – Pull structured fields and line items across multi‑page documents.
- Signature detection – Identify signatures and other non‑text elements for downstream validation.
- Automated machine learning – AutoML trains models for high‑volume, repeatable formats.
- Template‑based model extraction – One‑sample, fixed‑layout templates for government and industry forms.
- Automated PII redaction – Mask sensitive data using pattern and AI‑based rules to meet compliance requirements for GDPR and HIPAA.
- Data capture annotation – Tag and annotate captured data for further processing or model training.
- Human‑in‑the‑loop (HITL) feedback loop agent – Use composite confidence scores to route only low‑confidence fields to reviewers; the agent learns from corrections.
- Data validation & enrichment – Normalize formats, compose names and addresses, verify against reference data, and emit retrieval-augmented generation (RAG)‑ready outputs.













