How to Extract Specific Fields from Any Document —
Photo, Scan, or PDF
The question is almost never "can a computer read this document." OCR has been reading documents reliably for decades. The question that actually matters — and that most tools still don't answer well — is: can it give me only the fields I need, structured the way I need them, regardless of how the document is formatted? That's a different problem, and it requires a different approach.
Key Takeaways
- 99% accuracy on a single invoice template sounds impressive — but the real bottleneck isn't accuracy on one layout, it's whether the tool works on the next layout without you telling it where every field moved.
- Type "Due Date" and the AI finds "Payment Due," "Pay By," or even "Net 30" (and calculates the actual date) — because it understands meaning, not just text matching.
- Turn 50 invoices from 15 different suppliers — PDFs, scans, and phone photos mixed together — into one clean Excel file in under 10 minutes with ImageToTable.ai, no per-supplier configuration required.
OCR vs. Field Extraction: What's the Actual Difference
OCR — optical character recognition — converts an image of text into machine-readable characters. Give it a photo of a receipt and it returns a string of text that mirrors what's printed on the paper. That output is genuinely useful: it's searchable, copyable, and can be fed into other systems. But it has no awareness of what the text means or which parts you care about.
Specific field extraction starts from a different premise. Instead of asking "what text is in this document," you ask "what values does this document have for the fields I've defined." The output isn't a transcription — it's a structured dataset where your column names are the headers and the document's content fills the rows.
| Approach | Input | Output | Who defines structure |
|---|---|---|---|
| Basic OCR | Image / PDF | Raw text string | Nobody — flat text dump |
| PDF-to-Excel converter | Table mirroring original layout | The document itself | |
| Template-based extractor | PDF / image | Preset fields (Invoice #, Date, Total…) | The software vendor |
| Custom field extraction | Any format | Your columns, filled from any layout | You |
The column names you type become the headers of your output table. The AI's job is to locate the corresponding value in each document — regardless of where it appears, what the document calls it, or what format the file is in.
What Counts as "Any Document"
The approach works across a wider range of input types than most document tools support, because the underlying technology is a vision large model rather than a format parser. It reads images the way a human does — understanding content, not decoding file structure.
Input formats that work
Digital PDFs
Text-layer PDFs from any software — accounting systems, ERP exports, Word-to-PDF, etc.
Scanned documents
Office scanner output, scanned archives, faxed documents saved as PDF or image.
Photos of documents
Phone photos of receipts, invoices, forms, whiteboards, printed tables. Reasonable lighting required.
Screenshots
Screenshots of web pages, dashboards, system interfaces, payment confirmations, order summaries.
Handwritten documents
Handwritten forms, field notes, signed paper documents. Accuracy varies with penmanship and scan quality.
Document types that work
How Column Naming Works in Practice
The column names you provide act as semantic instructions to the AI. You don't need to know where on the page a field appears, what the document labels it, or whether the information is explicit or implied. Plain language field names are enough.
A few examples of how the AI interprets column names:
Finds the payment due date whether the invoice says "Due Date," "Payment Due," "Pay By," or states it implicitly as "Net 30 from invoice date" (in which case it calculates the date).
Identifies whether a contract renews automatically — regardless of whether that's in a "Renewal" clause, a "Term" section, or buried in clause 12.4 on page 18.
Locates the patient's name on a lab report or discharge summary, even when surrounded by other names (doctor, facility, referring physician).
Detects whether an official seal or chop is visible on the document — returns Yes/No without requiring the stamp to contain specific text.
If a field isn't present in a document, the cell is left blank. The AI doesn't substitute a related field or fabricate a value. An empty cell is accurate information — it tells you the document doesn't contain that data.
Single Document vs. Batch: Same Columns, Many Files
The same column-name approach works whether you're processing one document or three hundred. In batch mode, you upload all files together, define your columns once, and receive a single Excel file where each row is one document and each column is one of your specified fields.
This is where the cross-format flexibility becomes practically significant. A real-world batch rarely consists of identical documents. A month's worth of vendor invoices includes digital PDFs from large suppliers, scanned paper invoices from smaller ones, and photos of receipts from field staff. A patient data collection round includes printed lab reports, hand-filled intake forms, and exported system screenshots. Uploading them together and getting a consistent table out the other side is the point.
Processing speed: 5–10 seconds per page. A batch of 50 single-page documents finishes in under 10 minutes. Multi-page documents (contracts, reports) take proportionally longer based on page count.
Common Use Cases by Document Type
The following are the most common scenarios where custom field extraction replaces manual data entry. Each links to a more detailed workflow guide for that document type.
Invoices & Receipts
Extract Vendor Name, Invoice #, Date, PO Reference, Tax, Total — from any supplier format, any layout. One row per invoice.
Contracts & Agreements
Pull Parties, Contract Value, Effective Date, Expiry, Auto-Renewal terms, Governing Law from a batch of vendor agreements.
Vendor Quotes & RFQs
Turn supplier quote PDFs into a comparison table — Unit Price, MOQ, Lead Time, Payment Terms — across all vendors in one batch.
Batch AP / Expense Processing
Process 40–200 invoices or expense receipts in one run. One spreadsheet, one row per document, ready to paste into your AP tracker.
Direct to Google Sheets
Use the Google Sheets sidebar add-on to upload, specify fields, and append extracted data directly — without downloading a file.
Handwritten Forms & Checklists
Extract fields from handwritten intake forms, inspection checklists, and paper surveys — including checkboxes and signature detection.
Accuracy and Honest Limitations
For printed text in clear documents — digital PDFs, good-quality scans, well-lit photos — recognition accuracy reaches up to 99%. That covers most professional document workflows. A few scenarios are worth understanding before you rely on this in production:
High accuracy
- Digital PDFs from any source
- Office scans at 300 DPI or better
- Phone photos in good lighting, minimal blur
- Screenshots at standard resolution
- Multi-language documents — any language
- Clearly stated fields with standard meaning
Reduced accuracy or out of scope
- Dense handwriting or poor penmanship
- Low-resolution scans, heavy shadows, extreme skew
- Fields defined by complex cross-references or schedules
- Legal interpretation of ambiguous clause language
- Reasoning tasks ("is this a good deal?")
Frequently Asked Questions
Do I need to configure anything before uploading a new document type?
No. There are no templates to configure and no document-type-specific setup. Upload any document, type your column names in plain language, and the AI handles the rest. The same interface works for invoices, contracts, forms, and photos of handwritten notes.
Can I mix different document types in the same batch?
Yes. A single batch can contain invoices alongside receipts, or contracts alongside quote PDFs. Each file produces one row. If a column doesn't apply to a particular document type, that cell is blank. Mixed-format batches (PDFs, scans, images) also work in the same upload.
What file formats are accepted?
PDF, JPG, PNG, WebP, and AVIF. This covers digital PDFs, scanned documents saved in any of those image formats, phone photos, and screenshots. There's no need to convert files to a specific format before uploading.
How specific can my column names be?
Quite specific. "Unit Price for Item A (per kg, excluding VAT)" is a valid column name and the AI will attempt to match that level of specificity. More specific column names generally produce cleaner output because the AI has a clearer target. Ambiguous names like "Amount" may capture different things across different document types — more specific names like "Invoice Total (incl. tax)" are better.
What output formats are available?
Excel (XLSX), CSV, and JSON. For most spreadsheet workflows, XLSX is the default. CSV works for importing into databases or other systems. JSON is available for developers integrating extraction into automated pipelines.
Is there a way to collect documents from other people before extracting?
Yes. You can generate a Collection Link — a shareable URL — and send it to field staff, clients, or team members. They open the link, enter a short verification code, and upload files directly. No account needed on their side. Files land in your processing queue automatically. Useful for gathering expense receipts, inspection photos, or client documents before running a batch extraction.
Any document. Your columns. Structured output.
Upload a photo, scan, or PDF — type the fields you need — download a clean Excel file. No setup, no templates, no format restrictions.