Skip to content

Idea Gallery

Real-world document types mapped to extraction patterns. Find your use case and jump to the right tutorial.

How to Use This Page

  1. Find a document type similar to yours
  2. Note which patterns apply
  3. Follow the linked tutorials

Government & Public Records

Document Type Primary Pattern Secondary Pattern Key Features
Inspection Reports One Page = One Row Messy Tables Repeating forms, violation tables
FOIA Responses Finding Sections OCR Then Navigate Often scanned, redacted sections
Campaign Finance One Page = One Row Messy Tables Donor/expenditure tables
Police Incident Logs Messy Tables Multipage Content Multi-line entries, continuation rows
Budget Documents Multipage Content Label-Value Extraction Tables spanning pages
Permit Applications Label-Value Extraction OCR Then Navigate Form fields, sometimes handwritten
Court Filings Finding Sections Label-Value Extraction Case metadata, structured sections

Financial Documents

Document Type Primary Pattern Secondary Pattern Key Features
Invoices Label-Value Extraction Messy Tables Header fields + line items
Expense Reports One Page = One Row Label-Value Extraction Receipts, grouped expenses
Bank Statements Messy Tables Multipage Content Transaction tables
Tax Forms Label-Value Extraction OCR Then Navigate Fixed layouts, OCR for handwritten
Annual Reports Finding Sections Label-Value Extraction Narrative + financial tables
SEC Filings (10-K) Finding Sections Multipage Content Long documents, nested tables
Document Type Primary Pattern Secondary Pattern Key Features
Contracts Finding Sections Label-Value Extraction Clause extraction, party names
Real Estate Deeds Label-Value Extraction OCR Then Navigate Often scanned, property details
NDAs Finding Sections - Standard clauses
Court Orders Finding Sections Label-Value Extraction Case number, parties, rulings
Patent Documents Finding Sections Multipage Content Claims, descriptions

Healthcare & Insurance

Document Type Primary Pattern Secondary Pattern Key Features
Insurance Policies Finding Sections Label-Value Extraction Coverage details, exclusions
Explanation of Benefits Label-Value Extraction Messy Tables Procedure codes, amounts
Medical Intake Forms One Page = One Row OCR Then Navigate Checkboxes, handwritten
Lab Reports Label-Value Extraction Messy Tables Test results, reference ranges

Academic & Research

Document Type Primary Pattern Secondary Pattern Key Features
Research Papers Finding Sections Messy Tables Abstract, methods, results
Syllabi Finding Sections Label-Value Extraction Schedule tables, grading
Transcripts Messy Tables One Page = One Row Course listings
Grant Applications Finding Sections Label-Value Extraction Budget tables, milestones

HR & Administrative

Document Type Primary Pattern Secondary Pattern Key Features
Resumes/CVs Finding Sections - Work history, education
Job Applications Label-Value Extraction One Page = One Row Form fields
W-2 / Tax Documents Label-Value Extraction - Fixed box positions
Employee Reviews Finding Sections Label-Value Extraction Ratings, comments

Pattern Selection Guide

Start Here Based on Your Goal

"I need one row per page/document"One Page = One Row

"I need to find a label and get the value next to it"Label-Value Extraction

"My table has problems (multi-line, merged cells, etc.)"Messy Tables

"The content continues across multiple pages"Multipage Content

"I need to extract a specific section"Finding Sections

"The PDF is scanned/image-based"OCR Then Navigate

Combining Patterns

Most real extractions combine patterns:

  1. Invoice Processing
  2. Label-Value Extraction for header fields
  3. Messy Tables for line items

  4. Batch Form Processing

  5. One Page = One Row for the loop
  6. Label-Value Extraction for each field
  7. OCR Then Navigate if scanned

  8. Report Analysis

  9. Finding Sections to locate content
  10. Multipage Content for long sections
  11. Messy Tables for data tables within

Not Finding Your Use Case?

These patterns cover most extraction needs. If your document doesn't fit:

  1. Check if it's a layout issue - Try layout analysis
  2. Check if it needs AI - Consider document Q&A
  3. Check the tutorials - Browse the getting started guide

Contributing

Have a document type that worked well with Natural PDF? Consider contributing an example to help others with similar documents.