Idea Gallery
Real-world document types mapped to extraction patterns. Find your use case and jump to the right tutorial.
How to Use This Page
- Find a document type similar to yours
- Note which patterns apply
- Follow the linked tutorials
Government & Public Records
| Document Type | Primary Pattern | Secondary Pattern | Key Features |
|---|---|---|---|
| Inspection Reports | One Page = One Row | Messy Tables | Repeating forms, violation tables |
| FOIA Responses | Finding Sections | OCR Then Navigate | Often scanned, redacted sections |
| Campaign Finance | One Page = One Row | Messy Tables | Donor/expenditure tables |
| Police Incident Logs | Messy Tables | Multipage Content | Multi-line entries, continuation rows |
| Budget Documents | Multipage Content | Label-Value Extraction | Tables spanning pages |
| Permit Applications | Label-Value Extraction | OCR Then Navigate | Form fields, sometimes handwritten |
| Court Filings | Finding Sections | Label-Value Extraction | Case metadata, structured sections |
Financial Documents
| Document Type | Primary Pattern | Secondary Pattern | Key Features |
|---|---|---|---|
| Invoices | Label-Value Extraction | Messy Tables | Header fields + line items |
| Expense Reports | One Page = One Row | Label-Value Extraction | Receipts, grouped expenses |
| Bank Statements | Messy Tables | Multipage Content | Transaction tables |
| Tax Forms | Label-Value Extraction | OCR Then Navigate | Fixed layouts, OCR for handwritten |
| Annual Reports | Finding Sections | Label-Value Extraction | Narrative + financial tables |
| SEC Filings (10-K) | Finding Sections | Multipage Content | Long documents, nested tables |
Legal Documents
| Document Type | Primary Pattern | Secondary Pattern | Key Features |
|---|---|---|---|
| Contracts | Finding Sections | Label-Value Extraction | Clause extraction, party names |
| Real Estate Deeds | Label-Value Extraction | OCR Then Navigate | Often scanned, property details |
| NDAs | Finding Sections | - | Standard clauses |
| Court Orders | Finding Sections | Label-Value Extraction | Case number, parties, rulings |
| Patent Documents | Finding Sections | Multipage Content | Claims, descriptions |
Healthcare & Insurance
| Document Type | Primary Pattern | Secondary Pattern | Key Features |
|---|---|---|---|
| Insurance Policies | Finding Sections | Label-Value Extraction | Coverage details, exclusions |
| Explanation of Benefits | Label-Value Extraction | Messy Tables | Procedure codes, amounts |
| Medical Intake Forms | One Page = One Row | OCR Then Navigate | Checkboxes, handwritten |
| Lab Reports | Label-Value Extraction | Messy Tables | Test results, reference ranges |
Academic & Research
| Document Type | Primary Pattern | Secondary Pattern | Key Features |
|---|---|---|---|
| Research Papers | Finding Sections | Messy Tables | Abstract, methods, results |
| Syllabi | Finding Sections | Label-Value Extraction | Schedule tables, grading |
| Transcripts | Messy Tables | One Page = One Row | Course listings |
| Grant Applications | Finding Sections | Label-Value Extraction | Budget tables, milestones |
HR & Administrative
| Document Type | Primary Pattern | Secondary Pattern | Key Features |
|---|---|---|---|
| Resumes/CVs | Finding Sections | - | Work history, education |
| Job Applications | Label-Value Extraction | One Page = One Row | Form fields |
| W-2 / Tax Documents | Label-Value Extraction | - | Fixed box positions |
| Employee Reviews | Finding Sections | Label-Value Extraction | Ratings, comments |
Pattern Selection Guide
Start Here Based on Your Goal
"I need one row per page/document" → One Page = One Row
"I need to find a label and get the value next to it" → Label-Value Extraction
"My table has problems (multi-line, merged cells, etc.)" → Messy Tables
"The content continues across multiple pages" → Multipage Content
"I need to extract a specific section" → Finding Sections
"The PDF is scanned/image-based" → OCR Then Navigate
Combining Patterns
Most real extractions combine patterns:
- Invoice Processing
- Label-Value Extraction for header fields
-
Messy Tables for line items
-
Batch Form Processing
- One Page = One Row for the loop
- Label-Value Extraction for each field
-
OCR Then Navigate if scanned
-
Report Analysis
- Finding Sections to locate content
- Multipage Content for long sections
- Messy Tables for data tables within
Not Finding Your Use Case?
These patterns cover most extraction needs. If your document doesn't fit:
- Check if it's a layout issue - Try layout analysis
- Check if it needs AI - Consider document Q&A
- Check the tutorials - Browse the getting started guide
Contributing
Have a document type that worked well with Natural PDF? Consider contributing an example to help others with similar documents.