Tips & Tricks¶
Get the most out of Kaizen OCR & PDF with these expert tips and best practices.
OCR Best Practices¶
Choosing the Right OCR Engine¶
| Scenario | Recommended Engine |
|---|---|
| Simple text documents | Tesseract OCR (fast) |
| Tables and structured layouts | Paddle AI OCR |
| Multi-language documents | Paddle AI OCR |
| Low-quality or noisy images | Paddle AI OCR + Preprocessing |
| High-volume batch processing | Tesseract OCR (faster per page) |
| Handwritten text | Paddle AI OCR |
Image Quality Tips¶
Resolution Matters
- Minimum: 150 DPI for readable results
- Recommended: 300 DPI for best accuracy
- Photos: Ensure the text is in sharp focus with good lighting
Preprocessing for Better Results
Before running OCR on difficult images, use these preprocessing steps in order:
- Crop & Deskew -- Straighten the image and remove borders
- Grayscale -- Remove color distractions
- Denoising -- Clean up image noise
- Sharpening -- Enhance text edges
- Thresholding -- Convert to high-contrast black & white
Scanning Tips¶
- Place documents flat on the scanner
- Use a dark background behind single sheets to avoid bleed-through
- Scan at 300 DPI minimum for OCR processing
- Save scans as PNG or TIFF for lossless quality (avoid JPEG for text documents)
PDF Tips¶
When to Use Each PDF Feature¶
| Task | Feature to Use |
|---|---|
| Extract text from a scanned PDF | Easy OCR or Advanced OCR |
| Convert PDF to Word | Convert Documents |
| Pull photos/graphics from a PDF | Extract Images from PDF |
| Protect a PDF before sharing | PDF Add Password |
| Remove password for easier access | PDF Remove Password |
Batch Processing¶
- You can add multiple files at once for batch processing
- Use Easy OCR for quick batch extraction
- Use Export All to save all results in one go
PDF Security¶
Encryption Strength
- Standard encryption is sufficient for most documents
- Use Strong encryption for highly sensitive financial, legal, or medical documents
- Always keep a backup of the original unprotected file
Workflow Recommendations¶
For Scanned Documents¶
Scan at 300 DPI → Crop & Deskew → Advanced OCR (with preprocessing) → Export
For Photographed Documents¶
Take photo (good lighting, steady hand) → Crop & Deskew → Easy OCR → Copy Text
For Document Conversion¶
Add source file → Select output format → Start → Done
For Secure PDF Sharing¶
Open PDF → Add Password (Strong) → Share protected file → Recipient uses Remove Password
Keyboard Shortcuts¶
| Shortcut | Action |
|---|---|
| F6 | Take a screenshot for OCR |
| Drag & Drop | Add files to any feature quickly |
Performance Optimization¶
- Close unnecessary applications when processing large batches
- Reduce image resolution to 300 DPI if original is much higher (e.g., 600+ DPI)
- Process in smaller batches if the application becomes slow
- Use Tesseract OCR when speed is more important than maximum accuracy
Getting the Best OCR Results¶
- Start with Easy OCR -- It works great for most documents
- Switch to Advanced OCR only when Easy OCR doesn't give satisfactory results
- Use Paddle AI for structured content (tables, forms, invoices)
- Use Tesseract for simple text-heavy documents (letters, articles, books)
- Enable preprocessing for scanned or photographed documents
- Use Crop & Deskew first for tilted or misaligned images