Skip to content

Tips & Tricks

Get the most out of Kaizen OCR & PDF with these expert tips and best practices.

OCR Best Practices

Choosing the Right OCR Engine

Scenario Recommended Engine
Simple text documents Tesseract OCR (fast)
Tables and structured layouts Paddle AI OCR
Multi-language documents Paddle AI OCR
Low-quality or noisy images Paddle AI OCR + Preprocessing
High-volume batch processing Tesseract OCR (faster per page)
Handwritten text Paddle AI OCR

Image Quality Tips

Resolution Matters

  • Minimum: 150 DPI for readable results
  • Recommended: 300 DPI for best accuracy
  • Photos: Ensure the text is in sharp focus with good lighting

Preprocessing for Better Results

Before running OCR on difficult images, use these preprocessing steps in order:

  1. Crop & Deskew -- Straighten the image and remove borders
  2. Grayscale -- Remove color distractions
  3. Denoising -- Clean up image noise
  4. Sharpening -- Enhance text edges
  5. Thresholding -- Convert to high-contrast black & white

Scanning Tips

  • Place documents flat on the scanner
  • Use a dark background behind single sheets to avoid bleed-through
  • Scan at 300 DPI minimum for OCR processing
  • Save scans as PNG or TIFF for lossless quality (avoid JPEG for text documents)

PDF Tips

When to Use Each PDF Feature

Task Feature to Use
Extract text from a scanned PDF Easy OCR or Advanced OCR
Convert PDF to Word Convert Documents
Pull photos/graphics from a PDF Extract Images from PDF
Protect a PDF before sharing PDF Add Password
Remove password for easier access PDF Remove Password

Batch Processing

  • You can add multiple files at once for batch processing
  • Use Easy OCR for quick batch extraction
  • Use Export All to save all results in one go

PDF Security

Encryption Strength

  • Standard encryption is sufficient for most documents
  • Use Strong encryption for highly sensitive financial, legal, or medical documents
  • Always keep a backup of the original unprotected file

Workflow Recommendations

For Scanned Documents

Scan at 300 DPI → Crop & Deskew → Advanced OCR (with preprocessing) → Export

For Photographed Documents

Take photo (good lighting, steady hand) → Crop & Deskew → Easy OCR → Copy Text

For Document Conversion

Add source file → Select output format → Start → Done

For Secure PDF Sharing

Open PDF → Add Password (Strong) → Share protected file → Recipient uses Remove Password

Keyboard Shortcuts

Shortcut Action
F6 Take a screenshot for OCR
Drag & Drop Add files to any feature quickly

Performance Optimization

  • Close unnecessary applications when processing large batches
  • Reduce image resolution to 300 DPI if original is much higher (e.g., 600+ DPI)
  • Process in smaller batches if the application becomes slow
  • Use Tesseract OCR when speed is more important than maximum accuracy

Getting the Best OCR Results

  1. Start with Easy OCR -- It works great for most documents
  2. Switch to Advanced OCR only when Easy OCR doesn't give satisfactory results
  3. Use Paddle AI for structured content (tables, forms, invoices)
  4. Use Tesseract for simple text-heavy documents (letters, articles, books)
  5. Enable preprocessing for scanned or photographed documents
  6. Use Crop & Deskew first for tilted or misaligned images