Python basiertes, optical character recognition Skript zur Auto-Verarbeitung von Dokumenten und Speicherung in SMB Ordner.
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
dev_alex a430258750 Add target path creation + orig + extract + pdf 1 year ago
input Add sample bild 1 year ago
.gitignore Initial commit 2 years ago
LICENSE Initial commit 2 years ago
README.md Add initial docs 2 years ago
_requirements.txt Add initial docs 2 years ago
ocr_config.ini Fixed paths für saubere path joins 1 year ago
ocr_scan.py Add target path creation + orig + extract + pdf 1 year ago

README.md

ocr_document_scanner

Python basiertes, optical character recognition Skript zur Auto-Verarbeitung von Dokumenten und Speicherung in SMB Ordner.

Install Guide:

  1. Install Tesseract OCR je nach OS (https://tesseract-ocr.github.io/tessdoc/Installation.html)

  2. Sprachpakete Installieren oder manuel in C:\Program Files\Tesseract-OCR\tessdata hineinkopieren (herunterladen @ https://github.com/tesseract-ocr/tessdata_fast)