hpr3998 :: Using open source OCR to digitize my mom's book
How I used open source tools such as gphoto2 and the OCR software tesseract to digitize pages
Hosted by Deltaray on Wednesday, 2023-11-29 is flagged as Clean and is released under a CC-BY-SA license.
ocr, opensource, grep, scripts, programming.
2.
The show is available on the Internet Archive at: https://archive.org/details/hpr3998
Listen in ogg,
spx,
or mp3 format. Play now:
Duration: 00:30:47
general.
To improve the speed of my workflow, I wrote a bash script that uses the
open source programs gphoto2
, tesseract
,
grep
and ImageMagick
to digitize my mom's 338 page
book. Here is the link to the script:
https://github.com/deltaray/ocr-script