Site Map - skip to main content

Hacker Public Radio

Your ideas, projects, opinions - podcasted.

New episodes every weekday Monday through Friday.
This page was generated by The HPR Robot at


hpr3998 :: Using open source OCR to digitize my mom's book

How I used open source tools such as gphoto2 and the OCR software tesseract to digitize pages

<< First, < Previous, , Latest >>

Hosted by Deltaray on Wednesday, 2023-11-29 is flagged as Clean and is released under a CC-BY-SA license.
ocr, opensource, grep, scripts, programming. 2.
The show is available on the Internet Archive at: https://archive.org/details/hpr3998

Listen in ogg, spx, or mp3 format. Play now:

Duration: 00:30:47

general.

To improve the speed of my workflow, I wrote a bash script that uses the open source programs gphoto2, tesseract, grep and ImageMagick to digitize my mom's 338 page book. Here is the link to the script: https://github.com/deltaray/ocr-script


Comments

Subscribe to the comments RSS feed.

Comment #1 posted on 2023-11-29 20:39:52 by brian-in-ohio

good show

Enjoyed every minute of this show. Its someting I've wanted to try, now I think I will. Nice little rant at the end, hit the nail on the head. Keep the shows coming

Comment #2 posted on 2023-12-03 13:19:03 by Deltaray

Thanks

Thanks, I appreciate that feedback and good luck with your endeavors.

Leave Comment

Note to Verbose Commenters
If you can't fit everything you want to say in the comment below then you really should record a response show instead.

Note to Spammers
All comments are moderated. All links are checked by humans. We strip out all html. Feel free to record a show about yourself, or your industry, or any other topic we may find interesting. We also check shows for spam :).

Provide feedback
Your Name/Handle:
Title:
Comment:
Anti Spam Question: What does the letter P in HPR stand for?
Are you a spammer?
Who is the host of this show?
What does HPR mean to you?