Site Map - skip to main content

Hacker Public Radio

Your ideas, projects, opinions - podcasted.

New episodes Monday through Friday.

hpr3315 :: tesseract optical character recognition

How to use this amazing tool

<< First, < Previous, , Latest >>

Thumbnail of Ken Fallon
Hosted by Ken Fallon on 2021-04-16 is flagged as Clean and is released under a CC-BY-SA license.
Tesseract, OCR, optical character recognition. 1.

Listen in ogg, spx, or mp3 format. Play now:

Duration: 00:02:08


Tesseract (software)

From Wikipedia, the free encyclopedia

Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License. Originally developed by Hewlett-Packard as proprietary software in the 1980s, it was released as open source in 2005 and development has been sponsored by Google since 2006.
In 2006, Tesseract was considered one of the most accurate open-source OCR engines then available.

$ tesseract -l eng english-page.jpg english
$ tesseract -l nld dutch-page.jpg dutch
$ ls
dutch.txt english.txt 


Subscribe to the comments RSS feed.

Comment #1 posted on 2022-02-13 14:56:47 by Ken Fallon

Yet another one

Load memory ....

Leave Comment

Note to Verbose Commenters
If you can't fit everything you want to say in the comment below then you really should record a response show instead.

Note to Spammers
All comments are moderated. All links are checked by humans. We strip out all html. Feel free to record a show about yourself, or your industry, or any other topic we may find interesting. We also check shows for spam :).

Provide feedback
Your Name/Handle:
Anti Spam Question: What does the P in HPR stand for ?