OCR apps that actually get it done?

I have been testing some of the OCR apps (or apps that include such option) in the F-Droid repository, and honestly, none can give proper results.

Even when black words in white paper are the only thing it has to transcribe to text, there are always mistakes. Sure, if the text I photograph is in English the result is slightly better, but the idea is exactly to photograph a text on another language (German, French, etc) and be able to translate it into English. The Translator app by David Ventura (using Mozilla models for local translation) gets the job of the translation done (not great, but pretty good for a fully local and on the fly choice). But the OCR is still erroneous most of the time.

I have downloaded the models and addons marked as “best” and tried setting accuracy higher but… Seems not to work.

I am open to using an AI model if it can be run locally (on my GNU/linux laptop I have a semi-working solution, but wanted something for the smartphone as well). Any other apps that could be helpful? Thanks!

Did you try

Hi!

Yes, I did tried that one, with the “best” setting, and it doesn’t quite get high quality results…

I also tried this one:

With this addon:

As well as this one:

Which works well enough for translation, but OCR is kinda weak.

So… what other alternatives are there? :frowning:

There are good OCR models but for some unknown reasons all of these apps use the Tesseract. :person_shrugging: You can use GitHub - zibo-chen/rust-paddle-ocr: 高性能OCR识别库,支持上百种语言,提供命令行、图形界面及C API多种调用方式,使用便捷高效。 High-performance OCR library powered by PaddleOCR v4/v5 with MNN backend. Supports 10+ languages (Chinese, English, Japanese, Korean, Arabic, Cyrillic, Thai, etc). Provides Rust crate + C API + CLI tools. Fast, lightweight, easy-to-integrate. · GitHub in termux.

1 Like

Thanks. I will take a look.

I’m currently preparing an experimental RC build of MakeACopy with optional offline PaddleOCR V5 support (GitHub-only paddle flavor for now).

Tesseract remains the default OCR engine, but the early PaddleOCR results look promising so far.

I’d be happy to hear feedback once the RC is available.

Hey everyone!

Thanks for the replies!

I would prefer a native app rather than a termux solution, and searching for apps using other models didn’t return such good results.

I am happy to hear MakeACopy will include that option soon. @egdels do let us know when it’s available and I will provide feedback. Thanks!

@GNUser

First experimental RC is ready.

Please try one of the paddle-flavored release APKs from GitHub and enable Paddle OCR in the OCR settings screen.

This is still experimental, so feedback about OCR quality, crashes, slow devices, wrong recognition results or confusing UI behavior would be very helpful.

I’m still unsure how I should publish this on F-Droid and Google Play, since the app becomes significantly larger with PaddleOCR included.

Thanks! Will do!

The apk in Github are F-Droid compliant, yes? No proprietary blobs? I know some apps are made in two “versions” one for F-Droid and other for Play Store, which is why I ask.

Yes - the regular Google Play and F-Droid versions are currently identical.

The PaddleOCR variant is experimental for now and only published on GitHub mainly because of the significantly larger APK size (models are included for fully offline use. MakeACopy will always remain offline-first).

It is still fully open source. I extended the existing ONNX Runtime build script, which is already used for the standard version as well.

Hi,

So, I did a few tests. Hum… I think Paddle is doing better with punctuation and such. However, it “makes up” some letters here and there as well. I would say… if I had to pick one, I would go with Paddle, but still seems to have a hard time extracting the text perfectly.

I would maybe suggest having the option to improve line breaks? Like… either disable those entirely, or (if possible) try a “best case” approach. If there is no punctuation at the end of a line, assume that the next line is a part of the same text block, with larger white spaces being considered as “breaks” themselves.

I think the app has great potential and Paddle as an alternative is very welcome, I ask though, is it using the “best” addon thing? Or for Paddle there is no difference?

Thanks!

Hi Mike. Thanks a lot for your feedback.

That is also my impression so far. Paddle is usually more accurate and also faster. In some cases it even recognizes handwriting surprisingly well.

I currently use the mobile versions of the Paddle models. The larger ones would probably be even more accurate.

I have already been working on a layout analysis system for quite some time, but it is currently hidden behind a feature flag because it is still not good enough yet.

For Paddle, the “best” addons do not matter. Tesseract is currently only used as a fallback, there is no combined post-processing or result merging between both OCR engines at the moment.

For comparison: the small Tesseract models together are around 70 MB, while the Paddle models are around 80 MB. So the app would only become slightly larger if I removed Tesseract completely.

Unlike Tesseract, a single Paddle model can cover multiple languages. So when adding more languages, the overall app size could actually become smaller in comparison.

Totally agree. OCR quality can vary a lot depending on the font, lighting, and language. I’ve had the best results when the app also lets you crop and increase contrast before scanning - that alone can make a huge difference.