OCR apps that actually get it done?

I have been testing some of the OCR apps (or apps that include such option) in the F-Droid repository, and honestly, none can give proper results.

Even when black words in white paper are the only thing it has to transcribe to text, there are always mistakes. Sure, if the text I photograph is in English the result is slightly better, but the idea is exactly to photograph a text on another language (German, French, etc) and be able to translate it into English. The Translator app by David Ventura (using Mozilla models for local translation) gets the job of the translation done (not great, but pretty good for a fully local and on the fly choice). But the OCR is still erroneous most of the time.

I have downloaded the models and addons marked as “best” and tried setting accuracy higher but… Seems not to work.

I am open to using an AI model if it can be run locally (on my GNU/linux laptop I have a semi-working solution, but wanted something for the smartphone as well). Any other apps that could be helpful? Thanks!

Did you try

Hi!

Yes, I did tried that one, with the “best” setting, and it doesn’t quite get high quality results…

I also tried this one:

With this addon:

As well as this one:

Which works well enough for translation, but OCR is kinda weak.

So… what other alternatives are there? :frowning:

1 Like

There are good OCR models but for some unknown reasons all of these apps use the Tesseract. :person_shrugging: You can use GitHub - zibo-chen/rust-paddle-ocr: 高性能OCR识别库,支持上百种语言,提供命令行、图形界面及C API多种调用方式,使用便捷高效。 High-performance OCR library powered by PaddleOCR v4/v5 with MNN backend. Supports 10+ languages (Chinese, English, Japanese, Korean, Arabic, Cyrillic, Thai, etc). Provides Rust crate + C API + CLI tools. Fast, lightweight, easy-to-integrate. · GitHub in termux.

1 Like

Thanks. I will take a look.

I’m currently preparing an experimental RC build of MakeACopy with optional offline PaddleOCR V5 support (GitHub-only paddle flavor for now).

Tesseract remains the default OCR engine, but the early PaddleOCR results look promising so far.

I’d be happy to hear feedback once the RC is available.

Hey everyone!

Thanks for the replies!

I would prefer a native app rather than a termux solution, and searching for apps using other models didn’t return such good results.

I am happy to hear MakeACopy will include that option soon. @egdels do let us know when it’s available and I will provide feedback. Thanks!

@GNUser

First experimental RC is ready.

Please try one of the paddle-flavored release APKs from GitHub and enable Paddle OCR in the OCR settings screen.

This is still experimental, so feedback about OCR quality, crashes, slow devices, wrong recognition results or confusing UI behavior would be very helpful.

I’m still unsure how I should publish this on F-Droid and Google Play, since the app becomes significantly larger with PaddleOCR included.

1 Like

Thanks! Will do!

The apk in Github are F-Droid compliant, yes? No proprietary blobs? I know some apps are made in two “versions” one for F-Droid and other for Play Store, which is why I ask.

Yes - the regular Google Play and F-Droid versions are currently identical.

The PaddleOCR variant is experimental for now and only published on GitHub mainly because of the significantly larger APK size (models are included for fully offline use. MakeACopy will always remain offline-first).

It is still fully open source. I extended the existing ONNX Runtime build script, which is already used for the standard version as well.

Hi,

So, I did a few tests. Hum… I think Paddle is doing better with punctuation and such. However, it “makes up” some letters here and there as well. I would say… if I had to pick one, I would go with Paddle, but still seems to have a hard time extracting the text perfectly.

I would maybe suggest having the option to improve line breaks? Like… either disable those entirely, or (if possible) try a “best case” approach. If there is no punctuation at the end of a line, assume that the next line is a part of the same text block, with larger white spaces being considered as “breaks” themselves.

I think the app has great potential and Paddle as an alternative is very welcome, I ask though, is it using the “best” addon thing? Or for Paddle there is no difference?

Thanks!

Hi Mike. Thanks a lot for your feedback.

That is also my impression so far. Paddle is usually more accurate and also faster. In some cases it even recognizes handwriting surprisingly well.

I currently use the mobile versions of the Paddle models. The larger ones would probably be even more accurate.

I have already been working on a layout analysis system for quite some time, but it is currently hidden behind a feature flag because it is still not good enough yet.

For Paddle, the “best” addons do not matter. Tesseract is currently only used as a fallback, there is no combined post-processing or result merging between both OCR engines at the moment.

For comparison: the small Tesseract models together are around 70 MB, while the Paddle models are around 80 MB. So the app would only become slightly larger if I removed Tesseract completely.

Unlike Tesseract, a single Paddle model can cover multiple languages. So when adding more languages, the overall app size could actually become smaller in comparison.

Totally agree. OCR quality can vary a lot depending on the font, lighting, and language. I’ve had the best results when the app also lets you crop and increase contrast before scanning - that alone can make a huge difference.

Syndra03, People come to forums to read posts from humans, not AI. (It’s not just this post, all of their posts are like this)

I am currently unsure whether the Paddle version should eventually replace the current app or whether it should stay a separate app/variant.

What would you prefer?

Looking at your previous reply, I suppose using Paddle gives more flexibility and gives the same results as using the addon “Best”, which if we sum it all up, would make for a larger app.

Now, I will be honest, it’s easy for me these days to “not care” about the size an app, sure. But I remember when I was using my previous device, memory was a concern yes, so size being a factor taken into account is a good thing, I agree.

If it’s doable for you to keep two versions, one for Paddle and one for Tesseract, I say go for it. Otherwise, just have the option for the user to download whichever model they want (even both if the user wants to try both).

I guess at this point I personally would prefer to have a single better app, with the features you described that are being worked on, if that was a choice to be made. In the end the app will only be as good as the models themselves, so… it will really depend on “which model we expect to improve more in the next 6 months? 12 months?”

I am sorry, my answer is not a clear answer, just some thoughts. Personally, I still have all the apps on my device and will keep testing when I have the time, but still hope to have one (yours hopefully) to get to the point where I can trust it to be usable in any situation I need (honestly, rarely I need it, but still want to have my device ready for everything ahah).

Thank you.

I currently tend towards replacing the existing OCR implementation with PaddleOCR instead of maintaining two separate long-term variants.

I will first roll this out on Google Play and continue collecting real-world feedback before making further decisions for F-Droid.

1 Like