I have been testing some of the OCR apps (or apps that include such option) in the F-Droid repository, and honestly, none can give proper results.
Even when black words in white paper are the only thing it has to transcribe to text, there are always mistakes. Sure, if the text I photograph is in English the result is slightly better, but the idea is exactly to photograph a text on another language (German, French, etc) and be able to translate it into English. The Translator app by David Ventura (using Mozilla models for local translation) gets the job of the translation done (not great, but pretty good for a fully local and on the fly choice). But the OCR is still erroneous most of the time.
I have downloaded the models and addons marked as “best” and tried setting accuracy higher but… Seems not to work.
I am open to using an AI model if it can be run locally (on my GNU/linux laptop I have a semi-working solution, but wanted something for the smartphone as well). Any other apps that could be helpful? Thanks!
Please try one of the paddle-flavored release APKs from GitHub and enable Paddle OCR in the OCR settings screen.
This is still experimental, so feedback about OCR quality, crashes, slow devices, wrong recognition results or confusing UI behavior would be very helpful.
I’m still unsure how I should publish this on F-Droid and Google Play, since the app becomes significantly larger with PaddleOCR included.
The apk in Github are F-Droid compliant, yes? No proprietary blobs? I know some apps are made in two “versions” one for F-Droid and other for Play Store, which is why I ask.
Yes - the regular Google Play and F-Droid versions are currently identical.
The PaddleOCR variant is experimental for now and only published on GitHub mainly because of the significantly larger APK size (models are included for fully offline use. MakeACopy will always remain offline-first).
It is still fully open source. I extended the existing ONNX Runtime build script, which is already used for the standard version as well.
So, I did a few tests. Hum… I think Paddle is doing better with punctuation and such. However, it “makes up” some letters here and there as well. I would say… if I had to pick one, I would go with Paddle, but still seems to have a hard time extracting the text perfectly.
I would maybe suggest having the option to improve line breaks? Like… either disable those entirely, or (if possible) try a “best case” approach. If there is no punctuation at the end of a line, assume that the next line is a part of the same text block, with larger white spaces being considered as “breaks” themselves.
I think the app has great potential and Paddle as an alternative is very welcome, I ask though, is it using the “best” addon thing? Or for Paddle there is no difference?
That is also my impression so far. Paddle is usually more accurate and also faster. In some cases it even recognizes handwriting surprisingly well.
I currently use the mobile versions of the Paddle models. The larger ones would probably be even more accurate.
I have already been working on a layout analysis system for quite some time, but it is currently hidden behind a feature flag because it is still not good enough yet.
For Paddle, the “best” addons do not matter. Tesseract is currently only used as a fallback, there is no combined post-processing or result merging between both OCR engines at the moment.
For comparison: the small Tesseract models together are around 70 MB, while the Paddle models are around 80 MB. So the app would only become slightly larger if I removed Tesseract completely.
Unlike Tesseract, a single Paddle model can cover multiple languages. So when adding more languages, the overall app size could actually become smaller in comparison.
Totally agree. OCR quality can vary a lot depending on the font, lighting, and language. I’ve had the best results when the app also lets you crop and increase contrast before scanning - that alone can make a huge difference.
Looking at your previous reply, I suppose using Paddle gives more flexibility and gives the same results as using the addon “Best”, which if we sum it all up, would make for a larger app.
Now, I will be honest, it’s easy for me these days to “not care” about the size an app, sure. But I remember when I was using my previous device, memory was a concern yes, so size being a factor taken into account is a good thing, I agree.
If it’s doable for you to keep two versions, one for Paddle and one for Tesseract, I say go for it. Otherwise, just have the option for the user to download whichever model they want (even both if the user wants to try both).
I guess at this point I personally would prefer to have a single better app, with the features you described that are being worked on, if that was a choice to be made. In the end the app will only be as good as the models themselves, so… it will really depend on “which model we expect to improve more in the next 6 months? 12 months?”
I am sorry, my answer is not a clear answer, just some thoughts. Personally, I still have all the apps on my device and will keep testing when I have the time, but still hope to have one (yours hopefully) to get to the point where I can trust it to be usable in any situation I need (honestly, rarely I need it, but still want to have my device ready for everything ahah).