Inquiry regarding Reproducible Builds data for Master's Thesis

deeeenn · January 6, 2026, 2:04pm

Good morning,

I am Denise, an intern at Télécom-Paris. I am currently writing my thesis on Reproducible Builds, working with a research group that has previously assessed reproducibility in ecosystems such as Nix (see their paper here).

Our goal is to perform a similar analysis on the F-Droid ecosystem to assess the reproducibility of Android applications over time. Over the past two months, I have studied the F-Droid documentation and fdroidserver code, performed test builds, and analyzed your archives to extract reproducibility trends.

First, thank you for the amount of data you provide; it has been interesting to explore. Based on my analysis of ~77k app versions present in the index files of your APIs, I have encountered some anomalies that I hope you can clarify:

Missing Data: I could not find reproducibility logs for approximately 43,000 versions. Are these missing due to build failures on your end, or is there another reason? (eg. org.b3log.siyuan has reproducibility checks only for 4 versions of the 58 in the index, and not for the recent ones)
“Ignored” Status: Roughly 10% of found logs are marked as “ignored.” Could you clarify what this status implies? Specifically, I noticed the reason “manually ignored by sudo” and would like to understand the context behind this. (eg. org.breezyweather:60011 is ignored with the message “Manually ignored by sudo: “/sys/devices/system/cpu/””)
Ignored visibility: In the reproducibility status page, not every test marked as ignored is shown even though I could get the log (eg. org.breezyweather:60012). Why some of them are not displayed?
2021/2022 Trends: I noticed a significant decrease in missing data during 2021 and 2022, but a corresponding spike in “ignored” statuses. Was there a specific infrastructure or policy change during this period?
Timestamp Discrepancies: When comparing the verification timestamp against the package added timestamp, some have a “negative delay” (up to 2 years). How is it possible for a test to be recorded so long before the package publication? (eg. com.termux.gui:7 appears to be added in 2025 but tested in 2023)
Double Tests: Packages published before 2021 appear to have two different reproducibility tests. Was this due to the introduction of diffoscope in 2021? Did you rebuild the whole repo to provide diffoscopes? There are also other packages with multiple tests (eg. S.N.A.K.E:1000001 has a gap of couple hours, eu.mirkodi.swatchbeatclock:15 has two-day gap) that have been published after 2021. In these cases, why there have been multiple tests?
Failed Logs: Do you archive logs for failed builds? I would like to compare them against my own failed reproduction attempts.

Finally, regarding infrastructure: my work involves setting up a rebuild environment using an OpenStack backend. I have modified fdroidserver to support OpenStack and I am developing an orchestrator for parallel builds. I would be happy to hear any suggestions you might have on the process and answer your questions.

Thank you for your time.

system · March 7, 2026, 2:05pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.