Troubleshooting f-droid.org slowness periods

So the website goes through periods of slowness, so we need to map it out to figure out how best to handle it. The good news is that we have solid mirrors, and more coming online. Before we push more users to mirrors, we have to make sure that this will actually improve things.

There is a nice ping monitor that I think @krombel setup:
https://monitor.msg-net.de/d/kXzI4Jliks/worldping-endpoint-f-droid-org

@Bubu @krombel nice service! That should be helpful. Can it ping per IP? f-droid.org is usually mapped to two IPs on two different servers via roundrobin DNS. Also, given that we’ve been seeing successful sessions that require ~8s to connect, it would be worthwhile to include the ping time error level in that monitor
app.

1 Like

I added another endpoint so you are able to view them separately.
While doing so I also was able to increase the timeout to 10s

2 Likes

Looks like https is kinda down ?!

Turns out I’m working on making the client app aggressively switch to mirrors. There will be an alpha soon, within a week if all goes well.

3 Likes

So I was thinking about setting default timeout for installs to something like 1-5 seconds for TLS Connects and Reads. This would make fdroidclient rapidly switch to mirrors with f-droid.org is slow.

I tested this by setting the default mirror timeout to 1ms, and it went smoothly through all the failures until it hit the second tier timeout value. Then it successfully downloaded.

The tricky thing is that when someone is on slow internet, where it takes slightly longer than the default timeout to connect to a TLS site. There would be quite a bit of churn on every connection, since it would try every mirror first waiting the default timeout for each mirror, then cancel and move on. So someone on slow internet would have mirrors X default-timeout delay added to every connection.

I also checked the mirrors and some sites around the world to get a rough idea of what a normal connect time for TLS is. Connecting to a site on the other side of the world can easily take 1-2 seconds even on a fast connection:

$ for f in f-droid.org fdroid.tetaneutral.net mirror.cyberbits.eu bubu1.eu ftp.fau.de mirror.jarsilio.com www.baidu.com cafebazaar.ir cnnic.com.cn www.ecured.cu; do echo $f; tlsping ${f}:443;done
f-droid.org
tlsping: TLS connection to server f-droid.org:443 (10 connections)
tlsping: min/avg/max/stddev = 4.05s/4.36s/4.69s/186.88ms
fdroid.tetaneutral.net
tlsping: TLS connection to server fdroid.tetaneutral.net:443 (10 connections)
tlsping: min/avg/max/stddev = 273.39ms/278.35ms/284.09ms/3.41ms
mirror.cyberbits.eu
tlsping: TLS connection to server mirror.cyberbits.eu:443 (10 connections)
tlsping: min/avg/max/stddev = 322.27ms/517.76ms/673.33ms/120.69ms
bubu1.eu
tlsping: TLS connection to server bubu1.eu:443 (10 connections)
tlsping: min/avg/max/stddev = 134.85ms/138.98ms/141.55ms/2.54ms
ftp.fau.de
tlsping: TLS connection to server ftp.fau.de:443 (10 connections)
tlsping: min/avg/max/stddev = 184.73ms/188.55ms/190.24ms/1.74ms
mirror.jarsilio.com
tlsping: TLS connection to server mirror.jarsilio.com:443 (10 connections)
tlsping: min/avg/max/stddev = 371.07ms/452.02ms/505.62ms/39.30ms
www.baidu.com
tlsping: TLS connection to server www.baidu.com:443 (10 connections)
tlsping: min/avg/max/stddev = 621.47ms/634.71ms/641.88ms/6.81ms
cafebazaar.ir
tlsping: TLS connection to server cafebazaar.ir:443 (10 connections)
tlsping: min/avg/max/stddev = 348.31ms/358.49ms/366.57ms/5.80ms
cnnic.com.cn
tlsping: TLS connection to server cnnic.com.cn:443 (10 connections)
tlsping: min/avg/max/stddev = 1.21s/1.23s/1.24s/8.94ms
www.ecured.cu
tlsping: TLS connection to server www.ecured.cu:443 (10 connections)
tlsping: min/avg/max/stddev = 665.08ms/668.58ms/671.53ms/2.65ms

I also looked around a little bit for data on the normal latency of mobile networks. It seem that 3500ms is pretty common for 3G connections:

So based on this, I think the default timeout should remain 10 seconds.

So it will wait 10 seconds before using the next one? Ain’t nobody got time for that :confused:

Its not waiting for 10 seconds, its a timeout. It wants for a connection for up to 10 seconds. The timeout is then considered an error, so it then switches to the next mirror. It will then stick with that mirror until it errors out, or F-Droid is restarted. I’m adding a new feature where it will remember the current mirror across restarts, with an Expert pref to toggle it on until it proves stable.

Then I’d start using a random mirror from the start, as we know that for a good amount of people it’ll otherwise take 10 Seconds till anything happens.

Sticking to one repo until it fails would have the benefit that we can assume that there wont be inconsistencies between index and available apps.

So do not require an index update on each app start - only on fail/switch to be absolutely secure.

But I would pick a random mirror in case that “stick mirror” is unset.

Just for usability and simplicity, is it possible for f-droid to use a cdn instead of complex interactions between main server, mirrors and keeping track of what is up and what is down ? What would be the estimated cost to f-droid for using a decent cdn ?

@boomhauer A key part of the idea of the architecture of F-Droid repos (signed metadata, flat files, open mirror subscription, etc) is that anyone can set up a mirror, and it is easy for anyone to use that mirror. So please try setting up a CDN, and then you can answer those questions.

Debian has proven this architecture, and there are now Debian mirrors on multiple CDNs, servers, etc.

I tried adding mirrors from F-Droid Mirror Monitor but a # of them added as separate repos, not as mirrors. Any ideas on how to correctly fix? :thinking::confused:

I’ve also experienced that and reported it on GitLab. Please add more information to this issue:

1 Like

yeah, its bad that happens, the repo adding UX needs work. The way to
make a mirror repo URL work is to include the fingerprint in the URL,
e.g. add ?fingerprint=0000... to the end

I noticed the website was super slow yesterday. Is there a shortage of server capacity? Do we know what causes it (slow disks, TLS connection overhead, …)?

I don’t see any info about mirroring on the f-droid website, but in this thread mirrors are mentioned. I’m guessing those are official mirrors, not contributors setting them up? Because I could maybe host a small mirror (though the docs and fdroid server readme do not mention requirements).

Those mirrors are for the actual apps and the index, the website is separate.

The upcoming v1.6 release of F-Droid client will offload a lot of the load to the existing mirrors. As for adding new mirrors, the EU is well covered, and there is one large US mirror. We need mirrors in other places.

My guess is that the servers are slow because of sheer load spikes. Offloading app downloads to mirrors should help a lot.

1 Like

Anyone can help by testing the v1.6 alpha releases.

Better to have enough capacity, than to be geographically distributed enough, though :). I can’t offer a lot but if 40 mbps peak (10mbps steady) from western Europe would help, let me know.

I just installed 1.6, seems to work really well. Of course the slowness wasn’t always present so i can’t say whether this resolved anything for sure, but so far so good!

@hans You keep saying more mirrors yet most posts here are about the website, and I have the same issue, it either takes 30s or more to load the site or it fails to load at all.

And it’s not like the index is published that often…