Slow webserver response

Wow, that’s a stunning improvement. Unfortunately, I think that change comes from something besides our efforts. Someone mentioned that Aurora Store recently switched to using mirror sites. I’m guessing that happened around then. But then again, that change looks too sudden to have happened via software updates. So maybe it was something else.

As part of fixing the thousand paper cuts of inefficiency of the website, I’m pushing two changes to the .htaccess, one makes the HEAD cache check responses ~1/3 the size and the other sets permanent caching for the extracted icons.

Sorry for the very slow response, I expected an email notifcation, but it seems in never came.

Aurora Droid maybe, Store is about a different “store” :stuck_out_tongue:

Anyway, lately the site loads rather instantly most of the time, I’m impressed given the old situation.

good to hear :slight_smile: There are still key optimizations to be done to make
the site more resilient. For example, if the site used the Jekyll
plugin to add a hash to the file names for all the JS, CSS, PNG files,
then those files could be set to cache forever, and there would only be
a cache check if those files changed. That means for most page loads,
there would only be a cache check on the initial index.html load, then
the rest would load straight from the local cache.

Yes I did, but that was around 2-10-2019, two weeks later and as you said, the change would not be that sudden. We did not rule out DDOS and perhaps there was an attack that suddenly stopped?

That raises the question again, how to monitor servers and debug issues like this?

We don’t know if that will be a major improvement or not. Maybe the actual apk packages are 99% of the data and we are focussing on the 1%. We don’t know.

We do know that cache headers can improve performance and reduce traffic, at least between the origin server and the caches, so I suggest we focus on applying cache headers on all content and then ask the server maintainer to look at the logs and tell us what load remains.

If we can cache website assets for only 1 hour with immutable flag, that means that the origin server only sees (number of caches) / (caching time) = 3 requests per hour for each object. Extending this to a year or so may not be a massive improvement. It is mostly the immutable flag that prevents a possibly huge number of “304 Not Modified” responses.
Further downstream are the client web browsers accessing the website and the F-droid client app. We do not currently know what the traffic ratio is between those two.

I notice many objects have the cache header now, did that change on 19-9?
Some missing are:
*.apk
*.apk.asc
*.tar.gz

The apk is the most important I think.

Example:
$ curl -IsS “https://f-droid.org/assets/favicon-16x16.png” | grep -i cache
Cache-Control: max-age=43200, public, immutable

This would also improve performance: Repo Details does not show all mirrors, its a scroll within a scroll (#1865) · Issues · F-Droid / Client · GitLab
(point 3)

Mirrors are only used by the client app for downloading APK files. That leaves a lot of other data like index, icons, screenshots, etc. going through f-droid.org even if the user disabled the main mirror.

In case F-droid is experiencing slowness or downtime for whatever reason, people cannot just disable the main mirror or rely on the mirror logic to select another for all requests. The only way to get around that is to completely remove the fdroid repo and re-add it from a mirror link.

Ok, the traffic dropoff is a mystery, perhaps it was a DDoS then.
Caching was rolled out at different times around then.

My guess is that those files are rarely downloaded, especially in a
quantity that would affect the load. Plus they are security-sensitive,
so I’d rather be sure people are getting the correct file than save a
little by caching these. _fdroidclient already has its own caching
logic for those.

fdroidclient will automatically try a mirror if it cannot download the
index file from the main site. That’s also a security-sensitive area,
the index is the key to getting updates, so reliability and timeliness
need to come before caching for the index.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Since there is already lots of great info in this thread, I wanted to reopen it. We know have a host that can give us instances in Amsterdam, Hong Kong, and Miami as well as an Anycast IP. So we just need a reproducible design for a caching frontend server. @Joost you have that running already, can you publish the config for that? Ideally that would be put into ansible, since the rest of our servers are in ansible.

Here’s the host:
https://eclips.is/

2 Likes

Hi @hans this config file is from https://fdroidmirror.net/ and it includes a cached copy of the fdroid website. The comments should explain things, but just ask if it is not clear.

I have been tweaking the ssl parameters to get the best combination of performance, security and compatibility. The site is using my own acme.sh certificate manager to get both RSA and EC certificates from letsencrypt. I think it is an improvement over the current fdroid ssl configuration. You may run the site through ssllabs to check.

1 Like

That organisation is an extension of US government. Consider all traffic and everything on their servers compromised.

From their own site “Partners: Open Technology Fund
→ Parent organisation: U.S. Agency for Global Media
→ Predecessor: United States Information Agency
→ Superseding agencies: United States Department of State, U.S. Agency for Global Media.

They attract “Liberation technology developers”. I had never heard of that neoconservative slang, but “Liberation technology” appears to be the art of toppling foreign governments and instigating wars all over the planet using digital means.

1 Like

It is true that eclips.is is US GOVT funded, so if you consider anything
funded by the US Government or Open Tech Fund (OTF) to be suspect, then
you’ll need to also mistrust:

  • SELinux
  • the NIST/MITRE CVE vulnerability reporting system
  • Signal (OTF-funded)
  • Tor Project (OTF, State DRL, DARPA, NSF, etc)
  • F-Droid (OTF, ISC)
  • and many more

It is important to remember that the US Government is vast (~5 million
people). And we’re fully aware of the risks of people with money asking
for backdoors. That’s why we only develop open source software, and we
are public about the funding we accept, and what the goals of that
funding are. See the https:/f-droid.org and
https://guardianproject.info blogs for more details. F-Droid is now
starting to gather this info in https://opencollective.com/f-droid

1 Like

I should add, the DARPA/OTF/USAGM/DRL/etc funding definitely comes out
of interventionist US foreign policy. It’s clear to see that its
related. But I still think that organizations can accept that money and
still be widely trusted. For example, I’ve also received grant money
from the City of Vienna, which is very neutral, being the capitol of a
neutral country.

1 Like

letsencrypt was also OTF-funded:

It is important to be skeptical about funding sources, they are not
neutral. They have a point of view. Keeping things open and public
means people can review the work, but then of course someone actually
has to review the work and point out problems. So I welcome your
skepticism :slight_smile:

1 Like

Nice, I was wondering about Xi’an Jiaotong-Liverpool University venture located in China, maybe they could be interested in hosting F-Droid ?

They are running free(?) http://sanddroid.xjtu.edu.cn/#home

Although I have no idea how technological us ban could impact on open source licenses.

1 Like

Do you have any monitoring set up for CPU (core utilization, interrupts, softirqs, etc), disk I/O (read/write MB/s, read/write IOPS, backlog, utilization time, etc), networking (used sockets, sent/received packets, etc) and your web server (number of workers, number of idle workers, requests per seconds, KiB/s, etc)? That would be really helpful.

Like I asked on IRC: what’s net.core.somaxconn set to (check with sudo sysctl net.core.somaxconn)? If it’s some low value like the previous default 128, could you try increasing it to 4096 (which is also the default since Linux kernel 5.4; you could do that with sudo sysctl -w net.core.somaxconn=4096 and when you want to keep it after reboots, by adding net.core.somaxconn = 4096 to /etc/sysctl.conf)? Of course you then also have to increase ListenBackLog in your Apache config to 4096 (from the defaullt 511) as well.

1 Like

As far as I know, there is not performance monitoring in place on those servers. They are maintained by the founder, who maintains a lot of things and does not have so much time these days. So we’re trying to move things off of his plate. Troubleshooting the existing setup means using external monitoring. I’m happy to set up 3 VMs and an Anycast IP for someone to set up an ansible-based caching frontend setup for f-droid.org. Once we have that, then we can migrate to that for production. And we’ll be able to get detailed troubleshooting information from those boxes.

2 Likes

@uniqx and I were just discussing fdroid-frontend-deployment, very cool to see @Bubu got a prototype up and running already (https://fdroid-frontend1.bubu1.eu/). I think we should take this approach to production. The open issues we could think of here:

  • change the security headers to be passed through from the webserver so we have a single place to maintain them
  • maintain the webserver hostname in an ansible encrypted vault variable using the GPG-based setup we have in fdroid-website-server, fdroid-deployserver, etc.
  • intrusion detection monitoring and fail2ban
  • disabling sshd password auth, allowing only ssh key auth

In a related effort, I recently setup staging.f-droid.org to be built with F-Droid / fdroid-website-server · GitLab, so that is something we can also discuss migrating to. I think that will be a bit more complicated so something to do later.

Oops, and also performance monitoring like @wb9688 suggests. I think that could go to monitor.f-droid.org