After discussions with the Nextcloud team and guys at TU Berlin, the below could be officially supported with some small changes. See the updates noted against the challenges. A rewrite or additional post will be coming soon to address and test the changes.
Nextcloud works really well as a standalone, single-server deployment. They additionally have some great recommendations for larger deployments supporting thousands of users and terabytes of data:
Up to 100,000 users and 1PB of data
What wasn’t so apparent until last week, however, is how someone might deploy Nextcloud across multiple datacentres (or locations) in a distributed manner wherein each Node can act as the “master” at any point in time; federation is obviously a big feature in Nextcloud and works very well for connecting systems and building a trusted network of nodes, but that doesn’t do an awful lot for those wanting the type of enterprise deployment pictured above, without having all of the infrastructure on one network.
Now that Global Scale has been announced this will likely be the way forward when it’s ready, however given I’d already started a proof of concept (PoC) before NC12 was officially made available, I kept working away at it regardless – more for my own amusement than anything else for reasons explained further down.
The theory was as follows:
If I’m at home, the server in the office is the best place to connect to since it’s on the LAN and performance will be excellent. In this case, a DNS override will point the FQDN for the Global Load Balancer (GLB) to a local HAProxy server that in turn points to the LAN Nextcloud instance unless it’s down, where the local HAProxy will divert to the GLB as normal.
If I leave the house and want to access my files, I’ll browse to the FQDN of the Nextcloud cluster (which points to the GLB) and the GLB will then forward my request seamlessly to the Nextcloud instance that responds the fastest.
In the graphic above, I opted for a future-proof infrastructure design that would:
No matter which Nextcloud instance I eventually land on, I wanted to be able to add/remove/edit files as normal and have all other instances sync the changes immediately. Further still, I wanted all nodes to be identical, that is if any file changes in the Nextcloud web directory on the server, it should be synced.
The answer to that is pretty much “why not?”. This isn’t a serious implementation and I had no intention of scaling the solution beyond this PoC. GlusterFS or any one of the alternative distributed filesystems are designed for what I’m doing here, but I wanted to get some hands-on time with SyncThing and didn’t see a reason I couldn’t implement it in this fashion. I talk about it more here with Resilio Sync, but swapped Resilio with SyncThing as Resilio is proprietary.
All recommended deployments suggest using Galera (MySQL or MariaDB) in a Master-Slave (-Slave-Slave, etc) configuration with a single centralised Master for writes and distributed slaves for quick reads. In my case, that could mean having a Master in Wales for which the nodes in France, Holland, Finland, Canada or anywhere else would have to connect to in order to write, introducing latency.
To get around this, I opted for Galera Master-Master. This setup isn’t supported by Nextcloud I later found out (by reading the docs, no less, oops) but at the time it appeared to work fine – I uploaded 10,000 files in reasonably quick succession with no database or Nextcloud errors reported, however that may well have simply been luck.
Update: Galera Master-Master is supported, however a load-balancer supporting split read/write must be used in front of it. The master should be persistent and only failover to another Galera server in case of failure. ProxySQL (https://www.proxysql.com/) offers this functionality and is FOSS.
At a minimum I’d have to replicate
data for the individual nodes to sync data and retain a consistent user experience; enabling an app on one node wouldn’t be automatically enabled on the other node otherwise, as it’d first need to be downloaded from the Nextcloud store. I didn’t want to think of the issues this would cause for the database.
At this point I also considered the impact of upgrades, as when one Nextcloud instance successfully upgrades and makes changes to the database, the other nodes will want to do the same as part of the upgrade process. So rather than just syncing the
themes folders individually, I opted to replicate the entire
/nextcloud webroot folder between nodes on a 5-second sync schedule (that is, there will be a maximum of 5 sec before data uploaded to one node is replicated to the others).
In testing this setup in several containers on the home server, the additional load this put on the machine was enormous during an upgrade of Nextcloud; a hex-core with 32GB RAM and SSD-backed storage ended up with a load average nearing 30, far more than the normal 0.4 it typically runs at; not so much a problem with a distributed service but where traffic is monitored with some providers, the constant sync could have an impact on transfer caps.
Load and data transfer aside, the tests were successful; I updated Nextcloud from 11.0.3 to 12.0.0 and watched it almost immediately start replicating the changing data as the upgrade took place – it was beautiful.
This was naturally the 2nd attempt, as first I’d forgotten to leave the Nextcloud service in maintenance mode until all sync had ceased, and on accessing one of the nodes before it had completed, things started going wrong and the nodes fell out of sync. Keeping maintenance mode enabled until it was 100% synced across all nodes then worked every attempt (where an attempt involved restoring the database and falling back to snapshots from 11.0.3).
Update: SyncThing is a nice PoC, but if you’re seriously planning to run a distributed setup I’d strongly recommend Gluster or another distributed storage solution.
HAProxy wasn’t failing over fast enough. If a node went down (Apache stopped, MySQL stopped or server shut down) and the page refreshed anywhere up to 2-3 seconds later HAProxy may not have downed the node quite yet and throws an error.
Getting past that didn’t take long fortunately, and I ended up making the following changes to the HAProxy config:
server web1 10.10.20.1:80 <strong>check fall 1 rise 2</strong>
This tells HAProxy to check the servers in its configuration, but more importantly drop them out of circulation after only one failure to connect when doing a check, and require two successful checks to bring a server back in. After this it was much faster and knowing I had enough nodes to handle this type of configuration I wasn’t concerned about HAProxy potentially dropping a few of them out in quick succession if required.
Redis doesn’t really do clustering, and authentication options are limited.
This was pretty much a stopper for distributed session storage. Redis docs and a lot of Googling led me to the conclusion I can’t cluster Redis. Therefore each node would have to report session information on all Redis nodes. That’s not a dealbreaker, but worse, either Redis remains completely open on the internet for the nodes to connect to, or if authentication is used, it’s sent in plaintext. Not good. Redis suggests connecting in via VPN or tunneling, but that feels like it defeats the purpose of this exercise and requires a lot more configuration. Perhaps Memcached could solve the problem, or figuring out a way of replicating the session files with plain-old PHP session handling. I didn’t look much further into it.
Update: Speaking to TUBerlin, they got around this with IP-whitelisting between Redis nodes. Ultimately Redis can be left “open” and without authentication, however on the server/network level, only whitelisted IPs may successfully communicate with the Redis nodes. Within the Nextcloud configuration, stipulating all Redis nodes is the only way of achieving replication (as Nextcloud will write to them all).
SyncThing comes with user-based Systemd service files, and after a while of trying to make them succumb to my will for a custom user and home directory (because the web user doesn’t always have a “
home“, despite pointing to
/var/www/ SyncThing kept dying on me when I’d made the changes)
Being all data is presented, manipulated and used by
www-data for Nextcloud, I needed to ensure SyncThing ran as
www-data in order to retain permissions and not run into issues trying to manage data in a non-user directory (
/var/www/html/nextcloud/data). Because of this, I edited the default SyncThing service files to create some that aren’t user-based with a custom home directory, as follows:
So in order to confirm it was all working as it should be I did the following:
Here I dropped a Galera node, checked the state of Galera, brought it back in and checked again. Exciting.
And here’s a snippet of the load test at work on HAProxy (web nodes only):
I don’t know why I recorded that rather than screen capture, but c’est la vie.
During the load test Nextcloud remained responsive, load on the server went off the charts but that was fine as it too remained responsive. I could continue to upload, edit, download and use Nextcloud, so I was happy with it.
The next step would have been to start replicating the home server setup on remote servers, I’d considered a couple of containers with ElasticHosts, one or two LightSail servers from Amazon and perhaps a VPS with OVH. However this didn’t happen due to at this point finding out Galera isn’t supported, and Redis was going to cause me problems.
In conclusion, this type of deployment for Nextcloud seems to work, but isn’t feasible. Redis is a bit of a stopper and an alternative would need to be found, and Galera master-master is a major issue and a bit of an inconvenience in master-slave. Here’re the details I published over on the Nextcloud forums:
/data, however what happens if an admin adds an app on one node? It doesn’t show up on the others. Same for themes.
With the Redis exception I basically built an unsupported, but successful, distributed Nextcloud solution that synced well, maintained high availability at all times and only really suffered the odd discrepancy due to session storage not working properly.
SyncThing proved its worth to me, so I’ll definitely be looking more into that at some point soon. In the meantime, this experiment is over and all servers have been shut down:
If you have suggestions for another master-master database solution that could work*, and a session storage option that will either a) cluster or b) support authentication that isn’t completely plaintext, let me know!
*Keep in mind:
A multi-master setup with Galera cluster is not supported, because we require
READ-COMMITTEDas transaction isolation level. Galera doesn’t support this with a master-master replication which will lead to deadlocks during uploads of multiple files into one directory for example.
Have you attempted this kind of implementation with Nextcloud? Do you have any tips? Were you more or less successful than my attempt? Let me know in the comments, @jasonbayton on twitter or @bayton.org on Facebook.