Yesterday I was doing some cleanups on my personal server infastructure on DigitalOcean and I noticed that I had a webserver that I didn't really use. It was something that had been set up early in some loadbalancing testing and now served some very light sites. That role has now been taken over by my pi-cluster, so there wasn't any need for it anymore. So I selected it for deletion, DigitalOcean asked me to confirm that I wanted to delete the
ocean-web-3 server which I happily agreed to. At which point all my sites went down, even those that hadn't been on that webserver.
I quickly noticed that I had accidentally switched the naming of this server with the loadbalancer, and instead of deleting a spare webserver I had deleted my loadbalancer. So even though I had servers ready to serve requests, nobody could get to them. That wasn't the best, but I learned a few lessens from it that I'll share here.
It brought a lot of positives
Deleting this server cost me some time, but it helped me better understand security, what can go wrong and how to recover. When deleting with destructive actions at work, we are suitably paranoid. Which is a great thing, but "unfortunately" it means I reduce my exposure to problems. Which means that I don't have a lot of personal experience to base stability development on which makes it easier to miss something. Especially human psycology aspects. I thought I had deleted the right server, I double checked that the selected server had the right name yet it was still wrong. I should also note here that the hostname on each server was correct, it was exclusively in the DigitalOcean web interface that the naming was incorrect.
Monitoring is nice
Servers went down and I got notifications quickly afterwards from uptimerobot. Pretty nice. Having external validation of your sites and their performence is a neat thing, that brings problems to your attention much faster than you might notice yourself.
Being able to spin up a complete new instance automatically is soooo nice
The loadbalancer was dead and I had to build a new one. Luckily, it was built using an ansible script. So it was a simple matter of running that script on a fresh new instance and all required libraries were installed and all configuration was set up excatly as it was the previous host. Super brilliant, I didn't have to try and remember what my Haproxy configuration was or anything like that. The latest version was simply ported over.
Especially if everything was there
So I just said that it was super nice that all the libraries and configuration got ported over and it was ready to go. But, there was one exception. I didn't have my SSL certificates. They were stored solely on the server and not backed up anywhere elsewhere. Fortunately, I'm running LetsEncrypt and I was able to quickly acquire new certificates without too much trouble. They do have a ratelimit of about 5 sites every week, so if I had just a single more site I wouldn't have been able to acquire a certificate for it. Which would have meant that it would be essentially down for a week as I have setup HSTS and so cannot serve any of my sites on non-ssl to anyone that has visited in the past.
DNS should be easily handled
It used to be that all my sites had an A record to the loadbalancer. So after bringing up a new instance I had to reconfigure the DNS for all sites. Which is both laborious and annoying, plus since there is caching it would take an hour for the change to take effect.
Not anymore though. DigitalOcean has something called floating ips, which is similar to Amazons Elastic IPs. Essentially you send your traffic to this IP and they, behind the scenes, forward that traffic to your instance of choice. This is pretty neat as you can perform that change just once and it instantaneous propagates. Now all my sites point to this floating IP, so should this happen again or I just want to switch to another instance I can do that super easy. Oh and floating IPs are free, which just makes it an even better idea to use.