Blog

How to choose the right backup strategy

We hear a lot about backups being neglected, and wrote an article some time ago about why backups are so important. Yet, what can be equally infuriating is a backup policy that is not as useful as it might seem when the times comes to actually use it. This article touches on the pros and cons of different approaches. In particular, we consider the speed to restore a backup, compression to save space and encryption to protect the backed up data.

Speed to restore

If you have a lot of important data – whole server backups, VM snapshots or lots of static contents – videos, images and so on, speed is important. It is not unknown for a backup restore process to take several days in these cases, due to a sub-optimal backup choice. If you have a lot of data, one backup strategy must be the speed and cost of restoring everything in a worst-case scenario. An excellent article suggests why Amazon Glacier is the wrong choice for this.

If you have double-digital gigabytes of data to restore, I strongly recommend against lots of compression or encryption for your backups. If the data is very sensitive, consider backing up to an encrypted disk rather than encrypting the backups directly as it is often faster. Compression and deduplication (i.e. storing incremental changes over time) also deteriorate the speed of your backups to sometimes unusable levels. Finally, consider the geographic distance between the backup storage and your normal server – it should be great enough that it survives a disaster, yet small enough that data doesn’t have to travel far.

Compression, deduplication and encryption

Compression means rewriting files in a more efficient way and has a lot of similarities with deduplication, storing multiple copies of files by only recording changes over time. Compression and deduplication are perfect for accidental deletions as it’s easy to restore a handful of files from months or years ago. In these cases, the speed doesn’t matter much as, without compression and deduplication, you probably would not have been able to store the files for so long anyway.

Encryption is a more complex subject. Platforms that claim to offer encryption often offer encryption that can be decrypted by someone else. This sort of encryption offer negligible security benefits in my opinion. Good encryption needs to be end-to-end, meaning that there is only one feasible way to decrypt the data at any stage – with the relevant key. The downside is, strong, end-to-end encryption can be slow and, combining it with compression can make restores very slow indeed. It also places additional stress on your servers, potentially increasing hosting costs.

For sensitive data, such as e-commerce databases, separate the sensitive data and store it somewhere away from prying eyes. Tarsnap is an excellent choice for this sort of sensitive data. Although the pricing seems fuzzy, at least one of our clients stores data at Tarsnap at a surprisingly low cost.

RAID and Snapshots

One client this week found out the hard way that neither RAID nor snapshots were of any use when the hosting provider had a major incident, resulting in corrupt data, overwriting the snapshot with the corrupt data and eventually telling him that the snapshots shouldn’t be used as backups.. Right.

Snapshots are actually great when they are done properly – a few are stored off-site. They are fast to restore and generally very reliable. However, they only apply to backups being made by a hosting company on a virtual machine that they sell. It is not feasible to back up an entire running system consistently. Therefore, if you rely on your hosting company for backups, you should have a second backup, in case you hosting company get it wrong – as they often do.

Backup Service Providers

With our managed hosting and managed cloud, we take care of the complexities of backups. Other managed service providers generally do the same. I would still urge clients to keep their own backups, particularly for the cases of a dispute between the provider and the client, but, generally, someone else is worrying about it.

For others, on unmanaged services, consider Tarsnap for your most sensitive data. If you use a control panel, such as cPanel, there is already a great backup tool built in, that is specifically designed to work with your server, that just needs some space. The space should be somewhere else, off your server, and, you might be surprised at how cost effective that backup space can be.

Get help implementing backups

Contact us to get help designing and implementing a backup strategy that really works

Read more →

 

How to Choose a CDN

There has been a surge in the number of CDN services and the use of CDN services in recent months. In this article, we will consider why a certain provider might be a good fit.

Whole site CDN / static website

Whereas CDNs are conventionally used to host only static assets (CSS, Javascript, fonts and images), some informational websites, where every user sees the same page, can be served entirely from a CDN. In this case, you point your domain at the CDN service and the CDN acts as a reverse proxy for your website, much like Cloudflare and Incapsula, but without the bells and whistles. The CDN caches the whole website in different locations, meaning that your website load time is excellent. When you update content on the website, such as publishing a new article, you clear the CDN to reset the caches.

This type of CDN is known as a Static Site CDN, and tends to include a number of features to make it easy to host static content in different locations. If your website is based on Javascript, you can also consider pre-rendering it to improve speed and help Google to find your content.

Whereas static sites, that is, sites with just HTML content, were popular in the ’90s, they are less common today. Therefore, you may need software to generate one for you, using your existing website. Such static site generators are, incidentally, also surging in popularity. Specialist CDNs for static sites, such as Netlify, include built-in tools, although plenty of open source tools are available for different programming languages. Plugins are also available for WordPress.

You can, generally, use almost any CDN for your static site. However, you should consider the reach of the CDN (as always, they should be present where your users are). The features offered by the CDN are also especially important, as, for example, you may want to use your own SSL certificate and control cache expiry times. Aside from Netlify, we have found KeyCDN to be an excellent choice to maximise the use of modern software innovations as well as an extensive network and reasonable prices.

Network coverage

Your CDN should, first and foremost, be present where your users are. Cheap CDNs, like MaxCDN, generally cover locations where hosting is cheaper, to keep costs low. If all of your users are based in the EU and US, this may be ideal, but bear in mind that users (and potentially search rankings) outside these areas – that is, primarily, Australia and Asia, will be limited. Consider whether, for a small percentage cost difference, it is worth using a CDN with better coverage.

Equally, many of the “big name” CDNs, like Edgecast and Akamai, have points of presence in a very large number of locations. The benefit it touted as the content being as close as possible to the user. This is, potentially, correct. However, bear in mind that if the POP does not have your content in its cache, which is more likely on a larger network, it has to fetch it, usually from your server, before delivering it. This is far from ideal. To counterbalance this, some such CDNs offer their own storage facility, that should be considered alongside the CDN offer.

Neither Akamai not Edgecast (now Verizon) sell directly to small businesses. Akamai’s CDN is available via Rackspace and Edgecast is available via Speedy Rails. These resellers offer access to the same network although the level of control is sometimes limited, and in particular, there may be significant additional costs for certain features that are free with smaller CDNs.

Features

Features should be considered carefully. There are numerous software innovations that, alongside good network coverage, are a significant benefit:

  1. HTTP/2
    To vastly simplify this evolution in Internet standards, if your server or CDN supports HTTP/2, all modern web browsers, including mobile, will be able to download more than one resource using the same connection. Since there is some overhead at both ends for each connection, HTTP/2 can make a surprising difference to the speed of your website.
  2. Compression
    Conventionally, we hear that HTTP compression, such as gzip, makes websites faster. In reality, one of the reasons why this is not always the case, is that the server has to compress the content before sending it. If the compressed content is not cached, which it usually isn’t, and the website is busy, it can add noticeable CPU overhead to the server, slowing down the website overall. A CDN that implements compression for you is giving you access to compression for nothing.
  3. Minification
    CSS, HTML and Javascript can be minified, that is, most of the white space can be removed without impacting the functionality of the files. Consider this carefully, as the implementation of a given minification algorithm may break your specific content. CSS is generally safe, Javascript and HTML can be less safe. Much like compression, offloading this to the CDN saves your server some of its resources.
  4. Expiry Times
    Some CDNs let you specify your own expiry times, and specify headers too. Fine-grained control over the amount of time that browsers hold on to CDN-delivered content, and how long the CDN itself holds on to the content, can be extremely useful to ensure that your website it as fast as it can be, without impacting functionality.
  5. HTTPS
    Whereas most CDNs let you use HTTPS, some are stuck in the dark ages, when there was a cost associated with it. These are generally the larger CDN networks and also Fastly. For these, you may not be able to use HTTPS with your own domain, due to certificate limitations, or may not be able to use an EV SSL certificate, if you would like to, due to the cost involved in setting it up and maintaining it. This is likely to change in the near future, as standards such as SNI are implemented more widely.

Read more →

 

Free SSL Certificates with Let’s Encrypt

One of the most popular articles on our blog is an article dispelling common myths around SSL certificates. Let’s Encrypt gained a lot of attention after the article was written. Let’s Encrypt is a radically new way of getting an SSL certificate that is primarily popular because it’s free. The motivation is supposedly that, in 2015, there is really no reason not to use HTTPS. The overhead was considerable in the 90s but today, in most circumstances there is no perceivable difference. With recent revelations about ISP and government spying, other concerns about privacy and a spate of large hacks, there is a potentially lot to be gained. Google is also prioritising websites using HTTPS in its rankings and, more recently, in Chrome.

The cost to issue SSL certificates is generally very low so it is possible to offer them for nothing and rely on advertising or sponsorship. Let’s Encrypt is sponsored by several recognised names including Mozilla, Cisco and Facebook. It has also recently left beta, meaning that they believe that it’s stable enough for normal use, which has also been our experience.

Let’s Encrypt vs. the competition

Most SSL issuers have a set way of issuing certificates. For certificates other than EV (extended validation, “green bar”), you receive an email at a recognised email address on the domain that you want to secure. This proves that you control the domain and allows them to fulfil their obligations. Comodo has also added some other options such as creating a DNS record or uploading a file to your webspace. In all cases, when you’ve done this, you get the certificate, for at least one year.

The certificates aren’t usually recalled except under extreme circumstances (e.g. if you bought from a reseller who didn’t pay their bill) although they can, hypothetically, be recalled in most cases using OCSP, a modern standard, also aggressively supported by Google, that allows the issuer to revoke it in a way that is obvious to most users.

To use Let’s Encrypt, you or your hosting company or server administrator must install some software on the server that automates the Comodo-style HTML file upload process. Every so often, it speaks to Let’s Encrypt to get a new HTML file and places it in your web root. Let’s Encrypt sees it and issues a new certificate that is good for another (in our experience) 90 days. The impressive feat here is that once the software is installed, it’s all automatic. The down side is that the software must be installed and maintained, although it is fairly easy.

Those who use cPanel can easily install a plugin (that works well in our experience) to easily add and remove certificates to/from websites without dealing with an issuer or reseller. A web hosting provider with hundreds of accounts can save their users a lot of expense and time with this plugin.

Why you might not use Let’s Encrypt

There are relatively few reasons why one should not use it. The security is the same, it costs nothing and the issuance process is usually easier. If your hosting company doesn’t allow you to install the software or doesn’t support it themselves, for now, you’ll have to buy certificates the old way. Equally, if you have a lot of subdomains to secure or want a “green bar”, you might still opt for a wildcard or EV certificate since Let’s Encrypt don’t issue them. I’ve yet to see any impartial numbers on how EV certificates and site seals impact sales – I would guess the impact is very minimal in percentage terms. However, for a busy e-commerce website, it is probably worth the £30-100 to buy an EV certificate even if the percentage effect on conversions is very small.

Overall, the author is glad that the days of expensive SSL certificates are coming to a close. It was an industry that we could do without and Let’s Encrypt have had a substantial impact on way SSL certificates will be issued from now on.

Read more →

 

Open source SSH gateway

If, like us, you manage a lot of servers, it can be difficult to organise them. Some time ago, we explored a few options and settled on Ezeelogin. Ezeelogin is one of very few products that provides an easy to use SSH gateway. In our case, however, the problem we needed to solve was very simple: to allow trusted staff members easy access to an evolving catalogue of servers.

Today, we are open sourcing the custom built SSH gateway that we use in-house. It uses a MySQL database to store server groups, servers and users (there is no web interface for this yet but phpMyAdmin would do everything necessary). When a user logs in, it looks up which groups that user can see. The user selects a group and it displays all of the servers in that group. The user then selects a server and it logs them in.

Authentication between the user and gateway is handled in the normal way – they have their own system user and can use a password or a key – whatever is set is sshd_config. Authentication between the gateway and the server is with an SSH keypair that is shared by all users but can only be seen by root (on a properly configured server).

To reiterate, whereas some products have focused very heavily on security, we have a number of security policies within our company that reduce the need for the gateway to be anything more than a tool for convenience. That said, I recommend the following steps to secure the gateway:

  1. Encrypt the gateway’s disk – or use an encrypted volume to store the SSH keypair
  2. Host the gateway on your office server if you have one
  3. Turn off password authentication on the gateway in sshd_config

 

Download

Download it on GitHub

Installation

  1. Import the MySQL database
  2. Edit the database as required using phpMyAdmin, the command line or anything else
  3. Upload gwshell to /sbin and make it executable (“chmod +x /sbin/gwshell”)
  4. Optionally upload gwuseradd to /sbin and make it executable (“chmod +x /sbin/gwshell”) – this is a convenience tool for the admin to add users
  5. Generate an SSH key on the gateway server in /etc/sshgateway/id_rsa and add the public key to the servers you want to control
  6. Disable SELinux
  7. Allow sudo access without a password for members of the wheel group (see /etc/sudoers)

When new users are added, they must have access to the wheel group and therefore, sudo. This is to access the shared key. You can of course copy keys, use individual keys or set up password authentication if you wish – but users should not be able to access the normal shell anyway. Each user’s shell (in /etc/passwd) should be /sbin/gwshell, which means that they see the gateway rather than a normal shell when they log in. All of this is handled in gwuseradd if you choose to use it: “gwuseradd [username]”.

Usernames in /etc/passwd need to match usernames in the database. If there is no match, they won’t be able to access any servers. Groups in the “users” table is comma-separated so, if you want to give a user access to groups 1 and 3, enter “1,3”.

Support, licence etc.

The software is MIT licensed. We hope it is convenient for you as it is for us. You may use it free of charge, at entirely your own risk. Because it is free, if you need support from us, obviously, it will be chargeable. It is pretty easy to use, though.

Read more →