First and foremost, Christmas is not here yet. We have two more months before we have to sit down with our most annoying relatives. Phew.
But, if you’re in retail, you’ll know how important the next two months are. The next two months pay for the rest of the year. If they go well, you’ll be in great stead for 2018. If they go badly, you might get your P45 in your Christmas card.
If you have physical stores, you might be advertising temporary vacancies to help cope with the rush. Good thinking, Batman.
Your online store might not have the capacity it needs.
There are four factors that influence the capacity of your online store:
If you use Google Analytics, (1) and (2) are really easy to find so you should definitely do this now.
Log in and select your store and go to Audience > Overview.
Drop down under Overview. You’ll probably see Sessions selected. Choose Pageviews.
On the right hand side, choose Hourly.
Then, hover over the graph and you’ll see the number of pageviews per hour.
Select your busiest period. This might be your Black Friday or Christmas sale from last year. Find the busiest hour or the busiest day.
To make the maths easy, let’s say there were 36,000 pageviews in that hour. There are 3600 second in an hour, so divide that number by 3600 to get your pageviews per second. In our example, we arrived at 100.
We then need to adjust this number for your projections this year. If you’re spending more on advertising this year, you might want to add 20%. If you’re spending less, you might want to subtract 20%. It’s better to overshoot than to undershoot, though, and it’s worth adding a buffer in case your estimate is a bit out.
The number you arrived at is the number of pageviews per second that your store and hosting need to be able to accommodate so your Christmas card contains an awful joke and not a P45 (I’m not selling this well, am I?)
Armed with your pageviews per second, your developer and hosting company (or Coherent… just sayin’) will be able to run tests to determine if you’re already in good stead or if you need to make changes so your online store doesn’t fall flat when you most need it up and running.
Cloud is increasingly popular due to its low upfront and perceived reliability. I wrote an article a few weeks ago about why it can be a sensible choice, but needs to be considered carefully.
One of the main drawbacks of cloud (VPS / VDS / VM / cloud server / IaaS / whatever you want to call it – it’s generally the same thing) – is the CPU power. The physical servers that host virtual servers are very easy to add disks and memory to, and since the incorporation of solid state disks into cloud offers, disk IO is no longer a primary concern either. However, the ratio of CPU resources to memory and disk resources can be rather limiting. Consider the following hypothetical but realistic scenario, supported by anecdotal evidence and server suppliers’ hardware specifications:
2x Xeon E5-2620V3 (12 cores, 24 theads)
12 x 512GB SSD in RAID10 = 3TB usable space
Let’s package that into VMs by dividing it:
0.375 access to one CPU thread!
As you can see, the disk and memory seem plausible but the CPU seems dismal. Of course, this will vary from provider to provider – and you won’t see “0.375 of 1 CPU thread” but more likely something like “shared CPU”.
It can get worse, however, as many providers will oversubscribe the resources on the physical servers – i.e., they will sell more resources than they have, knowing that generally, the physical server will cope ok.
What does this mean for the customer? It means that you need to think carefully about CPU, because it’s the most expensive aspect of cloud, the most necessary for many common cloud applications and the hardest to scale and often, to get solid information on. It reinforces the argument in my earlier post that cloud is great for small deployments, but needs to be weighed very carefully against dedicated options for larger deployments.
The technically interesting bit
A grand challenge in cloud is not just gathering generalised, historical performance data but knowing how my application will perform when I deploy it. There does not exist a definitive solution to this challenge yet, however, there are ways of getting some idea of how it is performing now. Utilities like top, which systems administrators are familiar with, provide some input and have changed in recent years to accommodate the prevalence of shared resources.
Systems administrators will probably know the first 5 columns of the bottom row very well – they are the most commonly used and, generally, are very useful indicators of how a system is performing. They rely on what is called, tick-based CPU accounting, where, every CPU tick (the atomic unit of CPU resources) is categorised based on how the kernel decides it should be used, and a quick calculation is made to arrive at percentages. The kernel uses a scheduler to manage the CPU resources efficiently when there are potentially far more applications running than there are CPU threads available, by switching them in and out.
You may be interested to know that the unit of time between CPU scheduler decisions is, in technical parlance, called a jiffy and that the algorithm used to schedule CPU resources in most modern Linux systems is based heavily on the Brain Fuck Scheduler, written by an anaesthesiologist turned computer scientist, who liked to argue on forums.
Anyway, the very last column, “st”, stands for “steal” time. This is supposed to account for CPU time that is “stolen” by the virtualisation software on the physical server and is therefore completely unavailable to the kernel on the virtual server. The virtual server isn’t really aware of why it isn’t available but in a virtual system, it’s safe to assume that it’s probably the result of noisy neighbours and high contention.
Steal time is not easy to measure, as virtual servers are dependent on the virtualisation software on the hypervisor for the resources that would otherwise be physical – that includes the obvious ones like storage devices, network devices and so on, but also the simple act of keeping time, which is an integral part of the system. Thus, the number that you see under the “st” column depends not only on contention but also the virtualisation technology. Literature suggests that IBM’s virtualisation technology allows for CPU time to be accounted for extremely accurately, followed by Xen, followed by the others that are less effective. In our Proxmox setup, no steal time was reported even under very high load and oversubscription
We often work with clients on speeding up their website. Great results are achievable but focus needs to be put on the right areas as there are a myriad of reasons why your website could be slow. Recently, we worked with a client to speed up a customised WordPress website by implementing heavy caching with Nginx and extensive tuning for HTTP2 and Google Pagespeed. The result is that real visitors at a significant geographical distance can see the page in less than 0.6s and Google Page Insights and GTMetrix are both very happy. However, the extensive nature of the work to get every possible millisecond for as many visitors as possible, highlighted several important points.
Myth #1: Small images should be combined into sprites
A few years ago this was absolutely true. Today, almost every major browser supports HTTP2. This means that, between a server configured to use HTTP2 and a browser with HTTP2 support, several files can be downloaded simultaneously over one connection. Furthermore, this is faster than downloading one large file.
Myth #2: Load x first, then y, then z
Myth #3: You need to spend more on hosting
I am disappointed that website speed is so closely aligned with server size and hosting costs. We have successfully moved high traffic websites to smaller servers and made them faster at the same time. Part of it is the configuration of the server, that should aim to make efficiency savings, and part is the software itself. If your software: WordPress, Magento or anything else, is badly coded (think about plugins in particular – the ones you downloaded without paying much attention to ;)) then the additional hosting costs to achieve a better speed can be very very high. It is better to deal with the problems in the code so that the server doesn’t have to do as much work.
Myth #4: You need a CDN
I am generally in favour of using a CDN. However, it is not the secret sauce that it is sometimes made out to be. A good rule of thumb, is if your website is fast for local visitors but slow for international visitors (and you have enough international visitors for this to be a problem), then you should consider a CDN. Be aware that CDNs introduce more complexity as they need to cache your content so your website needs to tell the CDN when content has changed, and needs to reference the CDN on every page. A CDN is relatively unlikely to improve the speed of low traffic website or one where the geographic distance between the server and the users is already fairly small.
Myth #6: Google Page Insights will tell me how fast my website is
Tools like Google Pagespeed, Pingdom Tools and GTMetrix need to be interpreted properly. I despise the fact that they make it so easy for website owners to see what appear to be problems. Google Pagespeed does not take into account HTTP2 much, if at all. There is debate as to whether Google rankings correlate to this tool but as for real visitors, I wouldn’t think about it much. GTMetrix is a very advanced tool that takes a lot of different factors into account – it is a very good tool. However, again, despite its bells and whistles, it needs to be interpreted properly. Your score on GTMetrix is not always representative of how users see your website.
On a side note, Qualys SSL Labs test also needs to be interpreted properly. This test gives you some (actually very good) information about the security of your HTTPS configuration. However, what is neglected to mention, is that not having a top score is not necessarily a problem. Why? Because in some cases, using the latest ciphers will cause problems for users in locations where, for example, Windows XP is still widely used (there are some!). Equally, a very strong HTTPS setup can damage the performance of your website as more CPU power is needed to handle crytography
Myth #7: Always enable Gzip compression
Following on from my aside about HTTPS encryption using CPU, gzip compression uses CPU power too. If your website is busy and your server’s bottleneck is the CPU, enabling gzip will take CPU resources away from the server’s normal activities and have a good chance of making it slower. If you simply want to tick the “gzip?” box on online tools, which I discourage until there is substantial evidence that Google rankings depend on this, enable it at level 1, the lowest level. Level 1 is substantially better than nothing for page size, and not that much different to any other level relatively speaking. It does, however, use much, much less CPU time to compress and extract.
We hear a lot about backups being neglected, and wrote an article some time ago about why backups are so important. Yet, what can be equally infuriating is a backup policy that is not as useful as it might seem when the times comes to actually use it. This article touches on the pros and cons of different approaches. In particular, we consider the speed to restore a backup, compression to save space and encryption to protect the backed up data.
Speed to restore
If you have a lot of important data – whole server backups, VM snapshots or lots of static contents – videos, images and so on, speed is important. It is not unknown for a backup restore process to take several days in these cases, due to a sub-optimal backup choice. If you have a lot of data, one backup strategy must be the speed and cost of restoring everything in a worst-case scenario. An excellent article suggests why Amazon Glacier is the wrong choice for this.
If you have double-digital gigabytes of data to restore, I strongly recommend against lots of compression or encryption for your backups. If the data is very sensitive, consider backing up to an encrypted disk rather than encrypting the backups directly as it is often faster. Compression and deduplication (i.e. storing incremental changes over time) also deteriorate the speed of your backups to sometimes unusable levels. Finally, consider the geographic distance between the backup storage and your normal server – it should be great enough that it survives a disaster, yet small enough that data doesn’t have to travel far.
Compression, deduplication and encryption
Compression means rewriting files in a more efficient way and has a lot of similarities with deduplication, storing multiple copies of files by only recording changes over time. Compression and deduplication are perfect for accidental deletions as it’s easy to restore a handful of files from months or years ago. In these cases, the speed doesn’t matter much as, without compression and deduplication, you probably would not have been able to store the files for so long anyway.
Encryption is a more complex subject. Platforms that claim to offer encryption often offer encryption that can be decrypted by someone else. This sort of encryption offer negligible security benefits in my opinion. Good encryption needs to be end-to-end, meaning that there is only one feasible way to decrypt the data at any stage – with the relevant key. The downside is, strong, end-to-end encryption can be slow and, combining it with compression can make restores very slow indeed. It also places additional stress on your servers, potentially increasing hosting costs.
For sensitive data, such as e-commerce databases, separate the sensitive data and store it somewhere away from prying eyes. Tarsnap is an excellent choice for this sort of sensitive data. Although the pricing seems fuzzy, at least one of our clients stores data at Tarsnap at a surprisingly low cost.
RAID and Snapshots
One client this week found out the hard way that neither RAID nor snapshots were of any use when the hosting provider had a major incident, resulting in corrupt data, overwriting the snapshot with the corrupt data and eventually telling him that the snapshots shouldn’t be used as backups.. Right.
Snapshots are actually great when they are done properly – a few are stored off-site. They are fast to restore and generally very reliable. However, they only apply to backups being made by a hosting company on a virtual machine that they sell. It is not feasible to back up an entire running system consistently. Therefore, if you rely on your hosting company for backups, you should have a second backup, in case you hosting company get it wrong – as they often do.
Backup Service Providers
With our managed hosting and managed cloud, we take care of the complexities of backups. Other managed service providers generally do the same. I would still urge clients to keep their own backups, particularly for the cases of a dispute between the provider and the client, but, generally, someone else is worrying about it.
For others, on unmanaged services, consider Tarsnap for your most sensitive data. If you use a control panel, such as cPanel, there is already a great backup tool built in, that is specifically designed to work with your server, that just needs some space. The space should be somewhere else, off your server, and, you might be surprised at how cost effective that backup space can be.
Get help implementing backups
Contact us to get help designing and implementing a backup strategy that really works
There has been a surge in the number of CDN services and the use of CDN services in recent months. In this article, we will consider why a certain provider might be a good fit.
Whole site CDN / static website
Whereas static sites, that is, sites with just HTML content, were popular in the ’90s, they are less common today. Therefore, you may need software to generate one for you, using your existing website. Such static site generators are, incidentally, also surging in popularity. Specialist CDNs for static sites, such as Netlify, include built-in tools, although plenty of open source tools are available for different programming languages. Plugins are also available for WordPress.
You can, generally, use almost any CDN for your static site. However, you should consider the reach of the CDN (as always, they should be present where your users are). The features offered by the CDN are also especially important, as, for example, you may want to use your own SSL certificate and control cache expiry times. Aside from Netlify, we have found KeyCDN to be an excellent choice to maximise the use of modern software innovations as well as an extensive network and reasonable prices.
Your CDN should, first and foremost, be present where your users are. Cheap CDNs, like MaxCDN, generally cover locations where hosting is cheaper, to keep costs low. If all of your users are based in the EU and US, this may be ideal, but bear in mind that users (and potentially search rankings) outside these areas – that is, primarily, Australia and Asia, will be limited. Consider whether, for a small percentage cost difference, it is worth using a CDN with better coverage.
Equally, many of the “big name” CDNs, like Edgecast and Akamai, have points of presence in a very large number of locations. The benefit it touted as the content being as close as possible to the user. This is, potentially, correct. However, bear in mind that if the POP does not have your content in its cache, which is more likely on a larger network, it has to fetch it, usually from your server, before delivering it. This is far from ideal. To counterbalance this, some such CDNs offer their own storage facility, that should be considered alongside the CDN offer.
Neither Akamai not Edgecast (now Verizon) sell directly to small businesses. Akamai’s CDN is available via Rackspace and Edgecast is available via Speedy Rails. These resellers offer access to the same network although the level of control is sometimes limited, and in particular, there may be significant additional costs for certain features that are free with smaller CDNs.
Features should be considered carefully. There are numerous software innovations that, alongside good network coverage, are a significant benefit:
One of the most popular articles on our blog is an article dispelling common myths around SSL certificates. Let’s Encrypt gained a lot of attention after the article was written. Let’s Encrypt is a radically new way of getting an SSL certificate that is primarily popular because it’s free. The motivation is supposedly that, in 2015, there is really no reason not to use HTTPS. The overhead was considerable in the 90s but today, in most circumstances there is no perceivable difference. With recent revelations about ISP and government spying, other concerns about privacy and a spate of large hacks, there is a potentially lot to be gained. Google is also prioritising websites using HTTPS in its rankings and, more recently, in Chrome.
The cost to issue SSL certificates is generally very low so it is possible to offer them for nothing and rely on advertising or sponsorship. Let’s Encrypt is sponsored by several recognised names including Mozilla, Cisco and Facebook. It has also recently left beta, meaning that they believe that it’s stable enough for normal use, which has also been our experience.
Let’s Encrypt vs. the competition
Most SSL issuers have a set way of issuing certificates. For certificates other than EV (extended validation, “green bar”), you receive an email at a recognised email address on the domain that you want to secure. This proves that you control the domain and allows them to fulfil their obligations. Comodo has also added some other options such as creating a DNS record or uploading a file to your webspace. In all cases, when you’ve done this, you get the certificate, for at least one year.
The certificates aren’t usually recalled except under extreme circumstances (e.g. if you bought from a reseller who didn’t pay their bill) although they can, hypothetically, be recalled in most cases using OCSP, a modern standard, also aggressively supported by Google, that allows the issuer to revoke it in a way that is obvious to most users.
To use Let’s Encrypt, you or your hosting company or server administrator must install some software on the server that automates the Comodo-style HTML file upload process. Every so often, it speaks to Let’s Encrypt to get a new HTML file and places it in your web root. Let’s Encrypt sees it and issues a new certificate that is good for another (in our experience) 90 days. The impressive feat here is that once the software is installed, it’s all automatic. The down side is that the software must be installed and maintained, although it is fairly easy.
Those who use cPanel can easily install a plugin (that works well in our experience) to easily add and remove certificates to/from websites without dealing with an issuer or reseller. A web hosting provider with hundreds of accounts can save their users a lot of expense and time with this plugin.
Why you might not use Let’s Encrypt
There are relatively few reasons why one should not use it. The security is the same, it costs nothing and the issuance process is usually easier. If your hosting company doesn’t allow you to install the software or doesn’t support it themselves, for now, you’ll have to buy certificates the old way. Equally, if you have a lot of subdomains to secure or want a “green bar”, you might still opt for a wildcard or EV certificate since Let’s Encrypt don’t issue them. I’ve yet to see any impartial numbers on how EV certificates and site seals impact sales – I would guess the impact is very minimal in percentage terms. However, for a busy e-commerce website, it is probably worth the £30-100 to buy an EV certificate even if the percentage effect on conversions is very small.
Overall, the author is glad that the days of expensive SSL certificates are coming to a close. It was an industry that we could do without and Let’s Encrypt have had a substantial impact on way SSL certificates will be issued from now on.