Emerging Technology 2014 Q1

When we're not building applications in Ruby on Rails, or helping large companies improve their technology processes and tooling, we're busy evaluating new ways of solving problems. We're not big enough to evaluate everything - so we focus on those which help companies grow online; those which improve processes for web-companies; and their underlying languages. This is my view of technologies, which you should consider.

Techniques

Although most technology companies have cracked automated deployment, a lot of larger, older companies still miss this technique. Simply put - there shouldn't be a delay between the creation of code and realising the value of that code. Two techniques I think are worth exploring are blue-green deployment and immutable infrastructure.

Blue-Green Deployment is a technique where the production servers are mirrored. The blue being the live environment and the green the new. When the green environment is ready and tested, a simple router or networking change switches them over. A roll-back is as simple as switching back. In the cloud this could be simply implemented with an elastic IP or a load balancer. In Amazon Web Services, their Elastic Beanstalk can be configured to do this. The key in any deployment configuration is having a simple switch to move from blue to green, and back again if required.

Immutable infrastructure is another deployment technique where, once live, a server cannot be changed. Chad Fowler goes as far to say throw away the SSH keys! This aims to remove the problem of knowing the state of the server. Contrast this with some environments we've worked. After a period of time there could be parts where no-one quite remembers what they do; or after upgrade after upgrade, patch after patch there are parts which aren't necessary. By creating an immutable server, through automation, you know the state of that server will be the same as when it was created. Need a upgrade to the application? Build a completely new server and throw away the old one.

A common adage goes: what gets measured, improves. A recent infrastructure technique is to focus on Mean time to recovery, in place of measuring mean time before failure, or side-measurements like test coverage. The thinking is an interesting one. The idea is to focus on designing systems in a way that is conducive to recovery and hence to build in resilience, over traditional techniques like extensive testing. This isn't as crazy as it sounds - cloud hosting (especially in its early days) can have downtime. Older companies traditionally have deployment windows of days - with frequent downtime, a focus on recovery is an enabler to faster production builds. Netflix has taken this a step further with its Chaos Monkey, a service which randomly knocks out production services forcing developers to build easily recoverable infrastructure.

Platforms

Elasticsearch isn't really an emerging technology. I first encountered it at notonthehighstreet.com as a solution to our ageing search daemon. It provides out-of-the-box, advanced & fuzzy querying of datasets, and the ability to provide real-time search. Elasticsearch comes with a much nicer API than Apache Solr, sitting firmly in the modern camp of developer productivity first. Paired with tooling like Kibana which allows for data visualisation, or Logstash which is a tool for parsing and indexing logs we predict Elasticsearch becoming a very core component to a mature infrastructure.

Riak is showing promise as an alternative key/value store, with binary storage and interesting primary query types like MapReduce. Its design principle is one around fault tolerance, without the bloat and complexity of a solution like Dynamo. It works in a masterless way - with consistent hashing and replicas; and handoff and rebalancing to heal the cluster when nodes go offline. Riak should be on all developers radars if you have data spanning multiple machines and availability is required over consistency.

Libraries and Languages

We spend our time in data, and new ways to visualise or represent data are critical to effective communication. D3.js is a javascript framework I've been watching for some time. It was borne out of tools used to visualise data for the web. Historically these were in languages like Java. D3.js's initial novelty comes its use of JavaScript - a language specifically designed to manipulate information in the browser - which also explains its high adoption. It uses elements within a HTML page and outputs SVG - a much more dynamic and natural way than most data visualisation - which is represented as a jpeg image.