Tools That Made Our Microservices Easier

by Paul Hallett on Tuesday, 24 Nov 2015

Over the past year the codebase powering lyst.com has grown exponentially. (Coincidentally, so has the number of occupied desks in the office). We started to experience issues getting new features and services built fast, even with our nice development pipeline.

Some of the problems we started to face included:

  • Slow tests because our entire test suite is run for each pull request. We have 3274 python tests and 622 javascript tests that take a combined 31 minutes to run.
  • Merge conflict issues when people were working on similar parts of the architecture.
  • Django migration conflicts when two or more people were building new migration files at the same time.

It was killing our productivity. The biggest complaint we had at the monthly team retrospectives was the inability to move fast. The team structure had started shifting from skill-focussed (backend, frontend), to feature focussed (Search team, Shopping team) to support this. We needed to change how we wrote software to match this shift too. Following one of our values (Best Idea Wins), we decided to do some research and figure out the best way to solve this problem.

This eventually brought us to where we are now:

  • Tooling that allows any developer to build and deploy their own service.
  • Templating that gives each service logging, performance monitoring, and continous integration support.

We have been able to deploy twelve services to production in the last six months using this new infrastructure. Let me share with you how we did this. Another big advantage is being able to have a much better representation of the production environment in our development environment, instead of a set up that was slightly different from production.

Buzzwords And New Tools

Of all the buzzwords in the software industry right now, microservices might be one of the few that actually means something.

An example of microservice vs monolith
Microservices vs a Monolith. AKA many little applications vs a single big application.

When we looked at solutions to our problem, microservices, or “service oriented architecture” was the obvious choice.

We knew about the argument of only using microservices if you really need to and we decided early on to make sure services were isolated applications that weren’t dependant on our main application database. Instead they would provide specific functionality, such as computing related products, or detecting duplicate images. Building Microservices by Sam Newman was really helpful in helping us make these decisions.

Around the same time we began our experiementation, a new tool called Empire was released. Empire provides a PaaS service on top of Amazon EC2 Container Service, with an API similar to Heroku’s. This was exactly what we were looking for as we already ran everything on EC2. Merges into the Master branch automatically trigger a new container build and a push of the container.

Creating a new application on Empire is quite simple:

# See all your apps
$ emp apps
search                Oct 16 12:33
autosuggest           Sep 16 17:04
checkout              Jun 17 15:00

# Create a new application called myapp
$ emp create newapp
Created newapp

# Scale the application
$ emp scale -a newapp web=3
Scaled newapp to web=3:1X.

# Deploy the latest master to production
$ emp deploy -a newapp lyst/newapp:latest
Status: Created new release v2 for newapp

Many more commands for Empire can be found in their getting started guide.

Empire helped us solve the the deployment and platform problem. Now we needed to make it easy for our engineers to start new projects and format them correctly.

Cookiecutter All The Things!

After deploying one or two services we quickly realised we needed to make sure the projects were consistent. This would make it easier for engineers to be productive without having to worry about setting up things like performance monitoring or error logging. It also meant engineers could switch teams and instantly know how a project was organised. The best way to do this is to provide a project template with everything set up already.

Cookiecutter is a tool for creating reusable project templates. A cookiecutter project directory looks like this:

project-template
|
|- {{cookiecutter.project_name}}
    |
    |- projects files...
|- cookiecutter.json
|- README.md

The important file in this directory is cookiecutter.json, which should contain a json dictionary of defaults:

{
    "project_name": "my_project",
    "human_name": "My Lyst Project",
    "team": "Services team"
}

These defaults can then be placed throughout the template. The template files exist in the {{cookiecutter.project_name}} folder.

For example, in our template’s urls.py file, we can link the correct view function like so:

# urls.py

urlpatterns = [
    url(r'^$', '{{cookiecutter.project_name}}.views.index'),
]

Cookiecutter will replace these values and everything will match up because we’ve structured the template correctly.

Once we’ve got the template how we like it, we can publish it to GitHub and use Cookiecutter to build a new template for us:

$ pip install cookiecutter
...
$ cookiecutter [email protected]:lyst/python-microservice.git
project_name (default is "my_project")? new_service

And then voila! We’ve generated a new template. This is probably all familiar to those who’ve used Cookiecutter before, but how have we made this more suitable for us at Lyst?

Nearly every service uses two templates: a Service and a Client template. The service comes with all the following:

  • Django
  • Django REST Framework
  • Sentry error logging
  • NewRelic performance monitoring
  • Circle CI test suite integration
  • Codecov code coverage reporting
  • Docker tooling for building containers automatically.

Our services use HTTP APIs from Django REST Framework and follow our best practices in order to provide their functionality. So other engineers can continue to write Python and not worry about the HTTP layer, we provide a Client template that can integrate with the Service template. The Client template knows how to build itself and distribute a new Wheel to our internal PyPi. It also has mocking set up by default so we don’t have to make calls to the real service when we’re testing.

Once the Client wheel has been added to PyPi, we just need to add it to the requirements file of the project that wants to use the service:

# requirements.txt
...
my_service_client==1.1
...

A big advantage of the client template is that it allows us to handle graceful failure in the client instead of in the application using the client. If the client can’t communicate with the service for some reason, we can handle it and send back a default response.

Performance Monitoring and Error Logging

The service template already has the settings configured to handle performance monitoring with newrelic and error logging with Sentry. Once a service has been deployed we just have to set some environment variables and tell Sentry / Newrelic to listen for the new service.

How We Work Now

Empire and Cookiecutter have helped us to give each engineer in our team the tools to be productive and efficient at building new services. We are still learning a lot about how best to manage these services but we’ve seen positive things from it so far.

Search on Lyst runs on three separate services. Using the new workflow explained above, they were deployed to production in just a few months. Another service we’re building at the moment has gone live in the space of a few weeks.

The speed with which we can build new features and fix bugs is also staggering. The old monolith project took upwards of 30 minutes to run it’s entire test suite. Our new services can be shipped in about 10 minutes. That includes a full build on the pull request, and a full build on the master branch after it has been merged.

What We Want To Do Next

As I said before, we’re still learning a lot. Our process isn’t perfect yet but we’ve got to the point where we’re productive.

We’re now looking at ways to make it easier to provision tools like task queues, database, and elastic search for our services. Right now that’s still a manual process.

We’ve also found that our original templates became out of date fast. This meant that we had some inconsistency between the original services. Whilst that’s not always a problem we have had to fix the same bug across each service a few times. Solution: always make sure your templates are up to date!

On the product side, that is, teams building customer-facing components of our architecture, we’ve considered moving our data from the main codebase into it’s own service. This has always been seen as a hard task due to techincal debt. However someone has recently suggested something different: moving the Web UI components into their own service, keeping the core infrastructure as the original codebase. This would be much easier for us, but still troublesome. These type of decisions are some that you’re likely to encounter. For now, we’ll wait to cross that bridge until we come to it.

If you’ve enjoyed this post and want to work on services with us, then check out our job positions and follow us on twitter.