Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!


Software architecture workshops in Australia during August

Coding the Architecture - Simon Brown - Mon, 06/22/2015 - 08:02

This is a quick update on my upcoming trip to Australia ... in conjunction with the lovely folks behind the YOW! conference, we've scheduled two public software architecture sketching workshops as follows.

This workshop addresses one of the major problems I've evidenced during my career in software development; namely that people find it really hard to describe, communicate and visualise the design of their software. Sure, we have UML, but very few people use it and instead resort to an ad hoc "boxes and lines" notation. This is fine, but the resulting diagrams (as illustrated below) need to make sense and they rarely do in my experience.

Some typical software architecture sketches

My workshop explores this problem from a number of different perspectives, not least by giving you an opportunity to practice these skills yourself. My Voxxed article titled Simple Sketches for Diagramming Your Software Architecture provides an introduction to this topic and the C4 model that I use. I've been running this workshop in various formats for nearly 10 years now and the techniques we'll cover have proven invaluable for software development teams around the world in the following situations:

  • Doing (just enough) up front design on whiteboards.
  • Communicating software design retrospectively before architecture refactoring.
  • Explaining how the software works when somebody new joins the team.
  • Documenting/recording the software architecture in something like a software architecture document or software guidebook.
"Really surprising training! I expected some typical spoken training about someones revolutionary method and I found a hands-on training that used the "do-yourself-and-fail" approach to learn the lesson, and taught us a really valuable, reasonable and simple method as an approach to start the architecture of a system. Highly recommended!" (an attendee from the software architecture sketching workshop at GOTO Amsterdam 2014)

Attendees will also receive a copy of my Software Architecture for Developers ebook and a year subscription to Structurizr. Please do feel free to e-mail me with any questions. See you in Australia!

Categories: Architecture

Leadership Skills for Making Things Happen

"A leader is one who knows the way, goes the way, and shows the way." -- John C. Maxwell

How many people do you know that talk a good talk, but don’t walk the walk?

Or, how many people do you know have a bunch of ideas that you know will never see the light of day?  They can pontificate all day long, but the idea of turning those ideas into work that could be done, is foreign to them.

Or, how many people do you know can plan all day long, but their plan is nothing more than a list of things that will never happen?  Worse, maybe they turn it into a team sport, and everybody participates in the planning process of all the outcomes, ideas and work that will never happen. (And, who exactly wants to be accountable for that?)

It doesn’t need to be this way.

A lot of people have Hidden Strengths they can develop into Learned Strengths.   And one of the most important bucket of strengths is Leading Implementation.

Leading Implementation is a set of leadership skills for making things happen.

It includes the following leadership skills:

  1. Coaching and Mentoring
  2. Customer Focus
  3. Delegation
  4. Effectiveness
  5. Monitoring Performance
  6. Planning and Organizing
  7. Thoroughness

Let’s say you want to work on these leadership skills.  The first thing you need to know is that these are not elusive skills reserved exclusively for the elite.

No, these are commonly Hidden Strengths that you and others around you already have, and they just need to be developed.

If you don’t think you are good at any of these, then before you rule yourself out, and scratch them off your list, you need to ask yourself some key reflective questions:

  1. Do you know what good actually looks like?  Who are you role models?   What do they do differently than you, and is it really might and magic or do they simply do behaviors or techniques that you could learn, too?
  2. How much have you actually practiced?   Have you really spent any sort of time working at the particular skill in question?
  3. How did you create an effective feedback loop?  So many people rapidly improve when they figure out how to create an effective learning loop and an effective feedback loop.
  4. Who did you learn from?  Are you expecting yourself to just naturally be skilled?  Really?  What if you found a good mentor or coach, one that could help you create an effective learning loop and feedback loop, so you can improve and actually chart and evaluate your progress?
  5. Do you have a realistic bar?  It’s easy to fall into the trap of “all or nothing.”   What if instead of focusing on perfection, you focused on progress?   Could a little improvement in a few of these areas, change your game in a way that helps you operate at a higher level?

I’ve seen far too many starving artists and unproductive artists, as well as mad scientists, that had brilliant ideas that they couldn’t turn into reality.  While some were lucky to pair with the right partners and bring their ideas to live, I’ve actually seen another pattern of productive artists.

They develop some of the basic leadership skills in themselves to improve their ability to execute.

Not only are they more effective on the job, but they are happier with their ability to express their ideas and turn their ideas into action.

Even better, when they partner with somebody who has strong execution, they amplify their impact even more because they have a better understanding and appreciation of what it takes to execute ideas.

Like talk, ideas are cheap.

The market rewards execution.

Categories: Architecture, Programming

How to deploy composite Docker applications with Consul key values to CoreOS

Xebia Blog - Fri, 06/19/2015 - 16:34

Most examples on the deployment of Docker applications to CoreOS use a single docker application. But as soon as you have an application that consists of more than 1 unit, the number of commands you have to type soon becomes annoying. At Xebia we have a best practice that says "Three strikes and you automate" mandating that a third time you do something similar, you automate. In this blog I share the manual page of the utility called fleetappctl that allows you to perform rolling upgrades and deploy Consul Key value pairs of composite applications to CoreOS and show three examples of its usage.

fleetappctl is a utility that allows you to manage a set of CoreOS fleet unit files as a single application. You can start, stop and deploy the application. fleetappctl is idempotent and does rolling upgrades on template files with multiple instances running. It can substitute placeholders upon deployment time and it is able to deploy Consul key value pairs as part of your application. Using fleetappctl you have everything you need to create a self contained deployment  unit of your composite application and put it under version control.

The command line options to fleetappctl are shown below:

fleetappctl [-d deployment-descriptor-file]
            [-e placeholder-value-file]
            (generate | list | start | stop | destroy)
option -d

The deployment descriptor file describes all the fleet unit files and Consul key-value pair files that make up the application. All the files referenced in the deployment-descriptor may have placeholders for deployment time values. These placeholders are enclosed in  double curly brackets {{ }}.

option -e

The file contains the values for the placeholders to be used on deployment of the application. The file has a simple format:


starts all units in the order as they appear in the deployment descriptor. If you have a template unit file, you can specify the number of instances you want to start. Start is idempotent, so you may call start multiple times. Start will bring the deployment inline with your descriptor.

If the unit file has changed with respect to the deployed unit file, the corresponding instances will be stopped and restarted with the new unit file. If you have a template file, the instances of the template file will be upgraded one by one.

Any consul key value pairs as defined by the consul.KeyValuePairs entries are created in Consul. Existing values are not overwritten.


generates a deployment descriptor (deployit-manifest.xml) based upon all the unit files found in your directory. If a file is a fleet unit template file the number of instances to start is set to 2, to support rolling upgrades.


stops all units in reverse order of their appearance in the deployment descriptor.


destroys all units in reverse order of their appearance in the deployment descriptor.


lists the runtime status of the units that appear in the deployment descriptor.

Install fleetappctl

to nstall the fleetappctl utility, type the following commands:

curl -q -L | tar -xzf -
cd fleetappctl-0.25
brew install xmlstarlet
brew install fleetctl
Start the platform

If you do not have the platform running, start it first.

cd ..
git clone
cd coreos-container-platform-as-a-service
git checkout 029d3dd8e54a5d0b4c085a192c0ba98e7fc2838d
cd vagrant
vagrant up
Example - Three component web application

The first example is a three component application. It consists of a mount, a Redis database service and a web application. We generate the deployment descriptor, indicate we do not want to start the mount, start the application and then modify the web application unit file to change the service name into 'helloworld'. We perform a rolling upgrade by issuing start again.. Finally we list, stop and destroy the application.

cd ../fleet-units/app
# generate a deployment descriptor
fleetappctl generate

# do not start mount explicitly
xml ed -u '//fleet.UnitConfigurationFile[@name="mnt-data"]/startUnit' \
       -v false deployit-manifest.xml > \
mv deployit-manifest.xml{.new,}

# start the app
fleetappctl start 

# Check it is working

# Change the service name of the application in the unit file
sed -i -e 's/SERVICE_NAME=hellodb/SERVICE_NAME=helloworld/' app-hellodb@.service

# do a rolling upgrade
fleetappctl start 

# Check it is now accessible on the new service name

# Show all units of this app
fleetappctl list

# Stop all units of this app
fleetappctl stop
fleetappctl list

# Restart it again
fleetappctl start

# Destroy it
fleetappctl destroy
Example - placeholder references

This example shows the use of a placeholder reference in the unit file of the paas-monitor application. The application takes two optional environment  variables: RELEASE and MESSAGE that allow you to configure the resulting responses. The variable RELEASE is configured in the Docker run command in the fleet unit file through a placeholder. The actual value for the current deployment is taken from an placeholder value file.

cd ../fleetappctl-0.25/examples/paas-monitor
#check out the placeholder reference
grep '{{' paas-monitor@.service

ExecStart=/bin/sh -c "/usr/bin/docker run --rm --name %p-%i \
 <strong>--env RELEASE={{release}}</strong> \
# checkout our placeholder values
cat dev.env
# start the app
fleetappctl -e dev.env start

# show current release in status

# start is idempotent (ie. nothing happens)
fleetappctl -e dev.env start

# update the placeholder value and see a rolling upgrade in the works
echo 'release=V3' > dev.env
fleetappctl -e dev.env start

fleetappctl destroy
Example - Env Consul Key Value Pair deployments

The final example shows the use of a Consul Key Value Pair, the use of placeholders and envconsul to dynamically update the environment variables of a running instance. The environment variables RELEASE and MESSAGE are taken from the keys under /paas-monitor in Consul. In turn the initial value of these keys are loaded on first load and set using values from the placeholder file.

cd ../fleetappctl-0.25/examples/envconsul

#check out the Consul Key Value pairs, and notice the reference to placeholder values
cat keys.consul

# checkout our placeholder values
cat dev.env
message=Hi guys
# start the app
fleetappctl -e dev.env start

# show current release and message in status

# Change the message in Consul
fleetctl ssh paas-monitor@1 \
    curl -X PUT \
    -d \'hello Consul\' \

# checkout the changed message

# start does not change the values..
fleetappctl -e dev.env start

CoreOS provides all the basic functionality for a Container Platform as a Service. With the utility fleetappctl it becomes easy to start, stop and upgrade composite applications. The script is an superfluous to fleetctl and does not break other ways of deploying your applications to CoreOS.

The source code, manual page and documentation of fleetappctl can be found on


Diff'ing software architecture diagrams again

Coding the Architecture - Simon Brown - Fri, 06/19/2015 - 12:42

In Diff'ing software architecture diagrams, I showed that creating a software architecture model with a textual format provides you with the ability to version control and diff different versions of the model. As a follow-up, somebody asked me whether Structurizr provides a way to recreate what Robert Annett originally posted in Diagrams for System Evolution. In other words, can the colours of the lines be changed? As you can see from the images below, the answer is yes.

Before Before

To do this, you simply add some tags to the relationships and add the appropriate styles to the view configuration. will even auto-generate a key for you.

Diagram key

And yes, you can do the same with elements too. As this illustrates, the choice of approach is yours.

Categories: Architecture

Startup Thinking

“Startups don't win by attacking. They win by transcending.  There are exceptions of course, but usually the way to win is to race ahead, not to stop and fight.” -- Paul Graham

A startup is the largest group of people you can convince to build a different future.

Whether you launch a startup inside a big company or launch a startup as a new entity, there are a few things that determine the strength of the startup: a sense of mission, space to think, new thinking, and the ability to do work.

The more clarity you have around Startup Thinking, the more effective you can be whether you are starting startups inside our outside of a big company.

In the book, Zero to One: Notes on Startups, or How to Build the Future, Peter Thiel shares his thoughts about Startup Thinking.

Startups are Bound Together by a Sense of Mission

It’s the mission.  A startup has an advantage when there is a sense of mission that everybody lives and breathes.  The mission shapes the attitudes and the actions that drive towards meaningful outcomes.

Via Zero to One: Notes on Startups, or How to Build the Future:

“New technology tends to come from new ventures--startups.  From the Founding Fathers in politics to the Royal Society in science to Fairchild Semiconductor's ‘traitorous eight’ in business, small groups of people bound together by a sense of mission have changed the world for the better.  The easiest explanation for this is negative: it's hard to develop new things in big organizations, and it's even harder to do it by yourself.  Bureaucratic hierarchies move slowly, and entrenched interests shy away from risk.” 

Signaling Work is Not the Same as Doing Work

One strength of a startup is the ability to actually do work.  With other people.  Rather than just talk about it, plan for it, and signal about it, a startup can actually make things happen.

Via Zero to One: Notes on Startups, or How to Build the Future:

“In the most dysfunctional organizations, signaling that work is being done becomes a better strategy for career advancement than actually doing work (if this describes your company, you should quit now).  At the other extreme, a lone genius might create a classic work of art or literature, but he could never create an entire industry.  Startups operate on the principle that you need to work with other people to get stuff done, but you also need to stay small enough so that you actually can.”

New Thinking is a Startup’s Strength

The strength of a startup is new thinking.  New thinking is even more valuable than agility.  Startups provide the space to think.

Via Zero to One: Notes on Startups, or How to Build the Future:

“Positively defined, a startup is the largest group of people you can convince of a plan to build a different future.  A new company's most important strength is new thinking: even more important than nimbleness, small size affords space to think.  This book is about the questions you must ask and answer to succeed in the business of doing new things: what follows is not a manual or a record of knowledge but an exercise in thinking.  Because that is what a startup has to do: question received ideas and rethink business from scratch.”

Do you have stinking thinking or do you beautiful mind?

New thinking will take you places.

You Might Also Like

How To Get Innovation to Succeed Instead of Fail

Management Innovation is at the Top of the Innovation Stack

The Innovation Revolution

The New Competitive Landscape

The New Realities that Call for New Organizational and Management Capabilities

Categories: Architecture, Programming

Diff'ing software architecture diagrams

Coding the Architecture - Simon Brown - Wed, 06/17/2015 - 14:02

Robert Annett wrote a post titled Diagrams for System Evolution where he describes a simple approach to showing how to visually describe changes to a software architecture. In essence, in order to show how a system is to change, he'll draw different versions of the same diagram and use colour-coding to highlight the elements/relationships that will be added, removed or modified.

I've typically used a similar approach for describing as-is and to-be architectures in the past too. It's a technique that works well. Although you can version control diagrams, it's still tricky to diff them using a tool. One solution that addresses this problem is to not create diagrams, but instead create a textual description of your software architecture model that is then subsequently rendered with some tooling. You could do this with an architecture description language (such as Darwin) although I would much rather use my regular programming language instead.

Creating a software architecture model as code

This is exactly what Structurizr is designed to do. I've recreated Robert's diagrams with Structurizr as follows.



And since the diagrams were created by a model described as Java code, that description can be diff'ed using your regular toolchain.

The diff between the two model versions

Code provides opportunities

This perhaps isn't as obvious as Robert's visual approach, and I would likely still highlight the actual differences on diagrams using notation as Robert did too. Creating a textual description of a software architecture model does provide some interesting opportunities though.

Categories: Architecture

How to Configure Alerts to Prevent Too Many Server Notifications

This is a guest post by Ashish Mohindroo, who leads Product for Happy Apps, a new uptime and performance monitoring system.

It isn't uncommon for system administrators to receive a stream of alarms for situations that either can wait to be addressed, will remedy themselves, or simply weren't problems to begin with. The other side of the equation is missing the reports that indicate a problem that has to be addressed right away. Options for customizing server notifications let you decide the conditions that trigger alerts, set the level of alerts, and choose the recipients based on each alert's importance.

The only thing worse than receiving too many notifications is not receiving the one alert that would keep a small glitch from becoming a big problem. Preventing over-notification requires fine-tuning system alerts so that the right people find out about problems and potential problems at the right time. Here's a three-step approach to customizing alerts:

Categories: Architecture

boot2docker on xhyve

Xebia Blog - Mon, 06/15/2015 - 09:00

xhyve is a new hypervisor in the vein of KVM on Linux and bhyve on BSD. It’s actually a port of BSD’s bhyve to OS X and more similar to KVM than to Virtualbox in that it’s minimal and command line only which makes it a good fit for an always running virtual machine like boot2docker on OS X.

This post documents the steps to get boot2docker running within xhyve and contains some quick benchmarks to compare xhyve’s performance with Virtualbox.

Read the full blogpost on Simon van der Veldt's website

Using UIPageViewControllers with Segues

Xebia Blog - Mon, 06/15/2015 - 07:30

I've always wondered how to configure a UIPageViewController much like you configure a UITabBarController inside a Storyboard. It would be nice to create standard embed segues to the view controllers that make up the pages. Unfortunately, such a thing is not possible and currently you can't do a whole lot in a Storyboard with page view controllers.

So I thought I'd create a custom way of connecting pages through segues. This post explains how.

Without segues

First let's create an example without using segues and then later we'll try to modify it to use segues.

Screen Shot 2015-06-14 at 10.01.10

In the above Storyboard we have 2 scenes. One page view controller and another for the individual pages, the content view controller. The page view controller will have 4 pages that each display their page number. It's about as simple as a page view controller can get.

Below the code of our simple content view controller:

class MyContentViewController: UIViewController {

  @IBOutlet weak var pageNumberLabel: UILabel!

  var pageNumber: Int!

  override func viewDidLoad() {
      pageNumberLabel.text = "Page $$pageNumber)"

That means that our page view controller just needs to instantiate an instance of our MyContentViewController for each page and set the pageNumber. And since there is no segue between the two scenes, the only way to create an instance of the MyContentViewController is programmatically with the UIStoryboard.instantiateViewControllerWithIdentifier(_:). Of course for that to work, we need to give the content view controller scene an identifier. We'll choose MyContentViewController to match the name of the class.

Our page view controller will look like this:

class MyPageViewController: UIPageViewController {

  let numberOfPages = 4

  override func viewDidLoad() {

      direction: .Forward, 
      animated: false, 
      completion: nil)

    dataSource = self

  func createViewController(pageNumber: Int) -> UIViewController {
    let contentViewController = 
      as! MyContentViewController
    contentViewController.pageNumber = pageNumber
    return contentViewController


extension MyPageViewController: UIPageViewControllerDataSource {
  func pageViewController(pageViewController: UIPageViewController, viewControllerAfterViewController viewController: UIViewController) -> UIViewController? {
    return createViewController(
      mod((viewController as! MyContentViewController).pageNumber, 
      numberOfPages) + 1)

  func pageViewController(pageViewController: UIPageViewController, viewControllerBeforeViewController viewController: UIViewController) -> UIViewController? {
    return createViewController(
      mod((viewController as! MyContentViewController).pageNumber - 2, 
      numberOfPages) + 1)

func mod(x: Int, m: Int) -> Int{
  let r = x % m
  return r < 0 ? r + m : r

Here we created the createViewController(_:) method that has the page number as argument and creates the instance of MyContentViewController and sets the page number. This method is called from viewDidLoad() to set the initial page and from the two UIPageViewControllerDataSource methods that we're implementing here to get to the next and previous page.

The custom mod(_:_) function is used to have a continuous page navigation where the user can go from the last page to the first and vice versa (the built-in % operator does not do a true modules operation that we need here).

Using segues

The above sample is pretty simple. So how can we change it to use a segue? First of all we need to create a segue between the two scenes. Since we're not doing anything standard here, it will have to be a custom segue. Now we need a way to get an instance of our content view controller through the segue which we can use from our createViewController(_:). The only method we can use to do anything with the segue is UIViewController.performSegueWithIdentifier(_:sender:). We know that calling that method will create an instance of our content view controller, which is the destination of the segue, but we then need a way to get this instance back into our createViewController(_:) method. The only place we can reference the new content view controller instance is from within the custom segue. From it's init method we can set it to a variable which the createViewController(_:) can also access. That looks something as following. First we create the variable:

  var nextViewController: MyContentViewController?

Next we create a new custom segue class that assigns the destination view controller (the new MyContentViewController) to this variable.

public class PageSegue: UIStoryboardSegue {

  public override init!(identifier: String?, source: UIViewController, destination: UIViewController) {
    super.init(identifier: identifier, source: source, destination: destination)
    if let pageViewController = source as? MyPageViewController {
      pageViewController.nextViewController = 
        destinationViewController as? MyContentViewController

  public override func perform() {}

Since we're only interested in getting the reference to the created view controller we don't need to do anything extra in the perform() method. And the page view controller itself will handle displaying the pages so our segue implementation remains pretty simple.

Now we can change our createViewController(_:) method:

func createViewController(pageNumber: Int) -> UIViewController {
  performSegueWithIdentifier("Page", sender: self)
  nextViewController!.pageNumber = pageNumber
  return nextViewController!

The method looks a bit odd since it's not we're never assigning nextViewController anywhere in this view controller. And we're relying on the fact that the segue is created synchronously from the performSegueWithIdentifier call. Otherwise this wouldn't work.

Now we can create the segue in our Storyboard. We need to give it the same identifier as we used above and set the Segue Class to PageSegue

Screen Shot 2015-06-14 at 11.58.49

<2>Generic class

Now we can finally create segues to visualise the relationship between page view controller and content view controller. But let's see if we can write a generic class that has most of the logic which we can reuse for each UIPageViewController.

We'll create a class called SeguePageViewController which will be the super class for our MyPageViewController. We'll move the PageSegue to the same source file and refactor some code to make it more generic:

public class SeguePageViewController: UIPageViewController {

    var pageSegueIdentifier = "Page"
    var nextViewController: UIViewController?

    public func createViewController(sender: AnyObject?) -> MyContentViewController {
        performSegueWithIdentifier(pageSegueIdentifier, sender: sender)
        return nextViewController!


public class PageSegue: UIStoryboardSegue {

    public override init!(identifier: String?, source: UIViewController, destination: UIViewController) {
        super.init(identifier: identifier, source: source, destination: destination)
        if let pageViewController = source as? SeguePageViewController {
            pageViewController.nextViewController = destinationViewController as? UIViewController

    public override func perform() {}

As you can see, we moved the nextViewController variable and createViewController(_:) to this class and use UIViewController instead of our concrete MyContentViewController class. We also introduced a new variable pageSegueIdentifier to be able to change the identifier of the segue.

The only thing missing now is setting the pageNumber of our MyContentViewController. Since we just made things generic we can't set it from here, so what's the best way to deal with that? You might have noticed that the createViewController(_:) method now has a sender: AnyObject? as argument, which in our case is still the page number. And we know another method that receives this sender object: prepareForSegue(_:sender:). All we need to do is implement this in our MyPageViewController class.

override func prepareForSegue(segue: UIStoryboardSegue, sender: AnyObject?) {
  if segue.identifier == pageSegueIdentifier {
    let vc = segue.destinationViewController as! MyContentViewController
    vc.pageNumber = sender as! Int

If you're surprised to see the sender argument being used this way you might want to read my previous post about Understanding the ‘sender’ in segues and use it to pass on data to another view controller


Wether or not you'd want to use this approach of using segues for page view controllers might be a matter of personal preference. It doesn't give any huge benefits but it does give you a visual indication of what's happening in the Storyboard. It doesn't really save you any code since the amount of code in MyPageViewController remained about the same. We just replaced the createViewController(_:) with the prepareForSegue(_:sender:) method.

It would be nice if Apple offers a better solution that depends less on the data source of a page view controller and let you define the paging logic directly in the storyboard.

3 easy ways of improving your Kanban system

Xebia Blog - Sun, 06/14/2015 - 21:10

You are working in a Kanban team and after a few months it doesn’t seem to be as easy and effective as it used to be. Here are three tips on how to get that energy back up.

1. Recheck and refresh all policies

IMG_1884New team members haven’t gone through the same process as the other team members. This means that they probably won’t pick up on the subtleties of the board and policies that go with it. Over time team agreements that were very explicit (remember the Kanban Practice: “Make Policies Explicit”?) aren’t anymore. For example: A blue stickie that used to have a class of service associated with it becomes a stickie for the person that works on the project. Unclear policies lead to loss of focus, confusion and lack of flow. Do a check if all policies are still valid, clear and understood by everybody. If not, get rid of them or improve them!

2. Check if WIP limits are understood and adhered to

In the first time after a team starts using Kanban, Work in Progress (WIP) limits are not completely optimized for flow. They usually are higher to minimize resistance from team and stakeholders as tighter WIP limits require a higher maturity of the organization. The team should start to adjust (lower) the limits to optimize flow. Lack team focus on improvement of flow can make that they don't adjust WIP limits.

Reevaluate current WIP limits, adjust them for bottlenecks and lower and discuss next steps. It will return the focus to help the team to “Start Finishing”.

3. Do a complete overhaul of the board

IMG_0307It sounds counter intuitive, but sometimes it helps to take the board and do a complete runthrough of the steps to design a Kanban System. Analyze up- and downstream stakeholders and model the steps the work flows through. After that update the legend and policies for classes of service.

Even if the board looks almost the same, the discussion and analysis might lead to subtle changes that make a world of difference. And it’s good for team spirit!


I hope these tips can help you. Please share any that you have in the comments!

ATDD with Lego Mindstorms EV3

Xebia Blog - Fri, 06/12/2015 - 20:09

We have been building automated acceptance tests using web browsers, mobile devices and web services for several years now. Last week Erik Zeedijk and I came up with the (crazy) idea to implement features for a robot using ATDD. In this blog we will explain how we used ATDD while experimenting with Lego Mindstorms EV3.

What are we building?

When you practice ATDD it is really important to have a common understanding about what you need to build and test. So we had several conversations about requirements and specifications.

We wanted to build a robot that could help us clean up post-its from tables or floors after brainstorms and other sticky note related sessions. For our first iteration we decided to experiment with moving the robot and using the color scanner. The robot should search for yellow post-it notes all by itself. No manual interactions because we hate manual work. After the conversation we wrote down the following acceptance criteria to scope our development activities:

  • The robot needs to move his way on a white square board
  • The robot needs to turn away from the edges of the square board
  • The robot needs to stop a yellow post-it

We used Cucumber to write down the features and acceptance test scenarios like:

Feature: Move robot
Scenario: Robot turns when a purple line is detected
When the robot is moving and encounters a purple line
Then the robot turns left

Scenario: Robot stops moving when yellow post-it detected
When the robot is moving and encounters yellow post-it
Then the robot stops

Just enough information to start rocking with the robot! Ready? Yes! Tests? Yes! Lets go!

How did we build it?

First we played around with the Lego Mindstorms EV3 programming software (4GL tool). We could easily build the application by click our way through the conditions, loops and color detections.

But we couldn't find a way to hook up our Acceptance Tests. So we decided to look for programming languages. First hit was the leJOS. It comes with a custom firmware which you install on a SD card. This firmware allows you to run Java on the Lego EV3 brick.

Installing the firmware on the EV3 Brick

The EV3 main computer “Brick” is actually pretty powerful.

It has a 300mhz ARM processor with 64MB RAM and takes micro SD cards for storage.
It also has Bluetooth and Wi-Fi making it possible to wirelessly connect to it and load programs on it.

There are several custom firmware’s you can run on the EV3 Brick. The LeJOS firmware allows us to do just that where we can program in Java with having access to a specialized API created for the EV3 Brick. Install it by following these steps:

  1. Download the latest firmware on sourceforge:
  2. Get a blank SD card (2GB – 32GB, SDXC not supported) and format this card to FAT32.
  3. Unzip the ‘’ file from the leJOS home directory to the root directory of the SD card.
  4. Download Java for Lego Mindstorms EV3 from the Oracle website:
  5. Copy the downloaded ‘Oracle JRE.tar.gz’ file to the root of the SD card.
  6. Safely remove the SD card from your computer, insert it into the EV3 Brick and boot the EV3. The first boot will take approximately 8 minutes while it created the needed partitions for the EV3 Brick to work.
ATDD with Cucumber

Cucumber helped us creating scenarios in an human readable language. We created the Step Definitions to glue it all the automation together and used the remote access to the EV3 via Java RMI.

We also wrote some support code to help us make the automation code readable and maintainable.

Below you can find an example of the StepDefinition of a Given statement of a scenario.

@Given("^the robot is driving$")
public void the_robot_is_driving() throws Throwable {
assertThat(Ev3BrickIO.leftMotor.isMoving(), is(true));
assertThat(Ev3BrickIO.rightMotor.isMoving(), is(true));
Behavior Programming

A robot needs to perform a series of functions. In our case we want to let the robot drive around until he finds a yellow post-it, after which he stops. The obvious way of programming these functions is to make use of several if, then, else statements and loops. This works fine at first, however, as soon as the patterns become more complex the code becomes complex as well since adding new actions can have influence on other actions.

A solution to this problem is Behavior Programming. The concept of this is based around the fact that a robot can only do one action at a time depending on the state he is in. These behaviors are written as separate independent classes controlled by an overall structure so that behavior can be changed or added without touching other behaviors.

The following is important in this concept:

  1. One behavior is active at a time and then controls the robot.
  2. Each behavior is given a priority. For example; the turn behavior has priority over the drive behavior so that the robot will turn once the color sensor indicates he reaches the edge of the field.
  3. Each behavior is a self contained, independent object.
  4. The behaviors can decide if they should take control of the robot.
  5. Once a behavior becomes active it has a higher priority than any of the other behaviors
Behaviour API and Coding

The Behavior API is fairly easy to understand as it only exists out of one interface and one class. The individual behavior classes build-up is then defined by the overall Behavior interface and the individual behavior classes are controlled by an Arbitrator class that decides which behavior should be the active one. A separate class should be created for each behavior you want the robot to have.

leJOS Behaviour Pattern


When an Arbitrator object is instantiated, it is given an array of Behavior objects. Once it has these, the start() method is called and it begins arbitrating; deciding which behavior will become active.

In the code example below the Arbitrator is being created where the first two lines create instances of our behaviors. The third line places them into an array, with the lowest priority behavior taking the lowest array index. The fourth line creates the Arbitrator, and the fifth line starts the Arbitration process. One you create more behavior classes these can simply be added to the array in order to be executed by the Arbitrator.

DriveForward driveForward = new DriveForward();
Behavior stopOnYellow = new StopOnYellow();
behaviorList = new Behavior[]{driveForward, stopOnYellow};
arbitrator = new Arbitrator(behaviorList, true);

We will now take a detailed look at one of the behavior classes; DriveForward(). Each of the standard methods is defined. First we create the action() method in which contains what we want the robot to perform when the behavior becomes active. In this case moving the left and the right motor forwards.

public void action() {
try {
suppressed = false;

while( !suppressed ) {

} catch (RemoteException e) {

The suppress() method will stop the action once this is called.

public void suppress() {
suppressed = true;

The takeControl() method tells the Arbitrator which behavior should become active. In order to keep the robot moving after having performed more actions we decided to create a global variable to keep track of the fact if the robot is running. While this is the case we simply return 'true' from the takeControl() method.

public boolean takeControl() {
return Ev3BrickIO.running;

You can find our leJOS EV3 Automation code here.

Test Automation Day

Our robot has the potential to do much more and we have not yet implemented the feature of picking up our post-its! Next week we'll continue working with the Robot during the Test Automation Day.
Come and visit our booth where you can help us to enable the Lego Mindstorms EV3 Robot to clean-up our sticky mess at the office!

Feel free to use our Test Automation Day €50 discount on the ticket price by using the following code: TAD15-XB50

We hope to see you there!

Unit and integration are ambiguous names for tests

Coding the Architecture - Simon Brown - Fri, 06/12/2015 - 17:10

I've blogged about architecturally-aligned testing before, which essentially says that our automated tests should be aligned with the constructs that we use to describe a software system. I want to expand upon this topic and clarify a few things, but first...


In a nutshell, I think the terms "unit test" and "integration test" are ambiguous and open to interpretation. We tend to all have our own definition, but I've seen those definitions vary wildly, which again highlights that our industry often lacks a shared vocabulary. What is a "unit"? How big is a "unit"? And what does an "integration test" test the integration of? Are we integrating components? Or are we integrating our system with the database? The Wikipedia page for unit testing talks about individual classes or methods ... and that's good, so why don't we use a much more precise terminology instead that explicitly specifies the scope of the test?

Align the names of the tests with the things they are testing

I won't pretend to have all of the answers here, but aligning the names of the tests with the things they are testing seems to be a good starting point. And if you have a nicely defined way of describing your software system, the names for those tests fall into place really easily. Again, this comes back to creating that shared vocabulary - a ubiquitous language for describing the structures at different levels of abstraction within a software system.

Here are some example test definitions for a software system written in Java or C#, based upon my C4 model (a software system is made up of containers, containers contain components, and components are made up of classes).

Name What? Test scope? Mocking, stubbing, etc? Lines of production code tested per test case? Speed Class tests Tests focused on individual classes, often by mocking out dependencies. These are typically referred to as “unit” tests. A single class - one method at a time, with multiple test cases to test each path through the method if needed. Yes; classes that represent service providers (time, random numbers, key generators, non-deterministic behaviour, etc), interactions with “external” services via inter-process communication (e.g. databases, messaging systems, file systems, other infrastructure services, etc ... the adapters in a “ports and adapters” architecture). Tens Very fast running; test cases typically execute against resources in memory. Component tests Tests focused on components/services through their public interface. These are typically referred to as “integration” tests and include interactions with “external” services (e.g. databases, file systems, etc). See also ComponentTest on Martin Fowler's bliki. A component or service - one operation at a time, with multiple test cases to test each path through the operation if needed. Yes; other components. Hundreds Slightly slower; component tests may incur network hops to span containers (e.g. JVM to database). System tests UI, API, functional and acceptance tests (“end-to-end” tests; from the outside-in). See also BroadStackTest on Martin Fowler's bliki. A single scenario, use case, user story, feature, etc across a whole software system. Not usually, although perhaps other systems if necessary. Thousands Slower still; a single test case may require an execution path through multiple containers (e.g. web browser to web server to database).


This definition of the tests doesn't say anything about TDD, when you write your tests and how many tests you should write. That's all interesting, but it's independent to the topic we're discussing here. So then, do *you* have a good clear definition of what "unit testing" means? Do your colleagues at work agree? Can you say the same for "integration testing"? What do you think about aligning the names of the tests with the things they are testing? Thoughts?

Categories: Architecture

Software Engineering Radio - Episode 228

Coding the Architecture - Simon Brown - Fri, 06/12/2015 - 10:23

A quick note to say that my recent interview/chat with Sven Johann for Software Engineering Radio has now been published. As you might expect, it covers software architecture, diagramming, the use of UML, my C4 model, doing "just enough" and why the code should (but doesn't often) reflect the software architecture. You can listen to the podcast here. Enjoy!

Categories: Architecture

Diagrams for System Evolution

Coding the Architecture - Simon Brown - Thu, 06/11/2015 - 08:17

Every system has an architecture (whether deliberately designed or not) and most systems change and evolve during their existence - which includes their architecture. Sometimes this happens gradually and sometimes in large jumps. Let's have a look at a simple example.

Basic context diagram for a furniture company


The main elements are:

  • Sales Orderbook where sales staff book orders and designers occasionally view sales.
  • Product Catalogue where designers record the details of new products.
  • Warehouse Management application that takes orders and either delivers to customers from stock or forward orders to manufacturing.

The data flow is generally one way, with each system pushing out data with little feedback, due to an old batch-style architecture. For example, the designer updates product information but all the updates for the day are pushed downstream at end-of-day. The Warehouse application only accepts a feed from the Sales Orderbook, which includes both orders and product details.

There are multiple issues with a system like this:

  1. Modification to the products (new colours, sizes, materials etc) are not delivered to the manufacturer until after an order is placed. This means a long lead time for manufacturing and an unhappy customer.
  2. The sales orderbook receives a subset of the product information when updated. This means the information could become out-of-sync and not all of it is available.
  3. The warehouse system also relies on the sales orderbook for product information. This may be out of date or inaccurate.
  4. The designer does not have easy-to-access information about how product changes affect sales.

The users are unhappy with the system and want the processes to change as follows:

  1. The product designer wishes to see historic order information alongside the products in the catalog. Therefore the catalog should be able to retrieve data from the Sales Orderbook when required.
  2. Don't batch push products to the Orderbook and warehouse management. They should access the product catalog as a service when required.

System redesign

Therefore the system is re-architected as follows:

This is a significant modification to how the systems works:

  • The Sales Orderbook and Product Catalogue have a bi-directional relationship and request information from each other.
  • Only orders are sent from the Sales Orderbook to the Product Catalogue.
  • There is now a direct connection between the Warehouse Manager and the Product Catalogue.
  • The product catalogue now provides all features the Designer requires. They do not need to access any other interfaces into the system.

However, if you place the two diagrams next to each other they look similar and the scale of the change is not obvious.


When we write code, we would use a ‘diff’ tool to show us the modifications.


I often do something similar on my architecture diagrams. For example:

System before modification:


System after modification:

When placed side-by-side the differences are much more obvious:


I’ve used similar colours to those used by popular diff tools; with red for a removed relationship and green for a new relationship (I would use the same colours if components were added or removed). I’ve used blue to indicate where a relationship has changed significantly. Note that I’ve included a Key on these diagrams.

You may be thinking; why have I coloured before and after diagrams rather than having a single, coloured diagram? A single diagram can work if you are only adding and removing elements but if they are being modified it becomes very cluttered. As an example, consider the relationship between the Sales Orderbook and the Product Catalogue. Would the description on a combined diagram be "product updates", "exchange sales and product information", or both? How would I show which was before and after? Lastly, what would I do with the arrow on the connection? Colourise it differently from the line?


I find that having separate, coloured diagrams for the current and desired system architectures allows the changes to be seen much more clearly. Having spoken to several software architects, it appears that many use similar techniques, however there appears to be no standard way to do this. Therefore we should make sure that we follow standard advice such as using keys and labelling all elements and connections.

Categories: Architecture

Extracting software architecture from code

Coding the Architecture - Simon Brown - Mon, 06/08/2015 - 22:47

I ran my Extracting software architecture from code session at Skills Matter this evening and it was a lot of fun. The basic premise of the short workshop part was simple; "here's some code, now draw some software architecture diagrams to describe it". Some people did this individually and others worked in groups. It sounds easy, but you can see for yourself what happened.


There are certainly some common themes, but each diagram is different. Also, people's perception of what architectural information can be extracted from the code differed slightly too, but more on that topic another day. If you want to have a go yourself, the codebase I used was a slightly cutdown version* of the Spring PetClinic web application. The presentation parts of the session were videoed and I'm creating a 1-day version of this workshop that I'll be running at a conference or two in the Autumn.

This again raises some basic questions about the software development industry though. Why, in 2015, do we still not have a consistent way to do this? Why don't we have a shared vocabulary? When will we be able to genuinely call ourselves "software engineers"?

* The original version ships with three different database profile implementations, and I removed two of them for simplicity.

Categories: Architecture

Diagramming microservices with the C4 model

Coding the Architecture - Simon Brown - Mon, 06/08/2015 - 08:40

Here's a question I'm being asked more and more ... how do you diagram a microservices architecture with the C4 software architecture model?

It's actually quite straightforward providing that you have a defined view of what a microservice is. If a typical modular monolithic application is a container with a number of components running inside it, a microservice is simply a container with a much smaller number of components running inside it. The actual number of components will depend on your implementation strategy. It could range from the very simple (i.e. one, where a microservice is a container with a single component running inside) through to something like a mini-layered or hexagonal architecture. And by "container", I mean anything ranging from a traditional web server (e.g. Apache Tomcat, IIS, etc) through to something like a self-contained Spring Boot or Dropwizard application. In concrete terms then...

  • System context diagram: No changes ... you're still building a system with users (people) and other system dependencies.
  • Containers diagram: If each of your microservices can be deployed individually, then that should be reflected on the containers diagram. In other words, each microservice is represented by a separate container.
  • Component diagrams: Depending on the complexity of your microservices, I would question whether drawing a component diagram for every microservice is worth the effort. Of course, if each microservice is a mini-layered or hexagonal architecture then perhaps there's some value. I would certainly consider using something like Structurizr for doing this automatically from the code though.

So there you go, that's how I would approach diagramming a microservices architecture with the C4 model.

Categories: Architecture

Extracting software architecture from code

Coding the Architecture - Simon Brown - Tue, 06/02/2015 - 09:19

I'm running a short "In The Brain" session at Skills Matter in London next Monday evening, focussed around the topic of extracting the software architecture model from code.

It’s often said that the code is the true embodiement of the software architecture, yet my experience suggests that it’s difficult to actually extract this information from the code. Why isn’t the architecture in the code? Join me for this "In The Brain" session where we’ll look at a simple Java web application to see what information we can extract from the code and how to supplement it with information we can't. This session will be interactive, so bring a laptop.

This will be an interactive session so you will need to bring a laptop, or at least something that you can browse a GitHub repository with. We'll be looking at a Java web application, but the concepts are applicable to other programming languages and platforms. See you next week.

Categories: Architecture

Data’s hierarchy of needs


This post originally published in the AppsFlyer blog.

A couple of weeks ago Nir Rubinshtein and I presented AppsFlyer’s data architecture in a meetup ofBig Data & Data Science Israel. One of the concepts that I presented there, which is worth expanding upon is “Data’s Hierarchy of Needs:”

  • Data should Exist
  • Data should be Accessible
  • Data should be Usable
  • Data should be Distilled
  • Data should be Presented

How can we make data “achieve its pinnacle of existence” and be acted upon. In other words, what are the areas that should be addressed when designing a data architecture if you want it to be complete and enable creating insights and value from the data you generate and collect.

If done properly, your users might just act upon the data you provide. This list might seem a little simplistic but it is not a prescription of what to do but rather a set of reminders of areas we need to cover and questions we need answered to properly create a data architecture.

Data Should Exist

Well, of course data should exist, and it probably does. You should ask yourself however, is if the data that exists is the right data? Does the retention policy you have service the business needs? Does the availability fit your needs? Do you have all the needed links (foreign keys) to other data so you’d be able to connect it later for analysis?

To make this more concrete, consider the following example: AppsFlyer accepts several types of events (launches, in-app events, etc.) which are tied to apps. Apps are connected to accounts (an account would have one or more applications, usually at least, an iOS app and an Android one). If we would save the accounts as the latest snapshot and an app changes ownership, the historical data before that change would be skewed. If we treat the accounts as a slowly changing dimension of the events, then we’d be able to handle the transition correctly. Note that we may still choose to provide the new owner the historic data but now it not the only option the system support and the decision can be based on the business needs.

Data Should Be Accessible

If data is written to disk it is accessible programmatically at least, however, there can be many levels of accessibility and we need to think about our end users needs and the level of access they’d require. At AppsFlyer, the data existence (mentioned above) is handled by processing all the messages that go through our queues using Kafka but that data is saved in sequence files and stored by event time. Most of our usage scenarios do have a time component but they are primarily handled by the app or account. Any processing that needs a specific account and would access the raw events would have to sift through tons of records (3.7+ billion a day at the time of this post) to find the few relevant ones. Thus, one basic move toward accessibility of data is to sort by apps so that queries will only need to access a small subset of the data and thus run much faster.

Then we need to consider the “hotness” of the data i.e. what response times we need and for which types of data. For instance, aggregations, such as retention reports need to be accessed online (so called “sub-second” response), latest counts need near real-time , explorations of data for new patterns can take hours etc. To enable support of these varied usage scenarios, we need to create multiple projections of our data, most likely using several different technologies.  AppsFlyer stores raw data in sequence files, processed data in parquet files (accessible via Apache Spark), aggregations and recent data in columnar RDBMS and near real-time is stored in-memory.

The three different storage mechanisms I mentioned above (Parquet, columnar RDBMS and In-Memory Data Grid) used in AppsFlyer all have SQL access; this is not by chance. While we (the industry) went through a short period of NoSQL, SQL or almost-SQL is getting back to be the norm, even for semi-structured and poly-structured data. Providing an SQL interface to your data is another important aspect of data accessibility as it allows expanding the user base for the data beyond R&D. Again, this is important not just for your relational data

Data Should Be Usable

What’s the difference between accessible data and usable data?  For one there’s data cleansing. This is a no-brainer if you pull data from disparate systems but it is also needed if your source is a single system. Data cleansing is what traditional ETL is all about and the techniques still apply.
Another aspect of making data usable is enriching it or connecting it to additional data. Enriching can happen from internal sources like linking CRM data to the account info. This can also be facilitated by external sources as with getting the app category from the app store or getting device screen size from a device database.

Last but not least, is to consider legal and privacy aspects of the data. Before allowing access to the data you may need to mask sensitive information or remove privacy-related data (sometimes you shouldn’t even save it in the first place). At AppsFlyer we take this issue very seriously and make major efforts to comply when working with partners and clients to make sure privacy-related data is handled correctly. In fact, we are also undergoing independent SOC auditing to make sure we are compliant with the highest standards.

To summarize, to make the data usable you have to make sure it is correct, connect it to other data and you need to make sure that it is compliant with legal and privacy issues.

Data Should Be Distilled

Distilling insights is the reason we perform all the previous steps. Data in itself is of little use if it doesn’t help us make better decisions. There are multiple types of insights you can generate here beginning from the more traditional BI scenarios of slice and dice analytics going through real-time aggregations and trend analysis, ending in applying machine learning or “advanced analytics”. You can see one example of the type of insights that can be gleaned from our data by looking at theGaming Advertising Performance Index we recently published.

Data Should Be Presented

This point ties in nicely with the Gaming Advertising Performance Index example provided above. Getting insights is an important step, but if you fail to present them in a coherent and cohesive manner then the actual value users would be able to make of it is limited at best.  Note that even if you use insights for making decisions (e.g. recommending a product to a user) you’d still need to present how well this decision is doing.

There are many issues that need to be dealt with from UX perspective both in how users interact with the data and how the data is presented. An example of the former is deciding on chart types for the data. A simple example for the latter is when presenting projected or inaccurate data it  should be clear to the users that they are looking at approximations to prevent support calls on numbers not adding up.

Making sure all the areas discussed above are covered and handled properly is a lot of work but providing a solution that actually helps your users make better decisions is well worth it. The data’s hierarchy of needs is not a prescription of how to get there, it is merely a set of waypoints to help navigate toward this end goal. It helps me think holistically about AppsFlyer data needs and I hope following this post it would also help you.

For more information about our architecture, check out the presentation from the meetup:

Categories: Architecture

Do you have a shared vocabulary?

Coding the Architecture - Simon Brown - Mon, 05/18/2015 - 16:57

"This is a component of our system", says one developer, pointing to a box on a diagram labelled "Web Application". Next time you're sitting in an conversation about software design, listen out for how people use terms like "component", "module", "sub-system", etc. We can debate whether UML is a good notation to visually communicate the design of a software system, but we have a more fundamental problem in our industry. That problem is our vocabulary, or rather, the lack of it.


I've been running my software architecture sketching exercises in various formats for nearly ten years, and thousands of people have participated. The premise is very simple - here are some requirements, design a software solution and draw some pictures to illustrate the design. The range of diagrams I've seen, and still see, is astounding. The percentage of people who choose to use UML is tiny, with most people choosing to use an informal boxes and lines notation instead. With some simple guidance, the notational aspects of the diagrams are easy to clean up. There's a list of tips in my book that can be summarised with this slide from my workshop.

Some tips for effective sketches


What's more important though is the set of abstractions used. What exactly are people supposed to be drawing? How should they think about, describe and communicate the design of their software system? The primary aspect I'm interested in is the static structure. And I'm interested in the static structure from different levels of abstraction. Once this static structure is understood and in place, it's easy to supplement it with other views to illustrate runtime/behavioural characteristics, infrastructure, deployment models, etc.

In order to get to this point though, we need to agree upon some vocabulary. And this is the step that is usually missed during my workshops. Teams charge headlong into the exercise without having a shared understanding of the terms they are using. I've witnessed groups of people having design discussions using terms like "component" where they are clearly not talking about the same thing. Yet everybody in the group is oblivious to this. For me, the answer is simple. Each group needs to agree upon the vocabulary, terminology and abstractions they are going to use. The notation can then evolve.

My Simple Sketches for Diagramming Your Software Architecture article explains why I believe that abstractions are more important than notation. Maps are a great example of this. Two maps of the same location will show the same things, but they will often use different notations. The key to understanding these different maps is exactly that - a key tucked away in the corner of each map somewhere. I teach people my C4 model, based upon a simple set of abstractions (software systems, containers, components and classes), which can be used to describe the static structure of a software system from a number of different levels of abstraction. A common set of abstractions allows you to have better conversations and easily compare solutions. In my workshops, the notation people use to represent this static structure is their choice, with the caveat that it must be self-describing and understandable by other people without explanation.

Next time you have a design discussion, especially if it's focussed around some squiggles on a whiteboard, stop for a second and take a step back to make sure that everybody has a shared understanding of the vocabulary, terminology and abstractions that are being used. If this isn't the case, take some time to agree upon it. You might be surprised with the outcome.

Categories: Architecture

Suprastructure – how come “Microservices” are getting small?


Now, I don’t want to get off on a rant here*,  but, It seems like “Microservices” are all the rage these days – at least judging from my twitter, feedly and Prismatic feeds. I already wrote that that in my opinion “Microservices” is just new name to SOA . I thought I’d give a couple of examples for what I mean.

I worked on systems that today would  pass for Microservices years ago (as early as 2004/5).  For instance in  2007,  I worked at a startup called xsights. We developed something like  google goggles for brands (or barcodeless barcode) so users could snap a picture of a ad/brochure etc. and get relevant content or perks in response (e.g. we had campaigns in Germany with a book publisher where MMSing shots of newspaper ads  or outdoor signage resulted in getting information and discounts on the advertized books).The architecture driving that wast a set of small, focused autonomous services. Each service encapsulated its own data store (if it had one), services were replaceable (e.g. we had MMS, web, apps & 3G video call gateways). We developed the infrastructure to support automatic service discovery, the ability to create ad-hoc long running interactions a.k.a.Sagas that enabled different cooperations between the services (e.g. the flow for 3G video call needed more services for fulfilment than a app one) etc.  You can read a bit about it in the  “putting it all together” chapter of my SOA Patterns book or view the presentation I gave in QCon a few years back called “Building reliable systems from unreliable components” (see slides below) both elaborate some more on that system.

Another example is Naval command and control system I (along with Udi Dahan) designed back in 2004 for an unmanned surface vessel (like a drone but on water)  – In that system we had services like “navigation” that suggested navigation routes based on waypoints and data from other services (e.g. weather), a “protector” service that handled communications to and from the actual USVs a “Common Operational Picture” (COP) service that aggregated target data from external services and sensors (e.g. the ones on the protectors), “Alerts” services where business rules could trigger various actions etc. These services communicated using events and messages and had flows like the protecor publish its current positioning, the  COP publish an updated target positions (protector + other targets), the navigation system spots a potential interception problem and publish that , the alert service identify that the threshold for the potential problem is too big and trigger an alert to users which then initiate a request for suggesting alternate navigation plans etc. Admittedly some of these services could have been more focused and smaller but they were still autonomous, with separate storage  and  hey that was 2004 :)

So, what changed in the last decade ? For one, I guess after years of “enterprisy” hype that ruined SOAs name the actual architectural style is finally getting some traction (even if it had to change its name for that to happen).

However, this post is not just a rant on Microservices…

The more interesting chage is the shift in the role of infrastructure from a set of libraries and tools that are embedded within the software we write to larger constructs running outside of the software and running/managing it -> or in other words the emergence of  “suprastructure” instead of infrastructure  (infra = below, supra = above). It isn’t that infrastructure vanishes but a lot of functionality is “outsources” to suprastructure. Also this is something that started a few years back with PaaS but (IMHO) getting more acceptance and use in the last couple of years esp. with the gaining popularity of Docker (and more importantly its ecosystem)

If we consider, for example, the architecture of Appsflyer , which I recently joined, (You can listen to  Nir Rubinshtein, our system architect, presenting it (in Hebrew) or check out the slides on speaker-deck or below (English) )

Instead of writing or using elaborate service hosts and application servers you can  host simple apps in  Docker; run and schedule them by Mesos, get cluster and discovery services from Consul, recover from failure by rereading logs from Kafka etc. Back in the days we also had these capabilities but we wrote tons of code to make it happen. Tons of code that was specific to the solution and technology (and was prone for bugs and problems). For modern solutions, all these capabilities are available almost off the shelf , everywhere,  on premise, on clouds and even across clouds.

The importance of suprastructure in regard to  “microservices”  is that this “outsourcing” of functionality help drive down the overhead and costs associated with making services small(er). In previous years the threshold to getting from useful services  to nanoservices  was easier to cross. Today, it is almost reversed –  you spend the effort of setting all this suprastructure once and you actually just begin to see the return if you have enough services to make it worthwhile.

Another advantage of suprastructure is that it is easier to get polyglot services – is easier  to write different services using different technologies. Instead of investing in a lot of technology-specific infrastructure you can get more generic capabilities from the suprastructure and spend more time solving the business problems using the right tool for the job. It also makes it easier to change and  evolve technologies over time – again saving the sunk costs of investing in elaborate infrastructure


of course, that’s just my opinion I could be wrong…*

PS – we (Appsflyer) are hiring : UI tech lead, data scientist, senior devs and more… :)

Building reliable systems from unreliable components

* with apologies to Dennis Miller

Categories: Architecture