Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Architecture

Distributed big balls of mud

Coding the Architecture - Simon Brown - Sun, 07/06/2014 - 10:27

If you want evidence that the software development industry is susceptible to fashion, just go and take a look at all of the hype around microservices. It's everywhere! For some people microservices is "the next big thing", whereas for others it's simply a lightweight evolution of the big SOAP service-oriented architectures that we saw 10 years ago "done right". I do like a lot of what the current microservice architectures are doing, but it's by no means a silver bullet. Okay, I know that sounds obvious, but I think many people are jumping on them for the wrong reason.

From monoliths to microservices

I often show this slide in my conference talks, and I've blogged about this before, but basically there are different ways to build software systems. On the one side we have traditional monolithic systems, where everything is bundled up inside a single deployable unit. This is probably where most of the industry is. Caveats apply, but monoliths can be built quickly and are easy to deploy, but they provide limited agility because even tiny changes require a full redeployment. We also know that monoliths often end up looking like a big ball of mud because of the way that software often evolves over time. For example, many monolithic systems are built using a layered architecture, and it's relatively easy for layered architectures to be abused (e.g. skipping "around" a service to call the repository/data access layer directly).

On the other side we have service-based architectures, where a software system is made up of many separately deployable services. Again, caveats apply but, if done well, service-based architectures buy you a lot of flexibility and agility because each service can be developed, tested, deployed, scaled, upgraded and rewritten separately, especially if the services are decoupled via asynchronous messaging. The downside is increased complexity because your software system now has many more moving parts than a monolith. As Robert says, the complexity is still there, you're just moving it somewhere else.

There is, of course, a mid-ground here. We can build monolithic systems that are made up of in-process components, each of which has an explicit well-defined interface and set of responsibilities. This is old-school component-based design that talks about high cohesion and low coupling, but I usually sense some hesitation when I talk about it. And this seems odd to me. Before I explain why, let me quote something from a blog post that I read earlier this morning about the rationale behind a team adopting a microservices approach. When we started building Karma, we decided to split the project into two main parts: the backend API, and the frontend application. The backend is responsible for handling orders from the store, usage accounting, user management, device management and so forth, while the frontend offers a dashboard for users which accesses this API. Along the way we noticed that if the whole backend API is monolithic it doesn't work very well because everything gets entangled.

The blog post also mentions scaling, versioning and multiple languages/frameworks as other reasons to choose microservices. Again, there are no silver bullets here, everything is a trade-off. Anyway, "everything getting entangled" is not a reason to switch from monoliths to microservices. If you're building a monolithic system and it's turning into a big ball of mud, perhaps you should consider whether you're taking enough care of your software architecture. Do you really understand what the core structural abstractions are in your software? Are their interfaces and responsibilities clear too? If not, why do you think moving to a microservices architecture will help? Sure, the physical separation of services will force you to not take some shortcuts, but you can achieve the same separation between components in a monolith. A little design thinking and an architecturally-evident coding style will help to achieve this without the baggage of going distributed.

Many of the teams I've spoken to are building monolithic systems and don't want to look at component-based design. The mid-ground seems to be a hard-sell. I ran a software architecture sketching workshop with a team earlier this year where we diagrammed one of their software systems. The diagram started as a strictly layered architecture (presentation, business services, data access) with all arrows pointing downwards and each layer only ever calling the layer directly beneath it. The code told a different story though and the eventual diagram didn't look so neat anymore. We discussed how adopting a package by component approach could fix some of these problems, but the response was, "meh, we like building software using layers".

It seems as if teams are jumping on microservices because they're sexy, but the design thinking and decomposition strategy required to create a good microservices architecture are the same as those needed to create a well structured monolith. If teams find it hard to create a well structured monolith, I don't rate their chances of creating a well structured microservices architecture. As Michael Feathers recently said, "There's a bit of overhead involved in implementing each microservice. If they ever become as easy to create as classes, people will have a freer hand to create trouble - hulking monoliths at a different scale.". I agree. A world of distributed big balls of mud worries me.

Categories: Architecture

Create the smallest possible Docker container

Xebia Blog - Fri, 07/04/2014 - 21:59

When you are playing around with Docker, you quickly notice that you are downloading large numbers of megabytes as you use preconfigured containers. A simple Ubuntu container easily exceeds 200MB and as software is installed on top of it, the size increases. In some use cases, you do not need everything that comes with Ubuntu. For example, if you want to run a simple web server, written in Go, there is no need for any tool around that at all.

I have been searching for the smallest possible container to start with and found this one:

docker pull scratch

The scratch image is perfect. Literally perfect! It is elegant, small and fast. It does not contain any bugs, security leaks, slow code or technical debt. And that is because it is basically empty. Except for a bit of metadata added by Docker. In fact, you could have created this scratch image yourself with this command as described in the Docker documentation:

tar cv --files-from /dev/null | docker import - scratch

 

So that is it, the smallest possible Docker image. End of blog post!

... or is there something more we can say about this? For example, how do you use the scratch base image? It turns out this brings some challenges of its own.

Creating content for the scratch image

What can we run on an empty base image? An executable without dependencies. Do you have executables without dependencies?

I used to write code in Python, Java and JavaScript. Each of these languages/platforms require a runtime installed. Recently, I started looking into the Go (or GoLang if you prefer) platform. And it seems (spoiler alert) like Go is statically  linked. So I tried compiling a simple web server saying Hello World and running it within the scratch container. Here is the code for the Hello World web server:

package main

import (
	"fmt"
	"net/http"
)

func helloHandler(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintln(w, "Hello World from Go in minimal Docker container")
}

func main() {
	http.HandleFunc("/", helloHandler)

	fmt.Println("Started, serving at 8080")
	err := http.ListenAndServe(":8080", nil)
	if err != nil {
		panic("ListenAndServe: " + err.Error())
	}
}

 

Obviously, I cannot compile my webserver inside the scratch container as there is no Go compiler in it. And as I am working on a Mac, I also cannot compile a Linux binary just like that. (Actually, it is possible to cross-compile GoLang sources to different platforms, but that is material for another blog post)

So I first need a Docker container with a Go compiler. Let's start simple:

docker run -ti google/golang /bin/bash

 

Inside this container, I can build the Go web server, which I have committed in a GitHub repository:

go get github.com/adriaandejonge/helloworld

 

The go get command is a variant of the go build command that allows fetching and building remote dependencies. You can start the resulting executable with:

$GOPATH/bin/helloworld

 

This works. But it is not what we want. We need the hello world container to run inside the scratch container. So, in fact, we need a Dockerfile saying:

FROM scratch
ADD bin/helloworld /helloworld
CMD ["/helloworld"]

 

and then start that. Unfortunately, the way we started the google/golang container, there is no way to build this Dockerfile. So first, we need a way to access Docker from within the container.

Calling Docker from within Docker

When you use Docker, sooner or later you run into the need to control Docker from within Docker. There are multiple ways to accomplish this. You could use recursion and run Docker inside Docker. However, that seems overly complex and again leads to large containers. You can also provide access to the Docker server outside the instance with a few additional command line options:

docker run -v /var/run/docker.sock:/var/run/docker.sock -v $(which docker):$(which docker) -ti google/golang /bin/bash

 

Before you continue, please rerun the Go compiler, as Docker forgot our previous compilation during the restart:

go get github.com/adriaandejonge/helloworld

 

When starting the container, the -v flag creates a volume inside the Docker container and allows you to provide a file from the Docker machine as input. The /var/run/docker.sock is the Unix socket that allows access to the Docker server. The $(which docker) part is a clever way to provide the path for the docker executable inside the container without hardcoding it. However, be careful when you use this command on an Apple when using boot2docker. If the docker executable is installed in a different location than it is installed in boot2docker's virtual machine, this results in a mismatch. It will be the executable inside the boot2docker virtual server that gets inserted into the container. So you may want to replace $(which docker) with /usr/local/bin/docker which is hardcoded. Similarly, if you run a different system, there is a chance that the /var/run/docker.sock has a different location and you need to adjust it accordingly.

Now you can use the Dockerfile inside the google/golang container in the $GOPATH directory, which points to /gopath in this example. Actually, I already checked this Dockerfile into GitHub. So you can copy it from the Go build directory to the desired location like this:

cp $GOPATH/src/github.com/adriaandejonge/helloworld/Dockerfile $GOPATH

 

You need to copy this as the compiled binary is now located in $GOPATH/bin and it is not possible to include files from parent directories when building a Dockerfile. So after copying, the next step is:

docker build -t adejonge/helloworld $GOPATH

 

And if all goes, well, Docker responds with something like:

Successfully built 6ff3fd5a381d

 

Which allows you to run the container:

docker run -ti --name hellobroken adejonge/helloworld

 

But unfortunately, now Docker responds with:

2014/07/02 17:06:48 no such file or directory

 

So what is going on? We have a statically linked executable inside a scratch container. Did we make a mistake?

As it turns out, Go does not statically link libraries. Or at least not all libraries. Under Linux, we can see the dynamically linked libraries for an executable with the ldd command:

ldd $GOPATH/bin/helloworld 

 

Which responds with:

linux-vdso.so.1 => (0x00007fff039fe000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f61df30f000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f61def84000)
/lib64/ld-linux-x86-64.so.2 (0x00007f61df530000)

 

So before we can run the Hello World webserver, we need to tell the Go compiler to actually do static linking.

Creating statically linked executables in Go

In order to create statically linked executables, we need to tell Go to use the cgo compiler rather than the go compiler. The command to do so is:

CGO_ENABLED=0 go get -a -ldflags '-s' github.com/adriaandejonge/helloworld

 

The CGO_ENABLED environment variable tells Go to use the cgo compiler rather than the go compiler. The -a flag tells Go to rebuild all dependencies. Otherwise you still and up with dynamically linked dependencies. And finally the -ldflags '-s' flag is a nice extra. It reduces the file size of the resulting executable by roughly 50%. You can also do this without the cgo compiler. The size reduction is a result from removing debug information.

Just to be sure, rerun the ldd command.

ldd $GOPATH/bin/helloworld 

 

It should now respond with:

not a dynamic executable

 

You can also rerun the steps for creating the Docker container around the executable from scratch:

docker build -t adejonge/helloworld $GOPATH

 

And if all goes well, Docker responds with something like:

Successfully built 6ff3fd5a381d

 

Which allows you to run the container:

docker run -ti --name helloworld adejonge/helloworld

 

And this time it should respond with:

Started, serving at 8080

 

Until so far, there were many manual steps and there is a lot of room for error. Let's exit from the google/golang container and continue from the surrounding machine:

<Press Ctrl-C>
exit

 

You can check the existence or absence of containers and images with:

docker ps -a
docker images -a

 

And you can do some cleaning of Docker with:

docker rm -f helloworld
docker rmi -f adejonge/helloworld

 

Creating a Docker container that creates a Docker container

The steps we took so far, we can also record in a Dockerfile and have Docker do the work for us:

FROM google/golang
RUN CGO_ENABLED=0 go get -a -ldflags '-s' github.com/adriaandejonge/helloworld
RUN cp /gopath/src/github.com/adriaandejonge/helloworld/Dockerfile /gopath
CMD docker build -t adejonge/helloworld gopath

 

I checked this Dockerfile into a separate GitHub repository called adriaandejonge/hellobuild. It can be built with this command:

docker build -t adejonge/hellobuild github.com/adriaandejonge/hellobuild

 

Providing the  -t flag names the image as adejonge/hellobuild and implicitly tags it as latest. These names make it easier for you to remove the image later on. Next,  you can create a container from this image while providing the flags that you have seen earlier in this post:

docker run -v /var/run/docker.sock:/var/run/docker.sock -v $(which docker):$(which docker) -ti --name hellobuild adejonge/hellobuild

 

Providing the --name hellobuild flag makes it easier to remove the container after running. In fact, you can do so right away, because after running this command, you already created the adejonge/helloworld image:

docker rm -f hellobuild
docker rmi -f adejonge/hellobuild

 

And now you can start a new container named helloworld based on the adejonge/helloworld image as you have done before:

docker run -ti --name helloworld adejonge/helloworld

 

Because all these steps are run from the same command line, without opening a bash shell inside a Docker container, you can add these steps to a bash script and run it automatically. For your convenience, I have added these bash scripts to the hellobuild GitHub repository.

Also, if you want to try the smallest possible Docker container running a Hello World web server without following all the steps described in this blog post, you can also use the pre-built image that I checked into the Docker Hub repository:

docker pull adejonge/helloworld

 

With docker images -a you can see that the size is 3.6MB. Of course, you can make it even smaller if you manage to create an executable that is smaller than the web server in Go that I wrote. In C or Assembly you may be able to do so. However, you can never make it smaller than the scratch image.

Stuff The Internet Says On Scalability For July 4th, 2014

Hey, it's HighScalability time:


Beauty is everywhere. Household dust magnified 22 million times.
  • Let's play a game of guess the company. They have: >100 billion searches per month; > 60 trillion known URLs; > 50 billion facts in knowledge graph; > 100 hours of video uploaded every minute; > 2 billion containers; > 6 trillion Cloud Datastore ops/month. Who is it? Why it's Google, of course. 
  • Billions of events every day: Twitter. One billion active users: Android.
  • Quotable quotes:
    • PeterGriffin: I don't know why the author called this "Multi-process architectures suck :(" when he really meant "I suck at multi-process architectures :("
    • @khrabrov: Experienced startup engineers are looking for a full-stack Business Guy to be CEO, COO, PM, marketer, account manager, HR, and receptionist.
    • @PatrickMcFadin: 30x perf over #hadoop by running #spark over #cassandra The crowd was stunned. 
    • @jcoglan: A programmer is someone who can simultaneously entertain the ideas that tight coupling is bad and fridges should be connected to the 'net
    • @BenedictEvans: Consumers spend more on apps (~$20bn run rate) than on recorded music ($17bn). 
    • @solarce: "You achieve nirvana when all failures are viewed as normal operations and not as apocalyptic events"
    • Rudiger Moller: Yup. As memory keeps getting cheaper, Java cannot profit except going off heap or use Azul Zing. Either improve concurrent GC or reduce the amount of references required to model data structures in Java.
    • @PatrickMcFadin: OH: "idompotency is better than beefalo"
  • I started listening to Songza about 6 weeks ago. Loved its emotional intelligence. And now I find Google went and acquired it. A coincidence? This is not a case of megalomania. It occurred to me that Google is in the perfect position to let some algorithms loose on its data to see if a service like Songza is gaining mind share. If you look at DNS access, G+, Gmail, Chrome, web trends, etc you have a pretty good proxy for actual usage data. In fact, your algorithms could just look at everything and identify acquisition targets by ranking what services are rising above the noise. And in double fact Google can probably estimate future growth trends better than Songza because they have historical data on many other services. 
  • Concurrency Improvements in HyperLevelDB. Taking single threaded code and making multithreaded is not for the faint of heart. Deadlocks await each new access pattern. By reducing time locks are held, using lock free data structures, and using fine grained locking HyperDex was able to reach 400K operations per second, better than LevelDB's 275K operations per second.
  • The Lambda Architecture has nothing to do with The Secret, in case you were wondering. To see why Jay Kreps has an excellent article Questioning the Lambda Architecture based on his experiences at LinkedIn. The main objection is double processing, concluding: These days, my advice is to use a batch processing framework like MapReduce if you aren’t latency sensitive, and use a stream processing framework if you are, but not to try to do both at the same time unless you absolutely must. Great discussion in the comment section. For me it's as simple as never mix read and write streams. They have completely different purposes. More on Hacker News.
  • Videos from the Velocity Conference 2014 on YouTube.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Categories: Architecture

Dockerfiles as automated installation scripts

Xebia Blog - Thu, 07/03/2014 - 19:16

Dockerfiles are great and easily readable specifications for the installation and configuration of an application. It is terse, can be understood by anyone who understands UNIX commands, results in a testable product and can easily be turned into an automated installation script using a little awk'ward magic. Just in case you want to install the application in question on the good old fashioned way, without the Docker hassle :-)

In this case, we needed to experiment with the Codahale Metrics library and Zabbix. Instead of installing a complete Zabbix server, I googled for a docker container and was pleased to find a ready to run Zabbix server configuration created by Bernardo Gomez Palacio. . Unfortunately, the server stopped repeatedly after about 5 minutes due the simplevisor's impression that it was requested to stop. I could not figure out where this request was coming from, and as it was pretty persistent, I decided to install zabbix on a virtual box.

So I checked out the  docker-zabbix github project and found a ready to run Vagrant configuration to build the zabbix docker container itself (Cool!). The Dockerfile contained easily and readable instructions on how to install and configure Zabbix. But,  instead of copy-and-pasting the instructions to the command prompt, I cloned the project on the vagrant box and created the following awk script in order to execute the instructions in the Dockerfile directly on the running system.

/^ADD/ {
sub(/ADD/, "")
    cmd = "mkdir -p $(dirname " $2 ")"
    system(cmd)
    cmd = "cp " $0
    system(cmd)
}

/^RUN/ {
    sub(/RUN/, "")
    cmd = $0
    system(cmd)
}

After a few minutes, the image was properly configured. I just needed to run the database initialisation script (/start.sh) and ensured that all the services were started on reboot.

 cd /etc/init.d
for i in zabbix* httpd mysqld snmp* ; do
     chkconfig $i on
     service $i start
done

Even if you do not use Docker in production, Dockerfiles are a great improvement in the specifications of installation instructions!

How architecture enables kick ass teams (1): replication considered harmful?

Xebia Blog - Thu, 07/03/2014 - 11:51

At Xebia we regularly have discussions regarding Agile Architecture? What is it? What does it take? How should you organise this? Is it technical or organisational? And much more questions… which I won’t be answering today. What I will do today is kick off a blog series covering subjects that are often part of these heated debates. In general what we strive for with Agile Architecture is an architecture that enables the organisation to keep moving fast and without IT be a limiting factor for realising changes. As you read this series you’ll start noticing one theme coming back over and over again: Autonomy. Sometimes we’ll be focussing on the architecture of systems, sometimes on the architecture of the organisation or teams, but autonomy is the overarching theme. And if you’re familiar with Conways Law it should be no surprise that there is a strong correlation between team and system structure. Having a structure of teams  that is completely different from your system landscape causes friction. We are convinced that striving for optimal team and system autonomy will lead to an organisation which is able to quickly adapt and respond to changes.

The first subject is replication of data, this is more a systems (landscape) issue and less of an organisational issue and definitely not the only one, more posts will follow.

We all have to deal with situations where:

  • consumers of a data retrieval service (e.g. customer account details) require this service to be highly available, or
  • compute intensive analysis must be done using the data in a system, or
  • data owned by a system must be searched in a way that is not (efficiently) supported by that system

These situations all impact the autonomy of the system owning the data.Is the system able to provide the it's functionality at the require quality level or do these external requirements lead to negative consequences on quality of the service provided or maintainability? Should these requirements be forced into the system or is another approach more appropriate?

Above examples all could be solved by replicating data into another system which is more suitable for meeting these requirements but … replication of data is considered to be harmful by some. Is it really? Often mentioned reasons not to replicate data are:

  • The replicated data will always be less accurate and timely than the original data
    True, and is this really a problem for the specific situation you’re dealing with? Sometimes you really need the latest version of a customer record, but in many situations it is no problem is the data is seconds, minutes or even hours old.
  • Business logic that interprets the data is implemented twice and needs to be maintained
    Yes, and you have to compare the costs of this against the benefits. As long as the benefits outweigh the costs, it is a good choice.  You can even consider to provide a library that is used in both systems.
  • System X is the authoritative source of the data and should be the only one that exposes it
    Agree, and keeping that system as the authoritative source is good practice and does not mean that there could be read only access to the same (replicated) data in other systems.

As you can see it is never a black and white decision, you’ll have to make a balanced decision and include benefits and costs of both alternatives. The gained autonomy and business benefits derived from this can easily outweigh the extra development, hosting and maintenance costs of replicating data.

A few concrete examples from my own experience:

We had a situation where a CRM system instance owned data which was also required in a 24x7 emergency support proces. The data was nicely exposed by a number of data retrieval services. At that organisation the CRM system deployment was such that most components were redundant, but during updates the system as a whole would still be down for several hours. Which was not acceptable given that the data was required in a 24x7 emergency support process. Making the CRM system deployment upgradable without downtime was not possible or would cost .
In this situation the costs of replicating the CRM system database to another datacenter using standard database features and having the data retrieval services access either that replicated database or the original database (as fall back) was much cheaper than trying to make CRM system itself high available. The replicated database would remain running accessible even when CRM system  got upgraded. Yes, we’re bypassing the CRM system business logic for interpreting the data, but for this situation the logic was so simple that the costs of reimplementing and maintaining this in a new lightweight service (separate from CRM system) were neglectable.

Another example is from a telecom provider that uses a chain of fulfilment systems in which it registered all network products sold to its customers (e.g. internet access, telephony, tv). Each product instance depends on instances registered in another system and if you drill down deep enough you‚Äôll reach the physical network hardware ports on which it runs. The systems that registered all products used a relational model which was okay for registration. However, questions like ‚Äúif this product instance breaks, which customers are impacted‚ÄĚ were impossible to answer without overheating CPUs in those systems. By publishing all changes in the registrations to a separate system we could model the whole inventory of services as a network graph and easily do analysis on it without impacting the fulfilment systems. The fact that the data would be a (at most) a few seconds old was absolutely no problem.

And a last example is that sometimes you want to do a full (phonetic) text search through a subset of your domain model. Relational data models quickly get you into an unmaintainable situation. You‚Äôre SQL queries will require many tables, lot‚Äôs of inefficient ‚ÄúLIKE ‚Äė%gold%‚Äô" and developers that have a hard time understanding what a query actually intended to do. Replicating the data to a search engine makes searching far easier and¬†provides more possibilities for searches that are hard to realise in a relational database.

As you can see replication of data can increase autonomy of systems and teams and thereby make your system or system landscape and organisation more agile. I.e. you can realise new functionality faster and get it available for your users quicker because the coupling with other systems or teams is reduced.

In a next blog we'll discuss another subject that impacts team or system autonomy.

Why does data need to have sex?

Data needs the ability to combine with other data in new ways to reach maximum value. So data needs to have the equivalent of sex.

That's why I used sex in the title of my previous article, Data Doesn't Need To Be Free, But It Does Need To Have Sex. So it wasn't some sort of click-bait title as some have suggested.

Sex is nature's way of bringing different data sets together, that is our genome, and creating something new that has a chance to survive and thrive in changing environments.

Currently data is cloistered behind Walled Gardens and thus has far less value than it could have. How do we coax data from behind these walls? With money. So that's where the bit about "data doesn't need to be free" comes from. How do we make money? Through markets. What do we have as a product to bring to market? Data. What do services need to keep producing data as a product? Money.

So it's a virtuous circle. Services generate data from their relationship with users. That data can be sold for the money services need to make a profit. Profit keeps the service that users like running. A running service  produces even more data to continue the cycle.

Why do we even care about data having a sex?

Historically one lens we can use to look at the world is to see everything in terms of how resources have been exploited over the ages. We can see the entire human diaspora as largely being determined by the search for and exploitation of different resource reservoirs.

We live near the sea for trade and access to fisheries. Early on we lived next to rivers for water, for food, for transportation, and later for power. People move to where there is lumber to harvest, gold to mine, coal to mine, iron to mine, land to grow food, steel to process, and so on. Then we build roads, rail roads, canals and ports to connect resource reservoirs to consumers.

In Nova Scotia, where I've been on vacation, a common pattern was for England and France to fight each other over land and resources. In the process they would build forts, import soldiers, build infrastructure, and make it relatively safe to trade. These forts became towns which then became economic hubs. We see these places as large cities now, like Halifax Nova Scotia, but it's the resources that came first.

When you visit coves along the coast of Nova Scotia they may tell you with interpretive signage, spaced out along a boardwalk, about the boom and bust cycles of different fish stocks as they were discovered, exploited, and eventually fished out.

In the early days in Nova Scotia great fortunes were made on cod. Then when cod was all fished out other resource reservoirs like sardines, halibut, and lobster were exploited. Atlantic salmon was over fished. Production moved to the Pacific where salmon was once again over fished. Now a big product is scallops and what were once trash fish, like redfish, is now the next big thing because that's what's left.

During these cycles great fortunes were made. But when a resource runs out people move on and find another. And when that runs out people move on and keep moving on until they find a place to make make living.

Places associated with old used up resources often just fade away. Ghosts of the original economic energy that created them. As a tourist I've noticed what is mined now as a resource is the history of the people and places that were created in the process of exploiting previous resources. We call it tourism.

Data is a resource reservoir like all the other resource reservoirs we've talked about, but data is not being treated like a resource. It's as if forts and boats and fishermen all congregated to catch cod, but then didn't sell the cod on an open market. If that were the case limited wealth would have been generated, but because all these goods went to market as part of a vast value chain, a decent living was made by a great many people.

If we can see data as a resource reservoir, as natural resources run out, we'll be able to switch to unnatural resources to continue the great cycle of resource exploitation.

Will this work? I don't know. It's just a thought that seems worth exploring.

Categories: Architecture

How combined Lean- and Agile practices will change the world as we know it

Xebia Blog - Tue, 07/01/2014 - 08:50

You might have attended this month at our presentation about eXtreme Manufacturing and the keynote of Nalden last week on XebiCon 2014. There are a few epic takeaways and additions I would like to share with you in this blogpost.

Epic TakeAway #1: The Learn, Unlearn and Relearn Cycle Like Nalden expressed in his inspiring keynote, one of the major things for him to be successful is being able to Learn, Unlearn and Relearn every time again. In my opinion, this will be the key ability for every successful company in the near future.  In fact, this is how nature evolutes: in the end, only the species who are able to adapt to changing circumstances will survive and evolute. This mechanism makes for example, most of the startups fail, but those who will survive, can be extremely disruptive for non-agile organizations.  Best example for this is of course Whatsapp.  Beating up the Telco Industry by almost destroying their whole businessmodel in only a few months. Learn more about disruptive innovation from one of my personal heroes, Harvard Professor Clayton Christensen.

Epic TakeAway #2: Unlearning Waterfall, Relearning Lean & Agile Globally, Waterfall is still the dominant method in companies and universities.  Waterfall has its origins more than 40 years ago. Times have changed. A lot. A new, successful and disruptive product could be there in only a matter of days instead of (many) years. Finally, things are changing. For example, the US Department of Defence has recently embraced Lean and Agile as mandatory practices, especially Scrum. Schools and universities are also more and more adopting the Agile way of working. Later more in this blogpost.

Epic TakeAway #3: Combined Lean- and Agile practices =  XM Lean practices arose in Japan in the 1980’s , mainly in the manufacturing industry, Toyota being the frontrunner here.  Agile practices like Scrum, were first introduced in the 1990’s by Ken Schwaber and Jeff Sutherland, these practices were mainly applied in the IT-industry. Until now, the manufacturing and IT world didn’t really joined forces combining Lean and Agile practices.  Until recently.  The WikiSpeed initiative of Joe Justice proved combining these practices result in a hyper-productive environment, where a 100 Mile/Gallon road legal sportscar could be developed in less than 3 months.  Out of this success eXtreme Manufacturing (XM) arose. Finally, a powerful combination of best practices from the manufacturing- and IT-world came together.

Epic TakeAway #4: Agile Mindset & Education fotoLike Sir Ken Robinson and Dan Pink already described in their famous TED-talks, the way most people are educated and rewarded, is not suitable anymore for modern times and even conflicts with the way we are born.  We learn by "failing", not by preventing it.  Failing in it’s essence should stimulate creativity to do things better next time, not be punished.  On the long run, failing (read: learning!) has more added value than short-term succes, for example by chasing milestones blindly. EduScrum in the Netherlands stimulates schools and universities to apply Scrum in their daily classes in order to stimulate creativity, happiness, self-reliantness and talent. The results of the schools joining these initiative are spectacular: happy students, less dropouts an significantly higher grades. For a prestigious project for the Delft University, Forze, the development of a hydrogen race car, the students are currently being trained and coached to apply Agile and Lean practices.  Also these results are more than promising. The Forze team is happier, more productive and more able to learn faster and better from setbacks.  Actually, they are taking the first steps of being anti-fragile.  Due too an intercession of the Forze team members themselves,  the current support of agile (Xebia) coaches is now planned being extended to the flagship of the Delft University:  the NUON solar team.

The Final Epic TakeAway In my opinion, we reached a tipping point in the way goals should be achieved.  Organizations are massively abandoning Waterfall and embracing Agile practices, like Scrum.  Adding Lean practices like Joe Justice did in his WikiSpeed project, makes Agile and Lean extremely powerful.  Yes, this will even make this world a much better place.  We cannot prevent nature disasters with this, but we can be anti-fragile.  We cannot prevent every epidemic, but we can respond in an XM-fashion on this by developing a vaccin in only days instead of years.  This brings me finally to the missing statement of the current Agile Manifesto:   We should Unlearn and Relearn before we Judge.  Dare to Dream like a little kid again. Unlearn your skepticism.  Companies like Boeing, Lockheed Martin and John Deere already did. Adopting XM speeded up their velocity in some cases with more than 7 times.

Complexity is Simple

Software Architecture Zen - Pete Cripp - Mon, 06/30/2014 - 20:18
I was taken with this cartoon and the comments put up by Hugh Macleod last week over at his gapingvoid.com blog so I hope he doesn’t mind me reproducing it here.

Read more...
Categories: Architecture

Data Doesn't Need to Be Free, But it Does Need to Have Sex

How do we pay for the services we want to create and use? That is the question. Systems like Twitter, Instagram, Pinterest and all the other services you love are not cheap to build at scale. Grow now and figure out your business model later as the VC funding disappears, like hope, is not a sustainable strategy. If we want new services that stick around we are going to have to figure out a way for them to make money.

I’m going to argue here that a business model that could make money for software companies, while benefiting users, is creating an open market for data. Yes, your data. For sale. On an open market. For anyone to buy. Privacy is dead. Isn’t it time we leverage the death of privacy for our own gain?

The idea is to create an ecosystem around the production, consumption, and exploitation of data so that all the players can get the energy they need to live and prosper.

The proposed model:

Categories: Architecture

Keeping a journal

Gridshore - Sun, 06/29/2014 - 23:34

Today I was reading the first part of a book I got as a gift from one of my customers. The book is called Show your work by Austin Kleon(Show Your Work! @ Amazon). The whole idea around this book is that you must be open en share what you learn and the steps you took to learn.

I think this fits me like a glove, but I can be more expressive. Therefore I have decided to do things differently. I want to start by writing smaller pieces of the things I want to do that day, or what I accomplished that day, give some excerpts of things I am working on. Not real blog posts or tutorials but more notes that I share with you. Since it is a Sunday I only want to share the book I am reading.


The post Keeping a journal appeared first on Gridshore.

Categories: Architecture, Programming

Diagramming Spring MVC webapps

Coding the Architecture - Simon Brown - Sun, 06/29/2014 - 09:54

Following on from my previous post (Software architecture as code) where I demonstrated how to create a software architecture model as code, I decided to throw together a quick implementation of a Spring component finder that could be used to (mostly) automatically create a model of a Spring MVC web application. Spring has a bunch of annotations (e.g. @Controller, @Component, @Service and @Repository) and these are often/can be used to signify the major building blocks of a web application. To illustrate this, I took the Spring PetClinic application and produced some diagrams for it. First is a context diagram.

A context diagram for the Spring PetClinic application

Next up are the containers, which in this case are just a web server (e.g. Apache Tomcat) and a database (HSQLDB by default).

A container diagram for the Spring PetClinic application

And finally we have a diagram showing the components that make up the web application. These, and their dependencies, were found by scanning the compiled version of the application (I cloned the project from GitHub and ran the Maven build).

A component diagram for the Spring PetClinic web application

Here is the code that I used to generate the model behind the diagrams.

The resulting JSON representing the model was then copy-pasted across into my simple (and very much in progress) diagramming tool. Admittedly the diagrams are lacking on some details (i.e. component responsibilities and arrow annotations, although those can be fixed), but this approach proves you can expend very little effort to get something that is relatively useful. As I've said before, it's all about getting the abstractions right.

Categories: Architecture

Data Science is the Art of Asking Better Questions

I heard a colleague make a great comment today …

“Data science is the art of asking better questions.

It‚Äôs not the art of finding a solution ‚Ķ the data keeps evolving.‚ÄĚ

Categories: Architecture, Programming

Mocking a REST backend for your AngularJS / Grunt web application

Xebia Blog - Thu, 06/26/2014 - 17:15

Anyone who ever developed a web application will know that a lot of time is spend in a browser to check if everything works as well and looks good. And you want to make sure it looks good in all possible situations. For a single-page application, build with a framework such as AngularJS, that gets all it's data from a REST backend this means you should verify your front-end against different responses from your backend. For a small application with primarily GET requests to display data, you might get away with testing against your real (development) backend. But for large and complex applications, you need to mock your backend.

In this post I'll go in to detail how you can solve this by mocking GET requests for an AngularJS web application that's built using Grunt.

In our current project, we're building a new mobile front-end for an existing web application. Very convenient since the backend already exists with all the REST services that we need. An even bigger convenience is that the team that built the existing web application also built an entire mock implementation of the backend. This mock implementation will give standard responses for every possible request. Great for our Protractor end-to-end tests! (Perhaps another post about that another day.) But this mock implementation is not so great for the non standard scenario's. Think of error messages, incomplete data, large numbers or a strange valuta. How can we make sure our UI displays these kind of cases correct? We usually cover all these cases in our unit tests, but sometimes you just want to see it right in front of you as well. So we started building a simple solution right inside our Grunt configuration.

To make this solution work, we need to make sure that all our REST requests go through the Grunt web server layer. Our web application is served by Grunt on localhost port 9000. This is the standard configuration that Yeoman generates (you really should use Yeoman to scaffold your project). Our development backend is also running on localhost, but on port 5000. In our web application we want to make all REST calls using the `/api` path so we need to rewrite all requests to http://localhost:9000/api to our backend: http://localhost:5000/api. We can do this by adding middleware in the connect:livereload configuration of our Gruntfile.

livereload: {
  options: {
    open: true,
    middleware: function (connect, options) {
      return [
        require('connect-modrewrite')(['^/api http://localhost:5000/api [P L]']),

        /* The lines below are generated by Yeoman */
        connect.static('.tmp'),
        connect().use(
          '/bower_components',
          connect.static('./bower_components')
        ),
        connect.static(appConfig.app)
      ];
    }
  }
},

Do the same for the connect:test section as well.

Since we're using 'connect-modrewrite' here, we'll have to add this to our project:

npm install connect-modrewrite --save-dev

With this configuration every request starting will http://localhost:9000/api will be passed on to http://localhost:5000/api so we can just use /api in our AngularJS application. Now that we have this working, we can write some custom middleware to mock some of our requests.

Let's say we have a GET request /api/user returning some JSON data:

{"id": 1, "name":"Bob"}

Now we'd like to see what happens with our application in case the name is missing:

{"id": 1}

It would be nice if we could send a simple POST request to change the response of all subsequent calls. Something like this:

curl -X POST -d '{"id": 1}' http://localhost:9000/mock/api/user

We prefixed the path that we want to mock with /mock in order to know when we should start mocking something. Let's see how we can implement this. In the same Gruntfile that contains our middleware configuration we add a new function that will help us mock our requests.

var mocks = [];
function captureMock() {
  return function (req, res, next) {

    // match on POST requests starting with /mock
    if (req.method === 'POST' && req.url.indexOf('/mock') === 0) {

      // everything after /mock is the path that we need to mock
      var path = req.url.substring(5);

      var body = '';
      req.on('data', function (data) {
        body += data;
      });
      req.on('end', function () {

        mocks[path] = body;

        res.writeHead(200);
        res.end();
      });
    } else {
      next();
    }
  };
}

And we need to add the above function to our middleware configuration:

middleware: function (connect, options) {
  return [
    captureMock(),
    require('connect-modrewrite')(['^/api http://localhost:5000/api [P L]']),

    connect.static('.tmp'),
    connect().use(
      '/bower_components',
      connect.static('./bower_components')
    ),
    connect.static(appConfig.app)
  ];
}

Our function will be called for each incoming request. It will capture each request starting with /mock as a request to define a mock request. Next it stores the body in the mocks variable with the path as key. So if we execute our curl POST request we end up with something like this in our mocks array:

mocks['/api/user'] = '{"id": 1}';

Next we need to actually return this data for requests to http://localhost:9000/api/user. Let's make a new function for that.

function mock() {
  return function (req, res, next) {
    var mockedResponse = mocks[req.url];
    if (mockedResponse) {
      res.writeHead(200);
      res.write(mockedResponse);
      res.end();
    } else {
      next();
    }
  };
}

And also add it to our middleware.

  ...
  captureMock(),
  mock(),
  require('connect-modrewrite')(['^/api http://localhost:5000/api [P L]']),
  ...

Great, we now have a simple mocking solution in just a few lines of code that allows us to send simple POST requests to our server with the requests we want to mock. However, it can only send status codes of 200 and it cannot differentiate between different HTTP methods like GET, PUT, POST and DELETE. Let's change our functions a bit to support that functionality as well.

 var mocks = {
  GET: {},
  PUT: {},
  POST: {},
  PATCH: {},
  DELETE: {}
};

function mock() {
  return function (req, res, next) {
    if (req.method === 'POST' && req.url.indexOf('/mock') === 0) {
      var path = req.url.substring(5);

      var body = '';
      req.on('data', function (data) {
        body += data;
      });
      req.on('end', function () {

        var headers = {
          'Content-Type': req.headers['content-type']
        };
        for (var key in req.headers) {
          if (req.headers.hasOwnProperty(key)) {
            if (key.indexOf('mock-header-') === 0) {
              headers[key.substring(12)] = req.headers[key];
            }
          }
        }

        mocks[req.headers['mock-method'] || 'GET'][path] = {
          body: body,
          responseCode: req.headers['mock-response'] || 200,
          headers: headers
        };

        res.writeHead(200);
        res.end();
      });
    }
  };
};

function mock() {
  return function (req, res, next) {
    var mockedResponse = mocks[req.method][req.url];
    if (mockedResponse) {
      res.writeHead(mockedResponse.responseCode, mockedResponse.headers);
      res.write(mockedResponse.body);
      res.end();
    } else {
      next();
    }
  };
}

We can now create more advanced mocks:

curl -X POST \
    -H "mock-method: DELETE" \
    -H "mock-response: 403" \
    -H "Content-type: application/json" \
    -H "mock-header-Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT" \
    -d '{"error": "Not authorized"}' http://localhost:9000/mock/api/user

curl -D - -X DELETE http://localhost:9000/api/user
HTTP/1.1 403 Forbidden
Content-Type: application/json
last-modified: Tue, 15 Nov 1994 12:45:26 GMT
Date: Wed, 18 Jun 2014 13:39:30 GMT
Connection: keep-alive
Transfer-Encoding: chunked

{"error": "Not authorized"}

Since we thought this would be useful for other developers, we decided to make all this available as open source library on GitHub and NPM

To add this to your project, just install with npm:

npm install mock-rest-request --save-dev

And of course add it to your middleware configuration:

middleware: function (connect, options) {
  var mockRequests = require('mock-rest-request');
  return [
    mockRequests(),
    
    connect.static('.tmp'),
    connect().use(
      '/bower_components',
      connect.static('./bower_components')
    ),
    connect.static(appConfig.app)
  ];
}

The New Competitive Landscape

"All men can see these tactics whereby I conquer, but what none can see is the strategy out of which victory is evolved." -- Sun Tzu

If it feels like strategy cycles are shrinking, they are.

If it feels like competition is even more intense, it is.

If it feels like you are balancing between competing in the world and collaborating with the world, you are.

In the book, The Future of Management, Gary Hamel and Bill Breen share a great depiction of this new world of competition and the emerging business landscape.

Strategy Cycles are Shrinking

Strategy cycles are shrinking and innovation is the only effective response.

Via The Future of Management:

“In a world where strategy life cycles are shrinking, innovation is the only way a company can renew its lease on success.  It's also the only way it can survive in a world of bare-knuckle competition.”

Fortifications are Collapsing

What previously kept people out of the game, no longer works.

Via The Future of Management:

“In decades past, many companies were insulated from the fierce winds of Schumpeterian competition.  Regulatory barriers, patent protection, distribution monopolies, disempowered customers, proprietary standards, scale advantages, import protection, and capital hurdles were bulwarks that protected industry incumbents from the margin-crushing impact of Darwinian competition.  Today, many of the fortifications are collapsing.”

Upstarts No Longer Have to Build a Global Infrastructure to Reach a Worldwide Market

Any startup can reach the world, without having to build their own massive data center to do so.

Via The Future of Management:

“Deregulation and trade liberalization are reducing the barriers to entry in industries as diverse as banking, air transport, and telecommunications.  The power of the Web means upstarts no longer have to build a global infrastructure to reach a worldwide market.  This has allowed companies like Google, eBay, and My Space to scale their businesses freakishly fast.” 

The Disintegration of Large Companies and New Entrants Start Strong

There are global resource pools of top talent available to startups.

Via The Future of Management:

“The disintegration of large companies, via deverticalization and outsourcing has also helped new entrants.  In turning out more and more of their activities to third-party contractors, incumbents have created thousands of 'arms suppliers' that are willing to sell their services to anyone.  By tapping into this global supplier base of designers, brand consultants, and contract manufacturers, new entrants can emerge from the womb nearly full-grown.” 

Ultra-Low-Cost Competition and Less Ignorant Consumers

With smarter consumers and ultra-low-cost competition, it’s tough to compete.

Via The Future of Management:

“Incumbents must also contend with a growing horde of ultra-low-cost competitors - companies like Huawei, the Chinese telecom equipment maker that pays its engineers a starting salary of just $8,500 per year.  Not all cut-price competition comes from China and India.  Ikea, Zara, Ryanair, and AirAsia are just a few of the companies that have radically reinvented industry cost structures.  Web-empowered customers are also hammering down margins.  Before the Internet, most consumers couldn't be sure whether they were getting the best deal on their home mortgage, credit card debt, or auto laon.  This lack of enlightenment buttressed margins.  But consumers are becoming less ignorant by the day.  One U.K. Web site encourages customers to enter the details of their most-used credit cards, including current balances, and then shows them exactly how much they will save by switching to a card with better payment terms.  In addition, the Internet is zeroing-out transaction costs.  The commissions earned by market makers of all kinds -- dealers, brokers, and agents -- are falling off a cliff, or soon will be.”

Distribution Monopolies are Under Attack

You can build your own fan base and reach the world.

Via The Future of Management:

“Distribution monopolies -- another source of friction -- are under attack.  Unlike the publishers of newspapers and magazines, bloggers don't need a physical distribution network to reach their readers.  Similarly, new bands don't have to kiss up to record company reps when they can build a fan base via social networking sites like MySpace.”

Collapsing Entry Barriers and Customer Power Squeeze Margins

Customers have a lot more choice and power now.

Via The Future of Management:

“Collapsing entry barriers, hyper efficient competitors, customer power -- these forces will be squeezing margins for years to come.  In this harsh new world, every company will be faced with a stark choice: either set the fires of innovation ablaze, or be ready to scrape out a mean existence in a world where seabed labor costs are the only difference between making money and going bust.”

What’s the solution?

Innovation.

Innovation is the way to play, and it’s the way to stay in the game.

Innovation is how you reinvent your success, reimagine a new future, and change what your capable of, to compete more effectively in today’s ever-changing world.

You Might Also Like

4 Stages of Market Maturity

Brand is the Ultimate Differentiator

High-Leverage Strategies for Innovation

If You Can Differentiate, You Have a Competitive Monopoly

Short-Burst Work

Categories: Architecture, Programming

The Secret of Scaling: You Can't Linearly Scale Effort with Capacity

The title is a paraphrase of something Raymond Blum, who leads a team of Site Reliability Engineers at Google, said in his talk How Google Backs Up the Internet. I thought it a powerful enough idea that it should be pulled out on its own:

Mr. Blum explained common backup strategies don’t work for Google for a very googly sounding reason: typically they scale effort with capacity.

If backing up twice as much data requires twice as much stuff to do it, where stuff is time, energy, space, etc., it won’t work, it doesn’t scale. 

You have to find efficiencies so that capacity can scale faster than the effort needed to support that capacity.

A different plan is needed when making the jump from backing up one exabyte to backing up two exabytes.

When you hear the idea of not scaling effort with capacity it sounds so obvious that it doesn't warrant much further thought. But it's actually a profound notion. Worthy of better treatment than I'm giving it here:

Categories: Architecture

How Agile accelerates your business

Xebia Blog - Wed, 06/25/2014 - 10:11

This drawing explains how agility accelerates your business. It is free to use and distribute. Should you have any questions regarding the subjects mentioned, feel free to get in touch.
Dia1

Software architecture as code

Coding the Architecture - Simon Brown - Tue, 06/24/2014 - 21:22

If you've been following the blog, you will have seen a couple of posts recently about the alignment of software architecture and code. Software architecture vs code talks about the typical gap between how we think about the software architecture vs the code that we write, while An architecturally-evident coding style shows an example of how to ensure that the code does reflect those architectural concepts. The basic summary of the story so far is that things get much easier to understand if your architectural ideas map simply and explicitly into the code.

Regular readers will also know that I'm a big fan of using diagrams to visualise and communicate the architecture of a software system, and this "big picture" view of the world is often hard to see from the thousands of lines of code that make up our software systems. One of the things that I teach people during my sketching workshops is how to sketch out a software system using a small number of simple diagrams, each at very separate levels of abstraction. This is based upon my C4 model, which you can find an introduction to at Simple sketches for diagramming your software architecture. The feedback from people using this model has been great, and many have a follow-up question of "what tooling would you recommend?". My answer has typically been "Visio or OmniGraffle", but it's obvious that there's an opportunity here.

Representing the software architecture model in code

I've had a lot of different ideas over the past few months for how to create, what is essentially, a lightweight modelling tool and for some reason, all of these ideas came together last week while I was at the GOTO Amsterdam conference. I'm not sure why, but I had a number of conversations that inspired me in different ways, so I skipped one of the talks to throw some code together and test out some ideas. This is basically what I came up with...

It's a description of the context and container levels of my C4 model for the techtribes.je system. Hopefully it doesn't need too much explanation if you're familiar with the model, although there are some ways in which the code can be made simpler and more fluent. Since this is code though, we can easily constrain the model and version it. This approach works well for the high-level architectural concepts because there are very few of them, plus it's hard to extract this information from the code. But I don't want to start crafting up a large amount of code to describe the components that reside in each container, particularly as there are potentially lots of them and I'm unsure of the exact relationships between them.

Scanning the codebase for components

If your code does reflect your architecture (i.e. you're using an architecturally-evident coding style), the obvious solution is to just scan the codebase for those components, and use those to automatically populate the model. How do we signify what a "component" is? In Java, we can use annotations...

Identifying those components is then a matter of scanning the source or the compiled bytecode. I've played around with this idea on and off for a few months, using a combination of Java annotations along with annotation processors and libraries including Scannotation, Javassist and JDepend. The Reflections library on Google Code makes this easy to do, and now I have simple Java program that looks for my component annotation on classes in the classpath and automatically adds those to the model. As for the dependencies between components, again this is fairly straightforward to do with Reflections. I have a bunch of other annotations too, for example to represent dependencies between a component and a container or software system, but the principle is still the same - the architecturally significant elements and their dependencies can mostly be embedded in the code.

Creating some views

The model itself is useful, but ideally I want to look at that model from different angles, much like the diagrams that I teach people to draw when they attend my sketching workshop. After a little thought about what this means and what each view is constrained to show, I created a simple domain model to represent the context, container and component views...

Again, this is all in code so it's quick to create, versionable and very customisable.

Exporting the model

Now that I have a model of my software system and a number of views that I'd like to see, I could do with drawing some pictures. I could create a diagramming tool in Java that reads the model directly, but perhaps a better approach is to serialize the object model out to an external format so that other tools can use it. And that's what I did, courtesy of the Jackson library. The resulting JSON file is over 600 lines long (you can see it here), but don't forget most of this has been generated automatically by Java code scanning for components and their dependencies.

Visualising the views

The last question is how to visualise the information contained in the model and there are a number of ways to do this. I'd really like somebody to build a Google Maps or Prezi-style diagramming tool where you can pinch-zoom in and out to see different views of the model, but my UI skills leave something to be desired in that area. For the meantime, I've thrown together a simple diagramming tool using HTML 5, CSS and JavaScript that takes a JSON string and visualises the views contained within it. My vision here is to create a lightweight model visualisation tool rather than a Visio clone where you have to draw everything yourself. I've deployed this app on Pivotal Web Services and you can try it for yourself. You'll have to drag the boxes around to lay out the elements and it's not very pretty, but the concept works. The screenshot that follows shows the techtribes.je context diagram.

A screenshot of a simple context diagram

Thoughts?

All of the C4 model Java code is open source and sitting on GitHub. This is only a few hours of work so far and there are no tests, so think of this as a prototype more than anything else at the moment. I really like the simplicity of capturing a software architecture model in code, and using an architecturally-evident coding style allows you to create large chunks of that model automatically. This also opens up the door to some other opportunities such automated build plugins, lightweight documentation tooling, etc. Caveats apply with the applicability of this to all software systems, but I'm excited at the possibilities. Thoughts?

Categories: Architecture

Sponsored Post: Apple, Chartbeat, Monitis, Netflix, Salesforce, Blizzard Entertainment, Cloudant, CopperEgg, Logentries, Wargaming.net, PagerDuty, Gengo, ScaleOut Software, Couchbase, MongoDB, BlueStripe, AiScaler, Aerospike, LogicMonitor, AppDynamics, Ma

Who's Hiring?

  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here.
    • Mobile Services Software Engineer. The Emerging Technologies/Mobile Services team is looking for a proactive and hardworking software engineer to join our team. The team is responsible for a variety of high quality and high performing mobile services and applications for internal use. Please apply here
    • Senior Software Engineer. Join Apple's Internet Applications Team, within the Information Systems and Technology group, as a Senior Software Engineer. Be involved in challenging and fast paced projects supporting Apple's business by delivering Java based IS Systems. Please apply here.
    • Sr Software Engineer. Join Apple's Internet Applications Team, within the Information Systems and Technology group, as a Senior Software Engineer. Be involved in challenging and fast paced projects supporting Apple's business by delivering Java based IS Systems. Please apply here.
    • Senior Security Engineer. You will be the ‘tip of the spear’ and will have direct impact on the Point-of-Sale system that powers Apple Retail globally. You will contribute to implementing standards and processes across multiple groups within the organization. You will also help lead the organization through a continuous process of learning and improving secure practices. Please apply here.
    • Quality Assurance Engineer - Mobile Platforms. Apple’s Mobile Services/Emerging Technology group is looking for a highly motivated, result-oriented Quality Assurance Engineer. You will be responsible for overseeing quality engineering of mobile server and client platforms and applications in a fast-paced dynamic environment. Your job is to exceed our business customer's aggressive quality expectations and take the QA team forward on a path of continuous improvement. Please apply here.

  • Chartbeat measures and monetizes attention on the web. Our traffic numbers are growing, and so is our list of product and feature ideas. That means we need you, and all your unparalleled backend engineer knowledge to help up us scale, extend, and evolve our infrastructure to handle it all. If you've these chops: www.chartbeat.com/jobs/be, come join the team!

  • The Salesforce.com Core Application Performance team is seeking talented and experienced software engineers to focus on system reliability and performance, developing solutions for our multi-tenant, on-demand cloud computing system. Ideal candidate is an experienced Java developer, likes solving real-world performance and scalability challenges and building new monitoring and analysis solutions to make our site more reliable, scalable and responsive. Please apply here.

  • Sr. Software Engineer - Distributed Systems. Membership platform is at the heart of Netflix product, supporting functions like customer identity, personalized profiles, experimentation, and more. Are you someone who loves to dig into data structure optimization, parallel execution, smart throttling and graceful degradation, SYN and accept queue configuration, and the like? Is the availability vs consistency tradeoff in a distributed system too obvious to you? Do you have an opinion about asynchronous execution and distributed co-ordination? Come join us

  • Java Software Engineers of all levels, your time is now. Blizzard Entertainment is leveling up its Battle.net team, and we want to hear from experienced and enthusiastic engineers who want to join them on their quest to produce the most epic customer-facing site experiences possible. As a Battle.net engineer, you'll be responsible for creating new (and improving existing) applications in a high-load, high-availability environment. Please apply here.

  • Engine Programmer - C/C++. Wargaming|BigWorld is seeking Engine Programmers to join our team in Sydney, Australia. We offer a relocation package, Australian working visa & great salary + bonus. Your primary responsibility will be to work on our PC engine. Please apply here

  • Human Translation Platform Gengo Seeks Sr. DevOps Engineer. Build an infrastructure capable of handling billions of translation jobs, worked on by tens of thousands of qualified translators. If you love playing with Amazon’s AWS, understand the challenges behind release-engineering, and get a kick out of analyzing log data for performance bottlenecks, please apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • Your event here.
Cool Products and Services
  • Now track your log activities with Log Monitor and be on the safe side! Monitor any type of log file and proactively define potential issues that could hurt your business' performance. Detect your log changes for: Error messages, Server connection failures, DNS errors, Potential malicious activity, and much more. Improve your systems and behaviour with Log Monitor.

  • The NoSQL "Family Tree" from Cloudant explains the NoSQL product landscape using an infographic. The highlights: NoSQL arose from "Big Data" (before it was called "Big Data"); NoSQL is not "One Size Fits All"; Vendor-driven versus Community-driven NoSQL.  Create a free Cloudant account and start the NoSQL goodness

  • Finally, log management and analytics can be easy, accessible across your team, and provide deep insights into data that matters across the business - from development, to operations, to business analytics. Create your free Logentries account here.

  • CopperEgg. Simple, Affordable Cloud Monitoring. CopperEgg gives you instant visibility into all of your cloud-hosted servers and applications. Cloud monitoring has never been so easy: lightweight, elastic monitoring; root cause analysis; data visualization; smart alerts. Get Started Now.

  • PagerDuty helps operations and DevOps engineers resolve problems as quickly as possible. By aggregating errors from all your IT monitoring tools, and allowing easy on-call scheduling that ensures the right alerts reach the right people, PagerDuty increases uptime and reduces on-call burnout—so that you only wake up when you have to. Thousands of companies rely on PagerDuty, including Netflix, Etsy, Heroku, and Github.

  • Aerospike in-Memory NoSQL database is now Open Source. Read the news and see who scales with Aerospike. Check out the code on github!

  • consistent: to be, or not to be. That’s the question. Is data in MongoDB consistent? It depends. It’s a trade-off between consistency and performance. However, does performance have to be sacrificed to maintain consistency? more.

  • Do Continuous MapReduce on Live Data? ScaleOut Software's hServer was built to let you hold your daily business data in-memory, update it as it changes, and concurrently run continuous MapReduce tasks on it to analyze it in real-time. We call this "stateful" analysis. To learn more check out hServer.

  • LogicMonitor is the cloud-based IT performance monitoring solution that enables companies to easily and cost-effectively monitor their entire IT infrastructure stack – storage, servers, networks, applications, virtualization, and websites – from the cloud. No firewall changes needed - start monitoring in only 15 minutes utilizing customized dashboards, trending graphs & alerting.

  • BlueStripe FactFinder Express is the ultimate tool for server monitoring and solving performance problems. Monitor URL response times and see if the problem is the application, a back-end call, a disk, or OS resources.

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

Performance at Scale: SSDs, Silver Bullets, and Serialization

This is a guest post by Aaron Sullivan, Director & Principal Engineer at Rackspace.

We all love a silver bullet. Over the last few years, if I were to split the outcomes that I see with Rackspace customers who start using SSDs, the majority of the outcomes fall under two scenarios. The first scenario is a silver bullet—adding SSDs creates near-miraculous performance improvements. The second scenario (the most common) is typically a case of the bullet being fired at the wrong target—the results fall well short of expectations.

With the second scenario, the file system, data stores, and processes frequently become destabilized. These demoralizing results, however, usually occur when customers are trying to speed up the wrong thing.

A common phenomena at the heart of the disappointing SSD outcomes is serialization. Despite the fact that most servers have parallel processors (e.g. multicore, multi-socket), parallel memory systems (e.g. NUMA, multi-channel memory controllers), parallel storage systems (e.g. disk striping, NAND), and multithreaded software, transactions still must happen in a certain order. For some parts of your software and system design, processing goes step by step. Step 1. Then step 2. Then step 3. That’s serialization.

And just because some parts of your software or systems are inherently parallel doesn’t mean that those parts aren’t serialized behind other parts. Some systems may be capable of receiving and processing thousands of discrete requests simultaneously in one part, only to wait behind some other, serialized part. Software developers and systems architects have dealt with this in a variety of ways. Multi-tier web architecture was conceived, in part, to deal with this problem. More recently, database sharding also helps to address this problem. But making some parts of a system parallel doesn’t mean all parts are parallel. And some things, even after being explicitly enhanced (and marketed) for parallelism, still contain some elements of serialization.

How far back does this problem go? It has been with us in computing since the inception of parallel computing, going back at least as far as the 1960s(1). Over the last ten years, exceptional improvements have been made in parallel memory systems, distributed database and storage systems, multicore CPUs, GPUs, and so on. The improvements often follow after the introduction of a new innovation in hardware. So, with SSDs, we’re peering at the same basic problem through a new lens. And improvements haven’t just focused on improving the SSD, itself. Our whole conception of storage software stacks is changing, along with it. But, as you’ll see later, even if we made the whole storage stack thousands of times faster than it is today, serialization will still be a problem. We’re always finding ways to deal with the issue, but rarely can we make it go away.

Parallelization and Serialization
Categories: Architecture

How to verify Web Service State in a Protractor Test

Xebia Blog - Sat, 06/21/2014 - 08:24

Sometimes it can be useful to verify the state of a web service in an end-to-end test. In my case, I was testing a web application that was using a third-party Javascript plugin that logged page views to a Rest service. I wanted to have some tests to verify that all our web pages did include the plugin, and that it was communicating with the Rest service properly when a new page was opened.
Because the webpages were written with AngularJS, Protractor was our framework of choice for our end-to-end test suite. But how to verify web service state in Protractor?

My first draft of a Protractor test looked like this:

var businessMonitoring = require('../util/businessMonitoring.js');
var wizard = require('./../pageobjects/wizard.js');

describe('Business Monitoring', function() {
  it('should log the page name of every page view in the wizard', function() {
    wizard.open();
    expect(wizard.activeStepNumber.getText()).toBe('1');

    // We opened the first page of the wizard and we expect it to have been logged
    expect(businessMonitoring.getMonitoredPageName()).toBe('/wizard/introduction');

    wizard.nextButton.click();
    expect(wizard.completeStep.getAttribute('class')).toContain('active');
    // We have clicked the ‚Äėnext‚Äô button so the ‚Äėcompleted‚Äô page has opened, this should have // been logged as well
    expect(businessMonitoring.getMonitoredPageName()).toBe('/wizard/completed');
  });
});

The next thing I had to write was the businessMonitoring.js script, which should somehow make contact with the Rest service to verify that the correct page name was logged.
First I needed a simple plugin to make http requests. I found the 'request' npm package , which provides a simple API to make a http request like this:

var request = require('request');

var executeRequest = function(method, url) {
  var defer = protractor.promise.defer();
  
  // method can be ‚ÄėGET‚Äô, ‚ÄėPOST‚Äô or ‚ÄėPUT‚Äô
  request({uri: url, method: method, json: true}, function(error, response, body) {

    if (error || response.statusCode >= 400) {
      defer.reject({
        error : error,
        message : response
      });
    } else {
      defer.fulfill(body);
    }
  });

  // Return a promise so the caller can wait on it for the request to complete
  return defer.promise;
};

Then I completed the businessmonitoring.js script with a method that gets the last request from the Rest service, using the request plugin.
It looked like this:

var businessMonitoring = exports; 

< .. The request wrapper with the executeRequest method is included here, left out here for brevity ..>

businessMonitoring.getMonitoredPageName = function () {

    var defer = protractor.promise.defer();

    executeRequest('GET', 'lastRequest')  // Calls the method which was defined above
      .then(function success(data) {
        defer.fulfill(data,.url);
      }, function error(e) {
        defer.reject('Error when calling BusinessMonitoring web service: ' + e);
      });

    return defer.promise;
 };

It just fires a GET request to the Rest service to see which page was logged. It is an Ajax call so the result is not immediately available, so a promise is returned instead.
But when I plugged the script into my Protractor test, it didn't work.
I could see that the requests to the Rest service were done, but they were done immediately before any of my end-to-end tests were executed.
How come?

The reason is that Protractor uses the WebdriverJS framework to handle its control flow. Statements like expect(), which we use in our Protractor tests, don't execute their assertions immediately, but instead they put their assertions on a queue. WebdriverJS first fills the queue with all assertions and other statements from the test, and then it executes the commands on the queue. Click here for a more extensive explanation of the WebdriverJs control flow.

That means that all statements in Protractor tests need to return promises, otherwise they will execute immediately when Protractor is only building its test queue. And that's what happened with my first implementation of the businessMonitoring mock.
The solution is to let the getMonitoredPageName return its promise within another promise, like this:

var businessMonitoring = exports; 

businessMonitoring.getMonitoredPageName = function () {
  // Return a promise that will execute the rest call,
  // so that the call is only done when the controlflow queue is executed.
  var deferredExecutor = protractor.promise.defer();

  deferredExecutor.then(function() {
    var defer = protractor.promise.defer();

    executeRequest('GET', 'lastRequest')
      .then(function success(data) {
        defer.fulfill(data.url);
      }, function error(e) {
        defer.reject('Error when calling BusinessMonitoring mock: ' + e);
      });

    return defer.promise;
  });

  return deferredExecutor;
};

Protractor takes care of resolving all the promises, so the code in my Protractor test did not have to be changed.