Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!


Data’s hierarchy of needs


This post originally published in the AppsFlyer blog.

A couple of weeks ago Nir Rubinshtein and I presented AppsFlyer’s data architecture in a meetup ofBig Data & Data Science Israel. One of the concepts that I presented there, which is worth expanding upon is “Data’s Hierarchy of Needs:”

  • Data should Exist
  • Data should be Accessible
  • Data should be Usable
  • Data should be Distilled
  • Data should be Presented

How can we make data “achieve its pinnacle of existence” and be acted upon. In other words, what are the areas that should be addressed when designing a data architecture if you want it to be complete and enable creating insights and value from the data you generate and collect.

If done properly, your users might just act upon the data you provide. This list might seem a little simplistic but it is not a prescription of what to do but rather a set of reminders of areas we need to cover and questions we need answered to properly create a data architecture.

Data Should Exist

Well, of course data should exist, and it probably does. You should ask yourself however, is if the data that exists is the right data? Does the retention policy you have service the business needs? Does the availability fit your needs? Do you have all the needed links (foreign keys) to other data so you’d be able to connect it later for analysis?

To make this more concrete, consider the following example: AppsFlyer accepts several types of events (launches, in-app events, etc.) which are tied to apps. Apps are connected to accounts (an account would have one or more applications, usually at least, an iOS app and an Android one). If we would save the accounts as the latest snapshot and an app changes ownership, the historical data before that change would be skewed. If we treat the accounts as a slowly changing dimension of the events, then we’d be able to handle the transition correctly. Note that we may still choose to provide the new owner the historic data but now it not the only option the system support and the decision can be based on the business needs.

Data Should Be Accessible

If data is written to disk it is accessible programmatically at least, however, there can be many levels of accessibility and we need to think about our end users needs and the level of access they’d require. At AppsFlyer, the data existence (mentioned above) is handled by processing all the messages that go through our queues using Kafka but that data is saved in sequence files and stored by event time. Most of our usage scenarios do have a time component but they are primarily handled by the app or account. Any processing that needs a specific account and would access the raw events would have to sift through tons of records (3.7+ billion a day at the time of this post) to find the few relevant ones. Thus, one basic move toward accessibility of data is to sort by apps so that queries will only need to access a small subset of the data and thus run much faster.

Then we need to consider the “hotness” of the data i.e. what response times we need and for which types of data. For instance, aggregations, such as retention reports need to be accessed online (so called “sub-second” response), latest counts need near real-time , explorations of data for new patterns can take hours etc. To enable support of these varied usage scenarios, we need to create multiple projections of our data, most likely using several different technologies.  AppsFlyer stores raw data in sequence files, processed data in parquet files (accessible via Apache Spark), aggregations and recent data in columnar RDBMS and near real-time is stored in-memory.

The three different storage mechanisms I mentioned above (Parquet, columnar RDBMS and In-Memory Data Grid) used in AppsFlyer all have SQL access; this is not by chance. While we (the industry) went through a short period of NoSQL, SQL or almost-SQL is getting back to be the norm, even for semi-structured and poly-structured data. Providing an SQL interface to your data is another important aspect of data accessibility as it allows expanding the user base for the data beyond R&D. Again, this is important not just for your relational data…

Data Should Be Usable

What’s the difference between accessible data and usable data?  For one there’s data cleansing. This is a no-brainer if you pull data from disparate systems but it is also needed if your source is a single system. Data cleansing is what traditional ETL is all about and the techniques still apply.
Another aspect of making data usable is enriching it or connecting it to additional data. Enriching can happen from internal sources like linking CRM data to the account info. This can also be facilitated by external sources as with getting the app category from the app store or getting device screen size from a device database.

Last but not least, is to consider legal and privacy aspects of the data. Before allowing access to the data you may need to mask sensitive information or remove privacy-related data (sometimes you shouldn’t even save it in the first place). At AppsFlyer we take this issue very seriously and make major efforts to comply when working with partners and clients to make sure privacy-related data is handled correctly. In fact, we are also undergoing independent SOC auditing to make sure we are compliant with the highest standards.

To summarize, to make the data usable you have to make sure it is correct, connect it to other data and you need to make sure that it is compliant with legal and privacy issues.

Data Should Be Distilled

Distilling insights is the reason we perform all the previous steps. Data in itself is of little use if it doesn’t help us make better decisions. There are multiple types of insights you can generate here beginning from the more traditional BI scenarios of slice and dice analytics going through real-time aggregations and trend analysis, ending in applying machine learning or “advanced analytics”. You can see one example of the type of insights that can be gleaned from our data by looking at theGaming Advertising Performance Index we recently published.

Data Should Be Presented

This point ties in nicely with the Gaming Advertising Performance Index example provided above. Getting insights is an important step, but if you fail to present them in a coherent and cohesive manner then the actual value users would be able to make of it is limited at best.  Note that even if you use insights for making decisions (e.g. recommending a product to a user) you’d still need to present how well this decision is doing.

There are many issues that need to be dealt with from UX perspective both in how users interact with the data and how the data is presented. An example of the former is deciding on chart types for the data. A simple example for the latter is when presenting projected or inaccurate data it  should be clear to the users that they are looking at approximations to prevent support calls on numbers not adding up.

Making sure all the areas discussed above are covered and handled properly is a lot of work but providing a solution that actually helps your users make better decisions is well worth it. The data’s hierarchy of needs is not a prescription of how to get there, it is merely a set of waypoints to help navigate toward this end goal. It helps me think holistically about AppsFlyer data needs and I hope following this post it would also help you.

For more information about our architecture, check out the presentation from the meetup:

Categories: Architecture

Goofy Innovation Techniques

If your team or company isn’t thriving with innovation, it’s not a big surprise.

In the book, Ten Types of Innovation: The Discipline of building Breakthroughs, Larry Keeley, Helen Walters, Ryan Pikkel, and Brian Quinn explain what holds innovation back.

Goofy innovation techniques are at least one part of the puzzle.

What holds innovation back is that many people still use goofy innovation techniques that either don’t work in practice, or aren’t very pragmatic.  For example “brainstorming” often leads to collaboration fixation.

Via Ten Types of Innovation: The Discipline of building Breakthroughs:

“Part of the Innovation Revolution is rooted in superior tradecraft: better ways to innovate that are suited for tougher problems.  Yet most teams are stuck using goofy techniques that have been discredited long ago.  This book is part of a new vanguard, a small group of leading thinkers who see innovation as urgent and essential, who know it needs to be cracked as a deep discipline and subjected to the same rigors as any other management science.”

The good news is that there are many innovation techniques that do work.

If you’re stuck in a rut, and wondering how to get innovation going, then abandon the goofy innovation techniques, and cast a wider net to find some of the approaches that actually do.   For example, Dr. Tony McCaffrey suggests “brainswarming.”  (Here is a video of brainswarming.)  Or check out the book, Blue Ocean Strategy, for a pragmatic approach to strategic market disruption.

Innovate in your approach to innovation.

You Might Also Like

How To Get Innovation to Succeed Instead of Fail

Management Innovation is at the Top of the Innovation Stack

No Slack = No Innovation

The Innovation Revolution

The Myths of Business Model Innovation

Categories: Architecture, Programming

Stuff The Internet Says On Scalability For May 29th, 2015

Hey, it's HighScalability time:

Just imagine. 0-100 mph in 1.2 seconds. Astronaut's view from the Dragon spacecraft.
  • $850B: mobile web market in 2018; 107: unicorns; 3.2 billion: # of people on the Internet; 10^82: atoms in the observable universe
  • Quotable Quotes:
    • @cloud_opinion: appropriate term for people that resist Docker is "VM Huggers"
    • @mikeloukides: Scale systems, not teams. Adding scale shouldn’t mean adding people. Teams should scale sublinearly.  @shinynew_oz @ #velocityconf
    • Marc Levinson: If the market repeatedly misjudged the container, so did the state. Governments in New York City and San Francisco ignored the consequences of containerization as they wasted hundreds of millions of dollars reconstructing ports that were outmoded before the concrete was dry
    • @corbett: doesn't describe ultimate origin but "Inflation describes how the universe emerges from a patch of 10^-28cm & mass of only a few grams" -AG
    • @Gizmodo: Since last year, over 600 million more people have smartphones. It’s the age of mobile, says Sundar Pichai. #io15
    • @stshank: Android in a nutshell: >1 billion users, 4000 devices, 500 carriers, 400 device makers says @sundarpichai at #io15 
    • Carlos C:  Congratulations, FP hackers. You won the battle of simplicity to express...and here is where Go wins the battle of simplicity to achieve.
    • @markimbriaco: @joestump In my day, we emitted HTML from our apps. Pushed the packets uphill to the browsers. Through driving DDoS. And we liked it.
    • aikah: Yep, hail "Isomorphic micro-service oriented management."
    • @bitfield: "We haven't got time to automate this stuff, because we're too busy dealing with the problems caused by our lack of automation." —Everyone
    • @raju: India reported 851 Million active mobile connections in February 2015
    • @ValaAfshar: The average smartphone user checks their mobile device 214 times per day... and 86% of the time is apps (vs 14% browser). #codecon
    • @BradStone: Meeker: 87 percent millennials say smartphones never leave their side night or day. 44 percent use camera at least once a day. #CodeCon
    • @sequoia: "We're close to 1M people everyday staying at an @Airbnb home. We're here to stay" @bchesky #codecon
    • @pmarca: Moore's Law used to be about faster, now it's more about cheaper. Huge change with the biggest possible consequences. 
    • Nicolas Liochon: CAP: if all you have is a timeout, everything looks like a partition
    • See the complete post for the full list...

  • This would change things. What Memory Will Intel’s Purley Platform Use?: One slide, titled: “Purley: Biggest Platform Advancement Since Nehalem” includes this post’s graphic, which tells of a memory with: “Up to 4x the capacity & lower cost than DRAM, and 500x faster than NAND.” Also, What High-Bandwidth Memory Is and Why You Should Care

  • The question seldom asked with these kind of efforts: Does your idea of merit have merit? Startup Aims to Make Silicon Valley an Actual Meritocracy.

  • The reason for us to save everything is that our collective data is the training ground for future AIs. We should train them to understand all of humanity. Hopefully they'll learn pity. Oh, wait...  The Internet With A Human Face: I've come to believe that a lot of what's wrong with the Internet has to do with memory. The Internet somehow contrives to remember too much and too little at the same time. 

  • If you would like a rich exploration of the ethical implications of post-humanism then Apex: Nexus Arc Book 3 by Ramez Naam is the book for you. The framework is a game of iterated tit-for-tat. Ultimately if we don't want post-humans to destroy us lowly humans then we humans need to treat them well, from the start. If we harm them then the correct move on their part is to tat us. That won't be good. So open with a trust move and be nice. This radical notion might even work with normal humans.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Microservices architecture principle #4: Asynchronous communication over synchronous communication

Xebia Blog - Fri, 05/29/2015 - 13:37

Microservices are a hot topic. Because of that a lot of people are saying a lot of things. To help organizations make the best of this new architectural style Xebia has defined a set of principles that we feel should be applied when implementing a Microservice Architecture. Over the next couple of days we will cover each of these principles in more detail in a series of blog posts.
This blog explains why we prefer asynchronous communication over synchronous communication

In a previous post in this series we explained that we prefer autonomy of Microservices over coordination between Microservices. That does not imply a Microservices landscape with all Microservices running without being dependent on any other Microservice. There will always be dependencies, but we try to minimise the number of dependencies.  Once you have minimised the number of dependencies, how should these be implemented such that autonomy is maintained as much as possible? Synchronous dependencies between services imply that the calling service is blocked and waiting for a response by the called service before continuing it's operation. This is tight coupling, does not scale very well, and the calling service may be impacted by errors in the called service. In a high available robust Microservices landscape that is not preferable. Measures can be taken (think of things like circuit breakers) but it requires extra effort.

The preferred alternative is to use asynchronous communication. In this pattern the calling service simply publishes it's request (or data) and continues with other work (unrelated to  this request). The service has a separate thread listening for incoming responses (or data) and processes these when they come in. It is not blocking and waiting for a response after it sent a request, this improves scalability. Problems in another service will not break this service. If other services are temporarily broken the calling service might not be able to complete a process completely, but the calling service is not broken itself. Thus using the asynchronous pattern the services are more decoupled compared to the synchronous pattern and which preserves the autonomy of the service .

Microservices Architecture Principle #3: small bounded contexts over one comprehensive model

Xebia Blog - Wed, 05/27/2015 - 21:18

Microservices are a hot topic. To help organizations make the best of this new architectural style Xebia has defined a set of principles that we feel should be applied when implementing a Microservice Architecture. Over the next couple of days we will cover each of these principles in more detail in a series of blog posts. Today we discuss the Domain Driven Design (DDD) concept of "Bounded Context" and how it plays a major role in designing Microservices.

One of the discussion points around Microservices, since the term was coined in 2013, is how big (or rather, how small) a Microservice should be. Some people, such as Fred George, claim services should be small, maybe between 100-1000 lines of code (LoC). However, LoC is a poor metric for measuring software in general and even more so for determining the scope of a Microservice. Rather, when identifying the scope of our Microservices, we look at the functionality that a service needs to provide, and how the service relates to other services. Our aim is to design Microservices that are autonomous, ie. have a low coupling with other services, have well defined interfaces, and implement a single business capability, ie. have high cohesion.

A technique that can be used for this is "Context Mapping".  Via this technique we identify the various contexts in the IT landscape and their boundaries. The Context Map is the primary tool used to make boundaries between domains explicit. A Bounded Context encapsulates the details of a single domain, such as domain model, data model, application services, etc., and defines the integration points with other bounded contexts/domains. This matches perfectly with our definition of a Microservice: autonomous, well defined interfaces, implementing a business capability. This makes Context Mapping (and DDD in general) an excellent tool in the architects toolbox for identifying and designing Microservices.

Another factor in sizing our services is that we would like to have models that can "fit in your head", so as to be able to reason about them efficiently. Most projects define a single comprehensive model encompassing the full domain, as this seems natural, and appears easier to maintain as one does not have to worry about the interaction between multiple models, or translate from one context to the other.

For small systems this may be true, but for large systems the costs will start to outweigh the benefits: maintaining a single model requires centralization. Naturally the model will tend to fragment: a domain expert from the accounting domain will think differently about 'inventory' than a logistics domain expert, for example. It requires lots of coordinated efforts to disambiguate all terms across all domains. And worse, this 'unified vocabulary' is awkward and unnatural to use, and will very likely be ignored in most cases. Here bounded contexts will help again: they make clear where we can safely use the natural domain terms and where we will need to bridge to other domains. With the right boundaries and sizes of our bounded contexts we can make sure our domain models "fit in your head" and that we do not have to switch between models too often.

So maybe the best answer to the question of how big a Microservice should be is: it should have a well defined bounded context that will enable us to work without having to consider, or swap, between contexts.

A Toolkit to Measure Basic System Performance and OS Jitter

Jean Dagenais published a great response on a mechanical-sympathy thread to Gil Tene's article, The Black Magic Of Systematically Reducing Linux OS Jitter. It's full of helpful tools for tracking down jitter problems. I apologize for the incomplete attribution. I did not find a web presence for Jean. 

To complement the great information I got on the “Systematic Way to Find Linux Jitter”, I have created a toolkit that I now used to evaluate current and future trading platforms.

In case this can be useful, I have listed these tools, as well as the URLs to get the source code and a description of their usage. I am learning a lot by reading the source code, and the blog entry associated.

This is far from an exhaustive list, as every week I find either a new problem area or a new tool that improve my understanding of this beautiful problem domain ;)

These tools are grouped into these categories: 

  1. CPU, Memory, Disk, Network
  2. X86, Linux, and Java time resolution
  3. Context Switches & Inter Thread Latency
  4. System Jitter
  5. Application Building Blocks: distruptor, openHft, Aeron & Workload Generator
  6. Application Performance Testing

Happy Benchmarking and Jitter Chasing!

1. CPU, Memory, Disk, Network

Categories: Architecture

Sponsored Post: Tumblr, Power Admin, Learninghouse, MongoDB, Internap, Aerospike, SignalFx, InMemory.Net, Couchbase, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?
  • Make Tumblr fast, reliable and available for hundreds of millions of visitors and tens of millions of users.  As a Site Reliability Engineer you are a software developer with a love of highly performant, fault-tolerant, massively distributed systems. Apply here now! 

  • At Scalyr, we're analyzing multi-gigabyte server logs in a fraction of a second. That requires serious innovation in every part of the technology stack, from frontend to backend. Help us push the envelope on low-latency browser applications, high-speed data processing, and reliable distributed systems. Help extract meaningful data from live servers and present it to users in meaningful ways. At Scalyr, you’ll learn new things, and invent a few of your own. Learn more and apply.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • 90 Days. 1 Bootcamp. A whole new life. Interested in learning how to code? Concordia St. Paul's Coding Bootcamp is an intensive, fast-paced program where you learn to be a software developer. In this full-time, 12-week on-campus course, you will learn either .NET or Java and acquire the skills needed for entry-level developer positions. For more information, read the Guide to Coding Bootcamp or visit

  • June 2nd – 4th, Santa Clara: Register for the largest NoSQL event of the year, Couchbase Connect 2015, and hear how innovative companies like Cisco, TurboTax, Joyent, PayPal, Nielsen and Ryanair are using our NoSQL technology to solve today’s toughest big data challenges. Register Today.

  • The Art of Cyberwar: Security in the Age of Information. Cybercrime is an increasingly serious issue both in the United States and around the world; the estimated annual cost of global cybercrime has reached $100 billion with over 1.5 million victims per day affected by data breaches, DDOS attacks, and more. Learn about the current state of cybercrime and the cybersecurity professionals in charge with combatting it in The Art of Cyberwar: Security in the Age of Information, provided by Russell Sage Online, a division of The Sage Colleges.

  • MongoDB World brings together over 2,000 developers, sysadmins, and DBAs in New York City on June 1-2 to get inspired, share ideas and get the latest insights on using MongoDB. Organizations like Salesforce, Bosch, the Knot, Chico’s, and more are taking advantage of MongoDB for a variety of ground-breaking use cases. Find out more at but hurry! Super Early Bird pricing ends on April 3.
Cool Products and Services
  • Here's a little quiz for you: What do these companies all have in common? Symantec, RiteAid, CarMax, NASA, Comcast, Chevron, HSBC, Sauder Woodworking, Syracuse University, USDA, and many, many more? Maybe you guessed it? Yep! They are all customers who use and trust our software, PA Server Monitor, as their monitoring solution. Try it out for yourself and see why we’re trusted by so many. Click here for your free, 30-Day instant trial download!

  • Turn chaotic logs and metrics into actionable data. Scalyr replaces all your tools for monitoring and analyzing logs and system metrics. Imagine being able to pinpoint and resolve operations issues without juggling multiple tools and tabs. Get visibility into your production systems: log aggregation, server metrics, monitoring, intelligent alerting, dashboards, and more. Trusted by companies like Codecademy and InsideSales. Learn more and get started with an easy 2-minute setup. Or see how Scalyr is different if you're looking for a Splunk alternative or Loggly alternative.

  • Instructions for implementing Redis functionality in Aerospike. Aerospike Director of Applications Engineering, Peter Milne, discusses how to obtain the semantic equivalent of Redis operations, on simple types, using Aerospike to improve scalability, reliability, and ease of use. Read more.

  • SQL for Big Data: Price-performance Advantages of Bare Metal. When building your big data infrastructure, price-performance is a critical factor to evaluate. Data-intensive workloads with the capacity to rapidly scale to hundreds of servers can escalate costs beyond your expectations. The inevitable growth of the Internet of Things (IoT) and fast big data will only lead to larger datasets, and a high-performance infrastructure and database platform will be essential to extracting business value while keeping costs under control. Read more.

  • SignalFx: just launched an advanced monitoring platform for modern applications that's already processing 10s of billions of data points per day. SignalFx lets you create custom analytics pipelines on metrics data collected from thousands or more sources to create meaningful aggregations--such as percentiles, moving averages and growth rates--within seconds of receiving data. Start a free 30-day trial!

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex goes beyond monitoring and measures the system's work on your MySQL and PostgreSQL servers, providing unparalleled insight and query-level analysis. This unique approach ultimately enables your team to work more effectively, ship more often, and delight more customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here:

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required.

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

Quickly build a XL Deploy plugin for deploying container applications to CoreOS

Xebia Blog - Mon, 05/25/2015 - 21:57

You can use fleetctl and script files to deploy your container applications to CoreOS. However, using XL Deploy for deployment automation is a great solution when you need to deploy and track versions of many applications. What does it take to create a XL Deploy plugin to deploy these container applications to your CoreOS clusters?

XL Deploy can be extended with custom plugins which add deployment capabilities. Using XL Rules custom plugins can be created quickly with limited effort. In this blog you can read how a plugin can be created in a matter of hours.

In a number of blog posts, Mark van Holsteijn explained how to create a high available Docker container platform using CoreOS and Consul. In these posts shell scripts (with fleetctl commands) are used deploy container applications. Based on these scripts I have built a XL Deploy plugin which deploys fleet unit configuration files to a CoreOS cluster.


Deploying these container applications using XL Deploy has a number of advantages:

  • Docker containers can be deployed without creating, adjusting and maintaining scripts for individual applications.
  • XL Deploy will track and report the applications deployed to the CoreOS clusters.
  • Additional deployment scenarios can be added with limited effort.
  • Deployments will be consistent and configuration is managed across environments.
  • XL Deploy permissions can be used to control (direct) access to the CoreOS cluster(s).


Building an XL Deploy plugin is fast, since you can:

  • Reuse existing XL Deploy capabilities, like the Overthere plugin.
  • Utilize XL Deploy template processing to inject property values in rules and deploy scripts.
  • Exploit the XL Deploy unified deployment model to get the deltas which drive the required fleetctl deployment commands for any type of deployment (new, update, undeploy and rollback deployments).
  • Use xml and script based rules to build deployment tasks.

Getting started

  • Install XL Deploy, you can download a free edition here. If you are not familiar with XL Deploy, read the getting started documentation.
  • Next add the plugin resources to the ext directory of your XL Deploy installation. You can find the plugin resources in this Github repository. Add the synthetic.xml, xl-rules.xml file from the repository root. In addition, add the scripts directory and its contents. Restart XL Deploy.
  • Next, setup a CoreOS cluster. This blog post explains how you can setup such a platform locally.
  • Now you can connect to XL deploy using your browser. On the deployment tab you can import the sample application, located in the sample-app folder of the plugin resources Github repository
  • You can now setup the target deployment container based on the Overthere.SshHost configuration item type. Verfiy that you can connect to your CoreOS cluster using this XL Deploy container.
  • Next, you can setup a XL Deploy environment, which contains your target deployment container.
  • Now you can use the deployment tab to deploy and undeploy your fleet configuration file applications.


Building the plugin

The plugin consists of two xml files and a number of script files. Below you find a description of the plugin implementation steps.

The CoreOS container application deployments are based on fleet unit configuration files. So, first we create a XL Deploy configuration Item type definition which represents such a file. This XL Deploy deployed type is defined in the XL Deploy synthetic.xml file. The snippet below shows the contents of this file. I have assigned the name “fleet.DeployedUnit”.


The definition contains  a container-type attribute. The Overthere.SshHost container is referenced. The plugin can simple use the Overthere.SshHost container type to connect to the CoreOS cluster and to execute fleet commands.

Furthermore, I have added two properties. One property which can be used to specify the number of instances. Note that XL Deploy dictionaries can be utilized to define the number of instances for each environment separately. The second property is a flag which controls whether instances will be started (or only submitted and loaded).

If you want to deploy a fleet configuration file using fleetctl, you can issue the following three commands: submit, load and start. In the plugin, I have created a separate script file for each of these fleetctl commands.  The caption below shows the script file to load a fleet configuration file. This load script uses the file name property and numberOfInstances property of the “fleet.DeployedUnit” configuration item.


Finally, the plugin can be completed with XML-based rules which create the deployment steps. The caption below shows the rule which adds steps to [1] submit the unit configuration and [2] load the unit when (a version of ) the application is deployed.


Using rules, you can easily define logic to add deployment steps. These steps can closely resemble the commands you perform when you are using fleetctl. For this plugin I have utized xml-based rules only. Using script rules, you can add more intelligence to your plugin. For example, the logic of the restart script can be converted to rules and more fine grained deployment steps.

More information

If you are interested in building your own XL Deploy plugin, the XL Deploy product documentation contains tutorials which will get you started.

If you want to know how you can create a High Available Docker Container Platform Using CoreOS And Consul, the following blogs are great starting point:

Appknox Architecture - Making the Switch from AWS to the Google Cloud

This is a guest post by dhilipsiva, Full-Stack & DevOps Engineer at Appknox.

Appknox helps detect and fix security loopholes in mobile applications. Securing your app is as simple as submitting your store link. We upload your app, scan for security vulnerabilities, and report the results. 

What's notable about our stack:
  • Modular Design. We modularized stuff so far that we de-coupled our front-end from our back-end. This architecture has many advantages that we'll talk about later in the post.
  • Switch from AWS to Google Cloud. We made our code largely vendor independent so we were able to easily make the switch from AWS to the Google Cloud. 
Primary Languages
  1. Python & Shell for the Back-end
  2. CoffeeScript and LESS for Front-end
Our Stack
  1. Django
  2. Postgres (Migrated from MySQL)
  3. RabbitMQ
  4. Celery
  5. Redis
  6. Memcached
  7. Varnish
  8. Nginx
  9. Ember
  10. Google Compute 
  11. Google Cloud Storage
Categories: Architecture

Microservices architecture principle #2: Autonomy over coordination

Xebia Blog - Mon, 05/25/2015 - 16:13

Microservices are a hot topic. Because of that a lot of people are saying a lot of things. To help organizations make the best of this new architectural style Xebia has defined a set of principles that we feel should be applied when implementing a Microservice Architecture. Over the next couple of days we will cover each of these principles in more detail in a series of blog posts.
This blog explains why we prefer autonomy of services over coordination between services.

Our Xebia colleague Serge Beaumont posted "Autonomy over Coordination" in a tweet earlier this year and for me it summarised one of the crucial aspects for creating an agile, scalable and robust IT system or organisational structure. Autonomy over coordination is closely related to the business capabilities described in the previous post in this series, each capability should be implemented in one microservice. Once you have defined your business capabilities correctly the dependencies between those capabilities are minimised. Therefore minimal coordination between capabilities is required, leading to optimal autonomy. Increased autonomy for a microservice gives it freedom to evolve without impacting other services: the optimal technology can be used, it can scale without having to scale others, etc. For the team responsible for the service the advantages are similar, the autonomy enables them to make optimal choices that make their team function at its best.

The drawbacks of less autonomy and more coordination are evident and we all have experienced these. For example, a change leads to a snowball of dependent changes that must be deployed at the same moment, making changes to a module requires approval of other teams,  not being able to scale up a compute intensive function without scaling the whole system, ... the list is endless.

So in summary, pay attention to defining you business capabilities (microservices) in such a manner that autonomy is maximised, it will give you both organisational and technical advantages.

Microservices architecture principle #1: Each Microservice delivers a single complete business capability

Xebia Blog - Sat, 05/23/2015 - 21:13

Microservices are a hot topic. Because of that a lot of people are saying a lot of things. To help organizations make the best of this new architectural style Xebia has defined a set of principles that we feel should be applied when implementing a Microservice Architecture. Over the next couple of days we will cover each of these principles in more detail in a series of blog posts.
This blog explains why a Microservice should deliver a complete business capability.

A complete business capability is a process that can be finished consecutively without interruptions or excursions to other services. This means that a business capability should not depend on other services to complete its work.
If a process in a microservice depends on other microservices we would end up in the dependency hell ESBs introduced: in order to service a customer request we need many other services and therefore if one of them fails everything stops. A more robust solution would be to define a service that handles a process that makes sense to a user. Examples include ordering a book in a web shop. This process would start with the selection of a book and end with creating an order. Actually fulfilling the order is a different process that lives in its own service. The fulfillment process might run right after the order process but it doesn’t have to. If the customer orders a PDF version of a book order fulfillment may be completed right away. If the order was for the print version, all the order service can promise is to ask shipping to send the book. Separating these two processes in different services allows us to make choices about the way the process is completed, making sure that a problem or delay in one service has no impact on other services.

So, building a microservice such that it does a single thing well without interruptions or waiting time is at the foundation of a robust architecture.

Are You an Integration Specialist?

Some people specialize in a narrow domain.  They are called specialists because they focus on a specific area of expertise, and they build skills in that narrow area.

Rather than focus on breadth, they go for depth.

Others focus on the bigger picture or connecting the dots.  Rather than focus on depth, they go for breadth.

Or do they?

It actually takes a lot of knowledge and depth to be effective at integration and “connecting the dots” in a meaningful way.  It’s like being a skilled entrepreneur or a skilled business developer.   Not just anybody who wants to generalize can be effective.  

True integration specialists are great pattern matchers and have deep skills in putting things together to make a better whole.

I was reading the book Business Development: A Market-Oriented Perspective where Hans Eibe Sørensen introduces the concept of an Integrating Generalist and how they make the world go round.

I wrote a post about it on Sources of Insight:

The Integrating Generalist and the Art of Connecting the Dots

Given the description, I’m not sure which is better, the Integration Specialist or the Integrating Generalist.  The value of the Integrating Generalist is that it breathes new life into people that want to generalize so that they can put the bigger puzzle together.  Rather than de-value generalists, this label puts a very special value on people that are able to fit things together.

In fact, the author claims that it’s Integrating Generalists that make the world go round.

Otherwise, there would be a lot of great pieces and parts, but nothing to bring them together into a cohesive whole.

Maybe that’s a good metaphor for the Integrating Generalist.  While you certainly need all the parts of the car, you also need somebody to make sure that all the parts come together.

In my experience, Integration Generalists are able to help shape the vision, put the functions that matter in place, and make things happen.

I would say the most effective Program Managers I know do exactly that.

They are the Oil and the Glue for the team because they are able to glue everything together, and, at the same time, remove friction in the system and help people bring out their best, towards a cohesive whole.

It’s synergy in action, in more ways than one.

You Might Also Like

Anatomy of a High-Potential

E-Shape People, Not T-Shape

Generalists vs. Specialists

Categories: Architecture, Programming

Stuff The Internet Says On Scalability For May 22nd, 2015

Hey, it's HighScalability time:

Where is the World Brain? San Fernando marshes in Spain (by Cristobal Serrano)
  • 569TB: 500px total data transfer per month; 82% faster: elite athletes' brains; billions and millions: Facebook's graph store read and write load; 1.3 billion: daily Pinterest spam fighting events; 1 trillion: increase in processing power performance over six decades; 5 trillion: Facebook pub-sub messages per day
  • Quotable Quotes:
    • Silicon Valley: “Tell me the truth,” Gavin demands of a staff member. “Is it Windows Vista bad? Zune bad?” “I’m sorry,” the staffer tells Gavin, “but it’s Apple Maps bad!”
    • @garybernhardt: Reminder to people whose "big data" is under a terabyte: servers with 1 TB RAM can be had about $20k. Your data set fits in RAM.
    • @epc: μServices and AWS Lambda are this year’s containers and Docker at #Gluecon
    • orasis: So by this theory the value of a tech startup is the developer's laptops and the value of a yoga studio is the loaner mats.
    • @ajclayton: An average attacker sits on your network for 229 days, collecting information. @StephenCoty #gluecon
    • @mipsytipsy: people don't *cause* problems, they trigger latent conditions that make failures more likely.  @allspaw on post mortems #srecon15europe
    • @pas256: The future of cloud infrastructure is a secure, elastically scalable, highly reliable, and continuously deployed microservices architecture
    • Kevin Marks: The Web is the network
    • @cdixon: We asked for flying cars and all we got was the entire planet communicating instantly via $34 pocket supercomputers 
    • @ajclayton: Uh oh, @pas256 just suggested that something could be called a "nanoservice"...microservices are already old. #gluecon
    • @jamesurquhart: A sign that containers are interim step? Pkging procs better than pkging servers, but not as good as pkging functs? 
    • @markburgess_osl: Let's rename "immutable infrastructure" to "prefab/disposable" infrastructure, to decouple it from the false association with functionalprog
    • @Beaker: Key to startup success: solve a problem that has been solved before but was constrained due to platform tech cost or non-automated ops scale
    • @mooreds: 10M req/month == $45 for lambda.  Cheap. -- @pas256 #gluecon
    • @ajclayton: Microservices "exist on all points of the hype cycle simultaneously" @johnsheehan #gluecon
    • @oztalip: "Treat web server as a library not as a container, start it inside your application, not the other way around!" -@starbuxman #GOTOChgo
    • @sharonclin: If a site doesn't load in 3 sec, 57% abandon, 80% never return.  @krbenedict #m6xchange #Telerik
    • QuirksMode: Tools don’t solve problems any more, they have become the problem.
    • @rzazueta: Was considering taking a shot every time I saw "Microservices" on the #gluecon hashtag. But I've already gone through two livers.
    • @MariaSallis: "If you don't invest in infrastructure, don't invest in microservices" @johnsheehan #gluecon
    • Brian Gallagher: If the world devolved into a single cloud provider, there would be no need for Cloud Foundry.
    • @b6n: startup idea: use technology from the 70s.
    • Steven Hawking: The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge
    • @aneel: "Monolithic apps have unlimited invisible internal dependencies" -@adrianco #gluecon
    • @windley: microservices don’t reduce complexity, they move it around, from dev to ops. #gluecon
    • @paulsbruce: When everyone has to be an expert in everything, that doesn't scale." @dberkholz @451research #gluecon
    • @oamike: I didn’t do SOA right, I didn’t do REST right, I’m sure as hell not going to do micro services right. #gluecon @kinlane
    • Urs Hölzle: My biggest worry is that regulation will threaten the pace of innovation.
    • @mccrory: There has been an explosion in managed OpenStack solutions - Platform9, MetaCloud, BlueBox
    • @viktorklang: Remember that you heard it here first, CPU L1 cache is the new disk.

  • This is more a measure of the fecundity of the ecosystem than an indication of disease. By its very nature the magic creation machine that it is Silicon Valley must create both wonder and bewilderment. Silicon Valley Is a Big Fat Lie: That gap between the Silicon Valley that enriches the world and the Silicon Valley that wastes itself on the trivial is widening daily.

  • In a liquidity crisis all those promises mean nothing. RadioShack Sold Your Data to Pay Off Its Debts.

  • YouTube has to work at it too. To Take On HBO And Netflix, YouTube Had To Rewire Itself: All of the things that InnerTube has enabled—faster iteration, improved user testing, mobile user analytics, smarter recommendations, and more robust search—have paid off in a big way. As of early 2015, YouTube was finally becoming a destination: On mobile, 80% of YouTube sessions currently originate from within YouTube itself.

  • If you aren't doing web stuff, do you really need to use HTTP? Do you really know why you prefer REST over RPC? There's no reason for API requests to pass through an HTTP stack.

  • If scaling is specialization and the cloud is the computer then why are we still using TCP/IP between services within a datacenter? Remote Direct Memory Access is fast. FaRM: Fast Remote Memory: FaRM’s per-machine throughput of 6.3 million operations per second is 10x that reported for Tao. FaRM’s average latency at peak throughput was 41µs which is 40–50x lower than reported Tao latencies. 

  • MigratoryData with 10 Million Concurrent Connections on a single commodity server. Lots of details on how the benchmark was run and the various configuration options. CPU usage under 50% (with spikes), memory usage was predictable, network traffic was  0.8 Gbps for 168,000 messages per second, 95th Percentile Latency: 374.90 ms. Next up? C100M.

  • Does anyone have a ProductHunt invite that they would be willing share with me?

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Database Scaling Redefined: Scaling Demanding Queries, High Velocity Data Modifications and Fast Indexing All At Once for Big Data

This is a guest post by Cihan Biyikoglu, Director of Product Management at Couchbase.

Question: A few million people are out looking for a setup to efficiently live and interact. What is the most optimized architecture they can use?

  1. Build one giant high-rise for everyone,
  2. Build many single-family homes OR
  3. Build something in between?

Schools, libraries, retail stores, corporate HQs, homes are all there to optimize variety of interactions. Sizes of groups and type of exchange vary drastically… Turns out, what we have chosen to do is, to build all of the above. To optimize different interactions, different architectures make sense.

While high rises can be effective for interactions with high density of people in a small amount of land, it is impractical to build 500 story buildings. It is also hard to add/remove floors as you need them. So high-rises feel awfully like scaling-up – cluster of processors communicating over fast memory to compute fast but limited scale ceiling and elasticity.

As your home, single-family architecture work great. Nice backyard to play and private space for family dinners... You may need to get in your car to interact with other families, BUT it is easy to build more single family houses: so easy elasticity and scale. Single-family structure feels awfully like scaling-out, doesn't it? Cluster of commodity machines that communicate over slower networks and come with great elasticity.

“How does this all relate to database scalability?” you ask…

Categories: Architecture

Paper: FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs

It's amazing what you can accomplish these days on a single machine using SSDs and smart design. In the paper FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs they:

demonstrate that FlashGraph is able to process graphs with billions of vertices and hundreds of billions of edges on a single commodity machine.

The challenge is SSDs are a lot slower than RAM:

The throughput of SSDs are an order of magnitude less than DRAM and the I/O latency is multiple orders of magnitude slower. Also, I/O performance is extremely non-uniform and needs to be localized. Finally, high-speed I/O consumes many CPU cycles, interfering with graph processing.

Their solution exploits caching, parallelism, smart scheduling and smart placement algorithms:

We build FlashGraph on top of a user-space SSD file system called SAFS [32] to overcome these technical challenges. The set-associative file system (SAFS) refactors I/O scheduling, data placement, and data caching for the extreme parallelism of modern NUMA multiprocessors. The lightweight SAFS cache enables FlashGraph to adapt to graph applications with different cache hit rates. We integrate FlashGraph with the asynchronous user-task I/O interface of SAFS to reduce the overhead of accessing data in the page cache and memory consumption, as well as overlapping computation with I/O.

The result performs up to 80% of its in-memory implementation:

We observe that in many graph applications a large SSD array is capable of delivering enough I/Os to saturate the CPU. This suggests the importance of optimizing for CPU and RAM in such an I/O system. It also suggests that SSDs have been sufficiently fast to be an important extension for RAM when we build a machine for large-scale graph analysis applications.


Categories: Architecture

Do you have a shared vocabulary?

Coding the Architecture - Simon Brown - Mon, 05/18/2015 - 16:57

"This is a component of our system", says one developer, pointing to a box on a diagram labelled "Web Application". Next time you're sitting in an conversation about software design, listen out for how people use terms like "component", "module", "sub-system", etc. We can debate whether UML is a good notation to visually communicate the design of a software system, but we have a more fundamental problem in our industry. That problem is our vocabulary, or rather, the lack of it.


I've been running my software architecture sketching exercises in various formats for nearly ten years, and thousands of people have participated. The premise is very simple - here are some requirements, design a software solution and draw some pictures to illustrate the design. The range of diagrams I've seen, and still see, is astounding. The percentage of people who choose to use UML is tiny, with most people choosing to use an informal boxes and lines notation instead. With some simple guidance, the notational aspects of the diagrams are easy to clean up. There's a list of tips in my book that can be summarised with this slide from my workshop.

Some tips for effective sketches


What's more important though is the set of abstractions used. What exactly are people supposed to be drawing? How should they think about, describe and communicate the design of their software system? The primary aspect I'm interested in is the static structure. And I'm interested in the static structure from different levels of abstraction. Once this static structure is understood and in place, it's easy to supplement it with other views to illustrate runtime/behavioural characteristics, infrastructure, deployment models, etc.

In order to get to this point though, we need to agree upon some vocabulary. And this is the step that is usually missed during my workshops. Teams charge headlong into the exercise without having a shared understanding of the terms they are using. I've witnessed groups of people having design discussions using terms like "component" where they are clearly not talking about the same thing. Yet everybody in the group is oblivious to this. For me, the answer is simple. Each group needs to agree upon the vocabulary, terminology and abstractions they are going to use. The notation can then evolve.

My Simple Sketches for Diagramming Your Software Architecture article explains why I believe that abstractions are more important than notation. Maps are a great example of this. Two maps of the same location will show the same things, but they will often use different notations. The key to understanding these different maps is exactly that - a key tucked away in the corner of each map somewhere. I teach people my C4 model, based upon a simple set of abstractions (software systems, containers, components and classes), which can be used to describe the static structure of a software system from a number of different levels of abstraction. A common set of abstractions allows you to have better conversations and easily compare solutions. In my workshops, the notation people use to represent this static structure is their choice, with the caveat that it must be self-describing and understandable by other people without explanation.

Next time you have a design discussion, especially if it's focussed around some squiggles on a whiteboard, stop for a second and take a step back to make sure that everybody has a shared understanding of the vocabulary, terminology and abstractions that are being used. If this isn't the case, take some time to agree upon it. You might be surprised with the outcome.

Categories: Architecture

How MySQL is able to scale to 200 Million QPS - MySQL Cluster

This is a guest post by Andrew Morgan, MySQL Principal Product Manager at Oracle.

MySQL Cluster logo

The purpose of this post is to introduce MySQL Cluster - which is the in-memory, real-time, scalable, highly available version of MySQL. Before addressing the incredible claim in the title of 200 Million Queries Per Second it makes sense to go through an introduction of MySQL Cluster and its architecture in order to understand how it can be achieved.

Introduction to MySQL Cluster
Categories: Architecture

Agile goes beyond Epic Levels

Xebia Blog - Fri, 05/15/2015 - 17:10

IMG_5514A snapshot from my personal backlog last week:

  • The Agile transformation at ING was frontpage news in the Netherlands. This made us even more realize how epic this transformation and assignment actually is.
  • The Agile-built hydrogen race car from the TU Delft set an official track record on the Nurburgring. We're proud on our guys in Delft!
  • Hanging out with Boeings’ Agile champs at their facilities in Seattle exchanging knowledge. Impressive and extremely fruitful!
  • Coaching the State of Washington on their ground breaking Agile initiatives together with my friend and fellow consultant from ScrumInc, Joe Justice.

One thing became clear for me after a week like this: Something Agile is cookin’.  And it’s BIG!

In this blog I will be explaining why and how Agile will develop in the near future.

Introduction; what’s happening?

Human kind is currently facing the biggest era change since the 19th Century. Our industries, education, technologies and society are simply not compliant anymore with today’s and tomorrows needs.  Some systems like healthcare and the economy are that broken they actually should be to be reinvented again. Everything has just become too vulnerable and complex. Just Quantitative-easing“lubricating the engine” like quantitive easing the economy, are no sustainable solutions anymore. Like Russell Ackoff already said, you can only fix a system as a whole not only by separate parts.

This reinvention will only succeed when we are able to learn and adjust our systems very rapidly.  Agile, Lean and a different way of organizing our selfs, can make this reality.  Lean will provide us with the right tools to do exactly what’s needed, nothing more, nothing less.  But applying Lean for only efficiency purposes will not bring the innovations and creativity we need.  We also need an additional catalyst and engine: Agile. It will provide us with the right mindset and tools to innovate, inspect and adapt very fast.  And finally, we must reorganize ourself more on cooperation not on directive command and control. This was useful in the industrial revolution, not in our modern complex times.

Agile’s for everything

Unlike most people think, Agile is not only a software development tool. You can apply it to almost everything. slider_Forze_VI_ValkenburgFor example, as Xebia consultants we've successfully coached Agile and Lean non-IT initiatives in Marketing, Innovation, Education, Automotive, Aviation and non-Profit organizations. It just simply works! And how. A productivity increase of 500% is no exception. But above all, team members and customers are much happier.

Agile's for everybody

At this moment, a lot of people are still unconsciously addicted to their patterns and unaware about the unlimited possibilities out there.  It’s like having a walk in the forrest.  You can bring your own lunch like you always do, but there are some delicious fruits out there for free!  Technology like 3D printing offer unlimited possibilities straight on your desk, where only a few years a go, you needed a complicated, million dollar machine for this. The same goes for Agile. It’s open source and out there waiting for you. It will also help you getting more out of all these new awesome developments!

The maturity of Agile explained

Until recently, most agile initiatives emerged bottom up, but stalled on a misfit between Agile and (conventional) organizations.  Loads of software was produced, but could not be brought to production, simply because the whole development chain was not Agile yet. Tools like TDD and Continuous Integration improved the situation significantly, but dependencies were still not managed properly most of the time.

Screen Shot 2015-05-13 at 7.53.15 PM

The last couple of years, some good scaled agile frameworks like LeSS and SAFe emerged. Managing the dependencies better, but not directly encouraging the Agile Mindset and motivation of people.  In parallel, departments like HR, Control and Finance were struggling with Agile. There was a scaled agile framework implemented, but the hierarchical organization structure was not adjusted creating a gap between fast moving Agile teams and departments still hindered by non-Agile procedures, proceses and systems.

Therefor, we see conventional organizations moving towards a more Agile, community based model like Spotify, Google or Zappos.  ING is now transforming towards to a similar organization model.

Predictions for the near future

My expectation is that we will see Agile transformations continue on a much wider scale.  For example, people developing their own products in an agile fashion while using 3D printing.  Governments will use Agile and Holacracy for solving issues like the reshaping the economic system together with society. Or like I have observed last week, the State of Washington government using these techniques successfully in solving the challenges they're facing.

For me, it currently feels like the early Nineties again when the Internet emerged.  In that time I explained to many people the Internet would be like electricity for them in the near future.  Most people laughed and stated it was just a new way of communication.  The same applies now for the Agile Mindset.   It's not just a hype or a corporate tool. It will reshape the world as we know it today.





Stuff The Internet Says On Scalability For May 15th, 2015

Hey, it's HighScalability time:

Stand a top a volcano and survey the universe.  (By Shane Black & Judy Schmidt)
  • 1 million: Airbnb's room inventory; 2 billion: Telegram messages sent daily; Two billion: photos shared daily on Facebook; 10,000: sensors in every Airbus wing
  • Quotable Quotes:
    • Silicon Valley: “We’re about shaving yoctoseconds off latency for every layer in the stack,” he said. “If we rent from a public cloud, we’re using servers that are, by definition, generic and unpredictable.”
    • @liviutudor: Netflix: approx 250 Cassandra clusters over 7,000+ server instances #cloud
    • @GreylockVC: "More billion-dollar marketplaces will be created in the next five years than in the previous 20." - @simonrothman 
    • CDIXON: Exponential growth curves in the “feels gradual” phase are deceptive. There are many things happening today in technology that feel gradual and disappointing but will soon feel sudden and amazing.
    • @badnima: OH: "The gossip protocol has reached its scaling limits"
    • marcosdumay: People get pretty excited every time physicists talk about information. The bottom line is that information manipulation is just Math, viewed by a different angle.
    • Bill Janeway: There's only one way to hedge against uncertainty in venture and control. Enough cash that when something goes wrong you can buy time to figure out what is and assess what you can do about it. 
    • zylo4747's coworker: Where's the step about preparing to have all your plans crushed and rushing shit out the door as fast as possible?
    • Martin Fowler: don't even consider microservices unless you have a system that's too complex to manage as a monolith. 
    • @postwait: Ingesting, querying, & visualizing data isn't a monitoring system. It isn't even sufficient plumbing for such a system. #srecon15europe
    • @techsummitpr: "Up to date weather conditions? It's not a marvel from Google, it's a marvel from the National Weather Service." @timoreilly #techsummitpr
    • @sovereignfund: Verified as legit: The top 25 hedge fund managers earn more than all kindergarten teachers in U.S. combined. 
    • Adrian Colyer: In their evaluation, the authors found that mixing MapReduce and memcached traffic in the same network extended memcached latency in the tail by 85x compared to an interference free network. 
    • @BenedictEvans: US ecommerce revenues 1999: $12bn 2013: $219bn
    • Gregory Hickok: the brain samples the world in rhythmic pulses, perhaps even discrete time chunks, much like the individual frames of a movie. From the brain’s perspective, experience is not continuous but quantized.
    • David Bollier: There is no master inventory of commons. They can arise whenever a community decides it wishes to manage a resource in a collective manner, with a special regard for equitable access, use and sustainability.

  • What’s Next for Moore’s Law?: I predict that Intel's 10nm process technology will use Quantum Well FETs (QWFETs) with a 3D fin geometry, InGaAs for the NFET channel, and strained Germanium for the PFET channel, enabling lower voltage and more energy efficient transistors in 2016, and the rest of the industry will follow suit at the 7nm node.

  • Don't read How to Build a Unicorn From Scratch – and Walk Away with Nothing if you are easily frightened. Years of work down the drain. **chills** To walk safely through the Valley: Focus on terms, not just valuation; Build a waterfall; Don’t do bad business deals just to get investment capital; Understand the motivations of others; Understand your own motivation.

  • How do you build a real-time chat system? Scaling Secret: Real-time Chat. Goal was to handle 50,000 simultaneous conversations. Pusher was used to deliver messages. For a database Secret used Google App Engine’s High-Replication Datastore. Some nice details on the schema and other issues. Good thread on HN where the main point of contention is should an expensive service like Pusher be used to do something so simple? Usual arguments about wasting money vs displaying your hacker plumage. 

  • Under the hood: Facebook’s cold storage system. A top to bottom reengineering to save power for infrequently accessed photos. Yes, that's cool. Each cold storage datacenter uses 1/6th the energy as a normal datacenter while storing hundreds of petabytes of data. Erasure coding is used to store data. Data is scanned every 30 days to recreate any lost data.  As capacity is added data is rebalanced to the new racks. No file system is used at all. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Suprastructure – how come “Microservices” are getting small?


Now, I don’t want to get off on a rant here*,  but, It seems like “Microservices” are all the rage these days – at least judging from my twitter, feedly and Prismatic feeds. I already wrote that that in my opinion “Microservices” is just new name to SOA . I thought I’d give a couple of examples for what I mean.

I worked on systems that today would  pass for Microservices years ago (as early as 2004/5).  For instance in  2007,  I worked at a startup called xsights. We developed something like  google goggles for brands (or barcodeless barcode) so users could snap a picture of a ad/brochure etc. and get relevant content or perks in response (e.g. we had campaigns in Germany with a book publisher where MMSing shots of newspaper ads  or outdoor signage resulted in getting information and discounts on the advertized books).The architecture driving that wast a set of small, focused autonomous services. Each service encapsulated its own data store (if it had one), services were replaceable (e.g. we had MMS, web, apps & 3G video call gateways). We developed the infrastructure to support automatic service discovery, the ability to create ad-hoc long running interactions a.k.a.Sagas that enabled different cooperations between the services (e.g. the flow for 3G video call needed more services for fulfilment than a app one) etc.  You can read a bit about it in the  “putting it all together” chapter of my SOA Patterns book or view the presentation I gave in QCon a few years back called “Building reliable systems from unreliable components” (see slides below) both elaborate some more on that system.

Another example is Naval command and control system I (along with Udi Dahan) designed back in 2004 for an unmanned surface vessel (like a drone but on water)  – In that system we had services like “navigation” that suggested navigation routes based on waypoints and data from other services (e.g. weather), a “protector” service that handled communications to and from the actual USVs a “Common Operational Picture” (COP) service that aggregated target data from external services and sensors (e.g. the ones on the protectors), “Alerts” services where business rules could trigger various actions etc. These services communicated using events and messages and had flows like the protecor publish its current positioning, the  COP publish an updated target positions (protector + other targets), the navigation system spots a potential interception problem and publish that , the alert service identify that the threshold for the potential problem is too big and trigger an alert to users which then initiate a request for suggesting alternate navigation plans etc. Admittedly some of these services could have been more focused and smaller but they were still autonomous, with separate storage  and  hey that was 2004 :)

So, what changed in the last decade ? For one, I guess after years of “enterprisy” hype that ruined SOAs name the actual architectural style is finally getting some traction (even if it had to change its name for that to happen).

However, this post is not just a rant on Microservices…

The more interesting chage is the shift in the role of infrastructure from a set of libraries and tools that are embedded within the software we write to larger constructs running outside of the software and running/managing it -> or in other words the emergence of  “suprastructure” instead of infrastructure  (infra = below, supra = above). It isn’t that infrastructure vanishes but a lot of functionality is “outsources” to suprastructure. Also this is something that started a few years back with PaaS but (IMHO) getting more acceptance and use in the last couple of years esp. with the gaining popularity of Docker (and more importantly its ecosystem)

If we consider, for example, the architecture of Appsflyer , which I recently joined, (You can listen to  Nir Rubinshtein, our system architect, presenting it (in Hebrew) or check out the slides on speaker-deck or below (English) )

Instead of writing or using elaborate service hosts and application servers you can  host simple apps in  Docker; run and schedule them by Mesos, get cluster and discovery services from Consul, recover from failure by rereading logs from Kafka etc. Back in the days we also had these capabilities but we wrote tons of code to make it happen. Tons of code that was specific to the solution and technology (and was prone for bugs and problems). For modern solutions, all these capabilities are available almost off the shelf , everywhere,  on premise, on clouds and even across clouds.

The importance of suprastructure in regard to  “microservices”  is that this “outsourcing” of functionality help drive down the overhead and costs associated with making services small(er). In previous years the threshold to getting from useful services  to nanoservices  was easier to cross. Today, it is almost reversed –  you spend the effort of setting all this suprastructure once and you actually just begin to see the return if you have enough services to make it worthwhile.

Another advantage of suprastructure is that it is easier to get polyglot services – is easier  to write different services using different technologies. Instead of investing in a lot of technology-specific infrastructure you can get more generic capabilities from the suprastructure and spend more time solving the business problems using the right tool for the job. It also makes it easier to change and  evolve technologies over time – again saving the sunk costs of investing in elaborate infrastructure


of course, that’s just my opinion I could be wrong…*

PS – we (Appsflyer) are hiring : UI tech lead, data scientist, senior devs and more… :)

Building reliable systems from unreliable components

* with apologies to Dennis Miller

Categories: Architecture