Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!


Your Load Generator is Probably Lying to You - Take the Red Pill and Find Out Why

Pretty much all your load generation and monitoring tools do not work correctly. Those charts you thought were full of relevant information about how your system is performing are really just telling you a lie. Your sensory inputs are being jammed. 

To find out how listen to the Morpheous of performance monitoring Gil Tene, CTO and co-founder at Azul Systems, makers of truly high performance JVMs, in a mesmerizing talk on How NOT to Measure Latency.

This talk is about removing the wool from your eyes. It's the red pill option for what you thought you were testing with load generators.

Some highlights:

  • If you want to hide the truth from someone show them a chart of all normal traffic with one just one bad spike surging into 95 percentile territory. 

  • The number one indicator you should never get rid of is the maximum value. That’s not noise, it’s the signal, the rest is noise.

  • 99% of users experience ~99.995%’ile response times, so why are you even looking at 95%'ile numbers?

  • Monitoring tools routinely drop important samples in the result set, leading you to draw really bad conclusions about the quality of the performance of your system.

It doesn't take long into the talk to realize Gil really knows his stuff. It's a deep talk with deep thoughts based on deep experience, filled with surprising insights. So if you take the red pill, you'll learn a lot, but you may not always like what you've learned.

Here's my inadequate gloss on Gil's amazing talk:

How to Lie With Percentiles
Categories: Architecture

Blogging Resources at a Glance

I’ve put together a massive collection of the best-of-the-best blogging resources so they are at your fingertips:

It’s a serious collection of blogging resources including:

  • Getting Started Blogging
  • Start Your Blog
  • Articles on Blogging
  • Books on Blogging
  • Checklists for Blogging
  • Courses for Blogging (Free + Paid)
  • Guides for Blogging (Free + Paid)
  • How They Got Started
  • Podcasts on Blogging
  • Success Stories of Bloggers
  • Videos on Blogging

And by serious, I mean serious.  It’s a hard-core collection of some of the best blogging resources that will help you succeed where others fail.

I will continue to add blogging resources, but you will already find a treasure trove of great articles, books, podcasts, videos and more to help you start your blog, improve your blog, or bring an old blog back to life.

I help a lot of people start blogs.  I shave years of potentially painful lessons off of their learning curve, so they can get started doing more of what they love, avoid some of the many pitfalls, and build a blog they love (if it feels like a chore, you’re doing it wrong.)

If you haven’t already started a blog, this might be just the resource roundup you need to help you get started and to help you leap frog ahead.

There are lots of reasons why you might start a blog, if you haven't already.  Maybe you want to start a movement.  Maybe you want to land your next dream job.  Maybe you want to make friends around the world.  Maybe you want to explore your creativity.  Maybe you want to launch a writing career and build your next book.  Maybe you want to build an online business, one post at a time.

The thing that I try to teach people is that working on your blog, is working on your life.  You learn a lot about your personal productivity, your values, your ability to ship ideas, your ability to connect with people, and ultimately, what you want to spend more time doing.  A blog is a great way to build a personal platform for giving your best, where you have your best to give in the service for others.

And if you monetize your blog, and if you master creating and capturing value, it can be one of the smartest ways to combine passion and profit.   The key to keep in mind is, do what you would do for free, but blend it with doing what people will pay you for, in a way that uses your unique strengths, makes you come alive, adds value, and helps change the world in your way.

Everybody has ideas.  Some share them.  Some shape them. Some ship them.  Some productize them.  Some let them die.

Put a little dent in the universe, a post at a time.

Categories: Architecture, Programming

GCC Compiler Optimizations: Dissection of a Benchmark

Xebia Blog - Mon, 10/05/2015 - 08:06

The idea of this post came from another blogpost which compared the performance of a little benchmark in C, Go and Python. The surprising result in that blog was, that the Go implementation performed much better than the C version.

The benchmark was a simple program took one command line argument and computed the sum of all integers up until the argument.
I wanted to see what was going on so I tried to run it locally and indeed, when invoked with a parameter 100,000,000 it took 0.259 seconds for the C implementation to finish and only 0.140 seconds for the Go version.

The C version of the benchmark is the following:

#include <stdio.h>
#include <stdio.h>
#include <stdlib.h>

main(int argc, char *argv[])

  int arg1 = 1;

  arg1 = atoi(argv[1]);

  long a;
  long sum = 0;
  /* for loop execution */
  for( a = 0; a < arg1; a++ )
      sum += a;
  printf("sum: %ld\n", sum);
  return 0;

The blog suggested using some optimization flags for compiling the C program, so I tried:

gcc -O3 bench.c -o bench -march=native

The O3 flag tells gcc to use its most aggressive optimization strategies and march=native tells it to take advantage of the local cpu version when compiling to machine code.
This had quite a dramatic effect: Instead of 0.259 seconds, the entire program now took only 0.001 seconds!
And it stayed this low when I increased the parameter to 1,000,000,000. So it seems that the compiler has rewritten our program to an implementation which only takes constant time.

To find out what was causing this amazing speedup I compiled again using the -S flag, so I would get the assembly code. Here it is, including some comments that explain the instructions:

        .section        __TEXT,__text,regular,pure_instructions
        .macosx_version_min 10, 10
        .globl  _main
        .align  4, 0x90
_main:                                  ## @main
## BB#0:
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        movq    8(%rsi), %rdi
        callq   _atoi                ; Converts the argument to integer, put the result in ax
        xorl    %esi, %esi
        testl   %eax, %eax
        jle     LBB0_2               ; Skip the calculation if ax is 0  
## BB#1:                                ##
        cltq                         ; the next couple of lines represent the for-loop:
        leaq    -1(%rax), %rdx       ; dx = ax - 1
        leaq    -2(%rax), %rcx       ; cx = ax - 2
        mulxq   %rcx, %rcx, %rdx     ; cx = cx * dx
        shldq   $63, %rcx, %rdx      ; dx = cx / 2
        leaq    -1(%rax,%rdx), %rsi  ; si = dx + ax - 1
        leaq    L_.str(%rip), %rdi   ; ready, transform the result to a string
        xorl    %eax, %eax
        callq   _printf              ; and print it
        xorl    %eax, %eax
        popq    %rbp

        .section        __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
        .asciz  "sum: %ld\n"


The compiler has transformed the for-loop into a single calculation:
(ax - 1) * (ax-2) / 2 + ax - 1, which can be simplified to: ax * (ax + 1) / 2
This is the well known formula for a partial sum of an arithmetic sequence with ax elements. Gcc has recognized that our loop could be rewritten as a single operation!

It also uses a couple of micro optimizations along the way. For instance, to compute dx = ax - 1, the lines:

mov %rax, %rdx
dec %rdx

would have done the trick. However the compiler chooses to do this in a single instruction:

leaq    -1(%rax), %rdx

This instruction was originally meant for manipulating address pointers with offsets but it can be used to perform simple 64 bit additions as well. On modern cpu's it performs a faster than the two separate instructions and it saves an instruction.
Another compiler trick is the following line:

shldq   $63, %rcx, %rdx

The shld instruction performs a shift left on the second operand rcx and the overflow bits are moved into the third operand, rdx. By left-shifting the 64 bits rcx register 63 positions into rdx, it effectively performs an integer division by 2 on rcx where the result ends up in rdx.
This again saves an instruction but this time the result is equally fast as moving rcx into rdx and then dividing rdx by 2 (by doing a shift right).


The conclusion: writing a proper benchmark is tough, comparing the performance of languages is difficult and modern compilers are amazingly clever.
An open question for me remains what kind of algorithm the compiler uses to recognize the nature of our for-loop. Does it use some simple heuristic or is our for-loop a special case of a generic class of loops that can be simplified?

Stuff The Internet Says On Scalability For October 2nd, 2015

Hey, it's HighScalability time:

Elon Musk's presentation of the Tesla Model X had more in common with a new iPhone event than a traditional car demo.
If you like Stuff The Internet Says On Scalability then please consider supporting me on Patreon.
  • 1.4 billion: Android devices; 1000: # of qubits in Google's new quantum computer; 150Gbps: Linux botnet DDoS attack; 3,000: iPhones sold per minute; smith: the most common last name in the US; 50%: storage reduction by using erasure coding in Hadoop; 101: calories burned during sex.

  • Quotable Quotes:
    • @peterseibel: How to be a 10x engineer: help ten other engineers be twice as good.
    • The Master Algorithm: Scientists make theories, and engineers make devices. Computer scientists make algorithms, which are both theories and devices
    • @immolations: Feudalism may not be perfect but it's the best system we've got. More of us have chainmail today than at any point in history
    • @mjpt777: We managed to transfer almost 10 GB/s worth of 1000 byte messages via Aeron IPC. That's more than a 100GigE network. Way to scale up on box!
    • @caitie: lol what my services do 1.5 billion writes per minute ~25 million writes per second
    • @mjpt777: Think of your QPI links in a multi-socket server as a fast network. Communicate to share memory; don't share memory to communicate.
    • @aalmiray: "you can't have a second CPU until you prove you can use the first one" - @mjpt777
    • Periscope: a hard drive is over 3x faster a than gigabit ethernet
    • thom: Any sufficiently complicated distributed architecture contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of SOAP.
    • @dabeaz: Instead of teaching everyone how to code, I wish we'd just focus on getting everyone's curiosity from kindergarten back.
    • Matthew Jones: It's a Catch-22. We need the metrics to choose the best architecture, but we need to actually implement the damn thing in order to get metrics, and implementation requires us to select an architecture. 
    • @jmwind: Today we built Shopify 500 times, deployed to prod 22 times, peaked at 700 build agents, spun 50k docker containers in test and 25k prod.
    • antirez: Redis, especially using pipelining, can serve an impressive amount of requests per second per thread (half a million is a common figure with very intensive pipelining. Without pipelining it is around 100,000 ops/sec). 
    • @jcox92: This is my invitation to you to start using languages that were discovered rather than languages that were invented." #strangeloop
    • @tyler_treat: "Measuring latency at saturation is like looking at your bumper after wrapping your car around a pole." —@giltene
    • There are a lot of great quotes this week. So to see all of the Quotable Quotes please see the full article.

  • Another example of the diffusion of the software ethos. Elon Musk's presentation of the Tesla Model X had more in common with a new iPhone event than a traditional car demo. First, it was a livecast that started a touch late. Second, throngs of fanpeople clapped and whooped in all the appropriate places. Gone are the beauty shots of cars simply meant to stroke the lizard brain. Elon hit the use cases. He talked vision statement. He talked safety specs and features. He talked air quality in depth. He didn't wait for iFixit to do a tear down, he showed construction details and how they reinforced features and quality. He showed how the Falcon Wing door auto opened and closed; how the doors worked in a crowded parking lot; and how the door design also allowed passengers to easily access the third row of seats. This focus on the car as an engineered product for solving tangible problems in real life may be the lasting legacy of Tesla. 

  • Tools are to programmers like shoes are to the mundane fashion world. Which is what makes this discussion of Why Fogbugz lost to Jira in the bug tool wars so fascinating. In one corner we have gecko with a nice analysis of the FogBugz side and we have carlfish with a quality response from the Atlassian perspective. It's painful to remember how convoluted product deployment was before software as a service. 

  • How does the CIA provide advanced state-of-the-art analytics? On Amazon of course. Amazon birthed the CIA their own region in 9 months. The CIA decided the only way to reach commercial parity was to to stop trying to do it themselves and leverage those who already know how to do it. The CIA will have its own private version of the marketplace so they can transition tools as fast as possible into the hands of analysts. The CIA really likes themselves some Spark. Partnering for expertise is something the CIA is trying to learn how to do. Oh, the CIA is hiring. 

  • Jeff Atwood has the sense of this. Learning to code is overrated: An accomplished programmer would rather his kids learn to read and reason. One caveat is understanding algorithms will be a necessary life skill now and certainly in the future. We'll need to see algorithms for what they are, biased tools that serve someone else's purpose. It's common even among the learned today to see algorithms as objective and benign. The easiest way of piercing the algorithm washing vale may be for people to learn a little programming. That may help demystify what's really going on.

  • Embrace, extend and extinguish. Amazon Will Ban Sale of Apple, Google Video-Streaming Devices. This kind of cross division strategy tax often marks the beginning of the end. Amazon is no longer an everything store. Once we begin to not think of going to Amazon First when shopping then we may transition to Amazon Maybe and then to Amazon Never. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

AzureCon Keynote Announcements: India Regions, GPU Support, IoT Suite, Container Service, and Security Center

ScottGu's Blog - Scott Guthrie - Thu, 10/01/2015 - 06:43

Yesterday we held our AzureCon event and were fortunate to have tens of thousands of developers around the world participate.  During the event we announced several great new enhancements to Microsoft Azure including:

  • General Availability of 3 new Azure regions in India
  • Announcing new N-series of Virtual Machines with GPU capabilities
  • Announcing Azure IoT Suite available to purchase
  • Announcing Azure Container Service
  • Announcing Azure Security Center

We were also fortunate to be joined on stage by several great Azure customers who talked about their experiences using Azure including:, Nascar, Alaska Airlines, Walmart, and ThyssenKrupp. Watching the Videos

All of the talks presented at AzureCon (including the 60 breakout talks) are now available to watch online.  You can browse and watch all of the sessions here.


My keynote to kick off the event was an hour long and provided an end-to-end look at Azure and some of the big new announcements of the day.  You can watch it here.

Below are some more details of some of the highlights:

Announcing General Availability of 3 new Azure regions in India

Yesterday we announced the general availability of our new India regions: Mumbai (West), Chennai (South) and Pune (Central).  They are now available for you to deploy solutions into.

This brings our worldwide presence of Azure regions up to 24 regions, more than AWS and Google combined. Over 125 customers and partners have been participating in the private preview of our new India regions.   We are seeing tremendous interest from industry sectors like Public Sector, Banking Financial Services, Insurance and Healthcare whose cloud adoption has been restricted by data residency requirements.  You can all now deploy your solutions too. Announcing N-series of Virtual Machines with GPU Support

This week we announced our new N-series family of Azure Virtual Machines that enable GPU capabilities.  Featuring NVidia’s best of breed Tesla GPUs, these Virtual Machines will help you run a variety of workloads ranging from remote visualization to machine learning to analytics.

The N-series VMs feature NVidia’s flagship GPU, the K80 which is well supported by NVidia’s CUDA development community. N-series will also have VM configurations featuring the latest M60 which was recently announced by NVidia. With support for M60, Azure becomes the first hyperscale cloud provider to bring the capabilities of NVidia’s Quadro High End Graphics Support to the cloud. In addition, N-series combines GPU capabilities with the superfast RDMA interconnect so you can run multi-machine, multi-GPU workloads such as Deep Learning and Skype Translator Training.

Announcing Azure Security Center

This week we announced the new Azure Security Center—a new Azure service that gives you visibility and control of the security of your Azure resources, and helps you stay ahead of threats and attacks.  Azure is the first cloud platform to provide unified security management with capabilities that help you prevent, detect, and respond to threats.


The Azure Security Center provides a unified view of your security state, so your team and/or your organization’s security specialists can get the information they need to evaluate risk across the workloads they run in the cloud.  Based on customizable policy, the service can provide recommendations. For example, the policy might be that all web applications should be protected by a web application firewall. If so, the Azure Security Center will automatically detect when web apps you host in Azure don’t have a web application firewall configured, and provide a quick and direct workflow to get a firewall from one of our partners deployed and configured:


Of course, even with the best possible protection in place, attackers will still try to compromise systems. To address this problem and adopt an “assume breach” mindset, the Azure Security Center uses advanced analytics, including machine learning, along with Microsoft’s global threat intelligence network to look for and alert on attacks. Signals are automatically collected from your Azure resources, the network, and integrated security partner solutions and analyzed to identify cyber-attacks that might otherwise go undetected. Should an incident occur, security alerts offer insights into the attack and suggest ways to remediate and recover quickly. Security data and alerts can also be piped to existing Security Information and Events Management (SIEM) systems your organization has already purchased and is using on-premises.


No other cloud vendor provides the depth and breadth of these capabilities, and they are going to enable you to build even more secure applications in the cloud.

Announcing Azure IoT Suite Available to Purchase

The Internet of Things (IoT) provides tremendous new opportunities for organizations to improve operations, become more efficient at what they do, and create new revenue streams.  We have had a huge interest in our Azure IoT Suite which until this week has been in public preview.  Our customers like Rockwell Automation and ThyssenKrupp Elevators are already connecting data and devices to solve business problems and improve their operations. Many more businesses are poised to benefit from IoT by connecting their devices to collect and analyze untapped data with remote monitoring or predictive maintenance solutions.

In working with customers, we have seen that getting started on IoT projects can be a daunting task starting with connecting existing devices, determining the right technology partner to work with and scaling an IoT project from proof of concept to broad deployment. Capturing and analyzing untapped data is complex, particularly when a business tries to integrate this new data with existing data and systems they already have. 

The Microsoft Azure IoT Suite helps address many of these challenges.  The Microsoft Azure IoT Suite helps you connect and integrate with devices more easily, and to capture and analyze untapped device data by using our preconfigured solutions, which are engineered to help you move quickly from proof of concept and testing to broader deployment. Today we support remote monitoring, and soon we will be delivering support for predictive maintenance and asset management solutions.

These solutions reliably capture data in the cloud and analyze the data both in real-time and in batch processing. Once your devices are connected, Azure IoT Suite provides real time information in an intuitive format that helps you take action from insights. Our advanced analytics then enables you to easily process data—even when it comes from a variety of sources, including devices, line of business assets, sensors and other systems and provide rich built-in dashboards and analytics tools for access to the data and insights you need. User permissions can be set to control reporting and share information with the right people in your organization.

Below is an example of the types of built-in dashboard views that you can leverage without having to write any code:


To support adoption of the Azure IoT Suite, we are also announcing the new Microsoft Azure Certified for IoT program, an ecosystem of partners whose offerings have been tested and certified to help businesses with their IoT device and platform needs. The first set of partners include Beaglebone, Freescale, Intel, Raspberry Pi,, Seeed and Texas Instruments. These partners, along with experienced global solution providers are helping businesses harness the power of the Internet of Things today.  

You can learn more about our approach and the Azure IoT Suite at and partners can learn more at Announcing Azure IoT Hub

This week we also announced the public preview of our new Azure IoT Hub service which is a fully managed service that enables reliable and secure bi-directional communications between millions of IoT devices and an application back end. Azure IoT Hub offers reliable device-to-cloud and cloud-to-device hyper-scale messaging, enables secure communications using per-device security credentials and access control, and includes device libraries for the most popular languages and platforms.

Providing secure, scalable bi-directional communication from the heterogeneous devices to the cloud is a cornerstone of any IoT solution which Azure IoT hub addresses in the following way:

  • Per-device authentication and secure connectivity: Each device uses its own security key to connect to IoT Hub. The application back end is then able to individually whitelist and blacklist each device, enabling complete control over device access.
  • Extensive set of device libraries: Azure IoT device SDKs are available and supported for a variety of languages and platforms such as C, C#, Java, and JavaScript.
  • IoT protocols and extensibility: Azure IoT Hub provides native support of the HTTP 1.1 and AMQP 1.0 protocols for device connectivity. Azure IoT Hub can also be extended via the Azure IoT protocol gateway open source framework to provide support for MQTT v3.1.1.
  • Scale: Azure IoT Hub scales to millions of simultaneously connected devices, and millions of events per seconds.

Getting started with Azure IoT Hub is easy. Simply navigate to the Azure Preview portal, and use the Internet of Things->Azure IoT Hub. Choose the name, pricing tier, number of units and location and select Create to provision and deploy your IoT Hub:


Once the IoT hub is created, you can navigate to Settings and create new shared access policies and modify other messaging settings for granular control.

The bi-directional communication enabled with an IoT Hub provides powerful capabilities in a real world IoT solution such as the control of individual device security credentials and access through the use of a device identity registry.  Once a device identity is in the registry, the device can connect, send device-to-cloud messages to the hub, and receive cloud-to-device messages from backend applications with just a few lines of code in a secure way.

Learn more about Azure IoT Hub and get started with your own real world IoT solutions. Announcing the new Azure Container Service

’We’ve been working with Docker to integrate Docker containers with both Azure and Windows Server for some time. This week we announced the new Azure Container Service which leverages the popular Apache Mesos project to deliver a customer proven orchestration solution for applications delivered as Docker containers.


The Azure Container Service enables users to easily create and manage a Docker enabled Apache Mesos cluster. The container management software running on these clusters is open source, and in addition to the application portability offered by tooling such as Docker and Docker Compose, you will be able to leverage portable container orchestration and management tooling such as Marathon, Chronos and Docker Swarm.

When utilizing the Azure Container Service, you will be able to take advantage of the tight integration with Azure infrastructure management features such as tagging of resources, Role Based Access Control (RBAC), Virtual Machine Scale Sets (VMSS) and the fully integrated user experience in the Azure portal. By coupling the enterprise class Azure cloud with key open source build, deploy and orchestration software, we maximize customer choice when it comes to containerize workloads.

The service will be available for preview by the end of the year. Learn More

Watch the AzureCon sessions online to learn more about all of the above announcements – plus a lot more that was covered during the day.  We are looking forward to seeing what you build with what you learn!

Hope this helps,

Scott omni

Categories: Architecture, Programming

Strategy: Taming Linux Scheduler Jitter Using CPU Isolation and Thread Affinity

When nanoseconds matter you have to pay attention to OS scheduling details. Mark Price, who works in the rarified high performance environment of high finance, shows how in his excellent article on Reducing system jitter.

For a tuning example he uses the famous Disrupter inter-thread messaging library. The goal is to keep the OS continuously feeding CPUs work from high priority threads. His baseline test shows the fastest message is sent in 76 nanoseconds, 1 in 100 messages took longer than 2 milliseconds, and the longest delay was 11 milliseconds.

The next section of the article shows in loving detail how to bring those latencies lower and more consistent, a job many people will need to do in practice. You'll want to read the article for a full explanation, including how to use perf_events and HdrHistogram. It's really great at showing the process, but in short:

  • Turning off power save mode on the CPU reduced brought the max latency from 11 msec down to 8 msec.
  • Guaranteeing threads will always have CPU resources using CPU isolation and thread affinity brought the maximum latency down to 14 microseconds.
Related Articles
Categories: Architecture

Sponsored Post: iStreamPlanet,, Instrumental, Location Labs, Enova, Surge, Redis Labs,, VoltDB, Datadog, SignalFx, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?
  • As a Networking & Systems Software Engineer at iStreamPlanet you’ll be driving the design and implementation of a high-throughput video distribution system. Our cloud-based approach to video streaming requires terabytes of high-definition video routed throughout the world. You will work in a highly-collaborative, agile environment that thrives on success and eats big challenges for lunch. Please apply here.

  • As a Scalable Storage Software Engineer at iStreamPlanet you’ll be driving the design and implementation of numerous storage systems including software services, analytics and video archival. Our cloud-based approach to world-wide video streaming requires performant, scalable, and reliable storage and processing of data. You will work on small, collaborative teams to solve big problems, where you can see the impact of your work on the business. Please apply here.

  • is a *profitable* fast-growing SaaS startup looking for a Lead DevOps/Infrastructure engineer to join our ~10 person team in Palo Alto or *remotely*. Come help us improve API performance, tune our databases, tighten up security, setup autoscaling, make deployments faster and safer, scale our MongoDB/Elasticsearch/MySQL/Redis data stores, setup centralized logging, instrument our app with metric collection, set up better monitoring, etc. Learn more and apply here.

  • Location Labs is the global pioneer in mobile security for humans. Our services are used by millions of monthly paying subscribers worldwide. We were named one of Entrepreneur magazine’s “most brilliant” companies and TechCrunch said we’ve “cracked the code” for mobile monetization. If you are someone who enjoys the scrappy, get your hands dirty atmosphere of a startup, but has the measured patience and practices to keep things robust, well documented, and repeatable, Location Labs is the place for you. Please apply here.

  • As a Lead Software Engineer at Enova you’ll be one of Enova’s heavy hitters, overseeing technical components of major projects. We’re going to ask you to build a bridge, and you’ll get it built, no matter what. You’ll balance technical requirements with business needs, while advocating for a high quality codebase when working with full business teams. You’re fluent in ‘technical’ language and ‘business’ language, because you’re the engineer everyone counts on to understand how it works now, how it should work, and how it will work. Please apply here.

  • As a UI Architect at Enova, you will be the elite representative of our UI culture. You will be responsible for setting a vision, guiding direction and upholding high standards within our culture. You will collaborate closely with a group of talented UI Engineers, UX Designers, Visual Designers, Marketing Associates and other key business stakeholders to establish and maintain frontend development standards across the company. Please apply here.

  • VoltDB's in-memory SQL database combines streaming analytics with transaction processing in a single, horizontal scale-out platform. Customers use VoltDB to build applications that process streaming data the instant it arrives to make immediate, per-event, context-aware decisions. If you want to join our ground-breaking engineering team and make a real impact, apply here.  

  • At Scalyr, we're analyzing multi-gigabyte server logs in a fraction of a second. That requires serious innovation in every part of the technology stack, from frontend to backend. Help us push the envelope on low-latency browser applications, high-speed data processing, and reliable distributed systems. Help extract meaningful data from live servers and present it to users in meaningful ways. At Scalyr, you’ll learn new things, and invent a few of your own. Learn more and apply.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • Surge 2015. Want to mingle with some of the leading practitioners in the scalability, performance, and web operations space? Looking for a conference that isn't just about pitching you highly polished success stories, but that actually puts an emphasis on learning from real world experiences, including failures? Surge is the conference for you.

  • Your event could be here. How cool is that?
Cool Products and Services
  • Instrumental is a hosted real-time application monitoring platform. In the words of one of our customers: "Instrumental is the first place we look when an issue occurs. Graphite was always the last place we looked."

  • Real-time correlation across your logs, metrics and events. just released its operations data hub into beta and we are already streaming in billions of log, metric and event data points each day. Using our streaming analytics platform, you can get real-time monitoring of your application performance, deep troubleshooting, and even product analytics. We allow you to easily aggregate logs and metrics by micro-service, calculate percentiles and moving window averages, forecast anomalies, and create interactive views for your whole organization. Try it for free, at any scale.

  • Datadog is a monitoring service for scaling cloud infrastructures that bridges together data from servers, databases, apps and other tools. Datadog provides Dev and Ops teams with insights from their cloud environments that keep applications running smoothly. Datadog is available for a 14 day free trial at

  • Turn chaotic logs and metrics into actionable data. Scalyr replaces all your tools for monitoring and analyzing logs and system metrics. Imagine being able to pinpoint and resolve operations issues without juggling multiple tools and tabs. Get visibility into your production systems: log aggregation, server metrics, monitoring, intelligent alerting, dashboards, and more. Trusted by companies like Codecademy and InsideSales. Learn more and get started with an easy 2-minute setup. Or see how Scalyr is different if you're looking for a Splunk alternative or Sumo Logic alternative.

  • SignalFx: just launched an advanced monitoring platform for modern applications that's already processing 10s of billions of data points per day. SignalFx lets you create custom analytics pipelines on metrics data collected from thousands or more sources to create meaningful aggregations--such as percentiles, moving averages and growth rates--within seconds of receiving data. Start a free 30-day trial!

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex goes beyond monitoring and measures the system's work on your servers, providing unparalleled insight and query-level analysis. This unique approach ultimately enables your team to work more effectively, ship more often, and delight more customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here:

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required.

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

Announcing General Availability of HDInsight on Linux + new Data Lake Services and Language

ScottGu's Blog - Scott Guthrie - Mon, 09/28/2015 - 21:54

Today, I’m happy to announce several key additions to our big data services in Azure, including the General Availability of HDInsight on Linux, as well as the introduction of our new Azure Data Lake and Language services. General Availability of HDInsight on Linux

Today we are announcing general availability of our HDInsight service on Ubuntu Linux.  HDInsight enables you to easily run managed Hadoop clusters in the cloud.  With today’s release we now allow you to configure these clusters to run using both a Windows Server Operating System as well as an Ubuntu based Linux Operating System.

HDInsight on Linux enables even broader support for Hadoop ecosystem partners to run in HDInsight providing you even greater choice of preferred tools and applications for running Hadoop workloads. Both Linux and Windows clusters in HDInsight are built on the same standard Hadoop distribution and offer the same set of rich capabilities.

Today’s new release also enables additional capabilities, such as, cluster scaling, virtual network integration and script action support. Furthermore, in addition to Hadoop cluster type, you can now create HBase and Storm clusters on Linux for your NoSQL and real time processing needs such as building an IoT application.

Create a cluster

HDInsight clusters running using Linux can now be easily created from the Azure Management portal under the Data + Analytics section.  Simply select Ubuntu from the cluster operating system drop-down, as well as optionally choose the cluster type you wish to create (we support base Hadoop as well as clusters pre-configured for workloads like Storm, Spark, HBase, etc).


All HDInsight Linux clusters can be managed by Apache Ambari. Ambari provides the ability to customize configuration settings of your Hadoop cluster while giving you a unified view of the performance and state of your cluster and providing monitoring and alerting within the HDInsight cluster.


Installing additional applications and Hadoop components

Similar to HDInsight Windows clusters, you can now customize your Linux cluster by installing additional applications or Hadoop components that are not part of default HDInsight deployment. This can be accomplished using Bash scripts with script action capability.  As an example, you can now install Hue on an HDInsight Linux cluster and easily use it with your workloads:


Develop using Familiar Tools

All HDInsight Linux clusters come with SSH connectivity enabled by default. You can connect to the cluster via a SSH client of your choice. Moreover, SSH tunneling can be leveraged to remotely access all of the Hadoop web applications from the browser.

image New Azure Data Lake Services and Language

We continue to see customers enabling amazing scenarios with big data in Azure including analyzing social graphs to increase charitable giving, analyzing radiation exposure and using the signals from thousands of devices to simulate ways for utility customers to optimize their monthly bills. These and other use cases are resulting in even more data being collected in Azure. In order to be able to dive deep into all of this data, and process it in different ways, you can now use our Azure Data Lake capabilities – which are 3 services that make big data easy.

The first service in the family is available today: Azure HDInsight, our managed Hadoop service that lets you focus on finding insights, and not spend your time having to manage clusters. HDInsight lets you deploy Hadoop, Spark, Storm and HBase clusters, running on Linux or Windows, managed, monitored and supported by Microsoft with a 99.9% SLA.

The other two services, Azure Data Lake Store and Azure Data Lake Analytics introduced below, are available in private preview today and will be available broadly for public usage shortly. Azure Data Lake Store

Azure Data Lake Store is a hyper-scale HDFS repository designed specifically for big data analytics workloads in the cloud. Azure Data Lake Store solves the big data challenges of volume, variety, and velocity by enabling you to store data of any type, at any size, and process it at any scale. Azure Data Lake Store can support near real-time scenarios such as the Internet of Things (IoT) as well as throughput-intensive analytics on huge data volumes. The Azure Data Lake Store also supports a variety of computation workloads by removing many of the restrictions constraining traditional analytics infrastructure like the pre-definition of schema and the creation of multiple data silos. Once located in the Azure Data Lake Store, Hadoop-based engines such as Azure HDInsight can easily mine the data to discover new insights.

Some of the key capabilities of Azure Data Lake Store include:

  • Any Data: A distributed file store that allows you to store data in its native format, Azure Data Lake Store eliminates the need to transform or pre-define schema in order to store data.
  • Any Size: With no fixed limits to file or account sizes, Azure Data Lake Store enables you to store kilobytes to exabytes with immediate read/write access.
  • At Any Scale: You can scale throughput to meet the demands of your analytic systems including the high throughput needed to analyze exabytes of data. In addition, it is built to handle high volumes of small writes at low latency making it optimal for near real-time scenarios like website analytics, and Internet of Things (IoT).
  • HDFS Compatible: It works out-of-the-box with the Hadoop ecosystem including other Azure Data Lake services such as HDInsight.
  • Fully Integrated with Azure Active Directory: Azure Data Lake Store is integrated with Azure Active Directory for identity and access management over all of your data.
Azure Data Lake Analytics with U-SQL

The new Azure Data Lake Analytics service makes it much easier to create and manage big data jobs. Built on YARN and years of experience running analytics pipelines for Office 365, XBox Live, Windows and Bing, the Azure Data Lake Analytics service is the most productive way to get insights from big data. You can get started in the Azure management portal, querying across data in blobs, Azure Data Lake Store, and Azure SQL DB. By simply moving a slider, you can scale up as much computing power as you’d like to run your data transformation jobs.


Today we are introducing a new U-SQL offering in the analytics service, an evolution of the familiar syntax of SQL.  U-SQL allows you to write declarative big data jobs, as well as easily include your own user code as part of those jobs. Inside Microsoft, developers have been using this combination in order to be productive operating on massive data sets of many exabytes of scale, processing mission critical data pipelines. In addition to providing an easy to use experience in the Azure management portal, we are delivering a rich set of tools in Visual Studio for debugging and optimizing your U-SQL jobs. This lets you play back and analyze your big data jobs, understanding bottlenecks and opportunities to improve both performance and efficiency, so that you can pay only for the resources you need and continually tune your operations.

image Learn More

For more information and to get started, check out the following links:

Hope this helps,


Categories: Architecture, Programming

GTD (Getting Things Done)

Xebia Blog - Mon, 09/28/2015 - 17:58

As a consultant I am doing a lot of things, so to keep up I have always used some form of a TODO list. The reason why I did this is because it helped me break down my tasks in to smaller ones and keep focusing, but also because I kept remembering the quote I once heard “smart people write things down, dumb people try to remember it”.

Years ago I read the books “Seven habits of highly effective people” and “Switch”, in my research in to how to become more effective I came in to contact with GTD and decided to try it out. In this post I want to show people who have heard about GTD how I use it and how it helps me.

For those who don’t know GTD or haven’t heard about the two books I mentioned please follow the links Getting things done Seven habits of highly effective people Switch and have fun.

How do I use GTD?
Because of my experience before GTD I knew I needed a digital list that I could access from my phone, laptop or tablet. It was also crucial for me to be able to have multiple lists and reminders. Because of these requirements I settled on using Todoist after having evaluated others as well.

I have couple of main groups like Personal, Work, Read-me, Someday & Maybe, Home etc.. In those lists I have topics like Xebia, CustomerX, CustomerY etc.. And in those topics I have the actual tasks or lists with tasks that I need to do.

I write everything in to my Inbox list the moment a task or idea pops up in to my mind. I do this because most of the time I don’t have the time to figure out directly what I want to do with it so I just put it in to my Inbox and deal with it later.

As a consultant I have to deal with external lists like team scrum boards. I don’t want to depend on them so I write down my own tasks with references to them. In those tasks I write the things that I need to do today, tomorrow or longer if it is relevant in making decisions on the long run.

Every day I finish my work I quickly look at my task list and decide on priorities for tomorrow, then every morning I evaluate my decisions based on the input of that day.

As a programmer I like to break down a programming task in to smaller solutions so therefore sometimes I use my list as a reference or micromanagement of tasks that take couple of hours.

To have a better overview of things in my list I also use tags like tablet, PC, home, tel so it helps me pick up tasks based on context I am in on that moment.

Besides deciding on priorities every day I also do a week review where I decide on weekly priorities and add or remove tasks. I also specify reminders on tasks, I do this day based but if it is really necessary I use time based.

Because I want to have one Inbox I am sticking to zero mail in my mail Inboxes. This means every time I check my mail I read it and then either delete it or archive it and if needed make a task to do something.

What does GTD do for me?
The biggest win for me is that it helps me clear my mind because I put everything in there that I need to do, this way my mind does not have to remember all those things. It is also a relaxing feeling knowing that everything you need to do is registered and you will not forget to do it.

It gives me structure and it allows me to make better priorities and decisions about saying yes or no to something because everything is in one place.

Having a task registered so that you can cross it off when it’s done gives a nice fulfilling feeling.

Having a big list of things that you think you need to do can be quite overwhelming and in some cases it can also feel like a pressure because you haven’t done a list of tasks that is there for a long time. That’s why I keep evaluating tasks and remove things from it after some time.

Task lists and calendars can sometimes collide so it is importent to keep your agenda as a part of your list although it’s not a list.

What am I still missing?
Right now from the GTD’s point of view I miss a better time management. Besides that I would like to start a journal of relevant things that happend or people I spoke to but also incorporate that with my GTD lists (for example combining Evernote/OneNote with my Todoist lists).

Everything else I miss is technical:

I miss a proper integration of all my mail where all the calendar invites are working without having to connect all accounts on all different devices. I also miss a proper integration between my mail and Todoist for creating and referencing mail.

Because I write down every task/idea that pops up in to my mind it would be great if voice recognition would work a little better for ‘me’ when I am in a car.

GTD has helped me structure my tasks and gave me more controle in to making decision around them. By writing this post I am hoping to trigger somebody else to look in to GTD and maybe have an impact on his or hers effectiveness.

I am also curious in what your opinion is on GTD or if you have any tips for me regarding GTD?

How Facebook Tells Your Friends You're Safe in a Disaster in Under Five Minutes

In a disaster there’s a raw and immediate need to know your loved ones are safe. I felt this way during 9/11. I know I’ll feel this way during the next wild fire in our area. And I vividly remember feeling this way during the 1989 Loma Prieta earthquake.

Most earthquakes pass beneath notice. Not this one and everyone knew it. After ceiling tiles stopped falling like snowflakes in the computer lab, we convinced ourselves the building would not collapse, and all thoughts turned to the safety of loved ones. As it must have for everyone else. Making an outgoing call was nearly impossible, all the phone lines were busy as calls poured into the Bay Area from all over the nation. Information was stuck. Many tense hours were spent in ignorance as the TV showed a constant stream of death and destruction.

It’s over a quarter of a century later, can we do any better?

Facebook can. Through a product called Safety Check, which connects friends and loved ones during a disaster. When a disaster hits Safety Check prompts people in the area to indicate if they are OK or not. Then Facebook closes the worry loop by telling their friends how they are doing.

Brian Sa, Engineer Manager at Facebook, created Safety Check out of his experience of the devastating earthquake in Fukushima Japan in 2011. He told his very moving story in a talk he gave at @Scale.

During the earthquake Brian put a banner on Facebook with helpful information sources, but he was moved to find a better way to help people in need. That impulse became Safety Check.

My first reaction to Safety Check was damn, why didn’t anyone think of this before? It’s such a powerful idea.

The answer became clear as I listened to a talk in the same video given by Peter Cottle, Software Engineer at Facebook, who also talked about building Safety Check.

It’s likely only Facebook could have created Safety Check. This observation dovetails nicely with Brian’s main lesson in his talk:

  • Solve real-world problem in a way that only YOU can. Instead of taking the conventional route, think about the unique role you and your company can play.

Only Facebook could create Safety Check, not because of resources as you might expect, but because Facebooks lets employees build crazy things like Safety Check and because only Facebook has 1.5 billion geographically distributed users, with a degree of separation between them of only 4.74 edges, and only Facebook has users who are fanatical about reading their news feeds. More about this later.

In fact, Peter talked about how resources were a problem in a sort of product development Catch-22 at Facebook. The team for Safety Check was small and didn’t have a lot of resources attached to it. They had to build the product and prove its success without resources before they could get the resources to build the product. The problem had to be efficiently solved at scale without the application of lots of money and lots of resources.

As is often the case constraints led to a clever solution. A small team couldn’t build a big pipeline and index, so they wrote some hacky PHP and effectively got the job done at scale.

So how did Facebook build Safety Check? Here’s my gloss on both Brian’s and Peter’s talks:

Categories: Architecture

Online AzureCon Conference this Tuesday

ScottGu's Blog - Scott Guthrie - Mon, 09/28/2015 - 04:35

This Tuesday, Sept 29th, we are hosting our online AzureCon event – which is a free online event with 60 technical sessions on Azure presented by both the Azure engineering team as well as MVPs and customers who use Azure today and will share their best practices.

I’ll be kicking off the event with a keynote at 9am PDT.  Watch it to learn the latest on Azure, and hear about a lot of exciting new announcements.  We’ll then have some fantastic sessions that you can watch throughout the day to learn even more.


Hope to see you there!


Categories: Architecture, Programming

Stuff The Internet Says On Scalability For September 25th, 2015

Hey, it's HighScalability time:

 How long would you have lasted? Loved The Martian. Can't wait for the game, movie, and little potato action figures. Me, I would have died on the first level.

  • 60 miles: new record distance for quantum teleportation; 160: size of minimum viable Mars colony; $3 trillion: assets managed by hedge funds; 5.6 million: fingerprints stolen in cyber attack; 400 million: Instagram monthly active users; 27%: increase in conversion rate from mobile pages that are 1 second faster; 12BN: daily Telegram messages; 1800 B.C: oldest beer recipe; 800: meetings booked per day at Facebook; 65: # of neurons it takes to walk with 6 legs

  • Quotable Quotes:
    • @bigdata: assembling billions of pieces of evidence: Not even the people who write algorithms really know how they work
    • @zarawesome: "This is the most baller power move a billionaire will pull in this country until Richard Branson finally explodes the moon."
    • @mtnygard: An individual microservice fits in your head, but the interrelationships among them exceeds any human's ability. Automate your awareness.
    • Ben Thompson~ The mistake that lots of BuzzFeed imitators have made is to imitate the BuzzFeed article format when actually what should be imitated from BuzzFeed is the business model. The business model is creating portable content that will live and thrive on all kinds of different platforms. The BuzzFeed article is relatively unsophisticated, it's mostly images and text, and mostly images.
    • For more Quotable Quotes please see the full article.

  • Is what Volkswagen did really any different that what happens on benchmarks all the time? Cheating and benchmarks go together like a clear conscience and rationalization. Clever subterfuge is part of the software ethos. There are many many examples. Cars are now software is a slick meme, but that transformation has deep implications. The software culture and the manufacturing culture are radically different.

  • Can we ever trust the fairness of algorithms? Of course not. Humans in relation to their algorithms are now in the position of priests trying to divine the will of god. Computer Scientists Find Bias in Algorithms: Many people believe that an algorithm is just a code, but that view is no longer valid, says Venkatasubramanian. “An algorithm has experiences, just as a person comes into life and has experiences.”

  • Stuff happens, even to the best. But maybe having a significant percentage of the world's services on the same platform is not wise or sustainable. Summary of the Amazon DynamoDB Service Disruption and Related Impacts in the US-East Region.

  • According to patent drawings what does the Internet look like? Noah Veltman has put together a fun list of examples: it's a cloud, or a bean, or a web, or an explosion, or a highway, or maybe a weird lump.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Better Density and Lower Prices for Azure’s SQL Elastic Database Pools

ScottGu's Blog - Scott Guthrie - Wed, 09/23/2015 - 21:41

A few weeks ago, we announced the preview availability of the new Basic and Premium Elastic Database Pools Tiers with our Azure SQL Database service.  Elastic Database Pools enable you to run multiple, isolated and independent databases that can be auto-scaled automatically across a private pool of resources dedicated to just you and your apps.  This provides a great way for software-as-a-service (SaaS) developers to better isolate their individual customers in an economical way.

Today, we are announcing some nice changes to the pricing structure of Elastic Database Pools as well as changes to the density of elastic databases within a pool.  These changes make it even more attractive to use Elastic Database Pools to build your applications.

Specifically, we are making the following changes:

  • Finalizing the eDTU price – With Elastic Database Pools you purchase units of capacity that we can call eDTUs – which you can then use to run multiple databases within a pool.  We have decided to not increase the price of eDTUs as we go from preview->GA.  This means that you’ll be able to pay a much lower price (about 50% less) for eDTUs than many developers expected.
  • Eliminating the per-database fee – In additional to lower eDTU prices, we are also eliminating the fee per database that we have had with the preview. This means you no longer need to pay a per-database charge to use an Elastic Database Pool, and makes the pricing much more attractive for scenarios where you want to have lots of small databases.
  • Pool density – We are announcing increased density limits that enable you to run many more databases per Elastic Database pool. See the chart below under “Maximum databases per pool” for specifics. This change will take effect at the time of general availability, but you can design your apps around these numbers.  The increase pool density limits will make Elastic Database Pools event more attractive.



Below are the updated parameters for each of the Elastic Database Pool options with these new changes:


For more information about Azure SQL Database Elastic Database Pools and Management tools go the technical overview here.

Hope this helps,

Scott omni

Categories: Architecture, Programming

How will new memory technologies impact in-memory databases?

This is a guest post by Yiftach Shoolman, Co-founder & CTO of redislabs. Will 3D XPoint change everything? Not as much as you might hope...

Recently, investors, analysts, partners and customers have asked me how the announcement from Intel and Micron about their new 3D XPoint memory technology will affect the in-memory databases market. In these discussions, a common question was “Who needs an in-memory database if all the non in-memory databases will achieve similar performance with 3D XPoint technology?” Well, I think that's a valid question so I've decided to take a moment to describe how we think this technology will influence our market.

First, a little background...

The motivation of Intel and Micron is clear -- DRAM is expensive and hasn’t changed much during the last few years (as shown below). In addition, there are currently only three major makers of DRAM on the planet (Samsung Electronics, Micron and SK Hynix), which means that the competition between them is not as cutthroat as it used to be between four and five major manufacturers several years ago.

DRAM Price Trends
Categories: Architecture

Publishing ES6 code to npm

Xebia Blog - Tue, 09/22/2015 - 07:58

This post is part of a series of ES2015 posts. We'll be covering new JavaScript functionality every week!

Most of the software we work with at Xebia is open source. Our primary expertise is in open source technology, which ranges our entire application stack. We don’t just consume open source projects, we also contribute back to them and occasionally release some of our own. Releasing an open source project doesn’t just mean making the GitHub repository public. If you want your project to be used, it should be easy to consume. For JavaScript this means publishing it to npm, the package manager for JavaScript code.

Nowadays we write our JavaScript code using the ES6 syntax. We can do this because we’re using Babel to compile it down to ES5 before we run it. When you’re publishing the code to npm however you can’t expect your package consumers to use Babel. In fact if you’re using the Require Hook, it excludes anything under node_modules by default and thus will not attempt to compile that code.


The reality is that we have to compile our code to ES5 before publishing to npm, which is what people writing libraries in CoffeeScript have been doing for a long time. Luckily npm provides a helpful way to automate this process: the prepublish script. There’s a whole list of scripts which you can use to automate your npm workflow. In package.json we simply define the scripts object:

  "name": "my-awesome-lib",
  "version": "0.0.0",
  "scripts": {
    "compile": "babel --optional runtime -d lib/ src/",
    "prepublish": "npm run compile"
  "main": "lib/index.js",
  "devDependencies": {
    "babel": "^5.8.23",
    "babel-runtime": "^5.8.24"

This will make npm automatically run the compile script when we run `npm publish` on the command line, which in turn triggers babel to compile everything in ./src to ./lib (the de-facto standard for compiled sources in npm packages, originating from CommonJS). Finally we also define the main entry point of our package, which denotes the file to import when require(‘my-awesome-lib’) is called.

An important part of the above configuration is the runtime option. This will include polyfills for ES6 features such as Promise and Map into your compiled sources.

Ignoring files

It’s easy to simply publish all of our source code to npm, but we don’t have to. In fact it’s much nicer to only publish the compiled sources to npm, so package consumers only have to download the files they need. We can achieve this using a file called .npmignore:


This file is very similar to .gitignore, in fact npm will fall back to .gitignore if .npmignore is not present. Since we probably want to ignore ./lib in our git repository because it holds compiled sources, npm would normally not even publish that directory, so we have to use .npmignore to get the desired result. However if you want your library to also be directly consumable via a git URL you can commit ./lib to git as well.


Now that everything is set up, the only thing left to do is publish it. Well, not so fast. First we should specify our version number:

npm version 1.0.0

This will update package.json with the new version number, commit this change to git and make a git tag for v1.0.0, all in one go. This of course assumes you're using git, otherwise it will just update package.json. An even better way is to have npm automatically determine the next version number by specifying the scope of change. We should also specify a commit message:

npm version patch -m "Bump to %s because reasons"

For more options check out the npm documentation. Now finish up by pushing your new version commit and tag to GitHub and publish our package to npm:

git push --follow-tags
npm publish

Uber Goes Unconventional: Using Driver Phones as a Backup Datacenter

In How Uber Scales Their Real-Time Market Platform one of the most intriguing hints was how Uber handles datacenter failovers using driver phones as an external distributed storage system for recovery.

Now we know a lot more about how that system works from Uber's Nikunj Aggarwal and Joshua Corbin, who gave a very interesting talk at the @Scale conference: How Uber Uses your Phone as a Backup Datacenter.

Rather than use a traditional backend replication scheme where databases sync state between datacenters to achieve a measure of k-safety, Uber did something different, what they do is store enough state on driver phones so that if a datacenter failover occurs trip information can not be lost on the failover.

Why choose this approach? The traditional approach would be much simpler. I think it is to make sure the customer always has a good customer experience and losing trip information for an active trip would make for a horrible customer experience. 

By building their syncing strategy around the phone, even thought it's complicated and takes a lot work, Uber is able to preserve trip data and make for a seamless customer experience even on datacenter failures. And making the customer happy is what counts, especially in a market with near zero switching costs.

So the goal is not to lose trip information, even on a datacenter failover. Using a traditional database replication strategy it would not be possible to make this guarantee for reasons that have parallels to how network management systems have always had to work. Let me explain.

In a network devices are the authoritative source for state information like packet errors, alarms, packets sent and received, and so on. The network management system is authoritative for configuration data like alarm thresholds and customer information. The complication is devices and the network management system are not always in contact, so they get out of sync because they work independently of each other. Which means on bootup, failover, and communication reconnection all this information has to be merged in both directions using a complicated dance that ensures correctness and consistency. 

Uber has the same problem, only the devices are smart phones and the authoritative state the phone contains is trip information. So on bootup, failover, and communication reconnection the trip information must be preserved because the phone is the authoritative source for trip information.

Even when connectivity is lost the phone has an accurate record all trip data. So you wouldn't want to sync trip data from the datacenter down to the phone because that would wipe out the correct data on the phone. The correct information must come from the phone.

Uber also takes another trick from network management systems. They periodically query phones to test the integrity of information in the datacenter. 

Let's see how they do it...

Motivation for Using Phones as Storage for Datacenter Failure
Categories: Architecture

The Union-Find Algorithm in Scala: a Purely Functional Implementation

Xebia Blog - Mon, 09/21/2015 - 09:29

In this post I will implement the union-find algorithm in Scala, first in an impure way and then in a purely functional manner, so without any state or side effects. Then we can check both implementations and compare the code and also the performance.

The reason I chose union-find for this blog is that it is relatively simple. It is a classic algorithm that is used to solve the following problem: suppose we have a set of objects. Each of them can be connected to zero or more others. And connections are transitive: if A is connected to B and B is connected to C, then A is connected to C as well. Now we take two objects from the set, and we want to know: are they connected or not?
This problem comes up in a number of area's, such as in social networks (are two people connected via friends or not), or in image processing (are pixels connected or separated).
Because the total number of objects and connections in the set might be huge, the performance of the algorithm is important.

Quick Union

The implementation I chose is the so called Quick Union implementation. It scales well but there are still faster implementations around, one of which is given in the references below the article. For this post I chose to keep things simple so we can focus on comparing the two implementations.

The algorithm keeps track of connected elements with a data structure: it represents every element as a Node which points to another element to which it is connected. Every Node points to only one Node it is connected to, and this Node it is called its parent. This way, groups of connected Nodes form trees. The root of such a connected tree is a Node which has an empty parent property.
When the question is asked if two Nodes are connected, the algorithm looks up the roots of the connected trees of both Nodes and checks if they are the same.

The tricky part in union find algorithms is to be able to add new connections to a set of elements without losing too much performance. The data structure with the connected trees enables us to do this really well. We start by looking up the root of both elements, and then set the parent element of one tree to the root of the other tree.

Some care must still be taken when doing this, because over time connected trees might become unbalanced. Therefore the size of every tree is kept in its root Node; when connecting two subtrees, the smaller one is always added to the larger one. This guarantees that all subtrees remain balanced.

This was only a brief description of the algorithm but there are some excellent explanations on the Internet. Here is a nice one because it is visual and interactive: visual algo

The Impure Implementation

Now let's see some code! The impure implementation:

import scala.annotation.tailrec

class IUnionFind(val size: Int) {

  private case class Node(var parent: Option[Int], var treeSize: Int)

  private val nodes = Array.fill[Node](size)(new Node(None, 1))

  def union(t1: Int, t2: Int): IUnionFind = {
    if (t1 == t2) return this

    val root1 = root(t1)
    val root2 = root(t2)
    if (root1 == root2) return this

    val node1 = nodes(root1)
    val node2 = nodes(root2)

    if (node1.treeSize < node2.treeSize) {
      node1.parent = Some(t2)
      node2.treeSize += node1.treeSize
    } else {
      node2.parent = Some(t1)
      node1.treeSize += node2.treeSize

  def connected(t1: Int, t2: Int): Boolean = t1 == t2 || root(t1) == root(t2)

  private def root(t: Int): Int = nodes(t).parent match {
    case None => t
    case Some(p) => root(p)

As you can see I used an array of Nodes to represent the connected components. Most textbook implementations use two integer arrays: one for the parents of every element, and the other one for the tree sizes of the components to which the elements belong. Memory wise that is a more efficient implementation than mine. But apart from that the concept of the algorithm stays the same and in terms of speed the difference doesn't matter much. I do think that using Node objects is more readable than having two integer arrays, so I chose for the Nodes.

The purely functional implementation

import scala.annotation.tailrec

case class Node(parent: Option[Int], treeSize: Int)

object FUnionFind {
  def create(size: Int): FUnionFind = {
    val nodes = Vector.fill(size)(Node(None, 1))
    new FUnionFind(nodes)

class FUnionFind(nodes: Vector[Node]) {

  def union(t1: Int, t2: Int): FUnionFind = {
    if (t1 == t2) return this

    val root1 = root(t1)
    val root2 = root(t2)
    if (root1 == root2) return this

    val node1 = nodes(root1)
    val node2 = nodes(root2)
    val newTreeSize = node1.treeSize + node2.treeSize

    val (newNode1, newNode2) =
      if (node1.treeSize < node2.treeSize) {
        val newNode1 = Node(Some(t2), newTreeSize)
        val newNode2 = Node(node2.parent, newTreeSize)

        (newNode1, newNode2)
      } else {
        val newNode2 = FNode(Some(t1), newTreeSize)
        val newNode1 = FNode(node1.parent, newTreeSize)

        (newNode1, newNode2)
    val newNodes = nodes.updated(root1, newNode1).updated(root2, newNode2)
    new FUnionFind(newNodes)

  def connected(t1: Int, t2: Int): Boolean = t1 == t2 || root(t1) == root(t2)

  private def root(t: Int): Int = nodes(t).parent match {
    case None => t
    case Some(p) => root(p)

Comparing to the first implementation, some parts remained the same. Such as the Node, except for the fact that it is not an inner class anymore. Also the connected and the root methods did not change.
What did change is the method which deals with updating the connections: union. In the purely functional implementation we can't update any array, so instead it creates a new FUnionFind object and returns it at the end. Also two Node objects need to be created when subtrees are merged; the root of the smaller one because it gets a new parent, and the root of the larger one because its tree size needs to be increased.
Perhaps surprisingly, the pure implementation needs more lines of code than the impure one.

The Performance

The pure implementation has to do a bit of extra work when it creates the new objects in its union method. The question is how much this costs in terms of performance.
To find this out, I ran both implementations through a series of performance tests (using ScalaMeter) where I added a large number of connections to a set of objects. I added a (impure) Java 8 implementation to the test as well.
Here are the results:

Nr of elements and connections Impure Pure Java 8 10000 2.2 s 3.8 s 2.3 s 15000 4.4 s 7.9 s 4.2 s 20000 6.2 s 10.3 s 6.3 s

Not surprisingly, the time grows with the number of connections and elements. The growth is a bit faster than linear, that's because the asymptotic time complexity of the quick union algorithm is of the order n log(n).
The pure algorithm is about 65% slower than the impure implementation. The cause is clear: in every call to union the pure algorithm has to allocate and garbage collect three extra objects.

For completeness I added Java 8 to the test too. The code is not given here but if you're interested, there's a link to the complete source below the article. Its implementation is really similar to the Scala version.


Purely functional code has a couple of advantages over non pure implementations; because of the lack of side effects it can make it easier to reason about blocks of code, also concurrency becomes easier because there is no shared state.
In general it also leads to more concise code because collection methods like map and filter can easily be used. But in this example that was not the case, the pure implementation even needed a few lines extra.
The biggest disadvantage of the pure union-find algorithm was its performance. It depends on the requirements of the project where the code is used if this is a showstopper, or if the better concurrent behavior of the pure implementation outweighs this disadvantage.

Explanation of a faster union-find with path compression
All the source code in the article, including the tests

Stuff The Internet Says On Scalability For September 18th, 2015

Hey, it's HighScalability time:

This is how you blast microprocessors with high-energy beams to test them for space.

  • terabits: Facebook's network capacity; 56.2 Gbps: largest extortion DDoS attack seen by Akamai; 220: minutes spent usings apps per day; $33 billion: 2015 in-app purchases; 2334: web servers running in containers on a Raspberry Pi 2; 121: startups valued over $1 billion

  • Quotable Quotes:
    • A Beautiful Question: Finding Nature's Deep Design: Two obsessions are the hallmarks of Nature’s artistic style: Symmetry—a love of harmony, balance, and proportion Economy—satisfaction in producing an abundance of effects from very limited means
    • : ad blocking Apple has done to Google what Google did to MSFT. Added a feature they can't compete with without breaking their biz model
    • @shellen: FWIW - Dreamforce is a localized weather system that strikes downtown SF every year causing widespread panic & bad slacks. 
    • @KentBeck: first you learn the value of abstraction, then you learn the cost of abstraction, then you're ready to engineer
    • @doctorow: Arab-looking man of Syrian descent found in garage building what looks like a bomb 
    • @kixxauth: Idempotency is not something you take a pill for. -- ZeroMQ
    • @sorenmacbeth: Alice in Blockchains
    • Sebastian Thrun: BECAUSE of the increased efficiency of machines, it is getting harder and harder for a human to make a productive contribution to society
    • Coding Horror: Getting the details right is the difference between something that delights, and something customers tolerate.
    • @mamund: "[92% of] all catastrophic failures are the result of incorrect handling of non-fatal errors."
    • Charles Weitz: Almost every cell in our body has a circadian clock. It helps every cell figure out when to use energy, when to rest, when to repair DNA, or to replicate DNA.
    • @kfury: Web development skills are like cells in your body. Every 7 years they're completely replaced by new ones.
    • Alexey Gorshkov: We’re learning how to build complex states of light that, in turn, can be built into more complex objects. 
    • @BenedictEvans: Ad blocking = taking money away from people whose work you read. Everyone has reasons, or excuses. But it remains true
    • Gaffer on Games: I swear you guys are like the f*cking climate change deniers of network programming..not just a rant, also deeply informative.
    • @anoemi: I don't use emojis because when I use smiley faces, I like to stay close to the metal.
    • @neil_conway: in practice, basically no app logic gets retry logic right (esp. for read-only xacts, which can abort under serializable).
    • @xaprb: All roads lead to Rome. All queueing theory studies lead to Agner Erlang. All scalability studies lead to Neil Gunther.

  • Why doesn't Google use git? Here's why. Stats on the Google source code repository: 1 billion files, 9 million source files, 2 billion lines of code, 35 million commits, 86 terabytes, 45 thousand commits per workday, 25,000 Googlers from all over the world, billions of file read requests per day (800K QPS peak). All in one single repository. The rate of change is on an exponential growth curve. Of note: robots commit 30K times per day, humans only 15K. From a talk by Rachel Potvin: The Motivation for a Monolithic Codebase

  • The problem is as soon as Medium becomes everything it also becomes nothing. Medium's Evan Williams To Publishers: Your Website Is Toast

  • If you appreciate the technical aspects of the intricate bot games Ashley Madison is said to have played then you might enjoy Darknet, a book that takes the same idea to chilling extremes. AI driven Distributed Autonomous Corporations use bitcoin and anonymous markets to take the world to the brink. Only a gambit worthy of Captain Kirk saves the day.

  • Points to ponder. Why I wouldn’t use rails for a new company: I worry now that rails is past its zenith, and that starting a new company with rails today might be like starting a company using Java Spring in 2007...Everyone knows that ruby is slow...over time other frameworks simply picked up those innovations [Rails]...If you want to future-proof your web application, you have to make a bet on what engineers will want to use in three years. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Log Uncaught Errors in Scala/ Akka

Xebia Blog - Thu, 09/17/2015 - 12:00

At my work we have a long running service that's using the Akka-Spray stack. Recently it crashed and we wanted to check its logfile to find some clue about the cause. But there was nothing there to help us.
Eventually we did find the cause, it had been an OutOfMemoryError which was thrown in one of the actors, and because this wasn't caught anywhere (and it shouldn't), it terminated the entire actor system.
It would have saved us some time if this error had been logged somewhere, so that is what this blog will be about.


Any exception or error is written to System.err automatically. System.err writes to the standard outputstream of the process, which normally is the console. I want to change this for our server, so at the startup of the server I redirect System.err to a custom outputstream of my own, called LoggingOutputStream, like this:

import org.apache.log4j.{Level, Logger}

System.setErr(new PrintStream(new LoggingOutputStream(Logger.getRootLogger, Level.ERROR), true))

The LoggingOutputStream will write anything that would normally go to System.err to log4j's RootLogger instead, with log level ERROR.
What's left is the implementation of our LoggingOutputStream:

import{IOException, OutputStream}

import org.apache.log4j.{Priority, Category}

class LoggingOutputStream(category: Category, priority: Priority) extends OutputStream {
  private val LINE_SEPARATOR = System.getProperty("line.separator")
  private var closed = false
  private var buffer = new Array[Byte](2048)
  private var count = 0

  override def close() {
    closed = true

  override def write(b: Int) {
    if (closed) {
      throw new IOException("The stream has been closed!")
    if (b == 0) {

    if (count == buffer.length) {
      // The buffer is full; grow it
      val newBuffer = new Array[Byte](2 * buffer.length)
      System.arraycopy(buffer, 0, newBuffer, 0, buffer.length)
      buffer = newBuffer

    buffer(count) = b.toByte
    count += 1

  override def flush() {
    if (count == 0) {
    // Don't print out blank lines; flushing from PrintStream puts these out
    if (!isBlankLine) category.log(priority, new String(buffer.slice(0, count)))

  private def isBlankLine = (count == LINE_SEPARATOR.length) &&
    ((buffer(0).toChar == LINE_SEPARATOR.charAt(0) && count == 1)
      || (buffer(1).toChar == LINE_SEPARATOR.charAt(1)) && count == 2)

  private def reset() {
    count = 0

Of course this solution is not specific for Akka, it will work in any Scala application.

5 Lessons and 8 Industry Changes Over 5 Years as Etsy CTO

Endings are often a time for reflection and from reflection often comes wisdom. That is the case for Kellan Elliott-McCrea, who recently announced he was leaving his job after five successful years as the CTO of Etsy. Kellan wrote a rather remarkable going away post: Five years, building a culture, and handing it off, brimming with both insight and thoughtful commentary.

This post is just a short gloss of the major points. He goes into more depth on each point, so please read his post.

The Five Lessons:

  1. Nothing we “know” about software development should be assumed to be true.
  2. Technology is the product of the culture that builds it.
  3. Software development should be thought of as a cycle of continual learning and improvement rather a progression from start to finish, or a search for correctness.
  4. You build a culture of learning by optimizing globally not locally.
  5. If you want to build for the long term, the only guarantee is change.

The Eight Industry Changes

  1. Five years ago, continuous deployment was still a heretical idea. 
  2. Five years ago, it was crazy to discuss that monitoring, testing, debugging, QA, staged releases, game days, user research, and prototypes are all tools with the same goal, improving confidence, rather than separate disciplines handled by distinct teams.
  3. Five years ago, focusing on detection and response vs prevention in order to achieve better, more reliable, more scalable, and more secure software was unprofessional.
  4. Five years ago, suggesting that better software is written by a diverse team of kind people who care about each other was antithetical to our self-image as an industry.
  5. Five years ago, trusting not only our designers and product managers to code and deploy to production, but trusting everyone in the company to deploy to production.
  6. Five years ago, rooms of people excitedly talking about their own contribution to a serious outage would have been a prelude to mass firings, rather than a path to profound learning.
  7. And five years ago no one was experimenting in public about how to do this stuff, sharing their findings, and open sourcing code to support this way of working.
  8. Five years ago, it would have seemed ludicrous to think a small team supporting a small site selling crafts could aspire to change how software is built and, in the process, cause us to rethink how the economy works.

While many of these ideas were happening more than five years ago the point still stands, the industry has undergone a lot of changes recently, and sometimes it's worth taking a little time to reflect on that a bit. 

Categories: Architecture