Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Architecture

Stuff The Internet Says On Scalability For August 15th, 2014

Hey, it's HighScalability time:


Somehow this seems quite appropriate. (via John Bredehoft)
  • 75 acres: Pizza eaten in US daily; 270TB: Backblaze storage pod; 14nm: Intel extends Moore's Law
  • Quotable Quotes
    • discreteevent: The dream of reuse has made a mess of many systems.
    • David Crawley: Don't think of Moore's Law in terms of technology; think of it in terms of economics and you get much greater understanding. The limits of Moore's Law is not driven by current technology. The limits of Moore's Law are really a matter of cost.
    • Simon Brown: If you can't build a monolith, what makes you think microservices are the answer?
    • smileysteve: The net result is that you should be able to transmit QPSK at 32GBd in 2 polarizations in maybe 80 waves in each direction. 2bits x 2 polarizations x 32G ~128Gb/s per wave or nearly 11Tb/s for 1 fiber. If this cable has 6 strands, then it could easily meet the target transmission capacity [60TB].
    • Eric Brumer: Highly efficient code is actually memory efficient code.

  • How to be a cloud optimist. Tell yourself: an instance is half full, it's not half empty; Downtime is temporary; Failures aren't your fault.

  • Mother Earth, Motherboard by Neal Stephenson. Goes without saying it's gorgeously written. The topic: The hacker tourist ventures forth across the wide and wondrous meatspace of three continents, chronicling the laying of the longest wire on Earth. < Related to Google Invests In $300M Submarine Cable To Improve Connection Between Japan And The US.

  • IBM compares virtual machines and against Linux containers: Our results show that containers result in equal or better performance than VM in almost all cases. Both VMs and containers require tuning to support I/O-intensive applications.

  • Does Psychohistory begin with BigData? Of a crude kind, perhaps. Google uses BigQuery to uncover patterns of world history: What’s even more amazing is that this analysis is not the result of a massive custom-built parallel application built by a team of specialized HPC programmers and requiring a dedicated cluster to run on: in stark contrast, it is the result of a single line of SQL code (plus a second line to create the initial “view”). All of the complex parallelism, data management, and IO optimization is handled transparently by Google BigQuery. Imagine that – a single line of SQL performing 2.5 million correlations in just 2.5 minutes to uncover the underlying patterns of global society.

  • Fabian Giesen with an deep perspective on how communication has evolved to use a similar pattern. Networks all the way down (part2): anything we would call a computer these days is in fact, for all practical purposes, a heterogeneous cluster made up of various specialized smaller computers, all connected using various networks that go by different names and are specified in different standards, yet are all suspiciously similar at the architecture level; a fractal of switched, packet-based networks of heterogeneous nodes that make up what we call a single “computer”. It means that all the network security problems that plague inter-computer networking also exist within computers themselves. Implementations may change substantially over time, the interfaces – protocols, to stay within our networking terminology – stay mostly constant over large time scales, warts and all.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Hamsterdb: An Analytical Embedded Key-value Store

 

In this post, I’d like to introduce you to hamsterdb, an Apache 2-licensed, embedded analytical key-value database library similar to Google's leveldb and Oracle's BerkeleyDB.

hamsterdb is not a new contender in this niche. In fact, hamsterdb has been around for over 9 years. In this time, it has dramatically grown, and the focus has shifted from a pure key-value store to an analytical database offering functionality similar to a column store database. 

hamsterdb is single-threaded and non-distributed, and users usually link it directly into their applications. hamsterdb offers a unique (at least, as far as I know) implementation of Transactions, as well as other unique features similar to column store databases, making it a natural fit for analytical workloads. It can be used natively from C/C++ and has bindings for Erlang, Python, Java, .NET, and even Ada. It is used in embedded devices and on-premise applications with millions of deployments, as well as serving in cloud instances for caching and indexing.

hamsterdb has a unique feature in the key-value niche: it understands schema information. While most databases do not know or care what kind of keys are inserted, hamsterdb supports key types for binary keys...

Categories: Architecture

Success Articles for Work and Life

"Success consists of going from failure to failure without loss of enthusiasm." -- Winston Churchill

I now have more than 300 articles on the topic of Success to help you get your game on in work and life:

Success Articles

That’s a whole lot of success strategies and insights right at your fingertips. (And it includes the genius from a wide variety of sources including  Scott Adams, Tony Robbins, Bruce Lee, Zig Ziglar, and more.)

Success is a hot topic. 

Success has always been a hot topic, but it seems to be growing in popularity.  I suspect it’s because so many people are being tested in so many new ways and competition is fierce.

But What is Success? (I tried to answer that using Zig Ziglar’s frame for success.)

For another perspective, see Success Defined (It includes definitions of success from Stephen Covey and John Maxwell.)

At the end of the day, the most important definition of success, is the one that you apply to you and your life.

People can make or break themselves based on how they define success for their life.

Some people define success as another day above ground, but for others they have a very high, and very strict bar that only a few mere mortals can ever achieve.

That said, everybody is looking for an edge.   And, I think our best edge is always our inner edge.

As my one mentor put it, “the fastest thing you can change in any situation is yourself.”  And as we all know, nature favors the flexible.  Our ability to adapt and respond to our changing environment is the backbone of success.   Otherwise, success is fleeting, and it has a funny way of eluding or evading us.

I picked a few of my favorite articles on success.  These ones are a little different by design.  Here they are:

Scott Adam’s (Dilbert) Success Formula

It’s the Pebble in Your Shoe

The Wolves Within

Personal Leadership Helps Renew You

The Power of Personal Leadership

Tony Robbins on the 7 Traits of Success

The Way of Success

The future is definitely uncertain.  I’m certain of that.   But I’m also certain that life’s better with skill and that the right success strategies under your belt can make or break you in work and life.

And the good news for us is that success leaves clues.

So make like a student and study.

Categories: Architecture, Programming

The Easy Way of Building a Growing Startup Architecture Using HAProxy, PHP, Redis and MySQL to Handle 1 Billion Requests a Week

This Case Study is a guest post written by Antoni Orfin, Co-Founder and Software Architect at Octivi

In the post I'll show you the way we developed quite simple architecture based on HAProxy, PHP, Redis and MySQL that seamlessly handles approx 1 billion requests every week. There’ll be also a note of the possible ways of further scaling it out and pointed uncommon patterns, that are specific for this project.

Stats:
Categories: Architecture

The AngularJS Promise DSL

Xebia Blog - Mon, 08/11/2014 - 10:21

As promised in my previous post, I just pushed the first version of our "Angular Promise DSL" to Github. It extends AngularJS's $q promises with a number of helpful methods to create cleaner applications.

The project is a V1, it may be a bit rough around the edges in terms of practical applicability and documentation, but that's why it's open source now.

The repository is at https://github.com/fwielstra/ngPromiseDsl and licensed as MIT. It's the first OS project I've created, so bear with me. I am accepting pull requests and issues, of course.

Questions? Ask them on the issues page, ask me via Twitter (@frwielstra) or send me an e-mail. I'd offer you to come by my office too... if I had one.

Introduction to big data presentation

I presented big data to Amdocs’ product group last week. One of the sessions I did was recorded so I might be able to add here later. Meanwhile you can check out the slides.

Note that trying to keep the slide visual I put some of the information is in the slide notes and not on the slides themselves.

Big data Overview from Arnon Rotem-Gal-Oz

Categories: Architecture

Are You Doing Agile Results?

If you already use Agile Results as your personal results system, you have a big advantage.

Why?

Because most people are running around, scrambling through a laundry list of too many things to do, a lack of clarity around what the end result or outcomes should be, and a lack of clarity around what the high-value things to focus on are.  They are using their worst energy for their most important things.  They are spending too much time on the things that don’t matter and not enough time on the things that do.   They are feeling at their worst, when they need to feel at their best, and they are struggling to keep up with the pace of change.

I created Agile Results to deal with the chaos in work and life, as a way to rise above the noise, and to easily leverage the most powerful habits and practices for getting better results in work and life.

Agile Results, in a nutshell, is a simple system for mastering productivity and time management, while at the same time, achieving more impact, realizing your potential, and feeling more fulfillment.

I wrote about the system in the book Getting Results the Agile Way.  It’s been a best seller in time management.

How Does Agile Results Work?

Agile Results works by combining proven practices for productivity, time management, psychology, project management, and some of the best lessons learned on high-performance.   And it’s been tested for more than a decade under extreme scenarios and a variety of conditions from individuals to large teams.

Work-Life balance is baked into the system, but more importantly Agile Results helps you live your values wherever you are, play to your strengths, and rapidly learn how to improve your results in an situation.  When you spend more time in your values, you naturally tap into your skills and abilities that help bring out your best.

The simplest way to think of Agile Results is that it helps you direct your attention and apply your effort on the things that count.  By spending more time on high-value activities and by getting intentional about your outcomes, you dramatically improve your ability to get better results.

But none of that matters if you aren’t using Agile Results.

How Can You Start Using Agile Results?

Start simple.

Simply ask yourself, “What are the 3 wins, results, or outcomes that I want for today?.”   Consider the demands you have on your plate, the time and energy you’ve got, and the opportunities you have for today, and write those 3 things down.

That’s it.   You’re doing Agile Results.

Of course, there’s more, but that’s the single most important thing you can do to immediately gain clarity, regain your focus, and spend your time and energy on the most valuable things.

Now, let’s assume this is the only post you ever read on Agile Results.   Let’s take a fast walkthrough of how you could use the system on a regular basis to radically and rapidly improve your results on an ongoing basis.

How I Do Agile Results? …

Here’s a summary of how I do Agile Results.

I create a new monthly list at the start of each month that lists out all the things that I think I need to do, and I bubble up 3 of my best things I could achieve or must get done to the top.   I look at it at the start of the week, and any time I’m worried if I’m missing something.  This entire process takes me anywhere from 10-20 minutes a month.

I create a weekly list at the start of the week, and I look at it at the start of each day, as input to my 3 target wins or outcomes for the day, and any time I’m worried if I’m missing anything.   This tends to take me 5-10 minutes at the start of the week.

I barely have to ever look at my lists – it’s the act of writing things down that gives me quick focus on what’s important.   I’m careful not to put a bunch of minutia in my lists, because then I’d train my brain to stop focusing on what’s important, and I would become forgetful and distracted.  Instead, it’s simple scaffolding.

Each day, I write a simple list of what’s on my mind and things I think I need to achieve.   Next, I step back and ask myself, “What are the 3 things I want to accomplish today?”, and I write those down.   (This tends to take me 5 minutes or less.  When I first started it took me about 10.)

Each Friday, I take the time to think through three things going well and three things to improve.   I take what I learn as input into how I can simplify work and life, and how I can improve my results with less effort and more effectiveness.   This takes me 10-20 minutes each Friday.

How Can You Adopt Agile Results?

Use it to plan your day, your week, and your month.

Here is a simple recipe for adopting Agile Results and using it to get better results in work and life:

  1. Add a recurring appointment on your calendar for Monday mornings.  Call it Monday Vision.   Add this text to the body of the reminder: “What are your 3 wins for this week?”
  2. Add a recurring appointment on your calendar to pop up every day in the morning.  Call it Daily Wins.  Add this text to the body of the reminder: “What are your 3 wins for today?”
  3. Add a recurring appointment on your calendar to pop up every Friday mid-morning.  Call it Friday Reflection.  Add this text to the body of your reminder:  What are 3 things going well?  What are 3 things to improve?”
  4. On the last day of the month, make a full list of everything you care about for the next month.   Alphabetize the list.  Identify the 3 most important things that you want to accomplish for the month, and put those at the top of the list.   Call this list  Monthly Results for Month XYZ.  (Note – Alphabetizing your list helps you name your list better and sort your list better.  It’s hard to refer to something important you have to do if you don’t even have a name for it.  If naming the things on your list and sorting them is too much to do, you don’t need to.  It’s just an additional tip that helps you get even more effective and more efficient.)
  5. On Monday of each week, when you wake up, make a full list of everything you care about accomplishing for the week.  Alphabetize the list.  Identify the 3 most important things you want to accomplish and add that to the top of the list.  (Again, if you don’t want to alphabetize then don’t.)
  6. On Wednesdays, in the morning, review the three things you want to accomplish for the week to see if anything matters that you should have spent time on or completed by now.  Readjust your priorities and focus as appropriate.  Remember that the purpose of having the list of your most important outcomes for the week isn’t to get good at predicting what’s important.  It’s to help you focus and to help you make better decisions about what to spend time on throughout the week.  If something better comes along, then at least you can make a conscious decision to trade up and focus on that.  Keep trading up.   And when you look back on Friday, you’ll know whether you are getting better at trading up or if you are just getting randomize or focusing on the short-term but hurting the long term.
  7. On Fridays,  in the morning, do your Friday Reflection.  As part of the exercise, check against your weekly outcomes and your monthly outcomes that you want to accomplish.  If you aren’t effective for the week, don’t ask “why not,” ask “how to.”   Ask how can you bite off better things and how can you make better choices throughout the week.  Just focus on little behavior changes, and this will add up over time.  You’ll get better and better as you go, as long as you keep learning and changing your approach.   That’s the Agile Way.

There are lots of success stories by other people who have used Agile Results.   Everybody from presidents of companies to people in the trenches, to doctors and teachers, to teams and leaders, as well as single parents and social workers.

But none of that matters if it’s not your story.

Work on your success story and just start getting better results, right here, right now.

What are the three most important things you really want to accomplish or achieve today?

Categories: Architecture, Programming

Stuff The Internet Says On Scalability For August 8th, 2014

Hey, it's HighScalability time:


Physicists reveal the scaling behaviour of exotic giant molecules.
  • 5 billion: Transistors Intel manufactures each second; 396M: WeChat active users
  • Quotable Quotes:
    • @BenedictEvans: Every hour or so, Apple ships phones with something around 2x more transistors than were in all the PCs on earth in 1995.
    • @robgomes: New client. Had one of their employees tune an ORM-generated query. Reduced CPU by 99.999%, IO by 99.996%.  Server now idle.
    • @pbailis: As a hardware-oriented systems builder, I'd pay attention to, say, ~100 ns RTTs via on-chip photonic interconnects
    • @CompSciFact: "Fancy algorithms are buggier than simple ones, and they're much harder to implement." -- Rob Pike's rule No. 4
    • @LusciousPear: I'm probably doing in Google what would have taken 5-8 engineers on AWS.
    • C. Michael Holloway, NASA: To a first approximation, we can say that accidents are almost always the result of incorrect estimates of the likelihood of one or more things.
    • Stephen O'Grady: More specific to containers specifically, however, is the steady erosion in the importance of the operating system. 

  • Wait, I thought mobile meant making single purpose apps? Mobile meant tearing down the portal cathedrals built by giants of the past. Then Why aren’t App Constellations working?: The App Constellation strategy works when you have a core resource which can be shared across multiple apps. 

  • Decentralization: I Want to Believe. The irony is mobile loves centralization, not p2p. Mobile IP addresses change all the time and you can't run a server on a phone. The assumption that people want decentralization has been disproven. Centralized services have won. People just want a service that works. The implementation doesn't matter that much. Good discussion on HackerNews and on Reddit.

  • Myth: It takes less money to start a startup these days. Sort of.  Why the Structural Changes to the VC Industry Matter: It turns out that, while it is in fact cheaper to get started and enter the market, it also requires more money for the breakout companies to win the market. Ultimately, today’s winners have a chance to be a lot bigger. But winning requires more money for geographic expansion, full-stack support of multiple new disciplines, and product expansion. And these companies have to do all of this while staying private for a much longer period of time; the median for money raised by companies prior to IPO has doubled in the past five years. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Extending AngularJS services with the Decorate method

Xebia Blog - Fri, 08/08/2014 - 12:00

Many large Angular applications tend to see a lot of repetition - same API endpoint, same method of dealing with and transforming data, etcetera. One technique you can use to at least alleviate that is using AngularJS's decorate method, which allows you to extend, adjust or even fully replace any existing service.

As you'll see in this post, using this allows you to modify and extend the framework you build your app in, which will lead to a cleaner, more legible codebase, written in a more functional style (the what, not the how).

Update 11/8: The follow-up is now live, along with the GitHub repository.

A feature not often used when developing AngularJS applications is the $provide service, which is the primary service used to register components with the $injector. More commonly, a developer would use methods like $provide.service() or $provide.factory to do so, but those are merely utility methods defined in the $provide service and exposed via angular.module().

The main reasons to use $provide over the service() and factory() methods is to configure the service before it's instantiated, for example. While there may be more advanced use-cases for using $provide, I haven't yet encountered them while developing regular applications and I'm sure they won't occur often.

One of the methods listed at the very bottom of the $provide documentation is the decorate() method. It doesn't look like much (it's at the bottom, after all), but its documentation hints that it's very powerful:

"A service decorator intercepts the creation of a service, allowing it to override or modify the behaviour of the service. The object returned by the decorator may be the original service, or a new service object which replaces or wraps and delegates to the original service."

Nothing to add there. You can use decorate() to change, add to, or completely replace the behaviour of services without having to edit its code. This can be done on any code not your own - core AngularJS services, but also third-party libraries. It's the equivalent of overriding methods in OO languages or monkey-patching in the more dynamic languages.

“Isn’t that evil?”, I hear you ask. As with every programming-related question, the only correct answer is: it depends. I’m going to give a few practical examples of when I believe using decorate() is appropriate. In a future blog post, I'll expand on this example, showing how relatively simple code can positively influence your entire application architecture.

Here’s a practical example, a neat follow-up on my previous blog about angular promises: decorating $q to add methods to the promise object. The promise API itself defines only one method: then(). $q adds a few simple methods to that like catch() and finally(), but for your own application you can add a few more.

If you’ve been working with promises for a little while in your AngularJS application, you’ve probably noticed some operations are pretty common; assigning the promise result to the scope (or any object), logging the result in the console, or calling some other method. Using decorate(), we can add methods to the promise object to simplify those. Here's a bit of code from my previous post; we'll add a method to $q to remove the need for a callback:

CustomerService.getCustomer(currentCustomer)
 .then(CartService.getCart)
 .then(function(cart) {
   $scope.cart = cart;
 })
 .catch($log.error);

First, we’ll need to do some boilerplate: we create a function that adds our methods to the promise object, and then we replace all the default promise methods. Note that the decorating function will also apply itself to the given promise.then method again, so that our customisations aren’t lost further down a promise chain:

angular.module('ngPromiseDsl', [])
  .config(function ($provide) {
    $provide.decorator('$q', function ($delegate, $location) {

      // decorating method

      function decoratePromise(promise) {
        var then = promise.then;

        // Overwrite promise.then. Note that $q's custom methods (.catch and .finally) are implemented by using .then themselves, so they're covered too.

        promise.then = function (thenFn, errFn, notifyFn) {
          return decoratePromise(then(thenFn, errFn, notifyFn));
        };

        return promise;
      }

      // wrap and overwrite $q's deferred object methods

      var defer = $delegate.defer,
        when = $delegate.when,
        reject = $delegate.reject,
        all = $delegate.all;

      $delegate.defer = function () {
        var deferred = defer();
        decoratePromise(deferred.promise);
        return deferred;
      };

      $delegate.when = function () {
        return decoratePromise(when.apply(this, arguments));
      };

      $delegate.reject = function () {
        return decoratePromise(reject.apply(this, arguments));
      };

      $delegate.all = function () {
        return decoratePromise(all.apply(this, arguments));
      };

      return $delegate;

    });
  });

With that boilerplate in place, we can now start adding methods. As I mentioned earlier, one of the most common uses of a then() function is to set the result onto the scope (or some other object). This is a fairly trivial operation, and it’s pretty straightforward to add it to the promise object using our decorator, too:

function decoratePromise(promise) {
  var then = promise.then;

  promise.then = function (thenFn, errFn, notifyFn) {
    return decoratePromise(then(thenFn, errFn, notifyFn));
  };

  // assigns the value given to .then on promise resolution to the given object under the given varName
  promise.thenSet = function (obj, varName) {
    return promise.then(function (value) {
      obj[varName] = value;
      return value;
    });
  };

  return promise;
}

That’s all. Put this .config block in your application's module definition, or create a new module and add a dependency to it, and you can use it throughout your application. Here's the same piece of code, now with our new thenSet method:

CustomerService.getCustomer(currentCustomer)
  .then(CartService.getCart)
  .thenSet($scope, 'cart')
  .catch($log.error);

This particular example can be extended in a multitude ways to add useful utilities to promises. In my current project we’ve added a number of methods to the promise object, which allows us to reduce the number of callback definitions in our controllers and thus create cleaner, more legible code.

 

Replacing custom callbacks with named methods allows for a more functional programming style, and allows readers to read and write code as a list of “whats”, instead of “hows” - and it's also fully asynchronous.

Extending $q is just the start though: Every angular service can be extended for various purposes - add performance monitoring and logging to $http, set common prefixes or fixed properties on $resource urls or template paths, you name it. Leave a remark in the comments about how you've used decorate() to create a better application.

Stay tuned for an upcoming post where I release a small open source project that extends angular’s promise objects with a number of helpful methods to perform common tasks.

Further reading, resources, helpful links:

The Principles of Modern Management

Are your management practices long in the tooth?

I think I was lucky that early on, I worked in environments that shook things up and rattled the cage in pursuit of more customer impact, employee engagement, and better organizational performance.

In one of the environments, a manufacturing plant, the management team flipped the typical pyramid of the management hierarchy upside down to reflect that the management team is there to empower and support the production line.

And when I was on the Microsoft patterns & practices team, we had an interesting mix of venture capitalist type management coupled with some early grandmasters of the Agile movement.   More than just Agile teams, we had an Agile management culture that encouraged a customer-connected approach to product development, complete with self-organizing, multi-disciplinary teams, empowered people, a focus on execution excellence, and a fierce focus on being a rapid learning machine. 

We thrived on change.

We also had a relentless focus on innovation.  Not just in our product, but in our process.  If we didn’t innovate in our process, then we got pushed out of market by becoming too slow, too expensive, or by lacking the quality experience that customers have come to expect.

But not everybody knows what a great environment for helping people thrive and do great things for the world, looks like.

While a lot of people in software or in manufacturing have gotten a taste of Agile and Lean practices, there are many more businesses that don’t know what a modern learning machine of people and processes that operate at a higher-level looks like. 

Many, many businesses and people are still operating and looking at the world through the lens of old world management principles.

In the book The Future of Management, Gary Hamel walks through the principles upon which modern management is based.

The Principles of Modern Management

Hamel gives us a nice way to frame looking at the modern management principles, by looking at their application, and their intended goal.

Via The Future of Management:

Principle Application Goal Standardization Minimize variances from standards around inputs, outputs, and work methods. Cultivate economies of scale, manufacturing efficiency, reliability, and quality. Specialization (of tasks and functions) Group like activities together in modular organizational units. Reduce complexity and accelerate learning. Goal alignment Establish clear objectives through a cascade of subsidiary goals and supporting metrics. Ensure that individual efforts are congruent with top-down goals. Hierarchy Create a pyramid of authority based on a limited span of control. Maintain control over a broad scope of operations. Planning and control Forecast demand, budget resources, and schedule tasks, then track and correct deviations from plan. Establish regularity and predictability in operations; conformance to plans. Extrinsic rewards Provide financial rewards to individuals and teams for achieving specified outcomes. Motivate effort and ensure compliance with policies and standards. What are the Principles Upon Which Your Management Beliefs are Based?

Most people aren’t aware of the principles behind the management beliefs that they practice or preach.  But before coming up with new ones, it helps to know what current management thinking is rooted in.

Via The Future of Management:

“Have you ever asked yourself, what are the deepest principles upon which your management beliefs are based? Probably not.  Few executives, in my experience, have given much thought to the foundational principles that underlie their views on how to organize and manage.  In that sense, they are as unaware of their management DNA as they are of their biological DNA.  So before we set off in search of new management principles, we need to take a moment to understand the principles that comprise our current management genome, and how those tenets may limit organizational performance.”

A Small Nucleus of Core Principles

It really comes down to a handful of core principles.  These principles serve as the backbone for much of today’s management philosophy.

Via The Future of Management:

“These practices and processes of modern management have been built around a small nucleus of core principles: standardization, specialization, hierarchy, alignment, planning, and control, and the use of extrinsic rewards to shape human behavior.”

How To Maximize Operational Efficiency and Reliability in Large-Scale Organizations

It’s not by chance that the early management thinkers came to the same conclusions.  They were working on the same problems in a similar context.  Of course, the challenge now is that the context has changed, and the early management principles are often like fish out of water.

Via The Future of Management:

“These principles were elucidated early in the 20th century by a small band of pioneering management thinkers -- individuals like Henri Fayol, Lyndall Urwick, Luther Gullick, and Max Weber. While each of these theorists had a slightly different take on the philosophical foundations of modern management, they all agreed on the principles just enumerated. This concordance is hardly surprising, since they were all focusing on the same problem: how to maximize operational efficiency and reliability in large-scale organizations. Nearly 100 years on, this is still the only problem that modern management is fully competent to address.”

If your management philosophy and guiding principles are nothing more than a set of hand me downs from previous generations, it might be time for a re-think.

You Might Also Like

Elizabeth Edersheim on Management Lessons of a Lifelong Student

How Employees Lost Empathy for their Work, for the Customer, and for the Final Product

No Slack = No Innovation

The Drag of Old Mental Models on Innovation and Change

The New Competitive Landscape

The New Realities that Call for New Organizational and Management Capabilities

Who’s Managing Your Company

Categories: Architecture, Programming

A Valuable Sprint Review (a.k.a. Demo): How To

Xebia Blog - Thu, 08/07/2014 - 10:05

A valuable Sprint Review (from now on in this blog referred to as Demo) can be built in three steps. It starts during the Sprint planning session with agreeing on and understanding the user stories on the Sprint backlog. Then, during the Sprint, the team constantly exchanges ideas and results of the realisation of the Story. Finally, during the demo itself, the Product Owner and the rest of the team demo the stories to the stakeholders to display the value delivered and open up for feedback.

Planning for a good demo

During the planning session, it is imperative that the Product Owner and the rest of the team understand the stories that will be picked up. This sounds obvious, but it happens often that this is not the case. Stories might be too technical so the Product Owner is disconnected or stories are so high level that it is hard to determine what needs to be done.

Make sure stories are formulated from the perspective of an end-user of the functionality that will be delivered. This could be an actual user, a system that picks up whatever result is created or any other manifestation of who or what will use the result of the story.

Also take care of getting the acceptance criteria clear. This way it will be clear to developers what to build, to testers what to test for and designers what to design. It will help the Product Owner to have a better idea what is in and what might have to be defined in a new/other user story.

It is important that everyone understands the context in which the story ‘lives’. What part of the system is touched (end-to-end is preferred but not always possible), which parties are affected by the change, what prerequisites are needed, etc.

Building for a great demo

When during the creation of the value of each story the whole team is in constant contact about intermediate results and decisions taken, everyone will be able to add to the value and be aware of what the result of the story will be. It is very important that the whole team is including the Product Owner. When the PO sees the intermediate results, she or he can already create an image of what the result will be like. Also, the PO can contact stakeholders that might have an opinion of what is created and, when needed, adjust the end result to match expectations.

Delivering an valuable demo

In the demo, the Product Owner should present to the stakeholders the value of each user story that has been delivered. So, per story, explain what has changed from the perspective of the end-user and have the rest of the team show this. Also, when stories are not done, explain which (sub-)functionality is not yet finished. Make sure to ask for feedback from the end-user or other stakeholders on what is demonstrated.

Conclusion

The value of the demo depends largely on the cooperation of the entire team. When the Product Owner and the rest of the team work together on understanding what will be delivered and help each other to get the most value from each story delivered the demo will be focused, valuable and fun.

Scrumban

Xebia Blog - Wed, 08/06/2014 - 19:19

Scrum has become THE revolution in the world of software development. The main philosophy behind scrum is accepting that a problem cannot be fully understood or defined at start; scrum has the focus on maximizing the team's ability to deliver quickly and respond to emerging requirements. It came as truly refreshing in the time when projects were ruled by procedure and MS-project planning. Because of scrum:

  1. Projects can deliver what the customer needs, not just what he thought he wanted.
  2. Teams are efficient. They work as a unit to reach a common goal.
  3. We have better project roles (like a product owner and scrum master), ceremonies (like daily stand-ups, grooming) and a scrumboard.

But the central question is: "are we there yet"? And the answer is: "No!". We can optimize scrum by mixing it with kanban, which leads to scrumban.

A kanban introduction for scrummers

Instead of scrum, which is a software development framework in the widest sense of the term, kanban is a method. It, or instance, does not define ceremonies and project roles. There are two main principles in kanban I would like to highlight:

  1. Each column on the kanban board represents a step in the workflow. So, instead of the lanes 'todo', 'inprogress' and 'done' like in scrum, you have 'defining', 'developing', 'testing' and 'deploying'. That is a more full-stack view; a task has a wider lifecycle. This concept is also called 'from concept to cash'; from user research and strategic planning to data center operations and product support.
  2. Another principle of Kanban is that it limits WIP (work in progress). An example of a WIP limit is limiting the number of cards allowed in each column. The advantage is that it reveals bottlenecks dynamically. Because of the WIP, Kanban is a pull mechanism. For instance, a tester can only pickup a next work item if there are items available in de done-column of development-lane and when the WIP limit of the test-lane isn't exceeded.

After all kanban is incredibly simple, but at the same time incredibly powerful.

What's wrong with scrum?

  1. The reason why we went to scrum is because we did not want the waterfall approach anymore. But, in fact each sprint in scrum has become a mini waterfall. In each sprint teams plan, try to design, develop and test. At the end the product owner reviews the completed work and decides which of the stories are shippable and ready for production. Those sprints can result in a staccato flow, which can be exhausted. With kanban we can make sprints more agile and the goal is to have a more continuous flow. In comparison with how to run a marathon? You don't make sprints of 200 meters, but rather with a constant rate.
  2. Scrum is a push mechanism and therefore 'pushes' the quality out of your product. When a sprint backlog is defined, the team is asked for a commitment. Whatever happened, a team must satisfy its commitment and at the end of the sprint the product owner must say 'decharge', else the team has failed. No team wants to publicly fail, so most of the time, at the end of the sprint, teams take shortcuts to satisfy the deadline. Those shortcuts are killing for quality! Asking for commitment is like not trusting the intrinsic motivation of the team. The correct commitment is visible during each standup. During a standup team members have to tell each other what they've done the day before. If they are working too long on a story, another team member will rebel. That is the real commitment.
  3. One of the reasons why we do scrum is that it is better to start immediately instead of doing an estimation and a feasibility study upfront, because almost always after the study is completed, the project will be executed. The estimation at the start is not reliable and the feasibility study is just a waste of time. Isn't that the same mistake we make with scrum with the grooming and ready sessions that causes a lot of overhead? The first overhead during grooming is that we give an estimation with relative precision. It is in a developer's nature to argue about the story points; is it 3, 5, 8 or maybe 1 points? And that is a waste. You should only talk about the story sizes large, medium or small. Making a more precise estimation is just a waste of time, because there are too many external factors. Second, with the grooming we do a mini feasibility study. With a team we will think about a direction of the solution, which is fine. But most of the time it takes two or three sprints before it is realized in the sprint. And with all the weekends of beer in between we've already forgot the solutions. So one smart guy says: 'yes, lets document it', but that is an inefficient way for the real problem: there is too much time between the grooming and the realization.

Scrumban: the board of kanban

Untitled

A scrumban board

The first column in a scrumban board is reserved for the backlog, where stories are ordered by their respective priority. There are several rules for the kanban backlog:

  1. It is the product owner's responsibility to fill this lane with stories, and keep it steadily supplied. The product owner must avoid creating or analyzing too many stories, because this is a waste and it corrupts with the Just-In-Time principle of scrumban. Therefore the scrumban board has a WIP-limit of 5 backlog stories.
  2. Assure the necessary level of analysis before starting development. Each story must be analyzed with a minimum effort. That should be done in the Weekly Time Box (WTB), which will be discussed later on.
  3. The backlog should be event-driven with an order point.
  4. Prioritization-on-demand. The ideal work planning process should always provide the team with best things to work on next, no more and no less.

Next to the backlog-column is the tasking-column, in which there should always be at least one story that is tasked (a minimum WIP-limit). If this isn't the case the team will task after the standup to satisfy this condition. A story is picked up from the backlog and is tasked by the team. Tip: put the tasked cards at the back of the story cards. The next columns are the realization columns. Each team is free to add, remove or change necessary columns so it suits the business. In the realization columns there should be a maximum number of stories that are worked on. If the maximum limit has not been reached, a story can be pulled from the tasking column and unfolded on the 'to implement' lane. Now the team can work on the tasks of the story. Each task that is implemented can be moved to the 'ready' lane. If all of the tasks are done for a story, the story can be moved to the next lane. When the story and tasks are ready, the cards can be moved to the right bottom of the board, so there is a new horizontal lane available for the next story.

Scrumban: the ceremonies of scrum

With scrumban we only have two types of meetings: the daily standup and the weekly timeblock. The Weeky Timeblock is a recurring meeting used for multiple purposes. It should be set up in the middle of the week. The big advantage of the weekly timeblock is that developers are distracted from their work only once a week (instead of the various of meetings with scrum).

The Weekly Timeblock contains three parts. First there is a demo of the work done. Second, there is a retrospective on the development process of the last week. Third, the team should have a preview of upcoming workitems. The team try to understand the intent of each item and provide feedback. The only size-indication a team has to make is if the story is small, medium or large. Avoid using poker cards/story points, which are too fine-grained and are to vague.

Conclusion

Scrumban is a mix of the scrum ceremonies and the kanban method. With scrumban we

  1. Introduce the weekly timeblock (WTB). The weekly timeblock should be around 4 hours and there are no more meetings
  2. Have a wider lifecycle of a story: 'from concept to cash'.
  3. Change the scrumboard to  company flows and avoid the push principle of a sprint but using a pull mechanism.

Tokutek White Paper: A Comparison of Log-Structured Merge (LSM) and Fractal Tree Indexing

What data structure does your database use? It's not something the typical database user spends much time pondering. Since data structure is destiny, what data structure your database uses is a key point to consider in your selection process.

We know CouchDB uses a modified B+ tree. We've learned a lot fascinating details over the years about the use of Log-structured merge-trees in Cassandra, HBase and LevelDB. So B+ trees and LSMs seem familiar by now.

What may not be so familiar is Tokutek's Fractal Tree Indexing technology that is supposed to be even better than B+ trees and LSMs.

As a comparison between Fractal Tree Indexing and LSMs, Bradley Kuszmaul, Chief Architect at Tokutek, has written a detailed paper, a must read for the algorithmically inclined or someone interested in database internals: A Comparison of Log-Structured Merge (LSM) and Fractal Tree Indexing.

Here's a quick intro to Fractal Tree (FT) indexes:

Categories: Architecture

Kanban should be the default choice for DevOps teams

Xebia Blog - Wed, 08/06/2014 - 14:59

We had a successful workshop on DevOpsDays 2014. Our main point was that Kanban should be the default choice for DevOps teams. The presentation can be downloaded here.

DevOpsDays 2014 was a success

On the 19th, 20th & 21st of June 2014 the second edition of DevOpsDays Amsterdam was held in Pakhuis De Zwijger in Amsterdam. This year I was asked to teach a course there on Kanban for DevOps. At the 2013 edition I also gave a presentation about this subject and it was nice to be invited back to this great event.

With the Open Source mindset in mind I teamed up with Maarten Hoppen en Bas van Oudenaarde. Our message: Kanban should be the default choice for DevOps teams. 

The response to this workshop was very positive and because we received a lot of great feedback I thought I’d share the slide deck. The presentation assumes you are working in an environment where DevOps might work or is already being implemented. 

Main points of the presentation

DevOps is about Culture, Automation, Measurement and Sharing (CAMS). These four values require a way of working that looks past existing processes, handovers and role descriptions. 

The Kanban Method is about looking at your organization in a different way. From a point of:

  • Sustainability: by shaping demand and limiting Work in Progress
  • Service-Orientation: by creating an SLA based on past results and data
  • Survivability: create an improvement mindset in the organization to respond to rapidly changing environments

The three different ways the Kanban Method makes you look at your organization makes it an extremely powerful solution for DevOps.

If you are interested to learn more about the workshop, check out the slides here:

http://www.slideshare.net/jsonnevelt/kanban-bootcamp-devopsdays-2014

How Employees Lost Empathy for their Work, for the Customer, and for the Final Product

“In most cases being a good boss means hiring talented people and then getting out of their way.” -- Tina Fey

The Digital Revolution marked the beginning of the Information Age.

The Information Age, or Digital Age, or New Media Age, is a shift away from the industrial revolution to an economy based on information computerization.  Some would say, along with this shift, we are now in a Knowledge Economy or a Digital Economy. 

This opens the door to new ways of working and a new world of work to generate new business value and customer impact.

But what did the Industrial Age do to employees and what paradigms could limit us in this new world?

In the book The Future of Management, Gary Hamel walks through how industrialization and large Enterprises have created a disconnect between employees and their customers, their final product, and the big financial picture.  And in the process, he argues, this had led to disengaged employees, crippled innovation, and inflexible organizations.

If you don’t know Gary Hamel, he’s been ranked the #1 influential business thinker by the Wall Street Journal.

According to Hamel, what we traded for scale and efficiencies created gaps between workers and employees and gaps between employees and their customers, the product, the financial impact, and … a diminished sense of responsibility for quality and efficiency.

Maybe We Have Managers Because We Have Employees

Do managers exist because employees do?

Via The Future of Management:

“Here's a thought.  Maybe we need 'managers' because we have 'employees.'  (Be patient, this is not as tautological as it sounds.)  Think about the way computers are dependent on software.  PCs aren't smart enough to write their own operating instructions, and they sit idle until a user sets them to work.  Perhaps the same is true for employees.”

Did We Manufacture a Need for Managers?

When we manufactured employees, did we manufacture a need for managers?

Via The Future of Management:

“Earlier, I talked about the invention of 'the employee.' What happened in this process, at the dawn of the 20th century?  How did work life change as individuals left their farms and workshops to be absorbed into large-scale organizations?  In manufacturing employees, did we manufacture a need for managers as well?  I think so.  If we understood how this came about, we will gain clues into how we might learn to manage without managers -- or, at least, with a lot fewer of them.”

Disconnected from the Customer

As the size and scale of industrial organizations grew, so did the disconnect between employees and their final customers.

Via The Future of Management:

“In pre-industrial times, farmers and artisans enjoyed an intimate relationship with their customers.  The feedback they received each day from their patrons was timely and unfiltered.  Yet as industrial organizations grew in size and scale, millions of employees found themselves disconnected from the final customer.  Robbed of direct feedback, they were compelled to rely on others who were closer to the customer to calibrate the effectiveness of their efforts and to tell them how they could better please their clients.”

A Diminished Sense of Responsibility for Producer Quality and Efficiency

Without a connection to the customer, employees lose empathy for their work, for the customer, and for the final product.

Via The Future of Management:

“As companies divided themselves into departments and functions, employees also became disconnected from the final product.  As tasks became narrower and more specialized, employees lost their emotional bond with the end product.  The result? A diminished sense of responsibility for producer quality and efficiency.  No longer were workers product craftsmen, now they were cogs in an industrial machine over which they had little control.”

Employees No Longer Have a System Wide View of the Production Process

It’s hard to make changes to the system when you no longer have a system wide view.

Via The Future of Management:

“Size and scale also separate employees from their coworkers.  Working in semi-isolated departments, they no longer had a system wide view of the production process.  If that system was suboptimal, they had no way of knowing it and now way of correcting it.”

The Gap Widens Between Workers and Owners

People at the top don’t hear from the people at the bottom.

Via The Future of Management:

“Industrialization also enlarged the gulf between workers and owners.  While a 19th-century apprentice would have had the ear of the proprietor, most 20th-century employees reported to low-level supervisors.  In a large enterprise a junior employee could work for decades and never have the chance to speak one-on-one with someone empowered to make important policy decisions.”

The Scoreboard is Contrived

Scoreboards tell employees how they are doing their jobs, but not how the company is doing overall.

Via The Future of Management:

“In addition, growing operational complexity fractured the information that was available to employees.  In a small proprietorship, the financial scoreboard was simple and real time; there was little mystery about how the firm was doing.  In a big industrial company, employees had a scoreboard but it was contrived.  It told workers how they were doing their jobs, but little about how the company was doing overall.  With no more than a knothole view of the company's financial model, and only a sliver of responsibility for results, it was difficult for an employee to feel a genuine burden for the company's performance.”

Industrialization Disconnects Employees from Their Own Creativity

Standardizing jobs and processes limits innovation in the jobs and processes.  They are at odds.

Via The Future of Management:

“Finally, and worst of all, industrialization disconnected employees from their own creativity.  In the industrial world, work methods and procedures were defined by experts and, once defined, were not easily altered.  No matter how creative an employee might be, the scope for exercising that gift was severely truncated.”

The Pursuit of Scale and Efficiency Advantages Disconnected Workers from Their Essential Inputs

With the disconnect between employees and their inputs, there was a natural need for the management class.

Via The Future of Management:

“To put it simply, the pursuit of scale and efficiency advantages disconnected workers from the essential inputs that had, in earlier times, allowed them to be (largely) self-managing -- and in so doing, it made the growth on an expansive managerial class inevitable.”

Employees Don’t Lack Wisdom and Experience

Employees don’t lack wisdom and experience.  They just lack information and context.

Via The Future of Management:

“To a large extent, employees need managers for the same reason 13-year-olds need parents: they are incapable of self-regulation.  Adolescents, with their hormone-addled brains and limited lie experience, lack the discernment to make consistently wise choices.  Employees on the other hand, aren't short of wisdom and experience, but they do lack information and context -- since they are so often disconnected from customers, associates, end products, owners, and the big financial picture.  Deprived of the ability to exercise control from within, employees must accept control from above.  The result: disaffection.  It turns out that employees enjoy being treated like 13-year-olds even less than 13-year-olds.”

Disengaged Employees, Hamstrung Innovation, and Inflexible Organizations

What is the result of all this disconnect?   Stifled innovation, rigid organizations, and disinterested employees.

Via The Future of Management:

“Disengaged employees.  Hamstrung innovation.  Inflexible organizations.  Although we are living in a new century, we are still plagued by the side effects of a management model that invented roughly a hundred years ago.  Yet history doesn't have to be destiny -- not if you are willing to go back and reassess the time-forgotten choices that so many others still take for granted.  With the benefit of hindsight, you can ask: How have circumstances changed? Are new approaches possible? Must we be bound by the shackles of the past?  These are essential questions for every management innovator.”

Does history have to be destiny?

We’re writing new chapters of history each and every day.

In all of my experience, where I’ve seen productivity thrive, people shine, and innovation unleashed, it’s when employees are connected with customers, they are empowered and encouraged to make changes to processes and products, and they are part of a learning organization with rapid feedback loops.

You Might Also Like

Ability to Execute

Business Value Generation is the New Bottleneck

Customer-Connected Development

How We Adhered to the Agile Manifesto on the Microsoft patterns & practices Team

Why So Many Ideas Die or Don’t Get Adopted

Categories: Architecture, Programming

100+ Motivational Quotes to Inspire Your Greatness

I updated my Motivational Quotes page.

I’ve got more than 100 motivational quotes on the page to help you find your inner-fire.

It’s not your ordinary motivational quotes list.  

It’s deep and it draws from several masters of inspiration including Bruce Lee, Jim Rohn, and Zig Ziglar.

Here is a sampling of some of my personal favorite motivational quotes ..

“If you always put limit on everything you do, physical or anything else. It will spread into your work and into your life. There are no limits. There are only plateaus, and you must not stay there, you must go beyond them.” – Bruce Lee

“Knowing is not enough; we must apply. Willing is not enough; we must do.” - Johann Wolfgang von Goethe

“Kites rise highest against the wind; not with it.” – Winston Churchill

“To hell with circumstances; I create opportunities.” – Bruce Lee

“Our greatest glory is not in never falling but in rising every time we fall.” – Confucius

“There is no such thing as failure. There are only results.” – Tony Robbins

“When it’s time to die, let us not discover that we have never lived.” -Henry David Thoreau

“People who say it cannot be done should not interrupt those who are doing it.” – Anonymous

“Motivation alone is not enough. If you have an idiot and you motivate him, now you have a motivated idiot.” – Jim Rohn

“If you love life, don’t waste time, for time is what life is made up of.” – Bruce Lee

For more quotes, check out my motivational quotes page.

It’s a living page and at some point I’ll do a complete revamp.

I think in the future I’ll organize it by sub-categories within motivation rather than by people.I think at the time it made sense to have words of wisdom by various folks, but now I think grouping motivational quotes by sub-categories would work much better, especially when there is such a large quantity of quotes.

Categories: Architecture, Programming

Using C3js with AngularJS

Gridshore - Tue, 07/29/2014 - 17:17

C3js is a graph javascript library on top of D3js. For a project we needed graphs and we ended up using c3js. How this happened and some of the first steps we took is written down by Roberto van der Linden in his blogpost: Creating charts with C3.js

Using C3js without other javascript libraries is of course perfectly doable. Still I think using it in combination with AngularJS is interesting to evaluate. In this blogpost I am going to document some steps I took to go from the basic c3 sample to a more advanced AngularJS powered sample.

Setup project – most basic sample

The easiest setup is by cloning by github repository: c3-angular-sample. If you are more the follow a long kind of person, than these are the steps.

We start by downloading C3js, d3 and AngularJS and create the first html page to get us started.

Next we create the index1.html that loads the stylesheet as well as the required javascript files. Two things are important to notice. One, the div with id chart. This is used to position the chart. Two, the script block that creates the chart object with data.

<!doctype html>
<html>
<head>
	<meta charset="utf-8">

	<link href="css/c3-0.2.4.css" rel="stylesheet" type="text/css">
</head>
<body>
	<h1>Sample 1: Most basic C3JS sample</h1>
	<div id="chart"></div>

	<!-- Load the javascript libraries -->
	<script src="js/d3/d3-3.4.11.min.js" charset="utf-8"></script>
	<script src="js/c3/c3-0.2.4.min.js"></script>

	<!-- Initialize and draw the chart -->
	<script type="text/javascript">
		var chart = c3.generate({
		    bindto: '#chart',
		    data: {
		      columns: [
		        ['data1', 30, 200, 100, 400, 150, 250],
		        ['data2', 50, 20, 10, 40, 15, 25]
		      ]
		    }
		});
	</script>
</body>
</html>

Next we introduce AngularJS to create the chart object.

Use AngularJS to create the chart

The first step is to remove the script block and add more javascript libraries to load. We load the AngularJS library and our own js file called graph-app-2.js. In the html tag we initialise the AngularJS app using ng-app=”graphApp”. In the body tag we initialise the Graph controller and call the function when the Angular app is initialised.

<html ng-app="graphApp">
<body ng-controller="GraphCtrl" ng-init="showGraph()">

Now let us have a look at some AngularJS javascript code. Notice that their is hardly a difference between the JavaScript code in both samples.

var graphApp = angular.module('graphApp', []);
graphApp.controller('GraphCtrl', function ($scope) {
	$scope.chart = null;
	
	$scope.showGraph = function() {
		$scope.chart = c3.generate({
			    bindto: '#chart',
			    data: {
			      columns: [
			        ['data1', 30, 200, 100, 400, 150, 250],
			        ['data2', 50, 20, 10, 40, 15, 25]
			      ]
			    }
			});		
	}
});

Still everything is really static, it makes use of mostly defaults. Let us move on and make it possible for the user to make some changes to the data and the type of chart.

Give the power to the user to change the data

We have two input boxes that accept common separated data and two drop downs to change the type of the chart. Fun to try out all the different types available. Below first he piece of html that contains the form to accept the user input.

	<form novalidate>
		<p>Enter in format: val1,val2,val3,etc</p>
		<input ng-model="config.data1" type="text" size="100"/>
		<select ng-model="config.type1" ng-options="typeOption for typeOption in typeOptions"></select>
		<p>Enter in format: val1,val2,val3,etc</p>
		<input ng-model="config.data2" type="text" size="100"/>
		<select ng-model="config.type2" ng-options="typeOption for typeOption in typeOptions"></select>
	</form>
	<button ng-click="showGraph()">Show graph</button>

This should be easy to understand. There are a few fields taken from the model as well as a button that generates the graph. Below is the complete javascript code for the controller.

var graphApp = angular.module('graphApp', []);

graphApp.controller('GraphCtrl', function ($scope) {
	$scope.chart = null;
	$scope.config={};
	$scope.config.data1="30, 200, 100, 200, 150, 250";
	$scope.config.data2="70, 30, 10, 240, 150, 125";

	$scope.typeOptions=["line","bar","spline","step","area","area-step","area-spline"];

	$scope.config.type1=$scope.typeOptions[0];
	$scope.config.type2=$scope.typeOptions[1];


	$scope.showGraph = function() {
		var config = {};
		config.bindto = '#chart';
		config.data = {};
		config.data.json = {};
		config.data.json.data1 = $scope.config.data1.split(",");
		config.data.json.data2 = $scope.config.data2.split(",");
		config.axis = {"y":{"label":{"text":"Number of items","position":"outer-middle"}}};
		config.data.types={"data1":$scope.config.type1,"data2":$scope.config.type2};
		$scope.chart = c3.generate(config);		
	}
});

In the first part we initialise all the variables in the $scope that are used on the form. In the showGraph function we have added a few things compared to the previous sample. We do not use the column data provider anymore, now we use the json provider. therefore we create arrays out of the comma separated numbers string. We also add a label to the y-axis and we set the types of charts using the same names as in the son object with the data. I think this is a good time to show a screen dump, stil it is much nicer to open the index3.html file yourself and play around with it.

Screen Shot 2014 07 29 at 17 52 26

Now we want to do more with angular, introduce a service that could obtain data from the server, but also make the data time based data.

Introducing time and data generation

The html is very basic, it only contains to buttons. One to start generating data and one to stop generating data. Let us focus on the javascript we have created. When creating the application we now tell it to look for another module called graphApp.services. Than we create the service using the factory and register it with the application. That way we can inject the service later on into the controller.

var graphApp = angular.module('graphApp', ['graphApp.services']);
var services = angular.module('graphApp.services', []);
services.factory('dataService', [function() {
	function DataService() {
		var data = [];
		var numDataPoints = 60;
		var maxNumber = 200;

		this.loadData = function(callback) {
			if (data.length > numDataPoints) {
				data.shift();
			}
			data.push({"x":new Date(),"data1":randomNumber(),"data2":randomNumber()});
			callback(data);
		};

		function randomNumber() {
			return Math.floor((Math.random() * maxNumber) + 1);
		}
	}
	return new DataService();
}]);

This service has one exposed method, loadData(callback). The result is that you get back a collection of objects with 3 fields: x, data1 and data2. The number of points is maximised to 60 and they have a random value between 0 and 200.

Next is the controller. I start with the controller, the parameters that are initialised as well as the method to draw the graph.

graphApp.controller('GraphCtrl', ['$scope','$timeout','dataService',function ($scope,$timeout,dataService) {
	$scope.chart = null;
	$scope.config={};

	$scope.config.data=[]

	$scope.config.type1="spline";
	$scope.config.type2="spline";
	$scope.config.keys={"x":"x","value":["data1","data2"]};

	$scope.keepLoading = true;

	$scope.showGraph = function() {
		var config = {};
		config.bindto = '#chart';
		config.data = {};
		config.data.keys = $scope.config.keys;
		config.data.json = $scope.config.data;
		config.axis = {};
		config.axis.x = {"type":"timeseries","tick":{"format":"%S"}};
		config.axis.y = {"label":{"text":"Number of items","position":"outer-middle"}};
		config.data.types={"data1":$scope.config.type1,"data2":$scope.config.type2};
		$scope.chart = c3.generate(config);		
	}

Take note of the $scope.config.keys which is used later on in the config.data.keys object. With this we move from an array of data per line to a an object with all the datapoints. The x is coming from the x property, the value properties are coming from data1 and data2. Also take note of the config.axis.x property. Here we specify that we a re now dealing with timeseries data and we format the ticks to show only the seconds. This is logical since we configured the have a maximum of 60 points and we create a new point every second, which we will see later on. The next code block shows the other three functions in the controller.

	$scope.startLoading = function() {
		$scope.keepLoading = true;
		$scope.loadNewData();
	}

	$scope.stopLoading = function() {
		$scope.keepLoading = false;
	}

	$scope.loadNewData = function() {
		dataService.loadData(function(newData) {
			var data = {};
			data.keys = $scope.config.keys;
			data.json = newData;
			$scope.chart.load(data);
			$timeout(function(){
				if ($scope.keepLoading) {
					$scope.loadNewData()				
				}
			},1000);			
		});
	}

In the loadNewData method we use the $timeout function of AngularJS. After a second we call the loadNewData again if we did not set the keepLoading property explicitly to false. Note that adding a point to the graph and calling the function recursively is done in the callback of the methode as send to the dataService.

Now run the sample for longer than 60 seconds and see what happens. Below is a screen dump, but is becomes more funny when you run the sample yourself.

Screen Shot 2014 07 29 at 18 16 01

That is about it, now I am going to integrate this with my elasticsearch-ui plugin

The post Using C3js with AngularJS appeared first on Gridshore.

Categories: Architecture, Programming

Transform the input before indexing in elasticsearch

Gridshore - Sat, 07/26/2014 - 07:51

Sometimes you are indexing data and want to have as little to do in the input, or maybe even no influence on the input. Still you need to make changes, you want other content, or other fields. Maybe even remove fields. In elasticsearch 1.3 a new feature is introduces called Transform. In this blogpost I am going to show some of the aspects of this new feature.

Insert the document with the problem

The input we get is coming from a system that puts the string null in a field if it is empty. We do not want null as a string in elasticsearch index. Therefore we want to remove this field completely when indexing a document like that. We start with the example and the proof that you can search on the field.

PUT /transform/simple/1
{
  "title":"This is a document with text",
  "description":"null"
}

Now search for the word null in the description.

For completeness I’ll show you the response as well.

Response:
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.30685282,
      "hits": [
         {
            "_index": "transform",
            "_type": "simple",
            "_id": "1",
            "_score": 0.30685282,
            "_source": {
               "title": "This is a document with text",
               "description": "null"
            }
         }
      ]
   }
}
Change mapping to contain transform

Next we are going to use the transform functionality to remove the field if it contains the string null. To do that we need to remove the index and create a mapping containing the transform functionality. We use the groovy language for the script. Beware that the script is only validated when the first document is inserted.

PUT /transform
{
  "mappings": {
    "simple": {
      "transform": {
        "lang":"groovy",
        "script":"if (ctx._source['description']?.equals('null')) ctx._source['description'] = null"
      },
      "properties": {
        "title": {
          "type": "string"
        },
        "description": {
          "type": "string"
        }
      }
    }
  }
}

When we insert the same document as before and execute the same query we do not get hits. The description field is no longer indexed. An important aspect is that the actual _source is not changed. When requesting the _source of the document you still get back the original document.

GET transform/simple/1/_source
Response:
{
   "title": "This is a document with text",
   "description": "null"
}
Add a field to the mapping

To add a bit more complexity, we add a field called nullField which will contain the name of the field that was null. Not very useful but it suits to show the possibilities.

PUT /transform
{
  "mappings": {
    "simple": {
      "transform": {
        "lang":"groovy",
        "script":"if (ctx._source['description']?.equals('null')) {ctx._source['description'] = null;ctx._source['nullField'] = 'description';}"
      },
      "properties": {
        "title": {
          "type": "string"
        },
        "description": {
          "type": "string"
        },
        "nullField": {
          "type": "string"
        }
      }
    }
  }
}

Notice that we script has changed, not only do we remove the description field, now we also add a new field called nullField. Check that the _source is still not changed. Now we do a search and only return the fields description and nullField. Before scrolling to the response think about the response that you would expect.

GET /transform/_search
{
  "query": {
    "match_all": {}
  },
  "fields": ["nullField","description"]
}

Did you really think about it? Try it out and notice that the nullField is not returned. That is because we did not store it in the index and it is not obtained from the source. So if we really need this value, we can store the nullField in the index and we are fine.

PUT /transform
{
  "mappings": {
    "simple": {
      "transform": {
        "lang":"groovy",
        "script":"if (ctx._source['description']?.equals('null')) {ctx._source['description'] = null;ctx._source['nullField'] = 'description';}"
      },
      "properties": {
        "title": {
          "type": "string"
        },
        "description": {
          "type": "string"
        },
        "nullField": {
          "type": "string",
          "store": "yes"
        }
      }
    }
  }
}

Than with the match all query for two fields we get the following response.

GET /transform/_search
{
  "query": {
    "match_all": {}
  },
  "fields": ["nullField","description"]
}
Response:
{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index": "transform",
            "_type": "simple",
            "_id": "1",
            "_score": 1,
            "fields": {
               "description": [
                  "null"
               ],
               "nullField": [
                  "description"
               ]
            }
         }
      ]
   }
}

Yes, now we do have the new field. That is it, but wait there is more you need to know. There is a way to check what is actually passed to the index for a certain document.

GET transform/simple/1?pretty&_source_transform
Result:
{
   "_index": "transform",
   "_type": "simple",
   "_id": "1",
   "_version": 1,
   "found": true,
   "_source": {
      "description": null,
      "nullField": "description",
      "title": "This is a document with text"
   }
}

Notice the null description and the nullField in the _source.

Final remark

You cannot update the transform part, think about what would happen to your index when some documents did pass the transform version 1 and others version 2.

I would be gentile with this feature, try to solve it before sending it to elasticsearch, but maybe you just have the usecase for this feature, now you know it exists.

In my next blogpost I dive a little bit deeper into the scripting module.

The post Transform the input before indexing in elasticsearch appeared first on Gridshore.

Categories: Architecture, Programming

Playing with two most interesting new features of elasticsearch 1.3.0

Gridshore - Fri, 07/25/2014 - 11:47

Just a few days a go elasticsearch released version 1.3.0 of their flagship product. The first one is the most waited for feature called the Top hits aggregation. Basically this is what is called grouping. You want to group certain items based on one characteristic, but within this group you want to have the best matching result(s) based on score. Another very important feature is the new support for scripts. Better security options when using scripts using sandboxed script languages.

In this blogpost I am going to explain and show the top_hits feature as well as the new scripting support.


Top hits

I am going to show a very simple example of top hits using my music index. This index contains all the songs I have in my itunes library. First step is to find songs by genre, the following query gives (the default) 10 hits based on the match_all query and the terms aggregation as requested.

GET /mymusic/_search
{
  "aggs": {
    "byGenre": {
      "terms": {
        "field": "genre",
        "size": 10
      }
    }
  }
}

The response is of the format:

{
	"hits": {},
	"aggregations": {
		"byGenre": {
			"buckets": [
				{"key":"rock","doc_count":1910},
				...
			]
		}
    }
}

Now we add the query to the request, songs containing the word love in the title.

GET /mymusic/_search
{
  "query": {
    "match": {
      "name": "love"
    }
  }, 
  "aggs": {
    "byGenre": {
      "terms": {
        "field": "genre",
        "size": 10
      }
    }
  }
}

Now we have less hits, still a number of buckets and the amount of songs that match our query within that bucket. The biggest change is the score in the returned hits. In te previous query the score was always 1, now the score is different due to the query we execute. The highest score now is the song Love by The Mission. The genre for this song is Rock and the song is from the year 1990. Time to introduce the top hits aggregation. With this query we can return the top song containing the word love in the title per genre

GET /mymusic/_search
{
  "query": {
    "match": {
      "name": "love"
    }
  },
  "aggs": {
    "byGenre": {
      "terms": {
        "field": "genre",
        "size": 5
      },
      "aggs": {
        "topFoundHits": {
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  }
}

Again we get hits, but they are not different from the query before. The interesting part is in the aggs part. Here we add a sub aggregation to the byGenre aggregation. This aggregation is called topFoundHits of type top_hits. We only return the best hit per genre. The next code block shows the part of the response with the top hits, I did remove the content of the _source field in the top_hits to keep the response shorter.

{
   "took": 4,
   "timed_out": false,
   "_shards": {
      "total": 3,
      "successful": 3,
      "failed": 0
   },
   "hits": {
      "total": 141,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "byGenre": {
         "buckets": [
            {
               "key": "rock",
               "doc_count": 52,
               "topFoundHits": {
                  "hits": {
                     "total": 52,
                     "max_score": 4.715253,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "4147",
                           "_score": 4.715253,
                           "_source": {
                              "name": "Love",
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": "pop",
               "doc_count": 39,
               "topFoundHits": {
                  "hits": {
                     "total": 39,
                     "max_score": 3.3341873,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "11381",
                           "_score": 3.3341873,
                           "_source": {
                              "name": "Love To Love You",
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": "alternative",
               "doc_count": 12,
               "topFoundHits": {
                  "hits": {
                     "total": 12,
                     "max_score": 4.1945505,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "7889",
                           "_score": 4.1945505,
                           "_source": {
                              "name": "Love Love Love",
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": "b",
               "doc_count": 9,
               "topFoundHits": {
                  "hits": {
                     "total": 9,
                     "max_score": 3.0271564,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "2549",
                           "_score": 3.0271564,
                           "_source": {
                              "name": "First Love",
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": "r",
               "doc_count": 7,
               "topFoundHits": {
                  "hits": {
                     "total": 7,
                     "max_score": 3.0271564,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "2549",
                           "_score": 3.0271564,
                           "_source": {
                              "name": "First Love",
                           }
                        }
                     ]
                  }
               }
            }
         ]
      }
   }
}

Did you note a problem with my analyser for genre? Hint R&B!

More information on the top_hits aggregation can be found here:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html

Scripting

Elasticsearch has support for scripts for a long time. The default scripting language was and is mvel up to version 1.3. It will change to groovy in 1.4. Mvel is not a well known scripting language. The biggest advantage is that mvel is very powerful. The disadvantage is that it is to powerful. Mvel does not come with a sandbox principle. Therefore is is possible to write some very nasty scripts even when only doing a query. This was very well shown by a colleague of mine (Byron Voorbach) who created a query to read private keys on developer machines who did not safeguard their elasticsearch instance. Therefore dynamic scripting was switched off in version 1.2 by default.

This came with a very big disadvantage, now it was not possible anymore to use the function_score query without resorting to stored scripts on the server. In version 1.3 of elasticsearch a much better way is introduced. Now you use sandboxed scripting languages like groovy to keep using the flexible approach. Groovy can be configured to include object creation and method calls that are allowed. More information about this is provided in the elasticsearch documentation about scripting.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html

Next is an example query that is querying my music index. This index contains all the songs from my music library. It queries all he songs after the year 1999 and calculates the score based on the year. So the newest songs get the highest score. And yes I know a sort by year desc would have given the same result.

GET mymusic/_search
{
  "query": {
    "function_score": {
      "query": {
        "range": {
          "year": {
            "gte": 2000
          }
        }
      },
      "functions": [
        {
          "script_score": {
            "lang": "groovy", 
            "script": "_score * doc['year'].value"
          }
        }
      ]
    }
  }
}

The score now becomes high, since we do a range query we get back only scores of one. Using the function_score as the multiplication of the year with the score, the end score is the year. I added the year as the only field to return, some of the results than are:

{
   "took": 4,
   "timed_out": false,
   "_shards": {
      "total": 3,
      "successful": 3,
      "failed": 0
   },
   "hits": {
      "total": 2895,
      "max_score": 2014,
      "hits": [
         {
            "_index": "mymusic",
            "_type": "itunes",
            "_id": "12965",
            "_score": 2014,
            "fields": {
               "year": [
                  "2014"
               ]
            }
         },
         {
            "_index": "mymusic",
            "_type": "itunes",
            "_id": "12975",
            "_score": 2014,
            "fields": {
               "year": [
                  "2014"
               ]
            }
         }
      ]
   }
}

Next up is the last sample, a combination of top_hits and scripting.

Top hits with scripting

We start with the sample from top_hits using my music index. Now we want to sort the buckets on the score of the best matching document in the bucket. The default is the number of documents in the bucket. As mentioned in the documentation you need a trick to do this.

The top_hits aggregator isn’t a metric aggregator and therefor can’t be used in the order option of the terms aggregator.

GET /mymusic/_search?search_type=count
{
  "query": {
    "match": {
      "name": "love"
    }
  },
  "aggs": {
    "byGenre": {
      "terms": {
        "field": "genre",
        "size": 5,
        "order": {
          "best_hit":"desc"
        }
      },
      "aggs": {
        "topFoundHits": {
          "top_hits": {
            "size": 1
          }
        },
        "best_hit": {
          "max": {
            "lang": "groovy", 
            "script": "doc.score"
          }
        }
      }
    }
  }
}

The results of this query again with most of the _source taken out is following. Compare it to the query in the top_hits section. Notice the different genres that we get back now. Also check the scores.

{
   "took": 4,
   "timed_out": false,
   "_shards": {
      "total": 3,
      "successful": 3,
      "failed": 0
   },
   "hits": {
      "total": 141,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "byGenre": {
         "buckets": [
            {
               "key": "rock",
               "doc_count": 37,
               "topFoundHits": {
                  "hits": {
                     "total": 37,
                     "max_score": 4.715253,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "4147",
                           "_score": 4.715253,
                           "_source": {
                              "name": "Love",
                           }
                        }
                     ]
                  }
               },
               "best_hit": {
                  "value": 4.715252876281738
               }
            },
            {
               "key": "alternative",
               "doc_count": 12,
               "topFoundHits": {
                  "hits": {
                     "total": 12,
                     "max_score": 4.1945505,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "7889",
                           "_score": 4.1945505,
                           "_source": {
                              "name": "Love Love Love",
                           }
                        }
                     ]
                  }
               },
               "best_hit": {
                  "value": 4.194550514221191
               }
            },
            {
               "key": "punk",
               "doc_count": 3,
               "topFoundHits": {
                  "hits": {
                     "total": 3,
                     "max_score": 4.1945505,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "7889",
                           "_score": 4.1945505,
                           "_source": {
                              "name": "Love Love Love",
                           }
                        }
                     ]
                  }
               },
               "best_hit": {
                  "value": 4.194550514221191
               }
            },
            {
               "key": "pop",
               "doc_count": 24,
               "topFoundHits": {
                  "hits": {
                     "total": 24,
                     "max_score": 3.3341873,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "11381",
                           "_score": 3.3341873,
                           "_source": {
                              "name": "Love To Love You",
                           }
                        }
                     ]
                  }
               },
               "best_hit": {
                  "value": 3.3341872692108154
               }
            },
            {
               "key": "b",
               "doc_count": 7,
               "topFoundHits": {
                  "hits": {
                     "total": 7,
                     "max_score": 3.0271564,
                     "hits": [
                        {
                           "_index": "mymusic",
                           "_type": "itunes",
                           "_id": "2549",
                           "_score": 3.0271564,
                           "_source": {
                              "name": "First Love",
                           }
                        }
                     ]
                  }
               },
               "best_hit": {
                  "value": 3.027156352996826
               }
            }
         ]
      }
   }
}

This is just a first introduction into the top_hits and scripting. Stay tuned for more blogs around these topics.

The post Playing with two most interesting new features of elasticsearch 1.3.0 appeared first on Gridshore.

Categories: Architecture, Programming

Review The Twitter Story by Nick Bilton

Gridshore - Thu, 07/24/2014 - 23:35


A lot of people dream of creating the new Facebook, Twitter or Instagram. Nobody knows when they will create it, most of the ideas just came to be big. Personally I do not believe I will ever create such a product. To busy with to much different things, usually based on great ideas of others. One thing I do like is reading about the success stories of others. I read the book about starbucks, microsoft, apple and a few others.

Recently I started reading the Twitter Story. It reeds like an exciting story, nevertheless it is telling a story based on interviews and facts behind one of the most exciting companies on the internet of the last century.

I do not think it is a coincidence that a lot of what I read in the Lean Startup but also the starbucks story is coming back in the twitter story. One thing that struck me in this book is what business does to friendship. It is also showing that people with great ideas are usually not the people that make this ideas into a profitable company.

When starting a company based on your terrific idea, read this book and learn from it. It might make your life a lot better.

The post Review The Twitter Story by Nick Bilton appeared first on Gridshore.

Categories: Architecture, Programming