Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Humpty Dumpty and #NoEstimates

Herding Cats - Glen Alleman - Wed, 06/03/2015 - 06:06

Humpty-Dumpty--010When I use a word Humpty Dumpty said in a rather scornful tone, it means just what I choose to to mean - neither more nor less.

The question is, said Alice, whether you can make words mean so many different things.

The question is said Humpty Dumpty which is to ne master.

Through the Looking Glass, Chapter 6

The mantra of #NoEstimates is that No Estimates is not about Not Estimating. Along with that oxymoron comes

Forecasting is Not Estimating

  • Forecasting the future based on past performance is not the same as estimating the future from past performance.
  • The Humpty Dumpty logic is Forecasting ≠Estimating.

This of course redefines the standard definition of both terms. Estimating is a rough calculation or judgment of a value, number, quantity, or extent of some outcome. 

An estimate is Approximation, prediction, or projection of a quantity based on experience and/or information available at the time, with the recognition that other pertinent facts are unclear or unknown.

  • Let’s estimate how many Great Horned Owls are in the county by sampling.
  • Let’s estimate to the total cost of this project using reference classes assigned to work element duration and running a Monte Carlo simulation

Forecasting is a prediction of a future event

  • Let’s produce weather forecast for the next five days

Both Estimating and Forecasting result in a probabilistic output in the presence of uncertainty

Slicing is Not Estimating??

Slicing work into smaller pieces so that "standard" size can be used to project the work effort and completion time. This is a standard basis of estimate in many domains. So slicing is Not Estimating in the #NoEstimates paradigm. In fact slicing is Estimating, another inversion of the term
No means Yes

Past Performance is #NoEstimates

using Past Performance to estimate future performance is core to all estimating processes. Time series used to estimate possible future outcomes is easily done with AIRMA, 4 lines of R, and some raw data as shown in The Flaw of Averages. But as described there, care is needed to confirm the future is like the past.

When We Redefine Words to Suite Our Needs We're Humpty Dumpty

Lewis Carol's Alice in Wonderland is political allegory of 19th century England. When #NoEstimates redefines established mathematical terms like Forecasting and Estimating and ignores the underlying mathematics for time series forecasting, ARIMA for example, they are willfully ignoring established practices and replacing them with their own untested conjectures.

No Estimates

Key here ways to make decisions with NO ESTIMATES. OK, show how that is not actually an estimating technical, no matter how simple or flawed and estimating technical.

Related articles Mr. Franklin's Advice There is No Such Thing as Free The Fallacy of the Planning Fallacy Do The Math Monte Carlo Simulation of Project Performance Essential Reading List for Managing Other People's Money
Categories: Project Management

Traffic Light Indicators for Metrics and KPIs

Stop on Red!

Stop on Red!

One of the most common indicators used in measurement and status report are traffic light indicators.  Traffic light indicators are most commonly shown as a set of red, yellow and green lights.  The metaphor draws from the nearly ubiquitous traffic light seen at nearly every intersection.  Traffic light indicators are part family of indicators of that combine indices and scales. Indices are typically used when a single measure or metric does not tell the story. An index reflects a composite of measures. Measures and/or metrics are averaged together or combined using complex mathematics.  The index is then transposed onto a scale so that it can be interpreted and used.  For example, wind chill is an index that combines temperature and wind speed into a temperature perceived by the skin. Wind chill once calculated is shown on a temperature scale. As a project status indicator, a traffic light indicator typically reflects a synthesis of many attribute.  The traffic light uses a simple scale in which red means trouble; yellow means caution and green means clear sailing. Traffic lights are adopted for three highly related reasons.

  1. Traffic lights are easy to recognize. The traffic light is a common symbol that every driver has been taught to recognize. Attaching a traffic light instantly indicates that a summary of status is being communicated.
  2. Traffic lights provide a consolidated view of complex attributes. The traffic light scale is a simple metaphor with three possible indications of overall performance.  Even in a simple project attributes such as budget, client satisfaction and risk must be synthesized into single perception of status that can be communicated. Traffic light indicators force a synthesized view.
  3. Traffic lights are easy to explain. Once an organization reaches a consensus on the business rules that set a traffic light indictor to red, yellow or green, is easy to explain. Red is bad and requires immediate action, yellow means caution, performance issues require mitigation and green mean business as usual.   Paul Byrnes, CMMI Lead Appraiser, when asked why people are drawn to traffic lights noted that “colors are easy…except for people that can’t see them… .”

 

Karl Jentzsch, a colleague at David Consulting Group summarized the case for traffic light indicators as “the appeal is that it provides an easily manageable number of ‘buckets’ to drop things into where the categorical distinctions are still fairly clear and inherently understood – good/go (green), bad/no (red), and in between (yellow).”

I often hear traffic lights defended with the statements like “we have always used traffic lights” and “or they are required by the PMO.” These are excuses that reflect an abrogation of thought and responsibility. It is too easy to succumb the simplicity of the indicator without reflecting on all the hard work and analysis needed to set the indicator. Typically this should be a lot of math and analysis to set the traffic light to red, yellow or green.  The math and the analysis is where the real magic happens and requires thought and understanding. As an indicator, the traffic light is elegant in its simplicity; however that simplicity can also be its undoing.


Categories: Process Management

Microservices architecture principle #6: One team is responsible for full life cycle of a Micro service

Xebia Blog - Tue, 06/02/2015 - 20:08

Microservices are a hot topic. Because of that a lot of people are saying a lot of things. To help organizations make the best of this new architectural style Xebia has defined a set of principles that we feel should be applied when implementing a Microservice Architecture. Today's blog is the last in a series about our Microservices principles. This blog explains why a Microservice should be the responsibility of exactly one team (but one team may be responsible for more services).

Being responsible for the full life cycle of a service means that a single team can deploy and manage a service as well as create new versions and retire obsolete ones. This means that users of the service have a single point of contact for all questions regarding the use of the service. This property makes it easier to track changes in a service. Developers can focus on a specific area of the business they are supporting so they will become specialists in that area. This in turn will lead to better quality. The need to also fix errors and problems in production systems is a strong motivator to make sure code works correctly and problems are easy to find.
Having different teams working on different services introduces a challenge that may lead to a more robust software landscape. If TeamA needs a change in TeamB’s service in order to complete it’s task, some form of planning has to take place. Both teams have to cater for slipping schedules and unforeseen events that cause the delivery date of a feature to change. However, depending on a commitment made by another team is tricky; there are lots of valid reasons why a change may be late (e.g. production issues or illness temporarily reduces a teams capacity or high priority changes take precedence). So TeamA may never depend on TeamB to finish before the deadline. TeamA will learn to protect its weekends and evenings by changing their architecture. Not making assumptions about another teams schedule, in a Microservice environment, will therefore lead to more robust software.

Microservices architecture principle #6: One team is responsible for full life cycle of a Micro service

Xebia Blog - Tue, 06/02/2015 - 20:08

Microservices are a hot topic. Because of that a lot of people are saying a lot of things. To help organizations make the best of this new architectural style Xebia has defined a set of principles that we feel should be applied when implementing a Microservice Architecture. Today's blog is the last in a series about our Microservices principles. This blog explains why a Microservice should be the responsibility of exactly one team (but one team may be responsible for more services).

Being responsible for the full life cycle of a service means that a single team can deploy and manage a service as well as create new versions and retire obsolete ones. This means that users of the service have a single point of contact for all questions regarding the use of the service. This property makes it easier to track changes in a service. Developers can focus on a specific area of the business they are supporting so they will become specialists in that area. This in turn will lead to better quality. The need to also fix errors and problems in production systems is a strong motivator to make sure code works correctly and problems are easy to find.
Having different teams working on different services introduces a challenge that may lead to a more robust software landscape. If TeamA needs a change in TeamB’s service in order to complete it’s task, some form of planning has to take place. Both teams have to cater for slipping schedules and unforeseen events that cause the delivery date of a feature to change. However, depending on a commitment made by another team is tricky; there are lots of valid reasons why a change may be late (e.g. production issues or illness temporarily reduces a teams capacity or high priority changes take precedence). So TeamA may never depend on TeamB to finish before the deadline. TeamA will learn to protect its weekends and evenings by changing their architecture. Not making assumptions about another teams schedule, in a Microservice environment, will therefore lead to more robust software.

Using Lines of Code as a Software Size Measure - New Lecture Posted

10x Software Development - Steve McConnell - Tue, 06/02/2015 - 18:08

I've posted this week's lecture in my Understanding Software Projects series at https://cxlearn.com. Most of the lectures that have been posted are still free. Lectures posted so far include:  

0.0 Understanding Sofware Projects - Intro
     0.1 Introduction - My Background
     0.2 Reading the News

1.0 The Software Lifecycle Model - Intro
     1.1 Variations in Iteration 
     1.2 Lifecycle Model - Defect Removal

2.0 Software Size
     2.05 Size - Comments on Lines of Code (New)
     2.1 Size - Staff Sizes 
     2.2 Size - Schedule Basics 

Check out the lectures at http://cxlearn.com!

Understanding Software Projects - Steve McConnell

 

Why You Dont' Want to Aim for 100% Uptime According to Google's Urs Hölzle

Wait, you don't want 100% uptime? Who said such a crazy thing? Risk taker Urs Hölzle, senior VP for technical infrastructure, in Google's Infrastructure Chief Talks SDN: Whenever you try something new, there are going to be problems with it....We were willing to take the risk to get the innovation. Our VP who runs our site reliability gave a great talk about not aiming for 100% uptime....The easiest way to make it be at 100% is to resist change, because change is when bad things happen. Looks great for your SLA, but it's bad for your business because you slow down innovation.... In the first year of running B4, [we asked] "Will we have an outage?" Realistically, yes there's a high chance because it was all new code. Are we going to be perfect? Probably not. You have to have a willingness to take a little risk.
Categories: Architecture

Erik Gross is Teaching People How to Code

Making the Complex Simple - John Sonmez - Tue, 06/02/2015 - 16:00

Developer bootcamps are a very popular topic today. There is a lot of mystery around the idea of a developer boot camp and it can be difficult to figure out if it is something that is worth pursuing. For example, if you are just starting out today, are you better off going to a bootcamp […]

The post Erik Gross is Teaching People How to Code appeared first on Simple Programmer.

Categories: Programming

Holacracy and the Search for Agile Organization

Mike Cohn's Blog - Tue, 06/02/2015 - 15:00

I met Brian Robertson back when he was still the CEO of Ternary Software and experimenting with the ideas that have become Holacracy. You’ve probably heard of Holacracy by now. It’s a way of running organizations. But it’s a very different way of running organizations. It’s been written up in Forbes and Wired magazines. But it’s often poorly described and misunderstood. So, what better way for us to learn about it than from its primary developer, Brian Robertson, who has written this guest post for us. —Mike

Agile software development is truly a stark contrast to the machine-like predict-and-control methods of a waterfall approach. For better or worse, agile methods are also in stark contrast to the organizational leadership, management, and governance structures of modern day business, which — like waterfall approaches — rely on autocratic predict-and-control management and tend to fight change.

This clash often creates significant stress between agile teams and the rest of the organization — stress that can severely slow or even kill the shift to agile methods. For organizations that do manage to make agile stick in their software teams, interesting questions then arise like, “can we run the rest of our organization on similar principles?” and “what would it take to make our entire organization agile?”

Fortunately, there are emerging methods that do for entire organizations what agile has done for software teams. This post examines one approach, called Holacracy, which offers a complete system for achieving agility in all aspects of an organization.

While Holacracy is by no means the first attempt to take agile approaches outside of software teams, as The Economist points out, “Holacracy goes further in shaking up working practices than most [other] approaches.” Holacracy also has more traction than any other defined system: it is already being used by hundreds of companies around the world, including notable mentions like Zappos, the David Allen Company, and Medium.

Adam Pisoni, co-founder of Yammer, adds: “Just like agile development systems break work into sprints, Holacracy forces a company to revisit its rules, roles, objectives and authorities in short cycles. This prevents you from over-planning upfront. It also gives you the chance to re-evaluate your plans, direction and beliefs on a regular, frequent basis.”

So, what is Holacracy? Essentially, it’s a new way of running an organisation that removes power from a management hierarchy and distributes it across clear roles, which can then be executed autonomously, without a micromanaging boss. More specifically, Holacracy differs from the traditional management model in four ways:

  1. No more job descriptions
    In most companies each person has a single job description that is often imprecise, outdated and irrelevant to their day-to-day work. In Holacracy, people have multiple roles, often on different teams, and those role descriptions are constantly updated by the team actually doing the work. This allows people working in a Holacracy-powered company a lot more freedom to express their creative talents. It also means that the company can take advantage of those skills in a way it couldn’t before.

    Ev Williams, cofounder of Blogger, Twitter and Medium, puts it this way: “In the past, as my companies have grown, I’ve hired these amazing people, and I’ve felt like I was getting less and less of them as the company got bigger. Part of that was because they were in a particular area, and if they had ideas or concerns or perspectives that were relevant outside that area, it wasn’t clear what to do with them. Holacracy provides a very specific way where people are actually encouraged to bring this stuff up. You really take advantage of everybody’s perspectives and ideas.”

  2. No more delegated authority
    The agility that Holacracy provides comes directly from truly distributed authority. In traditional organisations, managers loosely delegate authority, but ultimately their decisions always trump those they manage and everybody knows it.

    In Holacracy, authority is truly distributed and decisions are made locally, by the individual closest to the front line. Teams are self-organised: they’re given a purpose, but they decide internally how to best reach it. In this way, Holacracy replaces the traditional hierarchy with a series of interconnected but autonomous teams (“circles” in Holacracy’s vernacular). And once a circle has distributed some responsibility or authority to one of its roles, whoever fills that role has a whole lot of power in that area -- power no one else can trump.

    This is very different than in most companies, where only the management has the kind of authority needed to make important decisions. In practice, this means that everyone in the company is asked to take the reins and become a leader of their roles, and, conversely, a follower of others’ roles.

  3. No more big re-orgs
    In traditional companies, the organisation chart gets revamped every few years. These cyclical ‘re-orgs’ are an attempt to keep up with the changing environment, but since they only occur every three to five years, they are almost always out of date. In Holacracy, the structure of the organisation is updated every month in every circle.

    Yammer co-founder Adam Pisoni says it this way, “Most startups believe in iteration of their products. Now they need to apply the same thinking to their organisations.” Holacracy is precisely this type of solution: a rapid and agile approach to organisational development, rather than the industrial-age approach of predict, plan, and then cross your fingers and hope that you’ve made the right prediction.

    With Holacracy, the definitions of the roles and the circles are not prescribed in advance, nor are they rigidly defined. Instead, Holacracy allows a company to evolve whatever organisational structure is most suited to its current environment. In this way, Holacracy doesn’t just help organisations become evolved – it helps them become evolutionary.

  4. No more office politics
    In most companies, things are done a certain way because “that’s how we’ve always done it”, and those implicit rules are hard to change. Often no one knows why those rules exist, who decided them, or who can change them. This makes distributing authority almost impossible, because there is no way to ensure that everyone is following the same set of rules. The traditional management hierarchy is based mostly on ‘people who get it’ promoting other ‘people who get it.’

In Holacracy, distributing authority is not just a matter of taking power out of the hands of a leader and giving it to someone else or even to a group. Rather, the seat of power shifts from the person at the top to an explicit process, which is defined in detail in a written document (called the Holacracy ‘constitution’). In fact, when an organisation adopts Holacracy, the very first step is having the CEO or current power-holder formally cede his or her power into its rule system, meaning that they are subject to the same rules as everyone else.

This shift from personal leadership to constitutionally derived power is central to Holacracy’s new paradigm. The transparency of the rules means that you no longer have to depend on office politics to get things done.

Conclusion: a better way of working?
Companies designed in the 20th century have very little capacity to evolve and adapt. They are subject to evolution’s process at the market level and may survive or die as a result, but they are rarely adaptive organisms themselves, at least on more than a superficial level.

“The key to doing better,” argues Oxford economist Eric Beinhocker,“is to ‘bring evolution inside’ and get the wheels of differentiation, selection, and amplification spinning within a company’s four walls.” Holacracy offers the possibility of doing just that: embedding an enhanced capacity to dynamically and continually evolve, within an organisation’s core DNA.

It helps create organisations that are fast, agile and succeed by pursuing their purpose, free from the tyranny of top-down planning or the time-consuming pursuit of consensus.

It’s not a silver bullet – it takes hard work and practice to make the shift into such a dramatically different way of organising, but those who see and experience it in action are excited about its results. In the words of David Allen, author of "Getting Things Done" and a business leader with years of Holacracy experience in his own company, “Holacracy is not a panacea – it won’t resolve all of an organisation’s tensions and dilemmas. But, in my experience, it does provide the most stable ground from which to recognise, frame, and address them.”

Pre-order Brian’s book, "Holacracy: The New Management System for a Rapidly Changing World" (released June 2nd) here: http://holacracybook.com/

You can also read the first chapter here:
http://holacracy.org/sites/default/files/images/holacracybook-ch1.pdf

Share your thoughts, reactions, and questions about Holacracy in the comments section of the blog. We'll choose two commentators at random and send them a free copy of Holacracy.

From Software Delivery to Software Creativity

From the Editor of Methods & Tools - Tue, 06/02/2015 - 13:04
This editorial was inspired by a quote from Mary and Tom Poppendieck’s book “Lean Mindset“. They wrote “What’s next is to stop thinking about software development as a delivery process and to start thinking of it as a problem-solving process, a creative process.” In many large companies, software development has often been traditionally considered as […]

Extracting software architecture from code

Coding the Architecture - Simon Brown - Tue, 06/02/2015 - 09:19

I'm running a short "In The Brain" session at Skills Matter in London next Monday evening, focussed around the topic of extracting the software architecture model from code.

It’s often said that the code is the true embodiement of the software architecture, yet my experience suggests that it’s difficult to actually extract this information from the code. Why isn’t the architecture in the code? Join me for this "In The Brain" session where we’ll look at a simple Java web application to see what information we can extract from the code and how to supplement it with information we can't. This session will be interactive, so bring a laptop.

This will be an interactive session so you will need to bring a laptop, or at least something that you can browse a GitHub repository with. We'll be looking at a Java web application, but the concepts are applicable to other programming languages and platforms. See you next week.

Categories: Architecture

R: dplyr – removing empty rows

Mark Needham - Tue, 06/02/2015 - 07:49

I’m still working my way through the exercises in Think Bayes and in Chapter 6 needed to do some cleaning of the data in a CSV file containing information about the Price is Right.

I downloaded the file using wget:

wget http://www.greenteapress.com/thinkbayes/showcases.2011.csv

And then loaded it into R and explored the first few rows using dplyr

library(dplyr)
df2011 = read.csv("~/projects/rLearning/showcases.2011.csv")
 
> df2011 %>% head(10)
 
           X Sep..19 Sep..20 Sep..21 Sep..22 Sep..23 Sep..26 Sep..27 Sep..28 Sep..29 Sep..30 Oct..3
1              5631K   5632K   5633K   5634K   5635K   5641K   5642K   5643K   5644K   5645K  5681K
2                                                                                                  
3 Showcase 1   50969   21901   32815   44432   24273   30554   20963   28941   25851   28800  37703
4 Showcase 2   45429   34061   53186   31428   22320   24337   41373   45437   41125   36319  38752
5                                                                                                  
...

As you can see, we have some empty rows which we want to get rid of to ease future processing. I couldn’t find an easy way to filter those out but what we can do instead is have empty columns converted to ‘NA’ and then filter those.

First we need to tell read.csv to treat empty columns as NA:

df2011 = read.csv("~/projects/rLearning/showcases.2011.csv", na.strings = c("", "NA"))

And now we can filter them out using na.omit:

df2011 = df2011 %>% na.omit()
 
> df2011  %>% head(5)
             X Sep..19 Sep..20 Sep..21 Sep..22 Sep..23 Sep..26 Sep..27 Sep..28 Sep..29 Sep..30 Oct..3
3   Showcase 1   50969   21901   32815   44432   24273   30554   20963   28941   25851   28800  37703
4   Showcase 2   45429   34061   53186   31428   22320   24337   41373   45437   41125   36319  38752
6        Bid 1   42000   14000   32000   27000   18750   27222   25000   35000   22500   21300  21567
7        Bid 2   34000   59900   45000   38000   23000   18525   32000   45000   32000   27500  23800
9 Difference 1    8969    7901     815   17432    5523    3332   -4037   -6059    3351    7500  16136
...

Much better!

Categories: Programming

For a Few Dollars More

Phil Trelford's Array - Mon, 06/01/2015 - 19:02

This is a follow up to my last post A Fistful of Dollars, where I looked at test-driven approaches to implementing a money type based on the example running through Kent Beck’s Test-Driven Development by Example book:

Test-Driven Development By Example

In this run I decided as an exercise to skip formal unit testing altogether and just script the functionality for the multi-currency report that Kent is working towards over 100 pages or so of his book:

Money Example

Unsurprisingly It ended up being relatively quick and easy to implement.

Money type

First off we need a money type with an amount and a currency and support for multiplication and addition:

type Money = private { Amount:decimal; Currency:Currency } 
   with   
   static member ( * ) (lhs:Money,rhs:decimal) = 
      { lhs with Amount=lhs.Amount * rhs }
   static member ( + ) (lhs:Money,rhs:Money) =
      if lhs.Currency <> rhs.Currency then invalidOp "Currency mismatch"
      { lhs with Amount=lhs.Amount + rhs.Amount}
   override money.ToString() = sprintf "%M%s" money.Amount money.Currency
and  Currency = string

In the code above I’ve used an F# record type with operator overloads for multiplication and addition.

Exchange rates

Next we need to be able to do currency conversion based on a rate table:

type RateTable = { To:Currency; From:Map<Currency,decimal> }

let exchangeRate (rates:RateTable) cy =   
   if rates.To = cy then 1.0M else rates.From.[cy]

let convertCurrency (rates:RateTable) money =
   let rate = exchangeRate rates money.Currency
   { Amount=money.Amount / rate; Currency=rates.To }

Here I’ve used a record type for the table and simple functions to look up a rate and perform the conversion.

Report model

Now we need a representation for the input and output, i.e. the user’s positions and the report respectively:

type Report = { Rows:Row list; Total:Money }
and  Row = { Position:Position; Total:Money }
and  Position = { Instrument:string; Shares:int; Price:Money }

Again this is easily described using F# record types

Report generation

Here we need a function that takes the rates and positions and returns a report instance:

let generateReport rates positions =
   let rows =
      [for position in positions ->        
         let total = position.Price * decimal position.Shares
         { Position=position; Total=total } ]
   let total =
      rows
      |> Seq.map (fun row -> convertCurrency rates row.Total)   
      |> Seq.reduce (+)
   { Rows=rows; Total=total }

For the report generation I’ve used a simple projection to generate the rows followed by a map/reduce block to compute the total in the target currency.

Report view

There’s a number of different ways to view a generate the report. At first I looked at WinForms and WPF, which provide built-in data grids, but unfortunately I couldn’t find anything “simple” for showing summary rows.

In the end I plumped for a static HTML view with an embedded table:

let toHtml (report:Report) =
   html [
      head [ title %"Multi-currency report" ]      
      body [
         table [
            "border"%="1"
            "style"%="border-collapse:collapse;"
            "cellpadding"%="8"
            thead [
               tr [th %"Instrument"; th %"Shares"; th %"Price"; th %"Total"] 
            ]
            tbody [
               for row in report.Rows ->
                  let p = row.Position
                  tr [td %p.Instrument; td %p.Shares; td %p.Price; td %row.Total]
            ]
            tfoot [
               tr [td ("colspan"%="3"::"align"%="right"::[strong %"Total"])
                   td %report.Total]
            ]
         ]
      ]
   ]

For the HTML generation I wrote a small internal DSL for defining a page.

If you’re something a little more polished I found these static HTML DSLs on my travels:

Report data

Finally I can define the inputs and generate the report:

let USD amount = { Amount=amount; Currency="USD" }
let CHF amount = { Amount=amount; Currency="CHF" }

let positions =
   [{Instrument="IBM";      Shares=1000; Price=USD( 25M)}
    {Instrument="Novartis"; Shares= 400; Price=CHF(150M)}]

let inUSD = { To="USD"; From=Map.ofList ["CHF",1.5M] }

let positionsInUSD = generateReport inUSD positions

let report = positionsInUSD |> toHtml |> Html.toString

Which I think is pretty self-explanatory.

Report table

The resultant HTML appears to match the table from the book pretty well:

Instrument Shares Price Total IBM 1000 25USD 25000USD Novartis 400 150CHF 60000CHF Total 65000USD

 

Summary

I was able to implement the report in small steps using F# interactive to get quick feedback and test out scenarios, with the final result being as expected on first time of running.

Overall I’m pretty happy with the brevity of the implementation. F# made light work of generating the report, and statically generated HTML produced a nice result with minimal effort, a technique I’ll be tempted to repeat in the future.

The full script is available as an F# Snippet.

Categories: Programming

Developing Products in the Style of Etsy

How should you go about structuring your project? We have two general paradigms that I'll characterize as flowing from the Etsy coaching tree, emphasising the monolith, and from the Netflix coaching tree, emphasizing microservices. This is of course an over simplification, but it's for instructional purposes only. For a broad comparison of the two approaches take a look at The Great Microservices Vs Monolithic Apps Twitter Melee.

This is not a good vs. evil sort of mythos. The Force is truly one. We simply have two valid and functional ways of looking the world.

I think wdewind nails the heart of the difference:

The point of the article is that local optimization gives you this tiny boost in the beginning for a long term cost that eventually moves the organization is a direction of shipping less. It's not that innovative technologies are bad.

The mentioned article is Choose Boring Technology by Dan McKinley, in which Dan does a great job exploring Etsy style development with both insight and wisdom. 

Dan explores four different principles:

Categories: Architecture

The Simple Programmer European Tour Begins

Making the Complex Simple - John Sonmez - Mon, 06/01/2015 - 16:00

I’m sitting here writing this post on a plane heading from Iceland to Paris, as I’m about to embark upon a 3-month journey. A lot of people have asked me if it is a vacation. In a way it is—but it’s also not. I don’t really think in terms of “vacation” anymore. This is my […]

The post The Simple Programmer European Tour Begins appeared first on Simple Programmer.

Categories: Programming

Microservices principles #5: Best technology for the job over one technology for all

Xebia Blog - Mon, 06/01/2015 - 11:39

Microservices are a hot topic. Because of that a lot of people are saying a lot of things. To help organizations make the best of this new architectural style Xebia has defined a set of principles that we feel should be applied when implementing a Microservice Architecture. Over the next couple of days we will cover each of these principles in more detail in a series of blog posts.
In this blog we cover: “Best technology for the job over one technology for all”

A common benefit of service based (and loosely coupled) architectures is the possibility to choose a different technology for each service. Even though this concept isn’t new, it’s rarely applied. Often the reason for this seems to be that even though the services should operate independently they do share (parts of) the same stack. This is further fueled by an urge to consolidate all development under a single technology. Reasoning here usually being that developers become more interchangeable and therefore more valuable if everything runs on the same technology, which should be a good thing.

So, if this isn’t new territory why drag it up again? Why would a Microservices architecture merit changing an existing approach? The short answer is autonomy. The long(er) answer is that a Microservices Architecture does not try to centralize common (technological) functions in singleton-esque services. No, the focus of a Microsservices architecture is on service autonomy, centered around business capability and a Microservice can therefore implement its own stack.  This to make a Microservice easier to deploy on their own and removes dependencies on other services as much as possible.

But autonomous deployment isn’t the most important reason to consider technology on a per-service basis. The most important reason is the simple “use the best tool for the job”. Not all technology is created equal. This isn’t limited to the choice of programming language or even the framework. It applies to the whole stack including the data layer.

Instead of spending a significant sum to buy large, bloated, multipurpose middleware, consider lightweight, single purpose containers. Pick containers that run the tech you need. You don't need Java applications with a relational database for everything. Other languages, frameworks and even datastores exist that cater to specific needs. Think of other languages like Scala or Go, frameworks like Akka or Play and database alternatives that focus on specific needs like storing (and retrieving) geographical data.

The choice of stack also relates to the choices you can make for your application landscape. If you have existing components that work for you or if you have components you want to buy off the shelve, it’s a real benefit to not be limited by an existing stack. For example, if you have opted for a Windows-only environment you are limiting your options.

Concerns about maintaining such a diverse landscape should consider that a lot of complexity comes from trying to maintain a single stack for everything. Smaller and simpler stacks should be easier to maintain. And having a single operations team for all those different technologies doesn't sound like a good idea? You're right! If you still have separate development and operations teams it may also be time to revisit that strategy. The devops approach makes running the services a shared responsibility. This doesn’t happen overnight but it is also a reason why Microservices can be such a good fit for organizations that have adopted an Agile way of working and/or apply Continuous Delivery.

Finally giving your developers a broader toolset to play with should keep them engaged. The opportunity to work with more than one technology can be a factor in retaining and attracting talent.

A Fistful of Dollars

Phil Trelford's Array - Mon, 06/01/2015 - 08:08

Just over a week ago I took the Eurostar over to Paris for NCrafts, a conference bringing together over 300 software craftsmen and craftswomen:

@ncraftsConf A good balance between the software craftsmanship, the DDD, The functional. It was so good #ncrafts Cc @rhwy @dotnetstation

— brunoboucard (@brunoboucard) May 23, 2015

The event was held in a crypt and featured a good number of F# sessions:

. @CedricRup it's the second time I attend a #fsharp heavy conference in a crypt. What's wrong with us people? @rhwy

— Steffen Forkmann (@sforkmann) May 22, 2015

Mathias Brandewinder gave an excellent closing talk on The T in TDD : tests, types, tales.

 NCrafts 2015 - May 2015

In this live coding session, Mathias took the multi-currency money example from Kent Beck’s seminal Test-Driven Development by Example book. First implementing a dollars class in C# driven by a unit test for quick feedback and then contrasting it with a similar implementation in F# using the REPL for immediate feedback.

Unit Test

The system needs to be able to multiply a price in dollars by a number of shares, so that 5 USD * 2 = 10 USD:

public class Tests
{
   [Test]
   public void five_dollars_times_two_should_equal_ten_dollars()
   {
      // arrange
      var five = new Dollars(5);
      // act
      var result = five.Times(2);
      // assert
      Assert.AreEqual(new Dollars(10), result);
   }
}

C# Dollars

Based on the test an immutable dollars class can be implemented:

public class Dollars
{
   private readonly decimal _amount;

   public Dollars(decimal value)
   {
      _amount = value;
   }

   public decimal Amount
   {
      get { return _amount; }  
   }

   public Dollars Times(decimal multiplier)
   {
      return new Dollars(this._amount * multiplier);
   }
}

The code now compiles, but the test fails!

C# Equality

The test fails because in C# class types use reference equality, so we must override Equals:

public class Dollars
{
   private readonly decimal _amount;

   public Dollars(decimal value)
   {
      _amount = value;
   }

   public decimal Amount
   {
      get { return _amount; }  
   }

   public Dollars Times(decimal multiplier)
   {
      return new Dollars(this._amount * multiplier);
   }

   public override bool Equals(object obj)
   {
      var that = obj as Dollars;
      return
         that != null
         ? this.Amount == that.Amount
         : false;
   }
}

Note: at this point FXCop will also recommend that we implement GetHashCode as we’ve implemented Equals.

F# Dollars

In F#, the simplest thing that could possibly work is a measure type which gives compile time type safety:

[<Measure>] type usd

5.0M<usd> * 2.0M = 10.0M<usd>

We can also test it immediately in F# Interactive as above, or alternatively write a unit test as below:

let [<Test>] ``5 USD * 2 = 10 USD`` () =
   Assert.AreEqual(10M<usd>, 5M<usd> * 2M)

Note: F# units of measure are erased at compile time meaning there’s no runtime performance penalty.

F# Money

For a report we’d probably want to encode money dynamically with a currency component. Below I’ve chosen an F# record type:

type Money = { Amount:decimal; Currency:string } with
   member this.Times(multiplier) = { this with Amount = this.Amount * multiplier }

let USD amount = { Amount=amount; Currency="USD" }

USD 10M = (USD 5M).Times(2M)

This succeeds immediately as F# implements equality (and GetHashCode) by default for us on record types.

Unquote

As an aside, I find assertions over numerical types are more natural using the Unquote library which lets you assert equality using the equals operator, i.e.

let [<Test>] ``5 USD * 2 = 10 USD`` () =
   test <@ (USD 5M).Times(2M) = USD 10M @>

Summary

When writing code we may seek quick feedback on our first implementations. In C# we’d typically write reflection based unit tests to get early feedback, in F# we may use F# interactive first for immediate feedback and later promote useful tests to reflection based tests that run as part of our continuous build and may help find regressions.

Also in this scenario implementing types in F# required a lot less boilerplate than the equivalent C# code.

Categories: Programming

Data’s hierarchy of needs

pyramid

This post originally published in the AppsFlyer blog.

A couple of weeks ago Nir Rubinshtein and I presented AppsFlyer’s data architecture in a meetup ofBig Data & Data Science Israel. One of the concepts that I presented there, which is worth expanding upon is “Data’s Hierarchy of Needs:”

  • Data should Exist
  • Data should be Accessible
  • Data should be Usable
  • Data should be Distilled
  • Data should be Presented

How can we make data “achieve its pinnacle of existence” and be acted upon. In other words, what are the areas that should be addressed when designing a data architecture if you want it to be complete and enable creating insights and value from the data you generate and collect.

If done properly, your users might just act upon the data you provide. This list might seem a little simplistic but it is not a prescription of what to do but rather a set of reminders of areas we need to cover and questions we need answered to properly create a data architecture.

Data Should Exist

Well, of course data should exist, and it probably does. You should ask yourself however, is if the data that exists is the right data? Does the retention policy you have service the business needs? Does the availability fit your needs? Do you have all the needed links (foreign keys) to other data so you’d be able to connect it later for analysis?

To make this more concrete, consider the following example: AppsFlyer accepts several types of events (launches, in-app events, etc.) which are tied to apps. Apps are connected to accounts (an account would have one or more applications, usually at least, an iOS app and an Android one). If we would save the accounts as the latest snapshot and an app changes ownership, the historical data before that change would be skewed. If we treat the accounts as a slowly changing dimension of the events, then we’d be able to handle the transition correctly. Note that we may still choose to provide the new owner the historic data but now it not the only option the system support and the decision can be based on the business needs.

Data Should Be Accessible

If data is written to disk it is accessible programmatically at least, however, there can be many levels of accessibility and we need to think about our end users needs and the level of access they’d require. At AppsFlyer, the data existence (mentioned above) is handled by processing all the messages that go through our queues using Kafka but that data is saved in sequence files and stored by event time. Most of our usage scenarios do have a time component but they are primarily handled by the app or account. Any processing that needs a specific account and would access the raw events would have to sift through tons of records (3.7+ billion a day at the time of this post) to find the few relevant ones. Thus, one basic move toward accessibility of data is to sort by apps so that queries will only need to access a small subset of the data and thus run much faster.

Then we need to consider the “hotness” of the data i.e. what response times we need and for which types of data. For instance, aggregations, such as retention reports need to be accessed online (so called “sub-second” response), latest counts need near real-time , explorations of data for new patterns can take hours etc. To enable support of these varied usage scenarios, we need to create multiple projections of our data, most likely using several different technologies.  AppsFlyer stores raw data in sequence files, processed data in parquet files (accessible via Apache Spark), aggregations and recent data in columnar RDBMS and near real-time is stored in-memory.

The three different storage mechanisms I mentioned above (Parquet, columnar RDBMS and In-Memory Data Grid) used in AppsFlyer all have SQL access; this is not by chance. While we (the industry) went through a short period of NoSQL, SQL or almost-SQL is getting back to be the norm, even for semi-structured and poly-structured data. Providing an SQL interface to your data is another important aspect of data accessibility as it allows expanding the user base for the data beyond R&D. Again, this is important not just for your relational data…

Data Should Be Usable

What’s the difference between accessible data and usable data?  For one there’s data cleansing. This is a no-brainer if you pull data from disparate systems but it is also needed if your source is a single system. Data cleansing is what traditional ETL is all about and the techniques still apply.
Another aspect of making data usable is enriching it or connecting it to additional data. Enriching can happen from internal sources like linking CRM data to the account info. This can also be facilitated by external sources as with getting the app category from the app store or getting device screen size from a device database.

Last but not least, is to consider legal and privacy aspects of the data. Before allowing access to the data you may need to mask sensitive information or remove privacy-related data (sometimes you shouldn’t even save it in the first place). At AppsFlyer we take this issue very seriously and make major efforts to comply when working with partners and clients to make sure privacy-related data is handled correctly. In fact, we are also undergoing independent SOC auditing to make sure we are compliant with the highest standards.

To summarize, to make the data usable you have to make sure it is correct, connect it to other data and you need to make sure that it is compliant with legal and privacy issues.

Data Should Be Distilled

Distilling insights is the reason we perform all the previous steps. Data in itself is of little use if it doesn’t help us make better decisions. There are multiple types of insights you can generate here beginning from the more traditional BI scenarios of slice and dice analytics going through real-time aggregations and trend analysis, ending in applying machine learning or “advanced analytics”. You can see one example of the type of insights that can be gleaned from our data by looking at theGaming Advertising Performance Index we recently published.

Data Should Be Presented

This point ties in nicely with the Gaming Advertising Performance Index example provided above. Getting insights is an important step, but if you fail to present them in a coherent and cohesive manner then the actual value users would be able to make of it is limited at best.  Note that even if you use insights for making decisions (e.g. recommending a product to a user) you’d still need to present how well this decision is doing.

There are many issues that need to be dealt with from UX perspective both in how users interact with the data and how the data is presented. An example of the former is deciding on chart types for the data. A simple example for the latter is when presenting projected or inaccurate data it  should be clear to the users that they are looking at approximations to prevent support calls on numbers not adding up.

Making sure all the areas discussed above are covered and handled properly is a lot of work but providing a solution that actually helps your users make better decisions is well worth it. The data’s hierarchy of needs is not a prescription of how to get there, it is merely a set of waypoints to help navigate toward this end goal. It helps me think holistically about AppsFlyer data needs and I hope following this post it would also help you.

For more information about our architecture, check out the presentation from the meetup:

Categories: Architecture

R: Think Bayes Euro Problem

Mark Needham - Mon, 06/01/2015 - 00:11

I’ve got back to working my way through Think Bayes after a month’s break and started out with the one euro coin problem in Chapter 4:

A statistical statement appeared in “The Guardian” on Friday January 4, 2002:

When spun on edge 250 times, a Belgian one-euro coin came up heads 140 times and tails 110. ‘It looks very suspicious to me,’ said Barry Blight, a statistics lecturer at the London School of Economics. ‘If the coin were unbiased, the chance of getting a result as extreme as that would be less than 7%.’

But do these data give evidence that the coin is biased rather than fair?

We’re going to create a data frame with each row representing the probability that heads shows up that often. We need one row for each value between 0 (no heads) and 100 (all heads) and we’ll start with the assumption that each value can be chosen equally (a uniform prior):

library(dplyr)
 
values = seq(0, 100)
scores = rep(1.0 / length(values), length(values))  
df = data.frame(score = scores, value = values)
 
> df %>% sample_n(10)
         score value
60  0.00990099    59
101 0.00990099   100
10  0.00990099     9
41  0.00990099    40
2   0.00990099     1
83  0.00990099    82
44  0.00990099    43
97  0.00990099    96
100 0.00990099    99
12  0.00990099    11

Now we need to feed in our observations. We need to create a vector containing 140 heads and 110 tails. The ‘rep’ function comes in handy here:

observations = c(rep("T", times = 110), rep("H", times = 140))
> observations
  [1] "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T"
 [29] "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T"
 [57] "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T"
 [85] "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "H" "H"
[113] "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H"
[141] "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H"
[169] "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H"
[197] "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H"
[225] "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H"

Now we need to iterate over each of the observations and update our data frame appropriately.

for(observation in observations) {
  if(observation == "H") {
    df = df %>% mutate(score = score * (value / 100.0))
  } else {
    df = df %>% mutate(score = score * (1.0 - (value / 100.0)))
  }    
}
 
df = df %>% mutate(weighted = score / sum(score))

Now that we’ve done that we can calculate the maximum likelihood, mean, median and credible interval. We’ll create a ‘percentile’ function to help us out:

percentile = function(df, p) {
    df %>% filter(cumsum(weighted) > p) %>% head(1) %>% select(value) %>% as.numeric
}

And now let’s calculate the values:

# Maximum likelihood
> df %>% filter(weighted == max(weighted)) %>% select(value) %>% as.numeric
[1] 56
 
# Mean
> df %>% mutate(mean = value * weighted) %>% select(mean) %>% sum
[1] 55.95238
 
# Median
> percentile(df, 0.5)
[1] 56
 
# Credible Interval
percentage = 90
prob = (1 - percentage / 100.0) / 2
 
# lower
> percentile(df, prob)
[1] 51
 
# upper
> percentile(df, 1 - prob)
[1] 61

This all wraps up nicely into a function:

euro = function(values, priors, observations) {
  df = data.frame(score = priors, value = values)
 
  for(observation in observations) {
    if(observation == "H") {
      df = df %>% mutate(score = score * (value / 100.0))
    } else {
      df = df %>% mutate(score = score * (1.0 - (value / 100.0)))
    }    
  }
 
  return(df %>% mutate(weighted = score / sum(score)))
}

which we can call like so:

values = seq(0,100)
priors = rep(1.0 / length(values), length(values))
observations = c(rep("T", times = 110), rep("H", times = 140))
df = euro(values, priors, observations)

The next part of the problem requires us to change the prior distribution to be more weighted to values close to 50%. We can tweak the parameters we pass into the function accordingly:

values = seq(0,100)
priors = sapply(values, function(x) ifelse(x < 50, x, 100 - x))
priors = priors / sum(priors)
observations = c(rep("T", times = 110), rep("H", times = 140))
df = euro(values, priors, observations)

In fact even with the adjusted priors we still end up with the same posterior distribution:

> df %>% filter(weighted == max(weighted)) %>% select(value) %>% as.numeric
[1] 56
 
> df %>% mutate(mean = value * weighted) %>% select(mean) %>% sum
[1] 55.7435
 
> percentile(df, 0.5)
[1] 56
 
> percentile(df, 0.05)
[1] 51
 
> percentile(df, 0.95)
[1] 61

The book describes this phenemenom as follows:

This is an example of swamping the priors: with enough data, people who start with different priors will tend to converge on the same posterior.

Categories: Programming

Python: CSV writing – TypeError: ‘builtin_function_or_method’ object has no attribute ‘__getitem__’

Mark Needham - Sun, 05/31/2015 - 23:33

When I’m working in Python I often find myself writing to CSV files using the in built library and every now and then make a mistake when calling writerow:

import csv
writer = csv.writer(file, delimiter=",")
writer.writerow["player", "team"]

This results in the following error message:

TypeError: 'builtin_function_or_method' object has no attribute '__getitem__'

The error message is a bit weird at first but it’s basically saying that I’ve tried to do an associative lookup on an object which doesn’t support that operation.

The resolution is simply to include the appropriate parentheses instead of leaving them out!

writer.writerow(["player", "team"])

This one’s for future Mark.

Categories: Programming

SPaMCAST 344 – Susan Parente, Agile Risk Management

Software Process and Measurement Cast - Sun, 05/31/2015 - 22:00

Software Process and Measurement Cast 344 features our conversation with Susan Parente.  We talked about Agile risk management. Risk is not always discussed in polite Agile circles however Susan suggests that if you do not have a plan to address risk you are asking for pain for yourself and everyone around you.

Susan’s Bio

Susan Parente is a Principal Consultant at S3 Technologies, LLC and an Associate Professor at Post University. She is an author, mentor and teacher focused on project and risk management. Her experience is augmented by her Masters in Engineering Management with a focus in Marketing of Technology from George Washington University, DC, along with a number of professional certifications. Ms. Parente has 16+ years’ experience leading software and business development projects in the private and public sectors, including a decade of experience implementing IT projects for the DoD.

Contact Data:

Email: parente@s3-tec.com

Phone: 203-307-5246

LinkedIn: https://www.linkedin.com/in/susanparente

Risk Management Resources: www.techriskmanager.com

Company website: www.s3-tec.com

Agile Risk Management LinkedIn Group:

https://www.linkedin.com/groups?mostRecent=&gid=4020498&trk=my_groups-tile-flipgrp

Call to action!

Reviews of the Podcast help to attract new listeners.  Can you write a review of the Software Process and Measurement Cast and post it on the podcatcher of your choice?  Whether you listen on ITunes or any other podcatcher, a review will help to grow the podcast!  Thank you in advance!

Re-Read Saturday News

The Re-Read Saturday focus on Eliyahu M. Goldratt and Jeff Cox’s The Goal: A Process of Ongoing Improvement began on February 21nd. The Goal has been hugely influential because it introduced the Theory of Constraints, which is central to lean thinking. The book is written as a business novel. Visit the Software Process and Measurement Blog and catch up on the re-read.

Note: If you don’t have a copy of the book, buy one.  If you use the link below it will support the Software Process and Measurement blog and podcast.

Dead Tree Version or Kindle Version 

Next . . . The Mythical Man-Month Get a copy now and start reading! We will start in four weeks!

Upcoming Events

2015 ICEAA PROFESSIONAL DEVELOPMENT & TRAINING WORKSHOP
June 9 – 12 
San Diego, California
http://www.iceaaonline.com/2519-2/
I will be speaking on June 10.  My presentation is titled “Agile Estimation Using Functional Metrics.”

Let me know if you are attending!

Also upcoming conferences I will be involved in include and SQTM in September. More on these great conferences next week.

Next SPaMCast

The next Software Process and Measurement Cast will feature our will essay on Cognitive Bias.  The core of software development, enhancements and maintenance is people. Knowledge of cognitive biases can help us understand and predict team behaviors. Will will also have the first installment Jeremy Berriault’s QA Corner.  QA Corner is all about testing.

 

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.

Categories: Process Management