Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

How To Lie With Statistics

Herding Cats - Glen Alleman - Fri, 08/21/2015 - 22:51

How-to-lie-with-statisticsHow To Lie With Statistics is a critically important book to have on your desk if you're involved any decision making. My edition is a First Edition, but I don't have the dust jacket, so not worth that much beyond the current versions.

The reason for this post is to lay the ground work for assessing reports, presentations, webinars, and other selling documents that contain statistical information. 

The classic statistical misuse if the Standish Report, describing the success and failure of IT projects.

Here's my summation on the elements of How To Lie in our project domain

  • Sample with the Built In Bias - the population of the sample space is not defined. The samples are¬†self selected in that those who respond are the basis of the statistics. No adjustment for all those who did not respond to a survey for example.¬†
  • The Well Chosen Average - The arithmetical average, Median, and Mode are¬†estimators of the population statistics. Any of these without a variance is of little value for decision making.¬†
  • Little Figures That Are Not There - the classic is¬†use this¬†approach¬†(in this case #NoEstimates) and¬†your¬†productivity¬†will improve 10X, that 1000% by the way. A 1000% improvement. That's unbelievable, literally unbelievable. The actual improvements are stated, only the percentage. The baseline performance is not stated. It's unbelievable.
  • Much Ado About Practically Nothing - ¬†the probability of being in the range of¬†normal. This is the basis of advertising. What's the variance?
  • Gee-Whiz Graphs - using graphics and adjustable scales provides the opportunity to¬†manipulate the message. The classic example of this is the estimating errors in a popular graph used by the No Estimates advocates. It's a graph showing the number of projects that complete¬†over there estimated cost and schedule. What's not shown is the credibility of the original estimate.
  • One Dimensional Picture - using a picture to show numbers, where the picture is not in the scale as the numbers provides a messaging path for visual readers.¬†
  • Semi-attached¬†Picture -¬†If you can't¬†prove¬†what you want to prove,¬†demonstrate¬†something else¬†and¬†pretend that they are the same thing. In one example, the logic is inverted. Estimating is conjectured to be the root cause of problems. With no evidence of that, the statement¬†we don't see how estimating can produce success, so not estimating will increase the probability of success.¬†
  • Post Hoc Rides Again - posy hoc causality is common in the absence of a¬†cause and effect understanding. The correlation and causality differences are many times not understood.¬†

Here's a nice example of How To Lie

There's a chart from an IEEE Computer article showing the numbers of projects that exceeded their estimated cost. But let's start with some research on the problem. Coping with the Cone of Uncertainty.

There is a graph, popularly used to show that estimates 

Screen Shot 2015-08-14 at 3.09.33 PM


This diagram is actually MISUSED by the #NoEstimates advocates.

The presentation below shows the follow on information for how estimates can be improved the increase the confidence in the process and improvements in the business. As well shows the root causes of poor estimates and their corrective actions. Please ignore any ruse of Todd's chart without the full presentation. 


My mistake was doing just that.

So before anyone accepts any conjecture from a #NoEstimates advocate using the graph above, please read the briefing at the link below to see the corrective actions for making poor estimates.

Screen Shot 2015-08-14 at 3.24.50 PM

Here's the link to Todd's entire briefing not just the many times misused graph of estimates not representing the actuals Uncertainty Surrounding the Cone of Uncertainty. 

Related articles Root Cause of Project Failure Estimating and Making Decisions in Presence of Uncertainty Are Estimates Really The Smell of Dysfunction?
Categories: Project Management

Neo4j: Summarising neo4j-shell output

Mark Needham - Fri, 08/21/2015 - 21:59

I frequently find myself trying to optimise a set of cypher queries and I tend to group them together in a script that I fed to the Neo4j shell.

When tweaking the queries it’s easy to make a mistake and end up not creating the same data so I decided to write a script which will show me the aggregates of all the commands executed.

I want to see the number of constraints created, indexes added, nodes, relationships and properties created. The first 2 don’t need to match across the scripts but the latter 3 should be the same.

I put together the following script:

import re
import sys
from tabulate import tabulate
lines = sys.stdin.readlines()
def search(term, line):
    m =  re.match(term + ": (.*)", line)
    return (int( if m else 0)
nodes_created, relationships_created, constraints_added, indexes_added, labels_added, properties_set = 0, 0, 0, 0, 0, 0
for line in lines:
    nodes_created = nodes_created + search("Nodes created", line)
    relationships_created = relationships_created + search("Relationships created", line)
    constraints_added = constraints_added + search("Constraints added", line)
    indexes_added = indexes_added + search("Indexes added", line)
    labels_added = labels_added + search("Labels added", line)
    properties_set = properties_set + search("Properties set", line)
    time_match = re.match("real.*([0-9]+m[0-9]+\.[0-9]+s)$", line)
    if time_match:
        time =
table = [
            ["Constraints added", constraints_added],
            ["Indexes added", indexes_added],
            ["Nodes created", nodes_created],
            ["Relationships created", relationships_created],
            ["Labels added", labels_added],
            ["Properties set", properties_set],
            ["Time", time]
print tabulate(table)

Its input is the piped output of the neo4j-shell command which will contain a description of all the queries it executed.

$ cat
{ ./neo4j-community-2.2.3/bin/neo4j stop; } 2>&1
rm -rf neo4j-community-2.2.3/data/graph.db/
{ ./neo4j-community-2.2.3/bin/neo4j start; } 2>&1
{ time ./neo4j-community-2.2.3/bin/neo4j-shell --file $1; } 2>&1

We can use the script in two ways.

Either we can pipe the output of our shell straight into it and just get the summary e.g.

$ ./ local.import.optimised.cql | python
---------------------  ---------
Constraints added      5
Indexes added          1
Nodes created          13249
Relationships created  32227
Labels added           21715
Properties set         36480
Time                   0m17.595s
---------------------  ---------

…or we can make use of the ‘tee’ function in Unix and pipe the output into stdout and into the file and then either tail the file on another window or inspect it afterwards to see the detailed timings. e.g.

$ ./ local.import.optimised.cql | tee /tmp/output.txt |  python
---------------------  ---------
Constraints added      5
Indexes added          1
Nodes created          13249
Relationships created  32227
Labels added           21715
Properties set         36480
Time                   0m11.428s
---------------------  ---------
$ tail -f /tmp/output.txt
| appearances |
| 3771        |
1 row
Nodes created: 3439
Properties set: 3439
Labels added: 3439
289 ms
| appearances -> player, match, team |
| 3771                               |
1 row
Relationships created: 10317
1006 ms

My only dependency is the tabulate package to get the pretty table:

$ cat requirements.txt

The cypher script I’m running creates a BBC football graph which is available as a github project. Feel free to grab it and play around – any problems let me know!

Categories: Programming

Project Tango I/O Apps now released in Google Play

Google Code Blog - Fri, 08/21/2015 - 20:15

Posted by Larry Yang, Lead Product Manager, Project Tango

At Google I/O, we showed the world many of the cool things you can do with Project Tango. Now you can experience it yourself by downloading these apps on Google Play onto your Project Tango Tablet Development Kit.

A few examples of creative experiences include:

MeasureIt is a sample application that shows how easy it is to measure general distances. Just point a Project Tango device at two or more points. No tape measures and step ladders required.

Constructor is a sample 3D content creation tool where you can scan a room and save the scan for further use.

Tangosaurs lets you walk around and dig up hidden fossils that unlock a portal into a virtual dinosaur world.

Tango Village and Multiplayer VR are simple apps that demonstrate how Project Tango’s motion tracking enables you to walk around VR worlds without requiring an input device.

Tango Blaster lets you blast swarms of robots in a virtual world, and can even work with the Tango device mounted on a toy gun.

We also showed a few partner apps that are also now available in Google Play. Break A Leg is a fun VR experience where you’re a magician performing tricks on stage.

SideKick’s Castle Defender uses Project Tango’s depth perception capability to place a virtual world onto a physical playing surface.

Defective Studio’s VRMT is a world-building sandbox designed to let anyone create, collaborate on, and share their own virtual worlds and experiences. VRMT gives you libraries of props and intuitive tools, to make the virtual creation process as streamlined as possible.

We hope these applications inspire you to use Project Tango’s motion tracking, area learning and depth perception technologies to create 3D experiences. We encourage you to explore the physical space around the user, including precise navigation without GPS, windows into virtual 3D worlds, measurement of spaces, and games that know where they are in the room and what’s around them.

As we mentioned in our previous post, Project Tango Tablet Development Kits will go on sale in the Google Store in Denmark, Finland, France, Germany, Ireland, Italy, Norway, Sweden, Switzerland and the United Kingdom starting August 26.

We have a lot more to share over the coming months! Sign-up for our monthly newsletter to keep up with the latest news. Connect with the 5,000 other developers in our Google+ community. Get help from other developers by using the Project Tango tag in Stack Overflow. See what others are creating on our YouTube channel. And share your story on Twitter with #ProjectTango.

Join us on our journey.

Categories: Programming

SE-Radio Episode 236: Rebecca Parsons on Evolutionary Architecture

Johannes Th√∂nes¬†talks to Rebecca Parsons, Chief Technology Officer at ThoughtWorks, about evolutionary architecture. The practice of evolutionary software architecture means making decisions as late as possible (last responsible moment) and setting up cross-functional requirements that the architecture has to meet (architectural fitness function). In the beginning, Parsons and Th√∂nes introduce the term evolutionary architecture and […]
Categories: Programming

Stuff The Internet Says On Scalability For August 21st, 2015

Hey, it's HighScalability time:

Hunter-Seeker? Nope. This is the beauty of what a Google driverless car sees. Great TED talk.
  • $2.8 billion: projected Instagram ad revenue in 2017; 1 trillion: Azure event hub events per month; 10 million: Stack Overflow questions asked; 1 billion: max volts generated by a lightening strike; 850: apps downloaded every second from the AppStore; 2000: years data can be stored in DNA; 60: # of robots needed to replace 600 humans; 1 million: queries per second with Nginx, Ubuntu, EC2

  • Quotable Quotes:
    • Tales from the Lunar Module Guidance Computer: we landed on the moon with 152 Kbytes of onboard computer memory.
    • @ijuma: Included in JDK 8 update 60 "changes GHASH internals from using byte[] to long, improving performance about 10x
    • @ErrataRob: I love the whining over the Bitcoin XT fork. It's as if anarchists/libertarians don't understand what anarchy/libertarianism means.
    • Network World: the LHC Computing Grid has 132,992 physical CPUs, 553,611 logical CPUs, 300PB of online disk storage and 230PB of nearline (magnetic tape) storage. It's a staggering amount of processing capacity and data storage that relies on having no single point of failure.
    • @petereisentraut: Chef is kind of a distributed monkey-patching festival running as root.
    • @SciencePorn: If you were to remove all of the empty space from the atoms that make up every human on earth, all humans would fit into an apple.
    • SDN for the cloud: Most of the concepts presented in the papers have been put into practice in Microsoft cloud infrastructures. As a result of these improvements, modern Azure services can carry up to 1,400,000 SQL databases. Moreover, a typical Azure event hub sees as high as 1 trillion events per month.

  • On the Alphabet Google reorg...what Horace Dediu has to say on functional vs divisional organizations may provide insight. A functional organization, which is used by the Army and Apple, prevents cross divisional fights for resources and power. Those are the kind of internal politics that kill a company. Why not just sidestep all that?

  • Here's how Pinterest shards MySQL to scale: All data needed to be replicated to a slave machine for backup, with high availability and dumping to S3 for MapReduce...You never want to read/write to a slave in production...Slaves lag, which causes strange bugs; I still recommend startups avoid the fancy new stuff — try really hard to just use MySQL. Trust me. I have the scars to prove it...We created a 64 bit ID that contains the shard ID...To create a new Pin, we gather all the data and create a JSON blob...A mapping table links one object to another...there are three primary ways to add more capacity...more up new ranges...move some shards to new machines...This system is best effort. It does not give you Atomicity, Isolation or Consistency in all cases...We stored the shard configuration table in ZooKeeper...This system has been in production at Pinterest for 3.5 years now and will likely be in there forever. 

  • Nobody expects the quadruple fault! Google loses data as lightning strikes: four successive lightning strikes on the local utilities grid that powers our European datacenter caused a brief loss of power to storage systems...only a very small number of disks remained affected, totalling less than 0.000001% of the space of allocated persistent disks...full recovery is not possible.

  • Flxone upgraded to Go version 1.5 and reduced their 95-percentile garbage collector from 279 milliseconds down to just 10 ms, a 96% decrease in garbage collection pause time. Average request latency dropped by 53%. I wonder now if they can reduce the number of nodes required to meet their SLA? And would the results hold if they wrote their app more natively, that is to generate garbage?

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

You Are Not Agile . . . If You Do Waterfall

The spiral method is just one example of a Agile hybrid.

The spiral method is just one example of a Agile hybrid.

Many organizations have self-titled themselves as Agile. Who wouldn’t want to be Agile? If you are not Agile, aren’t you by definition clumsy, slow or dull? Very few organizations would sign up for those descriptions; however, Agile in the world of software development, enhancements and maintenance means more than being able to move quickly and easily. Agile means that a team or organization has embraced a set of principles that shape behaviors and lead to the adoption of a set of techniques. When there is a disconnect between the Agile walk and the Agile talk, management is often a barrier when it comes to principles and practitioners are when it comes to techniques. Techniques are often deeply entrenched and require substantial change efforts. Many organizations state they are using a hybrid approach to Agile to transition from a more classic approach to some combination of Scrum, Kanban and Extreme Programming. This is considered a safe, conservative approach that allows an organization to change organically. The problem is that this tactic rarely works and often organizations get stuck. Failure to spend the time and effort on change management often leads to hybrids frameworks that are neither fish nor fowl.  Those neither fish nor fowl frameworks are rarely Agile. Attributes of stuck (or potentially stuck) organizations are:

The iterative waterfall. The classic iterative waterfall traces its roots to the Boehem Spiral Model. In the faux Agile version of iterative development, short, time-boxed iterations are used for each of the classic waterfall phase. A requirements sprint is followed by a design sprint, then a development sprint and you know the rest. Both the classic spiral model or the faux Agile version are generally significantly better than the classic waterfall model for generating feedback and delivering value faster; therefore, organizations stop moving toward Agile and reap the partial rewards.

Upfront requirements. In this hybrid approach to Agile, a team or organization will gather all of the requirements (sometimes called features) at the beginning of the project and then have them locked down before beginning ‚Äúwork.‚ÄĚ Agile is based on a number of assumptions about requirements. Two key assumptions are that requirements are emergent, and that once known, requirements decay over time. Locking product backlogs flies in the face of both of these assumptions, which puts teams and organizations back into the age of building solutions that when delivered don‚Äôt meet the current business needs. This approach is typically caused when the Agile rollout is done using a staggered approach beginning with the developers and then later reaching out to the business analysts and business. the¬†interface between groups¬†who have embraced Agile and those that ¬†have not often generates additional friction, often blamed on Agile making further change difficult.

Testing after development is ‚Äúdone.‚ÄĚ One of the most pernicious Agile hybrids is testing the sprint after development is complete. I have heard this hybrid called ‚Äúdevelopment+1 sprint.‚ÄĚ In this scenario a team will generate a solution (functional code if this is a software problem), demo it to customers, and declare it to be done, and THEN throw it over the wall to testers. Testers will ALWAYS find defects, which requires them to throw the software back over the wall either to be worked on, disrupting the current development sprint, or to be put on the backlog to be addressed later. Agile principles espouse the delivery of shippable software (or at least potentially shippable) at the end of every sprint. Shippable means TESTED. Two slightly less pernicious variants of this problem are the use of hardening sprints or doing all of the testing at the end of the project. At least in those cases you are not pretending to be Agile.

How people work is the only cut and dry indicator of whether an organization is Agile or not. Sometimes how people work is reflection of a transition; however, without a great deal of evidence that the transition is moving along with alacrity, I assume they are or will soon be stuck. When a team or organization adopts Agile, pick a project and have everyone involved with that project adopt Agile at the same time, across the whole flow of work. If that means you have to coach one whole project or team at a time, so be it. Think of it as an approach that slices the onion, addressing each layer at the same time rather than peeling it layer by layer.

One final note: Getting stuck in most of these hybrids is probably better than the method(s) that was being used before. This essay should not be read as an indictment of people wrestling with adopting Agile, but rather as a prod to continue to move forward.

Categories: Process Management

Polymer Summit Schedule Released!

Google Code Blog - Thu, 08/20/2015 - 20:12

Posted by Taylor Savage, Product Manager

We’re excited to announce that the full speaker list and talk schedule has been released for the first ever Polymer Summit! Find the latest details on our newly launched site here. Look forward to talks about topics like building full apps with Polymer, Polymer and ES6, adaptive UI with Material Design, and performance patterns in Polymer.

The Polymer Summit will start on Monday, September 14th with an evening of Code Labs, followed by a full day of talks on Tuesday, September 15th. All of this will be happening at the Muziekgebouw aan ‚Äėt IJ, right on the IJ river in downtown Amsterdam. All tickets to the summit were claimed on the first day, but you can sign up for the waitlist to be notified, should any more tickets become available.

Can’t make it to the summit? Sign up here if you’d like to receive updates on the livestream and tune in live on September 15th on We’ll also be publishing all of the talks as videos on the Google Developers YouTube Channel.

Categories: Programming

What’s in a message? Getting attachments right with the Google beacon platform

Google Code Blog - Thu, 08/20/2015 - 19:17

Posted by Hoi Lam, Developer Advocate

If your users‚Äô devices know where they are in the world ‚Äď the place that they‚Äôre at, or the objects they‚Äôre near ‚Äď then your app can adapt or deliver helpful information when it matters most. Beacons are a great way to explicitly label the real-world locations and contexts, but how does your app get the message that it‚Äôs at platform 9, instead of the shopping mall or that the user is standing in front of a food truck, rather than just hanging out in the parking lot?

With the Google beacon platform, you can associate information with registered beacons by using attachments in Proximity Beacon API, and serve those attachments back to users’ devices as messages via the Nearby Messages API. In this blog post, we will focus on how we can use attachments and messages most effectively, making our apps more context-aware.

Think per message, not per beacon

Suppose you are creating an app for a large train station. You’ll want to provide different information to the user who just arrived and is looking for the ticket machine, as opposed to the user who just wants to know where to stand to be the closest to her reserved seat. In this instance, you’ll want more than one beacon to label important places, such as the platform, entrance hall and waiting area. Some of the attachments for each beacon will be the same (e.g. the station name), others will be different (e.g. platform number). This is where the design of Proximity Beacon API, and the Nearby Messages API in Android and iOS helps you out.

When your app retrieves the beacon attachments via the Nearby Messages API, each attachment will appear as an individual message, not grouped by beacon. In addition, Nearby Messages will automatically de-duplicate any attachments (even if they come from different beacons). So the situation looks like this:

This design has several advantages:

  • It abstracts the API away from implementation (beacon in this case), so if in the future we have other kinds of devices which send out messages, we can adopt them easily.
  • Built in deduplication means that you do not need to build your own to react to the same message, such as the station name in the above example.
  • You can add finer grained context messages later on, without re-deploying.

In designing your beacon user experience, think about the context of your user, the places and objects that are important for your app, and then label those places. The Proximity Beacon API makes beacon management easy, and Nearby Messages API abstract the hardware away, allowing you to focus on creating relevant and timely experiences. The beacons themselves should be transparent to the user.

Using beacon attachments with external resources

In most cases, the data you store in attachments will be self-contained and will not need to refer to an external database. However, there are several exceptions where you might want to keep some data separately:

  • Large data items such as pictures and videos.
  • Where the data resides on a third party database system that you do not control.
  • Confidential or sensitive data that should not be stored in beacon attachments.
  • If you run a proprietary authentication system that relies on your own database.

In these cases, you might need to use a more generic identifier in the beacon attachment to lookup the relevant data from your infrastructure.

Combining the virtual and the real worlds

With beacons, we have an opportunity to delight users by connecting the virtual world of personalization and contextual awareness with real world places and things that matter most. Through attachments, the Google beacon platform delivers a much richer context for your app that goes beyond the beacon identifier and enables your apps to better serve your users. Let’s build the apps that connect the two worlds!

Categories: Programming

Interactive watch faces with the latest Android Wear update

Android Developers Blog - Thu, 08/20/2015 - 17:31

Posted by Wayne Piekarski, Developer Advocate

The Android Wear team is rolling out a new update that includes support for interactive watch faces. Now, you can detect taps on the watch face to provide information quickly, without having to open an app. This gives you new opportunities to make your watch face more engaging and interesting. For example, in this animation for the Pujie Black watch face, you can see that just touching the calendar indicator quickly changes the watch face to show the agenda for the day, making the watch face more helpful and engaging.

Interactive watch face API

The first step in building an interactive watch face is to update your build.gradle to use version 1.3.0 of the Wearable Support library. Then, you enable interactive watch faces in your watch face style using setAcceptsTapEvents(true):

setWatchFaceStyle(new WatchFaceStyle.Builder(mService)
    // other style customizations

To receive taps, you can override the following method:

public void onTapCommand(int tapType, int x, int y, long eventTime) { }

You will receive events TAP_TYPE_TOUCH when the user initially taps on the screen, TAP_TYPE_TAP when the user releases their finger, and TAP_TYPE_TOUCH_CANCEL if the user moves their finger while touching the screen. The events will contain (x,y) coordinates of where the touch event occurred. You should note that other interactions such as swipes and long presses are reserved for use by the Android Wear system user interface.

And that’s it! Adding interaction to your existing watch faces is really easy with just a few extra lines of code. We have updated the WatchFace sample to show a complete implementation, and design and development documentation describing the API in detail.

Wi-Fi added to LG G Watch R

This release also brings Wi-Fi support to the LG G Watch R. Wi-Fi support is already available in many Android Wear watches and allows the watch to communicate with the companion phone without requiring a direct Bluetooth connection. So, you can leave your phone at home, and as long as you have Wi-Fi, you can use your watch to receive notifications, send messages, make notes, or ask Google a question. As a developer, you should ensure that you use the Data API to abstract away your communications, so that your application will work on any kind of Android Wear watch, even those without Wi-Fi.

Updates to existing watches

This update to Android Wear will roll out via an over-the-air (OTA) update to all Android Wear watches over the coming weeks. The wearable support library version 1.3 provides the implementation for touch interactions, and is designed to continue working on devices which have not been updated. However, the touch support will only work on updated devices, so you should wait to update your apps on Google Play until the OTA rollout is complete, which we’ll announce on the Android Wear Developers Google+ community. If you want to release immediately but check if touch interactions are available, you can use this code snippet:

PackageInfo packageInfo = PackageManager.getPackageInfo("", 0);
if (packageInfo.versionCode > 720000000) {
  // Supports taps - cache this result to avoid calling PackageManager again
} else {
  // Device does not support taps yet

Android Wear developers have created thousands of amazing apps for the platform and we can’t wait to see the interactive watch faces you build. If you’re looking for a little inspiration, or just a cool new watch face, check out the Interactive Watch Faces collection on Google Play.

Join the discussion on

+Android Developers
Categories: Programming

Will My Blogging Get Me Fired?

Making the Complex Simple - John Sonmez - Thu, 08/20/2015 - 15:00

In this episode, I talk more about blogging and breaching confidentiality. Full transcript: John:¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† Hey, John Sonmez from I got a question about blogging. I think this is kind of an interesting topic about if you should actually blog, if it could get you fired from your job. This question comes from Chuck. Chuck […]

The post Will My Blogging Get Me Fired? appeared first on Simple Programmer.

Categories: Programming

What is a Software Intensive System?

Herding Cats - Glen Alleman - Thu, 08/20/2015 - 14:52

When we hear about software development in the absence of a domain, it's difficult to have a discussion about the appropriate principles, processes, and practices of that work. Here's one paradigm that has served us well.

In the Software Intensive System world, Number 6 and beyond, here's some background Related articles Making Conjectures Without Testable Outcomes Root Cause of Project Failure
Categories: Project Management

One More #NoEstimates Post

Herding Cats - Glen Alleman - Thu, 08/20/2015 - 02:37

Steve McConnell's recent post on estimating prompted me to make one more post on this topic. First some background on my domain and point of view.

I work in what is referred to a Software Intensive Systems (SIS) involving Introduction, Foundations, Development Lifecycle, Requirements, Analysis and Design, Implementation, Verification and Validation, Summary and Outlook and those SIS's are usually embedded in System of Systems 

This may not be the domain where the No Estimates advocates work. Their system may not be software intensive and more that not system of systems. And as one of the more vocal supporters of No Estimates likes to say the color of your sky is different than mine. And yes it is, it's Blue, and we know why. It's Rayleigh Scattering. The reason we know why is engineers and scientist occupy the hallways of our office, along with all the IT and business SW developers running the enterprise IT systems that enable the production of all the SIS's embedded in the SoS products.

Here's a familiar framework for the spectrum of software systems

I'll add to Steve's comments in italics, while editing out material not germain to my responses but still in support of Steve's. Before we start here's one important concept In project management we do not seek perfect prediction. We seek early warning signals to enable predictive corrective actions. 

1. Estimation is often done badly and ineffectively and in an overly time-consuming way. 

My company and I have taught upwards of 10,000 software professionals better estimation practices, and believe me, we have seen every imaginable horror story of estimation done poorly. There is no question that ‚Äúestimation is often done badly‚ÄĚ is a true observation of the state of the practice.¬†

The role of estimating is found on many domains. Independent Cost Estimates (ICE) are mandated in many domains I work. Estimating professional organizations provide guidance, materials, and communities.,, NASA, DOD, DOE, DHS, DOJ, most every "heavy industry" from dirt moving to writing software for money, has some formalized estimating process.

2. The root cause of poor estimation is usually lack of estimation skills. 

Estimation done poorly is most often due to lack of estimation skills. Smart people using common sense is not sufficient to estimate software projects. Reading two page blog articles on the internet is not going to teach anyone how to estimate very well. Good estimation is not that hard, once you’ve developed the skill, but it isn’t intuitive or obvious, and it requires focused self-education or training. 

One of the most common estimation problems is people engaging with so-called estimates that are not really Estimates, but that are really Business Targets or requests for Commitments. You can read more about that in my estimation book or watch my short video on Estimates, Targets, and Commitments. 

Root Cause Analysis is one of our formal processes. We apply Reality Charting¬ģ¬†to all technologies of our work. RCA is part of Governance and continuous process improvement. Conjecturing that estimates are somehow the "smell" of something else without stating that problem and most importantly confirming the Root Cause of the problem, providing corrective actions, and most critically confirming the corrective action removes the root cause is bad management at best and naive management at worst¬†

3. Many comments in support of #NoEstimates demonstrate a lack of basic software estimation knowledge. 

I don’t expect most #NoEstimates advocates to agree with this thesis, but as someone who does know a lot about estimation I think it’s clear on its face. Here are some examples

(a) Are estimation and forecasting the same thing? As far as software estimation is concerned, yes they are. (Just do a Google or Bing search of ‚Äúdefinition of forecast‚ÄĚ.) Estimation, forecasting, prediction--it's all the same basic activity, as far as software estimation is concerned.¬†

The notion of redefining terms to suite the needs of the speaker is troubling. Estimating is about the past, present, and future. As a former physicist I made estimates of the scattering cross section of particle collisions, so we knew where to look for the signature of the collision. In a second career, since I really didn't have an original idea needed for the profession of particle physics, I estimated the signature parameter is mono-pulse doppler radar signals to identify targets in missile defense systems. Same for signatures from sonar system to separate whales from Biscayne Bay speed boats, the Oscar Class Russian Submarines.

Forecasting is estimating some outcome in the future. Weather forecasters make predictions of the probability of rain in the coming days. 

(b) Is showing someone several pictures of kitchen remodels that have been completed for $30,000 and implying that the next kitchen remodel can be completed for $30,000 estimation? Yes, it is. That’s an implementation of a technique called Reference Class Forecasting. 

Reference Class Forecasting is fundamental to good estimating. But other techniques are useful as well. Parametric modeling, design based models in systems engineering in sysML has estimating databases. Model Based Design is a well developed discipline in our domain and others. Evene Subject Matter Experts (although actually undesirable) can be a start, with wide-band Delphi

(c) Is doing a few iterations, calculating team velocity, and then using that empirical velocity data to project a completion date count as estimation? Yes it does. Not only is it estimation, it is a really effective form of estimation. I’ve heard people argue that because velocity is empirically based, it isn’t estimation. Good estimation is empirically based, so that argument exposes a lack of basic understanding of the nature of estimation. 

All good estimates are based on some "reference class." Gathering data to build a reference class may be needed. But care is needed is using the "first few sprints" without first answering some questions.

  • Is a forecast of the future.
  • Is the future like the past?
  • Are there changes in the underlying statistical process in the future that are not accounted for in the past.
  • Are the underlying¬†statistical¬†processes for irreducible (aleatory)¬†uncertainty¬†stationary.¬†That is are the natural¬†variances¬†in the project work the¬†same across¬†the life span of the project. Or do they change as time passes?
Empirical estimation require knowing something about the underlying statistical and probabilistic processes. Without this knowledge, those empirical measurement are "point" measures and not likely to be representative of the future

(d) Is counting the number of stories completed in each sprint rather than story points, calculating the average number of stories completed each sprint, and using that for sprint planning, estimation? Yes, for the same reasons listed in point (c). 

This is estimating. But the numbers alone are no good "estimators." The variance and stability of the variance is needed. The past is a predictor of the future ONLY is the future is like the past. This is the role of time series analysis where simple and free tools can be used to produce a credible estimate of the future from the past.

(e) Most of the #NoEstimates approaches that have been proposed, including (c) and (d) above, are approaches that were defined in my book Software Estimation: Demystifying the Black Art, published in 2006. The fact that people people are claiming these long-ago-published techniques as "new" under the umbrella of #NoEstimates is another reason I say many of the #NoEstimates comments demonstrate a lack of basic software estimation knowledge. 

The use of Slicing - one proposed #Noestimates technique is estimating. Using the NO in front of Estimate and then referencing "slicing" seems a bit disingenuous. But slicing is subject to the same issue as all reference class that are not adjusted for future changes. The past may not be like the future. Confirmation and adjustment are part of good estimating. 

(f) Is estimation time consuming and a waste of time? One of the most common symptoms of lack of estimation skill is spending too much time on ineffective activities. This work is often well-intentioned, but it’s common to see well-intentioned people doing more work than they need to get worse estimates than they could be getting.

This notion that those spending the money get to say what is a waste and what is waste would be considered hubris in any other context. In an attempt not to be rude (one of the No estimates advocates favorite come backs when presented with a tough question - ala Jar Jar Binks) estimates are primarily not for those spending the money but for those providing the money. How much, when and what are business questions that need answers in any non-trivial business transaction. If the need to know is not there, it is likely the "value at risk" for the work is low enough, no one cares what it costs, when it will be done, or what we'll get when we're done.

Just to be Crystal clear I use the term non-trivial to mean a project whose cost and schedule and possible whose produced content - when missed - do not impact the business in any manner detrimental to its operation. 

(g) Is it possible to get good estimates? Absolutely. We have worked with multiple companies that have gotten to the point where they are delivering 90%+ of their projects on time, on budget, with intended functionality. 

Of course it is, and good estimates happen all the time. Bad estimates happen all the time as well. One of my engagements is with the Performance Assessment and Root Cause Analyses division of the US DOD. Root Cause Analysis of ACAT1 Nunn McCurdy programs shows the following. Similar Root Causes can be found for commercial projects. 

Gary Bliss Chart

One reason many people find estimation discussions (aka negotiations) challenging is that they don't really believe the estimates they came up with themselves. Once you develop the skill needed to estimate well -- as well as getting clear about whether the business is really talking about an estimate, a target, or a commitment -- estimation discussions become more collaborative and easier. 

The Basis of Estimate problem is universal. Is was on a proposal team that lost to an arch rival because our "basis of estimate" included an unrealistic staffing plan. Build a credible estimate is actual work. The size of the project, the "value at risk," the tolerance for risk, and a myriad of other factors all go into the deciding how to make the estimate. All good estimates and estimating practices are full collaboration. 

When management abuse is called out when estimating, it has not been explained how NOT estimating that corrects the management abuse.

4. Being able to estimate effectively is a skill that any true software professional needs to develop, even if they don’t need it on every project. 

‚ÄúEstimation often doesn't work very well, therefore software professionals should not develop estimation skill‚ÄĚ ‚Äď this is a common line of reasoning in #NoEstimates. This argument doesn't make any more sense than the argument, "Scrum often doesn't work very well, therefore software professionals should not try to use Scrum." The right response in both cases is, "Get better at the practice," not "Throw out the practice altogether."¬†

The notion of "I can't learn to estimate well," is not the same as "it's possible to learn to estimate well." There are professional estimating organizations, books, journals, courses. What is really being said is "I don't want to learn to estimate." 

#NoEstimates advocates say they're just exploring the contexts in which a person or team might be able to do a project without estimating. That exploration is fine, but until someone can show that the vast majority of projects do not need estimates at all, deciding to not estimate and not develop estimations skills is premature. And my experience tells me that when all the dust settles, the cases in which no estimates are needed will be the exception rather than the rule. Thus software professionals will benefit -- and their organizations will benefit -- from developing skill at estimation. 

Those #NoEstimate advocates have appeared to no ask those paying their salary what they need in terms of estimates. Ignore for the moment the Dilbert managers. This is a day one issue. #Noestimates willfully ignores the needs of the business. And when called on it says "if management needs estimates, we should estimate." Any manager accountable for a non-trivial expenditure that doesn't have some type of "estimate to complete and Estimate at Completion isn't going to be a manager for very long when the project shows up late, over budget, and doesn't deliver the needed capabilities.

I would go further and say that a true software professional should develop estimation skill so that you can estimate competently on the numerous projects that require estimation. I don't make these claims about software professionalism lightly. I spent four years as chair of the IEEE committee that oversees software professionalism issues for the IEEE, including overseeing the Software Engineering Body of Knowledge, university accreditation standards, professional certification programs, and coordination with state licensing bodies. I spent another four years as vice-chair of that committee. I also wrote a book on the topic, so if you're interested in going into detail on software professionalism, you can check out my book, Professional Software Development. Or you can check out a much briefer, more specific explanation in my company's white paper about our Professional Development Ladder. 

5. Estimates serve numerous legitimate, important business purposes.

Estimates are used by businesses in numerous ways, including: 

  • Allocating budgets to projects (i.e., estimating the effort and budget of each project)
  • Making cost/benefit decisions at the project/product level, which is based on cost (software estimate) and benefit (defined feature set)
  • Deciding which projects get funded and which do not, which is often based on cost/benefit
  • Deciding which projects get funded this year vs. next year, which is often based on estimates of which projects will finish this year
  • Deciding which projects will be funded from CapEx budget and which will be funded from OpEx budget, which is based on estimates of total project effort, i.e., budget
  • Allocating staff to specific projects, i.e., estimates of how many total staff will be needed on each project
  • Allocating staff within a project to different component teams or feature teams, which is based on estimates of scope of each component or feature area
  • Allocating staff to non-project work streams (e.g., budget for a product support group, which is based on estimates for the amount of support work needed)
  • Making commitments to internal business partners (based on projects‚Äô estimated availability dates)
  • Making commitments to the marketplace (based on estimated release dates)
  • Forecasting financials (based on when software capabilities will be completed and revenue or savings can be booked against them)
  • Tracking project progress (comparing actual progress to planned (estimated) progress)
  • Planning when staff will be available to start the next project (by estimating when staff will finish working on the current project)
  • Prioritizing specific features on a cost/benefit basis (where cost is an estimate of development effort)

These are just a subset of the many legitimate reasons that businesses request estimates from their software teams. I would be very interested to hear how #NoEstimates advocates suggest that a business would operate if you remove estimates for each of these purposes.

The #NoEstimates response to these business needs is typically of the form, ‚ÄúEstimates are inaccurate and therefore not useful for these purposes‚ÄĚ rather than, ‚ÄúThe business doesn‚Äôt need estimates for these purposes.‚Ä̬†

That argument really just says that businesses are currently operating on the basis of much worse predictions than they should be, and probably making poorer decisions as a result, because the software staff are not providing very good estimates. If software staff provided more accurate estimates, the business would make better decisions in each of these areas, which would make the business stronger. 

The other #NoEstimates response is that "Estimates are always waste." I don't agree with that. By that line of reasoning, daily stand ups are waste. Sprint planning is waste. Retrospectives are waste. Testing is waste. Everything but code-writing itself is waste. I realize there are Lean purists who hold those views, but I don't buy any of that. 

Estimates, done well, support business decision making, including the decision not to do a project at all. Taking the #NoEstimates philosophy to its logical conclusion, if #NoEstimates eliminates waste, then #NoProjectAtAll eliminates even more waste. In most cases, the business will need an estimate to decide not to do the project at all.  

In my experience businesses usually value predictability, and in many cases, they value predictability more than they value agility. Do businesses always need predictability? No, there are few absolutes in software. Do businesses usually need predictability? In my experience, yes, and they need it often enough that doing it well makes a positive contribution to the business. Responding to change is also usually needed, and doing it well also makes a positive contribution to the business. This whole topic is a case where both predictability and agility work better than either/or. Competency in estimation should be part of the definition of a true software professional, as should skill in Scrum and other agile practices. 

Estimates are the basis of managerial finance and decision making in the presence of uncertainty (Microeconomics of software development). The accuracy and precision of the estimates is usually determined by the value at risk. From low risk, which may mean no estimates. To high risk which means frequently updated independent validation of the estimates. But in nearly all business decisions - unless the value at risk can be written off - there is a need to know something about the potential loss as well as the potential gain. 

6. Part of being an effective estimator is understanding that different estimation techniques should be used for different kinds of estimates. 

One thread that runs throughout the #NoEstimates discussions is lack of clarity about whether we‚Äôre estimating before the project starts, very early in the project, or after the project is underway. The conversation is also unclear about whether the estimates are project-level estimates, task-level estimates, sprint-level estimates, or some combination. Some of the comments imply ineffective attempts to combine kinds of estimates‚ÄĒthe most common confusion I‚Äôve read is trying to use task-level estimates to estimate a whole project, which is another example of lack of software estimation skill.¬†

You can see a summary of estimation techniques and their areas of applicability here. This quick reference sheet assumes familiarity with concepts and techniques from my estimation book and is not intended to be intuitive on its own. But just looking at the categories you can see that different techniques apply for estimating size, effort, schedule, and features. Different techniques apply for small, medium, and large projects. Different techniques apply at different points in the software lifecycle, and different techniques apply to Agile (iterative) vs. Sequential projects. Effective estimation requires that the right kind of technique be applied to each different kind of estimate. 

Learning these techniques is not hard, but it isn't intuitive. Learning when to use each technique, as well as learning each technique, requires some professional skills development. 

When we separate the kinds of estimates we can see parts of projects where estimates are not needed. One of the advantages of Scrum is that it eliminates the need to do any sort of miniature milestone/micro-stone/task-based estimates to track work inside a sprint. If I'm doing sequential development without Scrum, I need those detailed estimates to plan and track the team's work. If I'm using Scrum, once I've started the sprint I don't need estimation to track the day-to-day work, because I know where I'm going to be in two weeks and there's no real value added by predicting where I'll be day-by-day within that two week sprint. 

That doesn't eliminate the need for estimates in Scrum entirely, however. I still need an estimate during sprint planning to determine how much functionality to commit to for that sprint. Backing up earlier in the project, before the project has even started, businesses need estimates for all the business purposes described above, including deciding whether to do the project at all. They also need to decide how many people to put on the project, how much to budget for the project, and so on. Treating all the requirements as emergent on a project is fine for some projects, but you still need to decide whether you're going to have a one-person team treating requirements as emergent, or a five-person team, or a 50-person team. Defining team size in the first place requires estimation. 

7. Estimation and planning are not the same thing, and you can estimate things that you can’t plan. 

Many of the examples given in support of #NoEstimates are actually indictments of overly detailed waterfall planning, not estimation. The simple way to understand the distinction is to remember that planning is about ‚Äúhow‚ÄĚ and estimation is about ‚Äúhow much.‚Ä̬†

Can I ‚Äúestimate‚ÄĚ a chess game, if by ‚Äúestimate‚ÄĚ I mean how each piece will move throughout the game? No, because that isn‚Äôt estimation; it‚Äôs planning; it‚Äôs ‚Äúhow.‚ÄĚ

Can I estimate a chess game in the sense of ‚Äúhow much‚ÄĚ? Sure. I can collect historical data on the length of chess games and know both the average length and the variation around that average and predict the length of a game.¬†

More to the point, estimating an individual software project is not analogous to estimating one chess game. It’s analogous to estimating a series of chess games. People who are not skilled in estimation often assume it’s more difficult to estimate a series of games than to estimate an individual game, but estimating the series is actually easier. Indeed, the more chess games in the set, the more accurately we can estimate the set, once you understand the math involved. 

This all goes back to the idea that we need estimates for different purposes at different points in a project. An agile project may be about "steering" rather than estimating once the project gets underway. But it may not be allowed to get underway in the first place if there aren't early estimates that show there's a business case for doing the project. 

Plans are strategies for the success of the project. What accomplishments must occur, how those accomplishments are assessed in units of measure meaningful to the decision makers are the start of Planning. Choices made during the planned process and most certainly during the execution process are informed by estimates of future outcomes from the decisions made today and the possible decision made in the future. This is the basis of Microeconomics of decision making.

Strategy making is many times used by #NoEstimates advocates when they are actually applying operational effectiveness. Strategic decision making is a critical success factor for non-trivial projects.

Strategic portfolio management from Glen Alleman

8. You can estimate what you don’t know, up to a point. 

In addition to estimating ‚Äúhow much,‚ÄĚ you can also estimate ‚Äúhow uncertain.‚ÄĚ In the #NoEstimates discussions, people throw out lots of examples along the lines of, ‚ÄúMy project was doing unprecedented work in Area X, and therefore it was impossible to estimate the whole project.‚ÄĚ This is essentially a description of the common estimation mistake of allowing high variability in one area to insert high variability into the whole project's estimate rather than just that one area's estimate.¬†

Most projects contain a mix of precedented and unprecedented work (also known as certain/uncertain, high risk/low risk, predictable/unpredictable, high/low variability--all of which are loose synonyms as far as estimation is concerned). Decomposing the work, estimating uncertainty in each area, and building up an overall estimate that includes that uncertainty proportionately is one technique for dealing with uncertainty in estimates. 

Why would that ever be needed? Because a business that perceives a whole project as highly risky might decide not to approve the whole project. A business that perceives a project as low to moderate risk overall, with selected areas of high risk, might decide to approve that same project. 

You can estimate anything that is knowable. You personally may not know it - so go find someone who does, do research, "explore," experiment, build models, build prototypes. Do what ever is necessary to improve your knowledge (epistemology) of the uncertainties and improve your understanding of the natural variance (aleatory uncertainty). But if it's knowable, then don't say it's unknown. It's just unknown to you. 

The classic error and unbounded hubris about estimates comes from Donald Rumsfeld when he used the Unknown Unknowns in the first Iraq war. he never read The Histories,  Herodotus, 5th Century B.C. Where the author told the reader "don't go to what is now Iraq," the tribal powers will never comply with your will. Same for what is now Afghanistan, where Alexander the Great was ejected by the local tribesman.  

9. Both estimation and control are needed to achieve predictability. 

Much of the writing on Agile development emphasizes project control over project estimation. I actually agree that project control is more powerful than project estimation, however, effective estimation usually plays an essential role in achieving effective control. 

Closed loop control and especially feedforward adaptive control requires making estimates for future states - before they unfavorably impact the outcome. This means estimating. Software development is a closed loop adaptive control system.

To put this in Agile Manifesto-like terms:

We have come to value project control over project estimation, 
as a means of achieving predictability.

My 1st disagreement with Steve. Control is based on estimating. Both needed in any close loop control system. By the Way the conjecture used of slicing is not Close Loop Control. There is no steering target. The slicing data does not say what the performance (how many slices or what ever units you want) are Needed to meet the goals of the project. Slicing is Open Loop Control.  The #Noestimates advocates need to  pick up any "Control System" book to see how this works.

As in the Agile Manifesto, we value both terms, which means we still value the term on the right. 

#NoEstimates seems to pay lip service to both terms, but the emphasis from the hashtag onward is really about discarding the term on the right. This is another case where I believe the right answer is both/and, not either/or. 

I wrote an essay when I was Editor in Chief of IEEE Software called "Sitting on the Suitcase" that discussed the interplay between estimation and control and discussed why we estimate even though we know the activity has inherent limitations. This is still one of my favorite essays. 

10. People use the word "estimate" sloppily. 

No doubt. Lack of understanding of estimation is not limited to people tweeting about #NoEstimates. Business partners often use the word ‚Äúestimate‚ÄĚ to refer to what would more properly be called a ‚Äúplanning target‚ÄĚ or ‚Äúcommitment.‚ÄĚ ¬†

The word "estimate" does have a clear definition, for those who want to look it up.  

The gist of these definitions is that an "estimate" is something that is approximate, rough, or tentative, and is based upon impressions or opinion. People don't always use the word that way, and you can see my video on that topic here. 

Better yet how about definition from the actual estimating community

  • Software Cost Estimation with COCOMO II
  • Software Sizing and Estimating
  • Forecasting and Simulating Software Development Projects: Effective Modeling of Kanban and Scrum Projects using Monte-Carlo Simulation
  • Estimating Software-Intensive Systems" Project, Products, and Processes
  • Making Hard Decisions
  • Forecasting Methods and¬†Applications¬†¬†
  • Probability Methods for Cost Uncertainty Analysis
  • Cost¬†Estimate¬†Classification¬†System, AACEI
  • Cost¬†Estimating¬†Body of Knowledge, ICEAA
  • Parametric¬†Estimating¬†Handbook, ICEAA
  • Basic¬†Software¬†Cost Estimating, CEB 09, ICEAA Online

The last opens with "‚ÄúAny sufficiently advanced technology is indistinguishable from magic.‚ÄĚ - Arthur C. Clarke. This may be one of the Root Causes for the #NoEstimates¬†advocates.¬†They've¬†encountered¬†a sufficiently advanced technology and see it as magic and therefore not¬†within¬†their grasp

There is no need to redefine anything. The estimating community has done that already.

Because people use the word sloppily, one common mistake software professionals make is trying to create a predictive, approximate estimate when the business is really asking for a commitment, or asking for a plan to meet a target, but using the word ‚Äúestimate‚ÄĚ to ask for that. It's common for businesses to think they have a problem with estimation when the bigger problem is with their commitment process.¬†

We have worked with many companies to achieve organizational clarity about estimates, targets, and commitments. Clarifying these terms makes a huge difference in the dynamics around creating, presenting, and using software estimates effectively. 

11. Good project-level estimation depends on good requirements, and average requirements skills are about as bad as average estimation skills. 

A common refrain in Agile development is ‚ÄúIt‚Äôs impossible to get good requirements,‚ÄĚ and that statement has never been true. I agree that it‚Äôs impossible to get¬†perfect¬†requirements, but that isn‚Äôt the same thing as getting¬†good¬†requirements. I would agree that ‚ÄúIt is impossible to get good requirements if you don‚Äôt have very good requirement skills,‚ÄĚ and in my experience that is a common case. ¬†I would also agree that ‚ÄúProjects usually don‚Äôt have very good requirements,‚ÄĚ as an empirical observation‚ÄĒbut not as a normative statement that we should accept as inevitable.¬†

If you don't know where you are going,you'll end up someplace else - Yogi Berra'

Figure it out, don't put up with being lazy. Use Capabilities Based Planning to elicit the requirements. What do you want this thing to do when it's done? Don't know, then why are you spend the customers money to build something. 

Agile is essentially spending the customers money to find out what the customer doesn't know. Ask first, is this the best use of the money?

Like estimation skill, requirements skill is something that any true software professional should develop, and the state of the art in requirements at this time is far too advanced for even really smart people to invent everything they need to know on their own. Like estimation skill, a person is not going to learn adequate requirements skills by reading blog entries or watching short YouTube videos. Acquiring skill in requirements requires focused, book-length self-study or explicit training or both. 

If your business truly doesn’t care about predictability (and some truly don’t), then letting your requirements emerge over the course of the project can be a good fit for business needs. But if your business does care about predictability, you should develop the skill to get good requirements, and then you should actually do the work to get them. You can still do the rest of the project using by-the-book Scrum, and then you’ll get the benefits of both good requirements and Scrum.

From my point of view, I often see agile-related claims that look kind of like this, What practices should you use if you have: 

  • Mediocre skill in Estimation
  • Mediocre skill in Requirements
  • Good to excellent skill in Scrum and Related Practices

Not too surprisingly, the answer to this question is, Scrum and Related Practices. I think a more interesting question is,What practices should you use if you have: 

  • Good to excellent skill in Estimation
  • Good to excellent skill in Requirements
  • Good to excellent skill in Scrum and related practices

Having competence in multiple areas opens up some doors that will be closed with a lesser skill set. In particular, it opens up the ability to favor predictability if your business needs that, or to favor flexibility if your business needs that. Agile is supposed to be about options, and I think that includes the option to develop in the way that best supports the business. 

12. The typical estimation context involves moderate volatility and a moderate levels of unknowns

Ron Jeffries¬†writes, ‚ÄúIt is conventional to behave as if all decent projects have mostly known requirements, low volatility, understood technology, ‚Ķ, and are therefore capable of being more or less readily estimated by following your favorite book.‚ÄĚ I don‚Äôt know who said that, but it wasn‚Äôt me, and I agree with Ron that that statement doesn‚Äôt describe most of the projects that I have seen.¬†

The color of Ron's sky must not be blue - the normal color. Every project we work has volatile requirements.

Don't undertake a project unless it is manifestly important and nearly impossible. - Edwin Land 

For enterprise IT there are databases showing the performance of past projects




I think it would be more true to say, “The typical software project has requirements that are knowable in principle, but that are mostly unknown in practice due to insufficient requirements skills; low volatility in most areas with high volatility in selected areas; and technology that tends to be either mostly leading edge or mostly mature." In other words, software projects are challenging, but the challenge level is manageable. If you have developed the full set of skills a software professional should have, you will be able to overcome most of the challenges or all of them. 

Of course there is a small percentage of projects that do have truly unknowable requirements and across-the-board volatility. I consider those to be corner cases. It’s good to explore corner cases, but also good not to lose sight of which cases are most common. 

13. Responding to change over following a plan does not imply not having a plan. 

It‚Äôs amazing that in 2015 we‚Äôre still debating this point. Many of the #NoEstimates comments literally emphasize not having a plan, i.e., treating 100% of the project as emergent. They advocate a process‚ÄĒtypically Scrum‚ÄĒbut no plan beyond instantiating Scrum.¬†

Plans are strategies for success of the projects.  Strategies are hypothesis. Hypothesis's need tests (experiments) to continually validate them. Ron can lecture us all he wants. But agile is a SW Development paradigm embedded in a larger strategic development paradigm and plans come from there. That's how enterprises function. Both are needed.

According to the Agile Manifesto, while agile is supposed to value responding to change, it also is supposed to value following a plan. The Agile Manifesto says, "there is value in the items on the right" which includes the phrase "following a plan." 

While I agree that minimizing planning overhead is good project management, doing no planning at all is inconsistent with the Agile Manifesto, not acceptable to most businesses, and wastes some of Scrum's capabilities. One of the amazingly powerful aspects of Scrum is that it gives you the ability to respond to change; that doesn’t imply that you need to avoid committing to plans in the first place. 

My company and I have seen Agile adoptions shut down in some companies because an Agile team is unwilling to commit to requirements up front or refuses to estimate up front. As a strategy, that’s just dumb. If you fight your business about providing estimates, even if you win the argument that day, you will still get knocked down a peg in the business’s eyes. 

I've commented in other contexts that I have come to the conclusion that most businesses would rather be wrong than vague. Businesses prefer to plant a stake in the ground and move it later rather than avoiding planting a stake in the ground in the first place. The assertion that businesses value flexibility over predictability is Agile's great unvalidated assumption. Some businesses do value flexibility over predictability, but most do not. If in doubt, ask your business. 

If your business does value predictability, use your velocity to estimate how much work you can do over the course of a project, and commit to a product backlog based on your demonstrated capacity for work. Your business will like that. Then, later, when your business changes its mind‚ÄĒwhich it probably will‚ÄĒyou‚Äôll still be able to¬†respond to change. Your business will like that even more. ¬†

14. Scrum provides better support for estimation than waterfall ever did, and there does not have to be a trade off between agility and predictability. 

Not quite true. Waterfall projects have excellent estimating processes. Trouble is during the execution of the project things change. When the Plan and the Estimate aren;'t updated to match this change - which is one of the root causes of project failure- then the estimate becomes of little use. Apply agile processes to estimating is the same as applying agile processes to codinig. Frequent assessments of progress to plan and corrective actions when variances appear.

Some of the #NoEstimates discussion seems to interpret challenges to #NoEstimates as challenges to the entire ecosystem of Agile practices, especially Scrum. Many of the comments imply that estimation will somehow impair agility. The examples cited to support that are mostly examples of unskilled misapplications of estimation practices, so I see them as additional examples of people not understanding estimation very well. 

The idea that we have to trade off agility to achieve predictability is a false trade off. If we define "agility" to mean, "no notion of our destination" or "treat all the requirements on the project as emergent," then of course there is a trade off, by definition. If, on the other hand, we define "agility" as "ability to respond to change," then there doesn't have to be any trade off. Indeed, if no one had ever uttered the word ‚Äúagile‚ÄĚ or applied it to Scrum, I would still want to use Scrum because of its support for estimation and predictability, as well as for its support for responding to change.¬†

The combination of story pointing, velocity calculation, product backlog, short iterations, just-in-time sprint planning, and timely retrospectives after each sprint creates a nearly perfect context for effective estimation. To put it in estimation terminology, story pointing is a proxy based estimation technique. Velocity is calibrating the estimate with project data. The product backlog (when constructed with estimation in mind) gives us a very good proxy for size. Sprint planning and retrospectives give us the ability to "inspect and adapt" our estimates. All this means that Scrum provides better support for estimation than waterfall ever did. 

If a company truly is operating in a high uncertainty environment, Scrum can be an effective approach. In the more typical case in which a company is operating in a moderate uncertainty environment, Scrum is well-equipped to deal with the moderate level of uncertainty and provide high predictability (e.g., estimation) at the same time. 

15. There are contexts where estimates provide little value. 

I don’t estimate how long it will take me to eat dinner, because I know I’m going to eat dinner regardless of what the estimate says. If I have a defect that keeps taking down my production system, the business doesn’t need an estimate for that because the issue needs to get fixed whether it takes an hour, a day, or a week. 

The most common context I see where estimates are not done on an ongoing basis and truly provide little business value is online contexts, especially mobile, where the cycle times are measured in days or shorter, the business context is highly volatile, and the mission truly is, ‚ÄúAlways do the next most useful thing with the resources available.‚Ä̬†

In both these examples, however, there is a point on the scale at which estimates become valuable. If the work on the production system stretches into weeks or months, the business is going to want and need an estimate. As the mobile app matures from one person working for a few days to a team of people working for a few weeks, with more customers depending on specific functionality, the business is going to want more estimates. As the group doing the work expands, they'll need budget and headcount, and those numbers are determined by estimates. Enjoy the #NoEstimates context while it lasts; don’t assume that it will last forever. 

Start with Value At Risk. What are you willing to lose if your estimate is wrong. Then decides if the cost of estimating covers than risk.

16. This is not religion. We need to get more technical and more economic about software discussions. 

I’ve seen #NoEstimates advocates treat these questions of requirements quality, estimation effectiveness, agility, and predictability as value-laden moral discussions. "Agile" is a compliment and "Waterfall" is an invective. The tone of the argument is more moral than economic. The arguments are of the form, "Because this practice is good," rather than of the form, "Because this practice supports business goals X, Y, and Z." 

That religion isn’t unique to Agile advocates, and I’ve seen just as much religion on the non-Agile sides of various discussions. It would be better for the industry at large if people could stay more technical and economic more often. 

Agile is About Creating Options, Right?

I subscribe to the idea that engineering is about doing for a dime what any fool can do for a dollar, i.e., it's about economics. If we assume professional-level skills in agile practices, requirements, and estimation, the decision about how much work to do up front on a project should be an economic decision about which practices will achieve the business goals in the most cost-effective way. We consider issues including the cost of changing requirements and the value of predictability. If the environment is volatile and a high percentage of requirements are likely to spoil before they can be implemented, then it’s a bad economic decision to do lots of up front requirements work. If predictability provides little or no business value, emphasizing up front estimation work would be a bad economic decision.

On the other hand, if predictability does provide business value, then we should support that in a cost-effective way. If we do a lot of the requirements work up front, and some requirements spoil, but most do not, and that supports improved predictability, that would be a good economic choice. 

The economics of these decisions are affected by the skills of the people involved. If my team is great at Scrum but poor at estimation and requirements, the economics of up front vs. emergent will tilt toward Scrum. If my team is great at estimation and requirements but poor at Scrum, the economics will tilt toward estimation and requirements. 

Of course, skill sets are not divinely dictated or cast in stone; they can be improved through focused self-study and training. So we can treat the decision to invest in skills development as an economic issue too. 

Decision to Develop Skills is an Economic Decision Too

What is the cost of training staff to reach competency in estimation and requirements? Does the cost of achieving competency exceed the likely benefits that would derive from competency? That goes back to the question of how much the business values predictability. If the business truly places no value on predictability, there won’t be any ROI from training staff in practices that support predictability. But I do not see that as the typical case. 

My company and I can train software professionals to approach competency in both requirements and estimation in about a week. In my experience most businesses place enough value on predictability that investing a week to make that option available provides a good ROI to the business. Note: this is about making the option available, not necessarily exercising the option on every project. 

My company and I can also train software professionals to approach competency in a full complement of Scrum and other Agile technical practices in about a week. That produces a good ROI too. In any given case, I would recommend both sets of training. If I had to recommend only one or the other, sometimes I would recommend starting with the Agile practices. But my real recommendation is to "embrace the and" and develop both sets of skills.  

For context about training software professionals to "approach competency" in requirements, estimation, Scrum, and other Agile practices, I am using that term based on work we've done with our  Professional Development Ladder. In that ladder we define capability levels of "Introductory," "Competence," "Leadership," and "Mastery." A few days of classroom training will advance most people beyond Introductory and much of the way toward Competence in a particular skill. Additional hands-on experience, mentoring, and feedback will be needed to cement Competence in an area. Classroom study is just one way to acquire these skills. Self-study or working with an expert mentor can work about as well. The skills aren't hard to learn, but they aren't self-evident either. As I've said above, the state of the art in estimation, requirements, and agile practices has moved well beyond what even a smart person can discover on their own. Focused professional development of some kind or other is needed to acquire these skills. 

Is a week enough to accomplish real competency? My company has been training software professionals for almost 20 years, and our consultants have trained upwards of 50,000 software professionals during that time. All of our consultants are highly experienced software professionals first, trainers second. We don't have any methodological ax to grind, so we focus on what is best for each individual client. We all work hands-on with clients so we know what is actually working on the ground and what isn't, and that experience feeds back into our training. We have also also invested heavily in training our consultants to be excellent trainers. As a result, our service quality is second to none, and we can make a tremendous amount of progress with a few days of training. Of course additional coaching, mentoring and support are always helpful. 

17. Agility plus predictability is better than agility alone. 

Agility in the absence of steering targets created by estimating in the presence of uncertainty of of little value. Any Closed Loop Control systems requires rapid response to changing conditions and a steering signal that may required an estimate of where we want to be when we arrive. 

Skills development in practices that support estimation and predictability vs. practices that support agility is not an either/or choice. A truly agile business would be able to be flexible when needed, or predictable when needed. A true software professional will be most effective when skilled in both skill sets. 

If you think your business values agility only, ask your business what it values. Businesses vary, and you might work in a business that truly does value agility over predictability or that values agility exclusively. Many businesses value predictability over agility, however, so don't just assume it's one or the other.  

I think it’s self-evident that a business that has both agility and predictability will outperform a business that has agility only. With today's powerful agile practices, especially Scrum, there's no reason we can't have both.  

Overall, #NoEstimates seems like the proverbial solution in search of a problem. I don't see businesses clamoring to get rid of estimates. I see them clamoring to get better estimates. The good news for them is that agile practices, Scrum in particular, can provide excellent support for agility and estimation at the same time. 

My closing thought, in this hash tag-happy discussion, is that #AgileWithEstimationWorksBest -- and #EstimationWithAgileWorksBest too. 

Woody has successful created what he wanted - a discussion of sorts - about estimating. Trouble is without a principled discussion it turns into personal anecdotes rather than fact based dialog. Those of us asking for fact based examples are then seen as improperly challenging the anecdotes and since there is not yet any fact based response the need to improve the probability of success for software development goes unanswered, replaced acquisitions and name calling.

Related articles Making Conjectures Without Testable Outcomes Root Cause of Project Failure IT Risk Management Deadlines Always Matter Thinking, Talking, Doing on the Road to Improvement Information Technology Estimating Quality
Categories: Project Management

Python: Extracting Excel spreadsheet into CSV files

Mark Needham - Thu, 08/20/2015 - 00:27

I’ve been playing around with the Road Safety open data set and the download comes with several CSV files and an excel spreadsheet containing the legend.

There are 45 sheets in total and each of them looks like this:

2015 08 17 23 33 19

I wanted to create a CSV file for each sheet so that I can import the data set into Neo4j using the LOAD CSV command.

I came across the Python Excel website which pointed me at the xlrd library since I’m working with a pre 2010 Excel file.

The main documentation is very extensive but I found the github example much easier to follow.

I ended up with the following script which iterates through all but the first two sheets in the spreadsheet – the first two sheets contain instructions rather than data:

from xlrd import open_workbook
import csv
wb = open_workbook('Road-Accident-Safety-Data-Guide-1979-2004.xls')
for i in range(2, wb.nsheets):
    sheet = wb.sheet_by_index(i)
    with open("data/%s.csv" %(" ","")), "w") as file:
        writer = csv.writer(file, delimiter = ",")
        print sheet,, sheet.ncols, sheet.nrows
        header = [cell.value for cell in sheet.row(0)]
        for row_idx in range(1, sheet.nrows):
            row = [int(cell.value) if isinstance(cell.value, float) else cell.value
                   for cell in sheet.row(row_idx)]

I’ve replaced spaces in the sheet name so that the file name on a disk is a bit easier to work with. For some reason the numeric values were all floats whereas I wanted them as ints so I had to explicitly apply that transformation.

Here are a few examples of what the CSV files look like:

$ cat data/1stPointofImpact.csv
0,Did not impact
-1,Data missing or out of range
$ cat data/RoadType.csv
2,One way street
3,Dual carriageway
6,Single carriageway
7,Slip road
12,One way street/Slip road
-1,Data missing or out of range
$ cat data/Weather.csv
1,Fine no high winds
2,Raining no high winds
3,Snowing no high winds
4,Fine + high winds
5,Raining + high winds
6,Snowing + high winds
7,Fog or mist
-1,Data missing or out of range

And that’s it. Not too difficult!

Categories: Programming

Unix: Stripping first n bytes in a file / Byte Order Mark (BOM)

Mark Needham - Thu, 08/20/2015 - 00:27

I’ve previously written a couple of blog posts showing how to strip out the byte order mark (BOM) from CSV files to make loading them into Neo4j easier and today I came across another way to clean up the file using tail.

The BOM is 3 bytes long at the beginning of the file so if we know that a file contains it then we can strip out those first 3 bytes tail like this:

$ time tail -c +4 Casualty7904.csv > Casualty7904_stripped.csv
real	0m31.945s
user	0m31.370s
sys	0m0.518s

The -c command is described thus;

-c number
             The location is number bytes.

So in this case we start reading at byte 4 (i.e. skipping the first 3 bytes) and then direct the output into a new file.

Although using tail is quite simple, it took 30 seconds to process a 300MB CSV file which might actually be slower than opening the file with a Hex editor and manually deleting the bytes!

Categories: Programming

Why Bother with Probability and Statistics?

Herding Cats - Glen Alleman - Wed, 08/19/2015 - 18:06

IWs3OkqfydCQIt is conjectured that uncertainty can be dealt with ordinary means with open conversation, identification of the uncertainties and their handling strategies. That quantitative methods are too elaborate and unnecessary for problems except the most technical and complicated ones.

When asked what is meant by uncertainty the answer many times is probably or very likely. But not any quantitative measure meaningful to the decision makers. Since the future is always uncertain in our project domain, making decisions in the presence of uncertainty is a critical success factor [1] for all project work. 

Decision making is one of the hard things in life. True decision-making occurs not when we already know the outcome, but when we do not know what to do. When we have to balance conflicting values, costs, schedule, needed capabilities, sort through complex situations, and deal with real uncertainty. To make decisions in the presence of this uncertainty we need to know the possible outcomes of our decision, the possible alternatives and their costs - in the short term and in the long term. Making these types of decisions requires we make estimates of all the variables involved in the decision-making process.

What Are Probabilities? 

There is a trend in the software development domain to redefine well established terms in mathematics, engineering, and science - it seems to suit the needs of those proffering that in the presence of uncertainty decisions can't be made.

Probabilities represent our state of knowledge. They are a statement of how likely we think an event might occur or the possible of a value being within a range of values.

These probabilities are based in uncertainty, and uncertainty comes in two forms. Aleatory and Epistemic. 

  • Aleatory uncertainty is the natural randomness in a process. For discrete variables, the randomness is parameterized by the probability of each possible value. For continuous variables, the randomness is parameterized by the probability density function (pdf).
  • Epistemic uncertainty is the uncertainty in the model of the process. It is due to limited data and knowledge. The epistemic uncertainty is characterized by alternative models. For discrete random variables, the epistemic uncertainty is modeled by alternative probability distributions. For continuous random variables, the epistemic uncertainty is modeled by alternative probability density functions. In addition, there is epistemic uncertainty in parameters that are not random by have only a single correct (but unknown) value.

Both these uncertainties exist on projects. When making good decisions on projects we know something about these uncertainties and have handling plans for the resulting risk produced by the uncertainties.

  • For Aleatory uncertainty (irreducible risk) we need¬†margin. The¬†margin¬†protects the project deliverables from unfavorable cost, schedule, and technical performance that is part of the naturally occurring variances.
  • For Epistemic uncertainty (reducible risk) can be addressed by¬†buying down the uncertainty. Paying money to¬†learn more.

This by the way is a primary benefit of Agile Software Development, where forced short term deliverables provide information to reduce risk. Agile is Not a risk management process, many other steps needed for that. But Agile is a means to reveal risk and take corrective action on much shorter time boundaries - reducing the accumulation of risk.

Some Background on Decision Making in the Presence of Uncertainty 

One way to distinguish good decisions from bad decisions is to assess the outcomes of those decisions. The measurement critical for a good or bad decision needs some definition itself. There are issues of course. The results of the decision may not appear for some time in the future, but we need to know something about the possible results before we make the decision. As well we'd like to see the results of the alternatives of our decision for the choices that weren't made or rejected.

A fundamental purpose of quantitative decision making is to distinguish between good and bad decisions. And to provide criteria for assessing the goodness of the decision. To do this we need first to establish what the decision is about.

  • When do you think we'll be ready to¬†go live with the needed capabilities we're paying you develop?
  • If we switch from our legacy systems to an ERP system, how much will we save over the next 5 years with the sunk cost of the entire project?
  • On the list of¬†desirable features, which ones can we get on the current¬†need date¬†if we reduce the budget by 15%?

Making decisions like these in the presence of uncertainty by estimating future outcomes is a normal, everyday, business process. Any suggestion these decisions can be made without estimates is utter nonsense.

Decision analysis starts with defining what a decision is - the commitment to resources that is irrevocable only at some cost. If there is not cost associated with making the decsion or changing your mind after the decision has been made - in the business domain - the decision was of little if any value. This is the value at risk discussion. How much are we willing to risk if we don't know to some level of confidence what the outcome of our decision is?

The elements of good decision analysis are [2]. So for any good decision and its decision making process, we'll need answers to the questions on the left, some form of logic to make a decision, the defined actionable steps from that decision and then an assessment of the outcomes to inform future decisions - learning from our decisions 

Screen Shot 2015-08-19 at 10.19.27 AM

Decision support systems that implement the process above are based in part on the underlying uncertainties of the systems under management. Research into the cost and schedule behaviors of these systems is well developed. Here's one example.

Screen Shot 2015-08-20 at 8.01.06 AM

In the end the decision making process will not meet the needs of the decision makes if we don't have alternatives defined, information at hand - and most times this information is probabilities information from condition in the future in the presence of uncertainty, and the value we assign to the outcomes - then making decisions is going to turn out BAD.

We're driving in the dark with the lights off, while spending other peoples money and our project will end up like this...

Upside down

Reference Material for Further Understanding

  1. Strategic Planning with Critical Success Factors and Future Scenarios: An Integrated Strategic Planning Framework, Technical Report CMU/SEI-2010-TR-037 ESC-TR-2010-102.
  2. Decision Analysis for the Professional 4th Edition, Peter McNamee and John Celina, 
  3. Real Options: Managing Strategic Investment  in an Uncertain World, Amran, Martha, and Nalin Kulatilaka, 

    Harvard Business School Press, 1999. 
  4. Making Hard Decisions: An Introduction to Decision Analysis, Robert Clemen, 

    Duxbury Press, 1996.

  5. Software Design as an Investment Activity: A Real Options Perspective, Kevin Sullivan and Prasad Chalasani 
  6. Probabilistic Modeling as an Exploratory Decision Making Tool, Martin Pergler and Andrew Freeman, McKinsey & Company, Number 6, September 2008
  7. Value at Risk for IS/IT Project and Portfolio Appraisal and Risk Management, Stefan Koch, Department of Information Business, Vienna University of Economics and BA, Austria, 
Related articles Making Conjectures Without Testable Outcomes Root Cause of Project Failure IT Risk Management
Categories: Project Management

Announcing Windows Server 2016 Containers Preview

ScottGu's Blog - Scott Guthrie - Wed, 08/19/2015 - 17:01

At DockerCon this year, Mark Russinovich, CTO of Microsoft Azure, demonstrated the first ever application built using code running in both a Windows Server Container and a Linux container connected together. This demo helped demonstrate Microsoft's vision that in partnership with Docker, we can help bring the Windows and Linux ecosystems together by enabling developers to build container-based distributed applications using the tools and platforms of their choice.

Today we are excited to release the first preview of Windows Server Containers as part of our Windows Server 2016 Technical Preview 3 release. We’re also announcing great updates from our close collaboration with Docker, including enabling support for the Windows platform in the Docker Engine and a preview of the Docker Engine for Windows. Our Visual Studio Tools for Docker, which we previewed earlier this year, have also been updated to support Windows Server Containers, providing you a seamless end-to-end experience straight from Visual Studio to develop and deploy code to both Windows Server and Linux containers. Last but not least, we’ve made it easy to get started with Windows Server Containers in Azure via a dedicated virtual machine image. Windows Server Containers

Windows Server Containers create a highly agile Windows Server environment, enabling you to accelerate the DevOps process to efficiently build and deploy modern applications. With today‚Äôs preview release, millions of Windows developers will be able to experience the benefits of containers for the first time using the languages of their choice ‚Äď whether .NET, ASP.NET, PowerShell or Python, Ruby on Rails, Java and many others.

Today’s announcement delivers on the promise we made in partnership with Docker, the fast-growing open platform for distributed applications, to offer container and DevOps benefits to Linux and Windows Server users alike. Windows Server Containers are now part of the Docker open source project, and Microsoft is a founding member of the Open Container Initiative. Windows Server Containers can be deployed and managed either using the Docker client or PowerShell. Getting Started using Visual Studio

The preview of our Visual Studio Tools for Docker, which enables developers to build and publish ASP.NET 5 Web Apps or console applications directly to a Docker container, has been updated to include support for today’s preview of Windows Server Containers. The extension automates creating and configuring your container host in Azure, building a container image which includes your application, and publishing it directly to your container host. You can download and install this extension, and read more about it, at the Visual Studio Gallery here:

Once installed, developers can right-click on their projects within Visual Studio and select ‚ÄúPublish‚ÄĚ:


Doing so will display a Publish dialog which will now include the ability to deploy to a Docker Container (on either a Windows Server or Linux machine):


You can choose to deploy to any existing Docker host you already have running:


Or use the dialog to create a new Virtual Machine running either Window Server or Linux with containers enabled.  The below screen-shot shows how easy it is to create a new VM hosted on Azure that runs today‚Äôs Windows Server 2016 TP3 preview that supports Containers ‚Äď you can do all of this (and deploy your apps to it) easily without ever having to leave the Visual Studio IDE:

image Getting Started Using Azure

In June of last year, at the first DockerCon, we enabled a streamlined Azure experience for creating and managing Docker hosts in the cloud. Up until now these hosts have only run on Linux. With the new preview of Windows Server 2016 supporting Windows Server Containers, we have enabled a parallel experience for Windows users.

Directly from the Azure Marketplace, users can now deploy a Windows Server 2016 virtual machine pre-configured with the container feature enabled and Docker Engine installed. Our quick start guide has all of the details including screen shots and a walkthrough video so take a look here


Once your container host is up and running, the quick start guide includes step by step guides for creating and managing containers using both Docker and PowerShell. Getting Started Locally Using Hyper-V

Creating a virtual machine on your local machine using Hyper-V to act as your container host is now really easy. We’ve published some PowerShell scripts to GitHub that automate nearly the whole process so that you can get started experimenting with Windows Server Containers as quickly as possible. The quick start guide has all of the details at

Once your container host is up and running the quick start guide includes step by step guides for creating and managing containers using both Docker and PowerShell.

image Additional Information and Resources

A great list of resources including links to past presentations on containers, blogs and samples can be found in the community section of our documentation. We have also setup a dedicated Windows containers forum where you can provide feedback, ask questions and report bugs. If you want to learn more about the technology behind containers I would highly recommend reading Mark Russinovich‚Äôs blog on ‚ÄúContainers: Docker, Windows and Trends‚ÄĚ that was published earlier this week. Summary

At the //Build conference earlier this year we talked about our plan to make containers a fundamental part of our application platform, and today’s releases are a set of significant steps in making this a reality.’ The decision we made to embrace Docker and the Docker ecosystem to enable this in both Azure and Windows Server has generated a lot of positive feedback and we are just getting started.

While there is still more work to be done, now users in the Window Server ecosystem can begin experiencing the world of containers. I highly recommend you download the Visual Studio Tools for Docker, create a Windows Container host in Azure or locally, and try out our PowerShell and Docker support. Most importantly, we look forward to hearing feedback on your experience.

Hope this helps,

Scott omni

Categories: Architecture, Programming

The Microsoft Take on Containers and Docker

This is a guest repost by Mark Russinovich, CTO of Microsoft Azure (and novelist!). We all benefit from a vibrant competitive cloud market and Microsoft is part of that mix. Here's a good container overview along with Microsoft's plan of attack. Do you like their story? Is it interesting? Is it compelling?

You can’t have a discussion on cloud computing lately without talking about containers. Organizations across all business segments, from banks and major financial service firms to e-commerce sites, want to understand what containers are, what they mean for applications in the cloud, and how to best use them for their specific development and IT operations scenarios.

From the basics of what containers are and how they work, to the scenarios they’re being most widely used for today, to emerging trends supporting “containerization”, I thought I’d share my perspectives to better help you understand how to best embrace this important cloud computing development to more seamlessly build, test, deploy and manage your cloud applications.

Containers Overview

In abstract terms, all of computing is based upon running some “function” on a set of “physical” resources, like processor, memory, disk, network, etc., to accomplish a task, whether a simple math calculation, like 1+1, or a complex application spanning multiple machines, like Exchange. Over time, as the physical resources became more and more powerful, often the applications did not utilize even a fraction of the resources provided by the physical machine. Thus “virtual” resources were created to simulate underlying physical hardware, enabling multiple applications to run concurrently – each utilizing fractions of the physical resources of the same physical machine.

We commonly refer to these simulation techniques as virtualization. While many people immediately think virtual machines when they hear virtualization, that is only one implementation of virtualization. Virtual memory, a mechanism implemented by all general purpose operating systems (OSs), gives applications the illusion that a computer’s memory is dedicated to them and can even give an application the experience of having access to much more RAM than the computer has available.

Containers are another type of virtualization, also referred to as OS Virtualization. Today’s containers on Linux create the perception of a fully isolated and independent OS to the application. To the running container, the local disk looks like a pristine copy of the OS files, the memory appears only to hold files and data of a freshly-booted OS, and the only thing running is the OS. To accomplish this, the “host” machine that creates a container does some clever things.

The first technique is namespace isolation. Namespaces include all the resources that an application can interact with, including files, network ports and the list of running processes. Namespace isolation enables the host to give each container a virtualized namespace that includes only the resources that it should see. With this restricted view, a container can’t access files not included in its virtualized namespace regardless of their permissions because it simply can’t see them. Nor can it list or interact with applications that are not part of the container, which fools it into believing that it’s the only application running on the system when there may be dozens or hundreds of others.

For efficiency, many of the OS files, directories and running services are shared between containers and projected into each container’s namespace. Only when an application makes changes to its containers, for example by modifying an existing file or creating a new one, does the container get distinct copies from the underlying host OS – but only of those portions changed, using Docker’s “copy-on-write” optimization. This sharing is part of what makes deploying multiple containers on a single host extremely efficient.

Second, the host controls how much of the host’s resources can be used by a container. Governing resources like CPU, RAM and network bandwidth ensure that a container gets the resources it expects and that it doesn’t impact the performance of other containers running on the host. For example, a container can be constrained so that it cannot use more than 10% of the CPU. That means that even if the application within it tries, it can’t access to the other 90%, which the host can assign to other containers or for its own use. Linux implements such governance using a technology called “cgroups.” Resource governance isn’t required in cases where containers placed on the same host are cooperative, allowing for standard OS dynamic resource assignment that adapts to changing demands of application code.

The combination of instant startup that comes from OS virtualization and reliable execution that comes from namespace isolation and resource governance makes containers ideal for application development and testing. During the development process, developers can quickly iterate. Because its environment and resource usage are consistent across systems, a containerized application that works on a developer’s system will work the same way on a different production system. The instant-start and small footprint also benefits cloud scenarios, since applications can scale-out quickly and many more application instances can fit onto a machine than if they were each in a VM, maximizing resource utilization.

Comparing a similar scenario that uses virtual machines with one that uses containers highlights the efficiency gained by the sharing. In the example shown below, the host machine has three VMs. In order to provide the applications in the VMs complete isolation, they each have their own copies of OS files, libraries and application code, along with a full in-memory instance of an OS. Starting a new VM requires booting another instance of the OS, even if the host or existing VMs already have running instances of the same version, and loading the application libraries into memory. Each application VM pays the cost of the OS boot and the in-memory footprint for its own private copies, which also limits the number of application instances (VMs) that can run on the host.

App Instances on Host

The figure below shows the same scenario with containers. Here, containers simply share the host operating system, including the kernel and libraries, so they don’t need to boot an OS, load libraries or pay a private memory cost for those files. The only incremental space they take is any memory and disk space necessary for the application to run in the container. While the application’s environment feels like a dedicated OS, the application deploys just like it would onto a dedicated host. The containerized application starts in seconds and many more instances of the application can fit onto the machine than in the VM case.

Containers on Host

Docker’s Appeal
Categories: Architecture

Quote of the Day

Herding Cats - Glen Alleman - Wed, 08/19/2015 - 14:55

The door of a bigoted mind opens outwards. The pressure of facts merely closes it more snugly.
- Ogden Nash

When there are new ideas being conjectured, it is best for the conversation to establish the principles on which those ideas can be tested. Without this the person making the conjecture has to defined the idea on personality, personal anecdotes, and  personal experience alone

Categories: Project Management

KISS ‚ÄĒ One Best Practice to Rule Them All

Making the Complex Simple - John Sonmez - Wed, 08/19/2015 - 13:00

Why KISS isn’t easy Let’s talk about KISS, or “Keep It Simple, Stupid.” But before I go any further, just think a little about your favorite best practice when writing code. Is it DRY‚ÄĒDon’t Repeat Yourself? Or are you more a YAGNI‚ÄĒYou Aren’t Gonna Need It‚ÄĒperson? Do you follow SOLID principles? Or are you really […]

The post KISS ‚ÄĒ One Best Practice to Rule Them All appeared first on Simple Programmer.

Categories: Programming

Quote of the Month August 2015

From the Editor of Methods & Tools - Wed, 08/19/2015 - 09:39
Acknowledging that something isn‚Äôt working takes courage. Many organizations encourage people to spin things in the most positive light rather than being honest. This is counterproductive. Telling people what they want to hear just defers the inevitable realization that they won‚Äôt get what they expected. It also takes from them the opportunity to react to […]