Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Thinking, Talking, Doing on the Road to Improvement

Herding Cats - Glen Alleman - Mon, 07/20/2015 - 19:54

When there is a discussion around making improvements to anything, trouble starts when we don't have a shared understanding of the outcomes. For example, speculating that something can be done or that something should be stopped in pursuit of improvement has difficulty maintaining traction in the absence of a framework for that discussion.

The discussion falls into he said, she said style or I'll tell you a story (anecdote) of how this worked for me and it'll work for you.

Over the years I've been trained to work on proposals, provide training materials, write guidance documents, and other outlets - PodCasts, conference presentations - all designed to convey a new and sometimes controversial topic. Connecting agile and earned value management is the latest.

There are several guides that have formed the basis of my work. The critical success factor for this work is to move away from personal anecdotes - although those are many time used inside a broader context to make the message more personal. Rather start with a framework for the message. 

A good place to start is Cliff Atkinson's Beyond Bullet Points. It's not so much the making of Power Point briefings, but the process of sorting through what are you trying to say. Version 1 of the book is my favorite, because it was simple and actually changed how we thought about communication. Here's a framework from Cliff's 1st edition.

There is a deeper framework though. Our daughter is a teacher and she smiled when I mentioned one time, we're starting to use Bloom's Taxonomy for building our briefing materials that are designed to change how people do there work. Dad, I do that every week, it's called a lesson plan.  Here's an approach we use, from revised Blooms Handout. Screen Shot 2015-07-20 at 10.58.17 AMWhen we start a Pod Cast effort, any information conveyance efforts, we start by asking - what will the listener, attender, reader be able to do when they go back to their desk or place of work. The picture below helps us avoid the open ended taking out your brain and playing with it syndrome. This is the basis of Actionable Outcomes. Screen Shot 2015-07-20 at 10.59.20 AM

So when we hear about we're exploring or all we want is a conversation and at the same time the suggestion - conjecture actually - that what we're talking about is a desire to change an existing paradigm, make some dysfunction go away, take some correcrtive action - ask some importanrt questions:

  • Is this a framework for discussing these topics? Are we trying ti understand the problem before applying a solution? 
  • When applying the solution based on the understanding, is there any way to assess the effectiveness of the solution? Is this solution applicable beyond our personal anecdotal experience?
  • Cam we analyze the outcomes of the solution applied to the problem and determine if the solution results in correcting the problem?
  • Do we have some means of evaluating this effectiveness? What are the units of measure by which we can confirm this effectiveness. Anecdotes aren't evidence.
  • And finally can this solution be syndicated outside the personal experience? That is are our problem areas subject to the same solution?
Related articles Capabilities Based Planning Estimating Processes in Support of Economic Analysis Applying the Right Ideas to the Wrong Problem The Art of Systems Architecting Are Estimates Really The Smell of Dysfunction? Strategy is Not the Same as Operational Effectiveness
Categories: Project Management

Easier Auth for Google Cloud APIs: Introducing the Application Default Credentials feature.

Google Code Blog - Mon, 07/20/2015 - 19:27

Originally posted to the Google Cloud Platform blog

When you write applications that run on Google Compute Engine instances, you might want to connect them to Google Cloud Storage, Google BigQuery, and other Google Cloud Platform services. Those services use OAuth2, the global standard for authorization, to help ensure that only the right callers can make the right calls. Unfortunately, OAuth2 has traditionally been hard to use. It often requires specialized knowledge and a lot of boilerplate auth setup code just to make an initial API call.

Today, with Application Default Credentials (ADC), we're making things easier. In many cases, all you need is a single line of auth code in your app:

Credential credential = GoogleCredential.getApplicationDefault();

If you're not already familiar with auth concepts, including 2LO, 3LO, and service accounts, you may find this introduction useful.

ADC takes all that complexity and packages it behind a single API call. Under the hood, it makes use of:

  • 2-legged vs. 3-legged OAuth (2LO vs. 3LO) -- OAuth2 includes support for user-owned data, where the user, the API provider, and the application developer all need to participate in the authorization dance. Most Cloud APIs don't deal with user-owned data, and therefore can use much simpler two-party flows between the API provider and the application developer.
  • gcloud CLI -- while you're developing and debugging your app, you probably already use the gcloud command-line tool to explore and manage Cloud Platform resources. ADC lets your application piggyback on the auth flows in gcloud, so you only have to set up your credentials once.
  • service accounts -- if your application runs on Google App Engine or Google Compute Engine, it automatically has access to the built-in "service account", that helps the API provider to trust that the API calls are coming from a trusted source. ADC lets your application benefit from that trust.

You can find more about Google Application Default Credentials here. This is available for Java, Python, Node.js, Ruby, and Go. Libraries for PHP and .Net are in development.

Categories: Programming

MongoDB and WTFs and Anger

Eric.Weblog() - Eric Sink - Mon, 07/20/2015 - 19:00

Recently, Sven Slootweg (joepie91) published a blog entry entitled Why you should never, ever, ever use MongoDB. It starts out with the words "MongoDB is evil" and proceeds to give a list of negative statements about same.

I am not here to respond to each of his statements. He labels them as "facts", and some (or perhaps all) of them surely are. In fact, for now, let's assume that everything he wrote is correct. My point here is not to say that the author is wrong.

Rather, my point here is that this kind of blog entry tells me very little about MongoDB while it tells me a great deal about the emotions of the person who wrote it.

Like I said, it may be true that every WTF the author listed is correct. It is also true that some software has more WTFs than others.

I'm not a MongoDB expert, but I've been digging into it quite a bit, and I could certainly make my own list of its WTFs. And I would also admit that my own exploration of Couchbase has yielded fewer of those moments. Therefore, every single person on the planet who chooses MongoDB instead of Couchbase is making a terrible mistake, right?

Let me briefly shift to a similar situation where I personally have a lot more knowledge: Microsoft SQL Server vs PostgreSQL. For me, it is hard to study SQL Server without several WTF moments. And while PostgreSQL is not perfect, I have found that a careful study there tends to produce more admiration than WTFs.

So, after I discovered that (for example) SQL Server has no support for deferred foreign keys, why didn't I write a blog entry entitled "Why you should never, ever, ever use SQL Server"?

Because I calmed down and looked at the bigger picture.

I think I could make an entirely correct list of negative things about SQL Server that is several pages long. And I suppose if I wanted to do that, and if I were really angry while I was writing it, I would include only the facts that support my feelings, omitting anything positive. For example, my rant blog entry would have no reason to acknowledge that SQL Server is the mostly widely used relational database server in the world. These kinds of facts merely distract people from my point.

But what would happen if I stopped writing my rant and spent some time thinking about the fact I just omitted?

I just convinced myself that this piece of software is truly horrible, and yet, millions of people are using it every day. How do I explain this?

If I tried to make a complete list of theories that might fit the facts, today's blog entry would get too long. Suffice it to say this: Some of those theories might support an anti-Microsoft rant (for example, maybe Microsot's field sales team is really good at swindling people), but I'm NOT going to be able to prove that every single person who chose SQL Server has made a horrible mistake. There is no way I can credibly claim that PostgreSQL is the better choice for every single company simply because I admire it. Even though I think (for example) that SQL Server handles NULL and UNIQUE in a broken way, there is some very large group of people for whom SQL Server is a valid and smart choice.

So why would I write a blog entry that essentially claims that all SQL Server users are stupid when that simply cannot be true? I wouldn't. Unless I was really angry.

MongoDB is undisputably the top NoSQL vendor. It is used by thousands of companies who serve millions of users every day. Like all young software serving a large user base, it has bugs and flaws, some of which are WTF-worthy. But it is steadily getting better. Any discussion of its technical deficiences which does not address these things is mostly just somebody venting emotion.

 

Algolia's Fury Road to a Worldwide API Steps Part 2


The most frequent questions we answer for developers and devops are about our architecture and how we achieve such high availability. Some of them are very skeptical about high availability with bare metal servers, while others are skeptical about how we distribute data worldwide. However, the question I prefer is “How is it possible for a startup to build an infrastructure like this”. It is true that our current architecture is impressive for a young company:

  • Our high-end dedicated machines are hosted in 13 worldwide regions with 25 data-centers

  • our master-master setup replicates our search engine on at least 3 different machines

  • we process over 6 billion queries per month

  • we receive and handle over 20 billion write operations per month

Just like Rome wasn't built in a day, our infrastructure wasn't as well. This series of posts will explore the 15 instrumental steps we took when building our infrastructure. I will even discuss our outages and bugs in order to you to understand how we used them to improve our architecture.

The first blog post of the series focused on the early days of our beta. This blog post will focus on the first 18 months of our service from September 2013 to December 2014 and even include our first outages!

Step 4: January 2014
Categories: Architecture

Released Today: Visual Studio 2015, ASP.NET 4.6, ASP.NET 5 & EF 7 Previews

ScottGu's Blog - Scott Guthrie - Mon, 07/20/2015 - 16:14

Today is a big day with major release announcements for Visual Studio 2015, Visual Studio 2013 Update 5, and .NET Framework 4.6. All these releases have been covered in great detail on Soma’s Blog, Visual Studio Blog, and .NET Blog

Join us online for the Visual Studio 2015 Release Event, where you can see Soma, Brian Harry, Scott Hanselman, and many other demo new Visual Studio 2015 features and technologies. This year, in a new segment called “In The Code”, we share how a team of Microsoft engineers created a real app in 3 days. There will be opportunities along the way to interact in live Q&A with the team on subjects such as Agile development, web and cloud development, cross-platform mobile dev and much more. 

In this post I’d like to specifically talk about some of the ground we have covered in ASP.NET and Entity Framework.  In this release of Visual Studio, we are releasing ASP.NET 4.6, updating our Visual Studio Web Development Tools, and updating the latest beta release of our new ASP.NET 5 framework.  Below are details on just a few of the great updates available today: ASP.NET Tooling Improvements

Today’s VS 2015 release delivers some great updates for web development.  Here are just a few of the updates we are shipping in this release: JSON Editor

JSON has become a first class experience in Visual Studio 2015 and we are now giving you a great editor to allow you to maintain your JSON content.  With support for JSON Schema validation, intellisense, and support for SchemaStore.org writing and producing JSON content has never been as easy.  We’ve also added intellisense support for bower.json and package.json files for bower and npm package manager use.

image HTML Editor Updates

Our HTML editor received a lot of attention in this update.  We wanted to deliver an editor that kept up with HTML 5 standards and provided rich support for popular new frameworks and libraries.  We previously shipped the bootstrap responsive web framework with our ASP.NET templates, and we are now providing intellisense for their classes with an indicator icon to show that they are bootstrap CSS classes.

image

 

This helps you keep clear the classes that you wrote in your project, like the page-inner class above, and the bootstrap classes marked with the B icon.

We are also keeping up with support for the emerging web components standard with the import link for the web components that markup imports.

 image

We are also providing intellisense for AngularJS directives and attributes with an appropriate Angular icon so you know you’re triggering AngularJS functionality

 image JavaScript Editor Improvements

With the VS 2015 release we are introducing support for AngularJS structures including controllers, services, factories, directives and animations.  There is also support for the new EcmaScript 6 features such as classes, arrow functions, and template strings. We are also bringing a navigation bar to the editor to help you navigate between the major elements of your JavaScript.  With JSDoc support to deliver intellisense, JavaScript development gets easier.

 image ReactJS Editor Support

We spent some time with the folks at Facebook to make sure that we delivered first class capabilities for developers using their ReactJS framework.  With appropriate syntax highlighting and intellisense for React methods, developers should be very comfortable building React applications with the new Visual Studio:

 image Support for JavaScript package managers like Grunt and Gulp and Task Runners

JavaScript and modern web development techniques are the new recommended way to build client-side code for your web application.  We support these tools and programming techniques with our new Task Runner Explorer that executes grunt and gulp task runners.  You can open this tool window with the Ctrl+Alt+Backspace hotkey combination.

 image

Execute any of the tasks defined in your gruntfile.js or gulpfile.js by right-clicking on the task name in the left panel and choosing “Run” from the context menu that appears.  You can even use this context menu to attach grunt or gulp tasks to project build events in Visual Studio like “After Build” as shown in the figure above.  Every time the .NET objects in your web project are completed compiling, the ‘build’ task will be executed from the gruntfile.js

Combined with the intellisense support for JavaScript and JSON editors, we think that developers wanting to use grunt and gulp tasks will really enjoy this new Visual Studio experience.  You can add grunt and gulp tasks with the newly integrated npm package manager capabilities.  When you create a package.json file in your web project, we will install and upgrade local copies of all packages referenced.  Not only do we deliver syntax highlighting and intellisense for package.json terms, we also provide package name and version lookup against the npmjs.org gallery.

 image

The bower package manager is also supported with great intellisense, syntax highlighting and the same package name and version support in the bower.json file that we provide for package.json.

 image

These improvements in managing and writing JavaScript configuration files and executing grunt or gulp tasks brings a new level of functionality to Visual Studio 2015 that we think web developers will really enjoy.

ASP.NET 4.6 Runtime Improvements

Today’s release also includes a bunch of enhancements to ASP.NET from a runtime perspective. HTTP/2 Support

Starting with ASP.NET 4.6 we are introducing support for the HTTP/2 standard.  This new version of the HTTP protocol delivers a true multiplexing of requests and responses between browser and web server.  This exciting update is as easy as enabling SSL in your web projects to immediately improve your ASP.NET application responsiveness.

 image

With SSL enabled (which is a requirement of the HTTP/2 protocol), IISExpress on Windows 10 will begin interacting with the browser using the updated protocol.  The difference between the protocols is clear.  Consider the network performance presented by Microsoft Edge when requesting the same website without SSL (and receiving HTTP/1.x) and with SSL to activate the HTTP/2 protocol:

image

image

Both samples are showing the default ASP.NET project template’s home page.  In both scenarios the HTML for the page is retrieved in line 1.  In HTTP/1.x on the left, the first six elements are requested and we see grey bars to indicate waiting to request the last two elements.  In HTTP/2 on the right, all eight page elements are loaded concurrently, with no waiting. Support for the .NET Compiler Platform

We now support the new .NET compilers provided in the .NET Compiler Platform (codenamed Roslyn).  These compilers allow you to access the new language features of Visual Basic and C# throughout your Web Forms markup and MVC view pages.  Our markup can look much simpler and readable with new language features like string interpolation:

Instead of building a link in Web Forms like this:

  <a href="/Products/<%: model.Id %>/<%: model.Name %>"><%: model.Name %></a>

We can deliver a more readable piece of markup like this:

  <a href="<%: $"/Products/{model.Id}/{model.Name}" %>"><%: model.Name %></a>

We’ve also bundled the Microsoft.CodeDom.Providers.DotNetCompilerPlatform NuGet package to enable your Web Forms assets to compile significantly faster without requiring any changes to your code or project. Async Model Binding for Web Forms

Model binding was introduced for Web Forms applications in ASP.NET 4, and we introduced async methods in .NET 4.5  We heard your requests to be able to execute your model binding methods on a Web Form asynchronously with the new language features.  Our team has made this as easy as adding an async=”true” attribute to the @Page directive and return a Task from your model binding methods:

    public async Task<IEnumerable<Product>> myGrid_GetData()

    {

      var repo = new Repository();

      return await repo.GetAll();

    }

We have a blog post demonstrating with more information and tips about this feature on our MSDN Web Development blog. ASP.NET 5

I introduced ASP.NET 5 back in February and shared in detail what this release would bring. I’ll reiterate just a few high level points here, check out my post Introducing ASP.NET 5 for a more complete run down. 

ASP.NET 5 works with .NET Core as well as the full .NET Framework to give you greater flexibility when hosting your web apps. With ASP.NET MVC 6 we are merging the complimentary features and functionality from MVC, Web API, and Web Pages. With ASP.NET 5 we are also introducing a new HTTP request pipeline based on our learnings from Katana which enables you to add only the components you need with an opt-in strategy. Additionally, included in this release are multiple development features for improved productivity and to enable you to build better web applications. ASP.NET 5 is also open source. You can find us on GitHub, view and download the code, submit changes, and track when changes are made.   

The ASP.NET 5 Beta 5 runtime packages are in preview and not recommended for use in production, so please continue using ASP.NET 4.6 for building production grade apps. For details on the latest ASP.NET 5 beta enhancements added and issues fixed, check out the published release notes for ASP.NET 5 beta 5 on GitHub. To get started with ASP.NET 5 get the docs and tutorials on the ASP.NET site

To learn more and keep an eye on all updates to ASP.NET, checkout the Webdev blog and read along with the tutorials and documentation at www.asp.net/vnext Entity Framework

With today’s release, we not only have an update to Entity Framework 6 that primarily includes bug fixes and community contributions, but we also released a preview version of Entity Framework 7, keep reading for details: Entity Framework 6.x

Visual Studio 2015 includes Entity Framework 6.1.3. EF 6.1.3 primarily focuses on bug fixes and community contributions; you can see a list of the changes included in EF 6.1.3 in this EF 6.1.3 announcement blog post. The Entity Framework 6.1.3 runtime is included in a number of places in this release. In EF 6.1.3 when you can create a new model using the Entity Framework Tools in a project that does not already have the EF runtime installed, the runtime is automatically installed for you. Additionally, the runtime is pre-installed in new ASP.NET projects, depending on the project template you select.

image 

To learn more and keep an eye on all updates to Entity Framework, checkout the ADO.NET blog.   Entity Framework 7

Entity Framework 7 is in preview and not yet ready for production yet. This new version of Entity Framework enables new platforms and new data stores. Universal Windows Platform, ASP.NET 5, and traditional desktop applications can now use EF7. EF7 can also be used in .NET applications that run on Mac and Linux. Visual Studio 2015 includes an early preview of the EF7 runtime that is installed in new ASP.NET 5 projects. 

image

For more information on EF7, check out the GitHub page for what is EF7 all about.

image Summary

Today’s Visual Studio release is a big one that we are proud to share with you all. Thank you for your continued support by providing feedback on the interim releases (CTPs, Preview, RC).  We are really looking forward to seeing what you build with it.

Hope this helps,

Scott

P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me @scottgu omni

Categories: Architecture, Programming

You’ve Just Been Laid Off From Your Programming Job. Now What?

Making the Complex Simple - John Sonmez - Mon, 07/20/2015 - 16:00

Don’t worry. It happens to the best of us. We’ve all been “laid off” at some point in our lives—well, at least most of us. Maybe your employer went through a “workforce reduction” and randomly selected people to be laid off—you just got unlucky. Or perhaps it wasn’t so much luck, but your tendency to […]

The post You’ve Just Been Laid Off From Your Programming Job. Now What? appeared first on Simple Programmer.

Categories: Programming

Quote of the Month July 2015

From the Editor of Methods & Tools - Mon, 07/20/2015 - 15:16
Many users blame themselves for errors that occur when using technology, thinking that maybe they did something wrong. You must reverse this belief if you want to be an effective tester. Here is a rule of thumb: If something unexpected occurs, don’t blame yourself; blame the technology. Source: Tap Into Mobile Application Testing, Jonathan Koh, […]

SPaMCAST 351 – Distributed Agile, Illusion of Control, QA Corner

Software Process and Measurement Cast - Sun, 07/19/2015 - 22:00

Software Process and Measurement Cast 351 includes three columns.  The first is our essay on distributed Agile. What is distributed Agile? The phrase "distributed Agile" is often used indiscriminately; therefore definitions can cover a wide range of situations and evoke a wide range of emotions. A precise definition encompasses three concepts. The first is a team, project or program that is using Agile techniques. The second is geographic distribution describing where team members are located. The location of team members in a distributed team can range from being spread across a single building to members sprinkled across continents. Finally, the third is organizational distribution, meaning that teams can be comprised of members from different companies. No matter the definition, distributed Agile is different.

The Software Sensei, Kim Pries dives into the Illusion of Control.  Kim reminds us to drop the egos before you start working and choose your weapons unemotionally!

Jeremy Berriault brings a new installment of his QA Corner.  Jeremy discussed why testing is not just a random event. Testing requires planning or you will waste time, effort or your quality.

Call to Action!

I have a challenge for the Software Process and Measurement Cast listeners for the next few weeks. I would like you to find one person that you think would like the podcast and introduce them to the cast. This might mean sending them the URL or teaching them how to download podcasts. If you like the podcast and think it is valuable they will be thankful to you for introducing them to the Software Process and Measurement Cast. Thank you in advance!

Re-Read Saturday News

Remember that the Re-Read Saturday of The Mythical Man-Month is in full swing.  This week we tackle the essay titled “The Surgical Team”!

The Re-Read Saturday and other great articles can be found on the Software Process and Measurement Blog.

Remember: We just completed the Re-Read Saturday of Eliyahu M. Goldratt and Jeff Cox’s The Goal: A Process of Ongoing Improvement which began on February 21nd. What did you think?  Did the re-read cause you to read The Goal for a refresher? Visit the Software Process and Measurement Blog and review the whole re-read.

Note: If you don’t have a copy of the book, buy one. If you use the link below it will support the Software Process and Measurement blog and podcast.

Dead Tree Version or Kindle Version 

Upcoming Events

Software Quality and Test Management Conference
September 13 – 18, 2015
San Diego, California
http://qualitymanagementconference.com/

I will be speaking on the impact of cognitive biases on teams!  Let me know if you are attending!

I HAVE A SPECIAL DISCOUNT CODE. .  . just ask!

More on other great conferences soon!

Next SPaMCAST

The next Software Process and Measurement Cast features our interview with Gil Broza.  We discussed Gil’s new book The Agile Mindset.  Teams and organizations with an Agile mindset deliver more value; however many in the Agile community don’t know or don’t embrace an Agile Mindset.  Gil’s new book explains the concept of the Agile Mindset and how you can find it!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.

 

Categories: Process Management

SPaMCAST 351 – Distributed Agile, Illusion of Control, QA Corner

 www.spamcast.net

http://www.spamcast.net

Listen Now

Subscribe on iTunes

Software Process and Measurement Cast 351 includes three columns.  The first is our essay on distributed Agile. What is distributed Agile? The phrase “distributed Agile” is often used indiscriminately; therefore definitions can cover a wide range of situations and evoke a wide range of emotions. A precise definition encompasses three concepts. The first is a team, project or program that is using Agile techniques. The second is geographic distribution describing where team members are located. The location of team members in a distributed team can range from being spread across a single building to members sprinkled across continents. Finally, the third is organizational distribution, meaning that teams can be comprised of members from different companies. No matter the definition, distributed Agile is different.

The Software Sensei, Kim Pries dives into the Illusion of Control.  Kim reminds us to drop the egos before you start working and choose your weapons unemotionally!

Jeremy Berriault brings a new installment of his QA Corner.  Jeremy discussed why testing is not just a random event. Testing requires planning or you will waste time, effort or your quality.

Call to Action!

I have a challenge for the Software Process and Measurement Cast listeners for the next few weeks. I would like you to find one person that you think would like the podcast and introduce them to the cast. This might mean sending them the URL or teaching them how to download podcasts. If you like the podcast and think it is valuable they will be thankful to you for introducing them to the Software Process and Measurement Cast. Thank you in advance!

Re-Read Saturday News

Remember that the Re-Read Saturday of The Mythical Man-Month is in full swing.  This week we tackle the essay titled “The Surgical Team”!

The Re-Read Saturday and other great articles can be found on the Software Process and Measurement Blog.

Remember: We just completed the Re-Read Saturday of Eliyahu M. Goldratt and Jeff Cox’s The Goal: A Process of Ongoing Improvement which began on February 21nd. What did you think?  Did the re-read cause you to read The Goal for a refresher? Visit the Software Process and Measurement Blog and review the whole re-read.

Note: If you don’t have a copy of the book, buy one. If you use the link below it will support the Software Process and Measurement blog and podcast.

Dead Tree Version or Kindle Version 

Upcoming Events

Software Quality and Test Management Conference
September 13 – 18, 2015
San Diego, California
http://qualitymanagementconference.com/

I will be speaking on the impact of cognitive biases on teams!  Let me know if you are attending!

I HAVE A SPECIAL DISCOUNT CODE. .  . just ask!

More on other great conferences soon!

Next SPaMCAST

The next Software Process and Measurement Cast features our interview with Gil Broza.  We discussed Gil’s new book The Agile Mindset.  Teams and organizations with an Agile mindset deliver more value; however many in the Agile community don’t know or don’t embrace an Agile Mindset.  Gil’s new book explains the concept of the Agile Mindset and how you can find it!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.

 


Categories: Process Management

R: Bootstrap confidence intervals

Mark Needham - Sun, 07/19/2015 - 20:44

I recently came across an interesting post on Julia Evans’ blog showing how to generate a bigger set of data points by sampling the small set of data points that we actually have using bootstrapping. Julia’s examples are all in Python so I thought it’d be a fun exercise to translate them into R.

We’re doing the bootstrapping to simulate the number of no-shows for a flight so we can work out how many seats we can overbook the plane by.

We start out with a small sample of no-shows and work off the assumption that it’s ok to kick someone off a flight 5% of the time. Let’s work out how many people that’d be for our initial sample:

> data = c(0, 1, 3, 2, 8, 2, 3, 4)
> quantile(data, 0.05)
  5% 
0.35

0.35 people! That’s not a particularly useful result so we’re going to resample the initial data set 10,000 times, taking the 5%ile each time and see if we come up with something better:

We’re going to use the sample function with replacement to generate our resamples:

> sample(data, replace = TRUE)
[1] 0 3 2 8 8 0 8 0
> sample(data, replace = TRUE)
[1] 2 2 4 3 4 4 2 2

Now let’s write a function to do that multiple times:

library(ggplot)
 
bootstrap_5th_percentile = function(data, n_bootstraps) {
  return(sapply(1:n_bootstraps, 
                function(iteration) quantile(sample(data, replace = TRUE), 0.05)))
}
 
values = bootstrap_5th_percentile(data, 10000)
 
ggplot(aes(x = value), data = data.frame(value = values)) + geom_histogram(binwidth=0.25)

2015 07 19 18 05 48

So this visualisation is telling us that we can oversell by 0-2 people but we don’t know an exact number.

Let’s try the same exercise but with a bigger initial data set of 1,000 values rather than just 8. First we’ll generate a distribution (with a mean of 5 and standard deviation of 2) and visualise it:

library(dplyr)
 
df = data.frame(value = rnorm(1000,5, 2))
df = df %>% filter(value >= 0) %>% mutate(value = as.integer(round(value)))
ggplot(aes(x = value), data = df) + geom_histogram(binwidth=1)

2015 07 19 18 09 15

Our distribution seems to have a lot more values around 4 & 5 whereas the Python version has a flatter distribution – I’m not sure why that is so if you have any ideas let me know. In any case, let’s check the 5%ile for this data set:

> quantile(df$value, 0.05)
5% 
 2

Cool! Now at least we have an integer value rather than the 0.35 we got earlier. Finally let’s do some bootstrapping over our new distribution and see what 5%ile we come up with:

resampled = bootstrap_5th_percentile(df$value, 10000)
byValue = data.frame(value = resampled) %>% count(value)
 
> byValue
Source: local data frame [3 x 2]
 
  value    n
1   1.0    3
2   1.7    2
3   2.0 9995
 
ggplot(aes(x = value, y = n), data = byValue) + geom_bar(stat = "identity")

2015 07 19 18 23 29

‘2’ is by far the most popular 5%ile here although it seems weighted more towards that value than with Julia’s Python version, which I imagine is because we seem to have sampled from a slightly different distribution.

Categories: Programming

Re-read Saturday: The Mythical Man-Month, Part 3 The Surgical Team

The Mythical Man-Month

The Mythical Man-Month

When we began the re-read of The Mythical Man-Month my plan was to go through two essays every week. To date the planned cadence of two essays has been out of reach. Each of the essays are full of incredibly rich ideas that need to be shared therefore I am amending the plan to one essay per week. Today we re-read The Surgical Team. In this essay, Brooks addresses the impact of team size and team composition on the ability to deliver large projects.

The concept of a small team did not jump into popular discussion with the Agile Manifesto of 2001. Even before The Mythical Man-Month was published in 1975 the software development industry was beginning to coalesce around the idea that smaller teams were more efficient. Smaller teams are easier to coordinate and information sharing is easier because there are fewer communication paths with less people. The problem that Brooks postulated was that big systems, which are needed. can’t be built by a single, small team in any reasonable period of time. Efficiency does not always translate to effectiveness. Paraphrasing Brooks, the question we have to ask is if having an efficient single small team of first class people focused on a problem is great, how do you build the large systems that are needed? If small teams can’t build big systems efficiently the solution is either to not build large solutions all at once, find a mechanism to scale smaller teams or revert back to using brute force (which is often the method of choice).

The brute force method of developing large systems has been the most often leveraged answer to build large systems before (and after) the publication of The Mythical Man-Month. Brute force is a large project team typically coordinated by a team of program and project managers. Earlier in my career I was involved in systems side of bank mergers. During one of the larger mergers I worked on over 300 coders, business analysts, testers, project managers and others worked together to meet a merger date. Unquestionably the approach taken was brute force and as the date got closer the amount of force being brought to bear became more obvious. Brute force is problematic based on lack of efficiency and predictability. Brute force methods are affected by the variability of an individual’s capability and productivity. The goals of the systems portion of the bank mergers were not the efficiency of the process, but rather making the date given to the regulators for cut over to a single system without messing people’s money up and ending up on the front page of The Plain Dealer.

If brute force is an anathema (and it should be), a second solution is to only use small single purpose teams. Products would evolve as small pieces of functionality are conceived, built and then integrated into a larger whole. Scrum, at the team level, uses small teams to achieve a goal. Team level Agile embraces the effectiveness of small teams discussed in The Surgical Team, however does not address bringing large quantities of tightly integrated functionality of market quickly.

Agile has recognized the need to get large pieces of functionality to market faster than incremental evolution without abandoning the use of small teams by adding scaling techniques. A Scrum of Scrums is a technique to scale Scrum and other team-level Agile frameworks. Other scaling frameworks include DSDM, SAFe and Scaled Scrum. All of these frameworks leverage semi-autonomous small teams with some form coordination to keep teams moving in the same direction. Scaling adds some overhead to the process which reduces the efficiency gains from small teams, but allows larger pieces of work to be addressed.

In The Mythical Man-Month, Brooks leverages the metaphor of the surgical team to describe a highly effective AND highly efficient model of a team. In a surgical team the surgeon (responsible party) delivers value with a team that supports him or her. Transferring the surgical team metaphor to a development team, the surgeon writes the code and is responsible for the code and the backup surgeon (Brooks uses the term co-pilot) is the surgeons helper and is typically is less experienced. The backup is not responsible for the code. The rest of the team supports the surgeon and backup. The goal of the support team is to administer, test, remove road blocks and document the operation or project. While we might dicker about the definition of specific roles and what they are called, the concept of the small, goal-oriented team is not out of line with many Scrum teams in today’s Agile environment.

Scrum and most other Agile techniques move past the concept of teams of individuals with specific solo roles towards a teams with more cross functional individuals. Cross-functional teams tend to yield more of a peer relationship than the hierarchy seen in the surgical team. The flatter team will require more complex communication patterns which can be overcome in Scrum with techniques like the stand-up meeting to address communication. The concept of the Scrum team is a natural evolution of the concepts in The Surgical Team.  Scrum tunes the small team concept to software development where a number of coordinated hands can be in the patient simultaneously, if coordinated through the techniques of common goal, stand-up meeting, reviews and continuous integration.

 

Previous installments of Re-read Saturday for the The Mythical Man-Month

Intro and Tar Pit

The Mythical Man-month Part 2


Categories: Process Management

How To Get Smarter By Making Distinctions

"Whatever you do in life, surround yourself with smart people who'll argue with you." -- John Wooden

There’s a very simple way to get smarter.

You can get smarter by creating categories.

Not only will you get smarter, but you’ll also be more mindful, and you’ll expand your vocabulary, which will improve your ability to think more deeply about a given topic or domain.

In my post, The More Distinctions You Make, the Smarter You Get, I walk through the ins and outs of creating categories to increase your intelligence, and I use the example of “fat.”   I attempt to show how “Fat is bad” isn’t very insightful, and how by breaking “fat” down into categories, you can dive deeper and reveal new insight to drive better decisions and better outcomes.

I’m this post, I’m going to walk this through with an example, using “security” as the topic.

The first time I heard the word “security”, it didn’t mean much to me, beyond “protect.”

The next thing somebody taught me, was how I had to focus on CIA:  Confidentiality, Integrity, and Availability.

That was a simple way to break security down into meaningful parts.

And then along came Defense in Depth.   A colleague explained that Defense in Depth meant thinking about security in terms of multiple layers:  Network, Host, Application, and Data.

But then another colleague said, the real key to thinking about security and Defense in Depth, was to think about it in terms of people, process, and technology.

As much as I enjoyed these thought exercises, I didn’t find them actionable enough to actually improve software or application security.  And my job was to help Enterprise developers build better Line-Of-Business applications that were scalable and secure.

So our team went to the drawing board to map out actionable categories to take application security much deeper.

Right off the bat, just focusing on “application” security vs. “network” security or “host” security, helped us to get more specific and make security more tangible and more actionable from an Line-of-Business application perspective.

Security Categories

Here are the original security categories that we used to map out application security and make it more actionable:

  1. Input and Data Validation
  2. Authentication
  3. Authorization
  4. Configuration Management
  5. Sensitive Data
  6. Session Management
  7. Cryptography
  8. Exception Management
  9. Auditing and Logging

Each of these buckets helped us create actionable principles, patterns, and practices for improving security.

Security Categories Explained

Here is a brief description of each application security category:

Input and Data Validation
How do you know that the input your application receives is valid and safe? Input validation refers to how your application filters, scrubs, or rejects input before additional processing. Consider constraining input through entry points and encoding output through exit points. Do you trust data from sources such as databases and file shares?

Authentication
Who are you? Authentication is the process where an entity proves the identity of another entity, typically through credentials, such as a user name and password.

Authorization
What can you do? Authorization is how your application provides access controls for resources and operations.

Configuration Management
Who does your application run as? Which databases does it connect to? How is your application administered? How are these settings secured? Configuration management refers to how your application handles these operational issues.

Sensitive Data
How does your application handle sensitive data? Sensitive data refers to how your application handles any data that must be protected either in memory, over the network, or in persistent stores.

Session Management
How does your application handle and protect user sessions? A session refers to a series of related interactions between a user and your Web application.

Cryptography
How are you keeping secrets (confidentiality)? How are you tamper-proofing your data or libraries (integrity)? How are you providing seeds for random values that must be cryptographically strong? Cryptography refers to how your application enforces confidentiality and integrity.

Exception Management
When a method call in your application fails, what does your application do? How much do you reveal? Do you return friendly error information to end users? Do you pass valuable exception information back to the caller? Does your application fail gracefully?

Auditing and Logging
Who did what and when? Auditing and logging refer to how your application records security-related events.

As you can see, just by calling out these different categories, you suddenly have a way to dive much deeper and explore application security in depth.

The Power of a Security Category

Let’s use a quick example.  Let’s take Input Validation.

Input Validation is a powerful security category, given how many software security flaws and how many vulnerabilities and how many attacks all stem from a lack of input validation, including Buffer Overflows.

But here’s the interesting thing.   After quite a bit of research and testing, we found a powerful security pattern that could help more applications stand up to more security attacks.  It boiled down to the following principle:

Validate for length, range, format, and type.

That’s a pithy, but powerful piece of insight when it comes to implementing software security.

And, when you can’t validate the input, make it safe by sanitizing the output.  And along these lines, keep user input out of the control path, where possible.

All of these insights flow from just focusing on Input Validation as a security category.

Threats, Attacks, Vulnerabilities, and Countermeasures

Another distinction our team made was to think in terms of threats, attacks, vulnerabilities, and countermeasures.  We knew that threats could be intentional and malicious (as in the case of attacks), but they could also be accidental and unintended.

We wanted to identify vulnerabilities as weaknesses that could be addressed in some way.

We wanted to identify countermeasures as the actions to take to help mitigate risks, reduce the attack surface, and address vulnerabilities.

Just by chunking up the application security landscape into threats, attacks, vulnerabilities, and countermeasures, we empowered more people to think more deeply about the application security space.

Security Vulnerabilities Organized by Security Categories

Using the security categories above, we could easily focus on finding security vulnerabilities and group them by the relevant security category.

Here are some examples:

Input/Data Validation

  • Using non-validated input in the Hypertext Markup Language (HTML) output stream
  • Using non-validated input used to generate SQL queries
    Relying on client-side validation
  • Using input file names, URLs, or user names for security decisions
  • Using application-only filters for malicious input
  • Looking for known bad patterns of input
  • Trusting data read from databases, file shares, and other network resources
  • Failing to validate input from all sources including cookies, query string parameters, HTTP headers, databases, and network resources

Authentication

  • Using weak passwords
  • Storing clear text credentials in configuration files
  • Passing clear text credentials over the network
  • Permitting over-privileged accounts
  • Permitting prolonged session lifetime
  • Mixing personalization with authentication

Authorization

  • Relying on a single gatekeeper
  • Failing to lock down system resources against application identities
  • Failing to limit database access to specified stored procedures
  • Using inadequate separation of privileges

Configuration Management

  • Using insecure administration interfaces
  • Using insecure configuration stores
  • Storing clear text configuration data
  • Having too many administrators
  • Using over-privileged process accounts and service accounts

Sensitive Data

  • Storing secrets when you do not need to
  • Storing secrets in code
  • Storing secrets in clear text
  • Passing sensitive data in clear text over networks

Session Management

  • Passing session identifiers over unencrypted channels
  • Permitting prolonged session lifetime
  • Having insecure session state stores
  • Placing session identifiers in query strings

Cryptography

  • Using custom cryptography
  • Using the wrong algorithm or a key size that is too small
  • Failing to secure encryption keys
  • Using the same key for a prolonged period of time
  • Distributing keys in an insecure manner

Exception Management

  • Failing to use structured exception handling
  • Revealing too much information to the client

Auditing and Logging

  • Failing to audit failed logons
  • Failing to secure audit files
  • Failing to audit across application tiers
Threats and Attacks Organized by Security Categories

Again, using our security categories, we could then group threats and attacks by relevant security categories.

Here are some examples of security threats and attacks organized by security categories:

Input/Data Validation

  • Buffer overflows
  • Cross-site scripting
  • SQL injection
  • Canonicalization attacks
  • Query string manipulation
  • Form field manipulation
  • Cookie manipulation
  • HTTP header manipulation

Authentication

  • Network eavesdropping
  • Brute force attacks
  • Dictionary attacks
  • Cookie replay attacks
  • Credential theft

Authorization

  • Elevation of privilege
  • Disclosure of confidential data
  • Data tampering
  • Luring attacks

Configuration Management

  • Unauthorized access to administration interfaces
  • Unauthorized access to configuration stores
  • Retrieval of clear text configuration secrets
  • Lack of individual accountability

Sensitive Data

  • Accessing sensitive data in storage
  • Accessing sensitive data in memory (including process dumps)
  • Network eavesdropping
  • Information disclosure

Session Management

  • Session hijacking
  • Session replay
  • Man-in-the-middle attacks

Cryptography

  • Loss of decryption keys
  • Encryption cracking

Exception Management

  • Revealing sensitive system or application details
  • Denial of service attacks

Auditing and Logging

  • User denies performing an operation
  • Attacker exploits an application without trace
  • Attacker covers his tracks
Countermeasures Organized by Security Categories

Now here is where the rubber really meets the road.  We could group security countermeasures by security categories to make them more actionable.

Here are example security countermeasures organized by security categories:

Input/Data Validation

  • Do not trust input
  • Validate input: length, range, format, and type
  • Constrain, reject, and sanitize input
  • Encode output

Authentication

  • Use strong password policies
  • Do not store credentials
  • Use authentication mechanisms that do not require clear text credentials to be passed over the network
  • Encrypt communication channels to secure authentication tokens
  • Use HTTPS only with forms authentication cookies
  • Separate anonymous from authenticated pages

Authorization

  • Use least privilege accounts
  • Consider granularity of access
  • Enforce separation of privileges
  • Use multiple gatekeepers
  • Secure system resources against system identities

Configuration Management

  • Use least privileged service accounts
  • Do not store credentials in clear text
  • Use strong authentication and authorization on administrative interfaces
  • Do not use the Local Security Authority (LSA)
  • Avoid storing sensitive information in the Web space
  • Use only local administration

Sensitive Data

  • Do not store secrets in software
  • Encrypt sensitive data over the network
  • Secure the channel

Session Management

  • Partition site by anonymous, identified, and authenticated users
  • Reduce session timeouts
  • Avoid storing sensitive data in session stores
  • Secure the channel to the session store
  • Authenticate and authorize access to the session store

Cryptography

  • Do not develop and use proprietary algorithms (XOR is not encryption. Use platform-provided cryptography)
  • Use the RNGCryptoServiceProvider method to generate random numbers
  • Avoid key management. Use the Windows Data Protection API (DPAPI) where appropriate
  • Periodically change your keys

Exception Management

  • Use structured exception handling (by using try/catch blocks)
  • Catch and wrap exceptions only if the operation adds value/information
  • Do not reveal sensitive system or application information
  • Do not log private data such as passwords

Auditing and Logging

  • Identify malicious behavior
  • Know your baseline (know what good traffic looks like)
  • Use application instrumentation to expose behavior that can be monitored

As you can see, the security countermeasures can easily be reviewed, updated, and moved forward, because the actionable principles are well organized by the security categories.

There are many ways to use creating categories as a way to get smarter and get better results.

In the future, I’ll walk through how we created an Agile Security approach, using categories.

Meanwhile, check out my post on The More Distinctions You Make, the Smarter You Get to gain some additional insights into how to use empathy and creating categories to dive deeper, learn faster, and get smarter on any topic you want to take on.

Categories: Architecture, Programming

We Help Our Customers Transform

"Innovation—the heart of the knowledge economy—is fundamentally social." -- Malcolm Gladwell

I’m a big believer in having clarity around what you help your customers do.

I was listening to Satya Nadella’s keynote at the Microsoft Worldwide Partner Conference, and I like how he put it so simply, that we help our customers transform.

Here’s what Satya had to say about how we help our customers transform their business:

“These may seem like technical attributes, but they are key to how we drive business success for our customers, business transformation for our customers, because all of what we do, collectively, is centered on this core goal of ours, which is to help our customers transform.

When you think about any customer of ours, they're being transformed through the power of digital technology, and in particular software.

There isn't a company out there that isn't a software company.

And our goal is to help them differentiate using digital technology.

We want to democratize the use of digital technology to drive core differentiation.

It's no longer just about helping them operate their business.

It is about them excelling at their business using software, using digital technology.

It is about our collective ability to drive agility for our customers.

Because if there is one truth that we are all faced with, and our customers are faced with, it's that things are changing rapidly, and they need to be able to adjust to that.

And so everything we do has to support that goal.

How do they move faster, how do they interpret data quicker, how are they taking advantage of that to take intelligent action.

And of course, cost.

But we'll keep coming back to this theme of business transformation throughout this keynote and throughout WPC, because that's where I want us to center in on.

What's the value we are adding to the core of our customer and their ability to compete, their ability to create innovation.

And anchored on that goal is our technical ambition, is our product ambition.”

Transformation is the name of the game.

You Might Also Like

Satya Nadella is All About Customer Focus

SatyaSatya Nadella on a Mobile-First, Cloud-First World

Satya Nadella on Empower Every Person on the Planet

Satya Nadella on Everyone Has To Be a Leader

Satya Nadella on How the Key To Longevity is To Be a Learning Organization

Satya Nadella on Live and Work a Meaningful Life

Sayta Nadelle on The Future of Software

Categories: Architecture, Programming

Satya Nadella on a Mobile-First, Cloud-First World

You hear Mobile-First, Cloud-First all the time.

But do you ever hear it really explained?

I was listening to Satya Nadella’s keynote at the Microsoft Worldwide Partner Conference, and I like how he walked through how he thinks about a Mobile-First, Cloud-First world.

Here’s what Satya had to say:

“There are a couple of attributes.

When we talk about Mobile-First, we are talking about the mobility of the experience.

What do we mean by that?

As we look out, the computing that we are going to interface with, in our lives, at home and at work, is going to be ubiquitous.

We are going to have sensors that recognize us.

We are going to have computers that we are going to wear on us.

We are going to have computers that we touch, computers that we talk to, the computers that we interact with as holograms.

There is going to be computing everywhere.

But what we need across all of this computing, is our experiences, our applications, our data.

And what enables that is in fact the cloud acting as a control plane that allows us to have that capability to move from device to device, on any given day, at any given meeting.

So that core attribute of thinking of mobility, not by being bound to a particular device, but it's about human mobility, is very core to our vision.

Second, when we think about our cloud, we think distributed computing will remain distributed.

In fact, we think of our servers as the edge of our cloud.

And this is important, because there are going to be many legitimate reasons where people will want digital sovereignty, people will want data residency, there is going to be regulation that we can't anticipate today.

And so we have to think about a distributed cloud infrastructure.

We are definitely going to be one of the key hyper-scale providers.

But we are also going to think about how do we get computing infrastructure, the core compute, storage, network, to be distributed throughout the world.

These may seem like technical attributes, but they are key to how we drive business success for our customers, business transformation for our customers, because all of what we do, collectively, is centered on this core goal of ours, which is to help our customers transform.”

That’s a lot of insight, and very well framed for creating our future and empowering the world.

You Might Also Like

Microsoft Explained: Making Sense of the Microsoft Platform Story

Satya Nadella is All About Customer Focus

Satya Nadella on Empower Every Person on the Planet

Satya Nadella on Everyone Has To Be a Leader

Satya Nadella on How the Key To Longevity is To Be a Learning Organization

Satya Nadella on Live and Work a Meaningful Life

Sayta Nadelle on The Future of Software

Categories: Architecture, Programming

Empower Every Person on the Planet to Achieve More

It’s great to get back to the basics, and purpose is always a powerful starting point.

I was listening to Satya Nadella’s keynote at the Microsoft Worldwide Partner Conference, and I like how he walked through the Microsoft mission in a mobile-first, cloud-first world.

Here’s what Satya had to say:

“Our mission:  Empowering every person and every business on the planet to achieve more.

(We find that by going back into our history and re-discovering that core sense of purpose, that soul ... a PC in every home, democratizing client/server computing.)

We move forward to a Mobile-First, Cloud-First world.

We care about empowerment.

There is no other ecosystem that is primarily, and solely, built to help customers achieve greatness.

We are focused on helping our customers achieve greatness through digital technology.

We care about both individuals and organizations.  That intersection of people and organizations is the cornerstone of what we represent as excellence.

We are a global company.  We want to make sure that the power of technology reaches every country, every vertical, every organization, irrespective of size.

There will be many goals.

What remains constant is this sense of purpose, the reason why this ecosystem exists.

This is a mission that we go and exercise in a Mobile-First, Cloud-First world.”

If I think back to why I originally joined Microsoft, it was to empower every person on the planet to achieve more.

And the cloud is one powerful enabler.

You Might Also Like

Satya Nadella is All About Customer Focus

Satya Nadella on Everyone Has To Be a Leader

Satya Nadella on How the Key To Longevity is To Be a Learning Organization

Satya Nadella on Live and Work a Meaningful Life

Sayta Nadelle on The Future of Software

Categories: Architecture, Programming

R: Blog post frequency anomaly detection

Mark Needham - Sat, 07/18/2015 - 00:34

I came across Twitter’s anomaly detection library last year but haven’t yet had a reason to take it for a test run so having got my blog post frequency data into shape I thought it’d be fun to run it through the algorithm.

I wanted to see if it would detect any periods of time when the number of posts differed significantly – I don’t really have an action I’m going to take based on the results, it’s curiosity more than anything else!

First we need to get the library installed. It’s not on CRAN so we need to use devtools to install it from the github repository:

install.packages("devtools")
devtools::install_github("twitter/AnomalyDetection")
library(AnomalyDetection)

The expected data format is two columns – one containing a time stamp and the other a count. e.g. using the ‘raw_data’ data frame that is in scope when you add the library:

> library(dplyr)
> raw_data %>% head()
            timestamp   count
1 1980-09-25 14:01:00 182.478
2 1980-09-25 14:02:00 176.231
3 1980-09-25 14:03:00 183.917
4 1980-09-25 14:04:00 177.798
5 1980-09-25 14:05:00 165.469
6 1980-09-25 14:06:00 181.878

In our case the timestamps will be the start date of a week and the count the number of posts in that week. But first let’s get some practice calling the anomaly function using the canned data:

res = AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', plot=TRUE)
res$plot

2015 07 18 00 09 22

From this visualisation we learn that we should expect both high and low outliers to be identified. Let’s give it a try with the blog post publication data.

We need to get the data into shape so we’ll start by getting a count of the number of blog posts by (week, year) pair:

> df %>% sample_n(5)
                                                           title                date
1425                            Coding: Copy/Paste then refactor 2009-10-31 07:54:31
783  Neo4j 2.0.0-M06 -> 2.0.0-RC1: Working with path expressions 2013-11-23 10:30:41
960                                        R: Removing for loops 2015-04-18 23:53:20
966   R: dplyr - Error in (list: invalid subscript type 'double' 2015-04-27 22:34:43
343                     Parsing XML from the unix terminal/shell 2011-09-03 23:42:11
 
> byWeek = df %>% 
    mutate(year = year(date), week = week(date)) %>% 
    group_by(week, year) %>% summarise(n = n()) %>% 
    ungroup() %>% arrange(desc(n))
 
> byWeek %>% sample_n(5)
Source: local data frame [5 x 3]
 
  week year n
1   44 2009 6
2   37 2011 4
3   39 2012 3
4    7 2013 4
5    6 2010 6

Great. The next step is to translate this data frame into one containing a date representing the start of that week and the number of posts:

> data = byWeek %>% 
    mutate(start_of_week = calculate_start_of_week(week, year)) %>%
    filter(start_of_week > ymd("2008-07-01")) %>%
    select(start_of_week, n)
 
> data %>% sample_n(5)
Source: local data frame [5 x 2]
 
  start_of_week n
1    2010-09-10 4
2    2013-04-09 4
3    2010-04-30 6
4    2012-03-11 3
5    2014-12-03 3

We’re now ready to plug it into the anomaly detection function:

res = AnomalyDetectionTs(data, 
                         max_anoms=0.02, 
                         direction='both', 
                         plot=TRUE)
res$plot

2015 07 18 00 24 20

Interestingly I don’t seem to have any low end anomalies – there were a couple of really high frequency weeks when I first started writing and I think one of the other weeks contains a New Year’s Eve when I was particularly bored!

If we group by month instead only the very first month stands out as an outlier:

data = byMonth %>% 
  mutate(start_of_month = ymd(paste(year, month, 1, sep="-"))) %>%
  filter(start_of_month > ymd("2008-07-01")) %>%
  select(start_of_month, n)
res = AnomalyDetectionTs(data, 
                         max_anoms=0.02, 
                         direction='both',       
                         #longterm = TRUE,
                         plot=TRUE)
res$plot

2015 07 18 00 34 02

I’m not sure what else to do as far as anomaly detection goes but if you have any ideas please let me know!

Categories: Programming

Estimating and Making Decisions in Presence of Uncertainty

Herding Cats - Glen Alleman - Fri, 07/17/2015 - 18:03

There is a nice post from Trent Hone on No Estimates. This triggered some more ideas about why we estimates, what the root cause of the problem #NoEstimates is trying to solve and a summary of the problem

A Few Comments

All project work is probabilistic, driven by the underlying statistical uncertainties. These uncertainties are of two types - reducible and irreducible. Reducible uncertainty is driven by the lack of information. This information can be increased with direct work. We can "buy down" the uncertainty, with testing, alternative designs, redundancy. Reducible uncertainty is "event based." Your power outage for example. DDay being pushed one day by weather.

Irreducible uncertainty is just "part of the environment." It's the natural varaibility embedded in all project work. The "vibrations" of all the variables. This is handled by Margin. Schedule margin, cost margin, technical margin.

Here's an approach to "managing in the presence of uncertainty"

For my experience in Software Intensive Systems in a variety of domains (ERP, Realtime embedded systems, defense, space, nuclear power, pulp and paper, New Drug Development, heavy manufacturing, and more) #NE is a reaction to Bad Management. This inverts the Cause and Effect model of Root Cause Analysis. The conjecture that "estimates are the smell of dysfunction" without stating the dysfunction, the corrective action for that dysfunction, applying that corrective action, then reassessing the conjecture is a hollow statement. So the entire notion of #NE is a house built on sand.

Lastly the Microeconomics of decision making in SWDev in the presence of uncertainty means estimating is needed to "decide" between alternatives - opportunity costs. This paradigm is the basis of any non-trivial business governance process

No Estimates is a solution looking for a problem to solve.

Categories: Project Management

Stuff The Internet Says On Scalability For July 17th, 2015

Hey, it's HighScalability time:


In case you were wondering, the world is weird. Large Hadron Collider discovers new pentaquark particle.

 

  • 3x: Uber bigger than taxi market; 250x: traffic in HotSchedules' DDoS attack; 92%: Apple’s share of the smartphone profit pie; 7: Airbnb rejections
  • Quotable Quotes:
    • Netflix: A slow or unhealthy server is worse than a down server 
    • @inconshreveable: ngrok production servers, max GC pause: Go1.4 (top) vs Go1.5. Holy 85% reduction! /cc Go team
    • Nic Fleming: The fungal internet exemplifies one of the great lessons of ecology: seemingly separate organisms are often connected, and may depend on each other.
    • @IBMResearch: With 20+ billion transistors on new chip, that's a 50% scaling improvement over today’s tech #ibmresearch #7nm 

  • Apple and Google Race to See Who Can Kill the App First. Honest question, how are people supposed to make money in this new world? Apps are quickly becoming just an identity that ties together 10 or so components that appear integrated as part of the OS, but don't look like your app at all. Reminds me of laminar flow. We are seeing a rebirth of CORBA, COM and OLE 2, this time the container is an app bound by deep linking and some ad hoc ways to push messages around. Show developers the money.

  • The dark side of Google 10x: One former exec told Business Insider that the gospel of 10x, which is promoted by top execs including CEO Larry Page, has two sides. “It’s enormously energizing on one side, but on the other it can be totally paralyzing,”

  • Wait, are we going all RAM or all flash? So confusing. MIT Develops Cheaper Supercomputer Clusters By Nixing Costly RAM In Favor Of Flash: researchers presented evidence at the International Symposium on Computer Architecture that if servers executing a distributed computation go to disk for data even just 5 percent of the time, performance takes a hit to where it's comparable with flash memory anyway. 40 servers with 10 terabytes of RAM wouldn't chew through a 10.5TB computation any better than 20 servers with 20TB of flash memory. What's involved here is moving a little computational power off of the servers and onto the chips that control the flash drives.

  • Is disruption merely a Silicon Valley fantasy? Corporate America Hasn’t Been Disrupted: the advantage enjoyed by incumbents, always substantial, has been growing in recent years...more Americans worked for big companies...Large companies are becoming more dominant in part by buying up their rivals...Consolidation could explain at least part of the rising failure rate among startups...The startup rate has declined in every major industry, every state and nearly every city, and the failure rate’s rise has been nearly as universal. 

  • What's a unikernel and why should you care? Amir Chaudhry reveals all in his Unikernels talk given at PolyConf 15. And here's the supporting blog post. Why are we still applications on top of operating systems? Most applications are single purpose so why all the complexity? Why are we building software for the cloud the same way we build it for desktops? We can do better with Unikerels where every application is a single purpose VM with a single address space.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Estimating Processes in Support of Economic Analysis

Herding Cats - Glen Alleman - Fri, 07/17/2015 - 03:34

On any project with significant Value at Risk Economic Analysis provides visibility to the data needed for decision making. This Value at Risk paradigm is a critical starting point for applying all processes of decision making. The choice of decision process must be matched to the opportunity cost (actually the value of the loss for the alternative not chosen).

Screen Shot 2015-07-16 at 3.30.13 PM

  1. Objective - what capabilities need to be produced by this project which the customer will value (in some useful units of measure)? These objectives can be easily described by Capabilities of the Outcomes. Features, stories, requirements are of little use to the customer if they do not directly enable a capabilities to accomplish the business mission or vision. The customer bought the capability, not the feature.
  2. Assumptions and Constraints - there are always assumptions. These are conditions in place that impact the project. 
  3. Alternatives - there is always more than one way to do something. What are costs for each alternatives? 
  4. Benefits - what are the measurable benefits for this work? It can be monetary. It can be some intangible benefit.
  5. Costs - what will it cost to produce the value to be delivered to the customer? Along with this cost, what resources are needed? What schedule are these resources available?
  6. Rank Alternatives - with this information we can rank the alternatives in some objective manner. These measures can be assessment of effectiveness or 
  7. Sensitivity and Risk Analysis - tradeoffs are always probabilistic in nature, since all project work is probabilistic in nature. Rarely if ever are the single value non-varying numbers. This is the case only when the work is complete and no more activities are being performed. These actual values are useful, but they can be used for making future decisions only if there past performance statistical behaviors are collected. This is the Flaw of Averages problem. No average has value without know the variance.
  8. Make a decision - with all this information we can now make decisions. Of course the information about the past can be used and of course there is information about the future. Both are probabilistic nature.

With these probabilistic outcomes driven by the underlying statistical process of all project work, we need to be able to estimate all the values of the random variables and their impact on the processes above.   

Next is an example of applying this probabilistic decision making in the presence of uncertainty for cost and schedule assessment. This can be for other probabilistic variables on the project. Technical Performance Measures, Measures of Effectiveness, Measures of Performance, Key Performance Parameters, and many other ...ilities (maintainability, supportability, survivability, etc.)

Screen Shot 2015-07-16 at 4.51.42 PM

Related articles What Happened to our Basic Math Skills? Information Technology Estimating Quality Making Decisions In The Presence of Uncertainty What's the Smell of Dysfunction?
Categories: Project Management

Blisters: Thoughts on Change

Eww

Eww

Change is the currency of every organization involved developing, enhancing or maintaining software. Change includes the business process being automated, the techniques used to do work or even the technology upon which the software is built. Very little significant change is frictionless. For change to occur the need and benefit to be gained for solving the problem must overcome inertia, the work needed to find a solution and implement the solution.

I recently purchased a new pair of shoes, an act that I had put off for a bit longer than I should have. The pair of shoes I had owed for three years but had been wearing nearly everyday for the last year were a pair of Italian loafers. The leather was exquisite over the three years I had owned the shoes they had become very old but very comfortable friends.  Unfortunately the soles had worn out and because of how the shoes were constructed, they we’re not repairable. As a rule, consultants wearing worn out shoes, however treasured, generally do not confer confidence. The hole in the sole and a need to earn a living established the need for me to change.  The final impetus to overcome inertia was delivered when I found that an important meeting on my schedule in the next week. Why was there any inertia? Unlike my wife, I can’t order 10 pairs of shoes online and then return seven after trying them on for a few reasons. First my need was immediate, the worn out soles were obvious to anyone sitting near me. Secondly, I am fairly hard to fit when it comes to dress shoes. Shopping (something I find as enjoyable as a prostate exam) is an event to be avoided. Deciding to change/shop requires an investment in effort and time. Almost every significant organizational changes require the same upfront investment in time, effort and rationalization to break the day-to-day inertia needed to begin to pursue change.

Breaking through the barrier of inertia by establishing a need and weighing the benefit to be gained by fulfilling that need is only the first step along the path of change. All leaders know that change requires spending blood, sweat and tears to find the correct solution. A team that has decided to change how they are working might have to sort through and evaluate Agile, lean or even classic software development techniques before finding the solution that fits their needs and culture. The process is not terribly different from my shopping for shoes. The shoe story continues with a trip to the local mall with two “major” department stores. Once at the mall I began the process of evaluating options. The process included an hour that I will ever get back in one store being told that that there were no size 10.5 shoes in black that would be suitable for an office in stock. Then being offered a pair of 11’s that I could lace up myself to try on. The last caused me to immediately go to another store where I bought a pair (my normal brand in stylish black). Just like the team finding and deciding on a new development framework, I had to evaluate alternatives, try them on (sort of prototype) and then negotiate the sale. Change is not frictionless.

Once an organization decides to change and settles on how to they will meet their need, implementation remains. Regardless of all the ground work done up to this point important but not sufficient effort and … sometimes pain are required to implement the change. Teams embracing Agile, kanban or even waterfall will need to learn new concepts, practice those techniques and understand that mistakes will be made. Looping back to the shoe story, I am now suffering through a blister. Organizational process change might might not generate physical pain like new shoes however to stress of the change has to accounted for when determining if the cost of change is less than the gains foreseen for addressing the need.
In the end, change is unavoidable whether we are discussing new shoes or process improvements. The question is rarely, will we change but rather when we will change and how big a need do we have to generate to offset the effort and aggregation that any change requires.

Now for something completely different!

I need your help! I am proposing a talk at AgileDC (Washington DC, October 26th). The title is

Budgeting, Estimation, Planning, #NoEstimates and the Agile Planning Onion – They ALL make sense!

Can you go the AgileDC site and like the presentation (click the heart next to the title). The likes will help get the attention of the organizers! I would also like your feedback on the topic.


Categories: Process Management