Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Re-read Saturday: Consolidating Gains and Producing More Change, John P. Kotter Chapter 9

index

My local football (US variety) team seems to have the ability to be winning throughout the game only to wear out, lose focus or generally find a way to lose with very little time left on the clock. Significant organizational changes, like pursuing a goal or winning a game, is not over until the change is complete and it has become part of the culture. Stage 7 of Kotter’s 8-stage process for creating major changes is consolidating gains and producing more change. Just like my local football team, you must make sure you hold on to your gains and use them to build toward the next step forward. Kotter addresses two major topics in the stage: resistance and interdependence.

As time goes by and you experience short-term successes, it is easy to begin lose the urgency that helped power a change program. Until a change is written into an organization’s culture, resistance will always be lurking. Loss of urgency can lead to programs stalling. For example, I recently discussed a change program with colleagues that had been designed to transform an organization using Agile techniques including Scrum, continuous delivery and test driven development. Scrum was implemented as the first step and generated significant benefits. As the initial benefits were recognized, a number of leaders began to argue that 80% of the benefit had been generated and that the rest of the changes would be difficult. The argument led to a loss of urgency, momentum and a reduction in funding as attention wandered to another program with less resistance. Urgency and constancy of purpose must be continually maintained or resistance can lead to regression.

Significant organizational change typically requires changes to many different groups and processes to be effective. The larger the intended change, the larger the number of moving parts and interactions that will need to be involved when making a change. As most change programs progress, they evolve. Evolution is typically generated by feedback from the short-term wins and other sources within the environment. Changes help to identify new interactions and dependencies, which add complexity and the level of effort. Kotter uses an example of the difference of rearranging an office with all of the furniture attached with rubber bands and one without. The one in which the furniture is connected with rubber bands will require significantly more planning and effort. Each item will pull against each other as changes are made. As changes are identified, the program will potentially need to add new people and resources or perhaps even new subprojects many need to be established. Senior management needs to provide a sense of urgency for the change program and a vision of where the program is going. At the same time, the complexity of any significant change program requires tactical leadership and management. Effective change programs require both strategic vision and tactical management for effective delivery. The combination of interactions and dependencies cause complexity that requires focus and constancy of purpose by senior, middle and line management to facilitate change.

While any project or program evolves as new information and knowledge is discovered, we need to continually challenge the validity of change. Change causes complexity.  The higher the complexity of any program, the less likely they are to complete, at least effectively. One of the principles noted in the Agile Manifesto, is that “simplicity – the art of maximizing the amount of work not done – is essential.” Each part of the change program, and especially any changes or additions that are discovered as progress is made, must be evaluated to ensure that only what is required to deliver the vision is addressed. Remember the adage: keep it simple stupid. The tool to manage change is the guiding collation.  Use the guiding collation to accept change and to prioritize the change program’s backlog (sounds like a product owner).

Kotter summarizes stage 7 this way:

  • More change, not less – The program must build on the credibility and the feedback of the short-term wins.
  • More help – As inter-dependencies are identified, bring new people and resources into the program with the needed experience and knowledge.
  • Continued leadership – Senior management must have constancy of purpose. They need to continually provide and maintain a clear vision.
  • Project management and leadership from below – The individual projects and initiatives require tactical leadership and management to implement the visions of senior management.
  • Reduction of unnecessary inter-dependencies – Keep the change program as simple as it need to be.

All large projects, whether they are significant organizational change programs or not, take time and evolve. At some point, as change programs progress and generate benefits it will become tempting to declare victory. Yogi Berra stated “It ain’t over till it’s over.” A change program is not complete until it attains the vision for the program and has been integrated into the organization’s culture.


Categories: Process Management

Planning is the Basis of Decision Making in the Presence of Uncertainty

Herding Cats - Glen Alleman - Sat, 01/10/2015 - 23:46

Another twitter conversation - of sorts - prompted a thought about the purpose of planning, from the Quote of the Day post. In the enterprise IT world, planning and plans provide the road map for development and deployment of systems that implement the strategy of the business.

Planning

It is the process of planning that reveals the needed seqeunce of delivered capabilities. In the picture below a health insurance providers network application is deployed to replace several legacy systems over the course of time. Specific capabilities must be in place before later capabilities can be deployed.

This is a Plan for the development and deployment of those capabilities. From the plan, technical and operational requirements ae developed, software developed, tested, IV&V's, confirmed to be compliant with security and regulatory guidelines, business users trained, and providers (medical) trained.

Screen Shot 2015-01-10 at 11.41.06 AM

The sequence of these capabilities is well defined from the business operations side. The development of software inside each capability providing activity has some flexibility. But if we're going to arrive on the planned - NEED = date for say Data Store Lookup, then the sequence of that software has some flexibility. But not arbitrary flexibility. Further levels of planning are needed for resource availability - both people, machines, services, facilities, etc. 

Planning is Planning for Capabilities to Be Available to Deliver Value

The planning process is driven by Capabilities Based Planning. This process has some simple straightforward steps.

IdentifyCapabilities

So when we hear deliver value on day one, we need to ask in what domain is that even possible. We need to deliver value on the day the value is needed for the business. Having a capability ready before the business is ready to use it is poor resource utilization planning.

We'll have spent money and time on something the business can't use yet. We may have made better use of that time and money on another capability. Which capabilities come in what order, at what time the business is capable of absorbing them is the first and primary role of PLANNING.

So the conjecture that plans are almost certainly wrong, begs the question - do you know what you're doing? If you're plans are almost certainly wrong - at least in the Enterprise IT business - you've got the wrong people managing the project. And you almost certaintly are wasting money developing things that won't be used.

This doesn't mean we don't explore, try different things to see if they'll work. But that work is also planned. It's not our money.

Domain is King

If you've got a project that is just you or maybe you and a few close friends that's going to take a week or maybe 6 weeks, planning in this manner is likely a waste. Along with estimating your work, keeping track of progress to plan, and even counting the money.

But if you're on a $200M enterprise IT development, integration, and deployment project with ~100 developers, testers, QA people, security, compliance, server ops, DBA's etc. you'd better have some notion of the order of work, the order of value delivery, the cost of that work, the probability of showing up on timem, on budget with that value in hand and how you going to herd all the cats that surround the project.

Related articles All About Me or All About the Mission? Your Project Needs a Budget and Other Things What is Governance? Taxonomy of Logical Fallacies Good Project and Bad Project Closed Loop Control The False Notion of "we haven't seen this before"
Categories: Project Management

Diamond Kata - TDD with only Property-Based Tests

Mistaeks I Hav Made - Nat Pryce - Sat, 01/10/2015 - 21:44
The Diamond Kata is a simple exercise that Seb Rose described in a recent blog post. Seb describes the Diamond Kata as: Given a letter, print a diamond starting with ‘A’ with the supplied letter at the widest point. For example: print-diamond ‘C’ prints A B B C C B B A Seb used the exercise to illustrate how he “recycles” tests to help him work incrementally towards a full solution. Seb’s approach prompted Alastair Cockburn to write an article in response in which he argued for more thinking before programming. Alastair’s article shows how he approached the Diamond Kata with more up-front analysis. Ron Jeffries and George Dinwiddie resonded to Alastair’s article, showing how they approached the Diamond Kata relying on emergent design to produce an elegant solution (“thinking all the time”, as Ron Jeffries put it). There was some discussion on Twitter, and several other people published their approaches. (I’ll list as many as I know about at the end of this article). The discussion sparked my interest, so I decided to have a go at the exercise myself. The problem seemed to me, at first glance, to be a good fit for property testing. So I decided to test-drive a solution using only property-based tests and see what happens. I wrote the solution in Scala and used ScalaTest to run and organise the tests and ScalaCheck for property testing. What follows is an unexpurgated, warts-and-all walkthrough of my progress, not just the eventual complete solution. I made wrong turns and stupid mistakes along the way. The walkthrough is pretty long, so if you want you don’t want to follow through step by step, jump straight to the complete solution and/or my conclusions on how the exercise went and what I learned. Alternatively, if you want to follow the walkthrough in more detail, the entire history is on GitHub, with a commit per TDD step (add a failing test, commit, make the implementation pass the test, commit, refactor, commit, … and repeat). Walkthrough Getting Started: Testing the Test Runner The first thing I like to do when starting a new project is make sure my development environment and test runner are set up right, that I can run tests, and that test failures are detected and reported. I use Gradle to bootstrap a new Scala project with dependencies on the latest versions of ScalaTest and ScalaCheck and import the Gradle project into IntelliJ IDEA. ScalaTest supports several different styles of test and assertion syntax. The user guide recommends writing an abstract base class that combines traits and annotations for your preferred testing style and test runner, so that’s what I do first: @RunWith(classOf[JUnitRunner]) abstract class UnitSpec extends FreeSpec with PropertyChecks { } My test class extends UnitSpec: class DiamondSpec extends UnitSpec { } I add a test that explicitly fails, to check that the test framework, IDE and build hang together correctly. When I see the test failure, I’m ready to write the first real test. The First Test Given that I’m writing property tests, I have to start with a simple property of the diamond function, not a simple example. The simplest property I can think of is: For all valid input character, the diamond contains one or more lines of text. To turn that into a property test, I must define “all valid input characters” as a generator. The description of the Diamond Kata defines valid input as a single upper case character. ScalaCheck has a predefined generator for that: val inputChar = Gen.alphaUpperChar At this point, I haven’t decided how I will represent the diamond. I do know that my test will assert on the number of lines of text, so I write the property with respect to an auxiliary function, diamondLines(c:Char):Vector[String], which will generate a diamond for input character c and return the lines of the diamond in a vector. "produces some lines" in { forAll (inputChar) { c => assert(diamondLines(c).nonEmpty) } } I like the way that the test reads in ScalaTest/ScalaCheck. It is pretty much a direct translation of my English description of the property into code. To make the test fail, I write diamondLines as: def diamondLines(c : Char) : Vector[String] = { Vector() } The entire test class is: import org.scalacheck._ class DiamondSpec extends UnitSpec { val inputChar = Gen.alphaUpperChar "produces some lines" in { forAll (inputChar) { c => assert(diamondLines(c).nonEmpty) } } def diamondLines(c : Char) : Vector[String] = { Vector() } } The simplest implementation that will make that property pass is to return a single string: object Diamond { def diamond(c: Char) : String = { "A" } } I make the diamondLines function in the test call the new function and split its result into lines: def diamondLines(c : Char) = { Diamond.diamond(c).lines.toVector } The implementation can be used like this: object DiamondApp extends App { import Diamond.diamond println(diamond(args.lift(0).getOrElse("Z").charAt(0))) } A Second Test, But It Is Not Very Helpful I now need to add another property, to more tightly constrain the solution. I notice that the diamond always has an odd number of lines, and decide to test that: For all valid input character, the diamond has an odd number of lines. This implies that the number of lines is greater than zero (because vectors cannot have a negative number of elements and zero is even), so I change the existing test rather than adding another one: "produces an odd number lines" in { forAll (inputChar) { c => assert(isOdd(diamondLines(c).length)) } } def isOdd(n : Int) = n % 2 == 1 But this new test has a problem: my existing solution already passes it. The diamond function returns a single line, and 1 is an odd number. This choice of property is not helping drive the development forwards. A Failing Test To Drive Development, But a Silly Mistake The next simplest property I can think of is the number of lines of the diamond. If ‘ord(c)’ is the number of letters between ‘A’ and c, (zero for A, 1 for B, 2 for C, etc.) then: For all valid input characters, c, the number of lines in a diamond for c is 2*ord(c)+1. At this point I make a silly mistake. I write my property as: "number of lines" in { forAll (inputChar) { c => assert(diamondLines(c).length == ord(c)+1) } } def ord(c: Char) : Int = c - 'A' I don’t notice the mistake immediately. When I do, I decide to leave it in the code as an experiment to see if the property tests will detect the error by becoming inconsistent, and how long it will take before they do so. This kind of mistake would easily be caught by an example test. It’s a good idea to have a few examples, as well as properties, to act as smoke tests. I make the test pass with the smallest amount of production code possible. I move the ord function from the test into the production code and use it to return the required number of lines that are all the same. def diamond(c: Char) : String = { "A\n" * (ord(c)+1) } def ord(c: Char) : Int = c - 'A' Despite sharing the ord function between the test and production code, there’s still some duplication. Both the production and test code calculate ord(c)+1. I want to address that before writing the next test. Refactor: Duplicated Calculation I replace ord(c)+1 with lineCount(c), which calculates number of lines generated for an input letter, and inline the ord(c) function, because it’s now only used in one place. object Diamond { def diamond(c: Char) : String = { "A\n" * lineCount(c) } def lineCount(c: Char) : Int = (c - 'A')+1 } And I use lineCount in the test as well: "number of lines" in { forAll (inputChar) { c => assert(diamondLines(c).length == lineCount(c)) } } On reflection, using the lineCount calculation from production code in the test feels like a mistake. Squareness The next property I add is: For all valid input character, the text containing the diamond is square Where “is square” means: The length of each line is equal to the total number of lines In Scala, this is: "squareness" in { forAll (inputChar) { c => assert(diamondLines(c) forall {_.length == lineCount(c)}) } } I can make the test pass like this: object Diamond { def diamond(c: Char) : String = { val side: Int = lineCount(c) ("A" * side + "\n") * side } def lineCount(c: Char) : Int = (c - 'A')+1 } Refactor: Rename the lineCount Function The lineCount is also being used to calculate the length of each line, so I rename it to squareSide. object Diamond { def diamond(c: Char) : String = { val side: Int = squareSide(c) ("A" * side + "\n") * side } def squareSide(c: Char) : Int = (c - 'A')+1 } Refactor: Clarify the Tests I’m now a little dissatisfied with the way the tests read: "number of lines" in { forAll (inputChar) { c => assert(diamondLines(c).length == squareSide(c)) } } "squareness" in { forAll (inputChar) { c => assert(diamondLines(c) forall {_.length == squareSide(c)}) } } The “squareness” property does not stand alone. It doesn’t communicate that the output is square unless combined with “number of lines” property. I refactor the test to disentangle the two properties: "squareness" in { forAll (inputChar) { c => val lines = diamondLines(c) assert(lines forall {line => line.length == lines.length}) } } "size of square" in { forAll (inputChar) { c => assert(diamondLines(c).length == squareSide(c)) } } The Letter on Each Line The next property I write specifies which characters are printed on each line. The characters of each line should be either a letter that depends on the index of the line, or a space. Because the diamond is vertically symmetrical, I only need to consider the lines from the top to the middle of the diamond. This makes the calculation of the letter for each line much simpler. I make a note to add a property for the vertical symmetry once I have made the implementation pass this test. "single letter per line" in { forAll (inputChar) { c => val allLines = diamondLines(c) val topHalf = allLines.slice(0, allLines.size/2 + 1) for ((line, index) <- topHalf.zipWithIndex) { val lettersInLine = line.toCharArray.toSet diff Set(' ') val expectedOnlyLetter = ('A' + index).toChar assert(lettersInLine == Set(expectedOnlyLetter), "line " + index + ": \"" + line + "\"") } } } To make this test pass, I change the diamond function to: def diamond(c: Char) : String = { val side: Int = squareSide(c) (for (lc <- 'A' to c) yield lc.toString * side) mkString "\n" } This repeats the correct letter for the top half of the diamond, but the bottom half of the diamond is wrong. This will be fixed by the property for vertical symmetry, which I’ve noted down to write next. Vertical Symmetry The property for vertical symmetry is: For all input character, c, the lines from the top to the middle of the diamond, inclusive, are equal to the reversed lines from the middle to the bottom of the diamond, inclusive. "is vertically symmetrical" in { forAll(inputChar) { c => val allLines = diamondLines(c) val topHalf = allLines.slice(0, allLines.size / 2 + 1) val bottomHalf = allLines.slice(allLines.size / 2, allLines.size) assert(topHalf == bottomHalf.reverse) } } The implementation is: def diamond(c: Char) : String = { val side: Int = squareSide(c) val topHalf = for (lc <- 'A' to c) yield lineFor(side, lc) val bottomHalf = topHalf.slice(0, topHalf.length-1).reverse (topHalf ++ bottomHalf).mkString("\n") } But this fails the “squareness” and “size of square” tests! My properties are now inconsistent. The test suite has detected the erroneous implementation of the squareSide function. The correct implementation of squareSide is: def squareSide(c: Char) : Int = 2*(c - 'A') + 1 With this change, the implementation passes all of the tests. The Position Of The Letter In Each Line Now I add a property that specifies the position and value of the letter in each line, and that all other characters in a line are spaces. Like the previous test, I can rely on symmetry in the output to simplify the arithmetic. This time, because the diamond has horizontal symmetry, I only need specify the position of the letter in the first half of the line. I add a specification for horizontal symmetry, and factor out generic functions to return the first and second half of strings and sequences. "is vertically symmetrical" in { forAll (inputChar) { c => val lines = diamondLines(c) assert(firstHalfOf(lines) == secondHalfOf(lines).reverse) } } "is horizontally symmetrical" in { forAll (inputChar) { c => for ((line, index) <- diamondLines(c).zipWithIndex) { assert(firstHalfOf(line) == secondHalfOf(line).reverse, "line " + index + " should be symmetrical") } } } "position of letter in line of spaces" in { forAll (inputChar) { c => for ((line, lineIndex) <- firstHalfOf(diamondLines(c)).zipWithIndex) { val firstHalf = firstHalfOf(line) val expectedLetter = ('A'+lineIndex).toChar val letterIndex = firstHalf.length - (lineIndex + 1) assert (firstHalf(letterIndex) == expectedLetter, firstHalf) assert (firstHalf.count(_==' ') == firstHalf.length-1, "number of spaces in line " + lineIndex + ": " + line) } } } def firstHalfOf[AS, A, That](v: AS)(implicit asSeq: AS => Seq[A], cbf: CanBuildFrom[AS, A, That]) = { v.slice(0, (v.length+1)/2) } def secondHalfOf[AS, A, That](v: AS)(implicit asSeq: AS => Seq[A], cbf: CanBuildFrom[AS, A, That]) = { v.slice(v.length/2, v.length) } The implementation is: object Diamond { def diamond(c: Char) : String = { val side: Int = squareSide(c) val topHalf = for (letter <- 'A' to c) yield lineFor(side, letter) (topHalf ++ topHalf.reverse.tail).mkString("\n") } def lineFor(length: Int, letter: Char): String = { val halfLength = length/2 val letterIndex = halfLength - ord(letter) val halfLine = " "*letterIndex + letter + " "*(halfLength-letterIndex) halfLine ++ halfLine.reverse.tail } def squareSide(c: Char) : Int = 2*ord(c) + 1 def ord(c: Char): Int = c - 'A' } It turns out the ord function, which I inlined into squareSide a while ago, is needed after all. The implementation is now complete. Running the DiamondApp application prints out diamonds. But there’s plenty of scope for refactoring both the production and test code. Refactoring: Delete the “Single Letter Per Line” Property The “position of letter in line of spaces” property makes the “single letter per line” property superflous, so I delete “single letter per line”. Refactoring: Simplify the Diamond Implementation I rename some parameters and simplify the implementation of the diamond function. object Diamond { def diamond(maxLetter: Char) : String = { val topHalf = for (letter <- 'A' to maxLetter) yield lineFor(maxLetter, letter) (topHalf ++ topHalf.reverse.tail).mkString("\n") } def lineFor(maxLetter: Char, letter: Char): String = { val halfLength = ord(maxLetter) val letterIndex = halfLength - ord(letter) val halfLine = " "*letterIndex + letter + " "*(halfLength-letterIndex) halfLine ++ halfLine.reverse.tail } def squareSide(c: Char) : Int = 2*ord(c) + 1 def ord(c: Char): Int = c - 'A' } The implementation no longer uses the squareSide function. It’s only used by the “size of square” property. Refactoring: Inline the squareSide function I inline the squareSide function into the test. "size of square" in { forAll (inputChar) { c => assert(diamondLines(c).length == 2*ord(c) + 1) } } I believe the erroneous calculation would have been easier to notice if I had done this from the start. Refactoring: Common Implementation of Symmetry There’s one last bit of duplication in the implementation. The expressions that create the horizontal and vertical symmetry of the diamond can be replaced with calls to a generic function. I’ll leave that as an exercise for the reader… Complete Tests and Implementation Tests: import Diamond.ord import org.scalacheck._ import scala.collection.generic.CanBuildFrom class DiamondSpec extends UnitSpec { val inputChar = Gen.alphaUpperChar "squareness" in { forAll (inputChar) { c => val lines = diamondLines(c) assert(lines forall {line => line.length == lines.length}) } } "size of square" in { forAll (inputChar) { c => assert(diamondLines(c).length == 2*ord(c) + 1) } } "is vertically symmetrical" in { forAll (inputChar) { c => val lines = diamondLines(c) assert(firstHalfOf(lines) == secondHalfOf(lines).reverse) } } "is horizontally symmetrical" in { forAll (inputChar) { c => for ((line, index) <- diamondLines(c).zipWithIndex) { assert(firstHalfOf(line) == secondHalfOf(line).reverse, "line " + index + " should be symmetrical") } } } "position of letter in line of spaces" in { forAll (inputChar) { c => for ((line, lineIndex) <- firstHalfOf(diamondLines(c)).zipWithIndex) { val firstHalf = firstHalfOf(line) val expectedLetter = ('A'+lineIndex).toChar val letterIndex = firstHalf.length - (lineIndex + 1) assert (firstHalf(letterIndex) == expectedLetter, firstHalf) assert (firstHalf.count(_==' ') == firstHalf.length-1, "number of spaces in line " + lineIndex + ": " + line) } } } def firstHalfOf[AS, A, That](v: AS)(implicit asSeq: AS => Seq[A], cbf: CanBuildFrom[AS, A, That]) = { v.slice(0, (v.length+1)/2) } def secondHalfOf[AS, A, That](v: AS)(implicit asSeq: AS => Seq[A], cbf: CanBuildFrom[AS, A, That]) = { v.slice(v.length/2, v.length) } def diamondLines(c : Char) = { Diamond.diamond(c).lines.toVector } } Implementation: object Diamond { def diamond(maxLetter: Char) : String = { val topHalf = for (letter <- 'A' to maxLetter) yield lineFor(maxLetter, letter) (topHalf ++ topHalf.reverse.tail).mkString("\n") } def lineFor(maxLetter: Char, letter: Char): String = { val halfLength = ord(maxLetter) val letterIndex = halfLength - ord(letter) val halfLine = " "*letterIndex + letter + " "*(halfLength-letterIndex) halfLine ++ halfLine.reverse.tail } def ord(c: Char): Int = c - 'A' } Conclusions In his article, “Thinking Before Programming”, Alastair Cockburn writes: The advantage of the Dijkstra-Gries approach is the simplicity of the solutions produced. The advantage of TDD is modern fine-grained incremental development. … Can we combine the two? I think property-based tests in the TDD process combined the two quite successfully in this exercise. I could record my half-formed thoughts about the problem and solution as generators and properties while using “modern fine-grained incremental development” to tighten up the properties and grow the code that met them. In Seb’s original article, he writes that when working from examples… it’s easy enough to get [the tests for ‘A’ and ‘B’] to pass by hardcoding the result. Then we move on to the letter ‘C’. The code is now screaming for us to refactor it, but to keep all the tests passing most people try to solve the entire problem at once. That’s hard, because we’ll need to cope with multiple lines, varying indentation, and repeated characters with a varying number of spaces between them. I didn’t encounter this problem when driving the implementation with properties. Adding a new property always required an incremental improvement to the implementation to get the tests passing again. Neither did I need to write throw-away tests for behaviour that was not actually desired of the final implementation, as Seb did with his “test recycling” approach. Every property I added applied to the complete solution. I only deleted properties that were implied by properties I added later, and so had become unnecessary duplication. I took the approach of starting from very generic properties and incrementally adding more specific properties as I refine the implementation. Generic properties were easy to come up with, and helped me make progress in the problem. The suite of properties reinforced one another, testing the tests, and detected the mistake I made in one property that caused it to be inconsistent with the rest. I didn’t know Scala, ScalaTest or ScalaCheck well. Now I’ve learned them better I wish I had written a minimisation strategy for the input character. This would have made test failure messages easier to understand. I also didn’t address what the diamond function would do with input outside the range of ‘A’ to ‘Z’. Scala doesn’t let one define a subtype of Char, so I can’t enforce the input constraint in the type system. I guess the Scala way would be to define diamond as a PartialFunction[Char,String]. I haven’t yet looked at any other people’s solutions in detail. I’ll post a follow up article if I find any interesting differences. Other Solutions Other solutions to the Diamond Kata that I know about are: Seb Rose: Recycling Tests in TDD Alastair Cockburn: Thinking Before Programming Seb Rose: Diamond recycling (and painting yourself into a corner) Ron Jeffries: a detailed walkthrough of his solution George Dinwiddie: Another Approach to the Diamond Kata Ivan Sanchez: A walkthrough of his Clojure solution. Jon Jagger: print “squashed-circle” diamond Sandro Mancuso: A Java solution on GitHub Krzysztof Jelski: A Python solution on GitHub Philip Schwarz: A Clojure solution on GitHub
Categories: Programming, Testing & QA

Python: scikit-learn: ImportError: cannot import name __check_build

Mark Needham - Sat, 01/10/2015 - 09:48

In part 3 of Kaggle’s series on text analytics I needed to install scikit-learn and having done so ran into the following error when trying to use one of its classes:

>>> from sklearn.feature_extraction.text import CountVectorizer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/sklearn/__init__.py", line 37, in <module>
    from . import __check_build
ImportError: cannot import name __check_build

This error doesn’t reveal very much but I found that when I exited the REPL and tried the same command again I got a different error which was a bit more useful:

>>> from sklearn.feature_extraction.text import CountVectorizer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/sklearn/__init__.py", line 38, in <module>
    from .base import clone
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/sklearn/base.py", line 10, in <module>
    from scipy import sparse
ImportError: No module named scipy

The fix for this is now obvious:

$ pip install scipy

And I can now load CountVectorizer without any problem:

$ python
Python 2.7.5 (default, Aug 25 2013, 00:04:04)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from sklearn.feature_extraction.text import CountVectorizer
Categories: Programming

Python: gensim – clang: error: unknown argument: ‘-mno-fused-madd’ [-Wunused-command-line-argument-hard-error-in-future]

Mark Needham - Sat, 01/10/2015 - 09:39

While working through part 2 of Kaggle’s bag of words tutorial I needed to install the gensim library and initially ran into the following error:

$ pip install gensim
 
...
 
cc -fno-strict-aliasing -fno-common -dynamic -arch x86_64 -arch i386 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -arch i386 -pipe -I/Users/markneedham/projects/neo4j-himym/himym/build/gensim/gensim/models -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -I/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/numpy/core/include -c ./gensim/models/word2vec_inner.c -o build/temp.macosx-10.9-intel-2.7/./gensim/models/word2vec_inner.o
 
clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]
 
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
 
command 'cc' failed with exit status 1
 
an integer is required
 
Traceback (most recent call last):
 
  File "<string>", line 1, in <module>
 
  File "/Users/markneedham/projects/neo4j-himym/himym/build/gensim/setup.py", line 166, in <module>
 
    include_package_data=True,
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 152, in setup
 
    dist.run_commands()
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 953, in run_commands
 
    self.run_command(cmd)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
 
    cmd_obj.run()
 
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/setuptools/command/install.py", line 59, in run
 
    return orig.install.run(self)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/command/install.py", line 573, in run
 
    self.run_command('build')
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/cmd.py", line 326, in run_command
 
    self.distribution.run_command(command)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
 
    cmd_obj.run()
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/command/build.py", line 127, in run
 
    self.run_command(cmd_name)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/cmd.py", line 326, in run_command
 
    self.distribution.run_command(command)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
 
    cmd_obj.run()
 
  File "/Users/markneedham/projects/neo4j-himym/himym/build/gensim/setup.py", line 71, in run
 
    "There was an issue with your platform configuration - see above.")
 
TypeError: an integer is required
 
----------------------------------------
Cleaning up...
Command /Users/markneedham/projects/neo4j-himym/himym/bin/python -c "import setuptools, tokenize;__file__='/Users/markneedham/projects/neo4j-himym/himym/build/gensim/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/sb/6zb6j_7n6bz1jhhplc7c41n00000gn/T/pip-i8aeKR-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/markneedham/projects/neo4j-himym/himym/include/site/python2.7 failed with error code 1 in /Users/markneedham/projects/neo4j-himym/himym/build/gensim
Storing debug log for failure in /Users/markneedham/.pip/pip.log

The exception didn’t make much sense to me but I came across a blog post which explained it:

The Apple LLVM compiler in Xcode 5.1 treats unrecognized command-line options as errors. This issue has been seen when building both Python native extensions and Ruby Gems, where some invalid compiler options are currently specified.

The author suggests this only became a problem with XCode 5.1 so I’m surprised I hadn’t come across it sooner since I haven’t upgraded XCode in a long time.

We can work around the problem by telling the compiler to treat extra command line arguments as a warning rather than an error

export ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future

Now it installs with no problems.

Categories: Programming

Python NLTK/Neo4j: Analysing the transcripts of How I Met Your Mother

Mark Needham - Sat, 01/10/2015 - 02:22

After reading Emil’s blog post about dark data a few weeks ago I became intrigued about trying to find some structure in free text data and I thought How I met your mother’s transcripts would be a good place to start.

I found a website which has the transcripts for all the episodes and then having manually downloaded the two pages which listed all the episodes, wrote a script to grab each of the transcripts so I could use them on my machine.

I wanted to learn a bit of Python and my colleague Nigel pointed me towards the requests and BeautifulSoup libraries to help me with my task. The script to grab the transcripts looks like this:

import requests
from bs4 import BeautifulSoup
from soupselect import select
 
episodes = {}
for i in range(1,3):
    page = open("data/transcripts/page-" + str(i) + ".html", 'r')
    soup = BeautifulSoup(page.read())
 
    for row in select(soup, "td.topic-titles a"):
        parts = row.text.split(" - ")
        episodes[parts[0]] = {"title": parts[1], "link": row.get("href")}
 
for key, value in episodes.iteritems():
    parts = key.split("x")
    season = int(parts[0])
    episode = int(parts[1])
    filename = "data/transcripts/S%d-Ep%d" %(season, episode)
    print filename
 
    with open(filename, 'wb') as handle:
        headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
        response = requests.get("http://transcripts.foreverdreaming.org" + value["link"], headers = headers)
        if response.ok:
            for block in response.iter_content(1024):
                if not block:
                    break
 
                handle.write(block)

the files containing the lists of episodes are named ‘page-1′ and ‘page-2′

The code is reasonably simple – we find all the links inside the table, put them in a dictionary and then iterate through the dictionary and download the files to disk. The code to save the file is a bit of a monstrosity but there didn’t seem to be a ‘save’ method that I could use.

Having downloaded the files, I thought through all sorts of clever things I could do, including generating a bag of words model for each episode or performing sentiment analysis on each sentence which I’d learnt about from a Kaggle tutorial.

In the end I decided to start simple and extract all the words from the transcripts and count many times a word occurred in a given episode.

I ended up with the following script which created a dictionary of (episode -> words + occurrences):

import csv
import nltk
import re
 
from bs4 import BeautifulSoup
from soupselect import select
from nltk.corpus import stopwords
from collections import Counter
from nltk.tokenize import word_tokenize
 
def count_words(words):
    tally=Counter()
    for elem in words:
        tally[elem] += 1
    return tally
 
episodes_dict = {}
with open('data/import/episodes.csv', 'r') as episodes:
    reader = csv.reader(episodes, delimiter=',')
    reader.next()
 
    for row in reader:
        print row
        transcript = open("data/transcripts/S%s-Ep%s" %(row[3], row[1])).read()
        soup = BeautifulSoup(transcript)
        rows = select(soup, "table.tablebg tr td.post-body div.postbody")
 
        raw_text = rows[0]
        [ad.extract() for ad in select(raw_text, "div.ads-topic")]
        [ad.extract() for ad in select(raw_text, "div.t-foot-links")]
 
        text = re.sub("[^a-zA-Z]", " ", raw_text.text.strip())
        words = [w for w in nltk.word_tokenize(text) if not w.lower() in stopwords.words("english")]
 
        episodes_dict[row[0]] = count_words(words)

Next I wanted to explore the data a bit to see which words occurred across episodes or which word occurred most frequently and realised that this would be a much easier task if I stored the data somewhere.

s/somewhere/in Neo4j

Neo4j’s query language, Cypher, has a really nice ETL-esque tool called ‘LOAD CSV’ for loading in CSV files (as the name suggests!) so I added some code to save my words to disk:

with open("data/import/words.csv", "w") as words:
    writer = csv.writer(words, delimiter=",")
    writer.writerow(["EpisodeId", "Word", "Occurrences"])
    for episode_id, words in episodes_dict.iteritems():
        for word in words:
            writer.writerow([episode_id, word, words[word]])

This is what the CSV file contents look like:

$ head -n 10 data/import/words.csv
EpisodeId,Word,Occurrences
165,secondly,1
165,focus,1
165,baby,1
165,spiders,1
165,go,4
165,apartment,1
165,buddy,1
165,Exactly,1
165,young,1

Now we need to write some Cypher to get the data into Neo4j:

// words
LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-himym/data/import/words.csv" AS row
MERGE (word:Word {value: row.Word})
// episodes
LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-himym/data/import/words.csv" AS row
MERGE (episode:Episode {id: TOINT(row.EpisodeId)})
// words to episodes
LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-himym/data/import/words.csv" AS row
MATCH (word:Word {value: row.Word})
MATCH (episode:Episode {id: TOINT(row.EpisodeId)})
MERGE (word)-[:USED_IN_EPISODE {times: TOINT(row.Occurrences) }]->(episode);

Having done that we can write some simple queries to explore the words used in How I met your mother:

MATCH (word:Word)-[r:USED_IN_EPISODE]->(episode) 
RETURN word.value, COUNT(episode) AS episodes, SUM(r.times) AS occurrences
ORDER BY occurrences DESC
LIMIT 10
 
==> +-------------------------------------+
==> | word.value | episodes | occurrences |
==> +-------------------------------------+
==> | "Ted"      | 207      | 11437       |
==> | "Barney"   | 208      | 8052        |
==> | "Marshall" | 208      | 7236        |
==> | "Robin"    | 205      | 6626        |
==> | "Lily"     | 207      | 6330        |
==> | "m"        | 208      | 4777        |
==> | "re"       | 208      | 4097        |
==> | "know"     | 208      | 3489        |
==> | "Oh"       | 197      | 3448        |
==> | "like"     | 208      | 2498        |
==> +-------------------------------------+
==> 10 rows

The main 5 characters occupy the top 5 positions which is probably what you’d expect. I’m not sure why ‘m’ and ‘re’ are in the next two position s – I expect that might be scraping gone wrong!

Our next query might focus around checking which character is referred to the post in each episode:

WITH ["Ted", "Barney", "Robin", "Lily", "Marshall"] as mainCharacters
MATCH (word:Word) WHERE word.value IN mainCharacters
MATCH (episode:Episode)<-[r:USED_IN_EPISODE]-(word)
WITH episode, word, r
ORDER BY episode.id, r.times DESC
WITH episode, COLLECT({word: word.value, times: r.times})[0] AS topWord
RETURN episode.id, topWord.word AS word, topWord.times AS occurrences
LIMIT 10
 
==> +---------------------------------------+
==> | episode.id | word       | occurrences |
==> +---------------------------------------+
==> | 72         | "Barney"   | 75          |
==> | 143        | "Ted"      | 16          |
==> | 43         | "Lily"     | 74          |
==> | 156        | "Ted"      | 12          |
==> | 206        | "Barney"   | 23          |
==> | 50         | "Marshall" | 51          |
==> | 113        | "Ted"      | 76          |
==> | 178        | "Barney"   | 21          |
==> | 182        | "Barney"   | 22          |
==> | 67         | "Ted"      | 84          |
==> +---------------------------------------+
==> 10 rows

If we dig into it further there’s actually quite a bit of variety in the number of times the top character in each episode is mentioned which again probably says something about the data:

WITH ["Ted", "Barney", "Robin", "Lily", "Marshall"] as mainCharacters
MATCH (word:Word) WHERE word.value IN mainCharacters
MATCH (episode:Episode)<-[r:USED_IN_EPISODE]-(word)
WITH episode, word, r
ORDER BY episode.id, r.times DESC
WITH episode, COLLECT({word: word.value, times: r.times})[0] AS topWord
RETURN MIN(topWord.times), MAX(topWord.times), AVG(topWord.times), STDEV(topWord.times)
 
==> +-------------------------------------------------------------------------------------+
==> | MIN(topWord.times) | MAX(topWord.times) | AVG(topWord.times) | STDEV(topWord.times) |
==> +-------------------------------------------------------------------------------------+
==> | 3                  | 259                | 63.90865384615385  | 42.36255207691068    |
==> +-------------------------------------------------------------------------------------+
==> 1 row

Obviously this is a very simple way of deriving structure from text, here are some of the things I want to try out next:

  • Detecting common phrases/memes/phrases used in the show (e.g. the yellow umbrella) – this should be possible by creating different length n-grams and then searching for those phrases across the corpus.
  • Pull out scenes – some of the transcripts use the keyword ‘scene’ to denote this although some of them don’t. Depending how many transcripts contain scene demarkations perhaps we could train a classifier to detect where scenes should be in the transcripts which don’t have scenes.
  • Analyse who talks to each other or who talks about each other most frequently
  • Create a graph of conversations as my colleagues Max and Michael have previously blogged about.
Categories: Programming

Stuff The Internet Says On Scalability For January 9th, 2015

Hey, it's HighScalability time:


UFOs or Floating Solar Balloon power stations? You decide.

 

  • 700 Million: WhatsApp active monthly users; 17 million: comments on Stack Exchange in 2014
  • Quotable Quotes
    • John von Neumann: It is easier to write a new code than to understand an old one.
    • @BenedictEvans: Gross revenue on Apple & Google's app stores was a little over $20bn in 2014. Bigger than recorded music, FWIW.
    • Julian Bigelow: Absence of a signal should never be used as a signal. 
    • Bigelow ~ separate signal from noise at every stage of the process—in this case, at the transfer of every single bit—rather than allowing noise to accumulate along the way
    • cgb_: One of the things I've found interesting about rapidly popular opensource solutions in the last 1-2 years is how quickly venture cap funding comes in and drives the direction of future development.
    • @miostaffin: "If Amazon wants to test 5,000 users to use a feature, they just need to turn it on for 45 seconds." -@jmspool #uxdc
    • Roberta Ness: Amazing possibility on the one hand and frustrating inaction on the other—that is the yin and yang of modern science. Invention generates ever more gizmos and gadgets, but imagination is not providing clues to solving the scientific puzzles that threaten our very existence.

  • Can HTTPS really be faster than HTTP? Yes, it can. Take the test for yourself. The secret: SPDY. More at Why we don’t use a CDN: A story about SPDY and SSL

  • A fascinating and well told tale of the unexpected at Facebook. Solving the Mystery of Link Imbalance: A Metastable Failure State at Scale: The most literal conclusion to draw from this story is that MRU connection pools shouldn’t be used for connections that traverse aggregated links. At a meta-level, the next time you are debugging emergent behavior, you might try thinking of the components as agents colluding via covert channels. At an organizational level, this investigation is a great example of why we say that nothing at Facebook is somebody else’s problem.

  • Everything old is new again. Facebook on disaggregation vs. hyperconvergence: Just when everyone agreed that scale-out infrastructure with commodity nodes of tightly-coupled CPU, memory and storage is the way to go, Facebook’s Jeff Qin, a capacity management engineer – in a talk at Storage Visions 2015 – offers an opposing vision: disaggregated racks. One rack for computes, another for memory and a third – and fourth – for storage.

  • Why Instagram Worked. Instagram was the result of a pivot away from a not popular enough social networking site to a stripped down app that allowed people to document their world in pictures. Though the source article is short on the why, there's a good discussion on Hacker News. Some interesting reasons: Instagram worked because it algorithmically hides flaws in photographs so everyone's pictures look "good"; Snapping a photo is easy and revolves around a moment -- something easier to recognize when it's worthy of sharing; Startups need lucky breaks, but connections with the right people increase the odds considerably; Instagram worked because it was at the right place at the right time; It worked because it's a simple, quick, ultra-low friction way of sharing photos.

  • Atheists, it's not what you think. The God Login. The incomparable Jeff Atwood does a deep dive on the design of a common everyday object: the Login page. The title was inspired by one of Jeff's teacher's who asked what was the "God Algorithm" for a problem, that is, if God solved a problem what would the solution look like? While you may not agree with the proposed solution to the Login page problem, you may at least come away believing that one may or may not exist.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Quote of the Day

Herding Cats - Glen Alleman - Fri, 01/09/2015 - 17:17

It was overheard on Twitter

"Don't fall in love with your plan, it is almost certainly wrong "

A plan is a strategy for success. Strategies are hypothesise . Hypothesise need tests to verify - just like you learned in High School science class. That test is the Measures of Effectiveness and Measures of Performance of the outcomes from your project as well as the Technical Performance Measures that are used to take corrective actions needed to reach the goal of delivering value in exchgange for time and money.

The Plan describes where we are going, the various paths we can take to reach our destination, and the progress or performance assessment points along the way to assure we are on the right path.

These assessment points measures the “maturity” of the product or service against the planned maturity. This is the only real measure of progress – not the passage of time or consumption of money.

Wrong in the planning sense can only be wrong if you are managing your project Open Loop with no assessment of Effectiveness, Performance, Risk reduction, cost absorption, time performance or all the ...ilities associated with spending other peoples money.

So it is true that the Plan is wrong if you're not managing in the presence of uncertainty, feedback, and taking corrective action. 

Categories: Project Management

The God Login

Coding Horror - Jeff Atwood - Fri, 01/09/2015 - 12:32

I graduated with a Computer Science minor from the University of Virginia in 1992. The reason it's a minor and not a major is because to major in CS at UVa you had to go through the Engineering School, and I was absolutely not cut out for that kind of hardcore math and physics, to put it mildly. The beauty of a minor was that I could cherry pick all the cool CS classes and skip everything else.

One of my favorite classes, the one I remember the most, was Algorithms. I always told people my Algorithms class was the one part of my college education that influenced me most as a programmer. I wasn't sure exactly why, but a few years ago I had a hunch so I looked up a certain CV and realized that Randy Pausch – yes, the Last Lecture Randy Pausch – taught that class. The timing is perfect: University of Virginia, Fall 1991, CS461 Analysis of Algorithms, 50 students.

I was one of them.

No wonder I was so impressed. Pausch was an incredible, charismatic teacher, a testament to the old adage that your should choose your teacher first and the class material second, if you bother to at all. It's so true.

In this case, the combination of great teacher and great topic was extra potent, as algorithms are central to what programmers do. Not that we invent new algorithms, but we need to understand the code that's out there, grok why it tends to be fast or slow due to the tradeoffs chosen, and choose the correct algorithms for what we're doing. That's essential.

And one of the coolest things Mr. Pausch ever taught me was to ask this question:

What's the God algorithm for this?

Well, when sorting a list, obviously God wouldn't bother with a stupid Bubble Sort or Quick Sort or Shell Sort like us mere mortals, God would just immediately place the items in the correct order. Bam. One step. The ultimate lower bound on computation, O(1). Not just fixed time, either, but literally one instantaneous step, because you're freakin' God.

This kind of blew my mind at the time.

I always suspected that programmers became programmers because they got to play God with the little universe boxes on their desks. Randy Pausch took that conceit and turned it into a really useful way of setting boundaries and asking yourself hard questions about what you're doing and why.

So when we set out to build a login dialog for Discourse, I went back to what I learned in my Algorithms class and asked myself:

How would God build this login dialog?

And the answer is, of course, God wouldn't bother to build a login dialog at all. Every user would already be logged into GodApp the second they loaded the page because God knows who they are. Authoritatively, even.

This is obviously impossible for us, because God isn't one of our investors.

But.. how close can we get to the perfect godlike login experience in Discourse? That's a noble and worthy goal.

Wasn't it Bill Gates who once asked why the hell every programmer was writing the same File Open dialogs over and over? It sure feels that way for login dialogs. I've been saying for a long time that the best login is no login at all and I'm a staunch supporter of logging in with your Internet Driver's license whenever possible. So we absolutely support that, if you've configured it.

But today I want to focus on the core, basic login experience: user and password. That's the default until you configure up the other methods of login.

A login form with two fields, two buttons, and a link on it seems simple, right? Bog standard. It is, until you consider all the ways the simple act of logging in with those two fields can go wrong for the user. Let's think.

Let the user enter an email to log in

The critical fault of OpenID, as much as I liked it as an early login solution, was its assumption that users could accept an URL as their "identity". This is flat out crazy, and in the long run this central flawed assumption in OpenID broke it as a future standard.

User identity is always email, plain and simple. What happens when you forget your password? You get an email, right? Thus, email is your identity. Some people even propose using email as the only login method.

It's fine to have a username, of course, but always let users log in with either their username or their email address. Because I can tell you with 100% certainty that when those users forget their password, and they will, all the time, they'll need that email anyway to get a password reset. Email and password are strongly related concepts and they belong together. Always!

(And a fie upon services that don't allow me to use my email as a username or login. I'm looking at you, Comixology.)

Tell the user when their email doesn't exist

OK, so we know that email is de-facto identity for most people, and this is a logical and necessary state of affairs. But which of my 10 email addresses did I use to log into your site?

This was the source of a long discussion at Discourse about whether it made sense to reveal to the user, when they enter an email address in the "forgot password" form, whether we have that email address on file. On many websites, here's the sort of message you'll see after entering an email address in the forgot password form:

If an account matches name@example.com, you should receive an email with instructions on how to reset your password shortly.

Note the coy "if" there, which is a hedge against all the security implications of revealing whether a given email address exists on the site just by typing it into the forgot password form.

We're deadly serious about picking safe defaults for Discourse, so out of the box you won't get exploited or abused or overrun with spammers. But after experiencing the real world "which email did we use here again?" login state on dozens of Discourse instances ourselves, we realized that, in this specific case, being user friendly is way more important than being secure.

The new default is to let people know when they've entered an email we don't recognize in the forgot password form. This will save their sanity, and yours. You can turn on the extra security of being coy about this, if you need it, via a site setting.

Let the user switch between Log In and Sign Up any time

Many websites have started to show login and signup buttons side by side. This perplexed me; aren't the acts of logging in and signing up very different things?

Well, from the user's perspective, they don't appear to be. This Verge login dialog illustrates just how close the sign up and log in forms really are. Check out this animated GIF of it in action.

We've acknowledged that similarity by having either form accessible at any time from the two buttons at the bottom of the form, as a toggle:

And both can be kicked off directly from any page via the Sign Up and Log In buttons at the top right:

Pick common words

That's the problem with language, we have so many words for these concepts:

  • Sign In
  • Log In
  • Sign Up
  • Register
  • Join <site>
  • Create Account
  • Get Started
  • Subscribe

Which are the "right" ones? User research data isn't conclusive.

I tend to favor the shorter versions when possible, mostly because I'm a fan of the whole brevity thing, but there are valid cases to be made for each depending on the circumstances and user preferences.

Sign In may be slightly more common, though Log In has some nautical and historical computing basis that makes it worthy:

A couple of years ago I did a survey of top websites in the US and UK and whether they used “sign in”, “log in”, “login”, “log on”, or some other variant. The answer at the time seemed to be that if you combined “log in” and “login”, it exceeded “sign in”, but not by much. I’ve also noticed that the trend toward “sign in” is increasing, especially with the most popular services. Facebook seems to be a “log in” hold-out.

Work with browser password managers

Every login dialog you create should be tested to work with the default password managers in …

At an absolute minimum. Upon subsequent logins in that browser, you should see the username and password automatically autofilled.

Users rely on these default password managers built into the browsers they use, and any proper modern login form should respect that, and be designed sensibly, e.g. the password field should have type="password" in the HTML and a name that's readily identifable as a password entry field.

There's also LastPass and so forth, but I generally assume if the login dialog works with the built in browser password managers, it will work with third party utilities, too.

Handle common user mistakes

Oops, the user is typing their password with caps lock on? You should let them know about that.

Oops, the user entered their email as name@gmal.com instead of name@gmail.com? Or name@hotmail.cm instead of name@hotmail.com? You should either fix typos in common email domains for them, or let them know about that.

(I'm also a big fan of native browser "reveal password" support for the password field, so the user can verify that she typed in or autofilled the password she expects. Only Internet Explorer and I think Safari offer this, but all browsers should.)

Help users choose better passwords

There are many schools of thought on forcing helping users choose passwords that aren't unspeakably awful, e.g. password123 and iloveyou and so on.

There's the common password strength meter, which updates in real time as you type in the password field.

It's clever idea, but it gets awful preachy for my tastes on some sites. The implementation also leaves a lot to be desired, as it's left up to the whims of the site owner to decide what password strength means. One site's "good" is another site's "get outta here with that Fisher-Price toy password". It's frustrating.

So, with Discourse, rather than all that, I decided we'd default on a solid absolute minimum password length of 8 characters, and then verify the password to make sure it is not one of the 10,000 most common known passwords by checking its hash.

Don't forget the keyboard

I feel like keyboard users are a dying breed at this point, but for those of us that, when presented with a login dialog, like to rapidly type

name@example.com, tab, p4$$w0rd, enter

please verify that this works as it should. Tab order, enter to submit, etcetera.

Rate limit all the things

You should be rate limiting everything users can do, everywhere, and that's especially true of the login dialog.

If someone forgets their password and makes 3 attempts to log in, or issues 3 forgot password requests, that's probably OK. But if someone makes a thousand attempts to log in, or issues a thousand forgot password requests, that's a little weird. Why, I might even venture to guess they're possibly … not human.

You can do fancy stuff like temporarily disable accounts or start showing a CAPTCHA if there are too many failed login attempts, but this can easily become a griefing vector, so be careful.

I think a nice middle ground is to insert standard pauses of moderately increasing size after repeated sequential failures or repeated sequential forgot password requests from the same IP address. So that's what we do.

Stuff I forgot

I tried to remember everything we went through when we were building our ideal login dialog for Discourse, but I'm sure I forgot something, or could have been more thorough. Remember, Discourse is 100% open source and by definition a work in progress – so as my friend Miguel de Icaza likes to say, when it breaks, you get to keep both halves. Feel free to test out our implementation and give us your feedback in the comments, or point to other examples of great login experiences, or cite other helpful advice.

Logging in involves a simple form with two fields, a link, and two buttons. And yet, after reading all this, I'm sure you'll agree that it's deceptively complex. Your best course of action is not to build a login dialog at all, but instead rely on authentication from an outside source whenever you can.

Like, say, God.

[advertisement] How are you showing off your awesome? Create a Stack Overflow Careers profile and show off all of your hard work from Stack Overflow, Github, and virtually every other coding site. Who knows, you might even get recruited for a great new position!
Categories: Programming

Decision Making in the Presence of Uncertainty

Herding Cats - Glen Alleman - Fri, 01/09/2015 - 06:00

Decision theory is concerned with the problem of making decisions. Statistical decision theory is decision making in the presence of statistical knowledge, by understanding some of the uncertainties involved in the problem.

Decision theory deals with the situations where decisions have to be made in the presence of uncertainty, and its goal is to provide a rational framework for dealing with such situations. To make good choices we must calculate and manage the resulting risks from those choices. Today, we have tools  to perform these calculations.

A few hundred years ago decision making in the presence of uncertainty and the resulting risk had only tool faith, hope, and guesswork. This is because risk is a numbers game. Before the 17th century, our understanding of numbers did not provide us with the tools needed to make choices in the presence of uncertainty.

A good book about the history of making choices in the presence of uncertainty - risk management - is Against the Odds, The Remarkable Story of Risk, Peter Bernstein. These efforts culminated in Bernoulli's focused not on probabilistic events, but on the human beings who desire or fear certain outcomes to a greater or lesser degree.

Bernoulli showed how to create mathematical tools to allow anyone to “estimate his prospects from any risky undertaking in light of [his] specific financial circumstances.” The is the basis of Microeconomics of decision making, in which the opportunity cost of a collection of choices can be assessed by estimating both the cost of that decision and the result beneficial outcome or loss.

In 1921, Frank Knight distinguished between risk, when the probability of an outcome is possible to calculate — or is knowable — and uncertainty, when the probability of an outcome is not possible to determine — or is unknowable.

This becomes an argument that rendered insurance attractive and entrepreneurship tragic.  20 years  later, John von Neumann and Oskar Morgenstern established the foundation of game theory, which deals in situations where people’s decisions are influenced by the unknowable decisions of live variables — in the gaming world, this means other people.

Decision making in the presence of uncertainty is a normal business function as well as a normal technical development process. The world is full of uncertainty.

Those seeking certainty will be woefully disappointed. Those conjecturing that decisionscan't be made in the presence of uncertainty are woefully misinformed. 

Along with all this woefulness is the boneheaded notion that estimating is guessing, and that decisions can actually be made in the presence of uncertainty in the absence of estimating.

Here's why. When we are faced with a decision, a choice between multiple decisions, a choice between multiple outcomes, each is probabilistic. If it were not - that is we have 100% visibility into the consequences of our decision, the cost involved in making that decision, the cost impact or benefit impact from that decision - it's no longer a decision. It's a choice to pick between several options based on something other than time, money, or benefit.

Buying an ERP system, or funding the development of a new product, or funding the consolidation of the data center in another city is a much different choice process than picking apples. These decisions have uncertainty. Uncertainty of the cost. Uncertainty of the benefits, revenue, savings, increasing in reliability and maintainability.Uncertainty in almost every variable. 

Managing in the presence of uncertainty and the resulting risk, is called business management. It's also called how adults manage projects (Tim Lister)

The Presence of Uncertainty is one of most Significant Characteristics of Project Work

Managing in the presence of uncertainty is unavoidable. Ignoring this uncertainty is also unavoidable. It's still there even if you ignore it. Uncertainty comes in many forms

  • Statistical uncertainty - Aleatory uncertainty, only margin can address this uncertainty.
  • Subjective judgement - bias, anchoring, and adjustment.
  • Systematic error - lack of understanding of the reference model.
  • Incomplete knowledge - Epistemic Uncertainty, this lack of knowledge can be improved with effort.
  • Temporal variation - instability in the observed and measured system.
  • Inherent stochasticity - instability between and within collaborative system elements
So Back To the Problem at Hand   If decisions - credible decisions - are to be made in the presence of uncertainty, then some how we need information to address the sources of that uncertainty in the bulleted list above. This information can be obtained through many means. Modeling, sampling, parametrically, past performance, reference classes. Each of these sources has in itself an inherent uncertainty.  So in the end, it comes done to this...   To make a credible decision in the presence of uncertainty, we need to estimate the factors that go into that decision. We Need To Estimate   There's no way out of it. We can't make a credible decision of any importance without an estimate of the impact of that decision, the cost incurred from making that decision, the potential benefits from that decision, the opportunity cost of NOT selecting an outcome from a decision. Anyone suggesting we can make decisions in the absence of estimating needs to provide clear, concise, actionable information with examples of how this can be done in the presence of uncertainty created by he underlying statistical processes of project work and the resulting probabilistic outcomes of those processes.   If you have certainty, you don't need to estimate. Measure your emperical performance to date, using the Most Likely value from thay performance and the variance project the future performance. Ignore risk, ignore naturally occuring (aleatory) and event based (epistemic) uncertainties in the uncerlying processes and proceed to spend yor customers money by applying faith, hope, and guesswork, just like they did before the 17th century.   Related articles What is Governance? Your Project Needs a Budget and Other Things The False Notion of "we haven't seen this before" Conveniently Unburdened by Evidence Taxonomy of Logical Fallacies
Categories: Project Management

DevOps Primer: Three Major Goals

Increasing business value is the single most important reason for any significant organizational change.

Increasing business value is the single most important reason for any significant organizational change.

Over the past few years the concept DevOps has developed and become an important framework for structuring work in the IT organizations. The “newness” of the concept has led to a wide range of definitions of DevOps. The lack of an industry standard definition has led organizations pursue DevOps as a tool to address a wide range of goals. If we embrace the definition of DevOps as the exercise of combining operations and development personnel who participate together across the entire life cycle, from analysis to production support, leveraging Agile principles and techniques there are three macro goals for embracing and implementing DevOps. DevOps supports three overall goals.

  1. Increasing Business Value
  2. Changing Organizational Culture
  3. Optimizing Delivery

Reaching any of these goals can have many benefits. (Note: it is easy to conflate benefits and goals, however benefits are the result of attaining a goal rather than the other way around.)

Increasing business value is the single most important reason for any significant organizational change. Business value is term that encompasses a wide range of concepts including increased revenue, lowering costs, improving quality, reducing time-to-market and increasing customer, employee and stakeholder satisfaction. All of these can be potential benefits of implementing DevOps when perusing a goal increasing business value.

Changing organizational culture breaks down entrenched behaviors within an organization. Implementing DevOps integrates development, technical operations and testing personnel which erases barriers between organization silos. Benefits include increased collaboration, collaboration and reducing conflict.

Optimizing delivery is often the goal most organizations cite for implementing DevOps. DevOps can lead to fewer mistakes in the process of delivery, increasing the amount of functionality delivered and reducing the number of hand-offs.

All three of the high-level goals for implementing DevOps have some degree of overlap. For example, changing organizational culture by adopting DevOps will also increase employee satisfaction (increased business value) and improve collaboration (optimizing delivery). Optimizing delivery often leads to reduced costs which increases business value. Overlaps allow organizations to focus on one goal while getting some of the benefits of another.

Effective organizational transformation is not merely the pursuit of benefits, but rather a pursuit of goals in support of a greater vision. Understanding the goal or goals of a DevOps transformation is important. However goals are a reflection of the future established when an organization embraces a vision based on their sense of urgency. A vision represents a picture of a state of being at some point in the future, and it acts as an anchor that establishes the goal of the transformation. Attaining goals yields benefits, which provide feedback on progress toward goals and vision. The relationship between benefits, goals and vision establishes a virtuous cycle.


Categories: Process Management

BA Tips and Tricks for 2015

Software Requirements Blog - Seilevel.com - Thu, 01/08/2015 - 16:00
As a guide to less experienced business analysts here at Seilevel and out there on the Internet, and because a specific colleague of mine requested I write this post (this is for you Amanda!!), I wanted to share several tips and tricks that I have uncovered in the past couple of months that have made […]
Categories: Requirements

How Much Depth When Answering Interview Questions?

Making the Complex Simple - John Sonmez - Thu, 01/08/2015 - 16:00

How much depth should you go into when answering interview questions? Int his video, I talk about how to respond to software development interview questions in a way that will make anyone want to hire you right away.

The post How Much Depth When Answering Interview Questions? appeared first on Simple Programmer.

Categories: Programming

Shneiderman's mantra

Coding the Architecture - Simon Brown - Thu, 01/08/2015 - 10:01

I attended a fantastic talk about big data visualisation at the YOW! 2014 conference in Sydney last month (slides), where Doug Talbott talked about how to understand and visualise large quantities of data. One of the things he mentioned was Shneiderman's mantra:

Overview first, zoom and filter, then details-on-demand

Leaving aside the thorny issue of how teams structure their software systems as code, one of the major problems I see teams having with software architecture is how to think about their systems. There are various ways to do this, including a number of view catalogs (e.g. logical view, design view, development view, etc) and I have my C4 model that focuses on the static structure of a software system. If you inherit an existing codebase and are asked to create a software architecture model though, where do you start? And how to people start understanding the model as quickly as possible so they can get on with their job?

Shneiderman's mantra fits really nicely with the C4 model because it's hierarchical.

Shneiderman's mantra and the C4 software architecture model

Overview first (context and container diagrams)

My starting point for understanding any software system is to draw a system context diagram. This helps me to understand the scope of the system, who is using it and what the key system dependencies are. It's usually quick to draw and quick to understand.

Next I'll open up the system and draw a diagram showing the containers (web applications, mobile apps, standalone applications, databases, file systems, message buses, etc) that make up the system. This shows the overall shape of the software system, how responsibilities have been distributed and the key technology choices that have been made.

Zoom and filter (component diagrams)

As developers, we often need more detail, so I'll then zoom into each (interesting) container in turn and show the "components" inside it. This is where I show how each application has been decomposed into components, services, modules, layers, etc, along with a brief note about key responsibilities and technology choices. If you're hand-drawing the diagrams, this part can get a little tedious, which is why I'm focussing on creating a software architecture model as code, and automating as much of this as possible.

Details on demand (class diagrams)

Optionally, I might progress deeper into the hierarchy to show the classes* that make up a particular component, service, module, layer, etc. Ultimately though, this detail resides in the code and, as software developers, we can get that on demand.

Understanding a large and/or complex software system

Next time you're asked to create an architecture model, understand an existing system, present an system overview, do some software archaeology, etc, my advice is to keep Shneiderman's mantra in mind. Start at the top and work into the detail, creating a story that gets deeper into the detail as it progresses. The C4 model is a great way to do this and if you'd like an introduction to it (with example diagrams), you can take a look at Simple Sketches for Diagramming Your Software Architecture on the new Voxxed website.

* this assumes an OO language like Java or C#, for example

Categories: Architecture

Capabilities Based Planning

Herding Cats - Glen Alleman - Thu, 01/08/2015 - 01:11

It has been conjectured that ...

What in the begining you thought you needed is never what you actually need

Fails to realize several critical success factors for project success...

  • Without stating what capabilities we need from the project, we have no means to assess any value produced by the project are worth our investment. These capabilities aren't requirements - yet. They're the mechanisms to earn back the investment. Typical capabilities sound like...
    • We need to process provider enrollment processes for $0.07 per transaction versus of current $0.12 per transaction.
    • I need the capability to move a brigade of 3,000 to 5,000 troops 100 miles up the coast in ten hours. — Gen Norman Schwarzkopf

Capabilities decipher the intent of the leader (Commander)

  • Requirement always emerge. But capabilities should not, without rethinking why we're doing the project.
  • If we're changing our needed capabilities, we don't likley know what Done looks like in any meaningful way and therefore are wasting out money exploring.
    • Exploring in a research and development domain is mandated, but we should do that with the full knowledge and participation of the people paying for work.
    • Agile is essential a process to buy  knowledge about things we don't know about.
    • Ask before spending our customer's money experimenting, can we gain this knowledge in other, cheaper

Capabilities based planning (v2) from Glen Alleman So if we're on a project that doesn't know what Done looks like, we've got to ask a serious question Do we know what we're doing? If the answer is No, we may want to rethink why we're here. Here's a process to answer that question. Screen Shot 2015-01-07 at 4.08.01 PM And a framework to discover those answers Screen Shot 2015-01-07 at 4.10.20 PM Related articles What is Governance? Your Project Needs a Budget and Other Things The False Notion of "we haven't seen this before"
Categories: Project Management

Habits, Dreams, and Goals

I’ve been talking to people in the halls about what they learned about goals from last year, and what they are going to do differently this year.   We’ve had chats about New Years Resolutions, habits, goals, and big dreams. (My theme is Dream Big for 2015.)

Here are a few of the insights that I’ve been sharing with people that really seems to create a lot clarity:

  1. Dream big first, then create your goals.  Too many people start with goals, but miss the dream that holds everything together.   The dream is the backdrop and it needs to inspire you and pull your forward.  Your dream needs to be actionable and believable, and it needs to reflect your passion and your purpose.
  2. There are three types of actions:  habits, goals, and inspired actions.   Habits can help support our goals and reach our dreams.   Goals are really the above and beyond that we set our sights on and help us funnel and focus our energy to reach meaningful milestones.   They take deliberate focus and intent.  You don’t randomly learn to play the violin with skill.  It takes goals.  Inspired actions are the flashes of insight and moments of brilliance.
  3. People mess up by focusing on goals, but not having any habits that support them.  For example, if I have an extreme fitness goal, but I have the ice-cream habit, I might not reach my goals.  Or, if I want to be an early bird, but I have the party-all-night long, or a I’m a late-night reader, that might not work out so well.  
  4. People mess up on their habits when they have no goals.  They might inch their way forward, but they can easily spend an entire year, and not actually have anything significant or meaningful for themselves, because they never took the chance to dream big, or set a goal they cared about.   So while they’ve made progress, they didn’t make any real pop.   Their life was slow and steady.  In some cases, this is great, if all they wanted.  But I also know people that feel like they wasted the year, because they didn’t do what they knew they were capable of, or wanted to achieve.
  5. People can build habits that help them reach new goals.   Some people I knew have built fantastic habits.  They put a strong foundation in place that helps them reach for more.  They grow better, faster, stronger, and more powerful.   In my own experience, I had some extreme fitness goals, but I started with a few healthy habits.  My best one is wake up, work out.  I just do it.  I do a 30 minute workout.   I don’t have to think about it, it’s just part of my day like brushing my teeth.  Since it’s a habit, I keep doing it, so I get better over time.  When I first started the workout, I sucked.  I repeated the same workout three times, but by the third time, I was on fire.   And, since it’s a habit, it’s there for me, as a staple in my day, and, in reality, the most empowering part of my day.  It boosts me and gives me energy that makes everything else in my day, way easier, much easier to deal with, and I can do things in half the time, or in some cases 10X.

Maybe the most important insight is that while you don’t need goals to make your habits effective, it’s really easy to spend a year, and then wonder where the year went, without the meaningful milestones to look back on.   That said, I’ve had a few years, where I simply focused on habits without specific goals, but I always had a vision for a better me, or a better future in mind (more like a direction than a destination.)

As I’ve taken friends and colleagues through some of my learnings over the holidays, regarding habits, dreams, and goals, I’ve had a few people say that I should put it all together and share it, since it might help more people add some clarity to setting and achieving their goals.

Here it is:

How Dreams, Goals, and Habits Fit Together

Enjoy, and Dream Big for 2015.

Categories: Architecture, Programming

Episode 217: James Turnbull on Docker

James Turnbull joins Charles Anderson to discuss Docker, an open source platform for distributed applications for developers and system administrators. Topics include Linux containers and the functions they provide, container images and how they are built, use cases for containers, and the future of containers versus virtual machines. Venue: Internet Related Links James’s home page: […]
Categories: Programming

The Ultimate Guide: 5 Methods for Debugging Production Servers at Scale

This a guest post by Alex Zhitnitsky, an engineer working at Takipi, who is on a mission to help Java and Scala developers solve bugs in production and rid the world of buggy software.

How to approach the production debugging conundrum?

All sorts of wild things happen when your code leaves the safe and warm development environment. Unlike the comfort of the debugger in your favorite IDE, when errors happen on a live server - you better come prepared. No more breakpoints, step over, or step into, and you can forget about adding that quick line of code to help you understand what just happened. In production, bad things happen first and then you have to figure out what exactly went wrong. To be able to debug in this kind of environment we first need to switch our debugging mindset to plan ahead. If you’re not prepared with good practices in advance, roaming around aimlessly through the logs wouldn’t be too effective.

And that’s not all. With high scalability architectures, enter high scalability errors. In many cases we find transactions that originate on one machine or microservice and break something on another. Together with Continuous Delivery practices and constant code changes, errors find their way to production with an increasing rate. The biggest problem we’re facing here is capturing the exact state which led to the error, what were the variable values, which thread are we in, and what was this piece of code even trying to do?

Let’s take a look at 5 methods that can help us answer just that. Distributed logging, advanced jstack techniques, BTrace and other custom JVM agents:

1. Distributed Logging
Categories: Architecture

Change the Indispensable Employee Mindset

Years ago, I was the expert for two specific products in a small development organization. When it came time for my manager to divide up the work, I always got those products to add features to, or maintain. That was fine for a while, until I got bored. I went to my boss with a request for different work.

“Who will do the work if you don’t?” My boss was concerned.

“Steve or Dave will. They’re good. They can take over for me.” I knew my colleagues. They could do the work.

“But, they’ll have to learn what you do.”

“I know. I can take a few days to explain, if you want. I don’t think it will take a few days to explain. They’re smart. I’m still available if they have questions.”

“I don’t know. You’re indispensable where you are.”

I faced my boss and stood up. “No one is indispensable. And, if I am, you should replace me on those systems anyway. What are you going to do if I leave?”

My boss paled, and asked, “Are you planning to leave?”

“I don’t know. I’m bored. I want new work. I told you that. I don’t see why I can’t have new work. You need developers on these projects.” I named three of them. “Why do I have to stay doing work on the old stuff when I want to do new things. I don’t see why I should. Just because I’ve been doing it for a year is no reason to pigeon-hole me. No. I want new work. I’m not indispensable. You can hire someone and I can train that person if you want.”

My boss reluctantly agreed to let me stop working on the old systems and work on the new projects. I was no longer indispensable.

The problem with being an indispensable employee is that your options are limited. Your boss wants you to keep doing the same thing you’ve always done. Maybe you want that, too for now. The problem is that one day, you realize no one needs what you do. You have become such an expert that you are quite dispensable. You have the same year of experience for several years.

Instead of being indispensable, consider how to help other people learn your work. What do you want to learn next? You need to plan your career development.

What do you do if you’re a manager, and you have indispensable employees? “Fire” them.

I’m serious. When you have people who are indispensable, they are experts. They create bottlenecks and a cost of delay. If you need flexibility in your organization, you need people who know more than one area. You need teams who are adaptable and can learn quickly. A narrow expert is not what you need.

When I say “fire” people, I mean don’t let them work on their area of expertise alone. Create a transition plan and help the expert discover new skills.

Why should you do this? Because if not, people and projects across the organization decide they need that person. Sometimes with quite bad results.

This month’s management myth is based on a true story. The organization wanted an expert to change teams and move. All because of his expertise. That’s nuts. Go read Management Myth 36: You Have an Indispensable Employee.

Categories: Project Management

DevOps Primer: Definition

 

DevOps is like a blend of flavors.

DevOps is like a blend of flavors.

If you ask 20 people the definition of DevOps you will get 12 definitions and 10 people that don’t know the term DevOps. I know, I asked, and yes I am aware that the numbers don’t add up (I got a couple it is either this or that). The definitions I heard generally were a mixture of concepts, activities and how those concepts and activities were implemented. More specifically, the definitions I received included a description of a technical operations group, an automation framework for delivery and an overview of a collaborative organizational structure. All interesting and all part of DevOps to a greater or lesser extend but in general the definitions were complicated.  I am not a fan of complicated definitions. The simplest and most complete definition I have been able to mold from all of the conversations and personnel experiences is:

DevOps is the exercise of combining operations and development personnel who participate together across the entire life cycle, from analysis to production support, leveraging Agile principles and techniques.

The problem DevOps is trying to solve is not a new problem. The development, delivery, support and maintenance of software (substitute product, service or application as you will) is time consuming, costly and error prone. Much of the impetus for embracing Agile and lean methods was to become more responsive to development’s customers. DevOps is a representation of the spreading implementation of Agile and lean techniques and principles across the entire organization. The basic goal of DevOps is to deliver functionality faster and with higher quality. The Agile principle of being able to deliver functionality quickly either continuously or once per sprint, requires that the processes needed to build, test and implement functionality need to be automated. You can envision DevOps as the intersection of three macro sets of roles: development, technical operations and quality assurance (testing). All three of the macro categories are needed to develop, deliver and maintain a product. In many IT organizations, these three categories are independent silos with different goals. The implementation of DevOps seeks to break down the silos though collaboration to deliver a single goal. Dominque Bourget of RSA said “It seems that the DevOps groups are more oriented toward solving the problems of delivering more functionality and stability which is very positive.” Generating that benefit requires a change in how IT organizations both operate and manage themselves.

At a high operational level, DevOps requires interaction of personnel and processes from technical operations, testing and development across the product life cycle. This interaction is more than having TechOps or testing personnel review and provide sign-off on deliverables that development personnel create. Increasing the delivery cadence requires automation build, testing and promotion processes. In the simplest terms, involvement and automation typically require a reassessment of how work progresses across the entire product life cycle (including development, enhancement maintenance, support and retirement).

Implementing DevOps requires synchronizing the tactical goals of organizations that often have different vested interests at a tactical level. [HUH?] Conflicting goals can include: development – speed to delivery, technical operations – environmental stability, and QA – quality. Increasing collaboration by including all parties in team membership and then using Agile techniques such as self-organization are tools to break down walls. However, to make any of these techniques work at the team level requires empowering individuals and changing some of the time honored hierarchal management structures to reduce the conflicting forces between groups.

David Herron, a colleague, provided a simple definition, “(DevOps is)…an integrated approach to software delivery that include process, tools, technology, resources.” As Paul Laberge of Wolters Kluwer pointed out when defining DevOps, “delivery includes development.” A simple definition might not be truly possible for DevOps without a lot of clarifications. What can be said is that DevOps forces any discussion of development into broader discussion of how product developed, delivered and supported in an effective and efficient manner.

 

Special Note:  For just a little over two years I have been publishing content daily.  In 2015 I have decided to post new content on Tuesday, Thursday and Saturday.  On Sunday, I will continue to post the podcast announcement.  I may however go back to posting daily at a moments notice!


Categories: Process Management