Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

If it needs to happen: Schedule it!

Mike Cohn's Blog - Tue, 01/13/2015 - 15:00

The following is a guest post from Lisa Crispin. Lisa is the co-author with Janet Gregory of "Agile Testing: A Practical Guide for Testers and Agile Teams" and the newly published "More Agile Testing: Learning Journeys for the Whole Team". I highly recommend both of these books--in fact, I recommend reading everything Lisa and Janet write. In the following post, Lisa argues the benefits of scheduling anything that's important. I am a fanatic for doing this. Over the holiday I put fancy new batteries in my smoke detectors that are supposed to last 10 years. So I put a note in my calendar to replace them in 9 years. But, don't schedule time to read Lisa's guest post--just do it now. --Mike

During the holidays, some old friends came over to our house for lunch. We hadn’t seen each other in a few months, though we live 10 minutes apart. We had so much fun catching up. As they prepared to leave, my friend suggested, “Let’s pick a date to get together again soon. So often, six months go by without our seeing each other.” We got out our calendars, and penciled in a date a few weeks away. The chances are good that we will achieve our goal of meeting up soon.

Scheduling time for important activities is key in my professional life, too. Here’s a recent example. My current team has only three testers, and we all have other duties as well, such as customer support. We have multiple code bases, products and platforms to test, so we’re always juggling priorities.

Making time

The product owner for our iOS product wanted us to experiment with automating some regression smoke tests through its UI. Another tester on the team and I decided we’d pair on a spike for this. However, we had so many competing priorities, we kept putting it off. As much as we try avoid multi-tasking, it seems there is always some urgent interruption.

Finally, we created a recurring daily meeting on the calendar, scheduled shortly after lunchtime when few other meetings were going on. As soon as we did that, we started making the time we needed. We might miss a day here or there, but we’re much more disciplined about taking our time for the iOS test automation. As a result, we made steady, if slow, progress, and achieved many of our goals.

Scheduling help

Even though both of us were new to iOS, pairing gave us courage, and two heads were better than one. We’d still get stuck, though, and we needed the expertise of programmers and testers with experience automating iOS tests. Simply adding a meeting to the calendar with the right people has gotten us the help we needed. Even busy people can spare 30 minutes or an hour out of their week. Our iOS team is in another city two time zones away. If we put a meeting on the calendar with a Zoom meeting link, we can easily make contact at the appointed time. Screensharing enables us to make progress quickly, so we can stick to short time boxes.

Another way we ensured time on our schedule for the automation work was to add stories for it to our backlog. For example, we had stories for specific scripts, starting with writing an automated test for logging into the iOS app. Once we had some working test scripts, we added infrastructure-type chores, for example, get the tests running from the command line so we can add them to the team’s continuous integration later. These stories make our efforts more visible. As a result, team members anticipate our meeting invitations and think of ideas to overcome challenges with the automation.

Time for testing

Putting time on the calendar works when I need to pair with a programmer to understand a story better, or when we need help with exploratory testing for a major new feature. I can always ask for help at the morning standup, but if we don’t set a specific time, it’s easy for everyone to get involved with other priorities and forget.

The calendar is your friend. Once you create a meeting, you might still need to negotiate what time works for everyone involved, but you’ve started the collaboration process. Of course, if it’s easy to simply walk over to someone’s desk to ask a question, or pick up the phone if they’re not co-located, do that. But if your team is like ours, where programmers and other roles pair full time, and there’s always a lot going on, a scheduled meeting helps everyone plan the time for it.

If you have a tough project ahead, find a pair and set up a recurring meeting to work together. If you need one-off help, add a time-boxed meeting for today or tomorrow. If you need the whole team to brainstorm about some testing issues, schedule a meeting for the time of day with the fewest distractions. And if you haven’t seen an old friend in too long, schedule a date for that, too!

If it needs to happen: Schedule it!

Mike Cohn's Blog - Tue, 01/13/2015 - 15:00

The following is a guest post from Lisa Crispin. Lisa is the co-author with Janet Gregory of "Agile Testing: A Practical Guide for Testers and Agile Teams" and the newly published "More Agile Testing: Learning Journeys for the Whole Team". I highly recommend both of these books--in fact, I recommend reading everything Lisa and Janet write. In the following post, Lisa argues the benefits of scheduling anything that's important. I am a fanatic for doing this. Over the holiday I put fancy new batteries in my smoke detectors that are supposed to last 10 years. So I put a note in my calendar to replace them in 9 years. But, don't schedule time to read Lisa's guest post--just do it now. --Mike

During the holidays, some old friends came over to our house for lunch. We hadn’t seen each other in a few months, though we live 10 minutes apart. We had so much fun catching up. As they prepared to leave, my friend suggested, “Let’s pick a date to get together again soon. So often, six months go by without our seeing each other.” We got out our calendars, and penciled in a date a few weeks away. The chances are good that we will achieve our goal of meeting up soon.

Scheduling time for important activities is key in my professional life, too. Here’s a recent example. My current team has only three testers, and we all have other duties as well, such as customer support. We have multiple code bases, products and platforms to test, so we’re always juggling priorities.

Making time

The product owner for our iOS product wanted us to experiment with automating some regression smoke tests through its UI. Another tester on the team and I decided we’d pair on a spike for this. However, we had so many competing priorities, we kept putting it off. As much as we try avoid multi-tasking, it seems there is always some urgent interruption.

Finally, we created a recurring daily meeting on the calendar, scheduled shortly after lunchtime when few other meetings were going on. As soon as we did that, we started making the time we needed. We might miss a day here or there, but we’re much more disciplined about taking our time for the iOS test automation. As a result, we made steady, if slow, progress, and achieved many of our goals.

Scheduling help

Even though both of us were new to iOS, pairing gave us courage, and two heads were better than one. We’d still get stuck, though, and we needed the expertise of programmers and testers with experience automating iOS tests. Simply adding a meeting to the calendar with the right people has gotten us the help we needed. Even busy people can spare 30 minutes or an hour out of their week. Our iOS team is in another city two time zones away. If we put a meeting on the calendar with a Zoom meeting link, we can easily make contact at the appointed time. Screensharing enables us to make progress quickly, so we can stick to short time boxes.

Another way we ensured time on our schedule for the automation work was to add stories for it to our backlog. For example, we had stories for specific scripts, starting with writing an automated test for logging into the iOS app. Once we had some working test scripts, we added infrastructure-type chores, for example, get the tests running from the command line so we can add them to the team’s continuous integration later. These stories make our efforts more visible. As a result, team members anticipate our meeting invitations and think of ideas to overcome challenges with the automation.

Time for testing

Putting time on the calendar works when I need to pair with a programmer to understand a story better, or when we need help with exploratory testing for a major new feature. I can always ask for help at the morning standup, but if we don’t set a specific time, it’s easy for everyone to get involved with other priorities and forget.

The calendar is your friend. Once you create a meeting, you might still need to negotiate what time works for everyone involved, but you’ve started the collaboration process. Of course, if it’s easy to simply walk over to someone’s desk to ask a question, or pick up the phone if they’re not co-located, do that. But if your team is like ours, where programmers and other roles pair full time, and there’s always a lot going on, a scheduled meeting helps everyone plan the time for it.

If you have a tough project ahead, find a pair and set up a recurring meeting to work together. If you need one-off help, add a time-boxed meeting for today or tomorrow. If you need the whole team to brainstorm about some testing issues, schedule a meeting for the time of day with the fewest distractions. And if you haven’t seen an old friend in too long, schedule a date for that, too!

Does Agile Apply to Your Project?

I have a new column posted at projectmanagement.com. It’s called Does Agile Apply to Your Project? (You might need a free registration.)

 

Categories: Project Management

Why isn't the architecture in the code?

Coding the Architecture - Simon Brown - Tue, 01/13/2015 - 10:25

In response to my System Context diagram as code post yesterday was this question:

@simonbrown why is that information not already in the system's code?

— Nat Pryce (@natpryce) January 12, 2015

I've often asked the same thing and, if the code is the embodiment/implementation of the architecture, this information really should be present in the code. But my experience suggests this is rarely the case.

System context

My starting point for describing a software system is to draw a system context diagram. This shows the system in question along with key user types (e.g. actors, roles, personas, etc) and system dependencies.

I should be able to get a list of user roles from the code. For example, many web applications will have some configuration that describes the various user roles, Active Directory groups, etc and the parts of the web application that they have access too. This will differ from codebase to codebase and technology to technology, but in theory this information is available somewhere.

The key system dependencies are a little harder to extract from a codebase. Again, we can scrape security configuration to identify links to systems such as LDAP and Active Directory. We could also search the codebase for links to known libraries or APIs, and make the assumption that these are a system dependencies. But what about those system interactions that are done by copying a file into a network share? I know this sounds archaic, but it still happens. Understanding inbound dependencies is also tricky, especially if you don't keep track of your API consumers.

Containers

The next level in my C4 model is a container diagram, which basically shows the various web applications, mobile apps, databases, file systems, standalone applications, etc and how they interact to form the overall software system. Again, some of this information will be present, in one form or another, in the codebase. For example, you could scrape this information out of an IDE such as IntelliJ IDEA (i.e. modules) or Visual Studio (i.e. projects). The output from build scripts for code (e.g. Ant, Maven, MSBuild, etc) and infrastructure (e.g. Puppet, Chef, Vagrant, Docker, etc) will probably result in deployable units, which can again be identified and this information used to create the containers model.

Components

The third level of the C4 model is components (or modules, services, layers, etc). Since even a relatively small application may consist of a large number of components, this is a level that we certainly want to automate. But it turns out that even this is tricky. Usually there's a lack of an architecturally-evident coding style, which means you get a conflict between the software architecture model and the code. This is particularly true in older systems where the codebase lacks modularity and looks like a sea of thousands of classes interacting with one another. As Robert Annett suggests, there are a number of strategies that we can use to identify "components" from a codebase though; including annotations/attributes, packaging conventions, naming conventions, module systems (e.g. OSGi), library dependencies and so on.

Auto-generating the software architecture model

Ultimately, I'd like to auto-generate as much of the software architecture model as possible from the code, but this isn't currently realistic. Why?

@natpryce @simonbrown because code doesn't contain the structures needed (and we don't train/show people how to do it)

— Eoin Woods (@eoinwoodz) January 13, 2015

We face two key challenges here. First of all, we need to get people thinking about software architecture once again so that they are able to think about, describe and discuss the various structures needed to reason about a large and/or complex software system. And secondly, we need to find a way to get these structures into the codebase. We have a way to go but, in time, I hope that the thought of using Microsoft Visio for drawing software architecture diagrams will seem ridiculous.

Categories: Architecture

Python: Counter – ValueError: too many values to unpack

Mark Needham - Tue, 01/13/2015 - 00:16

I recently came across Python’s Counter tool which makes it really easy to count the number of occurrences of items in a list.

In my case I was trying to work out how many times words occurred in a corpus so I had something like the following:

>> from collections import Counter
>> counter = Counter(["word1", "word2", "word3", "word1"])
>> print counter
Counter({'word1': 2, 'word3': 1, 'word2': 1})

I wanted to write a for loop to iterate over the counter and print the (key, value) pairs and started with the following:

>>> for key, value in counter:
...   print key, value
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: too many values to unpack

I’m not sure why I expected this to work but in fact since Counter is a sub class of dict we need to call iteritems to get an iterator of pairs rather than just keys.

The following does the job:

>>> for key, value in counter.iteritems():
...   print key, value
...
word1 2
word3 1
word2 1

Hopefully future Mark will remember this!

Categories: Programming

The Stunning Scale of AWS and What it Means for the Future of the Cloud

James Hamilton, VP and Distinguished Engineer at Amazon, and long time blogger of interesting stuff, gave an enthusiastic talk at AWS re:Invent 2014 on AWS Innovation at Scale. He’s clearly proud of the work they are doing and it shows.

James shared a few eye popping stats about AWS:

  • 1 million active customers
  • All 14 other cloud providers combined have 1/5th the aggregate capacity of AWS (estimate by Gartner in 2013)
  • 449 new services and major features released in 2014
  • Every day, AWS adds enough new server capacity to support all of Amazon’s global infrastructure when it was a $7B annual revenue enterprise (in 2004).
  • S3 has 132% year-over-year growth in data transfer
  • 102Tbps network capacity into a datacenter.

The major theme of the talk is the cloud is a different world. It’s a special environment that allows AWS to do great things at scale, things you can’t do, which is why the transition from on premise x86 servers to the public cloud is happening at a blistering pace. With so many scale driven benefits to the public cloud, it's a transition that can't be stopped. The cloud will keep getting more reliable, more functional, and cheaper at a rate that you can't begin to match with your limited resources, generalist gear, bloated software stacks, slow supply chains, and outdated innovation paradigms.

That's the PR message at least. But one thing you can say about Amazon is they are living it. They are making it real. So a healthy doubt is healthy, but extrapolating out the lines of fate would also be wise.

One of the fickle finger of fate advantages AWS has is resources. At one million customers they have the scale to keep the engine of expansion and improvement going. Profits aren't being taken out, money is being reinvested. This is perhaps the most important advantage of scale.

But money without smarts is simply waste. Amazon wants you to know they have the smarts. We've heard how Google and Facebook build their own gear, Amazon does too. They build their own networking gear, networking software, racks, and they work with Intel to get faster processor versions of processors than are available on the market. The key is they know everything and control everything about their environment, so they can build simpler gear that does exactly what they want, which turns out to be cheaper and more reliable in the end.

Complete control allows quality metrics to be built into everything. Metrics drive a constant quality increase in all parts of the system, which is why against all odds AWS is getting more reliable as the pace of innovation quickens. Great pools of actionable data turned into knowledge is another huge advantage of scale.

Another thing AWS can do that you can't is the Availability Zone architecture itself. Each AZ is its own datacenter and AZs within a region are located very close together. This reduces messaging latencies, which means state can be synchronously replicated between AZs, which greatly improves availability compared to the typical approach where redundant datacenters are very far apart. 

It's a talk rich with information and...well, spunk. The real meta-theme of the talk is how Amazon consciously uses scale to their competitive advantage. For Amazon scale isn't just an expense to be dealt with, scale is a resource to exploit, if you know how.

Here's my gloss of James Hamilton's incredible talk...

Everything in the Talk has a Foundation in Scale
Categories: Architecture

How Taiseer Joudeh, Built an Incredibly Successful Technical Blog in 14 Months

Making the Complex Simple - John Sonmez - Mon, 01/12/2015 - 16:30

I recently had the opportunity to interview very prolific and friendly, Taiseer Joudeh. He first caught my attention when I saw he had mentioned me on Twitter, thanking me for the help in getting his Microsoft MVP and mentioning me in his blog post. John @jsonmez Thanks for all the inspiration and motivation you blog about! I’m MVP now. Special ... Read More

The post How Taiseer Joudeh, Built an Incredibly Successful Technical Blog in 14 Months appeared first on Simple Programmer.

Categories: Programming

System Context diagram as code

Coding the Architecture - Simon Brown - Mon, 01/12/2015 - 15:10

As I said in Resolving the conflict between software architecture and code, my focus for this year is representing a software architecture model as code. In Simple Sketches for Diagramming Your Software Architecture, I showed an example System Context diagram for my techtribes.je website.

techtribes.je System Context diagram

It's a simple diagram that shows techtribes.je in the middle, surrounded by the key types of users and system dependencies. It's your typical "big picture" view. This diagram was created using OmniGraffle (think Microsoft Visio for Mac OS X) and it's exactly that - a static diagram that needs to be manually kept up to date. Instead, wouldn't it be great if this diagram was based upon a model that we could better version control, collaborate on and visualise? If you're not sure what I mean by a "model", take a look at Models, sketches and everything in between.

This is basically what the aim of Structurizr is. It's a way to describe a software architecture model as code, and then visualise it in a simple way. The Structurizr Java library is available on GitHub and you can download a prebuilt binary. Just as a warning, this is very much a work in progress and so don't be surprised if things change! Here's some Java code to recreate the techtribes.je System Context diagram.

Executing this code creates this JSON, which you can then copy and paste into the try it page of Structurizr. The result (if you move the boxes around) is something like this.

techtribes.je System Context diagram

Don't worry, there will eventually be an API for uploading software architecture models and the diagrams will get some styling, but it proves the concept. What we have then is an API that implements the various levels in my C4 software architecture model, with a simple browser-based rendering tool. Hopefully that's a nice simple introduction of how to represent a software architecture model as code, and gives you a flavour for the sort of direction I'm taking it. Having the software architecture as code provides some interesting opportunities that you don't get with static diagrams from Visio, etc and the ability to keep the models up to date automatically by scanning the codebase is what I find particularly exciting. If you have any thoughts on this, please do drop me a note.

Categories: Architecture

Plan Management

Herding Cats - Glen Alleman - Mon, 01/12/2015 - 15:05

When we hear ...

"Don't fall in love with your plan, it is almost certainly wrong "

let's reflect on advice from IEEE Std 730-2014

Plan Management: It is obvious that the project must be managed during its execution, but perhaps not so obvious that the plan itself must be managed. Despite the fact that nearly continuous change destabilizes any plan, the plan itself must not be allowed to float and become meaningless. The project must adopt a discipline for monitoring, reviewing and revising plans that they are stable in application while responsive to change.

Plan Do Check Act CIThe management process of IEEE/ISO 12207-2008 requires the manager to control the execution of the project by monitoring and reporting progress, and investigating, analyzing and resolving problems that arise.

The common phrase used by many in agile is Plan-Do-Check-Act.

This starts with Plan. In order to Do, Check, and Act, we need a place to start. The Plan. So falling in love with our Plan and then not, Doing, Checking the outcome, and Acting on the progress to that Plan, determining the varaincies from that Plan, assessing the impact from those variances, would mean we are not managing the project.

This would also mean that falling in love with the Plan would require use ignorning the very basis of management in the presence of change.

Plan First

So make plans, get feedback on the progress to plan, update the plan with this information, make a new or updated plan, schedule the work that the plan shows needed to be done, and measure progress to plan in units of measure meaningful to the decision makers.

Like the picture above, if you're going to embark on a day hike to Twin Sisters in Rocky Mountain National Park, you'd better have a plan. Not just for how long it's going to take, what your path is going to be, but have a Plan-B and likely a Plan-C when the weather goes bad, thunder storms come over the mountain, or any other disruptive event.

But most of all PLAN, and don't fall for that non-actionable statement that plans are almost certainly wrong. Because without the plan, you have no clue about what DONE looks like, how to get there, how long it will take, how much it will cost, or how you'll recognize DONE when it arrives.

Plans are Strategies for Success

They can be neither Right or Wrong, they can only represent you're current understanding of the emerging situation that leads to the current path to success.

Related articles Taxonomy of Logical Fallacies The False Notion of "we haven't seen this before" What is Governance? Your Project Needs a Budget and Other Things
Categories: Project Management

Does Agile Work Because We are Optimistic?

I read the Business Week opening remarks, How Optimism Strengthens Economies.  See this quote at the end:

the group of people who turn out to be most accurate about predicting how long it will take to complete tasks—and how likely they are to succeed—are the clinically depressed. Optimists underestimate how difficult it will be to succeed. But that self-deception is precisely what makes them willing to take more risks and invest in a better future, while the pessimists slouch toward self-fulfilling failure. 

Pessimistic people are more accurate with their estimation. Optimists underestimate. Their optimism allows them to take more risks and innovate. Which kind of person are you? (I’m both, in different circumstances.)

That got me thinking about why agile works.

Agile and lean (I’m using agile as shorthand for both) help us make progress in small chunks. That creates hope and optimism in the project team. When the project team demos or releases to other people, they trust the team and a become hopeful and optimistic.

We know from The Progress Principle: Using Small Wins to Ignite Joy, Engagement, and Creativity at Work that the more we make progress with small wins, the better we feel and the more likely we are to make more progress. That leads to hope and optimism.

Is this why agile works? Because we can make progress daily?

It’s not the only reason. We also need feedback. When we provide demos to other people, as often as possible, we build trust. With trust comes the possibility of better connection and feedback.

We get feature-itis because we’re no longer in requirements hell. Other people can see that a project team can deliver. That leads to optimism and hope in the organization. (I’m differentiating the two, because they are different. See my review of Seligman’s book.) With hope, people can rise to many occasions. Without hope? I bet you’ve been there on a project. It isn’t pretty.

Agile is not for everyone. Agile approaches? Yes. Completing small chunks of work and showing it to other people? You can do that with deliverable-based planning, building incrementally, and iterative approaches to replanning. If you want a name for that approach, it’s called staged delivery or design to schedule.

If you’re doing agile well, you’re delivering new small features into the code base every day or every other day. That helps you feel as if you’re making progress. When you feel as if you’re making progress, you can be more optimistic or hopeful. That helps you see new possibilities.

I would rather work in a hopeful way, making progress on a project, than feel as if I’m dragging. What about you?

So, agile might increase optimism, which allows us to make more progress and innovate. Agile done well brings joy to our work.

People are always asking my why agile works, or to prove it works. I can’t prove anything.

I have said that in my experience, when people work in an agile way, they are more productive and more effective. Now I wonder if this is because they are optimistic and hopeful about their work.

Categories: Project Management

Models, sketches and everything in between

Coding the Architecture - Simon Brown - Mon, 01/12/2015 - 13:35

Eoin Woods (co-author of the Software Systems Architecture book) and I presented a session at the Software Architect 2014 conference titled Models, sketches and everything in between, where we discussed the differences between diagrams and models for capturing and communicating the software architecture of a system.

Just the mention of the word "modelling" brings back horrible memories of analysis paralysis for many software developers. And, in their haste to adopt agile approaches, we’ve seen countless software teams who have thrown out the modelling baby with the process bathwater. In extreme cases, this has led to the creation of software systems that really are the stereotypical "big ball of mud". In this session, Simon and Eoin will discuss models, sketches and everything in between, providing you with some real world advice on how even a little modelling can help you avoid chaos.

Models, sketches and everything in between - video

Models, sketches and everything in between - slides

The video and slides are both available. After a short overview of our (often differing!) opinions, we answered the following questions.

  1. Modelling - Why Bother?
  2. Modelling and Agility?
  3. How to Do It?
  4. UML - Is It Worth the Hassle?
  5. Modelling in the Large vs the Small

It was a very fun session to do and I'd recommend taking a look if you're interested in describing/communicating the software architecture of your system. Enjoy!

Categories: Architecture

SPaMCAST 324 – Software Non-Functional Assessment Process, SNAP

www.spamcast.net

http://www.spamcast.net

Listen to the Software Process and Measurement Cast (Here)

The Software Process and Measurement Cast features our interview with Charley Tichenor and Talmon Ben-Cnaan on the Software Non-Functional Assessment Process (SNAP).  SNAP is a standard process for measuring non-functional size.  Both Talmon and Charley are playing an instrumental role in developing and evolving the SNAP process and metric.  SNAP helps developers and leaders to shine a light on non-functional work required for software development and is useful for analyzing, planning and estimating work.

Talmon’s Bio:

Talmon Ben-Cnaan is the chairperson of the International Function Point User Group (IFPUG) committee for Non-Functional Software Sizing (NFSSC) and a Quality Manager at Amdocs. He led the Quality Measurements in his company, was responsible for collecting and analyzing measurements of software development projects and provided reports to senior management, based on those measurements. Talmon was also responsible for implementing Function Points in his organization.

Currently he manages quality operations and test methodology in Amdocs Testing division. The Amdocs Testing division includes more than 2,200 experts, located at more than 30 sites worldwide, and specializing in testing for the Telecommunication Service Providers.

Amdocs is the market leader in the Telecommunications market, with over 22,000 employees, delivering the most advanced business support systems (BSS), operational support systems (OSS), and service delivery to Communications Service Providers in more than 50 countries around the world.

Charley’s Bio:

Charley Tichenor has been a member of the International Function Point Users Group since 1991, and twice certified as a Certified Function Point Specialist.  He is currently a member of the IFPUG Non-functional Sizing Standards Committee, providing data collection and analysis support.  He recently retired from the US government with 32 years’ experience as an Operations Research Analyst, and is currently an Adjunct Professor with Marymount University in Washington, DC, teaching business analytics courses.  He has a BSBA degree from The Ohio State University, an MBA from Virginia Tech, and a Ph.D. in Business from Berne University.

 

Note:  Charley begins the interview with a work required disclaimer but then we SNAP to it … so to speak.

Next

In the next Software Process and Measurement Cast we will feature our essay on product owners.  The role of the product owner is one of the hardest to implement when embracing Agile. However how the role of the product owner is implemented is often a clear determinant of success with Agile.  The ideas in our essay can help you get it right.

We will also have new columns from the Software Sensei, Kim Pries and Jo Ann Sweeney with her Explaining Communication series.

Call to action!

We are in the middle of a re-read of John Kotter’s classic Leading Change on the Software Process and Measurement Blog.  Are you participating in the re-read? Please feel free to jump in and add your thoughts and comments!

After we finish the current re-read will need to decide which book will be next.  We are building a list of the books that have had the most influence on readers of the blog and listeners to the podcast.  Can you answer the question?

What are the two books that have most influenced you career (business, technical or philosophical)?  Send the titles to spamcastinfo@gmail.com.

First, we will compile a list and publish it on the blog.  Second, we will use the list to drive future  “Re-read” Saturdays. Re-read Saturday is an exciting new feature that began on the Software Process and Measurement blog on November 8th.  Feel free to choose you platform; send an email, leave a message on the blog, Facebook or just tweet the list (use hashtag #SPaMCAST)!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.


Categories: Process Management

SPaMCAST 324 – Software Non-Functional Assessment Process, SNAP

Software Process and Measurement Cast - Sun, 01/11/2015 - 23:00

The Software Process and Measurement Cast features our interview with Charley Tichenor and Talmon Ben-Cnaan on the Software Non-Functional Assessment Process (SNAP).  SNAP is a standard process for measuring non-functional size.  Both Talmon and Charley are playing an instrumental role in developing and evolving the SNAP process and metric.  SNAP helps developers and leaders to shine a light on non-functional work required for software development and is useful for analyzing, planning and estimating work.

Talmon’s Bio:

Talmon Ben-Cnaan is the chairperson of the International Function Point User Group (IFPUG) committee for Non-Functional Software Sizing (NFSSC) and a Quality Manager at Amdocs. He led the Quality Measurements in his company, was responsible for collecting and analyzing measurements of software development projects and provided reports to senior management, based on those measurements. Talmon was also responsible for implementing Function Points in his organization.

Currently he manages quality operations and test methodology in Amdocs Testing division. The Amdocs Testing division includes more than 2,200 experts, located at more than 30 sites worldwide, and specializing in testing for the Telecommunication Service Providers.

Amdocs is the market leader in the Telecommunications market, with over 22,000 employees, delivering the most advanced business support systems (BSS), operational support systems (OSS), and service delivery to Communications Service Providers in more than 50 countries around the world.

Charley’s Bio:

Charley Tichenor has been a member of the International Function Point Users Group since 1991, and twice certified as a Certified Function Point Specialist.  He is currently a member of the IFPUG Non-functional Sizing Standards Committee, providing data collection and analysis support.  He recently retired from the US government with 32 years’ experience as an Operations Research Analyst, and is currently an Adjunct Professor with Marymount University in Washington, DC, teaching business analytics courses.  He has a BSBA degree from The Ohio State University, an MBA from Virginia Tech, and a Ph.D. in Business from Berne University.

 

Note:  Charley begins the interview with a work required disclaimer but then we SNAP to it … so to speak.

Next

In the next Software Process and Measurement Cast we will feature our essay on product owners.  The role of the product owner is one of the hardest to implement when embracing Agile. However how the role of the product owner is implemented is often a clear determinant of success with Agile.  The ideas in our essay can help you get it right.

We will also have new columns from the Software Sensei, Kim Pries and Jo Ann Sweeney with her Explaining Communication series.

Call to action!

We are in the middle of a re-read of John Kotter’s classic Leading Change on the Software Process and Measurement Blog.  Are you participating in the re-read? Please feel free to jump in and add your thoughts and comments!

After we finish the current re-read will need to decide which book will be next.  We are building a list of the books that have had the most influence on readers of the blog and listeners to the podcast.  Can you answer the question?

What are the two books that have most influenced you career (business, technical or philosophical)?  Send the titles to spamcastinfo@gmail.com.

First, we will compile a list and publish it on the blog.  Second, we will use the list to drive future  “Re-read” Saturdays. Re-read Saturday is an exciting new feature that began on the Software Process and Measurement blog on November 8th.  Feel free to choose you platform; send an email, leave a message on the blog, Facebook or just tweet the list (use hashtag #SPaMCAST)!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.

Categories: Process Management

MVVM and Threading

Actively Lazy - Sun, 01/11/2015 - 21:01

The Model-View-ViewModel pattern is a very powerful design pattern when building WPF applications, even if I’m not sure everyone interprets it the same way. However, it’s never been clear to me how to easily manage multi-threaded WPF applications: writing multi-threaded code is hard and there seems to be no real support baked into WPF or the idea of MVVM to make multi-threaded code easier to get right.

The Problem

All WPF applications are effectively bound by the same three constraints:

  1. All interaction with UI elements must be done on the UI thread
  2. All long-running tasks (web service calls etc) should be done on a background thread to keep the UI responsive
  3. Repeatedly switching between threads is expensive

Bound by these constraints it means that all WPF code has two types of thread running through it: UI threads and background threads. It is clearly important to know which type of thread will be executing any given line of code: a background thread cannot interact with UI elements and UI threads should not make expensive calls.

Example

A very brief, contrived example might help. The source code for this is available on github.

Here is a viewmodel for a very simple view:

class ViewModel
{
  public ObservableCollection<string> Items { get; private set; }
  public ICommand FetchCommand { get; private set; }

  public async void Fetch()
  {
    var items = await m_model.Fetch();
    foreach (var item in items)
      Items.Add(item);
  }
}

The ViewModel exposes a command, which calls ViewModel.Fetch() to retrieve some data from the model; once retrieved this data is added to the list of displayed items. Since Fetch is called by an ICommand and interacts with an ObservableCollection (bound to the view) it clearly runs on the UI thread.

Our model is then responsible for fetching the data requested by the viewmodel:

class Model
{
  public async Task<IList<string>> Fetch()
  {
    return await Task.Run(() => DoFetch());
  }

  private IList<string> DoFetch()
  {
    Thread.Sleep(1000);
    return new[] { "Hello" };
  }
}

In a real application DoFetch() would obviously call a database or web service or whatever is required; it is also probable that the Fetch() method might be more complex, e.g. coordinating multiple service calls or managing caching or other application logic.

There is a difference in threading in the model compared to the viewmodel: the Fetch method will, in our example, be called on the UI thread whereas DoFetch will be called on a background thread. Here we have a class through which may run two separate types of thread.

In this very simple example which thread type calls each method is obvious. But scale this up to a real application with dozens of classes and many methods and it becomes less obvious. It suddenly becomes very easy to add a long-running service call to a method which runs on the UI thread; or a switch to a background thread from a method that already runs on a background thread. These errors can be difficult to spot: neither causes an obvious failure. The first will only show if the service call is obviously slow, the observed behaviour may simply be a UI that intermittently pauses for no apparent reason. The second simply slows tasks down, the application will seem slower than it ought to be with no obvious indication of why.

It seems as though WPF and MVVM have carefully guided us into a multi-threaded minefield.

First Approach

The first idea is to simply apply a naming convention, each method is suffixed with _UI or _Worker to indicate which type of thread invoked it. E.g. our model class becomes:

class Model
{
  public async Task<IList<string>> Fetch_UI()
  {
    return await Task.Run(() => Fetch_Worker());
  }

  private IList<string> Fetch_Worker()
  {
    Thread.Sleep(1000);
    return new[] { "Hello" };
  }
}

This at least makes it obvious to my future self which type of thread I think executes each method. Simple code inspection shows that a _UI method calling someWebService.Invoke(…) is a Bad Thing. Similarly, Task.Run(…) in a _Worker method is obviously suspect. However, it looks ugly and isn’t guaranteed to be correct – it is just a naming convention, nothing stops me calling a _UI method from a background thread, or vice versa.

Introducing PostSharp

If you haven’t tried PostSharp, the C# AOP library, it is definitely worth a look. Even the free version is quite powerful and allows us to evolve the first idea into something more powerful.

PostSharp allows you to create an attribute which will introduce “advice” (i.e. code to run) around any method annotated with the attribute. For example, we can annotate the ViewModel constructor with a new UIThreadPolicy attribute:

  [UIThreadPolicy]
  public ViewModel()
  {
    Items = new ObservableCollection<string>();
    FetchCommand = new FetchCommand(this);
  }

This attribute is logically similar to using a suffix on the method name, in that it documents our intention. However, by using AOP it also allows us to introduce code to run before the method executes:

class UIThreadPolicy : OnMethodBoundaryAspect
{
  public override void OnEntry(MethodExecutionArgs args)
  {
#if DEBUG
    if (Thread.CurrentThread.IsBackground)
      Console.WriteLine("*** Thread policy warning: \n" + Environment.StackTrace);
#endif
  }
}

The OnEntry method will be triggered before the annotated method is invoked. If the method is ever invoked on a background thread, we’ll output a warning to the console. In this rudimentary way, we not only document our intention that the ViewModel should only be created on a UI thread, we also enforce it at runtime to ensure that it remains correct.

We can define another attribute, WorkerThreadPolicy, to enforce the reverse: that a method is only invoked on a background thread. With discipline one of these two attributes can be applied to every method. This makes it trivial when making changes in months to come to know whether a given method runs on the UI thread or a background thread.

Locking

Understanding situations where multiple threads access code is hard, so wouldn’t it be great if we could easily identify situations where it’s safe to ignore it?

By using the thread attributes, we can identify situations where code is only accessed by the UI thread. In this case, we have no concurrency to deal with. There is exactly one UI thread so we can ignore any concurrency concerns. For example, our simple Fetch() method above can add a property to keep track of whether we’re already busy:

  [UIThreadPolicy]
  public async void Fetch()
  {
    IsFetching = true;
    try
    {
      var items = await m_model.Fetch();
      foreach (var item in items)
      Items.Add(item);
    }
    finally
    {
      IsFetching = false;
    }
  }

So long as all access to the IsFetching property is on the UI thread, there is no need to worry about locking. We can enforce this by adding attributes to the property itself, too:

  private bool m_isFetching;

  internal bool IsFetching
  {
    [UIThreadPolicy]
    get { return m_isFetching; }

    [UIThreadPolicy]
    set
    {
      m_isFetching = value;
      IsFetchingChanged(this, new EventArgs());
    }
  }

Here we use a simple, unlocked bool – knowing that the only access is from a single thread. Without these attributes, it is possible that someone in the future writes to IsFetching from a background thread. It will generally work – but access from the UI thread could continue to see a stale value for a short period.

Conclusion

In general we have aimed for a pattern where ViewModels are only accessed on the UI thread. Model classes, however, tend to have a mixture. And since the “model” in any non-trivial application actually includes dozens of classes this mixture of threads permeates the code. By using these attributes it is possible to understand which threads are used where without exhaustively searching up call stacks.

Writing multi-threaded code is hard: but, with a bit of PostSharp magic, knowing which type of thread will execute any given line of code at least makes it a little bit easier.


Categories: Programming, Testing & QA

The Notion of Enterprise Software Development

Herding Cats - Glen Alleman - Sun, 01/11/2015 - 20:09

The Road Map to SWEWhen we hear about some new fangled way of writing software for money, first ask in what domain has this new idea been applied, and what were the measures of effectiveness and measures of performance for that approach?

Then ask if there is any external frameworks applicable in that domain for developing software based business systems that govern the development, deployment, and operational aspects of the work? 

When the answer is yes, then next ask what are those governance frameworks? In a current engagement, ISO 12207 is the overarching framework for the development, testing, deployment, and operations of software systems. These systems provide services for Health and Human Services, Center for Medicaid Services and the disbursement of $492 billion. 

The management of software development at the Federal, State, Local level, and the commercial providers of services to those levels is guided by ISO 12207 which is composed of the following processes. 

Screen Shot 2015-01-11 at 10.50.17 AM

Now you may say we never ever have a project of this size. Such projects are insanely large and outside any domain we'll ever see.

So the inverse question is at what size do you no longer care about:

  • Budgeting?
  • Scheduling?
  • Resource management?
  • Estimate to Complete?
  • Estimate at Completion?
  • Risk informed estimates of cost, schedule, and technical performance of the deliverables?
  • Risk informed decision making in general?
  • Definitive descriptions of the deliverables in units of measure meaningful to the decision makers?

Then look through the 12207 list above and ask, which of these process areas has NO value for my project? That is, I don't need to know how to do this while spending other peoples money. 

I'll be just find in the absence of the guidance provided by this process area.

By the Way, the use of Agile methods fits right in with development work described in 12207.

With the answers to those questions, you can come back to see where you fit in the spectrum of projects.

Are you a Lone Wolf writing code for yourself or a customer where only you work or are you a member of an enterprise development team working on systems that are essential to the business in which they cannot fail, even cannot fail the first time they go live? Without the answer to this question, there can be no way to assess any suggestion, conjecture, or wild assed ideas about improving the probability of success for any software project. Related articles Taxonomy of Logical Fallacies Planning is the basis of decision making in the presence of uncertainty What is Governance? Good Project and Bad Project The Myth and Half-Truths of "Myths and Half-Truths"
Categories: Project Management

Re-read Saturday: Consolidating Gains and Producing More Change, John P. Kotter Chapter 9

index

My local football (US variety) team seems to have the ability to be winning throughout the game only to wear out, lose focus or generally find a way to lose with very little time left on the clock. Significant organizational changes, like pursuing a goal or winning a game, is not over until the change is complete and it has become part of the culture. Stage 7 of Kotter’s 8-stage process for creating major changes is consolidating gains and producing more change. Just like my local football team, you must make sure you hold on to your gains and use them to build toward the next step forward. Kotter addresses two major topics in the stage: resistance and interdependence.

As time goes by and you experience short-term successes, it is easy to begin lose the urgency that helped power a change program. Until a change is written into an organization’s culture, resistance will always be lurking. Loss of urgency can lead to programs stalling. For example, I recently discussed a change program with colleagues that had been designed to transform an organization using Agile techniques including Scrum, continuous delivery and test driven development. Scrum was implemented as the first step and generated significant benefits. As the initial benefits were recognized, a number of leaders began to argue that 80% of the benefit had been generated and that the rest of the changes would be difficult. The argument led to a loss of urgency, momentum and a reduction in funding as attention wandered to another program with less resistance. Urgency and constancy of purpose must be continually maintained or resistance can lead to regression.

Significant organizational change typically requires changes to many different groups and processes to be effective. The larger the intended change, the larger the number of moving parts and interactions that will need to be involved when making a change. As most change programs progress, they evolve. Evolution is typically generated by feedback from the short-term wins and other sources within the environment. Changes help to identify new interactions and dependencies, which add complexity and the level of effort. Kotter uses an example of the difference of rearranging an office with all of the furniture attached with rubber bands and one without. The one in which the furniture is connected with rubber bands will require significantly more planning and effort. Each item will pull against each other as changes are made. As changes are identified, the program will potentially need to add new people and resources or perhaps even new subprojects many need to be established. Senior management needs to provide a sense of urgency for the change program and a vision of where the program is going. At the same time, the complexity of any significant change program requires tactical leadership and management. Effective change programs require both strategic vision and tactical management for effective delivery. The combination of interactions and dependencies cause complexity that requires focus and constancy of purpose by senior, middle and line management to facilitate change.

While any project or program evolves as new information and knowledge is discovered, we need to continually challenge the validity of change. Change causes complexity.  The higher the complexity of any program, the less likely they are to complete, at least effectively. One of the principles noted in the Agile Manifesto, is that “simplicity – the art of maximizing the amount of work not done – is essential.” Each part of the change program, and especially any changes or additions that are discovered as progress is made, must be evaluated to ensure that only what is required to deliver the vision is addressed. Remember the adage: keep it simple stupid. The tool to manage change is the guiding collation.  Use the guiding collation to accept change and to prioritize the change program’s backlog (sounds like a product owner).

Kotter summarizes stage 7 this way:

  • More change, not less – The program must build on the credibility and the feedback of the short-term wins.
  • More help – As inter-dependencies are identified, bring new people and resources into the program with the needed experience and knowledge.
  • Continued leadership – Senior management must have constancy of purpose. They need to continually provide and maintain a clear vision.
  • Project management and leadership from below – The individual projects and initiatives require tactical leadership and management to implement the visions of senior management.
  • Reduction of unnecessary inter-dependencies – Keep the change program as simple as it need to be.

All large projects, whether they are significant organizational change programs or not, take time and evolve. At some point, as change programs progress and generate benefits it will become tempting to declare victory. Yogi Berra stated “It ain’t over till it’s over.” A change program is not complete until it attains the vision for the program and has been integrated into the organization’s culture.


Categories: Process Management

Planning is the Basis of Decision Making in the Presence of Uncertainty

Herding Cats - Glen Alleman - Sat, 01/10/2015 - 23:46

Another twitter conversation - of sorts - prompted a thought about the purpose of planning, from the Quote of the Day post. In the enterprise IT world, planning and plans provide the road map for development and deployment of systems that implement the strategy of the business.

Planning

It is the process of planning that reveals the needed seqeunce of delivered capabilities. In the picture below a health insurance providers network application is deployed to replace several legacy systems over the course of time. Specific capabilities must be in place before later capabilities can be deployed.

This is a Plan for the development and deployment of those capabilities. From the plan, technical and operational requirements ae developed, software developed, tested, IV&V's, confirmed to be compliant with security and regulatory guidelines, business users trained, and providers (medical) trained.

Screen Shot 2015-01-10 at 11.41.06 AM

The sequence of these capabilities is well defined from the business operations side. The development of software inside each capability providing activity has some flexibility. But if we're going to arrive on the planned - NEED = date for say Data Store Lookup, then the sequence of that software has some flexibility. But not arbitrary flexibility. Further levels of planning are needed for resource availability - both people, machines, services, facilities, etc. 

Planning is Planning for Capabilities to Be Available to Deliver Value

The planning process is driven by Capabilities Based Planning. This process has some simple straightforward steps.

IdentifyCapabilities

So when we hear deliver value on day one, we need to ask in what domain is that even possible. We need to deliver value on the day the value is needed for the business. Having a capability ready before the business is ready to use it is poor resource utilization planning.

We'll have spent money and time on something the business can't use yet. We may have made better use of that time and money on another capability. Which capabilities come in what order, at what time the business is capable of absorbing them is the first and primary role of PLANNING.

So the conjecture that plans are almost certainly wrong, begs the question - do you know what you're doing? If you're plans are almost certainly wrong - at least in the Enterprise IT business - you've got the wrong people managing the project. And you almost certaintly are wasting money developing things that won't be used.

This doesn't mean we don't explore, try different things to see if they'll work. But that work is also planned. It's not our money.

Domain is King

If you've got a project that is just you or maybe you and a few close friends that's going to take a week or maybe 6 weeks, planning in this manner is likely a waste. Along with estimating your work, keeping track of progress to plan, and even counting the money.

But if you're on a $200M enterprise IT development, integration, and deployment project with ~100 developers, testers, QA people, security, compliance, server ops, DBA's etc. you'd better have some notion of the order of work, the order of value delivery, the cost of that work, the probability of showing up on timem, on budget with that value in hand and how you going to herd all the cats that surround the project.

Related articles All About Me or All About the Mission? Your Project Needs a Budget and Other Things What is Governance? Taxonomy of Logical Fallacies Good Project and Bad Project Closed Loop Control The False Notion of "we haven't seen this before"
Categories: Project Management

Diamond Kata - TDD with only Property-Based Tests

Mistaeks I Hav Made - Nat Pryce - Sat, 01/10/2015 - 21:44
The Diamond Kata is a simple exercise that Seb Rose described in a recent blog post. Seb describes the Diamond Kata as: Given a letter, print a diamond starting with ‘A’ with the supplied letter at the widest point. For example: print-diamond ‘C’ prints A B B C C B B A Seb used the exercise to illustrate how he “recycles” tests to help him work incrementally towards a full solution. Seb’s approach prompted Alastair Cockburn to write an article in response in which he argued for more thinking before programming. Alastair’s article shows how he approached the Diamond Kata with more up-front analysis. Ron Jeffries and George Dinwiddie resonded to Alastair’s article, showing how they approached the Diamond Kata relying on emergent design to produce an elegant solution (“thinking all the time”, as Ron Jeffries put it). There was some discussion on Twitter, and several other people published their approaches. (I’ll list as many as I know about at the end of this article). The discussion sparked my interest, so I decided to have a go at the exercise myself. The problem seemed to me, at first glance, to be a good fit for property testing. So I decided to test-drive a solution using only property-based tests and see what happens. I wrote the solution in Scala and used ScalaTest to run and organise the tests and ScalaCheck for property testing. What follows is an unexpurgated, warts-and-all walkthrough of my progress, not just the eventual complete solution. I made wrong turns and stupid mistakes along the way. The walkthrough is pretty long, so if you want you don’t want to follow through step by step, jump straight to the complete solution and/or my conclusions on how the exercise went and what I learned. Alternatively, if you want to follow the walkthrough in more detail, the entire history is on GitHub, with a commit per TDD step (add a failing test, commit, make the implementation pass the test, commit, refactor, commit, … and repeat). Walkthrough Getting Started: Testing the Test Runner The first thing I like to do when starting a new project is make sure my development environment and test runner are set up right, that I can run tests, and that test failures are detected and reported. I use Gradle to bootstrap a new Scala project with dependencies on the latest versions of ScalaTest and ScalaCheck and import the Gradle project into IntelliJ IDEA. ScalaTest supports several different styles of test and assertion syntax. The user guide recommends writing an abstract base class that combines traits and annotations for your preferred testing style and test runner, so that’s what I do first: @RunWith(classOf[JUnitRunner]) abstract class UnitSpec extends FreeSpec with PropertyChecks { } My test class extends UnitSpec: class DiamondSpec extends UnitSpec { } I add a test that explicitly fails, to check that the test framework, IDE and build hang together correctly. When I see the test failure, I’m ready to write the first real test. The First Test Given that I’m writing property tests, I have to start with a simple property of the diamond function, not a simple example. The simplest property I can think of is: For all valid input character, the diamond contains one or more lines of text. To turn that into a property test, I must define “all valid input characters” as a generator. The description of the Diamond Kata defines valid input as a single upper case character. ScalaCheck has a predefined generator for that: val inputChar = Gen.alphaUpperChar At this point, I haven’t decided how I will represent the diamond. I do know that my test will assert on the number of lines of text, so I write the property with respect to an auxiliary function, diamondLines(c:Char):Vector[String], which will generate a diamond for input character c and return the lines of the diamond in a vector. "produces some lines" in { forAll (inputChar) { c => assert(diamondLines(c).nonEmpty) } } I like the way that the test reads in ScalaTest/ScalaCheck. It is pretty much a direct translation of my English description of the property into code. To make the test fail, I write diamondLines as: def diamondLines(c : Char) : Vector[String] = { Vector() } The entire test class is: import org.scalacheck._ class DiamondSpec extends UnitSpec { val inputChar = Gen.alphaUpperChar "produces some lines" in { forAll (inputChar) { c => assert(diamondLines(c).nonEmpty) } } def diamondLines(c : Char) : Vector[String] = { Vector() } } The simplest implementation that will make that property pass is to return a single string: object Diamond { def diamond(c: Char) : String = { "A" } } I make the diamondLines function in the test call the new function and split its result into lines: def diamondLines(c : Char) = { Diamond.diamond(c).lines.toVector } The implementation can be used like this: object DiamondApp extends App { import Diamond.diamond println(diamond(args.lift(0).getOrElse("Z").charAt(0))) } A Second Test, But It Is Not Very Helpful I now need to add another property, to more tightly constrain the solution. I notice that the diamond always has an odd number of lines, and decide to test that: For all valid input character, the diamond has an odd number of lines. This implies that the number of lines is greater than zero (because vectors cannot have a negative number of elements and zero is even), so I change the existing test rather than adding another one: "produces an odd number lines" in { forAll (inputChar) { c => assert(isOdd(diamondLines(c).length)) } } def isOdd(n : Int) = n % 2 == 1 But this new test has a problem: my existing solution already passes it. The diamond function returns a single line, and 1 is an odd number. This choice of property is not helping drive the development forwards. A Failing Test To Drive Development, But a Silly Mistake The next simplest property I can think of is the number of lines of the diamond. If ‘ord(c)’ is the number of letters between ‘A’ and c, (zero for A, 1 for B, 2 for C, etc.) then: For all valid input characters, c, the number of lines in a diamond for c is 2*ord(c)+1. At this point I make a silly mistake. I write my property as: "number of lines" in { forAll (inputChar) { c => assert(diamondLines(c).length == ord(c)+1) } } def ord(c: Char) : Int = c - 'A' I don’t notice the mistake immediately. When I do, I decide to leave it in the code as an experiment to see if the property tests will detect the error by becoming inconsistent, and how long it will take before they do so. This kind of mistake would easily be caught by an example test. It’s a good idea to have a few examples, as well as properties, to act as smoke tests. I make the test pass with the smallest amount of production code possible. I move the ord function from the test into the production code and use it to return the required number of lines that are all the same. def diamond(c: Char) : String = { "A\n" * (ord(c)+1) } def ord(c: Char) : Int = c - 'A' Despite sharing the ord function between the test and production code, there’s still some duplication. Both the production and test code calculate ord(c)+1. I want to address that before writing the next test. Refactor: Duplicated Calculation I replace ord(c)+1 with lineCount(c), which calculates number of lines generated for an input letter, and inline the ord(c) function, because it’s now only used in one place. object Diamond { def diamond(c: Char) : String = { "A\n" * lineCount(c) } def lineCount(c: Char) : Int = (c - 'A')+1 } And I use lineCount in the test as well: "number of lines" in { forAll (inputChar) { c => assert(diamondLines(c).length == lineCount(c)) } } On reflection, using the lineCount calculation from production code in the test feels like a mistake. Squareness The next property I add is: For all valid input character, the text containing the diamond is square Where “is square” means: The length of each line is equal to the total number of lines In Scala, this is: "squareness" in { forAll (inputChar) { c => assert(diamondLines(c) forall {_.length == lineCount(c)}) } } I can make the test pass like this: object Diamond { def diamond(c: Char) : String = { val side: Int = lineCount(c) ("A" * side + "\n") * side } def lineCount(c: Char) : Int = (c - 'A')+1 } Refactor: Rename the lineCount Function The lineCount is also being used to calculate the length of each line, so I rename it to squareSide. object Diamond { def diamond(c: Char) : String = { val side: Int = squareSide(c) ("A" * side + "\n") * side } def squareSide(c: Char) : Int = (c - 'A')+1 } Refactor: Clarify the Tests I’m now a little dissatisfied with the way the tests read: "number of lines" in { forAll (inputChar) { c => assert(diamondLines(c).length == squareSide(c)) } } "squareness" in { forAll (inputChar) { c => assert(diamondLines(c) forall {_.length == squareSide(c)}) } } The “squareness” property does not stand alone. It doesn’t communicate that the output is square unless combined with “number of lines” property. I refactor the test to disentangle the two properties: "squareness" in { forAll (inputChar) { c => val lines = diamondLines(c) assert(lines forall {line => line.length == lines.length}) } } "size of square" in { forAll (inputChar) { c => assert(diamondLines(c).length == squareSide(c)) } } The Letter on Each Line The next property I write specifies which characters are printed on each line. The characters of each line should be either a letter that depends on the index of the line, or a space. Because the diamond is vertically symmetrical, I only need to consider the lines from the top to the middle of the diamond. This makes the calculation of the letter for each line much simpler. I make a note to add a property for the vertical symmetry once I have made the implementation pass this test. "single letter per line" in { forAll (inputChar) { c => val allLines = diamondLines(c) val topHalf = allLines.slice(0, allLines.size/2 + 1) for ((line, index) <- topHalf.zipWithIndex) { val lettersInLine = line.toCharArray.toSet diff Set(' ') val expectedOnlyLetter = ('A' + index).toChar assert(lettersInLine == Set(expectedOnlyLetter), "line " + index + ": \"" + line + "\"") } } } To make this test pass, I change the diamond function to: def diamond(c: Char) : String = { val side: Int = squareSide(c) (for (lc <- 'A' to c) yield lc.toString * side) mkString "\n" } This repeats the correct letter for the top half of the diamond, but the bottom half of the diamond is wrong. This will be fixed by the property for vertical symmetry, which I’ve noted down to write next. Vertical Symmetry The property for vertical symmetry is: For all input character, c, the lines from the top to the middle of the diamond, inclusive, are equal to the reversed lines from the middle to the bottom of the diamond, inclusive. "is vertically symmetrical" in { forAll(inputChar) { c => val allLines = diamondLines(c) val topHalf = allLines.slice(0, allLines.size / 2 + 1) val bottomHalf = allLines.slice(allLines.size / 2, allLines.size) assert(topHalf == bottomHalf.reverse) } } The implementation is: def diamond(c: Char) : String = { val side: Int = squareSide(c) val topHalf = for (lc <- 'A' to c) yield lineFor(side, lc) val bottomHalf = topHalf.slice(0, topHalf.length-1).reverse (topHalf ++ bottomHalf).mkString("\n") } But this fails the “squareness” and “size of square” tests! My properties are now inconsistent. The test suite has detected the erroneous implementation of the squareSide function. The correct implementation of squareSide is: def squareSide(c: Char) : Int = 2*(c - 'A') + 1 With this change, the implementation passes all of the tests. The Position Of The Letter In Each Line Now I add a property that specifies the position and value of the letter in each line, and that all other characters in a line are spaces. Like the previous test, I can rely on symmetry in the output to simplify the arithmetic. This time, because the diamond has horizontal symmetry, I only need specify the position of the letter in the first half of the line. I add a specification for horizontal symmetry, and factor out generic functions to return the first and second half of strings and sequences. "is vertically symmetrical" in { forAll (inputChar) { c => val lines = diamondLines(c) assert(firstHalfOf(lines) == secondHalfOf(lines).reverse) } } "is horizontally symmetrical" in { forAll (inputChar) { c => for ((line, index) <- diamondLines(c).zipWithIndex) { assert(firstHalfOf(line) == secondHalfOf(line).reverse, "line " + index + " should be symmetrical") } } } "position of letter in line of spaces" in { forAll (inputChar) { c => for ((line, lineIndex) <- firstHalfOf(diamondLines(c)).zipWithIndex) { val firstHalf = firstHalfOf(line) val expectedLetter = ('A'+lineIndex).toChar val letterIndex = firstHalf.length - (lineIndex + 1) assert (firstHalf(letterIndex) == expectedLetter, firstHalf) assert (firstHalf.count(_==' ') == firstHalf.length-1, "number of spaces in line " + lineIndex + ": " + line) } } } def firstHalfOf[AS, A, That](v: AS)(implicit asSeq: AS => Seq[A], cbf: CanBuildFrom[AS, A, That]) = { v.slice(0, (v.length+1)/2) } def secondHalfOf[AS, A, That](v: AS)(implicit asSeq: AS => Seq[A], cbf: CanBuildFrom[AS, A, That]) = { v.slice(v.length/2, v.length) } The implementation is: object Diamond { def diamond(c: Char) : String = { val side: Int = squareSide(c) val topHalf = for (letter <- 'A' to c) yield lineFor(side, letter) (topHalf ++ topHalf.reverse.tail).mkString("\n") } def lineFor(length: Int, letter: Char): String = { val halfLength = length/2 val letterIndex = halfLength - ord(letter) val halfLine = " "*letterIndex + letter + " "*(halfLength-letterIndex) halfLine ++ halfLine.reverse.tail } def squareSide(c: Char) : Int = 2*ord(c) + 1 def ord(c: Char): Int = c - 'A' } It turns out the ord function, which I inlined into squareSide a while ago, is needed after all. The implementation is now complete. Running the DiamondApp application prints out diamonds. But there’s plenty of scope for refactoring both the production and test code. Refactoring: Delete the “Single Letter Per Line” Property The “position of letter in line of spaces” property makes the “single letter per line” property superflous, so I delete “single letter per line”. Refactoring: Simplify the Diamond Implementation I rename some parameters and simplify the implementation of the diamond function. object Diamond { def diamond(maxLetter: Char) : String = { val topHalf = for (letter <- 'A' to maxLetter) yield lineFor(maxLetter, letter) (topHalf ++ topHalf.reverse.tail).mkString("\n") } def lineFor(maxLetter: Char, letter: Char): String = { val halfLength = ord(maxLetter) val letterIndex = halfLength - ord(letter) val halfLine = " "*letterIndex + letter + " "*(halfLength-letterIndex) halfLine ++ halfLine.reverse.tail } def squareSide(c: Char) : Int = 2*ord(c) + 1 def ord(c: Char): Int = c - 'A' } The implementation no longer uses the squareSide function. It’s only used by the “size of square” property. Refactoring: Inline the squareSide function I inline the squareSide function into the test. "size of square" in { forAll (inputChar) { c => assert(diamondLines(c).length == 2*ord(c) + 1) } } I believe the erroneous calculation would have been easier to notice if I had done this from the start. Refactoring: Common Implementation of Symmetry There’s one last bit of duplication in the implementation. The expressions that create the horizontal and vertical symmetry of the diamond can be replaced with calls to a generic function. I’ll leave that as an exercise for the reader… Complete Tests and Implementation Tests: import Diamond.ord import org.scalacheck._ import scala.collection.generic.CanBuildFrom class DiamondSpec extends UnitSpec { val inputChar = Gen.alphaUpperChar "squareness" in { forAll (inputChar) { c => val lines = diamondLines(c) assert(lines forall {line => line.length == lines.length}) } } "size of square" in { forAll (inputChar) { c => assert(diamondLines(c).length == 2*ord(c) + 1) } } "is vertically symmetrical" in { forAll (inputChar) { c => val lines = diamondLines(c) assert(firstHalfOf(lines) == secondHalfOf(lines).reverse) } } "is horizontally symmetrical" in { forAll (inputChar) { c => for ((line, index) <- diamondLines(c).zipWithIndex) { assert(firstHalfOf(line) == secondHalfOf(line).reverse, "line " + index + " should be symmetrical") } } } "position of letter in line of spaces" in { forAll (inputChar) { c => for ((line, lineIndex) <- firstHalfOf(diamondLines(c)).zipWithIndex) { val firstHalf = firstHalfOf(line) val expectedLetter = ('A'+lineIndex).toChar val letterIndex = firstHalf.length - (lineIndex + 1) assert (firstHalf(letterIndex) == expectedLetter, firstHalf) assert (firstHalf.count(_==' ') == firstHalf.length-1, "number of spaces in line " + lineIndex + ": " + line) } } } def firstHalfOf[AS, A, That](v: AS)(implicit asSeq: AS => Seq[A], cbf: CanBuildFrom[AS, A, That]) = { v.slice(0, (v.length+1)/2) } def secondHalfOf[AS, A, That](v: AS)(implicit asSeq: AS => Seq[A], cbf: CanBuildFrom[AS, A, That]) = { v.slice(v.length/2, v.length) } def diamondLines(c : Char) = { Diamond.diamond(c).lines.toVector } } Implementation: object Diamond { def diamond(maxLetter: Char) : String = { val topHalf = for (letter <- 'A' to maxLetter) yield lineFor(maxLetter, letter) (topHalf ++ topHalf.reverse.tail).mkString("\n") } def lineFor(maxLetter: Char, letter: Char): String = { val halfLength = ord(maxLetter) val letterIndex = halfLength - ord(letter) val halfLine = " "*letterIndex + letter + " "*(halfLength-letterIndex) halfLine ++ halfLine.reverse.tail } def ord(c: Char): Int = c - 'A' } Conclusions In his article, “Thinking Before Programming”, Alastair Cockburn writes: The advantage of the Dijkstra-Gries approach is the simplicity of the solutions produced. The advantage of TDD is modern fine-grained incremental development. … Can we combine the two? I think property-based tests in the TDD process combined the two quite successfully in this exercise. I could record my half-formed thoughts about the problem and solution as generators and properties while using “modern fine-grained incremental development” to tighten up the properties and grow the code that met them. In Seb’s original article, he writes that when working from examples… it’s easy enough to get [the tests for ‘A’ and ‘B’] to pass by hardcoding the result. Then we move on to the letter ‘C’. The code is now screaming for us to refactor it, but to keep all the tests passing most people try to solve the entire problem at once. That’s hard, because we’ll need to cope with multiple lines, varying indentation, and repeated characters with a varying number of spaces between them. I didn’t encounter this problem when driving the implementation with properties. Adding a new property always required an incremental improvement to the implementation to get the tests passing again. Neither did I need to write throw-away tests for behaviour that was not actually desired of the final implementation, as Seb did with his “test recycling” approach. Every property I added applied to the complete solution. I only deleted properties that were implied by properties I added later, and so had become unnecessary duplication. I took the approach of starting from very generic properties and incrementally adding more specific properties as I refine the implementation. Generic properties were easy to come up with, and helped me make progress in the problem. The suite of properties reinforced one another, testing the tests, and detected the mistake I made in one property that caused it to be inconsistent with the rest. I didn’t know Scala, ScalaTest or ScalaCheck well. Now I’ve learned them better I wish I had written a minimisation strategy for the input character. This would have made test failure messages easier to understand. I also didn’t address what the diamond function would do with input outside the range of ‘A’ to ‘Z’. Scala doesn’t let one define a subtype of Char, so I can’t enforce the input constraint in the type system. I guess the Scala way would be to define diamond as a PartialFunction[Char,String]. I haven’t yet looked at any other people’s solutions in detail. I’ll post a follow up article if I find any interesting differences. Other Solutions Other solutions to the Diamond Kata that I know about are: Seb Rose: Recycling Tests in TDD Alastair Cockburn: Thinking Before Programming Seb Rose: Diamond recycling (and painting yourself into a corner) Ron Jeffries: a detailed walkthrough of his solution George Dinwiddie: Another Approach to the Diamond Kata Ivan Sanchez: A walkthrough of his Clojure solution. Jon Jagger: print “squashed-circle” diamond Sandro Mancuso: A Java solution on GitHub Krzysztof Jelski: A Python solution on GitHub Philip Schwarz: A Clojure solution on GitHub
Categories: Programming, Testing & QA

Python: scikit-learn: ImportError: cannot import name __check_build

Mark Needham - Sat, 01/10/2015 - 09:48

In part 3 of Kaggle’s series on text analytics I needed to install scikit-learn and having done so ran into the following error when trying to use one of its classes:

>>> from sklearn.feature_extraction.text import CountVectorizer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/sklearn/__init__.py", line 37, in <module>
    from . import __check_build
ImportError: cannot import name __check_build

This error doesn’t reveal very much but I found that when I exited the REPL and tried the same command again I got a different error which was a bit more useful:

>>> from sklearn.feature_extraction.text import CountVectorizer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/sklearn/__init__.py", line 38, in <module>
    from .base import clone
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/sklearn/base.py", line 10, in <module>
    from scipy import sparse
ImportError: No module named scipy

The fix for this is now obvious:

$ pip install scipy

And I can now load CountVectorizer without any problem:

$ python
Python 2.7.5 (default, Aug 25 2013, 00:04:04)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from sklearn.feature_extraction.text import CountVectorizer
Categories: Programming

Python: gensim – clang: error: unknown argument: ‘-mno-fused-madd’ [-Wunused-command-line-argument-hard-error-in-future]

Mark Needham - Sat, 01/10/2015 - 09:39

While working through part 2 of Kaggle’s bag of words tutorial I needed to install the gensim library and initially ran into the following error:

$ pip install gensim
 
...
 
cc -fno-strict-aliasing -fno-common -dynamic -arch x86_64 -arch i386 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -arch i386 -pipe -I/Users/markneedham/projects/neo4j-himym/himym/build/gensim/gensim/models -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -I/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/numpy/core/include -c ./gensim/models/word2vec_inner.c -o build/temp.macosx-10.9-intel-2.7/./gensim/models/word2vec_inner.o
 
clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]
 
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
 
command 'cc' failed with exit status 1
 
an integer is required
 
Traceback (most recent call last):
 
  File "<string>", line 1, in <module>
 
  File "/Users/markneedham/projects/neo4j-himym/himym/build/gensim/setup.py", line 166, in <module>
 
    include_package_data=True,
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 152, in setup
 
    dist.run_commands()
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 953, in run_commands
 
    self.run_command(cmd)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
 
    cmd_obj.run()
 
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/setuptools/command/install.py", line 59, in run
 
    return orig.install.run(self)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/command/install.py", line 573, in run
 
    self.run_command('build')
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/cmd.py", line 326, in run_command
 
    self.distribution.run_command(command)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
 
    cmd_obj.run()
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/command/build.py", line 127, in run
 
    self.run_command(cmd_name)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/cmd.py", line 326, in run_command
 
    self.distribution.run_command(command)
 
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
 
    cmd_obj.run()
 
  File "/Users/markneedham/projects/neo4j-himym/himym/build/gensim/setup.py", line 71, in run
 
    "There was an issue with your platform configuration - see above.")
 
TypeError: an integer is required
 
----------------------------------------
Cleaning up...
Command /Users/markneedham/projects/neo4j-himym/himym/bin/python -c "import setuptools, tokenize;__file__='/Users/markneedham/projects/neo4j-himym/himym/build/gensim/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/sb/6zb6j_7n6bz1jhhplc7c41n00000gn/T/pip-i8aeKR-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/markneedham/projects/neo4j-himym/himym/include/site/python2.7 failed with error code 1 in /Users/markneedham/projects/neo4j-himym/himym/build/gensim
Storing debug log for failure in /Users/markneedham/.pip/pip.log

The exception didn’t make much sense to me but I came across a blog post which explained it:

The Apple LLVM compiler in Xcode 5.1 treats unrecognized command-line options as errors. This issue has been seen when building both Python native extensions and Ruby Gems, where some invalid compiler options are currently specified.

The author suggests this only became a problem with XCode 5.1 so I’m surprised I hadn’t come across it sooner since I haven’t upgraded XCode in a long time.

We can work around the problem by telling the compiler to treat extra command line arguments as a warning rather than an error

export ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future

Now it installs with no problems.

Categories: Programming