Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Testing & QA

Quote of the Month August 2014

From the Editor of Methods & Tools - Tue, 08/26/2014 - 09:55
We don’t mean that you should put on your Super Tester cape and go protect the world from bugs. There’s no room for big egos on agile teams. Your teammates share your passion for quality. Focus on the teams goals and do what you can to help everyone do their best work. Source: Agile Testing, Lisa Crispin and Janet Gregory, Addison Wesley

How Not to Standardize Testing (ISO 29119)

James Bach’s Blog - Mon, 08/25/2014 - 09:15

Many years ago I took a management class. One of the exercises we did was on achieving consensus. My group did not reach an agreement because I wouldn’t lower my standards. I wanted to discuss the matter further, but the other guys grew tired of arguing with me and declared “consensus” over my objections. This befuddled me, at first. The whole point of the exercise was to reach a common decision, and we had failed, by definition, to do that– so why declare consensus at all? It’s like getting checkmated in chess and then declaring that, well, you still won the part of the game that you cared about… the part before the checkmate.

Later I realized this is not so bizarre. What they had effectively done is ostracize me from the team. They had changed the players in the game. The remaining team did come to consensus. In the years since, I have found that changing the boundaries or membership of a community is indeed an important pillar of consensus building. I have used this tactic many times to avoid unhelpful debate. It is one reason why I say that I’m a member of the Context-Driven School of Testing. My school does not represent all schools, and the other schools do not represent mine. Therefore, we don’t need consensus with them.

Then what about ISO 29119?

The ISO organization claims to have a new standard for software testing. But ISO 29119 is not a standard for testing. It cannot be a standard for testing.

A standard for testing would have to reflect the values and practices of the world community of testers. Yet, the concerns of the Context-Driven School of thought, which has been in development for at least 15 years have been ignored and our values shredded by this so-called standard and the process used to create it. They have done this by excluding us. There are two organizations explicitly devoted to Context-Driven values (AST and ISST) and our community holds several major conferences a year. Members of our community speak at all the major practitioners conferences, and our ideas are widely cited. Some of the most famous testers in the the world, including me, are Context-Driven testers. We exist, and together with the Agilists, we are the source of nearly every new idea in testing in the last decade.

The reason they have excluded us is that they know we won’t agree to any simplistic standard based on templates or simple formulae. We know those things look pretty but they don’t help. If ISO doesn’t exclude us, they worry they will never finish. They know we will challenge their evidence, and even their ethics and basic competence. This is why I say the craft is not ready for standards. It will be years before all the recognized experts in testing can come together and agree on anything substantial.

The people running the ISO effort know exactly who we are. I personally have had multiple public debates with Stuart Reid, on stage. He cannot pretend we don’t exist. He cannot pretend we are some sort of lunatic fringe. Tens of thousands of testers have watched my video lectures or bought my books. This is not a case where ISO can simply declare us to be outsiders.

The Burden of Proof

The Context-Driven community stands for excellence in testing. This is why we must reject this depraved attempt by ISO to grab power and assert control over our craft. Our craft is still an open marketplace of ideas, and it is full of strong debates. We must protect that marketplace and allow it to evolve. I want the fair chance to put my competitors out of business (or get them to change their business) with the high quality of my work. Context-Driven testing has been growing in strength and numbers over the years. Whereas this ISO effort appears to be a job protection program for people who can’t stomach debate. They can’t win the debate so they want to remake the rules.

The burden of proof is not on me or any of us to show that the standard is wrong, nor is it our job to make it right. The burden is on those who claim that the craft can be standardized to study the craft and recognize and resolve the deep differences among us. Failing that, there can be no ethical or rational basis for standardization.

This blog post puts me on record as opposing the ISO 29119 standard. Together with my colleagues, we constitute a determined and sustained and principled opposition.

Categories: Testing & QA

Software Development Linkopedia August 2014

From the Editor of Methods & Tools - Thu, 08/21/2014 - 14:48
Here is our monthly selection of interesting knowledge material on programming, software testing and project management.  This month you will find some interesting information and opinions about Agile retrospectives, software architecture, software developer psychology, software testing  in Agile teams, quality code and the (funny) history of programming. Web site: Fun Retrospectives Blog: How to make software architecture decisions? Blog: Cognitive Biases in Software Engineering Blog: How the Other Half Works: an Adventure in the Low Status of Software Engineers Blog: Confessions of an ex-developer Article: Tearing Down the Walls – Embedding QA in a TDD/Pairing and ...

Is pairing for everybody?

Actively Lazy - Thu, 08/21/2014 - 08:11

Pair programming is a great way to share knowledge. But every developer is different, does pairing work for everyone?

Pairing helps a team normalise its knowledge – what one person knows, everyone else learns through pairing: keyboard shortcuts, techniques, practices, third party libraries as well as the details of the source code you’re working in. This pushes up the average level of the team and stops knowledge becoming siloed.

Pairing also helps with discipline: it’s a lot harder to argue that you don’t need a unit test when there’s someone sitting next to you, literally acting as your conscience. It’s also a lot harder to just do the quick and dirty hack to get on to the next task, when the person sitting next to you has taken control of the keyboard to stop you committing war crimes against the source code.

The biggest problem most teams face is basically one of communication: coordinating, in detail, the activities of a team of developers is difficult. Ideally, every developer would know everything that is going on across the team – but this clearly isn’t practical. Instead, we have to draw boundaries to make it easier to reason about the system as a whole, without knowing the whole system to the same level of detail. I’ll create an API, some boundary layer, and we each work to our own side of it. I’ll create the service, you sort out the user interface. I’ll sort out the network protocol, you sort out the application layer. You have to introduce an architectural boundary to simplify the communication and coordination. Your architecture immediately reflects the relationships of the developers building it.

Whereas on teams that pair, these boundaries can be softer. They still happen, but the boundary becomes softer because as pairs rotate you see both sides of any boundary so it doesn’t become a black box you don’t know about and can’t change. One day I’m writing the user interface code, the next I’m writing the service layer that feeds it. This is how you spot inconsistencies and opportunities to fix the architecture and take advantage of implementation details on both sides. Otherwise this communication is hard. Continuous pair rotation means you can get close to the ideal that each developer knows, broadly, what is happening everywhere.

However, let’s be honest: pairing isn’t for everyone. I’ve worked with some people who were great at pairing, who were a pleasure to work with. People who had no problem explaining their thought process and no ego to get bruised when you point out the fatal flaw in their idea. People who spot when you’ve lost the train of thought and pick up where you drifted off from.

A good pairing session becomes very social. A team that is pairing can sound very noisy. It can be one of the hardest things to get used to when you start pairing: I seem to spend my entire day arguing and talking. When are we gonna get on and write some damned code? But that just highlights how little of the job is actually typing in source code. Most of the day is figuring out which change to make and where. A single line of code can take hours of arguing to get right and in the right place.

But programming tends to attract people who are less sociable than others – and let’s face it, we’re a pretty anti-social bunch: I spend my entire day negotiating with a machine that works in 1s and 0s. Not for me the subtle nuances of human communication, it either compiles or it doesn’t. I don’t have to negotiate or try and out politick the compiler. I don’t have to deal with the compiler having “one of those days” (well, I say that, sometimes I swear…). I don’t have to take the compiler to one side and offer comforting words because its cat died. I don’t have to worry about hurting the compiler’s feelings because I made the same mistake for the hundredth time: “yes of course I’m listening to you, no I’m not just ignoring you. Of course I value your opinions, dear. But seriously, this is definitely an IList of TFoo!”

So it’s no surprise that among the great variety of programmers you meet, some are extrovert characters who relish the social, human side of working in a team of people, building software. As well as the introvert characters who relish the quiet, private, intellectual challenge of crafting an elegant solution to a fiendish problem.

And so to pairing: any team will end up with a mixture of characters. The extroverts will tend to enjoy pairing, while the introverts will tend to find it harder and seek to avoid it. This isn’t necessarily a question of education or persuasion, the benefits are relatively intangible and more introverted developers may find the whole process less enjoyable than working solo. It sounds trite: but happy developers are productive developers. There’s no point doing anything that makes some of your peers unhappy. All teams need to agree rules. For example, some people like eating really smelly food in an open plan office. Good teams tend to agree rules about this kind of behaviour; everyone agrees that small sacrifices for an individual make a big difference for team harmony.

However, how do you resolve a difference of opinion with pairing? As a team decision, pairing is a bit all or nothing. Either we agree to pair on everything, so there’s no code ownership, regular rotation and we learn from each other. Or we don’t, and we each become responsible for our own dominion. We can’t agree that those that want to pair will go into the pairing room so as not to upset everyone else.

One option is to simply require that everyone on your team has to love pairing. I don’t know about you: hiring good people is hard. The last thing I want to do is start excluding people who could otherwise be productive. Isn’t it better to at least have somebody doing something, even if they’re not pairing?

Another option is to force developers to pair, even if they find it difficult or uncomfortable. But is that really going to be productive? Building resentment and unhappiness is not going to create a high performance team. Of course, the other extreme is just as likely to cause upset: if you stop all pairing, then those that want to will feel resentful and unhappy.

And what about the middle ground? Can you have a team where some people pair while others work on their own? It seems inevitable that Conway’s law will come into play: the structure of the software will reflect the structure of the team. It’s very difficult for there to be overlap between developers working on their own and developers that are pairing. For exactly the same reason it’s difficult for a group of individual developers to overlap on the same area of code at the same time: you’ll necessarily introduce some architectural boundary to ease coordination.

This means you still end up with a collection of silos, some owned by individual developers, some owned by a group of developers. Does this give you the best compromise? Or the worst of both worlds?

What’s your experience? What have you tried? What worked, what didn’t?


Categories: Programming, Testing & QA

Two lessons on haproxy checks and swap space

Agile Testing - Grig Gheorghiu - Thu, 08/21/2014 - 02:03
Let's assume you want to host a Wordpress site which is not going to get a lot of traffic. You want to use EC2 for this. You still want as much fault tolerance as you can get at a decent price, so you create an Elastic Load Balancer endpoint which points to 2 (smallish) EC2 instances running haproxy, with each haproxy instance pointing in turn to 2 (not-so-smallish) EC2 instances running Wordpress (Apache + MySQL). 
You choose to run haproxy behind the ELB because it gives you more flexibitity in terms of load balancing algorithms, health checks, redirections etc. Within haproxy, one of the Wordpress servers is marked as a backup for the other, so it only gets hit by haproxy when the primary one goes down. On this secondary Wordpress instance you set up MySQL to be a slave of the primary instance's MySQL. 
Here are two things (at least) that you need to make sure you have in this scenario:
1) Make sure you specify the httpchk option in haproxy.cfg, otherwise the primary server will not be marked as down even if Apache goes down. So you should have something like:
backend servers-http  server s1 10.0.1.1:80 weight 1 maxconn 5000 check port 80  server s2 10.0.1.2:80 backup weight 1 maxconn 5000 check port 80  option httpchk GET /
2) Make sure you have swap space in case the memory on the Wordpress instances gets exhausted, in which case random processes will be killed by the oom process (and one of those processes can be mysqld). By default, there is no swap space when you spin up an Ubuntu EC2 instance. Here's how to set up a 2 GB swapfile:
dd if=/dev/zero of=/swapfile1 bs=1024 count=2097152mkswap /swapfile1chmod 0600 /swapfile1swapon /swapfile1echo "/swapfile1 swap swap defaults 0 0" >> /etc/fstab
I hope these two things will help you if you're not already doing them ;-)

More Than 800 Videos on TVAgile.com

From the Editor of Methods & Tools - Wed, 08/20/2014 - 09:33
TVAgile.com has just passed the mark of the 800 resources available with an Oredev conference presentation that discusses how to foment creative collaboration based on the tenets of improv and open spaces. TVAgile.com is a directory of videos, interviews and tutorials focused agile software development approaches and practices: Scrum, Extreme Programming (XP), Test Driven Development (TDD) , Lean Software Development, Kanban, Behavior Driven Development (BDD), Agile Requirements, Continuous Integration, Pair Programming, Refactoring, … Explore all these resources on http://www.tvagile.com/

Managing OpenStack security groups from the command line

Agile Testing - Grig Gheorghiu - Fri, 08/15/2014 - 20:47
I had an issue today where I couldn't connect to a particular OpenStack instance on port 443. I decided to inspect the security group it belongs (let's call it myapp) to from the command line:

# nova secgroup-list-rules myapp
+-------------+-----------+---------+------------+--------------+
| IP Protocol | From Port | To Port | IP Range   | Source Group |
+-------------+-----------+---------+------------+--------------+
| tcp         | 80        | 80      | 0.0.0.0/0  |              |
| tcp         | 443       | 443     | 0.0.0.0/24 |              |
+-------------+-----------+---------+------------+--------------+

Note that the IP range for port 443 is wrong. It should be all IPs and not a /24 network.

I proceeded to delete the wrong rule:

# nova secgroup-delete-rule myapp tcp 443 443 0.0.0.0/24                                                               
+-------------+-----------+---------+------------+--------------+
| IP Protocol | From Port | To Port | IP Range   | Source Group |
+-------------+-----------+---------+------------+--------------+
| tcp         | 443       | 443     | 0.0.0.0/24 |              |
+-------------+-----------+---------+------------+--------------+


Then I added back the correct rule:
 # nova secgroup-add-rule myapp tcp 443 443 0.0.0.0/0                                                                   +-------------+-----------+---------+-----------+--------------+| IP Protocol | From Port | To Port | IP Range  | Source Group |+-------------+-----------+---------+-----------+--------------+| tcp         | 443       | 443     | 0.0.0.0/0 |              |+-------------+-----------+---------+------------+--------------+
Finally, I verified that the rules are now correct:
# nova secgroup-list-rules myapp                                                                                       +-------------+-----------+---------+-----------+--------------+| IP Protocol | From Port | To Port | IP Range  | Source Group |+-------------+-----------+---------+-----------+--------------+| tcp         | 443       | 443     | 0.0.0.0/0 |              || tcp         | 80        | 80      | 0.0.0.0/0 |              |+-------------+-----------+---------+-----------+--------------+
Of course, the real test was to see if I could now hit port 443 on my instance, and indeed I was able to.

Testing on the Toilet: Web Testing Made Easier: Debug IDs

Google Testing Blog - Wed, 08/13/2014 - 18:58
by Ruslan Khamitov 

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

Adding ID attributes to elements can make it much easier to write tests that interact with the DOM (e.g., WebDriver tests). Consider the following DOM with two buttons that differ only by inner text:
Save buttonEdit button
<div class="button">Save</div>
<div class="button">Edit</div>

How would you tell WebDriver to interact with the “Save” button in this case? You have several options. One option is to interact with the button using a CSS selector:
div.button

However, this approach is not sufficient to identify a particular button, and there is no mechanism to filter by text in CSS. Another option would be to write an XPath, which is generally fragile and discouraged:
//div[@class='button' and text()='Save']

Your best option is to add unique hierarchical IDs where each widget is passed a base ID that it prepends to the ID of each of its children. The IDs for each button will be:
contact-form.save-button
contact-form.edit-button

In GWT you can accomplish this by overriding onEnsureDebugId()on your widgets. Doing so allows you to create custom logic for applying debug IDs to the sub-elements that make up a custom widget:
@Override protected void onEnsureDebugId(String baseId) {
super.onEnsureDebugId(baseId);
saveButton.ensureDebugId(baseId + ".save-button");
editButton.ensureDebugId(baseId + ".edit-button");
}

Consider another example. Let’s set IDs for repeated UI elements in Angular using ng-repeat. Setting an index can help differentiate between repeated instances of each element:
<tr id="feedback-{{$index}}" class="feedback" ng-repeat="feedback in ctrl.feedbacks" >

In GWT you can do this with ensureDebugId(). Let’s set an ID for each of the table cells:
@UiField FlexTable table;
UIObject.ensureDebugId(table.getCellFormatter().getElement(rowIndex, columnIndex),
baseID + colIndex + "-" + rowIndex);

Take-away: Debug IDs are easy to set and make a huge difference for testing. Please add them early.

Categories: Testing & QA

Testing on the Toilet: Don't Put Logic in Tests

Google Testing Blog - Thu, 07/31/2014 - 17:59
by Erik Kuefler

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

Programming languages give us a lot of expressive power. Concepts like operators and conditionals are important tools that allow us to write programs that handle a wide range of inputs. But this flexibility comes at the cost of increased complexity, which makes our programs harder to understand.

Unlike production code, simplicity is more important than flexibility in tests. Most unit tests verify that a single, known input produces a single, known output. Tests can avoid complexity by stating their inputs and outputs directly rather than computing them. Otherwise it's easy for tests to develop their own bugs.

Let's take a look at a simple example. Does this test look correct to you?

@Test public void shouldNavigateToPhotosPage() {
String baseUrl = "http://plus.google.com/";
Navigator nav = new Navigator(baseUrl);
nav.goToPhotosPage();
assertEquals(baseUrl + "/u/0/photos", nav.getCurrentUrl());
}

The author is trying to avoid duplication by storing a shared prefix in a variable. Performing a single string concatenation doesn't seem too bad, but what happens if we simplify the test by inlining the variable?

@Test public void shouldNavigateToPhotosPage() {
Navigator nav = new Navigator("http://plus.google.com/");
nav.goToPhotosPage();
assertEquals("http://plus.google.com//u/0/photos", nav.getCurrentUrl()); // Oops!
}

After eliminating the unnecessary computation from the test, the bug is obvious—we're expecting two slashes in the URL! This test will either fail or (even worse) incorrectly pass if the production code has the same bug. We never would have written this if we stated our inputs and outputs directly instead of trying to compute them. And this is a very simple example—when a test adds more operators or includes loops and conditionals, it becomes increasingly difficult to be confident that it is correct.

Another way of saying this is that, whereas production code describes a general strategy for computing outputs given inputs, tests are concrete examples of input/output pairs (where output might include side effects like verifying interactions with other classes). It's usually easy to tell whether an input/output pair is correct or not, even if the logic required to compute it is very complex. For instance, it's hard to picture the exact DOM that would be created by a Javascript function for a given server response. So the ideal test for such a function would just compare against a string containing the expected output HTML.

When tests do need their own logic, such logic should often be moved out of the test bodies and into utilities and helper functions. Since such helpers can get quite complex, it's usually a good idea for any nontrivial test utility to have its own tests.

Categories: Testing & QA

Software Development Conferences Forecast July 2014

From the Editor of Methods & Tools - Mon, 07/28/2014 - 08:39
Here is a list of software development related conferences and events on Agile ( Scrum, Lean, Kanban) software testing and software quality, programming (Java, .NET, JavaScript, Ruby, Python, PHP) and databases (NoSQL, MySQL, etc.) that will take place in the coming weeks and that have media partnerships with the Methods & Tools software development magazine. Agile on the Beach, September 4-5 2014, Falmouth in Cornwall, UK SPTechCon, September 16-19 2014, Boston, USA Future of Web Apps, September 29-October 1 2014, London, UK STARWEST, October 12-17 2014, Anaheim, USA JAX London, October 13-15 2014, ...

Quote of the Month July 2014

From the Editor of Methods & Tools - Fri, 07/25/2014 - 08:22
Research has shown that the presumption of selfishness is true for maybe 30% of most populations; another 50% are reliably unselfish, and the remaining 20% could go either way, depending on the context. If a company presumes that the undecided 20% are selfish, you can bet they will be selfish—it’s a self-fulfilling prophecy. But worse, the company will create an environment where the 50% of the people who are unselfish are forced to act selfishly. And losing the energy, commitment, and intelligence of half the workforce is perhaps the biggest ...

The Deadline to Sign up for GTAC 2014 is Jul 28

Google Testing Blog - Wed, 07/23/2014 - 01:53
Posted by Anthony Vallone on behalf of the GTAC Committee

The deadline to sign up for GTAC 2014 is next Monday, July 28th, 2014. There is a great deal of interest to both attend and speak, and we’ve received many outstanding proposals. However, it’s not too late to add yours for consideration. If you would like to speak or attend, be sure to complete the form by Monday.

We will be making regular updates to our site over the next several weeks, and you can find conference details there:
  developers.google.com/gtac

For those that have already signed up to attend or speak, we will contact you directly in mid August.

Categories: Testing & QA

Troubleshooting haproxy 502 errors related to malformed/large HTTP headers

Agile Testing - Grig Gheorghiu - Wed, 07/23/2014 - 00:02
We had a situation recently where our web application started to behave strangely. First nginx (which sits in front of the application) started to error out with messages of this type:

upstream sent too big header while reading response header from upstream

A quick Google search revealed that a fix for this is to bump up proxy_buffer_size in nginx.conf, for both http and https traffic, along these lines:

proxy_buffer_size   256k;
proxy_buffers   4 256k;
proxy_busy_buffers_size   256k;

Now nginx was happy when hit directly. However, haproxy was still erroring out with a 502 'bad gateway' return code, followed by PH. Here is a snippet from the haproxy log file:

Jul 22 21:27:13 127.0.0.1 haproxy[14317]: 172.16.38.57:53408 [22/Jul/2014:21:27:12.776] www-frontend www-backend/www2:80 1/0/1/-1/898 502 8396 - - PH-- 0/0/0/0/0 0/0 "GET /someurl HTTP/1.1"

Another Google search revealed that PH means that haproxy rejected the header from the backend because it was malformed.

At this point, an investigation into the web app did discover a loop in the code that kept adding elements to a cookie included in the response header.

Anyway, I leave this here in the hope that somebody will stumble on it and benefit from it.

First experiences with OpenStack

Agile Testing - Grig Gheorghiu - Thu, 07/17/2014 - 21:37
We hit a big milestone this week, as we started to use OpenStack as a private cloud, intially just for QA/integration environments. Up to now we've been creating KVM machines semi-manually, which used to take minutes. Now we cut down that process to seconds, calling the Nova API from the command line, e.g.:

$ nova boot --image precise-image --flavor www --key_name mykey --nic net-id=3eafbd4f-0389-4c5b-93ba-7764742ee8cd www1.qa1

Once an instance is provisioned, we bootstrap it with Chef:

$ knife bootstrap www1.qa1.mydomain.com -x ubuntu --sudo -E qa1 -N www1.qa1 -r "role[base], role[www]"

Our internal network architecture is fairly complex, so my colleague Jeff Roberts spent quite some time bending OpenStack Neutron to his will (in conjunction with Open vSwitch) in order to support our internal VLANs. The OpenStack infrastructure has been stable so far, and it's just such a pleasure to do everything via an API and not to spin VMs up manually. Being back to working with a (private) cloud feels good.

This is just version 1.0 of our OpenStack rollout. Soon we'll start spinning up one environment at a time using chef-metal and fog  and we'll also integrate instance + environment spin-up with Jenkins. Exciting times ahead!

Measuring Coverage at Google

Google Testing Blog - Mon, 07/14/2014 - 20:45
By Marko Ivanković, Google Zürich

Introduction
Code coverage is a very interesting metric, covered by a large body of research that reaches somewhat contradictory results. Some people think it is an extremely useful metric and that a certain percentage of coverage should be enforced on all code. Some think it is a useful tool to identify areas that need more testing but don’t necessarily trust that covered code is truly well tested. Others yet think that measuring coverage is actively harmful because it provides a false sense of security.

Our team’s mission was to collect coverage related data then develop and champion code coverage practices across Google. We designed an opt-in system where engineers could enable two different types of coverage measurements for their projects: daily and per-commit. With daily coverage, we run all tests for their project, where as with per-commit coverage we run only the tests affected by the commit. The two measurements are independent and many projects opted into both.

While we did experiment with branch, function and statement coverage, we ended up focusing mostly on statement coverage because of its relative simplicity and ease of visualization.

How we measured
Our job was made significantly easier by the wonderful Google build system whose parallelism and flexibility allowed us to simply scale our measurements to Google scale. The build system had integrated various language-specific open source coverage measurement tools like Gcov (C++), Emma / JaCoCo (Java) and Coverage.py (Python), and we provided a central system where teams could sign up for coverage measurement.

For daily whole project coverage measurements, each team was provided with a simple cronjob that would run all tests across the project’s codebase. The results of these runs were available to the teams in a centralized dashboard that displays charts showing coverage over time and allows daily / weekly / quarterly / yearly aggregations and per-language slicing. On this dashboard teams can also compare their project (or projects) with any other project, or Google as a whole.

For per-commit measurement, we hook into the Google code review process (briefly explained in this article) and display the data visually to both the commit author and the reviewers. We display the data on two levels: color coded lines right next to the color coded diff and a total aggregate number for the entire commit.


Displayed above is a screenshot of the code review tool. The green line coloring is the standard diff coloring for added lines. The orange and lighter green coloring on the line numbers is the coverage information. We use light green for covered lines, orange for non-covered lines and white for non-instrumented lines.

It’s important to note that we surface the coverage information before the commit is submitted to the codebase, because this is the time when engineers are most likely to be interested in improving it.

Results
One of the main benefits of working at Google is the scale at which we operate. We have been running the coverage measurement system for some time now and we have collected data for more than 650 different projects, spanning 100,000+ commits. We have a significant amount of data for C++, Java, Python, Go and JavaScript code.

I am happy to say that we can share some preliminary results with you today:


The chart above is the histogram of average values of measured absolute coverage across Google. The median (50th percentile) code coverage is 78%, the 75th percentile 85% and 90th percentile 90%. We believe that these numbers represent a very healthy codebase.

We have also found it very interesting that there are significant differences between languages:

C++JavaGoJavaScriptPython56.6%61.2%63.0%76.9%84.2%

The table above shows the total coverage of all analyzed code for each language, averaged over the past quarter. We believe that the large difference is due to structural, paradigm and best practice differences between languages and the more precise ability to measure coverage in certain languages.

Note that these numbers should not be interpreted as guidelines for a particular language, the aggregation method used is too simple for that. Instead this finding is simply a data point for any future research that analyzes samples from a single programming language.

The feedback from our fellow engineers was overwhelmingly positive. The most loved feature was surfacing the coverage information during code review time. This early surfacing of coverage had a statistically significant impact: our initial analysis suggests that it increased coverage by 10% (averaged across all commits).

Future work
We are aware that there are a few problems with the dataset we collected. In particular, the individual tools we use to measure coverage are not perfect. Large integration tests, end to end tests and UI tests are difficult to instrument, so large parts of code exercised by such tests can be misreported as non-covered.

We are working on improving the tools, but also analyzing the impact of unit tests, integration tests and other types of tests individually.

In addition to languages, we will also investigate other factors that might influence coverage, such as platforms and frameworks, to allow all future research to account for their effect.

We will be publishing more of our findings in the future, so stay tuned.

And if this sounds like something you would like to work on, why not apply on our job site?

Categories: Testing & QA

Affordance Open-Space at XP Day

Mistaeks I Hav Made - Nat Pryce - Tue, 12/03/2013 - 00:20
Giovanni Asproni and I facilitated an open-space session at XP Day on the topic of affordance in software design. Affordance is "a quality of an object, or an environment, which allows an individual to perform an action". Don Norman distinguishes between actual afforance and perceived affordance, the latter being "those action possibilities that are readily perceivable by an actor". I think this distinction applies as well to software programming interfaces. Unless running in a sandboxed execution environments, software has great actual affordance, as long as we're willing to do the wrong thing when necessary: break encapsulation or rely on implementation-dependent data structures. What makes programming difficult is determining how and why to work with the intentions of those who designed the software that we are building upon: the perceived affordance of those lower software layers. Here are the notes I took during the session, cleaned up a little and categorised into things that provide help affordance, hinder affordance or can both help and hinder. Helping Repeated exposure to language you understand results in fluency. Short functions Common, consistent language. Domain types. Good error messages Things that behave differently should look different. e.g. convention for ! suffix in Scheme and Ruby for functions that mutate data. Able to explore software: able to put "probes" into/around the software to experiment with its behaviour know where to put those probes TDD, done properly, is likely to make most important affordances visible simpler APIs ability to experiment and probe software running in isolation. Hindering Two ways software has bad affordance: can't work out what it does (poor perceived affordance) it behaves unexpectedly (poor actual affordance) (are these two viewpoints on the same thing?) Null pointer exceptions detected far from origin. Classes named after design patterns incorrectly. e.g. programmer did not understand the pattern. Factories that do not create objects Visitors that are really internal iterators Inconsistencies can be caused by unfinished refactoring which code style to use? agree in the team spread agreement by frequent pair rotation Names that have subtly different meanings in natural language and in the code. E.g. OO design patterns. Compare with FP design patterns that have strict definitions and meaningless names (in natural language). You have to learn what the names really mean instead of rely on (wrong) intuition. Both Helping and Hindering Conventions e.g. Ruby on Rails alternative to metaphor? but: don't be magical! hard to test code in isolation If you break convention, draw attention to where it has been broken. Naming types after patterns better to use domain terms than pattern names? e.g. CustomerService mostly does stuff with addresses. Better named AddressBook, can see it also has unrelated behaviour that belongs elsewhere. but now need to communicate the system metaphors and domain terminology (is that ever a bad thing?) Bad code style you are familiar with does offer perceived affordance. Bad design might provide good actual affordance. e.g. lots of global variables makes it easy to access data. but provides poor perceived affordance e.g. cannot understand how state is being mutated by other parts of the system Method aliasing: makes code expressive common in Ruby but Python takes the opposite stance - only one way to do anything Frameworks: new jargon to learn new conventions provide specific affordances if your context changes, you need different affordances and the framework may get in the way. Documentation vs Tests actual affordance of physical components is described in a spec documents and not obvious from appearance. E.g. different grade of building brick. Is this possible for software? docs are more readable tests are more trustworthy best of both worlds - doctests? Used in Go and Python How do we know what has changed between versions of a software component? open source libraries usually have very good release notes why is it so hard in enterprise development?
Categories: Programming, Testing & QA

Multimethods

Mistaeks I Hav Made - Nat Pryce - Thu, 10/10/2013 - 00:30
I have observed that... Unless the language already provides them, any sufficiently complicated program written in a dynamically typed language contains multiple, ad hoc, incompatible, informally-specified, bug-ridden, slow implementations of multimethods. (To steal a turn of phrase from Philip Greenspun).
Categories: Programming, Testing & QA

Thought after Test Automation Day 2013

Henry Ford said “Obstacles are those frightful things you see when take your eyes off the goal.” After I’ve been to Test Automation Day last month I’m figuring out why industrializing testing doesn’t work. I try to put it in this negative perspective, because I think it works! But also when is it successful? A lot of times the remark from Ford is indeed the problem. People tend to see obstacles. Obstacles because of the thought that it’s not feasible to change something. They need to change. But that’s not an easy change.

After attending the #TAD2013 as it was on Twitter I saw a huge interest in better testing, faster testing, even cheaper testing by using tools to industrialize. Test automation has long been seen as an interesting option that can enable faster testing. it wasn’t always cheaper, especially the first time, but at least faster. As I see it it’ll enable better testing. “Better?” you may ask. Test automation itself doesn’t enable better testing, but by automating regression tests and simple work the tester can focus on other areas of the quality.

images

And isn’t that the goal? In the end all people involved in a project want to deliver a high quality product, not full of bugs. But they also tend to see the obstacles. I see them less and less. New tools are so well advanced and automation testers are becoming smarter and smarter that they enable us to look beyond the obstacles. I would like to say look over the obstacles.

At the Test Automation Day I learned some new things, but it also proved something I already know; test automation is here to stay. We don’t need to focus on the obstacles, but should focus on the goal.

Categories: Testing & QA

The Toyota Way: The need for doing it right the first time

After WWII Toyota started developing its Toyota Production System (TPS); which was identified as ‘Lean’ in the 1990s. Taiichi Ohno, Shigeo Shingo and Eiji Toyoda developed the system between 1948 and 1975. In the myth surrounding the system it was not inspired by the American automotive industry, but from a visit to American supermarkets, Ohno saw the supermarket as model for what he was trying to accomplish in the factor and perfect the Just-in-Time (JIT) production system. While accomplishing this low inventory levels were a key outcome of the TPS, and an important element of the philosophy behind its system is to work intelligently and eliminate waste so that only minimal inventory is needed.

As TPS and Lean have their own principles as outlined by Toyota:

  • Long-term Philosophy
  • Right process will produce the right results
  • Value to organization by developing people
  • Solving root problems drives organizational learning

As these principles were summed up and published by Toyota in 2001, by naming it “The Toyota Way 2001”. It consists the above named principles in two key areas: Continuous Improvement, and Respect for People.

the-toyota-way

The principles for a continuous improvement include establishing a long-term vision, working on challenges, continual innovation, and going to the source of the issue or problem. The principles relating to respect for people include ways of building respect and teamwork. When looking at the ALM all these principles come together in the ‘first time right’ approach already mentioned. And from Toyota’s view they were outlined as followed:

  • The right process will produce the right results
    • Create continuous process flow to bring problems to the surface.
    • Use the ‘pull’ system to avoid overproduction (kanban)
    • Level out the workload (heijunka).
    • Build a culture of stopping to fix problems, to get quality right from the first (jidoka)
  • Continuously solving root problems drives organizational learning
    • Go and see for yourself to thoroughly understand the situation (Genchi Genbutsu);
    • Make decisions slowly by consensus, thoroughly considering all options (nemawashi); implement decisions rapidly;
    • Become a learning organization through relentless reflection (hansei) and continuous improvement (kaizen).
Let’s do it right now!

As the economy is changing and IT is more common sense throughout ore everyday life the need for good quality software products has never been this high. Software issues create bigger and bigger issues in our lives. Think about trains that cannot ride due to software issues, bank clients that have no access to their bank accounts, and people oversleeping because their alarm app didn’t work on their iPhone. As Capers Jones [Jones, 2011] states in his 2011 study that “software is blamed for more major business problems than any other man-made product” and that “poor quality has become one of the most expensive topics in human history”. The improvement of software quality is a key topic for all industries.

 Right the first time vs jidoka

In both TPS and Lean autonomation or jidoka are used. Autonomation can be described as ‘intelligent autonomation’, it means that when an abnormal situation arises the ‘machine’ stops and fix the abnormality. Autonomation prevents the production of defective products, eliminates overproduction, and focuses attention on understanding the problem and ensuring that it never recurs; a quality control process that applies the following four principles:

  • Detect the abnormality.
  • Stop.
  • Fix or correct the immediate condition.
  • Investigate the root cause and install a countermeasure.
Find defects as early as possible

In other words autonomation helps to get quality right the first time perfectly. With IT projects being different from the Toyota car production line, ‘perfectly’ may be a bit too much, but the process around quality assurance should be the same:

  • Find the defect.
  • Stop.
  • Fix or correct the error.
  • Investigate the root cause and take countermeasures.

The defect should be found as early as possible to be fixed as early as possible. And as with Lean and TPS the reason behind this is to make it possible to address the identification and correction of defects immediately in the process.

Categories: Testing & QA

When will you start with test automation?

I just came back from vacation and when I started again I noticed a slight change in resource requests I now see coming by; as almost all requests are with a statement around test automation. In the last two days I had two separate sessions around test automation tools. Has test automation all of a sudden become more important? Did people follow up on my last post, where I state that tools are a prerequisite in testing today, or actually: yesterday!

If you missed the latest cycle in new tools for test automation you’re either an ostrich with your head in the ground (“sorry vacation was in Southern Africa”), or you just simply still afraid. Afraid of change that test automation would cannibalize your manual test execution.

Test automation is not anymore that it takes over test execution in a very complex and unmanageable way. No it offers higher efficiency on test design, test execution, but also more options to test certain non-functional parts of applications – that could not be done without those tools – like security and performance, and virtual environments to do end-2-end tests without test environments that are down all the time. Tools are now also offering more support for testing mobile solutions. Tools are everywhere!

automated-testing

Test automation offers us testers the opportunity to do more, faster, les risky, and cheaper. I set these words specifically in that order. Test automation is often seen as a way to do test cheaper. You can, but you also can do more, for instance:

  • Let the tool do the checks and you explore the application further;
  • Setup a virtual test environment that doesn’t go down after 1 hour of use and test more in the same time;
  • Create and execute more test cases by generating and executing them automatically.
  • Get higher quality by really doing a thorough regression test, instead of a simple check, to find integration errors.

There are enough reasons to work on test automation and I don’t see why not. I think now it is even time for the next step in test automation. What that is time will tell, but I look forward to hearing that at the Test Automation Day in June. Where Bryan Bakker will tell more on this next step in his presentation “Design for Testability – the next step in Test Automation”. After the congress I’ll post my ideas here.

Categories: Testing & QA