Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Populist Books Can Be Misleading

Herding Cats - Glen Alleman - Sun, 07/05/2015 - 21:32

Populist books provide an important role in the processes of "thinking about things." They are simple, understandable in ways that resonant with the those not familiar with a topic, and are hopefully gateway sources to then next level of understanding. Populist books have a down-side as well. They are usually simplified versions of the underlying topic, devoid of the details, which unfortunately have mathematics that may be beyond the casual reader.

I've written about the issues with populist books before. There is a new set of issues that needs to be addressed. The Think Fast Act Slow book is a recent example of a populist book. It has useful materials, but leaves out all the ground work and heavy lifting needed to put these ideas to work. 

In graduate school, there are several things you learn before starting your thesis work. Do a literature search. You're bright idea may have already been done. Or worse your bright idea is a cockamamy idea on day one. If everyone tells you it's a cockamamy idea, you may be able to show the world they're wrong. To do that you need to get through a peer review and a test of your idea by strangers using actual data that holds up to ruthless testing by others. There have been a few of those, most have gone on to win the Nobel Prize.

So if you hear some idea that doesn't quite make sense, ask for the data that supports that idea, so you can do independent testing. Better yet if that idea is an obvious violating of the basic principles - either of physics (cold fusion) or of economics (#NoEstimates) ask those proposing the idea for direct evidence of its applicability that can also be independently tested.

Here's a list of supporting papers need to put the populist ideas to work from my library. Goggle will find these for you:

  • Anchoring and Adjustment in Software Estimation, Jorge Aranda and Steve Easterbook
  • Judgement under Uncertainty Amos Tversky and Daniel Kahneman
  • The Fragile Basic Anchoring Effect, Noel Brewer and Gretchen Chapman
  • The Anchoring-and-Adjustment Heuristic Nicholas Epley and Thomas Gilovich
  • Review of¬†¬†Tversky and ¬†Kahneman (1974): Judgement under uncertainty: Heuristics and Baises
  • Reference points and redistribution preferences: Experiment evidence, Jimmy Cjarrit√©, Raymond Fisman, and Ilyana Kuziemko
  • Availability: A heuristic for judging frequency and probability,¬†Amos Tversky and Daniel Kahneman
  • Attention and Effort,¬†Daniel Kahneman
  • Assessing Range of Probabilities, Strategic Decision and Risk Management, Stanford Certificate Program,¬†Decision Analysis for the Professional, Chapter 12.
  • Anchoring Unbound, Nicholas Epley and Thomas Gilovich
  • Anchoring and Adjustment in Software Project Management: An Experimental Investigation, Timothy Costello, Naval Postgraduate School, Monterey, California

These are a small sample of the background that needs to be examined after read the populist book.

With this example, you can move beyond populist ideas - no matter how valid - to technical ideas and start putting them to work and testing the outcomes for their efficacy in your domain.

Here's a starting point for that effort in Populist versus Technical View of Problems 

And remember

Screen Shot 2015-07-05 at 2.29.57 PM

Categories: Project Management

SPaMCAST 349 ‚Äď Agile Testing, QA Corner ‚Äď Test Cases, TameFlow Column

 www.spamcast.net

http://www.spamcast.net

Listen Now

Subscribe on iTunes

To paraphrase Ed Sullivan, ‚ÄúWe have a big, big show this week,‚ÄĚ so we will keep the up front chit chat to a minimum.¬† First up is our essay on Agile Testing. Even if you are not a tester, understanding how testing flows in Agile projects is important to maximize value.

Second, we have a new installment from Jeremy Berriault’s QA Corner.  In this installment Jeremy talks about test cases.  More is not always the right answer.

Anchoring the Cast is Steve Tendon’s column discussing the TameFlow methodology and his great new book, Hyper-Productive Knowledge Work Performance.

Call to Action!

I have a challenge for the Software Process and Measurement Cast listeners for the next few weeks. I would like you to find one person that you think would like the podcast and introduce them to the cast. This might mean sending them the URL or teaching them how to download podcasts. If you like the podcast and think it is valuable they will be thankful to you for introducing them to the Software Process and Measurement Cast. Thank you in advance!

Re-Read Saturday News

We have just begun the Re-Read Saturday of The Mythical Man-Month. We are off to rousing start beginning with the Tar Pit. Get a copy now and start reading!

The Re-Read Saturday and other great articles can be found on the Software Process and Measurement Blog.

Remember: We just completed the Re-Read Saturday of Eliyahu M. Goldratt and Jeff Cox’s The Goal: A Process of Ongoing Improvement which began on February 21nd. What did you think?  Did the re-read cause you to read The Goal for a refresher? Visit the Software Process and Measurement Blog and review the whole re-read.

Note: If you don’t have a copy of the book, buy one. If you use the link below it will support the Software Process and Measurement blog and podcast.

Dead Tree Version or Kindle Version 

Upcoming Events

Software Quality and Test Management 

September 13 ‚Äď 18, 2015

San Diego, California

http://qualitymanagementconference.com/

I will be speaking on the impact of cognitive biases on teams!  Let me know if you are attending!

 

More on other great conferences soon!

Next SPaMCast

The next Software Process and Measurement Cast will feature our interview with Arlene Minkiewicz. Arlene and I talked technical debt. Not sure what technical debt is?  Well to some people it is a metaphor for cut corners and to others is a measure of work that will need to be done later.  In either case, a little goes a long way!

 

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques¬†co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: ‚ÄúThis book will prove that software projects should not be a tedious process, neither for you or your team.‚ÄĚ Support SPaMCAST by buying the book¬†here.

Available in English and Chinese.


Categories: Process Management

R: Wimbledon ‚Äď How do the seeds get on?

Mark Needham - Sun, 07/05/2015 - 09:38

Continuing on with the Wimbledon data set I’ve been playing with I wanted to do some exploration on how the seeded players have fared over the years.

Taking the last 10 years worth of data there have always had 32 seeds and with the following function we can feed in a seeding and get back the round they would be expected to reach:

expected_round = function(seeding) {  
  if(seeding == 1) {
    return("Winner")
  } else if(seeding == 2) {
    return("Finals") 
  } else if(seeding <= 4) {
    return("Semi-Finals")
  } else if(seeding <= 8) {
    return("Quarter-Finals")
  } else if(seeding <= 16) {
    return("Round of 16")
  } else {
    return("Round of 32")
  }
}
 
> expected_round(1)
[1] "Winner"
 
> expected_round(4)
[1] "Semi-Finals"

We can then have a look at each of the Wimbledon tournaments and work out how far they actually got.

round_reached = function(player, main_matches) {
  furthest_match = main_matches %>% 
    filter(winner == player | loser == player) %>% 
    arrange(desc(round)) %>% 
    head(1)  
 
    return(ifelse(furthest_match$winner == player, "Winner", as.character(furthest_match$round)))
}
 
seeds = function(matches_to_consider) {
  winners =  matches_to_consider %>% filter(!is.na(winner_seeding)) %>% 
    select(name = winner, seeding =  winner_seeding) %>% distinct()
  losers = matches_to_consider %>% filter( !is.na(loser_seeding)) %>% 
    select(name = loser, seeding =  loser_seeding) %>% distinct()
 
  return(rbind(winners, losers) %>% distinct() %>% mutate(name = as.character(name)))
}

Let’s have a look how the seeds got on last year:

matches_to_consider = main_matches %>% filter(year == 2014)
 
result = seeds(matches_to_consider) %>% group_by(name) %>% 
    mutate(expected = expected_round(seeding), round = round_reached(name, matches_to_consider)) %>% 
    ungroup() %>% arrange(seeding)
 
rounds = c("Did not enter", "Round of 128", "Round of 64", "Round of 32", "Round of 16", "Quarter-Finals", "Semi-Finals", "Finals", "Winner")
result$round = factor(result$round, levels = rounds, ordered = TRUE)
result$expected = factor(result$expected, levels = rounds, ordered = TRUE) 
 
> result %>% head(10)
Source: local data frame [10 x 4]
 
             name seeding       expected          round
1  Novak Djokovic       1         Winner         Winner
2    Rafael Nadal       2         Finals    Round of 16
3     Andy Murray       3    Semi-Finals Quarter-Finals
4   Roger Federer       4    Semi-Finals         Finals
5   Stan Wawrinka       5 Quarter-Finals Quarter-Finals
6   Tomas Berdych       6 Quarter-Finals    Round of 32
7    David Ferrer       7 Quarter-Finals    Round of 64
8    Milos Raonic       8 Quarter-Finals    Semi-Finals
9      John Isner       9    Round of 16    Round of 32
10  Kei Nishikori      10    Round of 16    Round of 16

We’ll wrap all of that code into the following function:

expectations = function(y, matches) {
  matches_to_consider = matches %>% filter(year == y)  
 
  result = seeds(matches_to_consider) %>% group_by(name) %>% 
    mutate(expected = expected_round(seeding), round = round_reached(name, matches_to_consider)) %>% 
    ungroup() %>% arrange(seeding)
 
  result$round = factor(result$round, levels = rounds, ordered = TRUE)
  result$expected = factor(result$expected, levels = rounds, ordered = TRUE)  
 
  return(result)
}

Next, instead of showing the round names it’d be cool to come up with numerical value indicating how well the player did:

  • -1 would mean they lost in the round before their seeding suggested e.g. seed 2 loses in Semi Final
  • 2 would mean they got 2 rounds further than they should have e.g. Seed 7 reaches the Final

The unclass function comes to our rescue here:

# expectations plot
years = 2005:2014
exp = data.frame()
for(y in years) {
  differences = (expectations(y, main_matches)  %>% 
                   mutate(expected_n = unclass(expected), 
                          round_n = unclass(round), 
                          difference = round_n - expected_n))$difference %>% as.numeric()    
  exp = rbind(exp, data.frame(year = rep(y, length(differences)), difference = differences)) 
}
 
> exp %>% sample_n(10)
Source: local data frame [10 x 6]
 
              name seeding expected_n round_n difference year
1    Tomas Berdych       6          6       5         -1 2011
2    Tomas Berdych       7          6       6          0 2013
3     Rafael Nadal       2          8       5         -3 2014
4    Fabio Fognini      16          5       4         -1 2014
5  Robin Soderling      13          5       5          0 2009
6    Jurgen Melzer      16          5       5          0 2010
7  Nicolas Almagro      19          4       2         -2 2010
8    Stan Wawrinka      14          5       3         -2 2011
9     David Ferrer       7          6       5         -1 2011
10 Mikhail Youzhny      14          5       5          0 2007

We can then group by the ‘difference’ column to see how seeds are getting on as a whole:

> exp %>% count(difference)
Source: local data frame [9 x 2]
 
  difference  n
1         -5  2
2         -4  7
3         -3 24
4         -2 70
5         -1 66
6          0 85
7          1 43
8          2 17
9          3  4
 
library(ggplot2)
ggplot(aes(x = difference, y = n), data = exp %>% count(difference)) +
  geom_bar(stat = "identity") +
  scale_x_continuous(limits=c(min(potential), max(potential) + 1))
2015 07 04 00 45 02

So from this visualisation we can see that the most common outcome for a seed is that they reach the round they were expected to reach. There are still a decent number of seeds who do 1 or 2 rounds worse than expected as well though.

Antonios suggested doing some analysis of how the seeds fared on a year by year basis – we’ll start by looking at what % of them exactly achieved their seeding:

exp$correct_pred = 0
exp$correct_pred[dt$difference==0] = 1
 
exp %>% group_by(year) %>% 
  summarise(MeanDiff = mean(difference),
            PrcCorrect = mean(correct_pred),
            N=n())
 
Source: local data frame [10 x 4]
 
   year   MeanDiff PrcCorrect  N
1  2005 -0.6562500  0.2187500 32
2  2006 -0.8125000  0.2812500 32
3  2007 -0.4838710  0.4193548 31
4  2008 -0.9677419  0.2580645 31
5  2009 -0.3750000  0.2500000 32
6  2010 -0.7187500  0.4375000 32
7  2011 -0.7187500  0.0937500 32
8  2012 -0.7500000  0.2812500 32
9  2013 -0.9375000  0.2500000 32
10 2014 -0.7187500  0.1875000 32

Some years are better than others – we can use a chisq test to see whether there are any significant differences between the years:

tbl = table(exp$year, exp$correct_pred)
tbl
 
> chisq.test(tbl)
 
	Pearson's Chi-squared test
 
data:  tbl
X-squared = 14.9146, df = 9, p-value = 0.09331

This looks for at least one statistically significant different between the years, although it doesn’t look like there are any. We can also try doing a comparison of each year against all the others:

> pairwise.prop.test(tbl)
 
	Pairwise comparisons using Pairwise comparison of proportions 
 
data:  tbl 
 
     2005 2006 2007 2008 2009 2010 2011 2012 2013
2006 1.00 -    -    -    -    -    -    -    -   
2007 1.00 1.00 -    -    -    -    -    -    -   
2008 1.00 1.00 1.00 -    -    -    -    -    -   
2009 1.00 1.00 1.00 1.00 -    -    -    -    -   
2010 1.00 1.00 1.00 1.00 1.00 -    -    -    -   
2011 1.00 1.00 0.33 1.00 1.00 0.21 -    -    -   
2012 1.00 1.00 1.00 1.00 1.00 1.00 1.00 -    -   
2013 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 -   
2014 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
 
P value adjustment method: holm


2007/2011 and 2010/2011 show the biggest differences but they’re still not significant. Since we have so few data items in each bucket there has to be a really massive difference for it to be significant.

The data I used in this post is available on this gist if you want to look into it and come up with your own analysis.

Categories: Programming

Born on the 4th of July

Star of India, San Diego, CA

Star of India, San Diego, CA

I wasn’t born on the 4th of July, but my wife was.  This week we have a perfect storm of activities that have conspired to postpone the second installment of the Re-read of The Mythical Man-Month: Essays on Software Engineering this Saturday. Birthday day for my wife, national holiday and family reunion are keeping me away from the keyboard today.  So please pardon the interruption!


Categories: Process Management

Happy 4th of July

Herding Cats - Glen Alleman - Sat, 07/04/2015 - 20:10

Semper Fidelis to all my colleagues and friends. Wait till minute 3:57 

Categories: Project Management

Agile Program Manager

Individually, each boy is a project. Together they're a program to be managed.

Individually, each boy is a project. Together they’re a program to be managed.

Scaling Agile project management to large, complex endeavors requires an Agile Program Manager to address the big picture coordination of programs.  Program management is the discipline of coordinating and managing large efforts comprised of a number of parallel and related projects. Scrum leverages a concept called the Scrum of Scrums to perform many of the activities need for program management.  Agile Program Management is not just repurposed project management or a part-time job for a Scrum Master.

Agile Program Managers coordinate and track expectations across all projects under the umbrella of the program, whether the projects are using Agile or not. Coordination includes activities like identifying and tracking dependencies, tracking risks and issues and communication. Coordination of the larger program generally requires developing a portfolio of moving parts at the epic or function level across all of the related projects (epics are large user stories that represent large concepts that will be broken down later). Agile Program Managers layer each project‚Äôs release plans on top of the program portfolio to provide a platform for coordinated release planning. Techniques like¬†Kanban¬†can be used for tracking and visualizing the portfolio.¬† Visualization show how the epics or functions are progressing as they are developed and staged for delivery to the program’s customers.

Facilitating communication is one the roles of an Agile Program Managers. The Scrum of Scrums is the primary vehicle for ensuring communication.  The Scum of Scrums is a meeting of the all of the directly responsible individuals (DRIs) from each team in the program. The DRI has the responsibility to act as the conduit of information for his or her team to the Agile Program Manager and other DRIs. The DRI raises issues, risks, concerns and needs. In short, the DRI communicates to the team and the Scrum of Scrums. The Scrum of Scrums is best as a daily meeting of the DRIs chaired by the Agile Program Manager, however the frequency can be tailored to meet the program’s needs.  A pattern I have seen used to minimize overhead is varying the frequency of the Scrum of Scrums based on project risk.

Another set of activities that generally fall to the Agile Program Manager is the development and communication of program status information. Chairing high-level status meetings, such as those with sponsor or other guidance groups, is a natural extension of the role. However this requires the Agile Program Manager to act as a conduit of information by transferring knowledge from the Scrum of Scrums to the sponsors and back again. Any problem with information flow can potentially cause bad decisions and will affect the program.

It is important to recognize that Agile Program Management is more than a specialization within the realm of project management or a side job a Scrum Master can do in his or her spare time.  Agile Program Managers need to be well versed in both Agile techniques and in standard program management techniques because the Agile Program Manager is a hybrid from both camps. Agile Program Managers build the big picture view that a portfolio view of all of the related projects will deliver. They also must facilitate communication via the Scrum of Scrums and standard program status vehicles.  The Agile Program Manager many times must straddle the line between both the Agile and waterfall worlds.


Categories: Process Management

R: Calculating the difference between ordered factor variables

Mark Needham - Thu, 07/02/2015 - 23:55

In my continued exploration of Wimbledon data I wanted to work out whether a player had done as well as their seeding suggested they should.

I therefore wanted to work out the difference between the round they reached and the round they were expected to reach. A ’round’ in the dataset is an ordered factor variable.

These are all the possible values:

rounds = c("Did not enter", "Round of 128", "Round of 64", "Round of 32", "Round of 16", "Quarter-Finals", "Semi-Finals", "Finals", "Winner")

And if we want to factorise a couple of strings into this factor we would do it like this:

round = factor("Finals", levels = rounds, ordered = TRUE)
expected = factor("Winner", levels = rounds, ordered = TRUE)  
 
> round
[1] Finals
9 Levels: Did not enter < Round of 128 < Round of 64 < Round of 32 < Round of 16 < Quarter-Finals < ... < Winner
 
> expected
[1] Winner
9 Levels: Did not enter < Round of 128 < Round of 64 < Round of 32 < Round of 16 < Quarter-Finals < ... < Winner

In this case the difference between the actual round and expected round should be -1 – the player was expected to win the tournament but lost in the final. We can calculate that differnce by calling the unclass function on each variable:

 
> unclass(round) - unclass(expected)
[1] -1
attr(,"levels")
[1] "Did not enter"  "Round of 128"   "Round of 64"    "Round of 32"    "Round of 16"    "Quarter-Finals"
[7] "Semi-Finals"    "Finals"         "Winner"

That still seems to have some remnants of the factor variable so to get rid of that we can cast it to a numeric value:

> as.numeric(unclass(round) - unclass(expected))
[1] -1

And that’s it! We can now go and apply this calculation to all seeds to see how they got on.

Categories: Programming

Game Performance: Data-Oriented Programming

Android Developers Blog - Thu, 07/02/2015 - 22:12

Posted by Shanee Nishry, Game Developer Advocate

To improve game performance, we’d like to highlight a programming paradigm that will help you maximize your CPU potential, make your game more efficient, and code smarter.

Before we get into detail of data-oriented programming, let’s explain the problems it solves and common pitfalls for programmers.

Memory

The first thing a programmer must understand is that memory is slow and the way you code affects how efficiently it is utilized. Inefficient memory layout and order of operations forces the CPU idle waiting for memory so it can proceed doing work.

The easiest way to demonstrate is by using an example. Take this simple code for instance:

char data[1000000]; // One Million bytes
unsigned int sum = 0;

for ( int i = 0; i < 1000000; ++i )
{
  sum += data[ i ];
}

An array of one million bytes is declared and iterated on one byte at a time. Now let's change things a little to illustrate the underlying hardware. Changes marked in bold:

char data[16000000]; // Sixteen Million bytes
unsigned int sum = 0;

for ( int i = 0; i < 16000000; i += 16 )
{
  sum += data[ i ];
}

The array is changed to contain sixteen million bytes and we iterate over one million of them, skipping 16 at a time.

A quick look suggests there shouldn't be any effect on performance as the code is translated to the same number of instructions and runs the same number of times, however that is not the case. Here is the difference graph. Note that this is on a logarithmic scale--if the scale were linear, the performance difference would be too large to display on any reasonably-sized graph!


Graph in logarithmic scale

The simple change making the loop skip 16 bytes at a time makes the program run 5 times slower!

The average difference in performance is 5x and is consistent when iterating 1,000 bytes up to a million bytes, sometimes increasing up to 7x. This is a serious change in performance.

Note: The benchmark was run on multiple hardware configurations including a desktop with Intel 5930K 3.50GHz CPU, a Macbook Pro Retina laptop with 2.6 GHz Intel i7 CPU and Android Nexus 5 and Nexus 6 devices. The results were pretty consistent.

If you wish to replicate the test, you might have to ensure the memory is out of the cache before running the loop because some compilers will cache the array on declaration. Read below to understand more on how it works.

Explanation

What happens in the example is quite simply explained when you understand how the CPU accesses data. The CPU can’t access data in RAM; the data must be copied to the cache, a smaller but extremely fast memory line which resides near the CPU chip.

When the program starts, the CPU is set to run an instruction on part of the array but that data is still not in the cache, therefore causing a cache miss and forcing the CPU to wait for the data to be copied into the cache.

For simplicity sake, assume a cache size of 16 bytes for the L1 cache line, this means 16 bytes will be copied starting from the requested address for the instruction.

In the first code example, the program next tries to operate on the following byte, which is already copied into the cache following the initial cache miss, therefore continuing smoothly. This is also true for the next 14 bytes. After 16 bytes, since the first cache miss the loop, will encounter another cache miss and the CPU will again wait for data to operate on, copying the next 16 bytes into the cache.

In the second code sample, the loop skips 16 bytes at a time but hardware continues to operate the same. The cache copies the 16 subsequent bytes each time it encounters a cache miss which means the loop will trigger a cache miss with each iteration and cause the CPU to wait idle for data each time!

Note: Modern hardware implements cache prefetch algorithms to prevent incurring a cache miss per frame, but even with prefetching, more bandwidth is used and performance is lower in our example test.

In reality the cache lines tend to be larger than 16 bytes, the program would run much slower if it were to wait for data at every iteration. A Krait-400 found in the Nexus 5 has a L0 data cache of 4 KB with 64 Bytes per line.

If you are wondering why cache lines are so small, the main reason is that making fast memory is expensive.

Data-Oriented Design

The way to solve such performance issues is by designing your data to fit into the cache and have the program to operate on the entire data continuously.

This can be done by organizing your game objects inside Structures of Arrays (SoA) instead of Arrays of Structures (AoS) and pre-allocating enough memory to contain the expected data.

For example, a simple physics object in an AoS layout might look like this:

struct PhysicsObject
{
  Vec3 mPosition;
  Vec3 mVelocity;

  float mMass;
  float mDrag;
  Vec3 mCenterOfMass;

  Vec3 mRotation;
  Vec3 mAngularVelocity;

  float mAngularDrag;
};

This is a common way way to present an object in C++.

On the other hand, using SoA layout looks more like this:

class PhysicsSystem
{
private:
  size_t mNumObjects;
  std::vector< Vec3 > mPositions;
  std::vector< Vec3 > mVelocities;
  std::vector< float > mMasses;
  std::vector< float > mDrags;

  // ...
};

Let’s compare how a simple function to update object positions by their velocity would operate.

For the AoS layout, a function would look like this:

void UpdatePositions( PhysicsObject* objects, const size_t num_objects, const float delta_time )
{
  for ( int i = 0; i < num_objects; ++i )
  {
    objects[i].mPosition += objects[i].mVelocity * delta_time;
  }
}

The PhysicsObject is loaded into the cache but only the first 2 variables are used. Being 12 bytes each amounts to 24 bytes of the cache line being utilised per iteration and causing a cache miss with every object on a 64 bytes cache line of a Nexus 5.

Now let’s look at the SoA way. This is our iteration code:

void PhysicsSystem::SimulateObjects( const float delta_time )
{
  for ( int i = 0; i < mNumObjects; ++i )
  {
    mPositions[ i ] += mVelocities[i] * delta_time;
  }
}

With this code, we immediately cause 2 cache misses, but we are then able to run smoothly for about 5.3 iterations before causing the next 2 cache misses resulting in a significant performance increase!

The way data is sent to the hardware matters. Be aware of data-oriented design and look for places it will perform better than object-oriented code.

We have barely scratched the surface. There is still more to data-oriented programming than structuring your objects. For example, the cache is used for storing instructions and function memory so optimizing your functions and local variables affects cache misses and hits. We also did not mention the L2 cache and how data-oriented design makes your application easier to multithread.

Make sure to profile your code to find out where you might want to implement data-oriented design. You can use different profilers for different architecture, including the NVIDIA Tegra System Profiler, ARM Streamline Performance Analyzer, Intel and PowerVR PVRMonitor.

If you want to learn more on how to optimize for your cache, read on cache prefetching for various CPU architectures.

Join the discussion on

+Android Developers
Categories: Programming

Agency Product Owner Training Starts in August

We have an interesting problem in some projects. Agencies, consulting organizations, and consultants help their clients understand what the client needs in a product. Often, these people and their organizations then implement what the client and agency develop as ideas.

As the project continues, the agency manager continues to help the client identify and update the requirements. Because this a limited time contract, the client doesn’t have a product manager or product owner. The agency person—often the owner—acts as a product owner.

This is why Marcus Blankenship and I have teamed up to offer Product Owner Training for Agencies.

If you are an agency/consultant/outside your client’s organization and you act as a product owner, this training is for you. It’s based on my workshop Agile and Lean Product Ownership. We won’t do everything in that workshop. Because it’s an online workshop, you’ll work on your projects/programs in between our meetings.

If you are not part of an organization and you find yourself acting as a product owner, this training is for you. See Product Owner Training for Agencies.

Categories: Project Management

Do You Really Bill $300 an Hour?

Making the Complex Simple - John Sonmez - Thu, 07/02/2015 - 15:00

In this episode, I explain why I bill $300 an hour. Full transcript: John:¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† Hey, this is John Sonmez from simpleprogrammer.com. I got a question today about my billing rate, so whether it‚Äôs real or not and I found a few questions like this. I just want to say upfront that when I say something […]

The post Do You Really Bill $300 an Hour? appeared first on Simple Programmer.

Categories: Programming

Flappy

Phil Trelford's Array - Thu, 07/02/2015 - 08:04

This week I ran a half-day hands on games development session at the Progressive .Net Tutorials hosted by Skills Matter in London. I believe this was the last conference to be held in Goswell Road before the big move to an exciting new venue.

My session was on mobile games development with F# as the implementation language:

Ready, steady, cross platform games - ProgNet 2015 from Phillip Trelford

Here’s a quick peek inside the room:

Game programming with @ptrelford @ #prognet2015 I'ma make and sell a BILLION copies of this game y'all! pic.twitter.com/5qa4wOSY1G

— Adron (@adron) July 1, 2015

The session tasks were around 2 themes:

  • implement a times table question and answer game (think Nintendo‚Äôs Brain Training game)
  • extended a simple Flappy Bird clone

Times table game

The motivation behind this example was to help people:

  • build a simple game loop
  • pick up some basic F# skills

The first tasks , like asking a multiplication question, could be built using F#’s REPL (F# Interactive) and later tasks that took user input required running as a console application.

Here’s some of the great solutions that were posted up to F# Snippets:

To run them, create a new F# Console Application project in Xamarin Studio or Visual Studio and paste in the code (use the Raw view in F# Snippets to copy the code).

Dominic Finn’s source code includes some fun ASCII art too:

// _____ _   _ _____ _____ _____  ______  _  _   _____  _____ _     
//|  __ \ | | |  ___/  ___/  ___| |  ___|| || |_|  _  ||  _  | |    
//| |  \/ | | | |__ \ `--.\ `--.  | |_ |_  __  _| | | || | | | |    
//| | __| | | |  __| `--. \`--. \ |  _| _| || |_| | | || | | | |    
//| |_\ \ |_| | |___/\__/ /\__/ / | |  |_  __  _\ \_/ /\ \_/ / |____
// \____/\___/\____/\____/\____/  \_|    |_||_|  \___/  \___/\_____/
//  

Flappy Bird clone

For this example I sketched out a flappy bird clone using Monogame (along with WinForms and WPF for comparison) with the idea that people could enhance and extend the game:

image

Monogame lets you target multiple platforms including iOS and Android along with Mac, Linux, Windows and even Rapsberry Pi!

The different flavours are available on F# Snippets, simply cut and paste them into an F# script file to run them:

All the samples and tasks are also available in a zip: http://trelford.com/ProgNet15.zip

Have fun!

Categories: Programming

SE-Radio Episode 231: Joshua Suereth and Matthew Farwell on SBT and Software Builds

Joshua Suereth and Matthew Farwell discuss SBT (Simple Build Tool) and their new book SBT in Action. They first look at the factors creating a need for build systems and why they think SBT‚ÄĒa new addition to this area‚ÄĒis a valuable contribution in spite of the vast number of existing build tools. Host Tobias Kaatz, […]
Categories: Programming

What Happened to our Basic Math Skills?

Herding Cats - Glen Alleman - Wed, 07/01/2015 - 15:29

Screen Shot 2015-06-30 at 10.03.06 PMMaking decisions in the presence of uncertainty of a future outcomes resulting from that decision is an important topic in the project management, product development, and engineering domains. The first question in this domain is...

If the future is not identical to the past, how can we make a decision in the presence of this future uncertainty?

The answer is we need some means of taking what we know about the past and the present and turning it into information about the future. This information can be measurements of actual activities - cost, duration of work, risks, dependencies, performance and effectiveness measures, models and simulation of past and future activities, reference classes, parametric models.

If the future is identical to the past and the present, then all this data can show us a simple straight line projection from the past to the future.

But there are some questions:

  • Is the future like the past? Have we just assumed this? Or have we actually developed an understanding of the future from looking into¬†what could possible change in the future from the past?
  • If there is no change, can that future be sustained long enough for our actions to have a beneficial impact?
  • If we discover the future may not be like the past, what is the statistical behavior of this future, how can we discover this behavior, and how will these changes impact our decision making processes?

The answers to these and many other questions can be found in the mathematics of probability and statistics. Here's some popular misconceptions of mathematical concepts

Modeling is the Key to Decision Making

"All models are wrong, some are useful," George Box and Norman R. Draper (1987). Empirical Model-Building and Response Surfaces, p. 424, Wiley. ISBN 0471810339. 

  • This book is about process control systems and the statistical process models used to design and operate the control systems in chemical plants. (This is a domain I have worked in and developed software for).
  • This quote has been wildly misquoted, not only out of context, but also completely out of the domain it is applicable to.
  • All models are wrong says, every model is wrong because it is a simplification of reality. This is the definition of a model.
  • Some models, in the "hard" sciences, are only a little wrong. They ignore things like friction or the gravitational effect of tiny bodies. Other models are a lot wrong - they ignore bigger things. In the social sciences, big things are ignored.
  • Statistical models are descriptions of ¬†systems using mathematical language. In many cases we can add a certain layer of abstraction to enable an inferential procedure.
  • It is¬†almost impossible¬†for a single model to describe perfectly a real world phenomenon given our ¬†own subjective view of the world, since our sensory system is not perfect.
  • But - and this is the critical misinterpretation of Box's quote - successful statistical inference does happen any a certain degree of consistency we exploit.
  • So our¬†almost always wrong models¬†do prove¬†useful.

We can't possibly estimate activities in the future if we don't already know what they are

We actually do this all the time. But more importantly there are simple step-by-step methods for making credible estimates about unknown - BUT KNOWABLE - outcomes.
This know of unknown but knowable is critical. If we really can't know - it is unknowable - then the work is not a project. It is pure research. So move on, unless you're a PhD researcher.

Here's a little dialog showing how to estimating most anything in the software development world. 
With your knowledge and experience in the domain and a reasonable understanding of what the customer wants (no units of measure for reasonable by the way, sorry), let's ask some questions.

I have no pre-defined expectation of the duration. That is I have no anchor to start. If I did and didn't have a credible estimate I'd be a Dilbert manager - and I'm not.

  • Me¬†- now that you know a little bit about my needed feature, can you develop this in less than 6 months?
  • You¬†- of course I can, I'm not a complete moron.
  • Me¬†- good, I knew I was right to hire you. How about developing this feature in a week?
  • You¬†- are you out of your mind? I'd have to be a complete moron to sign up for that.
  • Me¬†- good, still confirms I hired the right person for the job. How about getting it done in 4 months?
  • You¬†- well that's still seems like too long, but I guess it'll be more than enough time if we run into problems or it turns out you don't really know what you want and change your mind.
  • Me¬†- thanks for the confidence in my ability. How about 6 weeks for this puppy?
  • You¬†- aw come on, now you're making me cranky. I don't know anyone except someone who has done this already, that can do it in 6 weeks. That's a real stretch for me. A real risk of failure and I don't want that. You hired me to be successful, and now you're setting me up for failure.¬†
  • Me¬†- good, just checking. How about 2¬Ĺ months - about 10 weeks?
  • You¬†- yea that still sounds pretty easy, with some margin. I'll go for that.
  • Me¬†- Nice, I like the way you think. How about 7 weeks?¬†
  • You¬†- boy you're a pushy one aren't you. That's a stretch, but I've got some sense of what you want. It's possible, but I can't really commit to being done in that time, it'll be risky but I'll try.
  • Me¬†- good, let's go with 8¬Ĺ weeks for now, and we'll update the estimate after a few weeks of you actually producing output I can look at.

Microeconomics of Decision Making

 Making decisions about the future in the presence of uncertainty can be addressed by microeconomics principles. Microeconomics is a branch of economics that studies the behavior of individuals and small impacting organizations in making decisions on the allocation of limited resources. Projects have limited resources, business has limited resources. All human endeavors have limited resources - time, money, talent, capacity for work, skills, and other unknowns. 

The microeconomics of decision making involves several variables

  • Opportunity cost - the value of what we give up by taking that action. If we decide between A and B and choose B, what is the cost of A that we're giving up.
  • Marginal cost analysis -¬†impact of small changes in the ‚Äúhow-much‚ÄĚ decision.
  • Sunk cost -¬†costs that have already been incurred and cannot be recovered.
  • Present Value -¬†The value today of a future cost or benefit.

Formally, defining this choice problem is simple: there is a state space S, whose elements are called states of nature and represent all the possible realizations of uncertainty; there is an outcome space X , whose elements represent the possible results of any conceivable decision; and there is a preference relation ‚™ł over the mappings from S to X.¬†‚Ć

This of course provides little in a way to make a decision on a project. But the point here is making decisions in the presence of uncertainty is a well developed discipline. Conjecturing it can't be done simply ignores this discipline.

The Valuation of Project Deliverables

It's been conjectured that focusing on value is the basis of good software development efforts. When suggested that this value is independent of cost this is misinformed. Valuation and the resulting Value used to compare choices, is the process of determining the economic value of an asset, be it a created product, a service, or a process. Value is defined as the net worth, or the difference between the benefits produced by the asset and the costs to develop or acquire the asset, all adjusted appropriately for probabilistic risk, at some point in time.

This valuation has several difficulties:

  • Costs and benefits might occur at different points in time and need to be adjusted, or discounted, to account for time value of money. The fundamental principle that money is worth more today than in the future under ordinary economic conditions.¬†
  • Not all determinants of value are known at the time of the valuation since there is uncertainty inherent in all project and business environments.¬†
  • Intangible benefits like learning, growth or emergent opportunities, and embedded flexibility are the primary sources of value in the presence of uncertainty.¬†

The valuation of the outcomes of software projects depends on the analysis of these underlying costs and benefits. A prerequisite for cost-benefit analysis is the identification of the relevant value and cost drivers to produce that value. Both cost and value are probabilistic, driven by  uncertainty - both reducible and irreducible uncertainty

Modeling Uncertainty

In addition to measurable benefits and costs of the software project, the valuation process must consider uncertainty. Uncertainty arises from different sources. Natural uncertainty (aleatory) which is  irreducible. This uncertainty relates to variations in the environment variables. Dealing with irreducible uncertainty requires margin for cost, schedule, and the performance of the outcomes. For both value and cost.

Event based uncertainty (epistemic) which is reducible. That is we can buy down this uncertainty with out actions. We can pay money to find things out. We can pay money to improve the value delivered from the cost we invest to produce that value.

Parameter uncertainty relates to the estimation of parameters (e.g., the reliability of the average number of defects). Model uncertainty relates to the validity of specific models used (e.g., the suitability of a certain distribution to model the defects). There is a straightforward taxonomy of uncertainty for software engineering that includes additional sources such as scope error and assumption error. The standard approach of handling uncertainty is by defining probability distributions for the underlying quantities, allowing the application of a standard calculus. Other approaches based on fuzzy measures or Bayesian networks consider different types of prior knowledge. ‡

The Final Point Once Again

The conjecture we can make informed decisions about choices in an uncertain future can be done in the absence of making estimates of the impacts of these choices has no basis in the mathematics of decision making.

This conjecture is simply not true. Any attempt to show this can be done has yet to materialize in any testable manner. This is where the basic math skills come into play. There is no math that supports this conjecture. Therefore there is no way to test this conjecture. It's personal opinion uninformed by any mathematics.

Proceed with caution when you hear this.

† Decision Theory Under Uncertainty, Johanna Etner, Meglena Jeleva, Jean-Marc Tallon,  Centre d’Economie de la Sorbonne 2009.64

‡ Estimates, Uncertainty and Risk. IEEE Software, 69-74 (May 1997), Kitchenham and Linkman and "Belief Functions in Business Decisions. In: Studies in Fuzziness and Soft Computing, Vol. 88, Srivastava and Mock

Related articles Information Technology Estimating Quality Everything I Learned About PM Came From a Elementary School Teacher Carl Sagan's BS Detector Eyes Wide Shut - A View of No Estimates
Categories: Project Management

Software Architecture for Developers in Chinese

Coding the Architecture - Simon Brown - Wed, 07/01/2015 - 11:14

Although it's been on sale in China for a few months, my copies of the Chinese translation of my Software Architecture for Developers book have arrived. :-)

Software Architecture for Developers

I can't read it, but seeing my C4 diagrams in Chinese is fun! Stay tuned for more translations.

Categories: Architecture

Agile Teams Making Decisions (Refined)

Team members working together.

Team members working together (or not).

In Agile projects, the roles of the classic project manager in Scrum are spread across the three basic roles (Product Owner, Scrum Master and Development Team). A fourth role, the Agile Program Manager (known as a Release Train Engineer in SAFe), is needed when multiple projects are joined together to become a coordinated program.  The primary measures of success in Agile projects are delivered business value and customer satisfaction.  These attributes subsume the classic topics of on-time, on-budget and on-scope. (Note: Delivered value and customer satisfaction should be the primary measure of success in ALL types of projects, however these are not generally how project teams are held accountable.)

As teams learn to embrace and use Agile principles, they need to learn how to make decisions as a team. The decisions that teams need to learn how to make for themselves always have consequences, and sometimes those consequences will be negative. To accomplish this learning process in the least risky manner, the team should use techniques like delaying decisions as late as is practical and delivering completed work within short time boxes. These techniques reduce risk by increasing the time the team has to gather knowledge and by getting the team feedback quickly. The organization also must learn how to encourage the team to make good decisions while giving them the latitude to mess up. This requires the organization to accept some level of explicit initial ambiguity that is caused by delaying decisions, rather than implicit ambiguity of making decisions early that later turn out to be wrong. The organization must also learn to evaluate teams and individuals less on the outcome of a single decision and more on the outcome of the value the team delivered.

Teams also have to unlearn habits, for example, relying on others to plan for them. In order to do that, all leaders and teams must have an understanding of the true goals of the project (listen to my interview with David Marquet) and how the project fits into the strategic goals of the organization.

Teams make designs daily that affect the direction of the sprint and project. The faster these decisions are made the higher the team’s velocity or productivity. Having a solid understanding of the real goals of the project helps the team make decisions more effectively. Organizations need to learn how share knowledge that today is generally compartmentalized between developers, testers or analysts.

The process of learning and unlearning occurs on a continuum as teams push toward a type of collective self-actualization. As any team moves toward its full potential, the organization’s need to control planning and decisions falls away. If the organization doesn’t back away from the tenants of command and control and move towards the Agile principles, the ability of any team to continue to grow will be constrained.  The tipping point generally occurs when an organization realizes that self-managing and self-organizing teams deliver superior value and higher customer satisfaction and that in the long run is what keeps CIOs employed.


Categories: Process Management

R: write.csv ‚Äď unimplemented type ‚Äėlist‚Äô in ‚ÄėEncodeElement‚Äô

Mark Needham - Tue, 06/30/2015 - 23:26

Everyone now and then I want to serialise an R data frame to a CSV file so I can easily load it up again if my R environment crashes without having to recalculate everything but recently ran into the following error:

> write.csv(foo, "/tmp/foo.csv", row.names = FALSE)
Error in .External2(C_writetable, x, file, nrow(x), p, rnames, sep, eol,  : 
  unimplemented type 'list' in 'EncodeElement'

If we take a closer look at the data frame in question it looks ok:

> foo
  col1 col2
1    1    a
2    2    b
3    3    c

However, one of the columns contains a list in each cell and we need to find out which one it is. I’ve found the quickest way is to run the typeof function over each column:

> typeof(foo$col1)
[1] "double"
 
> typeof(foo$col2)
[1] "list"

So ‘col2′ is the problem one which isn’t surprising if you consider the way I created ‘foo':

library(dplyr)
foo = data.frame(col1 = c(1,2,3)) %>% mutate(col2 = list("a", "b", "c"))

If we do have a list that we want to add to the data frame we need to convert it to a vector first so we don’t run into this type of problem:

foo = data.frame(col1 = c(1,2,3)) %>% mutate(col2 = list("a", "b", "c") %>% unlist())

And now we can write to the CSV file:

write.csv(foo, "/tmp/foo.csv", row.names = FALSE)
$ cat /tmp/foo.csv
"col1","col2"
1,"a"
2,"b"
3,"c"

And that’s it!

Categories: Programming

GTAC 2015: Call for Proposals & Attendance

Google Testing Blog - Tue, 06/30/2015 - 22:11
Posted by Anthony Vallone on behalf of the GTAC Committee

The GTAC (Google Test Automation Conference) 2015 application process is now open for presentation proposals and attendance. GTAC will be held at the Google Cambridge office (near Boston, Massachusetts, USA) on November 10th - 11th, 2015.

GTAC will be streamed live on YouTube again this year, so even if you can’t attend in person, you’ll be able to watch the conference remotely. We will post the live stream information as we get closer to the event, and recordings will be posted afterward.

Speakers
Presentations are targeted at student, academic, and experienced engineers working on test automation. Full presentations are 30 minutes and lightning talks are 10 minutes. Speakers should be prepared for a question and answer session following their presentation.

Application
For presentation proposals and/or attendance, complete this form. We will be selecting about 25 talks and 200 attendees for the event. The selection process is not first come first serve (no need to rush your application), and we select a diverse group of engineers from various locations, company sizes, and technical backgrounds (academic, industry expert, junior engineer, etc).

Deadline
The due date for both presentation and attendance applications is August 10th, 2015.

Fees
There are no registration fees, but speakers and attendees must arrange and pay for their own travel and accommodations.

More information
You can find more details at developers.google.com/gtac.

Categories: Testing & QA

Debian Size Claims - New Lecture Posted

10x Software Development - Steve McConnell - Tue, 06/30/2015 - 19:17

In this week's lecture (https://cxlearn.com) I demonstrate how to use some of the size information we've discussed in other lectures by diving into the Wikipedia claims about the sizes of various versions of Debian.  The point of this week's lecture is to show how to apply critical thinking to size information presented by an authoritative source (Wikipedia), and how to arrive at a confident conclusion that that information is not credible. Practicing software professionals should be able to look at size claims like the Debian size claims and, based on general knowledge, immediately think, "That seems far from credible." Yet, few professionals actually do that. My hope is that working through public examples like this in the lecture series will help software professionals improve their instincts and judgment, which can then be applied to projects in their own organizations. 

Lectures posted so far include:  

0.0 Understanding Software Projects - Intro
     0.1 Introduction - My Background
     0.2 Reading the News
     0.3 Definitions and Notations 

1.0 The Software Lifecycle Model - Intro
     1.1 Variations in Iteration 
     1.2 Lifecycle Model - Defect Removal
     1.3 Lifecycle Model Applied to Common Methodologies
     1.4 Lifecycle Model - Selecting an Iteration Approach  

2.0 Software Size
     2.05 Size - Comments on Lines of Code
     2.1 Size - Staff Sizes 
     2.2 Size - Schedule Basics 
     2.3 Size - Debian Size Claims (New)

Check out the lectures at http://cxlearn.com!

Understanding Software Projects - Steve McConnell

 

Succeeding with Geographically Distributed Scrum Teams - New White Paper

10x Software Development - Steve McConnell - Tue, 06/30/2015 - 19:02

We have a new white paper, "Succeeding with Geographically Distributed Scrum Teams." To quote the white paper itself: 

When organizations adopt Agile throughout the enterprise, they typically apply it to both large and small projects. The gap is that most Agile methodologies, such as Scrum and XP, are team-level workflow approaches. These approaches can be highly effective at the team level, but they do not address large project architecture, project management, requirements, and project planning needs. Our clients find that succeeding with Scrum on a large, geographically distributed team requires adopting additional practices to ensure the necessary coordination, communication, integration, and architectural work. This white paper discusses common considerations for success with geographically distributed Scrum.

Check it out!

What’s new with Google Fit: Distance, Calories, Meal data, and new apps and wearables

Google Code Blog - Tue, 06/30/2015 - 18:52

Posted by Angana Ghosh, Lead Product Manager, Google Fit

To help users keep track of their physical activity, we recently updated the Google Fit app with some new features, including an Android Wear watch face that helps users track their progress throughout the day. We also added data types to the Google Fit SDK and have new partners tracking data (e.g. nutrition, sleep, etc.) that developers can now use in their own apps. Find out how to integrate Google Fit into your app and read on to check out some of the cool new stuff you can do.

table, th, td { border: clear; border-collapse: collapse; }

Distance traveled per day

The Google Fit app now computes the distance traveled per day. Subscribe to it using the Recording API and query it using the History API.

Calories burned per day

If a user has entered their details into the Google Fit app, the app now computes their calories burned per day. Subscribe to it using the Recording API and query it using the History API.

Nutrition data from LifeSum, Lose It!, and MyFitnessPal

LifeSum and Lose It! are now writing nutrition data, like calories consumed, macronutrients (proteins, carbs, fats), and micronutrients (vitamins and minerals) to Google Fit. MyFitnessPal will start writing this data soon too. Query it from Google Fit using the History API.

Sleep activity from Basis Peak and Sleep as Android

Basis Peak and Sleep as Android are now writing sleep activity segments to Google Fit. Query this data using the History API.

New workout sessions and activity data from even more great apps and fitness wearables!

Endomondo, Garmin, the Daily Burn, the Basis Peak and the Xiaomi miBand are new Google Fit partners that will allow users to store their workout sessions and activity data. Developers can access this data with permission from the user, which will also be shown in the Google Fit app.

How are developers using the Google Fit platform?

Partners like LifeSum, and Lose It! are reading all day activity to help users keep track of their physical activity in their favorite fitness app.

Runkeeper now shows a Google Now card to its users encouraging them to ‚Äúwork off‚ÄĚ their meals, based on their meals written to Google Fit by other apps.

Instaweather has integrated Google Fit into a new Android Wear face that they’re testing in beta. To try out the face, first join this Google+ community and then follow the link to join the beta and download the app.

We hope you enjoy checking out these Google Fit updates. Thanks to all our partners for making it possible! Find out more about integrating the Google Fit SDK into your app.

Categories: Programming