Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Flaws and Fallacies of #NoEstimates

Herding Cats - Glen Alleman - Sun, 08/09/2015 - 23:39

All the work we do in the projects domain is driven by uncertainty. Uncertainty of some probabilistic future event impacting our project. Uncertainty in the work activities performed while developing a product or service.

Decision making in the presence of these uncertainties is a natural process in all of business.

The decision maker is asked to express her beliefs by assigning probabilities to certain possible states of the system in the future and the resulting outcomes of those states.

What's the chance we'll have this puppy ready for VMWorld in August? What's the probability that when we go live and 300,000 users logon we'll be able to handle the load? What's our test coverage for the upcoming release given we've added 14 new enhancements to the code base this quarter? Questions like that are normal everyday business questions, along with what's the expected delivery date, what's the expected total sunk cost, and what's the expected bookable value measured in Dead Presidents for the system when it goes live?

To answer these and the unlimited number of other business, technical, operational, performance, security, and financial questions, we need to know something about probability and statistics. This knowledge  is an essential tool for decision making no matter the domain.

Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write - H.G. Wells

If we accept the notion that all project work is probabilistic, driven by the underlying statistical processes of time, cost, and technical outcomes, including Effectiveness, Performance, Capabilities, and all the ...ilities that  manifest and determine value after a system is put into initial use. Then these conditions are the source of uncertainty and come in two types:

  • Reducible - event based with a probability of occurrence within a specified time period.
  • Irreducible - naturally occurring by a Probability Distribution Function of the variances produced by the underlying process.

If you don't accept this - that all project work is probabilistic in nature - stop reading, this Blog is not for you.

If you do accept that all project work is uncertain, then there are some more assumptions we need to make sense of the decision making processes. The term statistic has two definitions - one long ago and a current one. The long ago one means a fact, referring to numerical facts. A numerical fact as a measurement, a count, or a rank. This number can represent a total, an average or a percentage of several such measures. This term also applied to the broad discipline of statistical manipulation in the same way accounting applies to entering and balancing accounts. 

Statistics in the second sense is a set of methods for obtaining, organizing, and summarizing numerical facts. These facts usually represent a partial rather than complete knowledge about a situation. For example the sample of the population rather than counting the entire population in the case of the census.

These numbers - statistics - are usually subjected to formal statistical analysis to help in our decision making in the presence of uncertainty.

In our software project world uncertainty is an inherent fact. Software uncertainty is likely much higher than in construction, since the requirements in software development are soft unlike the requirements in interstate highway development. But while the domain may have different variance in the level of uncertainty, estimates are still needed to make decisions in the presence of these uncertainties. Highway development has many uncertainties - none the least is the weather and weather delays. 

When you measure what you are speaking about and express it in numbers you know something about it; but when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind - Lord Kelvin

Decisions are made on data. Otherwise those decisions are just gut feel, intuition, and at their core guesses. When you are guessing with other peoples money you have a low probability of keeping your job or the business staying in business. 

... a tale told by an idiot, full of sound and fury, signifying nothing - Shakespeare

When we hear personal anecdotes about how to correct a problem and the conjecture that those anecdotes are applicable outside the individual telling the anecdote - beware. Without a test of any conjecture it is just a conjecture. 

He uses statistics as a drunken man uses lampposts - for support rather than illumination - Andrew Lang

We many times confuse a symptom with the cause. When reading about all the failures in IT projects, and probability of failure, the number of failures versus success, there is rarely - in those naive posts on that topic - any assessment of the cause of the failure. The Root Cause analysis is not present. The Chaos Report is the most egregious of these. 

There is no merit where there is no trial; and till experience stamps the mark of strength, cowards may pass for heroes, and faith for falsehood - A. Hill

Tossing out anecdotes, platitudes, and misquoted quotes does not make for a credible argument for anything. I knew a person that did X successfully, therefore you should have the same experience is common. Or just try it you may find it works for you just like it worked for me

It seems there are no Principles or tested Practices in the approach to improving projects success. Just platitudes and anecdotes - masking chatter as process improvement advice. 

I started to write a detailed exposition using this material for the #NoEstimates conjecture that decisions can be made without an estimate. But Steve McConnell's post is much better than anything I could have done. So here's the wrap up...

When it is conjectured that decisions, any decisions, some decisions, self selected decisions, can be made in the presence of uncertainty can be made without also making an estimate of the outcome of that decision, the cost of that decision, the impact of that decision - then let's hear how, so we can test it outside personal opinion and anecdote.

References 

It's time for #NoEstimates advocates to provide some principle based examples of how to make decisions in the presence of uncertainty without estimating. Here these are populist books (Books without the heavy math), but still capable of conveying the principles of the topic can be a source of learning. 

  1. Flaws and Fallacies in Statistical Thinking, Stephen K. Campbell, Prentice Hall, 1974
  2. The Economics of Iterative Software Development: Steering Toward Better Business Results, Walker Royce, Kurt Bittner, and Mike Perrow, Addison Wesley, 2009.
  3. How Not to be Wrong: The Power of Mathematical Thinking, Jordan Ellenberg, Penguin Press, 2014
  4. Hard Facts, Dangerous Half-Truths & Total Nonsense: Profiting from Evidence Based Management, Jeffery Pfeffer and Robert I. Sutton, Harvard Business School Press, 2006.
  5. How to Measure Anything, Finding the Value of Intangibles in Business, 3rd Edition, Douglas W. Hubbard, John Wiley & Sons, 2014.
  6. Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways Ways to Lie With Statistics, Gary Smith
  7. Center for Informed Decision Making
  8. Decision Making for the Professional, Peter McNamee and John Celona

Some actual math books on the estimating problem

  1. Probability Methods for Cost Uncertainty Analysis, Pau R. Garvey
  2. Making Hard Decisions: An Introduction to Decision Analysis, 2nd Edition, Robert T, Clemen, Duxbury Press, 1996.
  3. Estimating Software Intensive Systems, Richard D. Stutzke, Addison Wesley, 2005.
  4. Probabilities as Similarly Weighted Frequencies, Antoine Billot · Itzhak Gilboa · Dov Samet · David Schmeidler
Related articles Making Conjectures Without Testable Outcomes Estimating Processes in Support of Economic Analysis Applying the Right Ideas to the Wrong Problem Estimating and Making Decisions in Presence of Uncertainty
Categories: Project Management

SPaMCAST 354 -Allan Kelly, #NoProjects

Software Process and Measurement Cast - Sun, 08/09/2015 - 22:00

The week’s Software Process and Measurement Cast features our interview with Allan Kelly.  We talked #NoProjects and having a focus of delivering a consistent flow of value.  The classic project framework causes us to focus on being on-time, on-budget and on-scope, but not on-value. If we don’t focus on delivering the maximum value we are doing both our customers and ourselves a great disservice. 

Allan Kelly advises teams from many different companies and domains on adopting and deepening Agile practices and development in general. He specializes in working with software product companies and aligning products and processes with company strategy. When he is not with clients he writes far too much.  

He holds BSc and MBA degrees, is the author of three books: "Xanpan - team centric Agile Software Development" (https://leanpub.com/xanpan), "Business Patterns for Software Developers" and “Changing Software Development: Learning to be Agile”. In addition he is the originator of Retrospective Dialogue Sheets (http://www.dialoguesheets.com) and a regular conference speaker. He can be found on Twitter as @allankellynet (http://twitter.com/allankellynet) and blogs (http://blog.allankelly.net).

Call to Action!

I have a challenge for the Software Process and Measurement Cast listeners for the next few weeks. I would like you to find one person that you think would like the podcast and introduce them to the cast. This might mean sending them the URL or teaching them how to download podcasts. If you like the podcast and think it is valuable they will be thankful to you for introducing them to the Software Process and Measurement Cast. Thank you in advance!

Re-Read Saturday News

Remember that the Re-Read Saturday of The Mythical Man-Month is in full swing.  This week we tackle the essay titled “Passing the Word”!  Check out the new installment at Software Process and Measurement Blog.

Upcoming Events

Software Quality and Test Management
September 13 – 18, 2015
San Diego, California
http://qualitymanagementconference.com/

I will be speaking on the impact of cognitive biases on teams.  Let me know if you are attending! If you are still deciding on attending let me know because I have a discount code.

Agile Development Conference East
November 8-13, 2015
Orlando, Florida
http://adceast.techwell.com/

I will be speaking on November 12th on the topic of Agile Risk. Let me know if you are going and we will have a SPaMCAST Meetup.

Next SPaMCAST

The next Software Process and Measurement feature our essay titled, Agile Success.  How do we define success with Agile?  If we can’t define what success using Agile is and how we can measure it, anyone adopting Agile is bound to wander aimlessly.  Wandering aimlessly is bad for your career and potentially for the careers of everyone around you!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here. Available in English and Chinese.

Categories: Process Management

SPaMCAST 353 -Allan Kelly, #NoProjects

 www.spamcast.net

http://www.spamcast.net

Listen to SPaMCAST 354

Subscribe on iTunes

The week’s Software Process and Measurement Cast features our interview with Allan Kelly.  We talked #NoProjects and having a focus of delivering a consistent flow of value.  The classic project framework causes us to focus on being on-time, on-budget and on-scope, but not on-value. If we don’t focus on delivering the maximum value we are doing both our customers and ourselves a great disservice.

Allan Kelly advises teams from many different companies and domains on adopting and deepening Agile practices and development in general. He specializes in working with software product companies and aligning products and processes with company strategy. When he is not with clients he writes far too much.

He holds BSc and MBA degrees, is the author of three books: “Xanpan – team centric Agile Software Development” (https://leanpub.com/xanpan), “Business Patterns for Software Developers” and “Changing Software Development: Learning to be Agile”. In addition he is the originator of Retrospective Dialogue Sheets (http://www.dialoguesheets.com) and a regular conference speaker. He can be found on Twitter as @allankellynet (http://twitter.com/allankellynet) and blogs (http://blog.allankelly.net).

Call to Action!

I have a challenge for the Software Process and Measurement Cast listeners for the next few weeks. I would like you to find one person that you think would like the podcast and introduce them to the cast. This might mean sending them the URL or teaching them how to download podcasts. If you like the podcast and think it is valuable they will be thankful to you for introducing them to the Software Process and Measurement Cast. Thank you in advance!

Re-Read Saturday News

Remember that the Re-Read Saturday of The Mythical Man-Month is in full swing.  This week we tackle the essay titled “Passing the Word”!  Check out the new installment at Software Process and Measurement Blog.

Upcoming Events

Software Quality and Test Management

September 13 – 18, 2015

San Diego, California

http://qualitymanagementconference.com/

I will be speaking on the impact of cognitive biases on teams.  Let me know if you are attending! If you are still deciding on attending let me know because I have a discount code.

 

Agile Development Conference East

November 8-13, 2015

Orlando, Florida

http://adceast.techwell.com/

I will be speaking on November 12th on the topic of Agile Risk. Let me know if you are going and we will have a SPaMCAST Meetup.

Next SPaMCAST

The next Software Process and Measurement feature our essay titled, Agile Success.  How do we define success with Agile?  If we can’t define what success using Agile is and how we can measure it, anyone adopting Agile is bound to wander aimlessly.  Wandering aimlessly is bad for your career and potentially for the careers of everyone around you!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here. Available in English and Chinese.


Categories: Process Management

Re-Read Saturday: The Mythical Man-Month, Part 6 – Passing the Word

The Mythical Man-Month

The Mythical Man-Month

In the sixth essay of The Mythical Man-Month, titled Passing the Word, Brooks tackles one of the largest problems any large project will have: communicating the architecture. Whether you have defined the architecture upfront or just as it is needed, passing the word is critical to ensuring everyone stays on the same page and what gets built works and is what is wanted. This essay provided Brooks’ take on how to ensure that a large number of people hear, understand and implement the architects’ decisions. He describes seven inter-locking techniques for passing the word; they are:

  1. Written specifications. Many developers value documentation on the same level as reality TV shows; however, as projects and products are scaled past one or two co-located teams documentation becomes a valuable tool. Written specifications define limits, appearance, UX and interfaces. In essence, the written specification is the primary output of the architect that provides everyone with the boundaries of the product and how the user will  interact with the product.  What the written spec, the product of the architects, doesn’t define how the guts of the product will work which is the purview of the developers.  Written specifications do not have to relate to large paper manuals; techniques like WIKI’s have been used to capture and transmit specifications and to solicit interaction.
  2. Formal definitions. Words are great, but imprecise. Even when everyone involved in an effort shares the same first language (which is less and less true as the metaphorical world shrinks). Formal languages and modeling techniques can be used to document the specification and to capture exceptions and explanations. Alternately, simulators and prototypes are mechanisms that can be used to capture and document the specifications.
    The problem with having two definitions of the same idea is what happens when they disagree. Brooks points out that the answer is to never have two communication methods.  Rather he points out you need a tool to break the tie if a disagreement occurs. Either have one method of communicating the spec or have three (think odd numbers).
  3. Direct incorporation. Direct incorporation builds a structure or framework for the product that cannot be changed by the implementer.  For example, a set of predefined objects or classes. Deviations and changes, when needed, require renaming and recompiling modules and interfaces. I view this as more of a control mechanism; however, the original structure acts as baseline to communicate the architectural vision.
  4. Conferences and courts. This category can be described at a high level in one word – meetings. Brooks suggests two types of meetings to control and communicate change. The first is the conference. A conference is a group meeting held on a periodic basis (weekly or monthly) that includes all architects and representatives from the hardware and software developers. Changes and refinements are reviewed and decisions are made. Consensus drives decisions; however, if consensus cannot be achieved the lead architect decides (appeals to overall project leader are allowed). This type of meeting might be recognized as a type of architectural change control board (CCB). The second type of meeting is the “court.” The court is more of a formal meeting of the architects, representatives of the implementers, management, marketing (if relevant) and other stakeholders to make decisions about any nagging issues on how the architectural specification is to be implemented. Courts are typically held annually or semi-annually.
  5. Multiple implementations. One possible solution to the issue of discrepancies between the specifications and what is implemented is to support multiple implementations. Alternate solutions can move forward and be evaluated. While sometimes possible, in general this solution can generate a significant drain on people and resources.
  6. The telephone log. Questions to the architects come up as implementers interact with the specification. In this technique you capture all questions and answers and publish them so everyone can benefit from the conversations. Wikis make a great tool for capturing and disseminating Q&A content.
  7. The product test. The independent test is a tool to identify discrepancies between the specification and the implementation. Some form of independence, whether as an independent test group or through test driven development, is needed to ensure a consistent translation of the vision into a product. Remember that the final arbiter is the customer/user and their product test will be merciless.

Communication is the single most prevalent problem any large group effort will encounter. In Passing the Word, Brooks provides seven possible mechanisms to ensure that everyone hears the same story and has the chance to develop a clear and consistent understanding of that story.

Do you have other solutions that you can suggest? Please share?

Previous installments of the Re-read of The Mythical Man-Month

Introductions and The Tar Pit

The Mythical Man-Month (The Essay)

The Surgical Team

Aristocracy, Democracy and System Design

The Second-System Effect


Categories: Process Management

Record Linkage: Playing around with Duke

Mark Needham - Sat, 08/08/2015 - 23:50

I’ve become quite interesting in record linkage recently and came across the Duke project which provides some tools to help solve this problem. I thought I’d give it a try.

The typical problem when doing record linkage is that we have two records from different data sets which represent the same entity but don’t have a common key that we can use to merge them together. We therefore need to come up with a heuristic that will allow us to do so.

Duke has a few examples showing it in action and I decided to go with the linking countries one. Here we have countries from Dbpedia and the Mondial database and we want to link them together.

The first thing we need to do is build the project:

export JAVA_HOME=`/usr/libexec/java_home`
mvn clean package -DskipTests

At the time of writing this will put a zip fail containing everything we need at duke-dist/target/. Let’s unpack that:

unzip duke-dist/target/duke-dist-1.3-SNAPSHOT-bin.zip

Next we need to download the data files and Duke configuration file:

wget https://raw.githubusercontent.com/larsga/Duke/master/doc/example-data/countries-dbpedia.csv
wget https://raw.githubusercontent.com/larsga/Duke/master/doc/example-data/countries.xml
wget https://raw.githubusercontent.com/larsga/Duke/master/doc/example-data/countries-mondial.csv
wget https://raw.githubusercontent.com/larsga/Duke/master/doc/example-data/countries-test.txt

Now we’re ready to give it a go:

java -cp "duke-dist-1.3-SNAPSHOT/lib/*" no.priv.garshol.duke.Duke --testfile=countries-test.txt --testdebug --showmatches countries.xml
 
...
 
NO MATCH FOR:
ID: '7706', NAME: 'guatemala', AREA: '108890', CAPITAL: 'guatemala city',
 
MATCH 0.9825124555160142
ID: '10052', NAME: 'pitcairn islands', AREA: '47', CAPITAL: 'adamstown',
ID: 'http://dbpedia.org/resource/Pitcairn_Islands', NAME: 'pitcairn islands', AREA: '47', CAPITAL: 'adamstown',
 
Correct links found: 200 / 218 (91.7%)
Wrong links found: 0 / 24 (0.0%)
Unknown links found: 0
Percent of links correct 100.0%, wrong 0.0%, unknown 0.0%
Records with no link: 18
Precision 100.0%, recall 91.74311926605505%, f-number 0.9569377990430622

We can look in countries.xml to see how the similarity between records is being calculated:

  <schema>
    <threshold>0.7</threshold>
...
    <property>
      <name>NAME</name>
      <comparator>no.priv.garshol.duke.comparators.Levenshtein</comparator>
      <low>0.09</low>
      <high>0.93</high>
    </property>
    <property>
      <name>AREA</name>
      <comparator>no.priv.garshol.duke.comparators.NumericComparator</comparator>
      <low>0.04</low>
      <high>0.73</high>
    </property>
    <property>
      <name>CAPITAL</name>
      <comparator>no.priv.garshol.duke.comparators.Levenshtein</comparator>
      <low>0.12</low>
      <high>0.61</high>
    </property>
  </schema>

So we’re working out similarity of the capital city and country by calculating their Levenshtein distance i.e. the minimum number of single-character edits required to change one word into the other

This works very well if there is a typo or difference in spelling in one of the data sets. However, I was curious what would happen if the country had two completely different names e.g Cote d’Ivoire is sometimes know as Ivory Coast. Let’s try changing the country name in one of the files:

"19147","Cote dIvoire","Yamoussoukro","322460"
java -cp "duke-dist-1.3-SNAPSHOT/lib/*" no.priv.garshol.duke.Duke --testfile=countries-test.txt --testdebug --showmatches countries.xml
 
NO MATCH FOR:
ID: '19147', NAME: 'ivory coast', AREA: '322460', CAPITAL: 'yamoussoukro',

I also tried it out with the BBC and ESPN match reports of the Man Utd vs Tottenham match – the BBC references players by surname, while ESPN has their full names.

When I compared the full name against surname using the Levenshtein comparator there were no matches as you’d expect. I had to split the ESPN names up into first name and surname to get the linking to work.

Equally when I varied the team name’s to be ‘Man Utd’ rather than ‘Manchester United’ and ‘Tottenham’ rather than ‘Tottenham Hotspur’ that didn’t work either.

I think I probably need to write a domain specific comparator but I’m also curious whether I could come up with a bunch of training examples and then train a model to detect what makes two records similar. It’d be less deterministic but perhaps more robust.

Categories: Programming

Refactoring JavaScript from Sync to Async in Safe Baby-Steps

Mistaeks I Hav Made - Nat Pryce - Sat, 08/08/2015 - 17:10
Consider some JavaScript code that gets and uses a value from a synchronous call or built in data structure: function to_be_refactored() { var x; ... x = get_x(); ...use x... } Suppose we want to replace this synchronous call with a call to a service that has an asynchronous API (an HTTP fetch, for example). How can we refactor the code from synchronous to asynchronous style in small safe steps? First, wrap the the remainder of the function after the line that gets the value in a “continuation” function that takes the value as a parameter and closes over any other variables in its environment. Pass the value to the continuation function: function to_be_refactored() { var x, cont; ... x = get_x(); cont = function(x) { ...use x... }; cont(x); } Then pull the definition of the continuation function before the code that gets the value: function to_be_refactored() { var x, cont; cont = function(x) { ...use x... }; ... x = get_x(); cont(x); } Now extract the last two lines that get the value and call the continuation into a single function that takes the continuation as a parameter and pass the continuation to it. function to_be_refactored() { ... get_x_and(function(x) { ...use x... }); } function get_x_and(cont) { cont(get_x()); } If you have calls to get_x in many places in your code base, move get_x_and into a common module so that it can be called everywhere that get_x is called. Transform the remaining uses of get_x to “continuation passing style”, replacing the calls to get_x with calls toget_x_and. Finally, replace the implementation of get_x_and with a call to the async service and delete the get_x function. Wouldn’t it be nice if IDEs could do this refactoring automatically? The Trouble With Shared Mutable State Dale Hagglund asked via Twitter “What if cont assumes that some [shared mutable] property remains constant across the async invocation? I’ve always found these very hard to unmake.” In that case, you’ll have to copy the current value of the shared, mutable property into a local variable that is then closed over by the continuation. E.g. function to_be_refactored() { var x; ... x = get_x(); ...use x and shared_mutable_y() ... } would have to become: function to_be_refactored() { var y; ... y = shared_mutable_y(); get_x_and(function(x) { ...use x and y... }); }
Categories: Programming, Testing & QA

More Misconceptions of Waterfall

Herding Cats - Glen Alleman - Sat, 08/08/2015 - 16:47

It is popular in some agile circle to use Waterfall as the stalking horse for every bad management practices in software development. A recent example is

Go/No Go decisions are a residue of waterfall thinking. All software can built incrementally and most released incrementally.

Nothing in Waterfall prohibits incremental release. In fact the notion of block release is the basis of most Software Intensive Systems development. From the point of view of the business capabilities are what they bought. The capability to do something of value in exchange for the cost of that value. Here's an example in health insurance business. Incremental release of features is of little value if those features don't work together to provide some needed capability to conduct business. A naive approach is the release early and release often platitude of some in the agile domain. Let's say we're building a personnel management system. This includes recruiting, on-boarding, provisioning, benefits signup, time keeping, and payroll. It's not be very useful to release the time keeping feature if the payroll feature was not ready. 

Screen Shot 2015-08-08 at 9.37.16 AM

So before buying into the platitude of release early and often ask what does the business need to do business? Then draw a picture like the one about, develop a Plan for producing those capabilities in the order they are needed to deliver the needed value. Without this approach, you'll be spending money without producing value and calling that agile. 

That way you can stop managing other peoples money with Platitudes and replace them with actual business management processes. So every time you hear a platitude masking as good management, ask does that person using that platitude work anywhere that is high value at risk? No, then probably has yet to encounter that actual management of other peoples money

Related articles Capabilities Based Planning - Part 2 Are Estimates Really The Smell of Dysfunction?
Categories: Project Management

Welcome to The Internet of Compromised Things

Coding Horror - Jeff Atwood - Sat, 08/08/2015 - 11:59

This post is a bit of a public service announcement, so I'll get right to the point:

Every time you use WiFi, ask yourself: could I be connecting to the Internet through a compromised router with malware?

It's becoming more and more common to see malware installed not at the server, desktop, laptop, or smartphone level, but at the router level. Routers have become quite capable, powerful little computers in their own right over the last 5 years, and that means they can, unfortunately, be harnessed to work against you.

I write about this because it recently happened to two people I know.

.@jchris A friend got hit by this on newly paved win8.1 computer. Downloaded Chrome, instantly infected with malware. Very spooky.

— not THE Damien Katz (@damienkatz) May 20, 2015

@codinghorror *no* idea and there’s almost ZERO info out there. Essentially malicious JS adware embedded in every in-app browser

— John O'Nolan (@JohnONolan) August 7, 2015

In both cases, they eventually determined the source of the problem was that the router they were connecting to the Internet through had been compromised.

This is way more evil genius than infecting a mere computer. If you can manage to systematically infect common home and business routers, you can potentially compromise every computer connected to them.

Hilarious meme images I am contractually obligated to add to each blog post aside, this is scary stuff and you should be scared.

Router malware is the ultimate man-in-the-middle attack. For all meaningful traffic sent through a compromised router that isn't HTTPS encrypted, it is 100% game over. The attacker will certainly be sending all that traffic somewhere they can sniff it for anything important: logins, passwords, credit card info, other personal or financial information. And they can direct you to phishing websites at will – if you think you're on the "real" login page for the banking site you use, think again.

Heck, even if you completely trust the person whose router you are using, they could be technically be doing this to you. But they probably aren't.

Probably.

In John's case, the attackers inserted annoying ads in all unencrypted web traffic, which is an obvious tell to a sophisticated user. But how exactly would the average user figure out where this junk is coming from (or worse, assume the regular web is just full of ad junk all the time), when even a technical guy like John – founder of the open source Ghost blogging software used on this very blog – was flummoxed?

But that's OK, we're smart users who would only access public WiFi using HTTPS websites, right? Sadly, even if the traffic is HTTPS encrypted, it can still be subverted! There's an extremely technical blow-by-blow analysis at Cryptostorm, but the TL;DR is this:

Compromised router answers DNS req for *.google.com to 3rd party with faked HTTPS cert, you download malware Chrome. Game over.

HTTPS certificate shenanigans. DNS and BGP manipulation. Very hairy stuff.

How is this possible? Let's start with the weakest link, your router. Or more specifically, the programmers responsible for coding the admin interface to your router.

They must be terribly incompetent coders to let your router get compromised over the Internet, since one of the major selling points of a router is to act as a basic firewall layer between the Internet and you… right?

In their defense, that part of a router generally works as advertised. More commonly, you aren't being attacked from the hardened outside. You're being attacked from the soft, creamy inside.

That's right, the calls are coming from inside your house!

By that I mean you'll visit a malicious website that scripts your own browser to access the web-based admin pages of your router, and reset (or use the default) admin passwords to reconfigure it.

Nasty, isn't it? They attack from the inside using your own browser. But that's not the only way.

  • Maybe you accidentally turned on remote administration, so your router can be modified from the outside.

  • Maybe you left your router's admin passwords at default.

  • Maybe there is a legitimate external exploit for your router and you're running a very old version of firmware.

  • Maybe your ISP provided your router and made a security error in the configuration of the device.

In addition to being kind of terrifying, this does not bode well for the Internet of Things.

Internet of Compromised Things, more like.

OK, so what can we do about this? There's no perfect answer; I think it has to be a defense in depth strategy.

Inside Your Home

Buy a new, quality router. You don't want a router that's years old and hasn't been updated. But on the other hand you also don't want something too new that hasn't been vetted for firmware and/or security issues in the real world.

Also, any router your ISP provides is going to be about as crappy and "recent" as the awful stereo system you get in a new car. So I say stick with well known consumer brands. There are some hardcore folks who think all consumer routers are trash, so YMMV.

I can recommend the Asus RT-AC87U – it did very well in the SmallNetBuilder tests, Asus is a respectable brand, it's been out a year, and for most people, this is probably an upgrade over what you currently have without being totally bleeding edge overkill. I know it is an upgrade for me.

(I am also eagerly awaiting Eero as a domestic best of breed device with amazing custom firmware, and have one pre-ordered, but it hasn't shipped yet.)

Download and install the latest firmware. Ideally, do this before connecting the device to the Internet. But if you connect and then immediately use the firmware auto-update feature, who am I to judge you.

Change the default admin passwords. Don't leave it at the documented defaults, because then it could be potentially scripted and accessed.

Turn off WPS. Turns out the Wi-Fi Protected Setup feature intended to make it "easy" to connect to a router by pressing a button or entering a PIN made it … a bit too easy. This is always on by default, so be sure to disable it.

Turn off uPNP. Since we're talking about attacks that come from "inside your house", uPNP offers zero protection as it has no method of authentication. If you need it for specific apps, you'll find out, and you can forward those ports manually as needed.

Make sure remote administration is turned off. I've never owned a router that had this on by default, but check just to be double plus sure.

For Wifi, turn on WPA2+AES and use a long, strong password. Again, I feel most modern routers get the defaults right these days, but just check. The password is your responsibility, and password strength matters tremendously for wireless security, so be sure to make it a long one – at least 20 characters with all the variability you can muster.

Pick a unique SSID. Default SSIDs just scream hack me, for I have all defaults and a clueless owner. And no, don't bother "hiding" your SSID, it's a waste of time.

Optional: use less congested channels for WiFi. The default is "auto", but you can sometimes get better performance by picking less used frequencies at the ends of the spectrum. As summarized by official ASUS support reps:

  • Set 2.4 GHz channel bandwidth to 40 MHz, and change the control channel to 1, 6 or 11.

  • Set 5 GHz channel bandwidth to 80 MHz, and change the control channel to 165 or 161.

Experts only: install an open source firmware. I discussed this a fair bit in Everyone Needs a Router, but you have to be very careful which router model you buy, and you'll probably need to stick with older models. There are several which are specifically sold to be friendly to open source firmware.

Outside Your Home

Well, this one is simple. Assume everything you do outside your home, on a remote network or over WiFi is being monitored by IBGs: Internet Bad Guys.

I know, kind of an oppressive way to voyage out into the world, but it's better to start out with a defensive mindset, because you could be connecting to anyone's compromised router or network out there.

But, good news. There are only two key things you need to remember once you're outside, facing down that fiery ball of hell in the sky and armies of IBGs.

  1. Never access anything but HTTPS websites.

    If it isn't available over HTTPS, don't go there!

    You might be OK with HTTP if you are not logging in to the website, just browsing it, but even then IBGs could inject malware in the page and potentially compromise your device. And never, ever enter anything over HTTP you aren't 100% comfortable with bad guys seeing and using against you somehow.

    We've made tremendous progress in HTTPS Everywhere over the last 5 years, and these days most major websites offer (or even better, force) HTTPS access. So if you just want to quickly check your GMail or Facebook or Twitter, you will be fine, because those services all force HTTPS.

  2. If you must access non-HTTPS websites, or you are not sure, always use a VPN.

    A VPN encrypts all your traffic, so you no longer have to worry about using HTTPS. You do have to worry about whether or not you trust your VPN provider, but that's a much longer discussion than I want to get into right now.

    It's a good idea to pick a go-to VPN provider so you have one ready and get used to how it works over time. Initially it will feel like a bunch of extra work, and it kinda is, but if you care about your security an encrypt-everything VPN is bedrock. And if you don't care about your security, well, why are you even reading this?

If it feels like these are both variants of the same rule, always strongly encrypt everything, you aren't wrong. That's the way things are headed. The math is as sound as it ever was – but unfortunately the people and devices, less so.

Be Safe Out There

Until I heard Damien's story and John's story, I had no idea router hardware could be such a huge point of compromise. I didn't realize that you could be innocently visiting a friend's house, and because he happens to be the parent of three teenage boys and the owner of an old, unsecured router that you connect to via WiFi … your life will suddenly get a lot more complicated.

As the amount of stuff we connect to the Internet grows, we have to understand that the Internet of Things is a bunch of tiny, powerful computers, too – and they need the same strong attention to security that our smartphones, laptops, and servers already enjoy.

[advertisement] At Stack Overflow, we help developers learn, share, and grow. Whether you’re looking for your next dream job or looking to build out your team, we've got your back.
Categories: Programming

The Deadline to Apply for GTAC 2015 is Monday Aug 10

Google Testing Blog - Fri, 08/07/2015 - 18:33
Posted by Anthony Vallone on behalf of the GTAC Committee


The deadline to apply for GTAC 2015 is this Monday, August 10th, 2015. There is a great deal of interest to both attend and speak, and we’ve received many outstanding proposals. However, it’s not too late to submit your proposal for consideration. If you would like to speak or attend, be sure to complete the form by Monday.

We will be making regular updates to the GTAC site (developers.google.com/gtac/2015/) over the next several weeks, and you can find conference details there.

For those that have already signed up to attend or speak, we will contact you directly by mid-September.

Categories: Testing & QA

Stuff The Internet Says On Scalability For August 7th, 2015

Hey, it's HighScalability time:


A feather? Brass relief? River valley? Nope. It's frost on mars!
  • $10 billion: Microsoft data center spend per year; 1: hours from London to New York at mach 4.5; 1+: million Facebook requests per second; 25TB: raw data collected per day at Criteo; 1440: minutes in a day; 2.76: farthest distance a human eye can detect a candle flame in kilometers.

  • Quotable Quotes:
    • @drunkcod: IT is a cost center you say? Ok, let's shut all the servers down until you figure out what part of revenue we contribute to.
    • Beacon 23: I’m here because they ain’t made a computer yet that won’t do something stupid one time out of a hundred trillion. Seems like good odds, but when computers are doing trillions of things a day, that means a whole lot of stupid. 
    • @johnrobb: China factory: Went from 650 employees to 60 w/ robots. 3x production increase.  1/5th defect rate.
    • @twotribes: "Metrics are the internet’s heroin and we’re a bunch of junkies mainlining that black tar straight into the jugular of our organizations."
    • @javame: @adrianco I've seen a 2Tb erlang monolith and I don't want to see that again cc/@martinfowler
    • @micahjay1: Thinking about @a16z podcast about bio v IT ventures. Having done both, big diff is cost to get started and burn rate. No AWS in bio...yet
    • @0xced: XML: 1996 XLink: 1997 XML-RPC: 1998 XML Schema: 1998 JSON: 2001 JSON-LD: 2010 SON-RPC: 2005 JSON Schema: 2009 
    • Inside the failure of Google+: What people failed to understand was Facebook had network effects. It’s like you have this grungy night club and people are having a good time and you build something next door that’s shiny and new, and technically better in some ways, but who wants to leave? People didn't need another version of Facebook.
    • @bdu_p: Old age and treachery will beat youth and skill every time. A failed attempt to replace unix grep 

  • The New World looks a lot like the old Moscow. The Master of Disguise: My Secret Life in the CIA: we assume constant surveillance. This saturation level of surveillance, which far surpassed anything Western intelligence services attempted in their own democratic societies, had greatly constrained CIA operations in Moscow for decades.

  • How Netflix made their website startup time 70% faster. They removed a lot of server side complexity by moving to mostly client side rendering. Java, Tomcat, Struts, and Tiles were replaced with Node.js and React.js.  They call this Universal JavaScript, JavaScript on the server side and the client side. "Using Universal JavaScript means the rendering logic is simply passed down to the client." Only a bootstrap view is rendered on the server with everything else rendered incrementally on the client.

  • How Facebook fights spam with Haskell. Haskell is used as an expressive, latency sensitive rules engine. Sitting at the front of the ingestion point pipeline, it synchronously handles every single write request to Facebook and Instagram. That's more than one million requests per second. So not so slow. Haskell works well because it's a purely functional strongly typed language, supports hot swapping, supports implicit concurrency, performs well, and supports interactive development. Haskell is not used for the entire stack however. It's sandwiched. On the top there's C++ to process messages and on the bottom there's C++ client code interacts with other services. Key design decision: rules can't make writes, which means an abstract syntax tree of fetches can be overlapped and batched. 

  • You know how kids these days don't know the basics, like how eggs come from horses or that milk comes from chickens? The disassociation disorder continues. Now Millions of Facebook users have no idea they’re using the internet: A while back, a highly-educated friend and I were driving through an area that had a lot of data centers. She asked me what all of those gigantic blocks of buildings contained. I told her that they were mostly filled with many servers that were used to host all sorts of internet services. It completely blew her mind. She had no idea that the services that she and billions of others used on their phones actually required millions and millions of computers to transmit and process the data.

  • History rererepeats itself. Serialization is still evil. Improving Facebook's performance on Android with FlatBuffers:  It took 35 ms to parse a JSON stream of 20 KB...A JSON parser needs to build a field mappings before it can start parsing, which can take 100 ms to 200 ms...FlatBuffers is a data format that removes the need for data transformation between storage and the UI...Story load time from disk cache is reduced from 35 ms to 4 ms per story...Transient memory allocations are reduced by 75 percent...Cold start time is improved by 10-15 percent.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Finding What To Learn Next

Making the Complex Simple - John Sonmez - Fri, 08/07/2015 - 13:00

Being able to learn things quickly is an amazing skill to have—even more so for developers because of speed of technology. Most people change careers 15 times throughout their life. Not jobs, careers! So it’s safe to assume that the average developer will have multiple jobs throughout their career. Each job change has the potential […]

The post Finding What To Learn Next appeared first on Simple Programmer.

Categories: Programming

Always #notimplementednovalue . . . Maybe or Maybe Not

Does writing throwaway code to generate information and knowledge have value?

Does writing throwaway code to generate information and knowledge have value?

When we talk about implementing software into production at the end of each sprint (or for the more avant-garde, continuously as it is completed) as a reflection of value that a team delivers to its customer there is always push back. Agile practitioners, in particular, are concerned that some of the code is never implemented. If unimplemented code is not perceived to have value, then why are development teams generating the code they are not going to put into production? The problem is often exacerbated by the perceived need of the developers to get credit for the most tangible output of their work (which is code) rather than the knowledge the code generates. In order to understand why some of the code that is created is not put into production, it is important to understand the typical sources of code that does not get implemented: research and development (R&D), prototypes and development aids.

Research and Development (R&D): R&D is defined as the “investigative activities with the intention of making a discovery that can either lead to the development of new products or procedures, or to improvement of existing products or procedures.” R&D is not required to create a new report from SAP or a new data base table to support a CRM system. In R&D in an IT department, researchers generate experiments to explore ideas and test hypothesis, not to create shippable products or an installed production base of code. The value in the process is the knowledge that is generated (also known as intellectual property). Credit (e.g. adulation and promotions) accrues for generating the IP rather than to the code.

Prototypes: Prototypes are often used to sort out whether an idea is worth pursuing (this could be considered a special, micro form of R&D) and/or as a technique to generate and validate requirements. Prototypes are preliminary models that once constructed are put aside. As with R&D, the goal is less to generate code than to generate information that can be used in subsequent steps in solving a specific problem. As with R&D, credit (e.g. adulation and promotions) accrues for generating the IP rather than to the code.

Development Aids: Developers and testers often create tools to aid in the construction and testing of functionality. Rarely are these tools developed to be put into production. The value of this code is reflected in the efficiency and quality of the functionality they are created to support.

Whether in an R&D environment or at a team-level building prototypes or development aids, does writing throwaway code to generate information and knowledge have value? While this question sounds pedantic, it is a question that gets discussed when a coach begins to push the idea of #NotInProductionNoValue. The answer is to focus the discussion on the information and knowledge generated. In the end, it is the information and knowledge that that has value that will move forward with the project or organization even when the code is sloughed off like skin after a bad sunburn. Most simply when testing an assumption keeps you from making a mistake or provides information to make a good decision, doing whatever is needed makes sense. However, it is not the code that has value per se, but rather the information generated.

Side Note: Many IT departments have rebranded themselves as R&D departments. The R&D metaphor is used to evoke that the IT Departments is identifying products and leading the business. In some startup and cutting edge technology firms this is may well be true; however, generally the use of the term is a generally misnomer or wishful thinking. Instead most IT departments are focused on product delivery, i.e. building solutions based on relatively tried and true frameworks and methods. If you doubt the veracity of that statement just observe the amount of package software (e.g. SAP, PeopleSoft) your own organization supports.


Categories: Process Management

Spark: Convert RDD to DataFrame

Mark Needham - Thu, 08/06/2015 - 22:11

As I mentioned in a previous blog post I’ve been playing around with the Databricks Spark CSV library and wanted to take a CSV file, clean it up and then write out a new CSV file containing some of the columns.

I started by processing the CSV file and writing it into a temporary table:

import org.apache.spark.sql.{SQLContext, Row, DataFrame}
 
val sqlContext = new SQLContext(sc)
val crimeFile = "Crimes_-_2001_to_present.csv"
sqlContext.load("com.databricks.spark.csv", Map("path" -> crimeFile, "header" -> "true")).registerTempTable("crimes")

I wanted to get to the point where I could call the following function which writes a DataFrame to disk:

private def createFile(df: DataFrame, file: String, header: String): Unit = {
  FileUtil.fullyDelete(new File(file))
  val tmpFile = "tmp/" + System.currentTimeMillis() + "-" + file
  df.distinct.save(tmpFile, "com.databricks.spark.csv")
}

The first file only needs to contain the primary type of crime, which we can extract with the following query:

val rows = sqlContext.sql("select `Primary Type` as primaryType FROM crimes LIMIT 10")
 
rows.collect()
res4: Array[org.apache.spark.sql.Row] = Array([ASSAULT], [ROBBERY], [CRIMINAL DAMAGE], [THEFT], [THEFT], [BURGLARY], [THEFT], [BURGLARY], [THEFT], [CRIMINAL DAMAGE])

Some of the primary types have trailing spaces which I want to get rid of. As far as I can tell Spark’s variant of SQL doesn’t have the LTRIM or RTRIM functions but we can map over ‘rows’ and use the String ‘trim’ function instead:

rows.map { case Row(primaryType: String) => Row(primaryType.trim) }
res8: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = MapPartitionsRDD[29] at map at DataFrame.scala:776

Now we’ve got an RDD of Rows which we need to convert back to a DataFrame again. ‘sqlContext’ has a function which we might be able to use:

sqlContext.createDataFrame(rows.map { case Row(primaryType: String) => Row(primaryType.trim) })
 
<console>:27: error: overloaded method value createDataFrame with alternatives:
  [A <: Product](data: Seq[A])(implicit evidence$4: reflect.runtime.universe.TypeTag[A])org.apache.spark.sql.DataFrame <and>
  [A <: Product](rdd: org.apache.spark.rdd.RDD[A])(implicit evidence$3: reflect.runtime.universe.TypeTag[A])org.apache.spark.sql.DataFrame
 cannot be applied to (org.apache.spark.rdd.RDD[org.apache.spark.sql.Row])
              sqlContext.createDataFrame(rows.map { case Row(primaryType: String) => Row(primaryType.trim) })
                         ^

These are the signatures we can choose from:

2015 08 06 21 58 12

If we want to pass in an RDD of type Row we’re going to have to define a StructType or we can convert each row into something more strongly typed:

case class CrimeType(primaryType: String)
 
sqlContext.createDataFrame(rows.map { case Row(primaryType: String) => CrimeType(primaryType.trim) })
res14: org.apache.spark.sql.DataFrame = [primaryType: string]

Great, we’ve got our DataFrame which we can now plug into the ‘createFile’ function like so:

createFile(
  sqlContext.createDataFrame(rows.map { case Row(primaryType: String) => CrimeType(primaryType.trim) }),
  "/tmp/crimeTypes.csv",
  "crimeType:ID(CrimeType)")

We can actually do better though!

Since we’ve got an RDD of a specific class we can make use of the ‘rddToDataFrameHolder’ implicit function and then the ‘toDF’ function on ‘DataFrameHolder’. This is what the code looks like:

import sqlContext.implicits._
createFile(
  rows.map { case Row(primaryType: String) => CrimeType(primaryType.trim) }.toDF(),
  "/tmp/crimeTypes.csv",
  "crimeType:ID(CrimeType)")

And we’re done!

Categories: Programming

Understanding Software Project Size - New Lecture Posted

10x Software Development - Steve McConnell - Thu, 08/06/2015 - 21:36

I've uploaded a new lecture in my Understanding Software Projects lecture series. This lecture focuses on the critical topic of Software Size. If you've ever wondered why some early projects succeed while later similar projects fail, this lecture explains the basic dynamics that cause that. If you've wondered why Scrum projects struggle to scale, I share some insights on that topic. 

I believe this is one of my best lectures in the series so far -- and it's a very important topic. It will be free for the next week, so check it out: https://cxlearn.com.

Lectures posted so far include:  

0.0 Understanding Software Projects - Intro
     0.1 Introduction - My Background
     0.2 Reading the News
     0.3 Definitions and Notations 

1.0 The Software Lifecycle Model - Intro
     1.1 Variations in Iteration 
     1.2 Lifecycle Model - Defect Removal
     1.3 Lifecycle Model Applied to Common Methodologies 
     1.4 Lifecycle Model - Selecting an Iteration Approach  

2.0 Software Size - Introduction (New)
     1.01 Size - Examples of Size     
     2.05 Size - Comments on Lines of Code
     2.1 Size - Staff Sizes 
     2.2 Size - Schedule Basics 
     2.3 Size - Debian Size Claims 

3.0 Human Variation - Introduction

Check out the lectures at http://cxlearn.com!

Why You Shouldn’t Message Me

Making the Complex Simple - John Sonmez - Thu, 08/06/2015 - 16:00

In this episode, I tell you why you shouldn’t message me. Full transcript: John:               Hey, John Sonmez from simpleprogrammer.com. I’m going to talk about something that kind of annoys me a little bit here which is when people message me. This video is why you shouldn’t message me. I don’t mean to offend you if […]

The post Why You Shouldn’t Message Me appeared first on Simple Programmer.

Categories: Programming

Lean ops for startups: 4 leaders share their secrets

Google Code Blog - Wed, 08/05/2015 - 20:05

Posted by Ori Weinroth, Google Cloud Platform Marketing

As a CTO, VP R&D, or CIO at a technology startup you typically need to maneuver and make the most out of limited budgets. Chances are, you’ve never had your CEO walk in and tell you, “We’ve just closed our Series A round. You now have unlimited funding to launch version 1.5.”

So how do you extract real value from what you’re given to work with? We’re gathering four start technology leaders for a free webinar discussion around exactly that: their strategies and tactics for operating lean. They will cover key challenges and share tips and tricks for:

  • Reducing burn rate by making smart tech choices
  • Getting the most out of a critical but finite resource - your dev team
  • Avoiding vendor lock-in so as to maximize cost efficiencies

We’ve invited the following technology leaders from some of today’s most dynamic startups:

Sign up for our Lean Ops Webinar in your timezone to hear their take:

Americas
Wednesday, 13 August 2015
11:00 AM PT
[Click here to register]

Europe, Middle East and Africa
Wednesday, 13 August 2015
10:00 AM (UK), 11:00 AM (France), 12:00 PM (Israel)
[Click here to register]

Asia Pacific
Wednesday, 13 August 2015
10:30AM (India), 1:00 PM (Singapore/Hong Kong), 3:00PM (Sydney, AEDT)
[Click here to register]

Our moderator will be Amir Shevat, senior program manager at Google Developer Relations. We look forward to an insightful and open discussion and hope you can attend.

Categories: Programming

Hugh MacLeod’s Illustrated Guide to Life Inside Microsoft

imageIf you remember the little blue monster that says, “Microsoft, change the world or go home.”, you know Hugh MacLeod.

Hugh is the creative director at Gaping Void.  I got to meet Hugh, along with Jason Korman (CEO), and Jessica Higgins, last week to talk through some ideas.

Hugh uses cartoons as a snappy and insightful way to change the world.  You can think of it as “Motivational Art for Smart People.”

The Illustrated Guide to Life Inside Microsoft

One of Hugh’s latest creations is the Illustrated Guide to Life Insight Microsoft.  It’s a set of cards you can flip, with a cartoon on the front, and a quote on the back.  It’s truly insight at your fingertips.

image

I like them all … from “Microsoft is a ‘Get Stuff Done’ company” to “Software is the thing between the things”, but my favorite is:

“It’s more fun being the underdog.”

It’s a reminder how you can take the dog out of the fight, but you can’t take the fight out of the dog, and as long as you’re still in the game, and you are truly a learning company, and a company that continues to grow and evolve, you can change the world … your unique way.

Tweaking People in the Right Direction

Hugh is an observer and participant who inspires and prods people in the right direction …

Via Hugh MacLeod Connects the Dots:

“’Attaching art to business outcomes can articulate deep emotions and bring things to light fast,’ said MacLeod. To get there requires MacLeod immersing himself within a company, so he can look for what he calls ‘freaks of light’—epiphanies about a company that express the collected motivations of its people. ‘My cartoons make connections,’ said MacLeod. ‘I create work in an ambient way to tweak people in the right direction.’”

Via Hugh MacLeod Connects the Dots:

“He’s an observer and a participant, mingling temporarily within a culture to better understand it. He’s also a listener, taking your thoughts and combining them with his own to piece together the puzzle he is trying to solve about the human condition and business environment.”

Check out the Illustrated Guide to Life Inside Microsoft and some of the ideas just might surprise you, or, at least inspire and motivate you today – you smart person, you.

Categories: Architecture, Programming

How do you program a computer with 10 terabytes of RAM?

How do you program a computer with 10 terabytes of RAM in a single address space?  When the great Adrian Cockcroft was interviewed for Enterprise Initiatives Episode blog, that’s one of the answers he gave to the question of “What’s the next big thing?”

Adrian says we are already taking big machines and running tiny little containers on them. He thinks another interesting workload is huge memory systems. Building computers with many terabytes of main memory will soon be affordable. We already know the JVM has problems garbage collecting on machines with 10s of gigabytes of RAM. What about machines with terabytes of RAM? We don’t really have the programming models worked out yet. It may be that garbage collected languages won't make the cut.

Sounds like a good idea for a post, right? Here’s the problem, I found surprisingly little on huge memory systems. If you have any ideas on good source please leave a comment. Here’s some of what I did find…

SGI’s 64TB Computer
Categories: Architecture

Are 64% of Features Really Rarely or Never Used?

Mike Cohn's Blog - Wed, 08/05/2015 - 15:00

A very oft-cited metric is that 64 percent of features in products are “rarely or never used.” The source for this claim was Jim Johnson, chairman of the Standish Group, who presented it in a keynote at the XP 2002 conference in Sardinia. The data Johnson presented can be seen in the following chart.

Johnson’s data has been repeated again and again to the extent that those citing it either don’t understand its origins or never bothered to check into them.

The misuse or perhaps just overuse of this data has been bothering me for a while, so I decided to investigate it. I was pretty sure of the facts but didn’t want to rely solely on my memory, so I got in touch with the Standish Group, and they were very helpful in clarifying the data.

The results Jim Johnson presented at XP 2002 and that have been repeated so often were based on a study of four internal applications. Yes, four applications. And, yes, all internal-use applications. No commercial products.

So, if you’re citing this data and using it to imply that every product out there contains 64 percent “rarely or never used features,” please stop. Please be clear that the study was of four internally developed projects at four companies.

Specification-based Test Design Techniques for Enhancing Unit Tests Part 1

Making the Complex Simple - John Sonmez - Wed, 08/05/2015 - 13:05

The primary goal of most developers is usually achieving 100% code coverage if they write any unit tests at all. In this test design how-to article, I am going to show you how to use specification-based test design techniques to cover more requirements through your unit tests. I’ve seen a lot of unit tests, and […]

The post Specification-based Test Design Techniques for Enhancing Unit Tests Part 1 appeared first on Simple Programmer.

Categories: Programming