Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Quote of the Month March 2015

From the Editor of Methods & Tools - Wed, 03/25/2015 - 09:10
Competencies versus Roles: We’ve seen a positive move toward emphasizing competencies in a team rather than roles or titles. As teams make that change, we see fewer “It’s not my job” excuses and more “How can I help?” conversations. Team members will continue to have core competencies in some areas more than others, but they may not identify as strongly with a particular role. For example, saying, “I am a tester” really means, “I perform mainly testing activities because that is my primary passion and strength. I can provide leadership ...

CMMI: The Easy Way or The Hard Way? A Simple CMMI Readiness Checklist

Are you ready to build change?

Are you ready to build change?

Many times an organization or individual will start a change program because they deem it necessary for survival. But change is never easy. Survival and pain avoidance, while powerful, can lead to pursuing change as a reaction to pain rather than as pursuit of value. Pain avoidance and generation of business value both are necessary pieces of knowledge as the intellectual benefits persuade, while pain avoidance sells. In order to ensure that both sides of the change are addressed a framework can be useful to generate focus. The simple CMMI Readiness Checklist can be used for any major change initiative, but is tailored toward the testing whether the requirements for implementing a framework like the CMMI have been addressed.

I have broken this checklist into three categories: resources, plans and attitudes.  Each can be leveraged separately; however using the three components will help you to focus on the big picture.

Scale

The simple checklist can be used as a tool to evaluate how well you have prepared for you CMMI journey using the questions as evaluation criteria.  To use the checklist, evaluate each question on a scale of high, medium, low and not present (with one exception). Each question will potentially contribute points toward the total that can be used to evaluate preparation.

Section and Question Weights:

Resources: Forty-two total points. Each component contributes up to 7 points (7, 3, 1, 0).

Plans: Eighteen total points. Each component contributes up to 6 points (6, 3, 1, 0).

Attitude: Forty total points. Each component contributes up to 8 points (8, 4, 2, 0).

Resources

Resources are the raw materials that you will consume on your journey.  As with any journey having both the correct resources and correct amount of resources will make the journey easier.  Just think of trying to canoe from New York to London for a meeting; the wrong resources can make the trip difficult.

Management Support

Support from management is critical as we have discussed in past checklists, but so is support from your peers and from the teams that will be using the processes.

Score

7 – Senior management is actively involved in guiding and using the outputs of the CMMI.  Senior managers stop people in the hall to discuss progress and recently process implementations. Discussion of progress is an agenda item at all managers staff meetings.

3 – Senior and middle managers attend formal CMMI informational meetings and talk about the need to support the CMMI initiative.

1 – Senior managers attended the kick-off meeting, then relocating in mass to Aruba, leaving the middle managers in charge.

0 – The change initiative is a grass-roots effort.

Cash

Change costs money. Costs can include consultants, training, travel and an odd late-night pizza or two.

7 – A reasonable budget has been established and the implementation team can draw from the budget for planned expenditures.  Emergency funding can be attained to handle issues.

3 – A reasonable budget has been established and approved; however, access must be requested and justified for all expenditures.

1 – Any time that money is required funding must be requested and approved.

0 – Donations are sought in the organization’s lunchroom on a periodic basis (consider a PayPal donation button on your homepage).

Effort

Even if you have bales of cash, developing and implementing processes will require effort. Effort will be required from many constituencies including the process-improvement team, management and from the teams using the process, just to name a few.

7 – A reasonable staffing plan has been established and the change program is the only project the assigned resources have been committed to.

4 – A reasonable staffing plan has been established and the change initiative is the highest priority for the assigned resources.

1 – All resources are shared between the change initiative and are also assigned to other projects with high priority.

0 – You have all the effort you need after 5 PM and before 8 AM and during company holidays.

Change Specialist

Organizational change requires skills that are not generally found in an IT department. The skills needed include sales, marketing and communication.

7 – An organizational-change specialist has been assigned as a full-time resource for the project.

3 – An organizational-change specialist is available within the organization and works on many projects simultaneously. The specialist may or may not have had experience with IT change programs.

1 – Someone on the team has helped craft an organizational change plan in the past.

0 – Organizational change websites are blocked and your best bet is buying a book on Amazon using your own cash.

Projects

Change requires something to impact.  The organization needs to have a consistent flow of projects so that changes are not one-shot attempts.

7 – Projects are constantly beginning that will provide a platform for implementing process changes.

3 – There are numerous projects in the organization; however they typically begin early in the year or on some other periodic basis that makes waiting a necessity if you are not ready exactly on time.

1 – The organization does only a small number of projects every year.

0 – The organization does one large project every year.

Calendar Time

Calendar time is a resource that is as important as any other resource. Severe calendar constraints can lead to irrational or bet-the-farm behaviors which increase risk.

7 – The schedule for implementing the CMMI is in line with industry norms and includes time for tweaking the required processes before appraising.

3 – The schedule is realistic but bare bones. Any problems could cause delay.

1 – Expectations have been set that will require a compressed schedule; however, delay will only be career limiting rather than a critical impact on the business.

0 – The CMMI implantation is critical for the organization’s survival and is required on an extremely compressed schedule.

Expertise

A deep understanding of the CMMI (or any other framework for that matter) will be needed to apply the model in a dynamic environment.  Experience is generally hard won. “Doing” it once generally does not provide enough expertise to allow the level of tailoring needed to apply the model in more than one environment. Do not be afraid to get a mentor if this is a weakness.

7 – The leaders and team members working to implement the CMMI have been intimately involved in successfully implementing the framework in different environments.

3 –The leader and at least one of the team members have been involved in implementing the CMMI in the past in a similar environment.

1 – Only the leader of the CMMI program has been involved with implementing the CMMI in another environment.

0 – All of the team members have taken the basic CMMI course and can spell CMMI assuming they can buy a vowel.

Plans

Planning for the implementation of change can take many forms — from classic planning documents and schedules to backlogs.  The structure of the plan is less of a discussion point than the content.  You need several plans when changing an organization. While the term “several” is used this does not mandate many volumes of paper and schedules, rather that the activities required are thought through and recorded, the goal is known and the constraints on the program have been identified (in other words the who, what, when, why and how are known to the level required).

Scale and Scoring

Plans: Eighteen total points. Each component contributes up to 6 points (6, 3, 1, 0).

Organizational Change Plan

The Organizational Change Plan includes information on how the changes required to implement the CMMI will be communicated, marketed, reported, discussed, supported, trained and, if necessary escalated.

6 – A full change management plan has been developed, implemented and is being constantly monitored.

3 –An Organizational Change Plan is planned but is yet to be developed. .

1 – When created, the Organizational Change Plan will be referenced occasionally.

0 – No Organizational Change Plan has or will be created.

Backlog

The backlog records what needs to be changed in prioritized order. The backlog should include all changes, issues and risks. The items in the backlog will be broken down into tasks as they are selected to be worked on.  The format needs to match corporate culture and can range from an Agile backlog to in a waterfall organization, a requirements document.

6 – A prioritized backlog exists and is constantly maintained.

3 – A prioritized backlog exists and is periodically maintained.

1 – A rough list of tasks and activities is kept on whiteboard.

0 – No backlog or list of tasks exists.

Governance

Any change program requires resources, perseverance and political capital. In most corporations these types of requirements scream the need for oversight (governance is a code word for the less friendly word oversight). Governance defines who decides which changes will be made, when they will be made and who will pay for the changes. I strongly recommend that you decide how governance will be handled and write it down and make sure all of your stakeholders are comfortable on how you will get their advice, counsel, budget and in some cases permission.

6 – A full-governance plan has been developed, implemented and is being constantly monitored.

3 –A governance plan is planned, but is yet to be developed.

1 – When created, the governance plan will be used to show the process auditors.

0 – Governance . . . who needs it!

Attitude

When you talk about attitude it seems personal rather than organizational, but when it comes to large changes I believe that both the attitude of the organization and critical individuals are important.  As you prepare to address the CMMI, the onus is on you as a change leader to develop a nuanced understanding of who you need to influence within the organization. The checklist will portray an organizational view; however, you can and should replicate the exercise for specific critical influencers.

Scale and Scoring

Attitude: Forty total points. Each component contributes up to 8 points (8, 4, 2, 0).

Vision of tomorrow

Is there a belief that tomorrow will be demonstratively better based on the actions that are being taken? The organization needs to have a clear vision that tomorrow will be better than today in order to positively motivate the team to aspire to be better than they are.

8 – The organization is excited about the changes that are being implemented.  Volunteers to help or to pilot are numerous.

4 – Most of the organization is excited about most of the changes and their impact on the future.

2 – A neutral outlook (or at least undecided) is present.

0 – Active disenchantment with or dissension about the future is present.

Minimalist

The view that the simplest process change that works is the best is important in today’s lean world.  In many cases heavy processes are wearing on everyone who uses them and even when the process is okay today, entropy will add steps and reviews over time, which adds unneeded weight.  Score this attribute higher if the organization has a process to continually apply lean principles as a step in process maintenance.

8 – All processes are designed with lean principles formally applied.  Productivity and throughput are monitored to ensure that output isn’t negatively impacted.

4 – All processes are designed with lean principles formally applied; however, they are not monitored quantitatively.

2 – All processes are designed with lean principles informally applied.

0 – Processes are graded by the number of steps required, with a higher number being better.

Learner

A learner is someone that is learning understands that they don’t know everything and that mistakes will be made. They understand that when made, mistakes are to be examined and corrected rather than swept under the carpet. Another attribute of a learner is the knowledge that the synthesis of data and knowledge from other sources is required for growth.  In most organizations, an important source of process knowledge and definition are the practitioners — but not the sole source.

8 – New ideas are actively pursued and evaluated on an equal footing with any other idea or concept.

4 – New ideas are actively pursued and evaluated, but those that reflect the way work is currently done are given more weight.

2 – The “not invented here” point of view has a bit of a hold on the organization, making the introduction of new ideas difficult.

0 – There is only one way to do anything and it was invented here sometime early last century.  Introduction of new ideas is considered dangerous.

Goal Driven

The organization needs to have a real need to drive the change and must be used to pursuing longer-term goals. The Process Philosopher of Sherbrooke once told me that being goal-driven is required to be serious about change.  In many cases a good, focused near-death experience increases the probability of change, but waiting that long can create a negative atmosphere. A check-the-box goal rarely provides more than short-term localized motivation.

8 – The organization has a well-stated positive goal and that the CMMI not only supports, but is integral to attaining that goal.

2 – The pursuit of the CMMI is about checking a box on a RFP response.

0 – CMMI is being pursued for no apparent purpose.

Conviction

Belief in the underlying concepts of the CMMI (or other change framework) provides motivation to the organization and individuals.  Conviction creates a scenario where constancy of purpose (Deming) is not an after-thought but the way things are done. Implementing frameworks like the CMMI are long-term efforts — generally with levels of excitement cycling through peaks and valleys.  In the valley when despair becomes a powerful force, many times conviction is the thread that keeps things moving forward. Without a critical mass of conviction it will be easy to wander off to focus on the next new idea.

8 – We believe and have evidence that from the past that we can continue to believe over time.

4 – We believe but this is the first time we’ve attempted something this big!

2 – We believe  . . . mostly.

0 – No Organizational-Change Plan has been created.

Scoring

Sum all of the scores and apply the following criteria.

100 – 80   You have a great base; live the dream.

79 – 60   Focus on building your change infrastructure as you begin the CMMI journey.

59 – 30   Remediate your weaknesses before you start wrestling with the CMMI.

29 –   0   Run Away! Trying to implement the CMMI will be equivalent to putting your hand in the garbage disposal with it running; avoid if you absolutely can!


Categories: Process Management

Topic Modelling: Working out the optimal number of topics

Mark Needham - Tue, 03/24/2015 - 23:33

In my continued exploration of topic modelling I came across The Programming Historian blog and a post showing how to derive topics from a corpus using the Java library mallet.

The instructions on the blog make it very easy to get up and running but as with other libraries I’ve used, you have to specify how many topics the corpus consists of. I’m never sure what value to select but the authors make the following suggestion:

How do you know the number of topics to search for? Is there a natural number of topics? What we have found is that one has to run the train-topics with varying numbers of topics to see how the composition file breaks down. If we end up with the majority of our original texts all in a very limited number of topics, then we take that as a signal that we need to increase the number of topics; the settings were too coarse.

There are computational ways of searching for this, including using MALLETs hlda command, but for the reader of this tutorial, it is probably just quicker to cycle through a number of iterations (but for more see Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Science, 101, 5228-5235).

Since I haven’t yet had the time to dive into the paper or explore how to use the appropriate option in mallet I thought I’d do some variations on the stop words and number of topics and see how that panned out.

As I understand it, the idea is to try and get a uniform spread of topics -> documents i.e. we don’t want all the documents to have the same topic otherwise any topic similarity calculations we run won’t be that interesting.

I tried running mallet with 10,15,20 and 30 topics and also varied the stop words used. I had one version which just stripped out the main characters and the word ‘narrator’ & another where I stripped out the top 20% of words by occurrence and any words that appeared less than 10 times.

The reason for doing this was that it should identify interesting phrases across episodes better than TF/IDF can while not just selecting the most popular words across the whole corpus.

I used mallet from the command line and ran it in two parts.

  1. Generate the model
  2. Work out the allocation of topics and documents based on hyper parameters

I wrote a script to help me out:

#!/bin/sh
 
train_model() {
  ./mallet-2.0.7/bin/mallet import-dir \
    --input mallet-2.0.7/sample-data/himym \
    --output ${2} \
    --keep-sequence \
    --remove-stopwords \
    --extra-stopwords ${1}
}
 
extract_topics() {
  ./mallet-2.0.7/bin/mallet train-topics \
    --input ${2} --num-topics ${1} \
    --optimize-interval 20 \
    --output-state himym-topic-state.gz \
    --output-topic-keys output/himym_${1}_${3}_keys.txt \
    --output-doc-topics output/himym_${1}_${3}_composition.txt
}
 
train_model "stop_words.txt" "output/himym.mallet"
train_model "main-words-stop.txt" "output/himym.main.words.stop.mallet"
 
extract_topics 10 "output/himym.mallet" "all.stop.words"
extract_topics 15 "output/himym.mallet" "all.stop.words"
extract_topics 20 "output/himym.mallet" "all.stop.words"
extract_topics 30 "output/himym.mallet" "all.stop.words"
 
extract_topics 10 "output/himym.main.words.stop.mallet" "main.stop.words"
extract_topics 15 "output/himym.main.words.stop.mallet" "main.stop.words"
extract_topics 20 "output/himym.main.words.stop.mallet" "main.stop.words"
extract_topics 30 "output/himym.main.words.stop.mallet" "main.stop.words"

As you can see, this script first generates a bunch of models from text files in ‘mallet-2.0.7/sample-data/himym’ – there is one file per episode of HIMYM. We then use that model to generate differently sized topic models.

The output is two files; one containing a list of topics and another describing what percentage of the words in each document come from each topic.

$ cat output/himym_10_all.stop.words_keys.txt
 
0	0.08929	back brad natalie loretta monkey show call classroom mitch put brunch betty give shelly tyler interview cigarette mc laren
1	0.05256	zoey jerry arthur back randy arcadian gael simon blauman blitz call boats becky appartment amy gary made steve boat
2	0.06338	back claudia trudy doug int abby call carl stuart voix rachel stacy jenkins cindy vo katie waitress holly front
3	0.06792	tony wendy royce back jersey jed waitress bluntly lucy made subtitle film curt mosley put laura baggage officer bell
4	0.21609	back give patrice put find show made bilson nick call sam shannon appartment fire robots top basketball wrestlers jinx
5	0.07385	blah bob back thanksgiving ericksen maggie judy pj valentine amanda made call mickey marcus give put dishes juice int
6	0.04638	druthers karen back jen punchy jeanette lewis show jim give pr dah made cougar call jessica sparkles find glitter
7	0.05751	nora mike pete scooter back magazine tiffany cootes garrison kevin halloween henrietta pumpkin slutty made call bottles gruber give
8	0.07321	ranjit back sandy mary burger call find mall moby heather give goat truck made put duck found stangel penelope
9	0.31692	back give call made find put move found quinn part ten original side ellen chicago italy locket mine show
$ head -n 10 output/himym_10_all.stop.words_composition.txt
#doc name topic proportion ...
0	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/1.txt	0	0.70961794636687	9	0.1294699168584466	8	0.07950442338871108	2	0.07192178481473664	4	0.008360809510263838	5	2.7862560133367015E-4	3	2.562409242784946E-4	7	2.1697378721335337E-4	1	1.982849604752168E-4	6	1.749937876710496E-4
1	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/10.txt	2	0.9811551470820473	9	0.016716882136209997	4	6.794128563082893E-4	0	2.807350575301132E-4	5	2.3219634098530471E-4	8	2.3018997315244256E-4	3	2.1354177341696056E-4	7	1.8081798384467614E-4	1	1.6524340216541808E-4	6	1.4583339433951297E-4
2	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/100.txt	2	0.724061485807234	4	0.13624729774423758	0	0.13546964196228636	9	0.0019436342339785994	5	4.5291919356563914E-4	8	4.490055982996677E-4	3	4.1653183421485213E-4	7	3.5270123154213927E-4	1	3.2232165301666123E-4	6	2.8446074162457316E-4
3	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/101.txt	2	0.7815231689893246	0	0.14798271520316794	9	0.023582384458063092	8	0.022251052243582908	1	0.022138209217973336	4	0.0011804626661380394	5	4.0343527385745457E-4	3	3.7102343418895774E-4	7	3.1416667687862693E-4	6	2.533818368250992E-
4	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/102.txt	6	0.6448245189567259	4	0.18612146979166502	3	0.16624873439661025	9	0.0012233726722317548	0	3.4467218590717303E-4	5	2.850788252495599E-4	8	2.8261550915084904E-4	2	2.446611421432842E-4	7	2.2199909869250053E-4	1	2.028774216237081E-
5	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/103.txt	8	0.7531586740033047	5	0.17839539108961253	0	0.06512376460651902	9	0.001282794040111701	4	8.746645156304241E-4	3	2.749100345664577E-4	2	2.5654476523149865E-4	7	2.327819863700214E-4	1	2.1273153572848481E-4	6	1.8774342292520802E-4
6	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/104.txt	7	0.9489502365148181	8	0.030091466847852504	4	0.017936457663121977	9	0.0013482824985091328	0	3.7986419553884905E-4	5	3.141861834124008E-4	3	2.889445824352445E-4	2	2.6964174000656E-4	1	2.2359178288566958E-4	6	1.9732799141958482E-4
7	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/105.txt	8	0.7339694064061175	7	0.1237041841318045	9	0.11889696041555338	0	0.02005288536233353	4	0.0014026751618923005	5	4.793786828705149E-4	3	4.408655780020889E-4	2	4.1141370625324785E-4	1	3.411516484151411E-4	6	3.0107890675777946E-4
8	file:/Users/markneedham/projects/mallet/mallet-2.0.7/sample-data/himym/106.txt	5	0.37064909999661005	9	0.3613559917055785	0	0.14857567731040344	6	0.09545466082502917	4	0.022300625744661403	8	3.8725629469313333E-4	3	3.592484711785775E-4	2	3.3524900189121E-4	7	3.041961449432886E-4	1	2.779945050112539E-4

The output is a bit tricky to understand on its own so I did a bit of post processing using pandas and then ran the results of that through matplotlib to see the distribution of documents for different topics sizes with different stop words. You can see the script here.

I ended up with the following chart:

2015 03 24 22 08 48

On the left hand side we’re using more stop words and on the right just the main ones. For most of the variations there are one or two topics which most documents belong to but interestingly the most uniform distribution seems to be when we have few topics.

These are the main words for the most popular topics on the left hand side:

15 topics

8       0.50732 back give call made put find found part move show side ten mine top abby front fire full fianc

20 topics

12      0.61545 back give call made put find show found part move side mine top front ten full cry fire fianc

30 topics

22      0.713   back call give made put find show part found side move front ten full top mine fire cry bottom

All contain more or less the same words which at first glance seem like quite generic words so I’m surprised they weren’t excluded.

On the right hand side we haven’t removed many words so we’d expect common words in the English language to dominate. Let’s see if they do:

10 topics

1       3.79451 don yeah ll hey ve back time guys good gonna love god night wait uh thing guy great make

15 topics

5       2.81543 good time love ll great man guy ve night make girl day back wait god life yeah years thing
 
10      1.52295 don yeah hey gonna uh guys didn back ve ll um kids give wow doesn thing totally god fine

20 topics

1       3.06732 good time love wait great man make day back ve god life years thought big give apartment people work
 
13      1.68795 don yeah hey gonna ll uh guys night didn back ve girl um kids wow guy kind thing baby

30 topics

14      1.42509 don yeah hey gonna uh guys didn back ve um thing ll kids wow time doesn totally kind wasn
 
24      2.19053 guy love man girl wait god ll back great yeah day call night people guys years home room phone
 
29      1.84685 good make ve ll stop time made nice put feel love friends big long talk baby thought things happy

Again we have similar words across each run and as expected they are all quite generic words.

My take away from this exploration is that I should vary the stop word percentages as well and see if that leads to an improved distribution.

Taking out very common words like we do with the left hand side charts seems to make sense although I need to work out why there’s a single outlier in each group.

The authors suggest that having the majority of our texts in a small number of topics means we need to create more of them so I will investigate that too.

The code is all on github along with the transcripts so give it a try and let me know what you think.

Categories: Programming

Two Programmers Turned WordPress Entrepreneurs Profiled

Making the Complex Simple - John Sonmez - Tue, 03/24/2015 - 20:18

I’ve been involved with the WordPress community quite a bit, since I’ve launched my “How to Create a Blog to Boost Your Career” course. As I result, I’ve been very interested in what in happening in the WordPress development world. I’ve gotten a few requests for information about WordPress development, but I’m really not an […]

The post Two Programmers Turned WordPress Entrepreneurs Profiled appeared first on Simple Programmer.

Categories: Programming

Calculating Value from Software Projects - Estimating is a Risk Reduction Process

Herding Cats - Glen Alleman - Tue, 03/24/2015 - 15:41

Software can provide increased benefits to internal users, with the firm paying for the development or acquisition of that software. Software can also reduce costs to the users in exchange for the cost of development or acquisition of the software.

This exchange of benefit for cost is the basis of all business decision making where the governance of resources is in place. This governance process assumes there are limited resources - cost, time, capacity for work, available labor, equipment, and any other element that enters in the production of output of the firm. 

Benefits produced in exchange for this cost also include risk reduction, as well as increased revenue, and lowered cost. But again this risk reduction is in exchange for cost - money paid for reduced risk.

To determine benefits we can use a decision framework †

  • Business scope - measures deliverable benefits in either cost or revenue.
  • Value and Potential Impact:
    • Improvement benefits — based on existing valuation tools or what if calculations.
    • Scaling benefits — the total value of the associated business change resulting from the software acquisition and deployment.
    • Risk reduction benefits — the expected value of the risks being mitigates.
    • Future options — using Black-Scholes, Real Options, or Monte Carlo Simulation to determine future benefits.
  • Project cost - assess using standard project costing tools.
  • Direct savings — through operational cost reduction.
  • Real cash effective — netted off against direct savings.

Making Decisions in the Presence of Uncertainty ‡

It is better to be approximately right than precisely wrong - Warren Buffet

This quote is many times misused to say we can't possibly estimate so let's not estimate, let's just get started coding and we'll see what comes out. 

Since all project work is based on uncertainty — reducible and irreducible — risk is the result of this uncertainty. So first we need to determine what is the Value At Risk before we can say how we are going to manage in the presence of this uncertainty. 

Risk is a quantity that has relevance on it's own. What's the risk that if I make this next turn on a Double-Black run in Breckenridge, there will be ice and I'll fall? Hard to say on the first run of the day, so I'll slow down as I come around the tree line. On the third run in a row, I've got the experience on that run today and the experience of skiing fast and hard on that run for several years to know more about the risk.

Since statistical uncertainty drives projects, the resulting risk from that uncertainty is probabilistic. When interdependencies between work elements, capacity for work, technical and operational processes, changing understanding, and a myriad of other variables are also uncertain, we are faced with the problem. This problem is no deterministic assessment of time, cost, or capabilities can be performed for our project. We must have a probabilistic process. And of course this probabilistic process requires estimating the range of values possible for each variable that interacts with the other variables.

Sampling from the past (empirical data) is a good way to start, but those past samples tell us nothing about the future statistical process unless all work in the future is identical to the work in the past. This is naive at best and dangerous at worst. Such naive assumptions are many times the Root Cause of major cost overruns in our space and defense software intensive systems business. Those same naive assumptions are applicable across all software domains.

When there is a network of work activities - as there is on any non-trivial project - each activity is a probabilistic process. The notion of Independent work in the agile paradigm must always be confirmed before assuming simple queuing process can be put to work. So when you hear about Little's Law and Bootstrapping simulations, confirm the interdependencies of the work. The model in the chart below and the Probability Distribution Function below that are from a Monte Carlo Simulation, where the interdependencies of the work activities are modeled by the tool. In this case RiskyProject that provides a Risk Register for reducible risks, the means of modeling the Irreducible uncertainty in the duration of the work, shows the coupling between the work elements Cruciality Index and other indicators of trouble to come from the status of the past performance.

Screen Shot 2015-03-24 at 8.17.58 AM

The chart below says that the activity being modeled (and all the activities in the network of activities are modeled, I just picked this one), has a 54% chance of completing on or before Oct 18th, 2015. 

Probabilistic Finish

The End

If you're project is small enough to be able to see all the work in one place, see how to produce outcomes from all that work, and has a low value at risk, making business based estimates using risk reduction is probably not needed. 

If you project has interdependencies between the work elements, the work is of sufficient duration you can't see the end in enough detail to remove all the statistical variances, and the value at risk is sufficiently high that the business impact of not showing up on time, on budget, with the needed outcomes will be unacceptable - then the process of managing in the presence of uncertainty must be able to estimate all the interacting variables. 

It's this simple

  • No risk, low impact for being wrong, low value at risk project — no need to worry about the future, just start  and work out problems as you go.

But when we hear we don't need to estimate to make decisions and no domain and context is provided, those making that conjecture haven't considered the domain or context either. They're either unaware of the probability and statistics projects or they're intentionally ignoring the probability and statistics of projects. Since they ignore these fundamental process of all non-trivial project, ignore their advice.

‡ How to Measure Anything: Finding Value of Intangibles in Business 3rd Edition, Douglas Hubbard, 

† A New Framework for IT Investment Decisions, Anthony Barnes, 

Related articles Managing Projects By The Numbers Empirical Data Used to Estimate Future Performance Five Estimating Pathologies and Their Corrective Actions Managing in Presence of Uncertainty I Think You'll Find It's a Bit More Complicated Than That A Monte Carlo Simulation for Pi Day
Categories: Project Management

Announcing the new Azure App Service

ScottGu's Blog - Scott Guthrie - Tue, 03/24/2015 - 15:23

In a mobile first, cloud first world, every business needs to deliver great mobile and web experiences that engage and connect with their customers, and which enable their employees to be even more productive.  These apps need to work with any device, and to be able to consume and integrate with data anywhere.

I'm excited to announce the release of our new Azure App Service today - which provides a powerful new offering to deliver these solutions.  Azure App Service is an integrated service that enables you to create web and mobile apps for any platform or device, easily integrate with SaaS solutions (Office 365, Dynamics CRM, Salesforce, Twilio, etc), easily connect with on-premises applications (SAP, Oracle, Siebel, etc), and easily automate businesses processes while meeting stringent security, reliability, and scalability needs. Azure App Service

Azure App Service includes the Web App + Mobile App capabilities that we previously delivered separately (as Azure Websites + Azure Mobile Services).  It also includes powerful new Logic/Workflow App and API App capabilities that we are introducing today for the very first time - along with built-in connectors that make it super easy to build logic workflows that integrate with dozens of popular SaaS and on-premises applications (Office 365, SalesForce, Dynamics, OneDrive, Box, DropBox, Twilio, Twitter, Facebook, Marketo, and more). 

All of these features can be used together at one low price.  In fact, the new Azure App Service pricing is exactly the same price as our previous Azure Websites offering.  If you are familiar with our Websites service you now get all of the features it previously supported, plus additional new mobile support, plus additional new workflow support, plus additional new connectors to dozens of SaaS and on-premises solutions at no extra charge

Web + Mobile + Logic + API Apps

Azure App Service enables you to easily create Web + Mobile + Logic + API Apps:

image

You can run any number of these app types within a single Azure App Service deployment.  Your apps are automatically managed by Azure App Service and run in managed VMs isolated from other customers (meaning you don't have to worry about your app running in the same VM as another customer).  You can use the built-in AutoScaling support within Azure App Service to automatically increase and decrease the number of VMs that your apps use based on the actual resource consumption of them. 

This provides an incredibly cost-effective way to build and run highly scalable apps that provide both Web and Mobile experiences, and which contain automated business processes that integrate with a wide variety of apps and data sources.

Below are additional details on the different app types supported by Azure App Service.  Azure App Service is generally available starting today for Web apps, with the Mobile, Logic and API app types available in public preview:

Web Apps

The Web App support within Azure App Service includes 100% of the capabilities previously supported by Azure Websites.  This includes:

  • Support for .NET, Node.js, Java, PHP, and Python code
  • Built-in AutoScale support (automatically scale up/down based on real-world load)
  • Integrated Visual Studio publishing as well as FTP publishing
  • Continuous Integration/Deployment support with Visual Studio Online, GitHub, and BitBucket
  • Virtual networking support and hybrid connections to on-premises networks and databases
  • Staged deployment and test in production support
  • WebJob support for long running background tasks

Customers who have previously deployed an app using the Azure Website service will notice today that they these apps are now called "Web Apps" within the Azure management portals.  You can continue to run these apps exactly as before - or optionally now also add mobile + logic + API app support to your solution as well without having to pay anything more.

Mobile Apps

The Mobile App support within Azure App Service provides the core capabilities we previously delivered using Azure Mobile Services.  It also includes several new enhancements that we are introducing today including:

  • Built-in AutoScale support (automatically scale up/down based on real-world load)
  • Traffic Manager support (geographically scale your apps around the world)
  • Continuous Integration/Deployment support with Visual Studio Online, GitHub, and BitBucket
  • Virtual networking support and hybrid connections to on-premises databases
  • Staged deployment and test in production support
  • WebJob support for long running background tasks

Because we have an integrated App Service offering, you can now run both Web and Mobile Apps using a single Azure App Service deployment.  This allows you to avoid having to pay for a separate web and mobile backend - and instead optionally pool your resources to save even more money.

Logic Apps

The Logic App support within Azure App Services is brand new and enables you to automate workflows and business processes.  For example, you could configure a workflow that automatically runs every time your app calls an API, or saves data within a database, or on a timer (e.g. once a minute) - and within your workflows you can do tasks like create/retrieve a record in Dynamics CRM or Salesforce, send an email or SMS message to a sales-rep to follow up on, post a message on Facebook or Twitter or Yammer, schedule a meeting/reminder in Office 365, etc. 

Constructing such workflows is now super easy with Azure App Services.  You can define a workflow either declaratively using a JSON file (which you can check-in as source code) or using the new Logic/Workflow designer introduced today within the Azure Portal.  For example, below I've used the new Logic designer to configure an automatically recurring workflow that runs every minute, and which searches Twitter for tweets about Azure, and then automatically send SMS messages (using Twilio) to have employees follow-up on them:

image 

Creating the above workflow is super easy and takes only a minute or so to do using the new Logic App designer.  Once saved it will automatically run within the same VMs/Infrastructure that the Web Apps and Mobile Apps you've built using Azure App Service use as well.  This means you don't have to deploy or pay for anything extra - if you deploy a Web or Mobile App on Azure you can now do all of the above workflow + integration scenarios at no extra cost

Azure App Service today includes support for the following built-in connectors that you can use to construct and automate your Logic App workflows:

image

Combined the above connectors provide a super powerful way to build and orchestrate tasks that run and scale within your apps.  You can now build much richer web and mobile apps using it.

Watch this Azure Friday video about Logic Apps with Scott Hanselman and Josh Twist to learn more about how to use it.

API Apps

The API Apps support within Azure App Service provides additional support that enables you to easily create, consume and call APIs - both APIs you create (using a framework like ASP.NET Web API or the equivalent in other languages) as well as APIs from other SaaS and cloud providers.

API Apps enable simple access control and credential management within your applications, as well as automatic SDK generation support that enables you to easily expose and integrate APIs across a wide-variety of languages.  You can optionally integrate these APIs with Logic Apps. Getting Started

Getting started with Azure App Service is easy.  Simply sign-into the Azure Preview Portal and click the "New" button in the bottom left of the screen.  Select the "Web + Mobile" sub-menu and you can now create Web Apps, Mobile Apps, Logic Apps, and API Apps:

image 

You can create any number of Web, Mobile, Logic and API apps and run them on a single Azure App Service deployment at no additional cost. 

Learning More

I'll be hosting a special Azure App Service launch event online on March 24th at 11am PDT which will contain more details about Azure App Service, a great demo from Scott Hanselman, and talks by several customers and analytics talking about their experiences.  You can watch the online event for free here.

Also check out our new Azure Friday App Service videos with Scott Hanselman that go into detail about all of the new capabilities, and show off how to build Web, Mobile, Logic and API Apps using Azure App Service:

Then visit our documentation center to learn more about the service and how to get started with it today.  Pricing details are available here.

Summary

Today’s Microsoft Azure release enables a ton of great new scenarios, and makes building great web and mobile applications hosted in the cloud even easier.

If you don’t already have a Azure account, you can sign-up for a free trial and start using all of the above features today.  Then visit the Microsoft Azure Developer Center to learn more about how to build apps with it.

Hope this helps,

Scott

P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at:twitter.com/scottgu omni

Categories: Architecture, Programming

A High Available Docker Container Platform using CoreOS and Consul

Xebia Blog - Tue, 03/24/2015 - 11:35

Docker containers are hot, but containers in themselves are not very interesting. It needs an eco-system to make it into  24x7 production deployments. Just handing your container names to operations, does not cut it.

In the blog post, we will show you how  CoreOS can be used to provide a High Available Docker Container Platform as a Service, with a box standard way to deploy Docker containers. Consul is added to the mix to create a lightweight HTTP Router to any docker application offering a HTTP service.

We will be killing a few processes and machine on the way to prove our point...

Architecture

The basic architecture for our Docker Container Platform as a Service, consists of the following components

coreos-caas

  • CoreOS cluster
    The CoreOS cluster will provide us with a cluster of Highly Available Docker Hosts. CoreOS is an open source lightweight operating system based on the Linux kernel and provides an infrastructure for clustered deployments of applications. The interesting part of CoreOS is that you cannot install applications or packages on CoreOS itself. Any custom application has to be packed and deployed as a Docker container. At the same time CoreOS provides only basic functionality for managing these applications.
  • Etcd
    etcd is the CoreOS distributed key value store and provides a reliable mechanism to distribute data through the cluster.
  • Fleet
    Fleet is the cluster wide init system of CoreOS which allows you to schedule applications to run inside the Cluster and provides the much needed nanny system for you apps.
  • Consul
    Consul from Hashicorp is a tool that eases service discovery and configuration. Consul allows services to be discovered via DNS and HTTP and provides us with the ability to respond to changes in the service registration.
  • Registrator
    The Registrator from Gliderlabs will automatically register and deregister any Docker container as a service in Consul. The registrator runs on each Docker Host.
  • HttpRouter
    Will dynamically route HTTP traffic to any application providing a HTTP services, running anywhere in the cluster.  It listens on port 80.
  • Load Balancer
    An external load balancer which will route the HTTP traffic to any of the CoreOS node listening on port 80.
  • Apps
    These are the actual applications that may advertise HTTP services to be discovered and accessed. These will be provided by you.

 

Getting Started

In order to get your own container platform as a service running, we have created a Amazon AWS CloudFormation file which installs the basic services: Consul, Registrator, HttpRouter and the load balancer.

In the infrastructure we create two autoscaling groups: one for the Consul Servers which is limited to 3 to 5 machines and one from the Consul clients which is basically unlimited and depends on your need.

The nice thing about the autoscaling group is that it will automatically launch a new machine if the number of machines drops below the minimum or desired number.  This adds robustness to the platform.

The Amazon Elastic Load Balancer balances incoming traffic to any port machine in either autoscaling group.

We created a little script that creates your CoreOS cluster. This has prerequisite that you are running MacOS and have installed:

In addition, the CloudFormation file assumes that you have a Route53 HostedZone in which we can add Records for your domain. It may work on other Linux platforms, but that I did not test.

 

git clone https://github.com/mvanholsteijn/coreos-container-platform-as-a-service
cd coreos-container-platform-as-a-service
./bin/create-stack.sh -d cargonauts.dutchdevops.net

...
{
"StackId": "arn:aws:cloudformation:us-west-2:233211978703:stack/cargonautsdutchdevopsnet/b4c802f0-d1ff-11e4-9c9c-5088484a585d"
}
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
INFO: create in progress. sleeping 15 seconds...
CoreOSServerAutoScale 54.185.55.139 10.230.14.39
CoreOSServerAutoScaleConsulServer 54.185.125.143 10.230.14.83
CoreOSServerAutoScaleConsulServer 54.203.141.124 10.221.12.109
CoreOSServerAutoScaleConsulServer 54.71.7.35 10.237.157.117

Now you are ready to look around. Use one of the external IP addresses to setup a tunnel for fleetctl.

export FLEETCTL_TUNNEL=54.203.141.124

fleetctl is the command line utility that allows you to manage the units that you deploy on CoreOS.


fleetctl list-machines
....
MACHINE		IP		METADATA
1cdadb87...	10.230.14.83	consul_role=server,region=us-west-2
2dde0d31...	10.221.12.109	consul_role=server,region=us-west-2
7f1f2982...	10.230.14.39	consul_role=client,region=us-west-2
f7257c36...	10.237.157.117	consul_role=server,region=us-west-2

will list all the machines in the platform with their private IP addresses and roles. As you can see we have tagged 3 machines for the consul server role and 1 machine for the consul client role. To see all the docker containers that have started on the individual machines, you can run the following script:

for machine in $(fleetctl list-machines -fields=machine -no-legend -full) ; do
   fleetctl ssh $machine docker ps
done
...
CONTAINER ID        IMAGE                                  COMMAND                CREATED             STATUS              PORTS                                                                                                                                                                                                                                      NAMES
ccd08e8b672f        cargonauts/consul-http-router:latest   "/consul-template -c   6 minutes ago       Up 6 minutes        10.221.12.109:80->80/tcp                                                                                                                                                                                                                   consul-http-router
c36a901902ca        progrium/registrator:latest            "/bin/registrator co   7 minutes ago       Up 7 minutes                                                                                                                                                                                                                                                   registrator
fd69ac671f2a        progrium/consul:latest                 "/bin/start -server    7 minutes ago       Up 7 minutes        172.17.42.1:53->53/udp, 10.221.12.109:8300->8300/tcp, 10.221.12.109:8301->8301/tcp, 10.221.12.109:8301->8301/udp, 10.221.12.109:8302->8302/udp, 10.221.12.109:8302->8302/tcp, 10.221.12.109:8400->8400/tcp, 10.221.12.109:8500->8500/tcp   consul
....

To inspect the Consul console, you need to first setup a tunnel to port 8500 on a server node in the cluster:

ssh-add stacks/cargonautsdutchdevopsnet/cargonauts.pem
ssh -A -L 8500:10.230.14.83:8500 core@54.185.125.143
open http://localhost:8500

Consul Console

You will now see that there are two services registered: consul and the consul-http-router. Consul registers itself and the http router was detected and registered by the Registrator on 4 machines.

Deploying an application

Now we can to deploy an application and we have a wonderful app to do so: the paas-monitor. It is a simple web application which continuously gets the status of a backend service and shows who is responding in a table.

In order to deploy this application we have to create a fleet unit file. which is basically a systemd unit file. It describes all the commands that it needs to execute for managing the life cycle of a unit. The paas-monitor unit file looks like this:


[Unit]
Description=paas-monitor

[Service]
Restart=always
RestartSec=15
ExecStartPre=-/usr/bin/docker kill paas-monitor-%i
ExecStartPre=-/usr/bin/docker rm paas-monitor-%i
ExecStart=/usr/bin/docker run --rm --name paas-monitor-%i --env SERVICE_NAME=paas-monitor --env SERVICE_TAGS=http -P --dns 172.17.42.1 --dns-search=service.consul mvanholsteijn/paas-monitor
ExecStop=/usr/bin/docker stop paas-monitor-%i

It states that this unit should always be restarted, with a 15 second interval. Before it starts, it stops and removes the previous container (ignoring any errors) and when it starts, it runs a docker container - non-detached. This allows systemd to detect that the process has stopped. Finally there is also a stop command.

The file also contains %i: This is a template file which means that more instances of the unit can be started.

In the environment settings of the Docker container, hints for the Registrator are set. The environment variable SERVICE_NAME indicates the name under which it would like to be registered in Consul and the SERVICE_TAGS indicates which tags should be attached to the service. These tags allow you to select the  'http' services in a domain or even from a single container.

If the container would expose more ports for instance 8080 en 8081 for  http and administrative traffic, you cloud specify environment variables.

SERVICE_8080_NAME=paas-monitor
SERVICE_8080_TAGS=http
SERVICE_8081_NAME=paas-monitor=admin
SERVICE_8081_TAGS=admin-http

Deploying the file goes in two stages: submitting the template file and starting an instance:

cd fleet-units/paas-monitor
fleetctl submit paas-monitor@.service
fleetctl start paas-monitor@1

Unit paas-monitor@1.service launched on 1cdadb87.../10.230.14.83

Now the fleet report that it is launched, but that does not mean it is running. In the background Docker has to pull the image which takes a while. You can monitor the progress using fleetctl status.

fleetctl status paas-monitor@1

paas-monitor@1.service - paas-monitor
   Loaded: loaded (/run/fleet/units/paas-monitor@1.service; linked-runtime; vendor preset: disabled)
   Active: active (running) since Tue 2015-03-24 09:01:10 UTC; 2min 48s ago
  Process: 3537 ExecStartPre=/usr/bin/docker rm paas-monitor-%i (code=exited, status=1/FAILURE)
  Process: 3529 ExecStartPre=/usr/bin/docker kill paas-monitor-%i (code=exited, status=1/FAILURE)
 Main PID: 3550 (docker)
   CGroup: /system.slice/system-paas\x2dmonitor.slice/paas-monitor@1.service
           └─3550 /usr/bin/docker run --rm --name paas-monitor-1 --env SERVICE_NAME=paas-monitor --env SERVICE_TAGS=http -P --dns 172.17.42.1 --dns-search=service.consul mvanholsteijn/paas-monitor

Mar 24 09:02:41 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: 85071eb722b3: Pulling fs layer
Mar 24 09:02:43 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: 85071eb722b3: Download complete
Mar 24 09:02:43 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: 53a248434a87: Pulling metadata
Mar 24 09:02:44 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: 53a248434a87: Pulling fs layer
Mar 24 09:02:46 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: 53a248434a87: Download complete
Mar 24 09:02:46 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: b0c42e8f4ac9: Pulling metadata
Mar 24 09:02:47 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: b0c42e8f4ac9: Pulling fs layer
Mar 24 09:02:49 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: b0c42e8f4ac9: Download complete
Mar 24 09:02:49 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: b0c42e8f4ac9: Download complete
Mar 24 09:02:49 ip-10-230-14-83.us-west-2.compute.internal docker[3550]: Status: Downloaded newer image for mvanholsteijn/paas-monitor:latest

Once it is running you can navigate to http://paas-monitor.cargonauts.dutchdevops.net and click on start.

Screen Shot 2015-03-24 at 10.05.20

 

 

You can now add new instances and watch them appear in the paas-monitor! It definitively takes a while because the docker images have to be pulled from the registry before the can be started, but in the end it they will all appear!

fleetctl start paas-monitor@{2..10}
Unit paas-monitor@2.service launched on 2dde0d31.../10.221.12.109
Unit paas-monitor@4.service launched on f7257c36.../10.237.157.117
Unit paas-monitor@3.service launched on 7f1f2982.../10.230.14.39
Unit paas-monitor@6.service launched on 2dde0d31.../10.221.12.109
Unit paas-monitor@5.service launched on 1cdadb87.../10.230.14.83
Unit paas-monitor@8.service launched on f7257c36.../10.237.157.117
Unit paas-monitor@9.service launched on 1cdadb87.../10.230.14.83
Unit paas-monitor@7.service launched on 7f1f2982.../10.230.14.39
Unit paas-monitor@10.service launched on 2dde0d31.../10.221.12.109

to see all deployed units, use the list-units command

fleetctl list-units
...
UNIT MACHINE ACTIVE SUB
paas-monitor@1.service 94d16ece.../10.90.9.78 active running
paas-monitor@2.service f7257c36.../10.237.157.117 active running
paas-monitor@3.service 7f1f2982.../10.230.14.39 active running
paas-monitor@4.service 94d16ece.../10.90.9.78 active running
paas-monitor@5.service f7257c36.../10.237.157.117 active running
paas-monitor@6.service 7f1f2982.../10.230.14.39 active running
paas-monitor@7.service 7f1f2982.../10.230.14.39 active running
paas-monitor@8.service 94d16ece.../10.90.9.78 active running
paas-monitor@9.service f7257c36.../10.237.157.117 active running
How does it work?

Whenever there is a change to the consul service registry, the consul-http-router is notified, selects all http tagged services and generates a new nginx.conf. After the configuration is generated it is reloaded by nginx so that there is little impact on the current traffic.

The consul-http-router uses the Go template language to regenerate the config. It looks like this:

events {
    worker_connections 1024;
}

http {
{{range $index, $service := services}}{{range $tag, $services := service $service.Name | byTag}}{{if eq "http" $tag}}

    upstream {{$service.Name}} {
	least_conn;
	{{range $services}}server {{.Address}}:{{.Port}} max_fails=3 fail_timeout=60 weight=1;
	{{end}}
    }
{{end}}{{end}}{{end}}

{{range $index, $service := services}}{{range $tag, $services := service $service.Name | byTag}}{{if eq "http" $tag}}
    server {
	listen 		80;
	server_name 	{{$service.Name}}.*;

	location / {
	    proxy_pass 		http://{{$service.Name}};
	    proxy_set_header 	X-Forwarded-Host	$host;
	    proxy_set_header 	X-Forwarded-For 	$proxy_add_x_forwarded_for;
	    proxy_set_header 	Host 			$host;
	    proxy_set_header 	X-Real-IP 		$remote_addr;
	}
    }
{{end}}{{end}}{{end}}

    server {
	listen		80 default_server;

	location / {
	    root /www;
	    index index.html index.htm Default.htm;
	}
    }
}

It loops through all the services and selects all services tagged 'http' and creates a virtual host for servicename.* which sends all request to the registered upstream services. Using the following two commands you can see the current configuration file.

AMACHINE=$(fleetctl list-machines -fields=machine -no-legend -full | head -1)
fleetctl ssh $AMACHINE docker exec consul-http-router cat /etc/nginx/nginx.conf
...
events {
    worker_connections 1024;
}

http {

    upstream paas-monitor {
	least_conn;
	server 10.221.12.109:49154 max_fails=3 fail_timeout=60 weight=1;
	server 10.221.12.109:49153 max_fails=3 fail_timeout=60 weight=1;
	server 10.221.12.109:49155 max_fails=3 fail_timeout=60 weight=1;
	server 10.230.14.39:49153 max_fails=3 fail_timeout=60 weight=1;
	server 10.230.14.39:49154 max_fails=3 fail_timeout=60 weight=1;
	server 10.230.14.83:49153 max_fails=3 fail_timeout=60 weight=1;
	server 10.230.14.83:49154 max_fails=3 fail_timeout=60 weight=1;
	server 10.230.14.83:49155 max_fails=3 fail_timeout=60 weight=1;
	server 10.237.157.117:49153 max_fails=3 fail_timeout=60 weight=1;
	server 10.237.157.117:49154 max_fails=3 fail_timeout=60 weight=1;

    }

    server {
	listen 		80;
	server_name 	paas-monitor.*;

	location / {
	    proxy_pass 		http://paas-monitor;
	    proxy_set_header 	X-Forwarded-Host	$host;
	    proxy_set_header 	X-Forwarded-For 	$proxy_add_x_forwarded_for;
	    proxy_set_header 	Host 			$host;
	    proxy_set_header 	X-Real-IP 		$remote_addr;
	}
    }

    server {
	listen		80 default_server;

	location / {
	    root /www;
	    index index.html index.htm Default.htm;
	}
    }
}
[/code]

This also happens when you stop or kill and instance. Just stop an instance and watch your monitor respond.
1
fleetctl destroy paas-monitor@10
...
Destroyed paas-monitor@10.service
Killing a machine

Now let's be totally brave and stop and entire machine!

ssh core@$FLEETCTL_TUNNEL sudo shutdown -h now.
...
Connection to 54.203.141.124 closed by remote host.

Keep watching your paas-monitor. You will notice a slow down and also notice that a number of backend services are no longer responding. After a short while (1 or 2 minutes) you will see new instances appear in the list.
paas-monitor after restart
What happened is that Amazon AWS restarted a new instance into the cluster and all units that were running on the stopped node have been moved to running instances with only 6 HTTP errors!

Please note that CoreOS is not capable of automatically recovering from loss of a majority of the servers at the same time. In that case, manual recovery by operations is required.

Conclusion

CoreOS provides all the basic functionality to manage Docker containers and provided High Availability to your application with a minimum of fuss. Consul and Consul templates actually make it very easy to use custom components like NGiNX to implement dynamic service discovery.

Outlook

In the next blog we will be deploying an multi-tier application that uses Consul DNS to connect application parts to databases!

References

This blog is based on information, ideas and source code snippets from  https://coreos.com/docs/running-coreos/cloud-providers/ec2, http://cargonauts.io/mitchellh-auto-dc and https://github.com/justinclayton/coreos-and-consul-cluster-via-terraform

2015 is The Year of Your Launch

Google Code Blog - Mon, 03/23/2015 - 20:01

Posted by Amir Shevat, Google Developers Launchpad Program Manager

With new events, improved courses and an expanded mentorship network - Startup Launch is now Google Developers Launchpad. We’re changing our program name to emphasize how you can use our resources as a launch pad to scale and monetize your app business. Read on to learn about our upcoming events and how you can apply to participate.

Events: Launchpad Week goes global

Launchpad Week, Launchpad’s weeklong in-person bootcamp for early-stage apps, continues to expand, with new 2015 programs planned in Munich, Mexico City, Helsinki, Bogota, and Sydney, to name a few. We’ll also regularly host these events in Tel Aviv, London, Berlin, and Paris.

We kicked off Launchpad Week in Bengaluru, India and Bordeaux, France last month. 32 startups and 80 experts from these communities gathered at Idiom Design Center and Le Node for a week of product, UX, and technology sprints designed to help transform ideas into validated, scalable businesses.

Featured startups from Bengaluru included iReff, an app that helps pre-paid mobile users find the best recharge plan for their specific needs. In Bordeaux, Google Developer Expert David Gageot volunteered as a tech mentor, helping startups “ship early, ship often” through testing and continuous integration.

Events: Google Developers Summits

For later-stage startups, we’re providing some of the best tech experts to help optimize apps for Material Design, Android TV, and Google Cast at two-day Google Developer Summits. At an event in Buenos Aires, Argentina, last week, we had participants such as game developer Etermax, the team behind Trivia Crack. Similar events happened in Kuala Lumpur, Bangkok, and Bengaluru this month, and we’re looking forward to inviting more startups to this program in London, Tokyo, Tel Aviv, and New York in 2015.

Products: Your app, powered by Google

In 2014, we helped over 5,000 developers in 170 countries get their ideas off the ground by providing the infrastructure back-end that allows developers to build incredible products. For example, our program delivered software architecture reviews and Google Cloud Platform credits to help entrepreneurs in the program build businesses that scale with them. Check out how Fansino is using Google Cloud Platform to let artists interact with their fans.

We’ve also expanded our product offer for early-stage startups to include AdWords promotional offers for new accounts. Whatever your monetization plan, we’re making it easy to get started with tools like the new In-app Billing API and instruction from the AdMob team.

Courses: Upskilling you and your app

Starting this month, we’ll offer a virtual curriculum of how Google products can help your startup. We’re kicking things off with new Launchpad Online videos covering Google Analytics - are you observing how your users use your app? How do different promotional channels perform?

The series continues in April 2015 with AdMob products, and will expand with instruction in implementing material design and conducting user research later in the year.

If you can’t wait, we’ve also built courses together with Udacity to take your technical skills to the next level on topics, including Android, Java, Web Fundamentals, and UX.

Apply to get involved

Apply to Google Developers Launchpad program to take advantage of these offers - g.co/launchpad. Here’s to a great launch!

Categories: Programming

Three Simple Rules for Building Data Products that People Will Actually Use

Tim Trefren is one of the founders at Mixpanel, the most advanced analytics platform for web & mobile applications. He has many years of experience building compelling, accessible interfaces to data. To learn more, check out the Mixpanel engineering blog.

Building data products is not easy.

Many people are uncomfortable with numbers, and even more don't really understand statistics. It's very, very easy to overwhelm people with numbers, charts, and tables - and yet numbers are more important than ever. The trend toward running companies in a data-driven way is only growing...which means more programmers will be spending time building data products. These might be internal reporting tools (like the dashboards that your CEO will use to run the company) or, like Mixpanel, you might be building external-facing data analysis products for your customers.

Either way, the question is: how do you build usable interfaces to data that still give deep insights?

We've spent the last 6 years at Mixpanel working on this problem. In that time, we've come up with a few simple rules that apply to almost everyone:

Categories: Architecture

Impact-Driven Scrum, Code Review & #NoEstimates in Methods & Tools Spring 2015 issue

From the Editor of Methods & Tools - Mon, 03/23/2015 - 16:13
Methods & Tools – the free e-magazine for software developers, testers and project managers – has just published its Spring 2015 issue that discusses Impact-Driven Scrum, Code Review, #NoEstimates,Self-Selecting Teams, Software Laws, Kanboard and ConQAT. * Impact-Driven Scrum Delivery * Code Review: Why It Matters * #NoEstimates – Alternative to Estimate-Driven Software Development * Self-Selecting Teams Part 2 – Keeping the Momentum * Laws for Software Development Teams * Kanboard – Open Source Kanban Board * ConQAT – The Continuous Quality Assessment Toolkit 50 pages of software development knowledge that you can freely download from http://www.methodsandtools.com/mt/download.php?spring15

The Ultimate List of Programming Books

Making the Complex Simple - John Sonmez - Mon, 03/23/2015 - 16:00

Quite often I am asked about the top programming books that I’d recommend all software developers should read. I’ve finally decided to put together a list of the programming books that I find most beneficial and that I think every programmer should read. Now, just like my Ultimate List of Developer Podcasts, this is my […]

The post The Ultimate List of Programming Books appeared first on Simple Programmer.

Categories: Programming

How to write an Amazon RDS service broker for Cloud Foundry

Xebia Blog - Mon, 03/23/2015 - 10:58

Cloud Foundry is a wonderful on-premise PaaS  that makes it very easy to build, deploy while providing scalability and high availability to your stateless applications. But Cloud Foundry is really a Application Platform Service and does not provide high availability and scalability for your data. Fortunately, there is Amazon RDS, which excels in providing this as a service.

In this blog I will show you how easy it is to build, install and use a Cloud Foundry Service Broker for Amazon RDS.  The broker was developed in Node.JS using the Restify framework and can be deployed as a normal Cloud Foundry application. Finally,  I will point you to a skeleton service broker which you can use as the basis for your own.

Cloud Foundry Service Broker Domain

Before I race of into the details of the implementation, I would like to introduce you into the Cloud Foundry lingo. If you are aware of the lingo, just skip to the paragraph 'AWS RDS Service Broker operations'.

Service - an external resource that can be used by an application. It can be a database, a messaging system or an external application.  Commonly provided services are mysql, postgres, redis and memcached.

Service Plan - a plan specify the quality of the service and governs the amount memory, disk space, nodes etc. provided with the service.

Service Catalog - a document containing all services and service plans of a service broker.

Service Broker - a program that is capable of creating services and providing the necessary information to applications to connect to the service.

Now a service broker can provide the following operations:

Describe Services - Show me all the services this broker can provide.

Create Service - Creating an instance of a service matching a specified plan. When the service is a database, it depends on the broker what this means: It may create an entire database server, or just a new database instance, or even just a database schema.   Cloud Foundry calls this 'provisioning a service instance'.

Binding a Service - providing a specific application with the necessary information to connect to an existing service.  When the service is a database, it provides the hostname, portname, database name, username and password. Depending on the service broker, the broker may even  create specific credentials for each  bind request/application. The Cloud Controller will store the returned credentials in a JSON document stored as an UNIX environment variable (VCAP_SERVICES).

Unbind service - depending on the service broker, undo what what done on the bind.

Destroy Service - Easy, just deleting what was created. Cloud Foundry calls this 'deprovisioning a service instance'.

In the next paragraph I will map these operations to Amazon AWS RDS services.

AWS RDS Service Broker operations

Any Service Broker has to implement a REST API of the Cloud Foundry specification.  To create the Amazon AWS RDS service broker, I had to implement four out of five methods:

  • describe services - returns available services and service plans
  • create service - call the createDBInstance operation and store generated credentials as tags in with the instance.
  • bind service - call the describeDBInstances operation and return the stored credentials.
  • delete service - just call the deleteDBInstance operation.

I implemented these REST calls using the Restify framework and the Amazon AWS RDS API for Javascript. the skeleton looks like this:

// get catalog
server.get('/v2/catalog', function(request, response, next) {
    response.send(config.catalog);
    next();
});

// create service
server.put('/v2/service_instances/:id', function(request, response, next) {
        response.send(501, { 'description' : 'create/provision service not implemented' });
        next();
    });

// delete service
server.del('/v2/service_instances/:id', function(req, response, next) {
        response.send(501, { 'description' : 'delete/unprovision service not implemented' });
        next();
    });

// bind service
server.put('/v2/service_instances/:instance_id/service_bindings/:id', function(req, response, next) {
        response.send(501, { 'description' : 'bind service not implemented' });
        next();
});

// unbind service
server.del('/v2/service_instances/:instance_id/service_bindings/:id', function(req, response, next) {
    response.send(501, { 'description' : 'unbind service not implemented' });
    next();
});

For the actual implementation of each operations on AWS RDS,  I would like to refer you to the source code of aws-rds-service-broker.js on github.com .

Design decisions

That does not look all too difficult does it?  Here are some of my design decisions:

Where do I store the credentials?

I store the credentials as tags on the  instance.  I wanted to create service broker that was completely stateless so that I could deploy it in Cloud Foundry itself. I did not want to be dependent on a complete database for a little bit of information. The tags seemed to fit the purpose.

Why does bind return the same credentials for every bind?

I wanted the bind service to be as simple as possible. I did not want to generate new user accounts and passwords, because if I did, I had even more state to maintain.  Even more, I found  that if I bind two applications to the same MySQL service, they could see each others data. So why bother creating users for binds? Finally, making the bind service simple, kept the unbind service even simpler because there is nothing to undo.

How to implement different service plans?

The createDBInstance operation of AWS RDS API operation, takes a JSON object as input parameter that is basically the equivalent of a plan. I just had to add an appropriate JSON record to the configuration file for each plan. See the description of the params parameter of the createDBInstance operation.

How do I create a AWS RDS service within 60 seconds?

Well, I don't.  The service broker API states that you have to create a service within the timeout of the cloud controller (which is 60 seconds), but RDS takes a whee bit more time. So the create request is initiated within seconds, but before you can bind an application to it may take a few minutes. Nothing I can do about that.

Why store the service broker credentials in environment variables?

I want the service broker to be configured upon deployment time. When the credentials are in the config file, you need to change the files of the application on each deployment.

Installation

In these instructions, I presume you have access to an AWS account and you have an installation of Cloud Foundry. I used  Stackato which is a Cloud Foundy implementation by ActiveState.  These instructions assume you are too!

  1. Create a AWS IAM user
    First create a AWS IAM user (cf-aws-service-broker) with at least the folllowing privileges
  2. Assign privileges to execute AWS RDS operations
    The newly created IAM user needs the privileges to create RDS databases. I used the following permissions:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
             "rds:AddTagsToResource",
             "rds:CreateDBInstance",
             "rds:DeleteDBInstance",
             "rds:DescribeDBInstances",
             "rds:ListTagsForResource"
          ],
          "Resource": [
             "*"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
             "iam:GetUser"
          ],
          "Resource": [
              "*"
          ]
        }
      ]
    }
    
  3. Generate AWS access key and secret for the user 'cf-aws-service-broker'
  4. Create a Database Subnet
    Create a  database subnet 'stackato-db-subnet-group' in the AWS Region where you want to have the databases to be created.
  5. Check out the service broker
    git clone https://github.com/mvanholsteijn/aws-rds-service-broker
    cd aws-rds-service-broker
    
  6. Add all your parameters as environment variables to the manifest.yml
    applications:
       - name: aws-rds-service-broker
         mem: 256M
         disk: 1024M
         instances: 1
         env:
           AWS_ACCESS_KEY_ID: <fillin>
           AWS_SECRET_ACCESS_KEY: <fillin>
           AWS_REGION: <of db subnet group,eg eu-west-1>
           AWS_DB_SUBNET_GROUP: stackato-db-subnet-group
           SERVICE_BROKER_USERNAME: <fillin>
           SERVICE_BROKER_PASSWORD: <fillin>
         stackato:
           ignores:
             - .git
             - bin
             - node_modules
    
  7. Deploy the service broker
    stackato target <your-service-broker> --skip-ssl-validation
    stackato login
    push
    
  8. Install the service broker
    This script is a cunning implementation which create the service broker in Cloud Foundry and makes all the plans publicly available. In stackato we use the curl commands to achieve this. This script requires you to have installed jq, the wonderful JSON command line processor by Stephen Dolan.

    bin/install-service-broker.sh
    

Now you can use the service broker!

Using the Service Broker

Now we are ready to use the service broker.

  1. Deploy a sample application
    $ git clone https://github.com/mvanholsteijn/paas-monitor
    $ stackato push -n 
    
  2. Create a service for the mysql services
    $ stackato create-service
    1. filesystem 1.0, by core
    2. mysql
    3. mysql 5.5, by core
    4. postgres
    5. postgresql 9.1, by core
    6. redis 2.8, by core
    7. user-provided
    Which kind to provision:? 2
    1. 10gb: 10Gb HA MySQL database.
    2. default: Small 5Gb non-HA MySQL database
    Please select the service plan to enact:? 2
    Creating new service [mysql-844b1] ... OK
    
  3. Bind the service to the application
    stackato bind-service mysql-844b1 paas-monitor
      Binding mysql-844b1 to paas-monitor ... Error 10001: Service broker error: No endpoint set on the instance 'cfdb-3529e5764'. The instance is in state 'creating'. please retry a few minutes later (500)
    

    retry until the database is actually created (3-10 minutes on AWS)

    stackato bind-service mysql-844b1 paas-monitor
     Binding mysql-844d1 to paas-monitor ...
    Stopping Application [paas-monitor] ... OK
    Starting Application [paas-monitor] ...
    OK
    http://paas-monitor.<your-api-endpoint>/ deployed
    
  4. Check the environment of the application
    curl -s http://paas-monitor.<your-api-endpoint>/environment | jq .DATABASE_URL
    "mysql://root:e1zfMf7OXeq3@cfdb-3529e5764.c1ktcm2kjsfu.eu-central-1.rds.amazonaws.com:3306/mydb"
    

    As you can see the credentials for the newly created database has been inserted into the environment of the application.

Creating your own service broker

If you want to create your own service broker in Node.JS you may find the Skeleton Service Broker  a good starting point. It includes a number of utilities to test your broker in the bin directory.

  • catalog.sh - calls the catalog operation
  • provision.sh - calls the create operation
  • unprovision.sh - call the delete operation
  • bind.sh - calls the bind operation on a specified instance
  • unbind.sh - calls the unbind operation on a specified instance and bind id.
  • list.sh - calls the list all service instances operation
  • getenv.sh - gets the environment variables of an CF applications as sourceable output
  • install-service-broker.sh - installs the application and makes all plans public.
  • docurl.sh - calls the stackato CURL operation.

getenv.sh, install-service-broker.sh and provision.sh require jq to be installed.

Conclusion

As you can see, it is quite easy to create your own Cloud Foundry service broker!

By: thecribb.com » Business Analyst resource guide

Software Requirements Blog - Seilevel.com - Mon, 03/23/2015 - 03:11

[…] Seilevel Blog – Visit this site for relevant and timely articles on Business Analysis. The authors are business analysts who write about their work and what they’ve learnt on the job. The tips you get from this site are practical and can be applied to your projects. […]

Categories: Requirements

Python: Equivalent to flatMap for flattening an array of arrays

Mark Needham - Mon, 03/23/2015 - 01:45

I found myself wanting to flatten an array of arrays while writing some Python code earlier this afternoon and being lazy my first attempt involved building the flattened array manually:

episodes = [
    {"id": 1, "topics": [1,2,3]},
    {"id": 2, "topics": [4,5,6]}
]
 
flattened_episodes = []
for episode in episodes:
    for topic in episode["topics"]:
        flattened_episodes.append({"id": episode["id"], "topic": topic})
 
for episode in flattened_episodes:
    print episode

If we run that we’ll see this output:

$ python flatten.py
 
{'topic': 1, 'id': 1}
{'topic': 2, 'id': 1}
{'topic': 3, 'id': 1}
{'topic': 4, 'id': 2}
{'topic': 5, 'id': 2}
{'topic': 6, 'id': 2}

What I was really looking for was the Python equivalent to the flatmap function which I learnt can be achieved in Python with a list comprehension like so:

flattened_episodes = [{"id": episode["id"], "topic": topic}
                      for episode in episodes
                      for topic in episode["topics"]]
 
for episode in flattened_episodes:
    print episode

We could also choose to use itertools in which case we’d have the following code:

from itertools import chain, imap
flattened_episodes = chain.from_iterable(
                        imap(lambda episode: [{"id": episode["id"], "topic": topic}
                                             for topic in episode["topics"]],
                             episodes))
for episode in flattened_episodes:
    print episode

We can then simplify this approach a little by wrapping it up in a ‘flatmap’ function:

def flatmap(f, items):
        return chain.from_iterable(imap(f, items))
 
flattened_episodes = flatmap(
    lambda episode: [{"id": episode["id"], "topic": topic} for topic in episode["topics"]], episodes)
 
for episode in flattened_episodes:
    print episode

I think the list comprehensions approach still works but I need to look into itertools more – it looks like it could work well for other list operations.

Categories: Programming

SPaMCAST 334 – Mario Lucero, It’s All About Agile Coaching

www.spamcast.net

                       www.spamcast.net

Listen Now

Subscribe on iTunes

In this episode of the Software Process and Measurement Cast we feature our interview with Agile coach Mario Lucero.  Mario and I discussed the nuts and bolts of coaching Agile teams, what is and isn’t Agile and the impact of coaching on success. Mario provided insights on Agile that span both Americas!

Mario describes himself as an Agile evangelist (including Kanban) delivering coaching for Agile transformations and Scrum mastery. He performs as a Scrum Master for several teams while mentoring and coaching other teams, Scrum Masters and product owners.

Mario is as comfortable advising senior management on the Agile transformation strategy and implementation as he is working with teams.

Email: metlucero@gmail.com

Twitter: @metlucero

Blog:  http://mariolucero.cl/

LinkedIn: http://cl.linkedin.com/in/luceromet/en

Call to action!

Can you tell a friend about the podcast? If your friends don’t know how to subscribe or listen to a podcast, show them how you listen and subscribe them!  Remember to send us the name of you person you subscribed (and a picture) and I will give both you and the horde you have converted to listeners a call out on the show.

Re-Read Saturday News

The Re-Read Saturday focus on Eliyahu M. Goldratt and Jeff Cox’s The Goal: A Process of Ongoing Improvement began on February 21nd. The Goal has been hugely influential because it introduced the Theory of Constraints, which is central to lean thinking. The book is written as a business novel. Visit the Software Process and Measurement Blog and catch up on the re-read.

Note: If you don’t have a copy of the book, buy one.  If you use the link below it will support the Software Process and Measurement blog and podcast.

Dead Tree Version or Kindle Version 

I am beginning to think of which book will be next. Do you have any ideas?

Upcoming Events

CMMI Institute Conference EMEA 2015
March 26 -27 London, UK
I will be presenting “Agile Risk Management.”
http://cmmi.unicom.co.uk/

QAI Quest 2015
April 20 -21 Atlanta, GA, USA
Scale Agile Testing Using the TMMi
http://www.qaiquest.org/2015/

DCG will also have a booth!

Next SPaMCast

The next Software Process and Measurement Cast will feature our essay on the definitions of four critical words.  What do the words effectiveness, efficiency, frameworks and methodologies really mean?  These words get used ALL the time, however they really do have fairly specific meanings.  Meanings that, once understood and used to guide how we work, can help everyone to deliver more value and make our customers more satisfied!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.


Categories: Process Management

SPaMCAST 334 – Mario Lucero, It’s All About Agile Coaching

Software Process and Measurement Cast - Sun, 03/22/2015 - 22:00

In this episode of the Software Process and Measurement Cast we feature our interview with Agile coach Mario Lucero.  Mario and I discussed the nuts and bolts of coaching Agile teams, what is and isn’t Agile and the impact of coaching on success. Mario provided insights on Agile that span both Americas!

Mario describes himself as an Agile evangelist (including Kanban) delivering coaching for Agile transformations and Scrum mastery. He performs as a Scrum Master for several teams while mentoring and coaching other teams, Scrum Masters and product owners.

Mario is as comfortable advising senior management on the Agile transformation strategy and implementation as he is working with teams.

Email: metlucero@gmail.com

Twitter: @metlucero

Blog:  http://mariolucero.cl/

LinkedIn: http://cl.linkedin.com/in/luceromet/en

Call to action!

Can you tell a friend about the podcast? If your friends don’t know how to subscribe or listen to a podcast, show them how you listen and subscribe them!  Remember to send us the name of you person you subscribed (and a picture) and I will give both you and the horde you have converted to listeners a call out on the show. 

Re-Read Saturday News

The Re-Read Saturday focus on Eliyahu M. Goldratt and Jeff Cox’s The Goal: A Process of Ongoing Improvement began on February 21nd. The Goal has been hugely influential because it introduced the Theory of Constraints, which is central to lean thinking. The book is written as a business novel. Visit the Software Process and Measurement Blog and catch up on the re-read.

Note: If you don’t have a copy of the book, buy one.  If you use the link below it will support the Software Process and Measurement blog and podcast.

Dead Tree Version or Kindle Version 

I am beginning to think of which book will be next. Do you have any ideas?

Upcoming Events

CMMI Institute Conference EMEA 2015
March 26 -27 London, UK
I will be presenting “Agile Risk Management.” 
http://cmmi.unicom.co.uk/

QAI Quest 2015
April 20 -21 Atlanta, GA, USA
Scale Agile Testing Using the TMMi
http://www.qaiquest.org/2015/

DCG will also have a booth!

Next SPaMCast

The next Software Process and Measurement Cast will feature our essay on the definitions of four critical words.  What do the words effectiveness, efficiency, frameworks and methodologies really mean?  These words get used ALL the time, however they really do have fairly specific meanings.  Meanings that, once understood and used to guide how we work, can help everyone to deliver more value and make our customers more satisfied! 

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.

Categories: Process Management

Capabilities Based Planning First Then Requirements

Herding Cats - Glen Alleman - Sun, 03/22/2015 - 16:23

When I hear about requirements churn, bad requirements management - which is really bad business management, emergent requirements that turn over 20% a month for a complete turnover in 4 months - it's clear there is a serious problem in understanding how to manage the development of a non-trivial project.

Let's start here. Start with what capabilities does this project need to produce when it is done? The order of the capabilities is dependent of the business's ability to not only absorb the capability, but the value stream of those capabilities in support of the business strategy.

That picture at the bottom shows a value stream of capabilities for a health insurance provider network system. The notion of INVEST in agile has to be tested for any project. Dependencies exist and are actually required for enterprise projects. See the flow of capabilities chart below. Doing work in independent order would simply not work. 

Once we have the needed capabilities, and know their dependencies, we can determine - from the business strategy - what order they need to be delivered.
The Point When you hear about all the problems with requirements - or anything to do with software development - stop and remember - it is trivial to point out problems. The classical example of this trivial approach is estimates are the smell of dysfunction. This approach is a Dilbert carton management method. It's not only lame, it's not managing projects as an adult. Adults don't whine, they provide solutions.  So here's a place to start with Requirements Management.  Each  of these books informs our Command Media for requirements elicitation and management for software intensive systems. As well professional journals provide up to date guidance.  There are also tools for requirements management. But don't start with tools, start with a process. Analytic Hierarchy Process (AHP) is my favorite There is no reason to not have a credible requirements process - don't let the whiners dominate the conversation. Provide solutions to the problem.  Related articles Why We Need Governance I Think You'll Find It's a Bit More Complicated Than That The Use, Misuse, and Abuse of Complexity and Complex
Categories: Project Management

Quote of the Day

Herding Cats - Glen Alleman - Sun, 03/22/2015 - 15:40

Science is the great antidote to the poison of enthusiasm and superstition.
- Adam Smith Wealth of Nations

If you hear a conjecture or a claim that sounds like it is not what you were taught in school, doesn't seem to make sense in a common sense way, or appears to violate established principles of science, math, or business - ask for the numbers.

Categories: Project Management

Python: Simplifying the creation of a stop word list with defaultdict

Mark Needham - Sun, 03/22/2015 - 02:51

I’ve been playing around with topics models again and recently read a paper by David Mimno which suggested the following heuristic for working out which words should go onto the stop list:

A good heuristic for identifying such words is to remove those that occur in more than 5-10% of documents (most common) and those that occur fewer than 5-10 times in the entire corpus (least common).

I decided to try this out on the HIMYM dataset that I’ve been working on over the last couple of months.

I started out with the following code to build a dictionary of words, their total occurrences and the episodes they’d been used in:

import csv
from sklearn.feature_extraction.text import CountVectorizer
from collections import defaultdict
 
episodes = defaultdict(str)
with open("sentences.csv", "r") as file:
    reader = csv.reader(file, delimiter = ",")
    reader.next()
    for row in reader:
        episodes[row[1]] += row[4]
 
vectorizer = CountVectorizer(analyzer='word', min_df = 0, stop_words = 'english')
matrix = vectorizer.fit_transform(episodes.values())
features = vectorizer.get_feature_names()
 
words = {}
for doc_id, doc in enumerate(matrix.todense()):
    for word_id, score in enumerate(doc.tolist()[0]):
        word = features[word_id]
        if not words.get(word):
            words[word] = {}
 
        if not words[word].get("score"):
            words[word]["score"] = 0
        words[word]["score"] += score
 
        if not words[word].get("episodes"):
            words[word]["episodes"] = set()
 
        if score > 0:
            words[word]["episodes"].add(doc_id)

This works fine but the code inside the last for block is ugly and most of it is handling the case when parts of a dictionary aren’t yet initialised which is defaultdict territory. You’ll notice I am using defaultdict in the first part of the code but not yet the second as I’d struggled to get it working.

This was my first attempt to make the ‘words’ variable based on it:

>>> words = defaultdict({})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: first argument must be callable

We can see why this doesn’t work if we try to evaluate ‘{}’ as a function which is what defaultdict does internally:

>>> {}()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'dict' object is not callable

Instead what we need is to pass in ‘dict':

>>> dict()
{}
 
>>> words = defaultdict(dict)
 
>>> words
defaultdict(<type 'dict'>, {})

That simplifies the first bit of the loop:

words = defaultdict(dict)
for doc_id, doc in enumerate(matrix.todense()):
    for word_id, score in enumerate(doc.tolist()[0]):
        word = features[word_id]
        if not words[word].get("score"):
            words[word]["score"] = 0
        words[word]["score"] += score
 
        if not words[word].get("episodes"):
            words[word]["episodes"] = set()
 
        if score > 0:
            words[word]["episodes"].add(doc_id)

We’ve still got a couple of other places to simplify though which we can do by defining a custom function and passing that into defaultdict:

def default_dict_function():
   return {"score": 0, "episodes": set()}
 
>>> words = defaultdict(default_dict_function)
 
>>> words
defaultdict(<function default_dict_function at 0x10963fcf8>, {})

And here’s the final product:

def default_dict_function():
   return {"score": 0, "episodes": set()}
words = defaultdict(default_dict_function)
 
for doc_id, doc in enumerate(matrix.todense()):
    for word_id, score in enumerate(doc.tolist()[0]):
        word = features[word_id]
        words[word]["score"] += score
        if score > 0:
            words[word]["episodes"].add(doc_id)

After this we can write out the words to our stop list:

with open("stop_words.txt", "w") as file:
    writer = csv.writer(file, delimiter = ",")
    for word, value in words.iteritems():
        # appears in > 10% of episodes
        if len(value["episodes"]) > int(len(episodes) / 10):
            writer.writerow([word.encode('utf-8')])
 
        # less than 10 occurences
        if value["score"] < 10:
            writer.writerow([word.encode('utf-8')])
Categories: Programming

Python: Forgetting to use enumerate

Mark Needham - Sun, 03/22/2015 - 02:28

Earlier this evening I found myself writing the equivalent of the following Python code while building a stop list for a topic model…

words = ["mark", "neo4j", "michael"]
word_position = 0
for word in words:
   print word_position, word
   word_position +=1

…which is very foolish given that there’s already a function that makes it really easy to grab the position of an item in a list:

for word_position, word in enumerate(words):
   print word_position, word

Python does make things extremely easy at times – you’re welcome future Mark!

Categories: Programming