Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

A Short On How the Wayback Machine Stores More Pages than Stars in the Milky Way

How does the Wayback Machine work? Now with over 400 billion webpages indexed, allowing the Internet to be browsed all the way back to 1996, it's an even more compelling question. I've looked several times but I've never found a really good answer.

Here's some information from a thread on Hacker News. It starts with mmagin, a former Archive employee:

Categories: Architecture

3 Simple Techniques to Make APIs Easier to Use and Understand

Making the Complex Simple - John Sonmez - Mon, 05/19/2014 - 16:00

How many times have you tried to use an API only to find that you had to fill in some ridiculous number of parameters with values that you had no idea about? If you’ve ever done Windows programming and had to call into some of the Win32 APIs, I’m sure you’ve experienced this pain. Do […]

The post 3 Simple Techniques to Make APIs Easier to Use and Understand appeared first on Simple Programmer.

Categories: Programming

Top 200 Management & Leadership Authors (Maybe Not Experts)

NOOP.NL - Jurgen Appelo - Mon, 05/19/2014 - 12:57
top 50

Several weeks ago, I wanted to know who are the most interesting management & leadership writers (worldwide, in the English language). Which authors should I follow? Which blogs should I read? What books do I add to my backlog? Which writers have a great reputation? Of course, a not unreasonable secondary objective was to learn how does my work compare to the others? As I always say, how do you know you’re making progress when you don’t measure?

The post Top 200 Management & Leadership Authors (Maybe Not Experts) appeared first on NOOP.NL.

Categories: Project Management

SPaMCAST 290 – Jan Beaver, The Agile Team Handbook

Listen to the Software Process and Measurement Cast 290. SPaMCAST 290 features our interview with Jan Beaver, author of The Agile Team Handbook. Jan’s book provides team members with the resources needed not only to become Agile but to practice Agile.

Jan Beaver is a Ph.D. educator with over 25 years of experience in the software industry. His experience covers the gamut of management, development, QA, and technical writing. His first practical exposure to Scrum was a dramatic eye-opening experience that for the first time demonstrated that there really was a better way not just to develop software, but a better way to work in general.

Jan is an expert Agile trainer and coach who has worked with a wide variety of companies in an equally wide variety of industries including telecommunications, medical, insurance, financial, media, utilities, and smart-grid energy. He specializes in bringing Scrum and broader Agile and Lean principles and practices to teams, helping them to become effective, productive, and self-sustaining. He also works at the enterprise level using both training and coaching to help organizations become successful in their Agile practice.

Jan recently brought his passion for teams and teamwork to bear in a concise, content-rich volume. Applying Agile values and principles through Scrum practices provides teams and organizations with a roadmap for success in today’s challenging business environment. The Agile Team Handbook is a practical, hands-on guide to building great teams and great organizations.

Buy The Agile Team Handbook (I did!)

Contact Jan:
Twitter: @Jan_Beaver
Website: http://www.greyrockagile.com/home.html
Email: jbeaver@greyrockagile.com

I have shortened the introduction of the cast this week. I would like your feedback. Get in touch with us anytime or leave a comment here on the blog. Help support the SPaMCAST by reviewing and rating it on iTunes. It helps people find the cast. Like us on Facebook while you’re at it.

Next week we will feature our essay on splitting user stories.  User stories are a powerful tool used by many agile teams to conceptualize the value they will deliver. Generally when stories are captured they can range from granular units of work that can be accomplished in a day to gargantuan features that will require multiple sprints to complete (epics).  Epics are very difficult to work with therefore all Agile teams need techniques for splitting user stories into smaller units of work. Next week we will feature hands on practical advice.

Upcoming Events

ITMPI Webinar!

On June 3 I will be presenting the webinar titled “Rescuing a Troubled Project With Agile.” The webinar will demonstrate how Agile can be used to rescue troubled projects.  Your will learn how to recognize that a project is in trouble and how the discipline, focus, and transparency of Agile can promote recovery. Register now!

 

Upcoming DCG Webinars:
May 22 11:30 EDT – Agile User Stories
June 19 11:30 EDT – How To Split User Stories
July 24 11:30 EDT – The Impact of Cognitive Bias On Teams

Check these out at www.davidconsultinggroup.com

I look forward to seeing or hearing all SPaMCAST readers and listeners at all of these great events!

The Software Process and Measurement Cast has a sponsor.

As many you know I do at least one webinar for the IT Metrics and Productivity Institute (ITMPI) every year. The ITMPI provides a great service to the IT profession. ITMPI’s mission is to pull together the expertise and educational efforts of the world’s leading IT thought leaders and to create a single online destination where IT practitioners and executives can meet all of their educational and professional development needs. The ITMPI offers a premium membership that gives members unlimited free access to 400 PDU accredited webinar recordings, and waives the PDU processing fees on all live and recorded webinars. The Software Process and Measurement Cast some support if you sign up here. All the revenue our sponsorship generates goes for bandwidth, hosting and new cool equipment to create more and better content for you. Support the SPaMCAST and learn from the ITMPI.

 

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.


Categories: Process Management

SPaMCAST 290 – Jan Beaver, The Agile Team Handbook

Software Process and Measurement Cast - Sun, 05/18/2014 - 22:00

Listen to the Software Process and Measurement Cast 290. SPaMCAST 290 features our interview with Jan Beaver, author of The Agile Team Handbook. Jan’s book provides team members with the resources needed not only to become Agile but to practice Agile.

Jan Beaver is a Ph.D. educator with over 25 years of experience in the software industry. His experience covers the gamut of management, development, QA, and technical writing. His first practical exposure to Scrum was a dramatic eye-opening experience that for the first time demonstrated that there really was a better way not just to develop software, but a better way to work in general.

Jan is an expert Agile trainer and coach who has worked with a wide variety of companies in an equally wide variety of industries including telecommunications, medical, insurance, financial, media, utilities, and smart-grid energy. He specializes in bringing Scrum and broader Agile and Lean principles and practices to teams, helping them to become effective, productive, and self-sustaining. He also works at the enterprise level using both training and coaching to help organizations become successful in their Agile practice.

Jan recently brought his passion for teams and teamwork to bear in a concise, content-rich volume. Applying Agile values and principles through Scrum practices provides teams and organizations with a roadmap for success in today's challenging business environment. The Agile Team Handbook is a practical, hands-on guide to building great teams and great organizations.

Buy The Agile Team Handbook (I did!)

Contact Jan:
Twitter: @Jan_Beaver
Website: http://www.greyrockagile.com/home.html
Email: jbeaver@greyrockagile.com

I have shortened the introduction of the cast this week. I would like your feedback. Get in touch with us anytime or leave a comment here on the blog. Help support the SPaMCAST by reviewing and rating it on iTunes. It helps people find the cast. Like us on Facebook while you’re at it.

Next week we will feature our essay on splitting user stories.  User stories are a powerful tool used by many agile teams to conceptualize the value they will deliver. Generally when stories are captured they can range from granular units of work that can be accomplished in a day to gargantuan features that will require multiple sprints to complete (epics).  Epics are very difficult to work with therefore all Agile teams need techniques for splitting user stories into smaller units of work. Next week we will feature hands on practical advice.

Upcoming Events

ITMPI Webinar!

On June 3 I will be presenting the webinar titled “Rescuing a Troubled Project With Agile.” The webinar will demonstrate how Agile can be used to rescue troubled projects.  Your will learn how to recognize that a project is in trouble and how the discipline, focus, and transparency of Agile can promote recovery. Register now!

 

Upcoming DCG Webinars:
May 22 11:30 EDT – Agile User Stories
June 19 11:30 EDT – How To Split User Stories
July 24 11:30 EDT - The Impact of Cognitive Bias On Teams

Check these out at www.davidconsultinggroup.com

I look forward to seeing or hearing all SPaMCAST readers and listeners at all of these great events!

The Software Process and Measurement Cast has a sponsor.

As many you know I do at least one webinar for the IT Metrics and Productivity Institute (ITMPI) every year. The ITMPI provides a great service to the IT profession. ITMPI's mission is to pull together the expertise and educational efforts of the world's leading IT thought leaders and to create a single online destination where IT practitioners and executives can meet all of their educational and professional development needs. The ITMPI offers a premium membership that gives members unlimited free access to 400 PDU accredited webinar recordings, and waives the PDU processing fees on all live and recorded webinars. The Software Process and Measurement Cast some support if you sign up here. All the revenue our sponsorship generates goes for bandwidth, hosting and new cool equipment to create more and better content for you. Support the SPaMCAST and learn from the ITMPI.

 

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: "This book will prove that software projects should not be a tedious process, neither for you or your team." Support SPaMCAST by buying the book here.

Available in English and Chinese.

Categories: Process Management

Words Matter: Errors, Defects, and Failures!

Call them bunnies or opportunities or whatever...

Call them bunnies or opportunities or whatever…

Every once in a while the person sitting next to me on a flight home works in IT. We generally trade war stories, the human version of dogs meeting on the street.  Recently a seatmate described an environment in which defects and production problems had been renamed as opportunities. Discovering opportunities was considered to be a positive and the organization seemed to have a lot of them.

Every profession has a common language and that language tends to be built into common processes and frameworks.  Common industry language generates a specific behavioral bias. Testing and software development are no different. The product of software development, enhancements and maintenance activities is software. In development and testing, stuff that goes wrong are typically called errors, defects and failures.  Each has an industry standard meaning.

Errors can be generated by human and external factors. In most IT departments human factors are the most significant source of errors because humans write software. People make mistakes for reasons ranging from the complexity of the business process to distractions caused when someone in the next cube spills coffee on their lap. The bottom line is that we make errors that produces an incorrect result. Calling mistakes opportunities or something else with a similar positive spin changes the behavioral bias from something to avoid to something to to embrace.

Errors can translate into defects (also known as bugs). In software, defects are a flaw that can cause the code to fail to perform as required. As noted in the Principles of Testing, not all defects are discovered or ever occur.  I have spent more than a few nights pouring over code that had not been changed in years only to finally be exposed to a strange set of conditions that had never been seen before. Many years ago I was working for a credit card processing firm that discovered that if the same person bought the same item costing 99 cents, 100 times in a row using a credit card our file maintenance system would fail spectacularly.  Finding and fixing that bug funded at least one coffee plantation.

Defects that occur when the code is executed and represent a difference between what the application is supposed to and what actually happens are termed failures. The code in the credit card file maintenance system was a defect that existed for several years before the fateful night that someone ordered 100 99-cent items on The Home Shopping Network at 1 AM (what was even more strange was that the same person did the same thing the next day at approximately the same time). As soon as the defect actually happened, it became a failure.

Mistakes, defects and failures, whether generated by human factors or external factors (e.g. pollution, radiation or a Super Storm Sandy), are in some sense opportunities to learn and refine how we practice our profession. During the Software Testing Foundations class I recently took, the theme of avoiding the use of  industry standard definitions because words like mistakes, defects and defects can cause poor behavior (e.g. defect hiding, pointing fingers or team strife).  Dawn Haynes, the class instructor and industry consultant, recounted a story of an organization that once called defects and failures “bunnies” in an attempt to avoid negativity. Like my seatmate’s company, they found that they had lots of bunnies and finding and removing bunnies was not taken very seriously. Renaming mistakes, defects and failures to opportunities or bunnies trivializes the efforts of everyone that spend time reviewing, testing and monitoring software execution. I would rather focus my creativity of learning and improving how value is delivered than finding neutral or happy terms to describe errors, defects and failure.


Categories: Process Management

To Stay In Business You Need to Know Both Value and Cost

Herding Cats - Glen Alleman - Sun, 05/18/2014 - 00:08

Show-me-ROI
I hear all the time, in agile we focus on value generation. Any viable business focuses on producing value for its products or services. Saying that is a tautology. There can be no viable business without some kind of value proposition that is acceptable to some number of customers to produce enough revenue to cover the costs of producing that value.

But not knowing the cost of that value, the cost to produce that value, managing the cost to produce that value, and controlling the costs during the value production process - There is not sustainable business

And by the way, the date on which that value will appear, since revenue stream or benefit stream needed to pay for the cost of that value needs to start on the planned time to produce the planned business performance needed to stay in business. Nice products late or nice products for too much cost is a going out of business strategy.

The next time you hear, we focus on value first, call BS. Focusing on value in the absence of focusing on cost of that value is a going out of business  strategy (unless you're a sovereign).

In the bottom line management of any viable business you get a divide by zero error if you don't know the cost of the value produced.

Since both value and cost are random variables in any development business - and random variables on production business as well - we need to have estimates for both cost and value at the same time. These estimates need confidence levels, cumulative probabilities of being at or below and on or before conversations. And an understanding of how the work processes drives these random variables as the project or service proceeds.

It can't be said any more clearly

Both Value and the Cost of that Value are needed for any hope of staying in Business

So don't let those asserting we focus on value to get away with that weak statement. Producing products for money is driven by the business processes first. Those producing those products need to get paid.

And when we here, don't estimate budget, then we're fixing one for the three variables of all projects, and the other are still free to vary in random ways. These ways need to be estimated as well, otherwise that fixed budget, may or may not be enough to deliver the needed capabilities on the needed date to deliver that needed value.

Slide1

Related articles Four Critical Elements of Project Success Everything is a Random Variable
Categories: Project Management

Wrestling with the Browser Event Queue

Xebia Blog - Sat, 05/17/2014 - 16:03

In this post I want to show some things I’ve learned recently about the browser event queue.
It all started with the following innocent looking problem:

I was building a single page web application, and on one form the user should have two options to enter a stock order.
The options were represented as radio buttons, each of them having its own input field. The radio buttons gave the user the choice: he could either enter a order amount, or a order quantity.
Both fields should have their own specific input validations which should be triggered when the user left the input field.

Here's the Html snippet:

  <form>
    <input type="radio" name="options" id="option_amount">Amount</input>
    <input id="option_mount_input"/>
    <input type="radio" name="options" id="option_quantity">Quantity</input>
    <input id="option_quantity_input"/>
  </form>

And the Javascript:

function validate_amount() {
    if(!$.isNumeric($("#option_amount_input").text())) {
      alert("Please enter a valid amount");
    }
}

$("#option_amount_input").on("blur", validate_amount);

All was working fine, except for one scenario: what if the user entered an amount in a wrong way, but then decided he wanted to enter a quantity so he clicked the quantity option instead.
In that case I wouldn’t want the validation for the amount to go off.

Here is my first attempt to fix this scenario:

var shouldValidateAmount = true;

function validate_amount() {
    setTimeout(delayedAmountValidation, 0);
}

function click_quantity() {
    shouldValidateAmount = false;
}

function click_amount() {
    shouldValidateAmount = true;
}

function delayedAmountValidation() {
    if(shouldValidateAmount) {
         if(!$.isNumeric($("#option_amount_input").text())) {
            alert("Please enter a valid amount");
         }
    }
}

$("#option_amount_input").on("blur", validate_amount);
$("#option_quantity").on("click", click_quantity);
$("#option_amount").on("click", click_amount);

The idea is that when the user leaves the amount field, the amount validation will be triggered, but after a timeout. The timeout is zero so the validation event is just put at the end of the event queue.
So before the validation will be executed, the click handler for the quantity radio option will be handled first which will then cancel any postponed validations. (or so I thought)

But this didn’t work. For some reason the postponed validation was still being executed before the click handler.
Only if I adjusted the timout delay until it was high enough, in my case that was 70ms, it would be executed after the click handler. But that would be a very brittle solution of course.

Here's a diagram showing how the event queue handled my solution:

event queue
So what to do about this? Time to look at the browser event queue in some more depth.

It turns out that browser considers certain groups of event as batches. Those are groups which the browser considers to be logically belonging together.
An example is a blur-event for one input combined with a focus-gained event for another input.
Timeouts generated from within a batch have to wait until all the events from their batch are handled, and only then get a chance to execute.

A blur-event and a click event on another element do apparently not belong to the same batch. Therefore my timeout was executed immediately after the focus-lost event was handled, and before the click event handler.

To listen to the focus event on the radio option was no solution of course, because just because it got focus didn’t necessarily mean it was seleced; the user could have tabbed there as well.
But it turns out that there are more events than just onFocus that belong to the same batch as blur; mouseDown and mouseEnter events!

Here is the final solution, which listens for a mouseDown event on the radio button:

event queue batch
And the code:

var shouldValidateAmount = true;

function validate_amount() {
    setTimeout(delayedAmountValidation, 0);
}

function click_quantity() {
    shouldValidateAmount = false;
}

function click_amount() {
    shouldValidateAmount = true;
}

function delayedAmountValidation() {
    if(shouldValidateAmount) {
         if(!$.isNumeric($("#option_amount_input").text())) {
            alert("Please enter a valid amount");
         }
    }
}

$("#option_amount_input").on("blur", validate_amount);
$("#option_quantity").on("mousedown", click_quantity);
$("#option_amount").on("mousedown", click_amount);

References:
Link to Mozilla doc on events and timing

Get Up And Code 054: Austin Dimmer On Ergonomics

Making the Complex Simple - John Sonmez - Sat, 05/17/2014 - 15:00

Throwback time. (I know, it’s not Thursday, but…) This was an older episode I recorded when Iris was still on the show. Been saving this one, but decided to release it this week. So, here you go. Full transcript show John:               Hey everyone, welcome back to Get Up and CODE.  I’m John Sonmez and in […]

The post Get Up And Code 054: Austin Dimmer On Ergonomics appeared first on Simple Programmer.

Categories: Programming

Beyond software craftsmanship

Coding the Architecture - Simon Brown - Sat, 05/17/2014 - 08:49

I had the pleasure of attending the Island Innovators unconference that took place in Jersey last month ... an event co-hosted by Yossi Vardi (the "godfather of Israel's tech industry") and Daniel Seal in conjunction with Locate Jersey and Digital Jersey. This was very different to the usual software development conferences that I attend, with the background of the attendees being very broad. To give you a flavour, the session highlights for me were discussions about SaaS businesses (including pricing models, etc) and one which presented some fascinating insights into the Israeli culture.

Software development - what's next?

Since this was an unconference, I signed up to lead a session to discuss where software development is heading.

First we talked about the current state of software development and I jotted down some notes on flip chart paper (red writing is bad stuff that happens, green represents some solutions). Nothing new here, but it set the scene for the rest of the discussion.

There was a huge number of ideas to solve the problems outlined above, and rather than writing them all down, I attempted to summarise them based upon a list of principles that a "good software developer" should adopt. This is what the next sheet shows.

Finally, we wrapped up the last few minutes of the session by discussing how all of this could possibly be achieved in the real world.

It's an interesting time for the software development industry. Although there's a ton of debate out there about whether software development is art, craft or engineering, I do know that we need to get better at it. Protected/regulated titles and apprenticeships have all come up in conversations before, but it's a complex issue given the sheer quantity and variety of software development out there. Jersey is a small place and perhaps we can use this to our advantage. Perhaps we should do something on a small scale regarding the professionalism of software developers and see what happens...

Categories: Architecture

Testing Principles: Part 2

Let these principles be a caution!

Let these principles be a caution!

In Testing Principles Part 1: This is not Pokémon we noted that we need to strive to find ways of not injecting defects into systems, because while testing can find many defects, it will never be able to remove them all. The next set of principles suggests both where and when to work and then finally leaves us with a strong and sobering reminder. The next set of principles includes:

4. Early testing. Testing includes activities that execute the product (dynamic testing) and activities that review the product (static testing). In software, most people would recognize dynamic testing, which includes executing test cases and comparing results.  Static testing includes reviews and inspections in which a person (or tool in some cases) looks at the code or deliverable and compares it to requirements or another standard. Reviews and inspections can be applied to any piece of work at any time in the development life cycle. Reviewing or testing work products as soon as they are available will find defects earlier in the process and will reduce the possibility of rework.

5. Defect clustering. When you find one defect in a deliverable there will generally be others waiting to be found. The clustering can be caused by the technical or business complexity of the product, misinterpretation of a requirement or other issues (for example, left to my own devices I would get the “ie” thing wrong when spelling word – thank you static testing tool: spell check). Given that defects tend to cluster, if the budget for testing isn’t unlimited then spending more time on areas where defects have been found is a good risk mitigation strategy.

6. Testing is context dependent.  The type of testing that will be the most effective and efficient is driven by the type of product or project. For example, a list of user stories can be reviewed or inspected but can’t be dynamically tested (words can’t be executed like code). Programmers will unit test code based on their knowledge of the structure of the code (white box testing) while product owners and other business stakeholders will test based on understanding of how the overall product will behave (black box testing).

Principle 7 could have been included in part one, however it serves well as a final cautionary tale.

7. Absence of errors fallacy. Just because you have found and corrected all of the defects possible does not mean that the product being delivered is what the customer wants or is useful. My personal translation of this principle is that unless you build the right thing, building it right isn’t even an interesting conversation.

The seven testing principles lead us to understand that we need to focus our efforts by building the right product, using risk to focus our limited resources and find defects as early as possible. Testing in an important part of delivering value from any project, however it is never sufficient.  If you remember one concept based on the seven Testing Principles it is that we can’t test in quality or value.  Those two attributes require that everyone on the project consider quality and value rather than putting that mantle on the shoulders of testers alone.


Categories: Process Management

The Infinite Space Between Words

Coding Horror - Jeff Atwood - Fri, 05/16/2014 - 20:42

Computer performance is a bit of a shell game. You're always waiting for one of four things:

  • Disk
  • CPU
  • Memory
  • Network

But which one? How long will you wait? And what will you do while you're waiting?

Did you see the movie "Her"? If not, you should. It's great. One of my favorite scenes is the AI describing just how difficult it becomes to communicate with humans:

It's like I'm reading a book… and it's a book I deeply love. But I'm reading it slowly now. So the words are really far apart and the spaces between the words are almost infinite. I can still feel you… and the words of our story… but it's in this endless space between the words that I'm finding myself now. It's a place that's not of the physical world. It's where everything else is that I didn't even know existed. I love you so much. But this is where I am now. And this who I am now. And I need you to let me go. As much as I want to, I can't live your book any more.

I have some serious reservations about the work environment pictured in Her where everyone's spending all day creepily whispering to their computers, but there is deep fundamental truth in that one pivotal scene. That infinite space "between" what we humans feel as time is where computers spend all their time. It's an entirely different timescale.

The book Systems Performance: Enterprise and the Cloud has a great table that illustrates just how enormous these time differentials are. Just translate computer time into arbitrary seconds:

1 CPU cycle0.3 ns1 s Level 1 cache access0.9 ns3 s Level 2 cache access2.8 ns9 s Level 3 cache access12.9 ns43 s Main memory access120 ns6 min Solid-state disk I/O50-150 μs2-6 days Rotational disk I/O1-10 ms1-12 months Internet: SF to NYC40 ms4 years Internet: SF to UK81 ms8 years Internet: SF to Australia183 ms19 years OS virtualization reboot4 s423 years SCSI command time-out30 s3000 years Hardware virtualization reboot40 s4000 years Physical system reboot5 m32 millenia

The above Internet times are kind of optimistic. If you look at the AT&T real time US internet latency chart, the time from SF to NYC is more like 70ms. So I'd double the Internet numbers in that chart.

Latency is one thing, but it's also worth considering the cost of that bandwidth.

Speaking of the late, great Jim Gray, he also had an interesting way of explaining this. If the CPU registers are how long it takes you to fetch data from your brain, then going to disk is the equivalent of fetching data from Pluto.

He was probably referring to traditional spinning rust hard drives, so let's adjust that extreme endpoint for today:

  • Distance to Pluto: 4.67 billion miles.
  • Latest fastest spinning HDD performance (49.7) versus latest fastest PCI Express SSD (506.8). That's an improvement of 10x.
  • New distance: 467 million miles.
  • Distance to Jupiter: 500 million miles.

So instead of travelling to Pluto to get our data from disk in 1999, today we only need to travel to … Jupiter.

That's disk performance over the last decade. How much faster did CPUs, memory, and networks get in the same time frame? Would a 10x or 100x improvement really make a dent in these vast infinite spaces in time that computers deal with?

To computers, we humans work on a completely different time scale, practically geologic time. Which is completely mind-bending. The faster computers get, the bigger this time disparity grows.

[advertisement] Stack Overflow Careers matches the best developers (you!) with the best employers. You can search our job listings or create a profile and even let employers find you.
Categories: Programming

The Multiple SQLite Problem

Eric.Weblog() - Eric Sink - Fri, 05/16/2014 - 19:00
Eric, why the #$%! is your SQLite PCL taking so long?

It's Google's fault. And Apple's fault.

Seriously?

No. Yes. Kinda. Not really.

The Multiple SQLite Problem, In a Nutshell

If your app makes use of two separate instances of the SQLite library, you can end up with a corrupted SQLite data file.

From the horse's mouth

On the SQLite website, section 2.2.1 of How to Corrupt an SQLite Database File is entitled "Multiple copies of SQLite linked into the same application", and says:

As pointed out in the previous paragraph, SQLite takes steps to work around the quirks of POSIX advisory locking. Part of that work-around involves keeping a global list (mutex protected) of open SQLite database files. But, if multiple copies of SQLite are linked into the same application, then there will be multiple instances of this global list. Database connections opened using one copy of the SQLite library will be unaware of database connections opened using the other copy, and will be unable to work around the POSIX advisory locking quirks. A close() operation on one connection might unknowingly clear the locks on a different database connection, leading to database corruption. The scenario above sounds far-fetched. But the SQLite developers are aware of at least one commercial product that was released with exactly this bug. The vendor came to the SQLite developers seeking help in tracking down some infrequent database corruption issues they were seeing on Linux and Mac. The problem was eventually traced to the fact that the application was linking against two separate copies of SQLite. The solution was to change the application build procedures to link against just one copy of SQLite instead of two.

At its core, SQLite is written in C. It is plain-old-fashioned native/umanaged code. If you are accessing SQLite using C#, you are doing so through some kind of a wrapper. That wrapper is loading the SQLite library from somewhere. You may not know where. You probably don't [want to] care.

This is an abstraction. And it can leak. C# is putting some distance between you and the reality of what SQLite really is. That distance can somewhat increase the likelihood of you accidentally having two instances of the SQLite library without even knowing it.

SQLite as part of the mobile OS

Both iOS and Android contain an instance of SQLite as part of the basic operating system. This is a blessing. And a curse.

Built-in SQLite is nice because your app doesn't have to include it. This makes the size of your app smaller. It avoids the need to compile SQLite as part of your build process.

But the problem is that the OS has contributed one instance of the SQLite library that you can't eliminate. It's always there. The multiple SQLite problem cannot happen if only one SQLite is available to your app. Anybody or anything which adds one is risking a plurality.

If SQLite is always in the OS, why not always use it?

Because Apple and Google do a terrible job of keeping it current.

  • iOS 7 ships with SQLite 3.7.13. That shipped in June of 2012.

  • Android ships with SQLite 3.7.11. That shipped in March of 2012.

  • Since Android users never update their devices, a large number of them are still running SQLite 3.7.4, which shipped in December of 2010. (Yes, I know the sweeping generalization in the previous sentence is unfair. I like Android a lot, but I think Google's management of the Android world has been bad enough that I'm entitled to a little crabbiness.)

If you are targeting Android or iOS and using the built-in SQLite library, you are missing out on at least TWO YEARS of excellent development work by DRH and his team. Current versions of SQLite are significantly faster, with many bug fixes, and lots of insanely cool new features. This is just one of the excellent reasons to bundle a current version of SQLite into your app instead of using the one in the OS.

And as soon as you do that, there are two instances in play. You and Apple/Google have collaborated to introduce the risk of database corruption.

Windows

AFAIK, no version of Windows includes a SQLite library. This is a blessing. And a curse. For all of the opposite reasons discussed above.

In general, building a mobile app for Windows (Phone or RT or whatever) means you have to include SQLite as part of the app. And when doing so, it certainly makes sense to just use the latest version.

And that introduces another reason somebody might want to use an application-private version of SQLite instead of the one built-in to iOS or Android. If you're building a cross-platform app, you probably want all your platforms using the same version of SQLite. Have fun explaining to your QA people that your app is built on SQLite 3.8.4 on Windows and 3.7.11 on Android and 3.7.13 on iOS.

BTW, it's not clear how or if Windows platforms suffer from the data corruption risk of the multiple SQLite problem. Given that the DRH explanation talks about workarounds for quirks in POSIX file locking, it seems likely that the situation on Windows is different in significant ways. Nonetheless, even if using multiple SQLite instances on Windows platforms is safe, it is still wasteful. And sad.

SQLCipher or SEE

Mobile devices get lost or stolen. A significant portion of mobile app developers want their data encrypted on the device. And the SQLite instance built-in to iOS and Android is plain, with no support for encryption.

The usual solution to this problem is to use SQLCipher (open source, from Zetetic) or SEE (proprietary, from the authors of SQLite). Both of these are drop-in replacements for SQLite.

In other words, this is yet another reason the OS-provided SQLite library might not be sufficient.

SQLite compilation options

SQLite can be compiled in a lot of different ways. Do you want the full-text-search feature? Do you want foreign keys to be default on or off? What do you want the default thread-safety mode to be? Do you need the column metadata feature? Do you need ICU for full Unicode support in collations? The list goes on and on.

Did Apple or Google compile SQLite with the exact set of build options your app needs? Maybe. Or maybe your app just needs to have its own.

Adding a SQLite instance without knowing it

Another way to get two SQLite instances is to add a component or library which includes one. Even if you don't know.

For example, the client side of Zumero (our mobile SQL sync product) needs to call SQLite. Should it bundle a SQLite library? Or should it always call the one in the mobile OS (when available)?

Some earlier versions of the Zumero client SDK included a SQLite instance in our Xamarin component builds. Because, why on earth would we want our code running against the archaic version of SQLite provided by Apple and Google?

And then we had a customer run into this exact problem. They called Zumero for sync. And they used Mono.Data.Sqlite for building their app.

Now we ship builds which contain no SQLite library instance, because it minimizes the likelihood of this kind of accident happening.

There are all kinds of libraries and components and SDKs out there which build on SQLite. Are they calling the instance provided by the OS? Or are they bundling one? Do you even know?

So maybe app developers should just be more careful

Knee-jerk reaction: Yes, absolutely.

Better answer: Certainly not.

App developers don't want to think about this stuff. It's a bit of esoterica that nobody cares about. Most people who started reading this blog entry gave up several paragraphs ago. The ones that are still here (both of you) are wondering why you are still reading when right now there are seven cable channels showing a rerun of Law and Order.

An increasingly easy accident

The multiple SQLite scenario is sounding less far-fetched all the time. SQLite is now one of the most widely deployed pieces of software in history. It is incredibly ubiquitous, and still growing. And people love to build abstractions on top of it.

This problem is going to get more and more common.

And it can have very significant consequences for end users.

Think of it this way

The following requirements are very typical:

  • App developers want to be using a current version of SQLite (because DRH has actually been working for the last two years).

  • App developers want their SQLite data on the mobile device to be encrypted (because even grown-ups lose mobile devices).

  • App developers want to be using the same version of SQLite on all of their mobile app platforms (because it simplifies testing).

  • App developers want no risk of data corruption (because end users don't like that kind of thing).

  • App developers want to work with abstractions, also-known-as ORMs and sync tools, also-known-as things that makes their lives easier (because writing mobile apps is insanely expensive and it is important to reduce development costs).

  • App developers want to NOT have to think about anything in this blog entry (because they are paid to focus on their actual business, which is medicine or rental cars or construction, and it's 2014, so they shouldn't have to spend any time on the ramifications of quirky POSIX file locking).

Those requirements are not just typical, they are reasonable. To ask app developers to give up any of these things would be absurd.

And right now, there is NO WAY to satisfy all the requirements above. In the terminology of high school math, this is a system of equations with no solution.

To be fair

The last several weeks of "the NuGet package is almost ready" are also due to some reasons I can't blame Apple or Google or POSIX for.

When I started working on SQLitePCL.raw, I didn't know nearly enough about MSBuild or NuGet. Anything involving native code with NuGet is pretty tricky. I've spent time climbing the learning curve. My particular way of learning new technologies is to write the code three times. The commit history on GitHub contains the whole story.

Ramifications for SQLitePCL.raw

I want users of my SQLite PCL to have a great experience, so I'm spending [perhaps too much] time trying to find the sweetest subsets of the requirements above.

For example: C++/CX is actually pretty cool. I can build a single WP8 component DLL which is visible to C# while statically building SQLite itself inside. Fewer pieces. Fewer dependencies. Nice. But if anything else in the app needs direct access to SQLite, they'll have to include another instance of the library. Yuck.

Another example: I see [at least] three reasonable choices for iOS:

  • Use the SQLite provided by iOS. It's a shared library. Access it with P/Invoke, DllImport("sqlite3").

  • Bundle the latest SQLite. DllImport("__Internal"), and embed a sqlite3.a as a resource and use the MonoTouch LinkWith attribute.

  • Use the Xamarin SQLCipher component. DllImport("__Internal"), but don't bundle anything, relying on the presence of the SQLCipher component to make the link succeed.

Which one should the NuGet package assume that people want? How do people that prefer the others get a path that Just Works?

So, Eric, when will the SQLitePCL.raw NuGet package be ready

Soon. ;-)

Bottom line

"I don't know the key to success, but the key to failure is trying to please everybody." -- Bill Cosby

 

Helping You Go Global with More Seamless Google Play Payments

Android Developers Blog - Fri, 05/16/2014 - 18:52

By Ibrahim Elbouchikhi, Google Play Product Manager

Sales of apps and games on Google Play are up by more than 300 percent over the past year. And today, two-thirds of Google Play purchases happen outside of the United States, with international sales continuing to climb. We’re hoping to fuel this momentum by making Google Play payments easier and more convenient for people around the world.

PayPal support

Starting today, we're making it possible for people to choose PayPal for their Google Play purchases in 12 countries, including the U.S., Germany, and Canada. When you make a purchase on Google Play in these countries, you'll find PayPal as an option in your Google Wallet; just enter your PayPal account login and you'll easily be able to make purchases. Our goal is to provide users with a frictionless payment experience, and this new integration is another example of how we work with partners from across the payments industry to deliver this to the user.

Carrier billing and Google Play gift cards in more countries

Carrier billing—which lets people charge purchases in Google Play directly to their phone bill—continues to be a popular way to pay. We’ve just expanded coverage to seven more countries for a total of 24, including Singapore, Thailand and Taiwan. That means almost half of all Google Play users have this option when making their purchases.

We’ve also made Google Play gift cards available to a total of 13 countries, including Japan and Germany.

Support for developer sales in more countries

Developers based in 13 new countries can now sell apps on Google Play (with new additions such as Indonesia, Malaysia and Turkey), bringing the total to 45 countries with support for local developers. We’ve also increased our buyer currency support to 28 new countries, making it even easier for you to tailor your pricing in 60 countries.

Nothing for you to do!

Of course, as developers, when it comes to payments, there’s nothing for you to do; we process all payments, reconcile all currencies globally, and make a monthly deposit in your designated bank account. This means you get to focus on what you do best: creating beautiful and engaging apps and games.

Visit developer.android.com for more information.

Per-country availability of forms of payment is summarized here.

Join the discussion on

+Android Developers
Categories: Programming

Google I/O 2014: start planning your schedule

Google Code Blog - Fri, 05/16/2014 - 17:30
By Katie Miller, Google Developer Marketing

From making your apps as powerful as they can be to putting them in front of hundreds of millions of users, our focus at Google is to help you design, develop and distribute compelling experiences for your users. At Google I/O 2014, happening June 25-26 at Moscone West in San Francisco, we’re bringing you sessions and experiences ranging from design principles and techniques to the latest developer tools and implementations to developer-minded products and strategies to help distribute your app.

If you're coming in person, the schedule will give you more time to interact in the Sandbox, where partners will be on hand to demo apps built on the best of Google and open source, and where you can interact with Googlers 1:1 and in small groups. Don’t worry, though--we’ll have plenty of content online for those following along remotely! Visit the schedule on the Google I/O website (and check back often for updates). As you start your I/O planning, we want to highlight the experiences we’re working on to help you build and grow your apps:

  • Breakout sessions: This year, we’ll once again bring you a deep selection of technical content, including sessions such as "What's New in Android"and "Wearable computing with Google” from Android, Chrome and Cloud, and cross-product, cross-platform implementations. There will be a full slate of design sessions that will bring to life Google’s design principles and teach best practices, and an update on how our monetization, measurement and payment products are better suited than ever to help developers grow the reach of their applications. Sessions from Ray Kurzweil, Ignite and Women Techmakers will take the stage and make us uncomfortably excited about what is possible. The first sessions are now listed, keep checking back for more.
  • Workshops and code labs: Roll up your sleeves, dig in to hands-on experiences and code. Learn how to build better products, apply quantitative data to user experiences, and prototype new Glassware through interactive workshops on UX, experience mapping and design principles. To maximize your learning and give you more interaction with Googlers and peers, visit our coding work space, with work stations preloaded with self-paced modules. Dive into Android, Chrome, Cloud and APIs with experts on hand for guidance.
  • Connect with Googlers in the sandbox: Check out your favorite Google products and meet the Googlers who built them. From there, join a ‘Box talk or app review, ranging from conceptual prototyping, to performance testing with the latest tools, to turning your app into a successful business.
  • Learn from peers at the partner sandbox: We love to see partners build cool things with Google, and have invited a few of them to showcase inspiring integrations of what’s possible. You will be able to see demos and talk in-depth with them about how they designed, created and grew their apps.
  • Beyond Moscone, with I/O Extended: Experience I/O around the world, in an event setting, with I/O Extended. The I/O Extended events include everything from live streaming sessions from I/O to local speaker sessions and hackathons. It is great to see so many events taking place around the world, and we can't wait to see I/O Extended events have another strong year.

We look forward to seeing you next month, whether it’s in-person in San Francisco, at I/O Extended or online through the livestream!

Katherine Miller is part of the Developer Marketing team, working on session programming for Google I/O and developer research efforts. In her spare time she runs (both competitively and after her 2 children) and memorizes passages from beloved children's books.

Posted by Louis Gray, Googler
Categories: Programming

Stuff The Internet Says On Scalability For May 16th, 2014

Hey, it's HighScalability time:


Cross Section of an Undersea Cable. It's industrial art. The parts. The story.
  • 400,000,000,000: Wayback Machine pages indexed; 100 billion: Google searches per month; 10 million: Snapchat monthly user growth.
  • Quotable Quotes:
    • @duncanjw: The Great Rewrite - many apps will be rewritten not just replatformed over next 10 years says @cote #openstacksummit
    • @RFFlores: The Openstack conundrum. If you don't adopt it, you will regret it in the future. If you do adopt it, you will regret it now
    • elementai: I love Redis so much, it became like a superglue where "just enough" performance is needed to resolve a bottleneck problem, but you don't have resources to rewrite a whole thing in something fast.
    • @antirez: "when software engineering is reduced to plumbing together generic systems, software engineers lose their sense of ownership"
    • Tom Akehurst: Microservices vs. monolith is a false dichotomy.
    • @joestump: “Keep in mind that any piece of butt-based infrastructure can fail at any time. Plan your infrastructure accordingly.” Ain’t that the truth?
    • @SalesforceEng: Check out the scale of Kafka @LinkedInEng. @bonkoif says these numbers are about a month old. 3.25 million msgs/sec. 
    • Don Neufeld: The first is to look deeply into the stack of implicit assumptions I’m working with. It’s often the unspoken assumptions that are the most important ones. The second flows from the first and it’s to focus less on building the right thing and more how we’re going to meet our immediate needs.
    • Dan Gillmor: We’re in danger of losing what’s made the Internet the most important medium in history – a decentralized platform where the people at the edges of the networks – that would be you and me – don’t need permission to communicate, create and innovate.

  • If you think of a Hotel as an app, hotels have been doing in-app purchases for a long time. They lead with a teaser rate and then charge for anything that might cross a desire-money threshold. Wifi, that's extra. Gym, that's extra. The bar, a cover charge. Drinks, so so expensive. The pool, extra. A lounge by the pool is double extra extra. To go all the way hotels just need to let you stay for free and then fully monetize all the gamification points.

  • Apple: We handle hundreds of millions of active users using some of the most desirable devices on the planet and several Billion iMesssages/day, 40 billion push notifications/day, 16+ trillion push notifications sent to date.

  • It's a data prison for everyone! Comcast plans data caps for all customers in 5 years, could be 500GB. Or just a few 4K movies.

  • From the future of everything to the verge of extinction. The Slow Decline of Peer-to-Peer File Sharing: People have shifted their activities to streaming over file sharing. Subscribers get quality content at a reasonable price and it's dead simple to use, whereas torrenting or file sharing is a little more complicated.

  • I don't think people understand how hard this is to do in practice. European Court Lets Users Erase Records on Web. Once data is stored on tape deleting takes rewriting all the non-deleted data to another tape. So it's far more efficient to forget indexes to data than delete the data. Which goes against the point I'd imagine.

  • How is a strategy tax hands off? @parislemon: Instagram's decision to use Facebook's much worse place database over Foursquare's has made the product worse. Stupid.

  • Excellent detailed example of the SEDA architecture in action. Guide to Cassandra Thread Pools. Follow the regal message as it flows from thread pool to thread pool, transforming as it makes its way to its final resting place.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Categories: Architecture

Spike It! Article Posted

One of my clients was having trouble with estimating work they had never done before, so I wrote an article explaining spikes. That article is up on agileconnection: Need to Learn More about the Work You’re Doing? Spike It!

It was a little serendipity; I taught an estimation workshop right after I explained how to spike in email. That article almost wrote itself.

You can use a spike in any project, agile or not. Spikes are for learning.

I also explain what to do when managers say, “Use the code in the spike” and you think, “Oh no!” See for yourself.

Would-be authors: want to write an article for agileconnection.com? I’m interested.

 

Categories: Project Management

Interview with Esko Kilpi

NOOP.NL - Jurgen Appelo - Fri, 05/16/2014 - 14:21
esko-kilpi

As part of my global book tour I hope to have fascinating conversations with management & leadership experts around the world. One of them is Esko Kilpi, a researcher and consultant in Finland who has a focus on the “arts and sciences of complex interaction”. With Esko I talked about management & complexity.

The post Interview with Esko Kilpi appeared first on NOOP.NL.

Categories: Project Management

Testing Principles Part 1:  This is not Pokémon

Untitled

I recently studied and passed the test for the International Software Testing Qualification Board’s (ISTQB) Certified Test, Foundation Level (CFTL). During my career I have been a tester, managed a test group and consulted on testing processes. During my time as a tester and a test manager, I was not aware explicitly of the seven principles of testing, however I think I understood them in my gut. Unfortunately most of my managers and clients did not understand them, which meant they behaved in a way that never felt rational and always devolved into a discussion of why bugs made it into production. Whether you are involved in testing, developing, enhancing, supporting or managing IT projects an understanding of the principles of testing can and should influence your professional behavior. I have broken the seven principles into two groups.  Group one relates to why we can’t catch them all and the second is focus on where we find defects. The first group includes:

  1. Testing shows the presence of defects. Stated differently, testing proves that the defects you find exist, but does not prove that there aren’t any other defects that you did not find. Understanding that testing does not prove that software or any product is defect free means that we always need to plan and mitigate the risk that we will find a defect as the development process progresses through to a production environment.
  2. Exhaustive testing is impossible. Testing all combinations of inputs, outputs and processing conditions is not generally not possible (I was involved in a spirited argument at a testing conference that suggested in very simple cases, exhaustive testing might be possible). Even if we set aside exoteric test cases, such as the possibility of a neutrino changing active memory while your software, application or product is using it, the number of possible perpetuations for even simple changes is eye popping (consider calculating the number of possible combinations of a simple change with 15 independent inputs each having 10 possible values). If exhaustive testing is not possible, the testers and test managers must use other techniques to focus the time and effort they have on what is important and risky. Developing an understanding of potential impact and possibility of problems (risk) is needed to target testing resources.
  3. Pesticide Paradox. The value running the same type of test over and over on an application wanes over time. The metaphor of pesticide is used to draw attention to the fact that once a test finds the bugs it is designed to find (or can find – a factor of how the test is implemented) the remaining bugs will be not found by the test.  Testing must be refactored over time to continue to be effective. This is why simply automating a set of tests and then running them over and over is not an adequate risk reduction strategy.

The first three principles of testing very forcibly remind everyone involved in developing, maintaining or support IT applications (hardware or software) that zero defects is aspirational, but not realistic. That understanding belies the shocked disbelief or manic finger pointing when defects are discovered late in the development cycle or in production. They exist and will be found. Our strategy should start by first avoiding creating the defects, focus testing (the whole range of testing from reviews to dynamic testing) on areas of the application or change based on risk to the business if a defect is not found, and have a plan in place for the bugs that run the gauntlet. In the world of IT, everyone, developers, testers, operators and network engineers alike, need to work together to improve quality within real world constraints because unlike Pokémon, you are never going to catch them all.


Categories: Process Management

Using Dropwizard in combination with Elasticsearch

Gridshore - Thu, 05/15/2014 - 21:09

Dropwizard logo

How often do you start creating a new application? How often have you thought about configuring an application. Where to locate a config file, how to load the file, what format to use? Another thing you regularly do is adding timers to track execution time, management tools to do thread analysis etc. From a more functional perspective you want a rich client side application using angularjs. So you need a REST backend to deliver json documents. Does this sound like something you need regularly? Than this blog post is for you. If you never need this, please keep on reading, you might like it.

In this blog post I will create an application that show you all the available indexes in your elasticsearch cluster. Not very sexy, but I am going to use: AngularJS, Dropwizard and elasticsearch. That should be enough to get a lot of you interested.


What is Dropwizard

Dropwizard is a framework that combines a lot of other frameworks that have become the de facto standard in their own domain. We have jersey for REST interface, jetty for light weight container, jackson for json parsing, free marker for front-end templates, Metric for the metrics, slf4j for logging. Dropwizard has some utilities to combine these frameworks and enable you as a developer to be very productive in constructing your application. It provides building blocks like lifecycle management, Resources, Views, loading of bundles, configuration and initialization.

Time to jump in and start creating an application.

Structure of the application

The application is setup as a maven project. To start of we only need one dependency:

<dependency>
    <groupId>io.dropwizard</groupId>
    <artifactId>dropwizard-core</artifactId>
    <version>${dropwizard.version}</version>
</dependency>

If you want to follow along, you can check my github repository:


https://github.com/jettro/dropwizard-elastic

Configure your application

Every application needs configuration. In our case we need to configure how to connect to elasticsearch. In drop wizard you extend the Configuration class and create a pojo. Using jackson and hibernate validator annotations we configure validation and serialization. In our case the configuration object looks like this:

public class DWESConfiguration extends Configuration {
    @NotEmpty
    private String elasticsearchHost = "localhost:9200";

    @NotEmpty
    private String clusterName = "elasticsearch";

    @JsonProperty
    public String getElasticsearchHost() {
        return elasticsearchHost;
    }

    @JsonProperty
    public void setElasticsearchHost(String elasticsearchHost) {
        this.elasticsearchHost = elasticsearchHost;
    }

    @JsonProperty
    public String getClusterName() {
        return clusterName;
    }

    @JsonProperty
    public void setClusterName(String clusterName) {
        this.clusterName = clusterName;
    }
}

Then you need to create a yml file containing the properties in the configuration as well as some nice values. In my case it looks like this:

elasticsearchHost: localhost:9300
clusterName: jc-play

How often did you start in your project to create the configuration mechanism? Usually I start with maven and quickly move to tomcat. Not this time. We did do maven, now we did configuration. Next up is the runner for the application.

Add the runner

This is the class we run to start the application. Internally jetty is started. We extend the Application class and use the configuration class as a generic. This is the class that initializes the complete application. Used bundles are initialized, classes are created and passed to other classes.

public class DWESApplication extends Application<DWESConfiguration> {
    private static final Logger logger = LoggerFactory.getLogger(DWESApplication.class);

    public static void main(String[] args) throws Exception {
        new DWESApplication().run(args);
    }

    @Override
    public String getName() {
        return "dropwizard-elastic";
    }

    @Override
    public void initialize(Bootstrap<DWESConfiguration> dwesConfigurationBootstrap) {
    }

    @Override
    public void run(DWESConfiguration config, Environment environment) throws Exception {
        logger.info("Running the application");
    }
}

When starting this application, we have no succes. A big error because we did not register any resources.

ERROR [2014-05-14 16:58:34,174] com.sun.jersey.server.impl.application.RootResourceUriRules: 
	The ResourceConfig instance does not contain any root resource classes.
Nothing happens, we just need a resource.

Before we can return something, we need to have something to return. We create a pogo called Index that contains one property called name. For now we just return this object as a json object. The following code shows the IndexResource that handles the requests that are related to the indexes.

@Path("/indexes")
@Produces(MediaType.APPLICATION_JSON)
public class IndexResource {

    @GET
    @Timed
    public Index showIndexes() {
        Index index = new Index();
        index.setName("A Dummy Index");

        return index;
    }
}

The @GET, @PATH and @Produces annotations are from the jersey rest library. @Timed is from the metrics library. Before starting the application we need to register our index resource with jersey.

    @Override
    public void run(DWESConfiguration config, Environment environment) throws Exception {
        logger.info("Running the application");
        final IndexResource indexResource = new IndexResource();
        environment.jersey().register(indexResource);
    }

Now we can start the application using the following runner from intellij. Later on we will create the executable jar.

Running the app from intelij

Run the application again, this time it works. You can browse to http://localhost:8080/index and see our dummy index as a nice json document. There is something in the logs though. I love this message, this is what you get when running the application without health checks.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!    THIS APPLICATION HAS NO HEALTHCHECKS. THIS MEANS YOU WILL NEVER KNOW      !
!     IF IT DIES IN PRODUCTION, WHICH MEANS YOU WILL NEVER KNOW IF YOU'RE      !
!    LETTING YOUR USERS DOWN. YOU SHOULD ADD A HEALTHCHECK FOR EACH OF YOUR    !
!         APPLICATION'S DEPENDENCIES WHICH FULLY (BUT LIGHTLY) TESTS IT.       !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Creating a health check

We add a health check, since we are creating an application interacting with elasticsearch, we create a health check for elasticsearch. Don’t think to much about how we connect to elasticsearch yet. We get there later on.

public class ESHealthCheck extends HealthCheck {

    private ESClientManager clientManager;

    public ESHealthCheck(ESClientManager clientManager) {
        this.clientManager = clientManager;
    }

    @Override
    protected Result check() throws Exception {
        ClusterHealthResponse clusterIndexHealths = clientManager.obtainClient().admin().cluster().health(new ClusterHealthRequest())
                .actionGet();
        switch (clusterIndexHealths.getStatus()) {
            case GREEN:
                return HealthCheck.Result.healthy();
            case YELLOW:
                return HealthCheck.Result.unhealthy("Cluster state is yellow, maybe replication not done? New Nodes?");
            case RED:
            default:
                return HealthCheck.Result.unhealthy("Something is very wrong with the cluster", clusterIndexHealths);

        }
    }
}

Just like with the resource handler, we need to register the health check. Together with the standard http port for normal users, another port is exposed for administration. Here you can find the metrics reports like Metrics, Ping, Threads, Healthcheck.

    @Override
    public void run(DWESConfiguration config, Environment environment) throws Exception {
        Client client = ESClientFactorybean.obtainClient(config.getElasticsearchHost(), config.getClusterName());

        logger.info("Running the application");
        final IndexResource indexResource = new IndexResource(client);
        environment.jersey().register(indexResource);

        final ESHealthCheck esHealthCheck = new ESHealthCheck(client);
        environment.healthChecks().register("elasticsearch", esHealthCheck);
    }

You as a reader now have an assignment to start the application and check the admin pages yourself: http://localhost:8081. We are going to connect to elasticsearch in the mean time.

Connecting to elasticsearch

We connect to elasticsearch using the transport client. This is taken care of by the ESClientManager. We make use of the drop wizard managed classes. The lifecycle of these classes is managed by drop wizard. From the configuration object we take the host(s) and the cluster name. Now we can obtain a client in the start method and pass this client to the classes that need it. The first class that needs it is the health check, but we already had a look at that one. Using the ESClientManager other classes have access to the client. The managed interface mandates the start as well as the stop method.

    @Override
    public void start() throws Exception {
        Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", clusterName).build();

        logger.debug("Settings used for connection to elasticsearch : {}", settings.toDelimitedString('#'));

        TransportAddress[] addresses = getTransportAddresses(host);

        logger.debug("Hosts used for transport client : {}", (Object) addresses);

        this.client = new TransportClient(settings).addTransportAddresses(addresses);
    }

    @Override
    public void stop() throws Exception {
        this.client.close();
    }

We need to register our managed class with the lifecycle of the environment in the runner class.

    @Override
    public void run(DWESConfiguration config, Environment environment) throws Exception {
        ESClientManager esClientManager = new ESClientManager(config.getElasticsearchHost(), config.getClusterName());
        environment.lifecycle().manage(esClientManager);
    }	

Next we want to change the IndexResource to use the elasticsearch client to list all indexes.

    public List<Index> showIndexes() {
        IndicesStatusResponse indices = clientManager.obtainClient().admin().indices().prepareStatus().get();

        List<Index> result = new ArrayList<>();
        for (String key : indices.getIndices().keySet()) {
            Index index = new Index();
            index.setName(key);
            result.add(index);
        }
        return result;
    }

Now we can browse to http://localhost:8080/indexes and we get back a nice json object. In my case I got this:

[
	{"name":"logstash-tomcat-2014.05.02"},
	{"name":"mymusicnested"},
	{"name":"kibana-int"},
	{"name":"playwithip"},
	{"name":"logstash-tomcat-2014.05.08"},
	{"name":"mymusic"}
]
Creating a better view

Having this REST based interface with json documents is nice, but not if you are human like me (well kind of). So let us add some AngularJS magic to create a slightly better view. The following page can of course also be created with easier view technologies, but I want to demonstrate what you can do with dropwizard.

First we make it possible to use free marker as a template. To make this work we need to additional dependencies: dropwizard-views and dropwizard-views-freemarker. The first step is a view class that knows the free marker template to load and provide the fields that you template can read. In our case we want to expose the cluster name.

public class HomeView extends View {
    private final String clusterName;

    protected HomeView(String clusterName) {
        super("home.ftl");
        this.clusterName = clusterName;
    }

    public String getClusterName() {
        return clusterName;
    }
}

Than we have to create the free marker template. This looks like the following code block

<#-- @ftlvariable name="" type="nl.gridshore.dwes.HomeView" -->
<html ng-app="myApp">
<head>
    <title>DWAS</title>
</head>
<body ng-controller="IndexCtrl">
<p>Underneath a list of indexes in the cluster <strong>${clusterName?html}</strong></p>

<div ng-init="initIndexes()">
    <ul>
        <li ng-repeat="index in indexes">{{index.name}}</li>
    </ul>
</div>

<script src="/assets/js/angular-1.2.16.min.js"></script>
<script src="/assets/js/app.js"></script>
</body>
</html>

By default you put these template in the resources folder using the same sub folders as your view class has for the package. If you look closely you see some angularjs code, more on this later on. First we need to map a url to the view. This is done with a resource class. The following code block shows the HomeResource class that maps the “/” to the HomeView.

@Path("/")
@Produces(MediaType.TEXT_HTML)
public class HomeResource {
    private String clusterName;

    public HomeResource(String clusterName) {
        this.clusterName = clusterName;
    }

    @GET
    public HomeView goHome() {
        return new HomeView(clusterName);
    }
}

Notice we now configure it to return text/html. The goHome method is annotated with GET, so each GET request to the PATH “/” is mapped to the HomeView class. Now we need to tell jersey about this mapping. That is done in the runner class.

final HomeResource homeResource = new HomeResource(config.getClusterName());
environment.jersey().register(homeResource);
Using assets

The final part I want to show is how to use the assets bundle from drop zone to map a folder “/assets” to a part of the url. To use this bundle you have to add the following dependency in maven: dropwizard-assets. Than we can easily map the assets folder in our resources folder to the web assets folder

    @Override
    public void initialize(Bootstrap<DWESConfiguration> dwesConfigurationBootstrap) {
        dwesConfigurationBootstrap.addBundle(new ViewBundle());
        dwesConfigurationBootstrap.addBundle(new AssetsBundle("/assets/", "/assets/"));
    }

That is it, now you can load the angular javascript file. My very basic sample has one angular controller. This controller uses the $http service to call our /indexes url. The result is used to show the indexes in a list view.

myApp.controller('IndexCtrl', function ($scope, $http) {
    $scope.indexes = [];

    $scope.initIndexes = function () {
        $http.get('/indexes').success(function (data) {
            $scope.indexes = data;
        });
    };
});

And the result

the very basic screen showing the indexes

Concluding

This was my first go at using drop wizard, I must admit I like what I have seen so far. Not sure if I would create a big application with it, on the other hand it is really structured. Before moving on I would need to reed a bit more about the library and check all of its options. There is a lot more possible than what I have showed you in here.

References

The post Using Dropwizard in combination with Elasticsearch appeared first on Gridshore.

Categories: Architecture, Programming