Warning: Table './devblogsdb/cache_page' is marked as crashed and last (automatic?) repair failed query: SELECT data, created, headers, expire, serialized FROM cache_page WHERE cid = 'http://www.softdevblogs.com/?q=aggregator&page=1' in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc on line 135

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 729

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 730

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 731

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 732
Software Development Blogs: Programming, Software Testing, Agile, Project Management
Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator
warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/common.inc on line 153.

I/O session: Location and Proximity Superpowers: Eddystone + Google Beacon Platform

Google Code Blog - Mon, 07/25/2016 - 19:10

Originally posted on Geo Developers blog

Bluetooth beacons mark important places and objects in a way that your phone understands. Last year, we introduced the Google beacon platform including Eddystone, Nearby Messages and the Proximity Beacon API that helps developers build beacon-powered proximity and location features in their apps.
Since then, we’ve learned that when deployment of physical infrastructure is involved, it’s important to get the best possible value from your investment. That’s why the Google beacon platform works differently from the traditional approach.
We don’t think of beacons as only pointing to a single feature in an app, or a single web resource. Instead, the Google beacon platform enables extensible location infrastructure that you can manage through your Google Developer project and reuse many times. Each beacon can take part in several different interactions: through your app, through other developers’ apps, through Google services, and the web. All of this functionality works transparently across Eddystone-UID and Eddystone-EID -- because using our APIs means you never have to think about monitoring for the individual bytes that a beacon is broadcasting.
For example, we’re excited that the City of Amsterdam has adopted Eddystone and the newly released publicly visible namespace feature for the foundation of their open beacon network. Or, through Nearby Notifications, Eddystone and the Google beacon platform enable explorers of the BFG Dream Jar Trail to discover cloud-updateable content in Dream Jars across London.
To make getting started as easy as possible we’ve provided a set of tools to help developers, including links to beacon manufacturers that can help you with Eddystone, Beacon Tools (for Android and iOS), the Beacon Dashboard, a codelab and of course our documentation. And, if you were not able to attend Google I/O in person this year, you can watch my session, Location and Proximity Superpowers: Eddystone + Google Beacon Platform: We can’t wait to see what you build! author image
About Peter: I am a Product Manager for the Google beacon platform, including the open beacon format Eddystone, and Google's cloud services that integrate beacon technology with first and third party apps. When I’m not working at Google I enjoy taking my dog, Oscar, for walks on Hampstead Heath.
Categories: Programming

SE-Radio Episode 263: Camille Fournier on Real-World Distributed Systems

Stefan Tilkov talks to Camille Fournier about the challenges developers face when building distributed systems. Topics include the definition of a distributed system, whether developers can avoid building them at all, and what changes occur once they choose to. They also talk about the role distributed consensus tools such as Apache Zookeeper play, and whether […]
Categories: Programming

Scrum Day Europe 2016

Xebia Blog - Mon, 07/25/2016 - 10:50
During the 5th edition of Scrum Day Europe, Laurens and IΒ facilitated a workshop on how to β€œAdd Visual Flavor to Your Organization Transformation with Videoscribe.” The theme of the conference, β€œThe Next Iteration,” Β was all about the future of Scrum. We wanted to tie our workshop into the theme of the conference, so we had

The Raspberry Pi Has Revolutionized Emulation

Coding Horror - Jeff Atwood - Sun, 07/24/2016 - 23:12

Every geek goes through a phase where they discover emulation. It's practically a rite of passage.

I think I spent most of my childhood – and a large part of my life as a young adult – desperately wishing I was in a video game arcade. When I finally obtained my driver's license, my first thought wasn't about the girls I would take on dates, or the road trips I'd take with my friends. Sadly, no. I was thrilled that I could drive myself to the arcade any time I wanted.

My two arcade emulator builds in 2005 satisfied my itch thoroughly. I recently took my son Henry to the California Extreme expo, which features almost every significant pinball and arcade game ever made, live and in person and real. He enjoyed it so much that I found myself again yearning to share that part of our history with my kids – in a suitably emulated, arcade form factor.

Down, down the rabbit hole I went again:

I discovered that emulation builds are so much cheaper and easier now than they were when I last attempted this a decade ago. Here's why:

  1. The ascendance of Raspberry Pi has single-handedly revolutionized the emulation scene. The Pi is now on version 3, which adds critical WiFi and Bluetooth functionality on top of additional speed. It's fast enough to emulate N64 and PSX and Dreamcast reasonably, all for a whopping $35. Just download the RetroPie bootable OS on a $10 32GB SD card, slot it into your Pi, and … well, basically you're done. The distribution comes with some free games on it. Add additional ROMs and game images to taste.

  2. Chinese all-in-one JAMMA cards are available everywhere for about $90. Pandora's Box is one "brand". These things are are an entire 60-in-1 to 600-in-1 arcade on a board, with an ARM CPU and built-in ROMs and everything … probably completely illegal and unlicensed, of course. You could buy some old broken down husk of an arcade game cabinet, anything at all as long as it's a JAMMA compatible arcade game – a standard introduced in 1985 – with working monitor and controls. Plug this replacement JAMMA box in, and bam: you now have your own virtual arcade. Or you could build or buy a new JAMMA compatible cabinet; there are hundreds out there to choose from.

  3. Cheap, quality IPS arcade size LCDs. The CRTs I used in 2005 may have been truer to old arcade games, but they were a giant pain to work with. They're enormous, heavy, and require a lot of power. Viewing angle and speed of refresh are rather critical for arcade machines, and both are largely solved problems for LCDs at this point, which are light, easy to work with, and sip power for $100 or less.

Add all that up – it's not like the price of MDF or arcade buttons and joysticks has changed substantially in the last decade – and what we have today is a console and arcade emulation wonderland! If you'd like to go down this rabbit hole with me, bear in mind that I've just started, but I do have some specific recommendations.

Get a Raspberry Pi starter kit. I recommend this particular starter kit, which includes the essentials: a clear case, heatsinks – you definitely want small heatsinks on your 3, as it dissipate almost 4 watts under full load – and a suitable power adapter. That's $50.

Get a quality SD card. The primary "drive" on your Pi will be the SD card, so make it a quality one. Based on these excellent benchmarks, I recommend the Sandisk Extreme 32GB or Samsung Evo+ 32GB models for best price to peformance ratio. That'll be $15, tops.

Download and install the bootable RetroPie image on your SD card. It's amazing how far this project has come since 2013, it is now about as close to plug and play as it gets for free, open source software. The install is, dare I say … "easy"?

Decide how much you want to build. At this point you have a fully functioning emulation brain for well under $100 which is capable of playing literally every significant console and arcade game created prior to 1997. Your 1985 self is probably drunk with power. It is kinda awesome. Stop doing the Safety Dance for a moment and ask yourself these questions:

  • What controls do you plan to plug in via the USB ports? This will depend heavily on which games you want to play. Beyond the absolute basics of joystick and two buttons, there are Nintendo 64 games (think analog stick(s) required), driving games, spinner and trackball games, multiplayer games, yoke control games (think Star Wars), virtual gun games, and so on.

  • What display to you plan to plug in via the HDMI port? You could go with a tiny screen and build a handheld emulator, the Pi is certainly small enough. Or you could have no display at all, and jack in via HDMI to any nearby display for whatever gaming jamboree might befall you and your friends. I will say that, for whatever size you build, more display is better. Absolutely go as big as you can in the allowed form factor, though the Pi won't effectively use much more than a 1080p display maximum.

  • How much space do you want to dedicate to the box? Will it be portable? You could go anywhere from ultra-minimalist – a control box you can plug into any HDMI screen with a wireless controller – to a giant 40" widescreen stand up arcade machine with room for four players.

  • What's your budget? We've only spent under $100 at this point, and great screens and new controllers aren't a whole lot more, but sometimes you want to build from spare parts you have lying around, if you can.

  • Do you have the time and inclination to build this from parts? Or do you prefer to buy it pre-built?

These are all your calls to make. You can get some ideas from the pictures I posted at the top of this blog post, or search the web for "Raspberry Pi Arcade" for lots of other ideas.

As a reasonable all-purpose starting point, I recommend the Build-Your-Own-Arcade kits from Retro Built Games. From $330 for full kit, to $90 for just the wood case.

You could also buy the arcade controls alone for $75, and build out (or buy) a case to put them in.

My "mainstream" recommendation is a bartop arcade. It uses a common LCD panel size in the typical horizontal orientation, it's reasonably space efficient and somewhat portable, while still being comfortably large enough for a nice big screen with large speakers gameplay experience, and it supports two players if that's what you want. That'll be about $100 to $300 depending on options.

I remember spending well over $1,500 to build my old arcade cabinets. I'm excited that it's no longer necessary to invest that much time, effort or money to successfully revisit our arcade past.

Thanks largely to the Raspberry Pi 3 and the RetroPie project, this is now a simple Maker project you can (and should!) take on in a weekend with a friend or family. For a budget of $100 to $300 – maybe $500 if you want to get extra fancy – you can have a pretty great classic arcade and classic console emulation experience. That's way better than I was doing in 2005, even adjusting for inflation.

[advertisement] At Stack Overflow, we put developers first. We already help you find answers to your tough coding questions; now let us help you find your next job.
Categories: Programming

SPaMCAST 404 – Ryan Ripley, The Business of Agile

SPaMCAST Logo

http://www.spamcast.net

Listen Now
Subscribe on iTunes
Check out the podcast on Google Play Music

Software Process and Measurement Cast 404 features our interview with Ryan Ripley. Β We discussed The Business of Agile: Better, Faster, Cheaper at Agile. We discussed why having the answer for whether Agile is better, faster and cheaper is still important in the business world. Along the way we wrestled with the concept of value and why having value sooner is not the same as going fast. Β 

Ryan Ripley has worked on agile teams for the past 10 years in development, scrum master, and management roles. He’s worked at various Fortune 500 companies in the medical device, wholesale, and financial services industries.

Ryan is great at taking tests and holds the PMI-ACP, PSM I, PSM II, PSE, PSPO I, PSD I, CSM and CSPO agile certifications.

Ryan lives in Indiana with his wife Kristin and their three children.

He blogs at ryanripley.com and hosts the Agile for Humans podcast.

You can also follow Ryan on twitter: @ryanripley

Re-Read Saturday News

This week we continue our re-read of Kent Beck’s XP Explained, Second Edition with a discussion of Chapters 9 and 10. It is great to see the concepts we explored when we re-read Goldratt’s The Goal come back to roost. Β This week we focus on roles, the definition of team, flow and more flow. Β Β Β 

Use the link to XP Explained in the show notes when you buy your copy to read along to support both the blog and podcast. Visit the Software Process and Measurement Blog (www.tcagley.wordpress.com) to catch up on past installments of Re-Read Saturday.

Next SPaMCAST

The next Software Process and Measurement Cast will feature our essay productivity. Β A lot of people would tell you productivity does not matter or that discussing productivity in today’s Agile world is irrational. They are wrong. Productivity is about jobs. We will also have columns from the QA Corner and for Jon Quigley. Β I think 405 might be just a bit controversial.

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: β€œThis book will prove that software projects should not be a tedious process, for you or your team.” Support SPaMCAST by buying the book here. Available in English and Chinese.


Categories: Process Management

SPaMCAST 404 - Ryan Ripley, The Business of Agile

Software Process and Measurement Cast - Sun, 07/24/2016 - 22:00

Software Process and Measurement Cast 404 features our interview with Ryan Ripley. Β We discussed The Business of Agile: Better, Faster, Cheaper at Agile. We discussed why having the answer for whether Agile is better, faster and cheaper is still important in the business world. Along the way we wrestled with the concept of value and why having value sooner is not the same as going fast. Β 

Ryan Ripley has worked on agile teams for the past 10 years in development, scrum master, and management roles. He’s worked at various Fortune 500 companies in the medical device, wholesale, and financial services industries.

Ryan is great at taking tests and holds the PMI-ACP, PSM I, PSM II, PSE, PSPO I, PSD I, CSM and CSPO agile certifications.

Ryan lives in Indiana with his wife Kristin and their three children.

He blogs at ryanripley.com and hosts the Agile for Humans podcast.

You can also follow Ryan on twitter: @ryanripley

Β 

Re-Read Saturday News

This week we continue our re-read of Kent Beck’s XP Explained, Second Edition with a discussion of Chapters 9 and 10. It is great to see the concepts we explored when we re-read Goldratt’s The Goal come back to roost. Β This week we focus on roles, the definition of team, flow and more flow. Β Β Β 

Use the link to XP Explained in the show notes when you buy your copy to read along to support both the blog and podcast. Visit the Software Process and Measurement Blog (www.tcagley.wordpress.com) to catch up on past installments of Re-Read Saturday.

Next SPaMCAST

The next Software Process and Measurement Cast will feature our essay productivity. Β A lot of people would tell you productivity does not matter or that discussingΒ productivity in today’s Agile world is irrational. They are wrong. Productivity is about jobs. We will also have columns from the QA Corner and for Jon Quigley. Β I think 405 might be just a bit controversial.

Β 

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: β€œThis book will prove that software projects should not be a tedious process, for you or your team.” Support SPaMCAST by buying the book here. Available in English and Chinese.

Categories: Process Management

Extreme Programming Explained, Second Edition: Week 6

XP Explained

This week we tackle teams in XP and why XP works based on the Theory of Constraints in Extreme Programing Explained, Second Edition (2005). The two chapters are linked by the idea that work is delivered most effectively when Β teams or organizations achieve a consistent flow.

Chapter 10 -The Whole XP Team

The principal flow, as described in our re-read of Goldratt’s The Goal, proves that more value is created when a system achieves a smooth and steady stream of output. In order to achieve a state of flow, everyone on the team needs to be linked to the work to reduce delay and friction between steps. Β Implementing the steps necessary to address complex work within a team is often at odds with how waterfall projects break work down based on specialties.Β Unless the barriers between specialties are broken down it is hard to get people to agree that you can work incrementally in small chunks versus specialty-based phases such as planning, analysis, design and more.

Every β€œspecialty” needs to understand their role in XP.Β 

Testers – XP assumes that programmers using XP take on the responsibility for catching unit-level mistakes. XP uses the concept of test-first programming. Β In test-first programming, the team begins each cycle (sprint in Scrum) by writing tests that will fail until the code is written. Once the tests are written and executed to prove they will fail, the team writes the code and runs the tests until they pass. ItΒ is at least a partial definition of done. As the team uncovers new details, new tests will be specified and incorporated into the team’s test suite. Β When testers are not directly involved writing and executing tests they can work on extending automated testing.

Interaction designers – Interaction designers work with customers to write and clarify stories. Interaction designers deliver analysis of actual usage of the system to decide what the system needs to do next. The interaction designer in XP would also encompass the UX and UI designer roles as they have evolved since XP Explained was written and updated. Β The designer tends to be a bit in front of the developers to reduces potential delays.

Architects – Architects help the team to keep the big picture in mind as development progresses in small incremental steps. Β Similar to the interaction designer, the architect evolves the big picture just enough ahead of the development team to provide direction (SAFe call this the architectural runway). Evolving the architecture in small steps and gathering feedback as development progress from incremental system testing reduces the risk that the project will wander off track.

Project managers – Project managers (PM) facilitate communication both inside the team and between customers, suppliers and the rest of the organization. Beck suggests that the PMO act as the team historians. Β Project managers keep the team β€œplan” synchronized with the reality based on how they are performing and the world happening outside the team.

Product managers – Product managers write stories, pick themes and stories for the quarterly cycle, pick stories in the weekly cycle, answer questions as development progress and helps when new information is uncovered. The product manager helps the whole team prioritize work. Β (Note: it is different from the concept of the product owner in Scrum). Β The product manager should help the team focus on pieces of work that allow the system to be whole at the end of every cycle.

Executives – The executives’ role in XP is to provide an environment for a team so they have courage, confidence, and accountability. Beck suggests that executives trust the metrics. Β The first metric is the number of defects found after development. Β The fewer the better. The second metric that executives should leverage to build trust in XP is the time lag between idea inception and when the idea begins generating revenue. Β This metric is also known as β€œconcept to cash” (faster is better).

Technical writers – In XP, the technical writer role generates feedback by asking the question, β€œHow can I explain that?” The tech writer can also help to create a closer relationship with users as they help them to learn about the product, listen to their feedback and then to address any confusion between the development team and the user community. Embedding the tech writer role into the XP team allows the team to get feedback on a more timely basis, rather than waiting until much later in the development cycle.

Users – Users help write user stories, provide business intelligence and make business domain decisions as development progress. Β Users must be able to speak for the larger business community, they need to command a broad consensus for the decisions they make. Β If users can’t command a broad consensus from the business community for the decisions they make, they should let the team work on something else first while they get thier ducks in a row.

Programmers – Programmers estimate stories and tasks, they break stories down into smaller pieces, write tests, write code, run tests, automate tedious development processes, and gradually improve the design of the system. Β As with all roles in XP, the most valuable developers combine specialization with a broad set of capabilities.

Human resources – Human resources needs to find a way to hire the right individuals and then to evaluate teams. Β Evaluating teams require changing the review process to focus on teams, rather than on individuals.

XP addressed roles that most discussions of Scrum have ignored, but that are needed to deliver a project. Roles should not be viewed as a rigid set of specialties that every project requires at every moment. Β Teams and organizations need to add and subtract roles as needed. XP team members need to have the flexibility to shift roles as needed to maximize the flow of work. Β 

Chapter 11 – The Theory of Constraints

In order to find opportunities for improvement in the development process using XP, begin by determining which problems are development problems and which are caused outside of the development process. This first step is important because XP is only focused on the software development process (areas like marketing are out of scope). One approach for improving software development is to look at the throughput of the software development process. Β The theory of constraints (ToC) is a system thinking approach for process improvement. Β A simple explanation of ToC is that the output of any system or process is limited by a very small number of constraints within the process. Β Using the ToC to measure the throughput of the development process (a system from the point of view of the ToC) provides the basis for identifying constraints, making a change and then finding the next constraint. Β Using the ToC as an improvement approach maximizes the output of the overall development process rather focusing on the local maximization of steps. Β 

The theory of constraints is not a perfect fit for software development becauseΒ software development is more influenced by people, therefore, is more variable than the mechanical transformation of raw materials. An over-reliance on the concepts like the ToC will tend to overemphasize process and engineering solutions over people solutions, such as a team approach. This is a caution rather than a warning to avoid process approaches. In addition, systems approaches can highlight issues outside of development’s span of control. Β Getting others in the organization to recognize issues they are not ready to accept or address can cause conflict. Beck ends the chapter with the advice, β€œIf you don’t have executive sponsorship, be prepared to do a better job yourself without recognition or protection.” Unsaid is that you will also have to be prepared to for the consequences of your behavior. Β Β 

Previous installments of Extreme Programing Explained, Second Edition (2005) on Re-read Saturday:

Extreme Programming Explained: Embrace Change Second Edition Week 1, Preface and Chapter 1

Week 2, Chapters 2 – 3

Week 3, Chapters 4 – 5

Week 4, Chapters 6 – 7 Β 

Week 5, Chapters 8 – 9

 


Categories: Process Management

Stuff The Internet Says On Scalability For July 22nd, 2016

Hey, it's HighScalability time:


It's not too late London. There's still time to make this happen

 

If you like this sort of Stuff then please support me on Patreon.
  • 40%: energy Google saves in datacenters using machine learning; 2.3: times more energy knights in armor spend than when walking; 1000x: energy efficiency of 3D carbon nanotubes over silicon chips; 176,000: searchable documents from the Founding Fathers of the US; 93 petaflops: China’s Sunway TaihuLight; $800m: Azure's quarterly revenue; 500 Terabits per square inch: density when storing a bit with an atom; 2 billion: Uber rides; 46 months: jail time for accessing a database; 

  • Quotable Quotes:
    • Lenin: There are decades where nothing happens; and there are weeks where decades happen.
    • Nitsan Wakart: I have it from reliable sources that incorrectly measuring latency can lead to losing ones job, loved ones, will to live and control of bowel movements.
    • Margaret Hamilton~ part of the culture on the Apollo program “was to learn from everyone and everything, including from that which one would least expect.”
    • @DShankar: Basically @elonmusk plans to compete with -all vehicle manufacturers (cars/trucks/buses) -all ridesharing companies -all utility companies
    • @robinpokorny: ‘Number one reason for types is to get idea what the hell is going on.’ @swannodette at #curryon
    • Dan Rayburn: Some have also suggested that the wireless carriers are seeing a ton of traffic because of Pokemon Go, but that’s not the case. Last week, Verizon Wireless said that Pokemon Go makes up less than 1% of its overall network data traffic.
    • @timbaldridge: When people say "the JVM is slow" I wonder to what dynamic, GC'd, runtime JIT'd, fully parallel, VM they are comparing it to.
    • @papa_fire: “Burnout is when long term exhaustion meets diminished interest.”  May be the best definition I’ve seen.
    • Sheena Josselyn: Linking two memories was very easy, but trying to separate memories that were normally linked became very difficult
    • @mstine: if your microservices must be deployed as a complete set in a specific order, please put them back in a monolith and save yourself some pain
    • teaearlgraycold: Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.
    • Erik Duindam:  I bake minimum viable scalability principles into my app.
    • Hassabis: It [DeepMind] controls about 120 variables in the data centers. The fans and the cooling systems and so on, and windows and other things. They were pretty astounded.
    • @WhatTheFFacts: In 1989, a new blockbuster store was opening in America every 17 hours.
    • praptak: It [SRE] changes the mindset from "Failure? Just log an error, restore some 'good'-ish state and move on to the next cool feature." towards "New cool feature? What possible failures will it cause? How about improving logging and monitoring on our existing code instead?"
    • plusepsilon: I transitioned from using Bayesian models in academia to using machine learning models in industry. One of the core differences in the two paradigms is the "feel" when constructing models. For a Bayesian model, you feel like you're constructing the model from first principles. You set your conditional probabilities and priors and see if it fits the data. I'm sure probabilistic programming languages facilitated that feeling. For machine learning models, it feels like you're starting from the loss function and working back to get the best configuration

  • Isn't it time we admit Dark Energy and Dark Matter are simply optimizations in the algorithms running the sim of our universe? Occam's razor. Even the Eldritch engineers of our creation didn't have enough compute power to simulate an entire universe. So they fudged a bit. What's simpler than making 90 percent of matter in our galaxy invisible?

  • Do you have one of these? Google has a Head of Applied AI.

  • Uber with a great two article series on their stack. Part unoPart deux: Our business runs on a hybrid cloud model, using a mix of cloud providers and multiple active data centers...We currently use Schemaless (built in-house on top of MySQL), Riak, and Cassandra...We use Redis for both caching and queuing. Twemproxy provides scalability of the caching layer without sacrificing cache hit rate via its consistent hashing algorithm. Celery workers process async workflow operations using those Redis instances...for logging, we use multiple Kafka clusters...This data is also ingested in real time by various services and indexed into an ELK stack for searching and visualizations...We use Docker containers on Mesos to run our microservices with consistent configurations scalably...Aurora for long-running services and cron jobs...Our service-oriented architecture (SOA) makes service discovery and routing crucial to Uber’s success...we’re moving to a pub-sub pattern (publishing updates to subscribers). HTTP/2 and SPDY more easily enable this push model. Several poll-based features within the Uber app will see a tremendous speedup by moving to push....we’re prioritizing long-term reliability over debuggability...Phabricator powers a lot of internal operations, from code review to documentation to process automation...We search through our code on OpenGrok...We built our own internal deployment system to manage builds. Jenkins does continuous integration. We combined Packer, Vagrant, Boto, and Unison to create tools for building, managing, and developing on virtual machines. We use Clusto for inventory management in development. Puppet manages system configuration...We use an in-house documentation site that autobuilds docs from repositories using Sphinx...Most developers run OSX on their laptops, and most of our production instances run Linux with Debian Jessie...At the lower levels, Uber’s engineers primarily write in Python, Node.js, Go, and Java...We rip out and replace older Python code as we break up the original code base into microservices. An asynchronous programming model gives us better throughput. And lots more.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Mahout/Hadoop: org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4

Mark Needham - Fri, 07/22/2016 - 14:55

I’ve been working my way through Dragan Milcevski’s mini tutorial on using Mahout to do content based filtering on documents and reached the final step where I needed to read in the generated item-similarity files.

I got the example compiling by using the following Maven dependency:

<dependency>
      <groupId>org.apache.mahout</groupId>
      <artifactId>mahout-core</artifactId>
      <version>0.9</version>
</dependency>

Unfortunately when I ran the code I ran into a version incompatibility problem:

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
	at org.apache.hadoop.ipc.Client.call(Client.java:1113)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
	at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
	at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
	at org.apache.hadoop.hdfs.DFSClient.createNamenode(DFSClient.java:183)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:281)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:245)
	at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
	at com.markhneedham.mahout.Similarity.getDocIndex(Similarity.java:86)
	at com.markhneedham.mahout.Similarity.main(Similarity.java:25)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

Version 0.9.0 of mahout-core was published in early 2014 so I expect it was built against an earlier version of Hadoop than I’m using (2.7.2).

I tried updating the Hadoop dependencies that were being called in the stack trace to no avail.

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.7.2</version>
</dependency>
 
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>2.7.2</version>
</dependency>

When stepping through the stack trace I noticed that my program was still using an old version of hadoop-core, so with one last throw of the dice I decided to try explicitly excluding that:

<dependency>
    <groupId>org.apache.mahout</groupId>
    <artifactId>mahout-core</artifactId>
    <version>0.9</version>
 
    <exclusions>
        <exclusion>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-core</artifactId>
        </exclusion>
    </exclusions>
</dependency>

And amazingly it worked. Now, finally, I can see how similar my documents are!

Categories: Programming

Hadoop: DataNode not starting

Mark Needham - Fri, 07/22/2016 - 14:31

In my continued playing with Mahout I eventually decided to give up using my local file system and use a local Hadoop instead since that seems to have much less friction when following any examples.

Unfortunately all my attempts to upload any files from my local file system to HDFS were being met with the following exception:

java.io.IOException: File /user/markneedham/book2.txt could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1448)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:690)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344)
 
at org.apache.hadoop.ipc.Client.call(Client.java:905)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:928)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:811)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:427)

I eventually realised, from looking at the output of jps, that the DataNode wasn’t actually starting up which explains the error message I was seeing.

A quick look at the log files showed what was going wrong:


/usr/local/Cellar/hadoop/2.7.1/libexec/logs/hadoop-markneedham-datanode-marks-mbp-4.zte.com.cn.log

2016-07-21 18:58:00,496 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /usr/local/Cellar/hadoop/hdfs/tmp/dfs/data: namenode clusterID = CID-c2e0b896-34a6-4dde-b6cd-99f36d613e6a; datanode clusterID = CID-403dde8b-bdc8-41d9-8a30-fe2dc951575c
2016-07-21 18:58:00,496 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to /0.0.0.0:8020. Exiting.
java.io.IOException: All specified directories are failed to load.
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:477)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1361)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1326)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:316)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:801)
        at java.lang.Thread.run(Thread.java:745)
2016-07-21 18:58:00,497 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to /0.0.0.0:8020
2016-07-21 18:58:00,602 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2016-07-21 18:58:02,607 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2016-07-21 18:58:02,608 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2016-07-21 18:58:02,610 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

I’m not sure how my clusterIDs got out of sync, although I expect it’s because I reformatted HDFS without realising at some stage. There are other ways of solving this problem but the quickest for me was to just nuke the DataNode’s data directory which the log file told me sits here:

sudo rm -r /usr/local/Cellar/hadoop/hdfs/tmp/dfs/data/current

I then re-ran the hstart script that I stole from this tutorial and everything, including the DataNode this time, started up correctly:

$ jps
26736 NodeManager
26392 DataNode
26297 NameNode
26635 ResourceManager
26510 SecondaryNameNode

And now I can upload local files to HDFS again. #win!

Categories: Programming

Customer Satisfaction Metrics and Quality

On a scale of fist to five, I'm at a ten.

On a scale of fist to five, I’m at a ten.

Quality is partly about the number defects delivered in a piece of software and partly about how the stakeholders and customers experience the software. Β Experience is typically measured as customer satisfaction. Customer satisfactionΒ is a measure of how products and services supplied by a company meet or surpass customer expectations. Customer satisfaction is impacted by all three aspects of software quality: functional (what the software does), structural (whether the software meets standards) and process (how the code was built).

Surveys can be used to collect customer and team level data. Β Satisfaction is used to measure products, services, behaviors or work environment meet expectations.

  1. Asking: Β Asking the question, β€œare you happy (or some variant of the word happy) with the results of XYZ project?” is an assessment of satisfaction. The answer to that simple question will indicate whether the people you are asking are β€œhappy”, or whether you need to ask more questions. Β Asking is a powerful tool and can be as simple as asking a single question to a team or group of customers or as complicated using multifactor surveys. Even though just asking whether someone is satisfied and then listening to the answer can provide powerful information, the size of projects or the complexity of software being delivered often dictate a more formal approach, which means that surveys are often used to collect satisfaction data. Β Product or customer satisfaction is typically measured after a release or on a periodic basis.

    Fist to Five, a simple asking technique: Agile teams measure team level satisfaction using simple techniques such Fist-to-Five. Β Fist-to-five is a simple asking technique in which team members are asked to vote on how satisfied they are by flashing a number of fingers all at the same time. Β Showing five fingers means you are very satisfied and a fist (no fingers) is unsatisfied. Β This form of measurement can be used to assess team satisfaction on a daily basis. (A simple video explanation) I generally post an average score on the wall in the team room in order to track the team’s satisfaction trend.

  2. The Net Promoter metric is a more advanced form of a customer satisfaction measure than simple asking but less complicated than the multifactor indexes that are sometimes generated. Promoters are people who are so satisfied that they will actively spread knowledge to others. Generating the metric begins by asking β€œhow likely you are to recommend the product or organization being measured to a friend or colleague?” I have seen many variants of the net promoter question but at the heart of it, the question is whether the respondent will recommend the service, product, team or organization. Β The response is scored using a scale from 1 – 10. Β Answers of 10 or 9 represent promoters, 7 or 8 are neutral and all other answers represent detractors. The score is calculated using the following formula: (# of Promoters β€” # of Detractors) / (Total Promoters + Neutral + Detractors) x 100. Β Β If ten people responded to a net promoter question and 5 where promoters, 3 neutral and 2 detractors the net promoter score is 30 (5 -2 /10 *100). Β Over time the goal is to improve the net promoter score which will increase the chance your work will be recommended.

Software quality is a nuanced concept that reflects many factors, some of which are functional, structural or process related. Satisfaction is a reflection of quality from a different perspective than measuring defects or code structure. The essence of customer satisfaction is the very simple question, are you happy with what we delivered? Knowing if the team, stakeholders, and customers are happy with what was delivered or the path that was taken to get to that delivery is often just as important as knowing the number of defects that were delivered.


Categories: Process Management

Quote of the Day

Herding Cats - Glen Alleman - Thu, 07/21/2016 - 19:03

A skeptic will question claims, then embrace the evidence. A denier will question claims, then reject the evidence. - Neil deGrase Tyson

Think of this whenever there is a conjecture that has no testable evidence of the claim. And think ever more when those making the conjectured claim refuse to provide evidence. When that is the case, it is appropriate to ignore the conjecture all togetherΒ 

Categories: Project Management

Mahout: Exception in thread β€œmain” java.lang.IllegalArgumentException: Wrong FS: file:/… expected: hdfs://

Mark Needham - Thu, 07/21/2016 - 18:57

I’ve been playing around with Mahout over the last couple of days to see how well it works for content based filtering.

I started following a mini tutorial from Stack Overflow but ran into trouble on the first step:

bin/mahout seqdirectory \
--input file:///Users/markneedham/Downloads/apache-mahout-distribution-0.12.2/foo \
--output file:///Users/markneedham/Downloads/apache-mahout-distribution-0.12.2/foo-out \
-c UTF-8 \
-chunk 64 \
-prefix mah
16/07/21 21:19:20 INFO AbstractJob: Command line arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[file:///Users/markneedham/Downloads/apache-mahout-distribution-0.12.2/foo], --keyPrefix=[mah], --method=[mapreduce], --output=[file:///Users/markneedham/Downloads/apache-mahout-distribution-0.12.2/foo-out], --startPhase=[0], --tempDir=[temp]}
16/07/21 21:19:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/07/21 21:19:20 INFO deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
16/07/21 21:19:20 INFO deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
16/07/21 21:19:20 INFO deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: file:/Users/markneedham/Downloads/apache-mahout-distribution-0.12.2/foo, expected: hdfs://localhost:8020
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:646)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:194)
	at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106)
	at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
	at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
	at org.apache.mahout.text.SequenceFilesFromDirectory.runMapReduce(SequenceFilesFromDirectory.java:156)
	at org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:90)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:64)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
	at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
	at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

I was trying to run the command against the local file system on my laptop which should have been possible according to the instructions. I couldn’t find any flag I could pass any flag that I could pass to Mahout to tell it not to use HDFS but I eventually stumbled on someone else experiencing a similar problem.

It turns out the last time I was playing around with Hadoop, in late 2015, I’d actually configured that and had completely forgotten. I needed to comment out the following config:

/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/core-site.xml

<property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:8020</value>
</property>

I commented that property out and all was happy with the (Hadoop) world again.

Categories: Programming

Making Release Frictionless, a Business Decision, Part 2

In Part 1, I talked about small stories/chunks of work, checking in all the time so you could buildΒ often and see progress. That assumes you know what done means. Project “done” means release criteria. Here are some stories about how I started using release criteria.

Back in the 70s,Β I worked in a small development group. We had 5 or 6 people, depending on the time of year. We worked alone on our parts of the system. We all integrated into one instrument, but we worked primarily alone. This is back in the days of microcomputers. I wrote assembler, Fortran, or microcode, depending on the part of the system. I still worked on small chunks, “checked in,” as in I made sure I saved my files. No, we had no real version control then.

We had a major release in about 1979 or something like that. I’d been there about 15 months by then. Our customers called the President of the company, complaining about the software. Yes, it was that bad.

Why was it that bad? We had thought we were working towards one goal. Our customers wanted a different goal. If I remember correctly, these were some of the problems (that was a long time ago. I might have forgotten some details.):

  • We did not have a unified approach to how the system asked for information.Β There was no GUI, but the command line was not consistent.
  • Parts of the system were quite buggy. The calculations were correct, but the information presentation was slow, in the wrong place, or didn’t make sense. Our customers had a difficult time using the system.
  • Some parts of the system were quite slow. Not the instrument, but how the instrument showed the data.
  • The parts didn’t fit together. No one had the responsibility of making sure thatΒ the system looked like one system. We all integrated our own parts. No one looked at the whole.

Oops. My boss told us we needed to fix it. I asked the question, “How will we know we are done?” He responded, “When the customers stop calling.” I said, “No, we’re not going to keep shipping more tape to people. What are all the things you want us to do?” He said, “You guys are the software people. You decide.”

I asked my colleagues if it was okay if I developed draft release criteria, so we would know that the release was done. They agreed. I developed them in the next half day, wrote a memo and asked people for a meeting to see if they agreed. (This is in the days before email.)

We met and we changed my strawman criteria to something we could all agree on. We now knew what we had to do. I showed the criteria to my boss. He agreed with them. We worked to the release criteria, sent out a new tape (before the days of disks or CDs!) to all our customers and that project finally finished.

I used the idea of release criteria on every single project since. For me, it’s a powerful idea, to know what done means for the project.

I wrote aΒ release criteria article (see the release criteria search for my release criteria writing) and explained it more in Manage It! Your Guide to Modern, Pragmatic Project Management.

In the 80s, I used it for a projectΒ where we did custom machine vision implementations. If I hadn’t, the customer would have continued asking for more features. The customer did anyway, but we could ask for more money every time we changed the release criteria to add more features.

I use release criteria and milestone criteria for any projects and programs longer than three months in duration, so we could see our progress (or lack thereof) earlier, rather than later. To be honest, even if we think the project is only a couple of months, I always ask, “Do we know what done means for this project?” For small projects, I want to make sure we finish and don’t add more to the project. For programs, I want to make sure we all know where we are headed, so we get there.

Here’s how small chunks of work, checking in every day, and release criteria all work together:

  1. Release criteria tell you what done means for the project. Once you know, you can develop scenarios for checking on your “doneness” as often as you like. I like automated tests that we can run with each build. The tests tell us if we are getting closer or farther away from what we want out of our release.
  2. When you work in small chunks, check them in every day and build at least as often as every day, you can check on the build progress. You know if the build is good or not.
  3. If you add the idea of scenarios for testing as you proceed, release becomes a business decision, Β not a “hardening sprint” or some such.

Here’s a little list that might help you achieve friction-less releases:

  1. What do you need to do to make your stories small? If they are not one day, can you pair, swarm, or mob to finish one story in one day? What would you have to change to do so?
  2. If you have curlicue stories, what can you do to make your stories look like the straight line through the architecture?
  3. What can you do to check in all the time? Is it a personal thing you can do, or do you need to ask Β your entire team to check in all the time?Β I don’t know how to really succeed at agile without continuous integration. What prevents you from integrating all the time? (Hint, it might be your story size.)
  4. Do you know what done means for this release (interim and project)? Do you have deliverable-based planning to achieve those releases?

Solve these problems and you may find frictionless release possible.

When you make releasing externally a business decision—because you can release internally any time you want—you will find your project proceeds more smoothly, regardless of whether you are agile.

Reminder: If you want to learn how to Β make your stories smaller or solve some of the problems of non-frictionless releases, join my Practical Product Owner workshop, starting August 23, 2016. You’ll practice on your projects, so you can see maximum business value from the workshop.

Categories: Project Management

The Ultimate Tester: Wrap-Up

Xebia Blog - Thu, 07/21/2016 - 16:14
To everyone who has read all or some of the past blog posts in this series: thank you so much for reading. I hope I have given you some food for thought on where you can improve as a tester (or developer who tests!).Β  In four blog posts, we explored what it takes to become

Making Release Frictionless, a Business Decision, Part 1

Would you like to release your product at any time? I like it when releases are a business decision, not a result of blood, sweat, and tears.Β It’s possible, and it might not be easy for you. Here are some stories thatΒ showed you how I did it, long ago and more recently.

Story 1: Many years ago, I was a developer on a moderately complex system. There were three of us working together. We used RCS (yes, it was in the ’80s or something like that). I hated that system. Maybe it was our installation of it. I don’t know. All I Β know is that it was too easy to lock each other out, and not be able to do a darn thing. My approach was to make sure I could check in my work in as few files as possible (Single Responsibility Principle, although I didn’t know it at the time), and to work on small chunks.

I checked in every day at least before I went to lunch, once in the middle of the afternoon, and before I left for the day. I did not do test-first development, and I didn’t check my tests in at the time. It took me a while to learn that lesson. I only checked in working code—at least, it worked on my machine.

We built almost every day. (No, we hadn’t learned that lesson either.) We could release at least once a week, closer to twice a week. Β Not friction-less, but close enough for our needs.

Story 2: I learned some lessons, and a few years later, I had graduated to SCCS. I still didn’t like it. Merging was not possible for us, so we each worked on our own small stuff. I still worked on small chunks and checked in at least three times a day. This time, I was smarter, and checked in my tests as I wrote code. I still wrote code first and tests second. However, I worked in really small chunks (small functions and the tests that went with them) and checked them in as a unit. The only time I didn’t do that is if it was lunch or the end of the day. If I was done with code but not tests, I checked in anyway. (No, I was not perfect.) We all had a policy of checking in all our code every day. That way, someone else could take over if one of us got sick.

Each of us did the same thing. This time, we built whenever we wanted a new system. Often, it was a couple of times a day. We told each other, “Don’t go there. That part’s not done, but it shouldn’t break anything.” We had internal releases at least once a day. We released as a demo once a week to our manager.

After that, I worked at a couple of places with home-grown version control systems that look a lot like subversion does now. That was in the later 80s. I became a project manager and program manager.

Story 3: I was a program manager for a 9-team software program. We’d had trouble in the past getting to the point where we could release. I asked teams to do these things: Work towards a program-wide deliverable (release) every month, and use continuous integration. I said, “I want you to check everything in every day and make sure we always have a working build. I want to be able to see the build work every morning when I arrive.” Seven teams said yes. Two teams said no. I explained to the teams they could work in any way they wanted, as long as they could integrate within 24 hours of seeing everyone else’s code. “No problem, JR. We know what we’re doing.”

Well, those two teams didn’t deliver their parts at the first month milestone. They were pissed when I explained they could not work on any more features until they integrated what they had. Until they had everything working, no new features. (I was pissed, too.)

It took them almost three weeks to integrate their four weeks of work. They finally asked for help and a couple of other guys worked with the teams to untangle their code and check everything in.

I learned the value of continuous integration early. Mostly because I was way too lazy (forgetful?, not smart enough?) to be able to keep the details of an entire system in my head for an entire project. I know people who can. I cannot. I used to think it was one of my failings. I now realize many people only think they can keep all the details. They can’t either.

Here’s the technical part of how I got to frictionless releases:

  1. Make the thing you work on small. If you use stories, make the story a one-day or smaller story. I don’t care if the entire team works on it or one person works on it (well, I do care, and that’s a topic for another post), but being able to finish something of value in one day means you can descend into it. You finish it. You come up for air/more work and descend again. You don’t have to remember a ton of stuff related but not directly a part of this feature.
  2. Use continuous integration. Check in all the time. Now that I write books using subversion, I check in whenever I have either several paras/one chunk, or it’s been an hour. I check that the book builds and I fix problems right away, when the work is fresh in my mind. It’s one of the ways I can write fast and write well. Our version control systems are much more sophisticated than the ones I used in the early days. I’m not sure I buy automated merge. I prefer to keep the stories small and cohesive. (See this post on curlicue features. Avoid those by managing to implement by feature.)
  3. Check in all the associated data. I check in automated tests and test data when I write code. I check in bibliographic references when I write books. If you need something else with your work product, do it at the time you create. If I was a developer now, I would check in all my unit tests when I check in the code. If I was really smart, I might even check in the tests first, to do TDD. (TDD helps people design, not test.) If I was a tester, I would definitely check in all the automated tests as soon as possible. I could then ask the developers to run those tests to make sure they didn’t make a mistake. I could do the hard-to-define and necessary exploratory testing. (Yes, I did this as a tester.)

Frictionless releases are not just technical. You have to know what done means for a release. That’s why I started using release criteria back in the 70s. I’ll write a part 2 about release criteria.

Categories: Project Management

Our Answer To the Alert Storm: Introducing Team View Alerts

Xebia Blog - Thu, 07/21/2016 - 11:39
As a Dev or Ops it’s hard to focus on the things that really matter. Applications, systems, tools and other environments are generating notifications at a frequency and amount greater than you are able to cope with. It's a problem for every Dev and Ops professional. Alerts are used to identify trends, spikes or dips

Neo4j: Cypher – Detecting duplicates using relationships

Mark Needham - Wed, 07/20/2016 - 18:32

I’ve been building a graph of computer science papers on and off for a couple of months and now that I’ve got a few thousand loaded in I realised that there are quite a few duplicates.

They’re not duplicates in the sense that there are multiple entries with the same identifier but rather have different identifiers but seem to be the same paper!

e.g. there are a couple of papers titled ‘Authentication in the Taos operating system’:

http://dl.acm.org/citation.cfm?id=174614

2016 07 20 11 43 00

http://dl.acm.org/citation.cfm?id=168640

2016 07 20 11 43 38

This is the same paper published in two different journals as far as I can tell.

Now in this case it’s quite easy to just do a string similarity comparison of the titles of these papers and realise that they’re identical. I’ve previously use the excellent dedupe library to do this and there’s also an excellent talk from Berlin Buzzwords 2014 where the author uses locality-sensitive hashing to achieve a similar outcome.

However, I was curious whether I could use any of the relationships these papers have to detect duplicates rather than just relying on string matching.

This is what the graph looks like:

Graph  8

We’ll start by writing a query to see how many common references the different Taos papers have:

MATCH (r:Resource {id: "168640"})-[:REFERENCES]->(other)
WITH r, COLLECT(other) as myReferences
 
UNWIND myReferences AS reference
OPTIONAL MATCH path = (other)-[:REFERENCES]->(reference)
WITH other, COUNT(path) AS otherReferences, SIZE(myReferences) AS myReferences
WITH other, 1.0 * otherReferences / myReferences AS similarity WHERE similarity > 0.5
 
RETURN other.id, other.title, similarity
ORDER BY similarity DESC
LIMIT 10
╒════════╀═══════════════════════════════════════════╀══════════╕
β”‚other.idβ”‚other.title                                β”‚similarityβ”‚
β•žβ•β•β•β•β•β•β•β•β•ͺ═══════════════════════════════════════════β•ͺ══════════║
β”‚168640  β”‚Authentication in the Taos operating systemβ”‚1         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚174614  β”‚Authentication in the Taos operating systemβ”‚1         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

This query:

  • picks one of the Taos papers and finds its references
  • finds other papers which reference those same papers
  • calculates a similarity score based on how many common references they have
  • returns papers that have more than 50% of the same references with the most similar ones at the top

I tried it with other papers to see how it fared:

Performance of Firefly RPC

╒════════╀════════════════════════════════════════════════════════════════╀══════════════════╕
β”‚other.idβ”‚other.title                                                     β”‚similarity        β”‚
β•žβ•β•β•β•β•β•β•β•β•ͺ════════════════════════════════════════════════════════════════β•ͺ══════════════════║
β”‚74859   β”‚Performance of Firefly RPC                                      β”‚1                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚77653   β”‚Performance of the Firefly RPC                                  β”‚0.8333333333333334β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚110815  β”‚The X-Kernel: An Architecture for Implementing Network Protocolsβ”‚0.6666666666666666β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚96281   β”‚Experiences with the Amoeba distributed operating system        β”‚0.6666666666666666β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚74861   β”‚Lightweight remote procedure call                               β”‚0.6666666666666666β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚106985  β”‚The interaction of architecture and operating system design     β”‚0.6666666666666666β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚77650   β”‚Lightweight remote procedure call                               β”‚0.6666666666666666β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Authentication in distributed systems: theory and practice

╒════════╀══════════════════════════════════════════════════════════╀══════════════════╕
β”‚other.idβ”‚other.title                                               β”‚similarity        β”‚
β•žβ•β•β•β•β•β•β•β•β•ͺ══════════════════════════════════════════════════════════β•ͺ══════════════════║
β”‚121160  β”‚Authentication in distributed systems: theory and practiceβ”‚1                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚138874  β”‚Authentication in distributed systems: theory and practiceβ”‚0.9090909090909091β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Sadly it’s not as simple as finding 100% matches on references! I expect the later revisions of a paper added more content and therefore additional references.

What if we look for author similarity as well?

MATCH (r:Resource {id: "121160"})-[:REFERENCES]->(other)
WITH r, COLLECT(other) as myReferences
 
UNWIND myReferences AS reference
OPTIONAL MATCH path = (other)-[:REFERENCES]->(reference)
WITH r, other, authorSimilarity,  COUNT(path) AS otherReferences, SIZE(myReferences) AS myReferences
WITH r, other, authorSimilarity,  1.0 * otherReferences / myReferences AS referenceSimilarity
WHERE referenceSimilarity > 0.5
 
MATCH (r)<-[:AUTHORED]-(author)
WITH r, myReferences, COLLECT(author) AS myAuthors
 
UNWIND myAuthors AS author
OPTIONAL MATCH path = (other)<-[:AUTHORED]-(author)
WITH other, myReferences, COUNT(path) AS otherAuthors, SIZE(myAuthors) AS myAuthors
WITH other, myReferences, 1.0 * otherAuthors / myAuthors AS authorSimilarity
WHERE authorSimilarity > 0.5
 
 
 
RETURN other.id, other.title, referenceSimilarity, authorSimilarity
ORDER BY (referenceSimilarity + authorSimilarity) DESC
LIMIT 10
╒════════╀══════════════════════════════════════════════════════════╀═══════════════════╀════════════════╕
β”‚other.idβ”‚other.title                                               β”‚referenceSimilarityβ”‚authorSimilarityβ”‚
β•žβ•β•β•β•β•β•β•β•β•ͺ══════════════════════════════════════════════════════════β•ͺ═══════════════════β•ͺ════════════════║
β”‚121160  β”‚Authentication in distributed systems: theory and practiceβ”‚1                  β”‚1               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚138874  β”‚Authentication in distributed systems: theory and practiceβ”‚0.9090909090909091 β”‚1               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
╒════════╀══════════════════════════════╀═══════════════════╀════════════════╕
β”‚other.idβ”‚other.title                   β”‚referenceSimilarityβ”‚authorSimilarityβ”‚
β•žβ•β•β•β•β•β•β•β•β•ͺ══════════════════════════════β•ͺ═══════════════════β•ͺ════════════════║
β”‚74859   β”‚Performance of Firefly RPC    β”‚1                  β”‚1               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚77653   β”‚Performance of the Firefly RPCβ”‚0.8333333333333334 β”‚1               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

I’m sure I could find some other papers where neither of these similarities worked well but it’s an interesting start.

I think the next step is to build up a training set of pairs of documents that are and aren’t similar to each other. We could then train a classifier to determine whether two documents are identical.

But that’s for another day!

Categories: Programming

Android Developer Story: StoryToys finds success in the β€˜Family’ section on Google Play

Android Developers Blog - Wed, 07/20/2016 - 17:21

Posted by Lily Sheringham, Google Play team

Based in Dublin, Ireland, StoryToys is a leading publisher of interactive books and games for children. Like most kids’ app developers, they faced the challenges of engaging with the right audiences to get their content discovered. Since the launch of the Family section on Google Play, StoryToys has experienced an uplift of 270% in revenue and an increase of 1300% in downloads.

Hear Emmet O’Neill, Chief Product Officer, and Gavin Barrett, Commercial Director, discuss how the Family section creates a trusted and creative space for families to find new content. Also hear how beta testing, localized pricing and more, has allowed StoryToy’s flagship app, My Very Hungry Caterpillar, to significantly increase engagement and revenue.

Learn more about Google Play for Families and get the Playbook for Developers app to stay up-to-date with more features and best practices that will help you grow a successful business on Google Play.

Categories: Programming

Building Highly Scalable V6 Only Cloud Hosting

This is a guest repost by Donatas Abraitis, Lead Systems Engineer at at Hostinger International.

This article is about how we built the new high scalable cloud hosting solution using IPv6-only communication between commodity servers, what problems we faced with IPv6 protocol and how we tackled them for handling more than ten millions active users.

Why did we decide to run IPv6-only network?

At Hostinger we care much about innovation technologies, thus we decided to run a new project named Awex that is based on this protocol. If we can, so why not start since today? Only frontend (user facing) services are running in dual-stack environment, everything else is IPv6-only for west-east traffic.

Architecture
Categories: Architecture