Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Reference Class Forecasting

Herding Cats - Glen Alleman - Thu, 03/30/2017 - 17:18

Our long time friends have moved to our neighborhood here in Colorado. Their moving van arrived today.

Moving Van

We brought coffee to them while their old house was being unloaded into the new house. Talking with the moving van owner, he started telling stories about estimating the load in pounds. The agent makes the first estimate of the weight of the load, issues a quote for the cost of the move. Then the van owner picks up the load and weighs the truck before getting on the road. He told us the range of precision and accuracy is all over the place, depending on the agent. Sometimes it's very close. Sometimes it's not.

The quality of the estimate depends on the skill and experience of the estimator. The reference class estimating process is part of that skill and experience.

If we switch to software development. I relistened to Agile for Humans podcast with Steve McConnell about estimating. Steve's discussion was focused on how to make good estimates. How to put these to work to make business and technical decisions. These themes are based on how book Software Estimation: Demystifying the Black Art. This book should be on the shelf of any credible developer.

After listening again, it was clear those providing a forum for the No Estimates advocates have failed to address the fatal flaw of the No Estimates discussion.

There is NO principle by which you can decide in the presence of uncertainty without estimating. 

The podcast provided a nice overview of why we should estimate, how to estimate, and what to do with those estimates. Even discussed how to deal with the dysfunctional aspects of management when making estimates. But in the end, we need estimates to credibly provide value in exchange for money.

There can be no consideration for NOT estimating, except on de minimis projects. Just like the moving van owner, he needs an estimate for the weight of the load. From there he can confirm that load will fit on the truck. The trailer has a load limit of 33,000 pounds. The mechanics of household goods has a reference class parameter for the size of 33,000 pounds. This comes from empirical data of moving household goods. Some houses have heavier goods, some have lighter goods. But the estimator knows how to adjust for that.

It seems those in the software development business who conjecture that estimates are not needed, are the smell of dysfunction, are evil, should be stopped, have failed to understand not only the basic need for the estimate when making decision in the presence of uncertainty, but also the basic principles of estimating in the presence of the uncertainties that always exist on projects.

When the idea of No Estimates is given equal weight to Steve's message, it willfully ignores the principle of decision making. And replacing that principle with the practices and processes that attempt to make a decision without estimates.

This is like the moving van owner saying I get wild estimates, wrong estimates, estimates that cause me problems with my trailer, tractor, load management, and fee payments for carrying cargo across state lines. So I have an idea, I'll just not ask for or use any estimate, I'll just start emptying the house in Seal Beachm=, California and start planning to move the house to Colorado. What could possibly be wrong with that?

 

Related articles The Flaw of Empirical Data Used to Make Decisions About the Future The Flaw of Averages and Not Estimating Managing in Presence of Uncertainty Monte Carlo Simulation of Project Performance Myth's Abound Misunderstanding Making Decisions in the Presence of Uncertainty Herding Cats: Software Development for the 21st Century Want To Learn How To Estimate? Eyes Wide Shut - A View of No Estimates
Categories: Project Management

Game developers rejoice—new tools for developing on Actions on Google

Google Code Blog - Thu, 03/30/2017 - 17:00
By Sunil Vemuri, Product Manager for Actions on Google

Since we launchedthe Actions on Google platform last year, we've seen a lot of creative actions for use cases ranging from meditation to insurance. But one of the areas where we're especially excited is gaming. Games like Akinator to SongPop demonstrate that developers can create new and engaging experiences for users. To bring more great games online, we're adding new tools to Actions on Google to make it easier than ever for you to build games for the Google Assistant.

First, we're releasing a brand new sound effect library. These effects can make your games more engaging, help you create a more fun persona for your action, and hopefully put smiles on your users' faces. From airplanes, slide whistles, and bowlingto cats purring and thunder, you're going to find hundreds of options that will add some pizzazz to your Action.

Second, for those of you who feel nostalgic about interactive text adventures, we just published a handy guide on how to bring these games to life with the Google Assistant. With many old favorites being open source or in the public domain, you are now able to re-introduce these classics to Google Assistant users on Google Home.

Finally, for those of you who are looking to build new types of games, we've recently expanded the list of tool and consulting companies that have integrated their development solutions with Actions on Google. New collaborators like Pullstring, Converse.AI, Solstice and XAPP Media are now also able to help turn your vision into reality.

We can't wait to see how you use our sound library and for the new and classic games you'll bring to Google Assistant users on Google Home! Make sure you join our Google+ community to discuss Actions on Google with other developers.
Categories: Programming

Software Development Conferences Forecast March 2017

From the Editor of Methods & Tools - Thu, 03/30/2017 - 16:19
Here is a list of software development related conferences and events on Agile project management ( Scrum, Lean, Kanban), software testing and software quality, software architecture, programming (Java, .NET, JavaScript, Ruby, Python, PHP), DevOps and databases (NoSQL, MySQL, etc.) that will take place in the coming weeks and that have media partnerships with the Methods […]

Monitoring a Kubernetes Environment

Xebia Blog - Thu, 03/30/2017 - 14:28

This post is part 3 in a 4-part series about Container Monitoring. Post 1 dives into some of the new challenges containers and microservices create and the information you should focus on. Post 2 describes how you can monitor your Mesos cluster. This article describes the challenges of monitoring Kubernetes, how it works and what this means for […]

The post Monitoring a Kubernetes Environment appeared first on Xebia Blog.

The Entropy of Projects

Herding Cats - Glen Alleman - Wed, 03/29/2017 - 23:04

Entropy is the natural tendency of any system to move from order to disorder in the absence of an external force

In project work, the disorder is created by uncertainty. These uncertainties come in two forms. Reducible and irreducible uncertainty. In the absence of external forces, the naturally occurring and event based uncertainties of all project work creates risk to the project's success through the variances that result - the Entropy of the Project drives this risk created by uncertainty. Research shows most projects fail through managerial failures, not technical failures. Most of these failures are due to unaddressed risk [3]. And as is always the case

Risk Management is how Adults Manage Projects - Tim Lister

The development of software products or services is a collection of tasks and decisions to produce these outcomes for the organization paying for their development. The measurement of processes producing these outcomes is important to those paying since the process of developing the software not only creates value, but also creates cost.

In physical systems to reduce entropy, energy must be added to the system to prevent the creation of disorder. This energy is called Project Management or in general Management. This is the role of Management - to restrict to increase of entropy of the work environment. Management requires a process, so the process is part of entropy reduction as well. Management is applying a process to reduce entropy.

Uncertainty and variability are caused by probabilistic events and statistcal natural variances that force the systems (project) to deviate from a regular and predictable behavior.

In operational systems - projects, software development, DevOps - reducing variability or uncertainty is an important success issue. Reducing this uncertainty enables the work processes to increase predictability and managerial efficiency. Again reducing this uncertainty - that creates entropy - requires effort. This effort is energy in the principle of entropy management.

As always, to manage in the presence of uncertainty and to reduce the entropy of all project work, we need to make estimates of the amount of uncertainty and the amount of effort needed to make these reductions requires we estimate these attributes. 

Dealing with uncertainty in project work consists of three activities [1]

  1. identifying sources of project development uncertainty
  2. quantifying project uncertainty
  3. Using such uncertainty measure for improving decision-making process with respect to projects.

When uncertanty exists on projects, estimates are needed to manage in the presence of uncertainty. Without estimates, we cannot make informed decisions in the presence of uncertainty.  Just like physical systems, project systems have entropy that must be managed by spending energy - physical intervention to reduce uncertainty.

  1.  "An Entropy-Based Approach for Measuring Project Uncertainty," Arden Asllani and Lawrence Ettkin, Academy of Information and Management Sciences Journal, 2007.
  2. "An entropy-based uncertainty measure of process models," Jae-Yoon Jung, Chang-Ho Chin, and  Jorge Cardoso, Information Processing Letters, 111, (2011), 135-142
  3. “A risk management methodology for project risk dependencies,”T. W. Kwan and H. K. N. Leung, IEEE Transactions on Software Engineering, Vol. 37, 2011, pp. 635- 648. 
  4. "An Entropy Based Approach for Risk Factor Analysis in a Software Development Project," Pradnya Purandare, International Journal of Applied Engineering Research, Volume 11, Number 4 (2016) pp 2258-2262
  5. "An Information-Entropy-based Risk Measurement Method of Software Development Project," Rong Jinag, Journal of Information Science and Engineering 30, 1279-1301 (2014) 
Related articles Decision Analysis and Software Project Management Economics of Software Development Herding Cats: Where is the Adult Supervision on this Program? Misunderstanding Making Decisions in the Presence of Uncertainty Estimating and Making Decisions in Presence of Uncertainty
Categories: Project Management

How to speed up your MySQL with replication to in-memory database

Original article available at https://habrahabr.ru/company/mailru/blog/323870/

I’d like to share with you an article based on my talk at Tarantool Meetup(the video is in Russian, though). It’s a short story of why Mamba, one of the biggest dating websites in the world and the largest one in Russia, started using Tarantool. Why did we decide to busy ourselves with MySQL-to-Tarantool replication?

First, we had to migrate to MySQL 5.7 at some point, but this version didn’t have HandlerSocket that was being actively used on our MySQL 5.6 servers. We even contacted the Percona team — and they confirmed MySQL 5.6 is the last version to have HandlerSocket.

Second, we gave Tarantool a try and were pleased with its performance. We compared it against Memcached as a key-value store and saw the speed double from 0.6 ms to 0.3 ms on the same hardware. In relative terms, Tarantool’s twice as fast as Memcached. In absolute terms, it’s not that cool, but still impressive.

Third, we wanted to keep the whole existing architecture. There’s a MySQL master server and its slaves — we didn’t want to change anything in this structure. Can MySQL 5.6 slaves with HandlerSocket be replaced with something else without having to make significant architectural changes?

We learned that the Mail.Ru Group team has a replicator they created for their own purposes. The idea of replicating data from MySQL to Tarantool belongs to them. We asked the team to share the source code, which they did. We had to rewrite the code, though, since it worked with MySQL 5.1 and Tarantool 1.5, not 1.7. The replicator uses libslave, an open-source solution for reading events from a MySQL master server, and is built statically without any of MySQL’s system libraries. It’s been open-sourcedunder the BSD license, so anyone can use it for free.

Replication constraints
Categories: Architecture

Sponsored Post: ButterCMS, Aerospike, Loupe, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring? 
  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 
Fun and Informative Events
  • Analyst Webinar: Forrester Study on Hybrid Memory NoSQL Architecture for Mission-Critical, Real-Time Systems of Engagement. Thursday, March 30, 2017 | 11 AM PT / 2 PM ET. In today’s digital economy, enterprises struggle to cost-effectively deploy customer-facing, edge-based applications with predictable performance, high uptime and reliability. A new, hybrid memory architecture (HMA) has emerged to address this challenge, providing real-time transactional analytics for applications that require speed, scale and a low total cost of ownership (TCO). Forrester recently surveyed IT decision makers to learn about the challenges they face in managing Systems of Engagement (SoE) with traditional database architectures and their adoption of an HMA. Join us as our guest speaker, Forrester Principal Analyst Noel Yuhanna, and Aerospike’s VP Marketing, Cuneyt Buyukbezci, discuss the survey results and implications for your business. Learn and register

  • Advertise your event here!
Cool Products and Services
  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • ButterCMS is an API-based CMS that seamlessly drops into your app or website. Great for blogs, dynamic pages, knowledge bases, and more. Butter works with any language/framework including Ruby, Rails, Node.js, .NET, Python, Django, Flask, React, Angular, Go, PHP, Laravel, Elixir, Phoenix, and Meteor.

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Categories: Architecture

Agile Leadership Newsletters Posted

If you only read my blog, you might not know I publish a monthly newsletter, the Pragmatic Manager. The last two issues have been on agile leadership. Take a look at Being An Agile Leader and Own Your Leadership, Part 1. Those newsletters in addition to this 5-part series culminating with Becoming an Agile Leader, Part 5: Learning to Learn may help you with your agile changes.

I wrote them so you could envision the value Influential Agile Leader might have for you. Take a look at my writing and do join us in Toronto in May. Early-bird pricing ends this week.

As always, email me if you have questions. Or, let’s have a quick chat. See Book a Meeting With Me.

Categories: Project Management

Story Telling Techniques: The Premortem

Identify the risks before you start with a premortem.

Storytelling generates the big picture to guide a project or to help people frame their thoughts. A story can provide a deeper and more nuanced connection with information than most lists of PowerPoint bullets or a structured requirements documents. Storytelling can be used as a risk management tool.  Premortems are a useful tool for helping project teams anticipate risks.  Premortems were described in the Harvard Business Review, September 2007 by Gary Klein.  The basic premortem approach can be can be customized with storytelling to increase the power of the technique.

The basic premortem technique is as follows:

Step 1 – Prepare: Gather the project.

Step 2 – Have the team assume the project has utterly failed and then ask the question what caused the failure.

Step 3 – Ask each person to quietly write down all of the reasons they think the failure occurred in three minutes.

Step 4 – Using a round robin approach have each person shares one item on their list at a time with a facilitator recording the reasons on a whiteboard or flipchart.  Continue until all items are shared and recorded.

**The first four steps help defeat groupthink.

Step 5 – Identify the top 3 -5 items on the list and create user stories identifying the risks.  These high priority risks will be added to the backlog and revisited during grooming.  Common issues should be added to the team’s definition of done.

Step 6 – Periodically review the overall list with the team to determine whether any of the risks not added in step 5 have become more urgent. 

A more powerful twist to the standard process replaces steps 2 – 4 with a storytelling technique.

  • Break the team into pairs.
  • Provide the participants with an overview of the storytelling process, storytelling formats and the goal of the session.
  • Provide the participants with the premise that the project has failed and ask them to tell the story of how that point was reached.
  • Use probing questions to help the teams progress in generating the story. The sub-teams should be cross-functional. Time box this portion of the session to 15 minutes.
  • Have each team debrief the group with their stories.
  • Have the full team identify the issues that shaped the stories, these are potential risks.  The most critical risks should be added to the backlog (or if common to the definition of done).

When using the premortem storytelling technique there are a few important rules (many of these are useful for all types of storytelling sessions).

  1. Minimize interruptions, close laptops and have people put their phones away (consider collecting people’s phones).
  2. Set aside approximately two hours for generating the stories and to discuss the results.
  3. The whole project team and important stakeholders should be present or you will risk blind spots.
  4. If some members are not present video conferencing is important to create personal connections.
  5. A facilitator is important to making the process effective.  The facilitator should not be a critical team member or stakeholder. 
  6. The facilitator must ensure that the stories from the session are captured and the top 3 – 5 (more or less based on team’s discretion) are added to the product backlog. 

The premortem is an excellent tool to increase the team’s involvement and understanding of the understanding of risks.  Adding storytelling to the technique increases the richness of the experience over common brainstorming and listing techniques.  The results of a storytelling premortem will not only identify risks but provide the context of how the team members think the risks will emerge and turn into issues.  

 


Categories: Process Management

SE-Radio Episode 286: Katie Malone Intro to Machine Learning

Edaena Salinas talks with Katie Malone about Machine Learning. Katie Malone is a Data Scientist in the Research and Development department at Civis Analytics. She has a PhD in Physics from Stanford University, with a background specifically in particle physics. She is an instructor of the Intro to Machine Learning online course from Udacity and […]
Categories: Programming

A New Home for Google Open Source

Google Code Blog - Tue, 03/28/2017 - 19:36
Originally on Google Open Source Blog
Posted by Will Norris, Open Source Programs Office

Free and open source software has been part of our technical and organizational foundation since Google's early beginnings. From servers running the Linux kernel to an internal culture of being able to patch any other team's code, open source is part of everything we do. In return, we've released millions of lines of open source code, run programs like Google Summer of Code and Google Code-in, and sponsor open source projects and communities through organizations like Software Freedom Conservancy, the Apache Software Foundation, and many others.

Today, we're launching opensource.google.com, a new website for Google Open Source that ties together all of our initiatives with information on how we use, release, and support open source.

This new site showcases the breadth and depth of our love for open source. It will contain the expected things: our programs, organizations we support, and a comprehensive list of open source projects we've released. But it also contains something unexpected: a look under the hood at how we "do" open source.
Helping you find interesting open sourceOne of the tenets of our philosophy towards releasing open source code is that "more is better." We don't know which projects will find an audience, so we help teams release code whenever possible. As a result, we have released thousands of projects under open source licenses ranging from larger products like TensorFlow, Go, and Kubernetes to smaller projects such as Light My Piano, Neuroglancerand Periph.io. Some are fully supported while others are experimental or just for fun. With so many projects spread across 100 GitHub organizations and our self-hosted Git service, it can be difficult to see the scope and scale of our open source footprint.

To provide a more complete picture, we are launching a directory of our open source projects which we will expand over time. For many of these projects we are also adding information about how they are used inside Google. In the future, we hope to add more information about project lifecycle and maturity.
How we do open sourceOpen source is about more than just code; it's also about community and process. Participating in open source projects and communities as a large corporation comes with its own unique set of challenges. In 2014, we helped form the TODO Group, which provides a forum to collaborate and share best practices among companies that are deeply committed to open source. Inspired by many discussions we've had over the years, today we are publishing our internal documentation for how we do open source at Google.

These docs explain the process we follow for releasing new open source projects, submitting patches to others' projects, and how we manage the open source code that we bring into the company and use ourselves. But in addition to the how, it outlines why we do things the way we do, such as why we only use code under certain licenses or why we require contributor license agreements for all patches we receive.

Our policies and procedures are informed by many years of experience and lessons we've learned along the way. We know that our particular approach to open source might not be right for everyone—there's more than one way to do open source—and so these docs should not be read as a "how-to" guide. Similar to how it can be valuable to read another engineer's source code to see how they solved a problem, we hope that others find value in seeing how we approach and think about open source at Google.

To hear a little more about the backstory of the new Google Open Source site, we invite you to listen to the latest episode from our friends at The Changelog. We hope you enjoy exploring the new site!
Categories: Programming

Calling all early adopters for Android Studio previews

Android Developers Blog - Tue, 03/28/2017 - 17:03
Posted by Scott Main, Technical Writer

If you love trying out all of the newest features in Android Studio and helping us make it a better IDE, we're making it even easier to download early preview builds with a new website. Here, you can download and stay up to date on all the latest Android Studio previews and other tools announcements.



Android Studio previews give you early access to new features in all aspects of the IDE, plus early versions of other tools such as the Android Emulator and platform SDK previews. You can install multiple versions of Android Studio side-by-side, so if a bug in the preview build blocks your app development, you can keep working on the same project from the stable version.

The latest preview for Android Studio 2.4 just came out last week, and it includes new features to support development with the Android O Developer Preview. You can download and set up the O preview SDK from inside Android Studio, and then use Android O’s XML font resources and autosizing TextView in the Layout Editor.

By building your apps with the Android Studio preview, you're also helping us create a better version of Android Studio. We want to hear from you if you encounter any bugs.
Categories: Programming

Better User Stories: 24 Hours Until Doors Close

Mike Cohn's Blog - Tue, 03/28/2017 - 17:00

This blog post refers to a four-part series of videos on overcoming challenges with user stories. Topics covered are conducting story-writing workshops with story maps, splitting stories, and achieving the right level of detail in user stories.

To be notified when you the videos are again available, sign up below:

Notify Me!

Just a quick post this week to let you know that we will be closing registration to Better User Stories tomorrow at 6 P.M. Pacific, 9 P.M. Eastern.

We still have spaces for the Expert and Professional Levels, but Work With Mike is now completely sold out.

Click here to register before the deadline

Just a quick reminder of what people are saying about the course:

I could squeeze videos in between meeting packed days

“I loved the acronyms used to test story quality and that the modules were broken up into small enough segments that I could squeeze videos in between meeting packed days… I really enjoyed the worksheets that forced me to use my own backlog as practice to cement the concepts in my brain. It's way too easy to go through an online course and not really retain information that is useful later but that's what made it real for me.” - Sarah Fraser

Immediately able to apply what I learned

“I've used user stories for many years. I wasn't sure if this course was really going to teach me something new… I thought if anyone is going to be able to teach me more about user stories it will be Mike Cohn… The Q&A calls with the training were great. I think this is a big differentiator to other online trainings I've done. I was immediately able to apply what I learned in this course to support teams get their backlog set up as they begin delivering using the scrum framework. - Amber Burke

If you’re on the fence, jump in…

“It has already influenced and changed how I deliver story writing workshops. There is a lot of valuable information. It is split up into logical and digestible segments. For anyone willing to put in the time that needs to understand how to write better stories; you will find value here. If you're on the fence, jump in...you won't regret it.” - Max Lamers

You still have (some) time to access the free mini-course

When we close registration to the full Better User Stories course, we will also be taking down the free video training. If you’ve not yet seen those, you still have time to register and watch them before tomorrow’s deadline.

Click here to access the free mini-course

I don’t know when we’ll be opening doors again to the full, advanced course, so if you and your team want to sharpen your user stories skills, this is a great time to join.

Any last minute questions about the course? Let me know in the comments below.

Agile & Software Testing in Methods & Tools Q1 2017 articles

From the Editor of Methods & Tools - Tue, 03/28/2017 - 15:35
Here is a list of the articles published during the first quarter of 2017 on the Methods & Tools website. This quarter Methods & Tools has published articles discussing the management debt issue in Agile transformation, Agile forecasting and testing microservices. We also published two articles presenting open source software testing tools: CasperJS and Nitrate. […]

Luigi: Defining dynamic requirements (on output files)

Mark Needham - Tue, 03/28/2017 - 06:39

In my last blog post I showed how to convert a JSON document containing meetup groups into a CSV file using Luigi, the Python library for building data pipelines. As well as creating that CSV file I wanted to go back to the meetup.com API and download all the members of those groups.

This was a rough flow of what i wanted to do:

  • Take JSON document containing all groups
  • Parse that document and for each group:
    • Call the /members endpoint
    • Save each one of those files as a JSON file
  • Iterate over all those JSON files and create a members CSV file

In the previous post we created the GroupsToJSON task which calls the /groups endpoint on the meetup API and creates the file /tmp/groups.json.

Our new task has that as its initial requirement:

class MembersToCSV(luigi.Task):
    key = luigi.Parameter()
    lat = luigi.Parameter()
    lon = luigi.Parameter()

    def requires(self):
        yield GroupsToJSON(self.key, self.lat, self.lon)

But we also want to create a requirement on a task that will make those calls to the /members endpoint and store the result in a JSON file.

One of the patterns that Luigi imposes on us is that each task should only create one file so actually we have a requirement on a collection of tasks rather than just one. It took me a little while to get my head around that!

We don’t know the parameters of those tasks at compile time – we can only calculate them by parsing the JSON file produced by GroupsToJSON.

In Luigi terminology what we want to create is a dynamic requirement. A dynamic requirement is defined inside the run method of a task and can rely on the output of any tasks specified in the requires method, which is exactly what we need.

This code does the delegating part of the job:

class MembersToCSV(luigi.Task):
    key = luigi.Parameter()
    lat = luigi.Parameter()
    lon = luigi.Parameter()


    def run(self):
        outputs = []
        for input in self.input():
            with input.open('r') as group_file:
                groups_json = json.load(group_file)
                groups = [str(group['id']) for group in groups_json]


                for group_id in groups:
                    members = MembersToJSON(group_id, self.key)
                    outputs.append(members.output().path)
                    yield members


    def requires(self):
        yield GroupsToJSON(self.key, self.lat, self.lon)

Inside our run method we iterate over the output of GroupsToJSON (which is our input) and we yield to another task as well as collecting its outputs in the array outputs that we’ll use later.
MembersToJSON looks like this:

class MembersToJSON(luigi.Task):
    group_id = luigi.IntParameter()
    key = luigi.Parameter()


    def run(self):
        results = []
        uri = "https://api.meetup.com/2/members?&group_id={0}&key={1}".format(self.group_id, self.key)
        while True:
            if uri is None:
                break
            r = requests.get(uri)
            response = r.json()
            for result in response["results"]:
                results.append(result)
            uri = response["meta"]["next"] if response["meta"]["next"] else None


        with self.output().open("w") as output:
            json.dump(results, output)

    def output(self):
        return luigi.LocalTarget("/tmp/members/{0}.json".format(self.group_id))

This task generates one file per group containing a list of all the members of that group.

We can now go back to MembersToCSV and convert those JSON files into a single CSV file:

class MembersToCSV(luigi.Task):
    out_path = "/tmp/members.csv"
    key = luigi.Parameter()
    lat = luigi.Parameter()
    lon = luigi.Parameter()


    def run(self):
        outputs = []
        for input in self.input():
            with input.open('r') as group_file:
                groups_json = json.load(group_file)
                groups = [str(group['id']) for group in groups_json]


                for group_id in groups:
                    members = MembersToJSON(group_id, self.key)
                    outputs.append(members.output().path)
                    yield members

        with self.output().open("w") as output:
            writer = csv.writer(output, delimiter=",")
            writer.writerow(["id", "name", "joined", "topics", "groupId"])

            for path in outputs:
                group_id = path.split("/")[-1].replace(".json", "")
                with open(path) as json_data:
                    d = json.load(json_data)
                    for member in d:
                        topic_ids = ";".join([str(topic["id"]) for topic in member["topics"]])
                        if "name" in member:
                            writer.writerow([member["id"], member["name"], member["joined"], topic_ids, group_id])

    def output(self):
        return luigi.LocalTarget(self.out_path)

    def requires(self):
        yield GroupsToJSON(self.key, self.lat, self.lon)

We then just need to add our new task as a requirement of the wrapper task:

And we’re ready to roll:

$ PYTHONPATH="." luigi --module blog --local-scheduler Meetup --workers 3

We’ve defined the number of workers here as we can execute those calls to the /members endpoint in parallel and there are ~ 600 calls to make.

All the code from both blog posts is available as a gist if you want to play around with it.

Any questions/advice let me know in the comments or I’m @markhneedham on twitter.

The post Luigi: Defining dynamic requirements (on output files) appeared first on Mark Needham.

Categories: Programming

Join us live on May 23rd as we announce the latest Ads, Analytics and DoubleClick innovations

Google Code Blog - Mon, 03/27/2017 - 19:03
Posted by Sridhar Ramaswamy Senior Vice President, Ads and Commerce

What: Google Marketing Next keynote live stream
When: Tuesday, May 23rd at 9:00 a.m. PT/12:00 p.m. ET.
Duration: 1 hour
Where: On the Inside AdWords Blog



Be the first to hear about Google’s latest marketing innovations, the moment they’re announced. Watch live as my team and I share new Ads, Analytics and DoubleClick innovations designed to improve your ability to reach consumers, simplify campaign measurement and increase your productivity. We’ll also give you a sneak peek at how brands are starting to use the Google Assistant to delight customers.

Register for the live stream here.

Until then, follow us on Twitter, Google+, Facebook and LinkedIn for previews of what's to come.
Categories: Programming

AgilePath Podcast Up

I’ve said before that agile is a cultural change, not merely a project management framework or approach. One of the big changes is around transparency and safety.

We need safety to experiment. We need safety to be transparent. Creating that safe environment can be difficult for everyone involved.

John LeDrew has started a new podcast, agilepath.fm. I had the pleasure of chatting with John for the podcast. He wove a story with several other interviewers and it’s now up, In search of Safety.

I hope you enjoy it.

Categories: Project Management

Faster Networks + Cheaper Messages => Microservices => Functions => Edge

When Adrian Cockroft—the guy who helped put the loud in Cloud through his energetic evangelism of Cloud Native and Microservice architectures—talks about what’s next, it pays to listen. And you can listen, here’s a fascinating forward looking talk he gave at microXchg 2017: Shrinking Microservices to Functions. It’s typically Cockroftian: understated, thoughtful, and full of insight drawn from experience.

Adrian makes a compelling case that the same technology drivers, faster networking and cheaper messaging, that drove the move to Microservices are now driving the move to Functions.

The payoffs are all those you’ve no doubt heard about Serverless for some time, but Adrian develops them in an interesting way. He traces how architectures have evolved over time. Take a look at my gloss of his talk for more details.

What’s next after Functions? Adrian talks about pushing Lambda functions to the edge. A topic I’m excited about and have been interested in for sometime, though I didn’t quite see it playing out like this.

Datacenters disappear. Functions are not running in an AWS region anymore, code is placed near the customer using a CDN at CDN endpoints. Now you have a fully distributed, at the edge, low latency, milliseconds from the customer way of running code. Now you can build architectures that are partly in the datacenter, partly at the edge, and partly at the customer premises. And since this is AWS, it’s all, of course, built around Lambda. AWS Greengrass and Snowball Edge are peeks into what the future might look like.

There’s a hidden tension here. Once you put code at the edge you violate two of Lambda’s key assumptions: functions are composed using scalable backend services; low latency messaging. The edge will have a high latency path back to services in the datacenter, so how do you make a function based distributed application at the edge? Does edge computing argue for a more retro architecture with fewer messages back to a more monolithic core?

Or does edge computing require something completely different? Here’s one thought as to what that something completely different might look like: Datanet: A New CRDT Database That Let's You Do Bad Bad Things To Distributed Data.

Now, let’s see the future by first taking a tour of the past….

From Monoliths, to Microservices, to Functions
Categories: Architecture

SPaMCAST 435 – Allan Kelly, #NoProjects, Value

SPaMCAST Logo

http://www.spamcast.net

Listen Now
Subscribe on iTunes
Check out the podcast on Google Play Music

The Software Process and Measurement Cast 435 features our interview with Allan Kelly.  Our discussion touched on the concepts behind #NoProjects.  Allan describes how the concept of a project leads to a number of unintended consequences.  Those consequences aren’t pretty.

Allan makes digital development teams more effective and improves delivery with continuous agile approaches to reduce delay and risk while increasing value delivered. He helps teams and smaller companies – including start-ups and scale-ups – with advice, coaching and training. Managers, product, and technical staff are all involved in his improvements. He is the originator of Retrospective Dialogue Sheets and Value Poker, the author of four books, including “Xanpan – team-centric Agile Software Development” and “Business Patterns for Software Developers”. On Twitter he is @allankellynet.

Re-Read Saturday News

This week we tackle Chapter 8 of Carol Dweck’s Mindset: The New Psychology of Success (buy your copy and read along).  Chapter 8, titled “Changing Mindsets.” The whole concept of mindsets would be an interesting footnote if we did not believe they could change. Chapter 8 drives home the point that has been made multiple times in the book, that mindsets are malleable with self-awareness and a lot of effort. The question of whether all people want to be that self-aware will be addressed next week as we wrap up our re-read.

We are quickly closing in on the end of our re-read of Mindset.  I anticipate one more week.   The next book in the series will be Holacracy (Buy a copy today). After my recent interview with Jeff Dalton on Software Process and Measurement Cast 433, I realized that I had only read extracts from Holacracy by Brian J. Robertson, therefore we will read (first time for me) the whole book together.

Every week we discuss a chapter then consider the implications of what we have “read” from the point of view of both pursuing an organizational transformation and also using the material when coaching teams.  

Remember to buy a copy of Carol Dweck’s Mindset and start the re-read from the beginning!

Visit the Software Process and Measurement Cast blog to participate in this and previous re-reads.

Next SPaMCAST

The next Software Process and Measurement Cast will feature our essay on incremental change approaches.  We will also have columns from Jeremy Berriault. Jeremy blogs at https://jberria.wordpress.com/  and Jon M Quigley who brings his column, the Alpha and Omega of Product Development, to the Cast. One of the places you can find Jon is at Value Transformation LLC.

 


Categories: Process Management

SPaMCAST 435 - Allan Kelly, #NoProjects, Value

Software Process and Measurement Cast - Sun, 03/26/2017 - 22:00

The Software Process and Measurement Cast 435 features our interview with Allan Kelly.  Our discussion touched on the concepts behind #NoProjects.  Allan describes how the concept of a project leads to a number of unintended consequences.  Those consequences aren’t pretty.

Allan makes digital development teams more effective and improves delivery with continuous agile approaches to reduce delay and risk while increasing value delivered. He helps teams and smaller companies - including start-ups and scale-ups - with advice, coaching and training. Managers, product and technical staff are all involved in his improvements. He is the originator of Retrospective Dialogue Sheets and Value Poker, the author of four books, including "Xanpan - team-centric Agile Software Development" and "Business Patterns for Software Developers". On Twitter he is @allankellynet.

Re-Read Saturday News

This week we tackle Chapter 8 of Carol Dweck’s Mindset: The New Psychology of Success (buy your copy and read along).  Chapter 8, titled “Changing Mindsets.” The whole concept of mindsets would be an interesting footnote if we did not believe they could change. Chapter 8 drives home the point that has been made multiple times in the book, that mindsets are malleable with self-awareness and a lot of effort. The question of whether all people want to be that self-aware will be addressed next week as we wrap up our re-read.

We are quickly closing in on the end of our re-read of Mindset.  I anticipate one more week.   The next book in the series will be Holacracy (Buy a copy today). After my recent interview with Jeff Dalton on Software Process and Measurement Cast 433, I realized that I had only read extracts from Holacracy by Brian J. Robertson, therefore we will read (first time for me) the whole book together.

Every week we discuss a chapter then consider the implications of what we have “read” from the point of view of both someone pursuing an organizational transformation and using the material when coaching teams.  

Remember to buy a copy of Carol Dweck’s Mindset and start the re-read from the beginning!

Visit the Software Process and Measurement Cast blog to participate in this and previous re-reads.

Next SPaMCAST

The next Software Process and Measurement Cast will feature our essay on incremental change approaches.  We will also have columns from Jeremy Berriault. Jeremy blogs at https://jberria.wordpress.com/  and Jon M Quigley who brings his column, the Alpha and Omega of Product Development, to the Cast. One of the places you can find Jon is at Value Transformation LLC.

 

Categories: Process Management