Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Architecture

Facebook Mobile Drops Pull For Push-based Snapshot + Delta Model

We've learned mobile is different. In If You're Programming A Cell Phone Like A Server You're Doing It Wrong we learned programming for a mobile platform is its own specialty. In How Facebook Makes Mobile Work At Scale For All Phones, On All Screens, On All Networks we learned bandwidth on mobile networks is a precious resource. 

Given all that, how do you design a protocol to sync state (think messages, comments, etc.) between mobile nodes and the global state holding servers located in a datacenter?

Facebook recently wrote about their new solution to this problem in Building Mobile-First Infrastructure for Messenger. They were able to reduce bandwidth usage by 40% and reduced by 20% the terror of hitting send on a phone.

That's a big win...that came from a protocol change.

Facebook Messanger went from a traditional notification triggered full state pull:

Categories: Architecture

How to deploy a Docker application into production on Amazon AWS

Xebia Blog - Fri, 10/17/2014 - 17:00

Docker reached production status a few months ago. But having the container technology alone is not enough. You need a complete platform infrastructure before you can deploy your docker application in production. Amazon AWS offers exactly that: a production quality platform that offers capacity provisioning, load balancing, scaling, and application health monitoring for Docker applications.

In this blog, you will learn how to deploy a Docker application to production in five easy steps.

For demonstration purposes, you are going to use the node.js application that was build for CloudFoundry and used to demonstrate Deis in a previous post. A truly useful app of which the sources are available on github.

1. Create a Dockerfile

First thing you need to do is to create a Dockerfile to create an image. This is quite simple: you install the node.js and npm packages, copy the source files and install the javascript modules.

# DOCKER-VERSION 1.0
FROM    ubuntu:latest
#
# Install nodejs npm
#
RUN apt-get update
RUN apt-get install -y nodejs npm
#
# add application sources
#
COPY . /app
RUN cd /app; npm install
#
# Expose the default port
#
EXPOSE  5000
#
# Start command
#
CMD ["nodejs", "/app/web.js"]
2. Test your docker application

Now you can create the Docker image and test it.

$ docker build -t sample-nodejs-cf .
$ docker run -d -p 5000:5000 sample-nodejs-cf

Point your browser at http://localhost:5000, click the 'start' button and Presto!

3. Zip the sources

Now you know that the instance works, you zip the source files. The image will be build on Amazon AWS based on your Dockerfile.

$ zip -r /tmp/sample-nodejs-cf-srcs.zip .
4. Deploy Docker application to Amazon AWS

Now you install and configure the amazon AWS command line interface (CLI) and deploy the docker source files to elastic beanstalk.  You can do this all manually, but here you use the script deploy-to-aws.sh that I created.

$ deploy-to-aws.sh \
         sample-nodejs-cf \
         /tmp/sample-nodejs-cf-srcs.zip \
         demo-env

After about 8-10 minutes your application is running. The output should look like this..

INFO: creating application sample-nodejs-cf
INFO: Creating environment demo-env for sample-nodejs-cf
INFO: Uploading sample-nodejs-cf-srcs.zip for sample-nodejs-cf, version 1412948762.
upload: ./sample-nodejs-cf-srcs.zip to s3://elasticbeanstalk-us-east-1-233211978703/1412948762-sample-nodejs-cf-srcs.zip
INFO: Creating version 1412948762 of application sample-nodejs-cf
INFO: demo-env in status Launching, waiting to get to Ready..
...
INFO: demo-env in status Launching, waiting to get to Ready..
INFO: Updating environment demo-env with version 1412948762 of sample-nodejs-cf
INFO: demo-env in status Updating, waiting to get to Ready..
...
INFO: demo-env in status Updating, waiting to get to Ready..
INFO: Version 1412948762 of sample-nodejs-cf deployed in environment
INFO: current status is Ready, goto http://demo-env-vm2tqi3qk4.elasticbeanstalk.com
5. Test your Docker application on the internet!

Your application is now available on the Internet. Browse to the designated URL and click on start. When you increase the number of instances at Amazon, they will appear in the application. When you deploy a new version of the application, you can observe how new versions of the application  appear without any errors on the client application.

For more information, goto Amazon Elastic Beanstalk adds Docker support. and Dockerizing a Node.js Web App.

Stuff The Internet Says On Scalability For October 17th, 2014

Hey, it's HighScalability time:


What could this be? Swarms of drones painting 3D light sculptures against the night sky!
  • Quotable Quotes:
    • Visnja Zeljeznjak: Steve Jobs' product pricing formula: cost of materials x 3 + 33%
    • Benedict Evans: We now have over 2bn iOS and Android devices on earth, and this will grow in the next few years to well over 3bn.
    • @ClearStoryData: It's true! Avg beer drinker attracts 4.4% more Mosquitos than water drinker #Strataconf
    • Leslie Lamport: The core idea of the problem of that notion of causality came about because of my familiarity with special relativity...where one event could causally effect another depended on weather or not information from one could physically reach the other.
    • @laurelatoreilly: Fascinating session about cargo ships going dark to shift market prices #IoT #strataconf "your decisions are only as good as your data"
    • @muratdemirbas: Distributed/decentralized coordination is expensive & hard to scale. Centralized coordination is cheap & scales easily using hierarchies.
    • @froidianslip: ”Kafka is awesome. We heard it cures cancer." -- @gwenshap #Strataconf
    • @timoreilly: RT @grapealope: The self-driving car has 6000 sensors, and takes readings at 4Hz. That's a lot of data. @MCSrivas #strataconf #MapR
    • @froidianslip: Love the paraphrase borrowed from Ray Bradbury, "Any sufficiently complex configuration is indistinguishable from code." #Strataconf
    • @matei_zaharia: Spark shatters MapReduce's 100 TB and 1 PB sort records... with 10x fewer nodes
    • @msallstr: “Synchronous calls in this environment are the crystal meth of programming”  @mjpt777 on the new   reactive manifesto 
    • @postwait: “If you put them under enough stress, perfectly rational people will panic and start believing in science” #priceless
    • Ilya Grigorik: It's great to see access from mobile is around 30% faster compared to last year.
    • @ryandotsmith: Recently migrated an async system to SQS. Much simple. Tiny latency. Here is the code (maybe a gem?)

  • People just don't appreciate the power of messy. The problematic culture of "Worse is Better". There's an implied notion here that people can't recognize better when they see it. Better is not a platonic ideal. It can't be proved by argument. Better, like evolution, is something that works itself out in practice. Like evolution, Worse is Better is an algorithm for stepping through a possibility space by jumping from one working phenotype to the next more adapted working phenotype. And for many, that's better. Not Ideal, but Better.

  • The Times They Are a-Changin'. Docker and Microsoft partner to drive adoption of distributed applications. What's the goal? nickstinemates: Package your Windows app in a docker container, use same tooling you would otherwise use to deploy to a docker engine running on a Windows host. Package your Linux app in a docker container, use same tooling you would otherwise use to deploy to a docker engine running on a Linux host.

  • Leandro Pereira writes a fine autobiography in Life of a HTTP request, as seen by my toy web server. All the stages of life are there. Socket creation. Acceptance. Scheduling. Coroutines. Reading requests. Parsing requests. All the way to the reply and the death of the connection. A lot to learn if you want to look at the simplified internals of a service.

  • Wonderful talk: Call Me Maybe: Carly Rae Jepsen and the Perils of Network Partitions. Kyle Kingsbury takes a detailed look at different partition problems in different databases. There are split brains. Masters dying. Lost data. General network mayhem. It's great. The lesson: what's written down in the marketing documentation is not always what you get. Test your application and see what really happens. The world is not simple. A dumb solution where you understand the failure modes can be a good choice.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Then When Given

Xebia Blog - Fri, 10/17/2014 - 14:50

People who practice ATDD all know how frustrating it can be to write automated examples. Especially when you get stuck overthinking the preconditions of examples.

This post describes an alternative approach to writing acceptance tests: write them backwards!

Imagine that you are building the very first online phone book. We need to define an acceptance tests for viewing the location of a florist. Using the Given-When-Then formula you would probably describe the behaviour like this.


Given I am on the online phone book homepage
When I type “Florist” in the business type field
And I click …
...

Most of the time you will be discussing and describing details that have nothing to do with viewing the location of a florist. To avoid this, write down the Then clause of the formula first.
Make sure the Then clause contains an observable result.


Then I see the location “Floriststreet 123”

Next, we will try to answer the following question: What caused the Then clause?
Make sure the Then clause contains an actor and an action.


When I click “View map” of the search result
Then I see the location “Floriststreet 123”

The last thing we will need to do is answer the following question: Why can I perform that action?
Make sure the Given clause contains a simple precondition.


Given I see a search result for florist “Floral Designs”
When I click “View map” of the search result
Then I see the location “Floriststreet 123”

You might have noticed that I left out certain parts where the user goes to the homepage and selects UI objects in the search area. It was not worth mentioning in the Given-When-Then formula. Too much details make us lose focus of what we really want to check. The essence of this acceptance test is clicking on the link "View map" and exposing the location to the user.

Try it a couple of times and let me know how it went.

Testing CDN and geolocation with webpagetest.org

Agile Testing - Grig Gheorghiu - Wed, 10/15/2014 - 19:31
Assume you want to migrate some.example.com to a new CDN provider. Eventually you'll have to point example.mycompany.com as a CNAME to a domain name handled by the CDN provider, let's call it example.cdnprovider.com. To test this setup before you put it in production, the usual way is to get an IP address corresponding to example.cndprovider.com, then associate example.mycompany.com with that IP address in your local /etc/hosts file.

This works well for testing most of the functionality of your web site, but it doesn't work when you want to test geolocation-specific features such as displaying the currency based on the users's country of origin. For this, you can use a nifty feature from the amazing free service WebPageTest.

On the main page of WebPageTest, you can specify the test location from the dropdown. It contains a generous list of locations across the globe. To fake your DNS setting and point example.mycompany.com, you can specify something like this in the Script tab:

setDNSName example.mycompany.com example.cdnprovider.comnavigate http://example.mycompany.com
This will effectively associate the page you want to test with the CDN provider-specified URL, so you will hit the CDN first from the location you chose.

Building Better Business Cases for Digital Initiatives

It’s hard to drive digital initiatives and business transformation if you can’t create the business case.  Stakeholder want to know what their investment is supposed to get them

One of the simplest ways to think about business cases is to think in terms of stakeholders, benefits, KPIs, costs, and risks over time frames.

While that’s the basic frame, there’s a bit of art and science when it comes to building effective business cases, especially when it involves transformational change.

Lucky for us, in the book, Leading Digital: Turning Technology into Business Transformation, George Westerman, Didier Bonnet, and Andrew McAfee, share some of their lessons learned in building better business cases for digital initiatives.

What I like about their guidance is that it matches my experience

Link Operational Changes to Tangible Business Benefits

The more you can link your roadmap to benefits that people care about and can measure, the better off you are.

Via Leading Digital:

“You need initiative-based business cases that establish a clear link from the operational changes in your roadmap to tangible business benefits.  You will need to involve employees on the front lines to help validate how operational changes will contribute to strategic goals.”

Work Out the Costs, the Benefits, and the Timing of Return

On a good note, the same building blocks that apply to any business case, apply to digital initiatives.

Via Leading Digital:

“The basic building blocks of a business case for digital initiatives are the same as for any business case.  Your team needs to work out the costs, the benefits, and the timing of the return.  But digital transformation is still uncharted territory.  The cost side of the equation is easier, but benefits can be difficult to quantify, even when, intuitively, they seem crystal clear.”

Start with What You Know

Building a business case is an art and a science.   To avoid getting lost in analysis paralysis, start with what you know.

Via Leading Digital:

“Building a business case for digital initiatives is both an art an a science.  With so many unknowns, you'll need to take a pragmatic approach to investments in light of what you know and what you don't know.

Start with what you know, where you have most of the information you need to support a robust cost-benefit analysis.  A few lessons learned from our Digital Masters can be useful.”

Don’t Build Your Business Case as a Series of Technology Investments

If you only consider the technology part of the story, you’ll miss the bigger picture.  Digital initiatives involves organizational change management as well as process change.  A digital initiative is really a change in terms of people, process, and technology, and adoption is a big deal.

Via Leading Digital:

“Don't build your business case as a series of technology investments.  You will miss a big part of the costs.  Cost the adoption efforts--digital skill building, organizational change, communication, and training--as well as the deployment of the technology.  You won't realize the full benefits--or possibly any benefits--without them.”

Frame the Benefits in Terms of Business Outcomes

If you don’t work backwards from the end-in-mind, you might not get there.  You need clarity on the business outcomes so that you can chunk up the right path to get there, while flowing continuous value along the way.

Via Leading Digital:

“Frame the benefits in terms of the business outcomes you want to reach.  These outcomes can be the achievement of goals or the fixing of problems--that is, outcomes that drive more customer value, higher revenue, or a better cost position.  Then define the tangible business impact and work backward into the levers and metrics that will indicate what 'good' looks like.  For instance, if one of your investments is supposed to increase digital customer engagement, your outcome might be increasing engagement-to-sales conversation.  Then work back into the main metrics that drive this outcome, for example, visits, like inquiries, ratings, reorders, and the like.

When the business impact5 of an initiative is not totally clear, look at companies that have already made similar investments.  Your technology vendors can also be a rich, if somewhat biased, source of business cases for some digital investments.”

Run Small Pilots, Evaluate Results, and Refine Your Approach

To reduce risk, start with pilots to live and learn.   This will help you make informed decisions as part of your business case development.

Via Leading Digital:

“But, whatever you do, some digital investment cases will be trickier to justify, be they investments in emerging technologies or cutting-edge practices.  For example, what is the value of gamifying your brand's social communities?  For these types of investment opportunities, experiment with a test-and-learn approach.  State your measures of success, run small pilots, evaluate results, and refine your approach.  Several useful tools and methods exist, such as hypothesis-driven experiments with control groups, or A/B testing.  The successes (and failures) of small experiments can then become the benefits rationale to invest at greater scale.  Whatever the method, use an analytical approach; the quality of your estimated return depends on it.

Translating your vision into strategic goals and building an actionable roadmap is the firs step in focusing your investment.  It will galvanize the organization into action.  But if you needed to be an architect to develop your vision, you need to be a plumber to develop your roadmap.  Be prepared to get your hands dirty.”

While practice makes perfect, business cases aren’t about perfect.  Their job is to help you get the right investment from stakeholders so you can work on the right things, at the right time, to make the right impact.

You Might Also Like

Cloud Changes the Game from Deployment to Adoption

How Digital is Changing Physical Experiences

McKinsey on Unleashing the Value of Big Data Analytics

Categories: Architecture, Programming

Using a SSD Cache in Front of EBS Boosted Throughput by 50%, for Free

Using EBS has lots of advantages--reliability, snapshotting, resizing--but overcoming the performance problems by using Provisioned IOPS is expensive. 

Swrve, an integrated marketing and A/B testing and optimization platform for mobile apps, did something clever. They are using the c3.xlarge EC2 instances, that have two 40GB SSD devices per instance, as a cache.

They found through testing RAID-0 striping using a 4-way stripe along with enhanceio, effectively increased throughput by over 50%, for free. With no filesystem corruption problems.

How is it free? "We were planning on upgrading to the C3 class of instance anyway, and sticking with EBS as the backing store. Once you’re using an instance which has SSD ephemeral storage, there are no additional fees to use that hardware."

For great analysis, lots of juicy details, graphs, and configuration commands, please take a look at How we increased our EC2 event throughput by 50%, for free

Categories: Architecture

Docker and Microsoft: Integrating Docker with Windows Server and Microsoft Azure

ScottGu's Blog - Scott Guthrie - Wed, 10/15/2014 - 14:30

I’m excited to announce today that Microsoft is partnering with Docker, Inc to enable great container-based development experiences on Linux, Windows Server and Microsoft Azure.

Docker is an open platform that enables developers and administrators to build, ship, and run distributed applications. Consisting of Docker Engine, a lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments.

Earlier this year, Microsoft released support for Docker containers with Linux on Azure.  This support integrates with the Azure VM agent extensibility model and Azure command-line tools, and makes it easy to deploy the latest and greatest Docker Engine in Azure VMs and then deploy Docker based images within them.   Docker Support for Windows Server + Docker Hub integration with Microsoft Azure

Today, I’m excited to announce that we are working with Docker, Inc to extend our support for Docker much further.  Specifically, I’m excited to announce that:

1) Microsoft and Docker are integrating the open-source Docker Engine with the next release of Windows Server.  This release of Windows Server will include new container isolation technology, and support running both .NET and other application types (Node.js, Java, C++, etc) within these containers.  Developers and organizations will be able to use Docker to create distributed, container-based applications for Windows Server that leverage the Docker ecosystem of users, applications and tools.  It will also enable a new class of distributed applications built with Docker that use Linux and Windows Server images together.

image 

2) We will support the Docker client natively on Windows.  Developers and administrators running Windows will be able to use the same standard Docker client and interface to deploy and manage Docker based solutions with both Linux and Windows Server environments.

image

 

3) Docker for Windows Server container images will be available in the Docker Hub alongside the Docker for Linux container images available today.  This will enable developers and administrators to easily share and automate application workflows using both Windows Server and Linux Docker images.

4) We will integrate Docker Hub with the Microsoft Azure Gallery and Azure Management Portal.  This will make it trivially easy to deploy and run both Linux and Windows Server based Docker images in Microsoft Azure.

5) Microsoft is contributing code to Docker’s Open Orchestration APIs.  These APIs provide a portable way to create multi-container Docker applications that can be deployed into any datacenter or cloud provider environment. This support will allow a developer or administrator using the Docker command line client to launch either Linux or Windows Server based Docker applications directly into Microsoft Azure from his or her development machine.

Exciting Opportunities Ahead

At Microsoft we continue to be inspired by technologies that can dramatically improve how quickly teams can bring new solutions to market. The partnership we are announcing with Docker today will enable developers and administrators to use the best container tools available for both Linux and Windows Server based applications, and to run all of these solutions within Microsoft Azure.  We are looking forward to seeing the great applications you build with them.

You can learn more about today’s announcements here and here.

Hope this helps,

Scott omni

Categories: Architecture, Programming

Sponsored Post: Apple, Hypertable, VSCO, Gannett, Sprout Social, Scalyr, FoundationDB, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Who's Hiring?
  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here. 
    • Senior Engineer: Mobile Services. The Emerging Technologies/Mobile Services team is looking for a proactive and hardworking software engineer to join our team. The team is responsible for a variety of high quality and high performing mobile services and applications for internal use. We seek an accomplished server-side engineer capable of delivering an extraordinary portfolio of features and services based on emerging technologies to our internal customers. Please apply here.
    • Apple Pay Automation Engineer. The Apple Pay group within iOS Systems is looking for a outstanding automation engineer with strong experience in building client and server test automation. We work in an agile software development environment and are building infrastructure to move towards continuous delivery where every code change is thoroughly tested by push of a button and is considered ready to be deployed if we choose so. Please apply here
    • Site Reliability Engineer. As a member of the Apple Pay SRE team, you’re expected to not just find the issues, but to write code and fix them. You’ll be involved in all phases and layers of the application, and you’ll have a direct impact on the experience of millions of customers. Please apply here.
    • Software Engineering Manager. In this role, you will be communicating extensively with business teams across different organizations, development teams, support teams, infrastructure teams and management. You will also be responsible for working with cross-functional teams to delivery large initiatives. Please apply here

  • VSCO. Do you want to: ship the best digital tools and services for modern creatives at VSCO? Build next-generation operations with Ansible, Consul, Docker, and Vagrant? Autoscale AWS infrastructure to multiple Regions? Unify metrics, monitoring, and scaling? Build self-service tools for engineering teams? Contact me (Zo, zo@vs.co) and let’s talk about working together. vs.co/careers.

  • Gannett Digital is looking for talented Front-end developers with strong Python/Django experience to join their Development & Integrations team. The team focuses on video, user generated content, API integrations and cross-site features for Gannett Digital’s platform that powers sites such as http://www.usatoday.com, http://www.wbir.com or http://www.democratandchronicle.com. Please apply here.

  • Platform Software Engineer, Sprout Social, builds world-class social media management software designed and built for performance, scale, reliability and product agility. We pick the right tool for the job while being pragmatic and scrappy. Services are built in Python and Java using technologies like Cassandra and Hadoop, HBase and Redis, Storm and Finagle. At the moment we’re staring down a rapidly growing 20TB Hadoop cluster and about the same amount stored in MySQL and Cassandra. We have a lot of data and we want people hungry to work at scale. Apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • Sign Up for New Aerospike Training Courses.  Aerospike now offers two certified training courses; Aerospike for Developers and Aerospike for Administrators & Operators, to help you get the most out of your deployment.  Find a training course near you. http://www.aerospike.com/aerospike-training/

  • November TokuMX Meetups for Those Interested in MongoDB. Join us in one of the following cities in November to learn more about TokuMX and hear TokuMX use cases. 11/5 - London;11/11 - San Jose; 11/12 - San Francisco. Not able to get to these cities? Check out our website for other upcoming Tokutek events in your area - www.tokutek.com/events.
Cool Products and Services
  • Hypertable Inc. Announces New UpTime Support Subscription Packages. The developer of Hypertable, an open-source, high-performance, massively scalable database, announces three new UpTime support subscription packages – Premium 24/7, Enterprise 24/7 and Basic. 24/7/365 support packages start at just $1995 per month for a ten node cluster -- $49.95 per machine, per month thereafter. For more information visit us on the Web at http://www.hypertable.com/. Connect with Hypertable: @hypertable--Blog.

  • FoundationDB launches SQL Layer. SQL Layer is an ANSI SQL engine that stores its data in the FoundationDB Key-Value Store, inheriting its exceptional properties like automatic fault tolerance and scalability. It is best suited for operational (OLTP) applications with high concurrency. Users of the Key Value store will have free access to SQL Layer. SQL Layer is also open source, you can get started with it on GitHub as well.

  • Diagnose server issues from a single tab. Scalyr replaces all your monitoring and log management services with one, so you can pinpoint and resolve issues without juggling multiple tools and tabs. Engineers say it's powerful and easy to use. Customer support teams use it to troubleshoot user issues. CTO's consider it a smart alternative to Splunk, with enterprise-grade functionality, sane pricing, and human support. Trusted by in-the-know companies like Codecademy – learn more!

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

AngularJS Training Week

Xebia Blog - Tue, 10/14/2014 - 07:00

Just a few more weeks and it's the AngularJS Training Week at Xebia in Hilversum (The Netherlands). 4 days full with AngularJS content, from 17 to 20 October, 2014. In these different days we will cover the AngularJS basics, AngularJS advanced topics, Tooling & Scaffolding and Testing with Jasmine, Karma and Protractor.

If you already have some experience or if you are only interested in one or two of the topics, then you can sign up for just the days that are of interest to you.

Visit www.angular-training.com for a full overview of the days and topics or sign up on the Xebia Training website using the links below.

Fast and Easy integration testing with Docker and Overcast

Xebia Blog - Mon, 10/13/2014 - 18:40
Challenges with integration testing

Suppose that you are writing a MongoDB driver for java. To verify if all the implemented functionality works correctly, you ideally want to test it against a REAL MongoDB server. This brings a couple of challenges:

  • Mongo is not written in java, so we can not embed it easily in our java application
  • We need to install and configure MongoDB somewhere, and maintain the installation, or write scripts to set it up as part of our test run.
  • Every test we run against the mongo server, will change the state, and tests might influence each other. We want to isolate our tests as much as possible.
  • We want to test our driver against multiple versions of MongoDB.
  • We want to run the tests as fast as possible. If we want to run tests in parallel, we need multiple servers. How do we manage them?

Let's try to address these challenges.

First of all, we do not really want to implement our own MonogDB driver. Many implementations exist and we will be reusing the mongo java driver to focus on how one would write the integration test code.

Overcast and Docker

logoWe are going to use Docker and Overcast. Probably you already know Docker. It's a technology to run applications inside software containers. Overcast is the library we will use to manage docker for us. Overcast is a open source java library
developed by XebiaLabs to help you to write test that connect to cloud hosts. Overcast has support for various cloud platforms, including EC2, VirtualBox, Vagrant, Libvirt (KVM). Recently support for Docker has been added by me in Overcast version 2.4.0.

Overcast helps you to decouple your test code from the cloud host setup. You can define a cloud host with all its configuration separately from your tests. In your test code you will only refer to a specific overcast configuration. Overcast will take care of creating, starting, provisioning that host. When the tests are finished it will also tear down the host. In your tests you will use Overcast to get the hostname and ports for this cloud host to be able to connect to them, because usually these are dynamically determined.

We will use Overcast to create Docker containers running a MongoDB server. Overcast will help us to retrieve the dynamically exposed port by the Docker host. The host in our case will always be the docker host. Docker in our case runs on an external Linux host. Overcast will use a TCP connection to communicate with Docker. We map the internal ports to a port on the docker host to make it externally available. MongoDB will internally run on port 27017, but docker will map this port to a local port in the range 49153 to 65535 (defined by docker).

Setting up our tests

Lets get started. First, we need a Docker image with MongoDB installed. Thanks to the Docker community, this is as easy as reusing one of the already existing images from the Docker Hub. All the hard work of creating such an image is already done for us, and thanks to containers we can run it on any host capable of running docker containers. How do we configure Overcast to run the MongoDB container? This is our minimal configuration we put in a file called overcast.conf:

mongodb {
    dockerHost="http://localhost:2375"
    dockerImage="mongo:2.7"
    exposeAllPorts=true
    remove=true
    command=["mongod", "--smallfiles"]
}

That's all! The dockerHost is configured to be localhost with the default port. This is the default value and you can omit this. The docker image called mongo version 2.7 will be automatically pulled from the central docker registry. We set exposeAllPorts to true to inform docker it needs to dynamically map all exposed ports by the docker image. We set remove to true to make sure the container is automatically removed when stopped. Notice we override the default container startup command by passing in an extra parameter "--smallfiles" to improve testing performance. For our setup this is all we need, but overcast also has support for defining static port mappings, setting environment variables, etc. Have a look at the Overcast documentation for more details.

How do we use this overcast host in our test code? Let's have a look at the test code that sets up the Overcast host and instantiates the mongodb client that is used by every test. The code uses the TestNG @BeforeMethod and @AfterMethod annotations.

private CloudHost itestHost;
private Mongo mongoClient;

@BeforeMethod
public void before() throws UnknownHostException {
    itestHost = CloudHostFactory.getCloudHost("mongodb");
    itestHost.setup();

    String host = itestHost.getHostName();
    int port = itestHost.getPort(27017);

    MongoClientOptions options = MongoClientOptions.builder()
        .connectTimeout(300 * 1000)
        .build();

    mongoClient = new MongoClient(new ServerAddress(host, port), options);
    logger.info("Mongo connection: " + mongoClient.toString());
}

@AfterMethod
public void after(){
    mongoClient.close();
    itestHost.teardown();
}

It is important to understand that the mongoClient is the object under test. Like mentioned before, we borrowed this library to demonstrate how one would integration test such a library. The itestHost is the Overcast CloudHost. In before(), we instantiate the cloud host by using the CloudHostFactory. The setup() will pull the required images from the docker registry, create a docker container, and start this container. We get the host and port from the itestHost and use them to build our mongo client. Notice that we put a high connection timeout on the connection options, to make sure the mongodb server is started in time. Especially the first run it can take some time to pull images. You can of course always pull the images beforehand. In the @AfterMethod, we simply close the connection with mongoDB and tear down the docker container.

Writing a test

The before and after are executed for every test, so we will get a completely clean mongodb server for every test, running on a different port. This completely isolates our test cases so that no tests can influence each other. You are free to choose your own testing strategy, sharing a cloud host by multiple tests is also possible. Lets have a look at one of the tests we wrote for mongo client:

@Test
public void shouldCountDocuments() throws DockerException, InterruptedException, UnknownHostException {

    DB db = mongoClient.getDB("mydb");
    DBCollection coll = db.getCollection("testCollection");
    BasicDBObject doc = new BasicDBObject("name", "MongoDB");

    for (int i=0; i < 100; i++) {
        WriteResult writeResult = coll.insert(new BasicDBObject("i", i));
        logger.info("writing document " + writeResult);
    }

    int count = (int) coll.getCount();
    assertThat(count, equalTo(100));
}

Even without knowledge of MongoDB this test should not be that hard to understand. It creates a database, a new collection and inserts 100 documents in the database. Finally the test asserts if the getCount method returns the correct amount of documents in the collection. Many more aspects of the mongodb client can be tested in additional tests in this way. In our example setup, we have implemented two more tests to demonstrate this. Our example project contains 3 tests. When you run the 3 example tests sequentially (assuming the mongo docker image has been pulled), you will see that it takes only a few seconds to run them all. This is extremely fast.

Testing against multiple MongoDB versions

We also want to run all our integration tests against different versions of the mongoDB server to ensure there are no regressions. Overcast allows you to define multiple configurations. Lets add configuration for two more versions of MongoDB:

defaultConfig {
    dockerHost="http://localhost:2375"
    exposeAllPorts=true
    remove=true
    command=["mongod", "--smallfiles"]
}

mongodb27=${defaultConfig}
mongodb27.dockerImage="mongo:2.7"

mongodb26=${defaultConfig}
mongodb26.dockerImage="mongo:2.6"

mongodb24=${defaultConfig}
mongodb24.dockerImage="mongo:2.4"

The default configuration contains the configuration we have already seen. The other three configurations extend from the defaultConfig, and define a specific mongoDB image version. Lets also change our test code a little bit to make the overcast configuration we use in the test setup depend on a parameter:

@Parameters("overcastConfig")
@BeforeMethod
public void before(String overcastConfig) throws UnknownHostException {
    itestHost = CloudHostFactory.getCloudHost(overcastConfig);

Here we used the paramaterized tests feature from TestNG. We can now define a TestNG suite to define our test cases and how to pass in the different overcast configurations. Lets have a look at our TestNG suite definition:

<suite name="MongoSuite" verbose="1">
    <test name="MongoDB27tests">
        <parameter name="overcastConfig" value="mongodb27"/>
        <classes>
            <class name="mongo.MongoTest" />
        </classes>
    </test>
    <test name="MongoDB26tests">
        <parameter name="overcastConfig" value="mongodb26"/>
        <classes>
            <class name="mongo.MongoTest" />
        </classes>
    </test>
    <test name="MongoDB24tests">
        <parameter name="overcastConfig" value="mongodb24"/>
        <classes>
            <class name="mongo.MongoTest" />
        </classes>
    </test>
</suite>

With this test suite definition we define 3 test cases that will pass a different overcast configuration to the tests. The overcast configuration plus the TestNG configuration enables us to externally configure against which mongodb versions we want to run our test cases.

Parallel test execution

Until this point, all tests will be executed sequentially. Due to the dynamic nature of cloud hosts and docker, nothing limits us to run multiple containers at once. Lets change the TestNG configuration a little bit to enable parallel testing:

<suite name="MongoSuite" verbose="1" parallel="tests" thread-count="3">

This configuration will cause all 3 test cases from our test suite definition to run in parallel (in other words our 3 overcast configurations with different MongoDB versions). Lets run the tests now from IntelliJ and see if all tests will pass:

Screen Shot 2014-10-08 at 8.32.38 PM
We see 9 executed test, because we have 3 tests and 3 configurations. All 9 tests have passed. The total execution time turned out to be under 9 seconds. That's pretty impressive!

During test execution we can see docker starting up multiple containers (see next screenshot). As expected it shows 3 containers with a different image version running simultaneously. It also shows the dynamic port mappings in the "PORTS" column:

Screen Shot 2014-10-08 at 8.50.07 PM

That's it!

Summary

To summarise, the advantages of using Docker with Overcast for integration testing are:

  1. Minimal setup. Only a docker capable host is required to run the tests.
  2. Save time. Minimal amount of configuration and infrastructure setup required to run the integration tests thanks to the docker community.
  3. Isolation. All test can run in their isolated environment so the tests will not affect each other.
  4. Flexibility. Use multiple overcast configuration and parameterized tests for testing against multiple versions.
  5. Speed. The docker container starts up very quickly, and overcast and testng allow you to even parallelize the testing by running multiple containers at once.

The example code for our integration test project is available here. You can use Boot2Docker to setup a docker host on Mac or Windows.

Happy testing!

Paul van der Ende 

Note: Due to a bug in the gradle parallel test runner you might run into this random failure when you run the example test code yourself. The work around is to disable parallelism or use a different test runner like IntelliJ or maven.

 

Watch the open files limit when running Riak

Agile Testing - Grig Gheorghiu - Mon, 10/13/2014 - 17:53
I was close to expressing my unbridled joy at how little hand-holding our Riak cluster needs, when we started to see strange increased latencies when hitting the cluster, on calls that should have been very fast. Also, the health of the Riak nodes seems fine in terms of CPU, memory and disk. As usual, our good old friend the error log file pointed us towards the solution. We saw entries like this in /var/log/riak/error.log:

2014-10-11 03:22:40.565 UTC [error] <0.12830.4607> CRASH REPORT Process <0.12830.4607> with 0 neighbours exited with reason: {error,accept_failed} in mochiweb_acceptor:init/3 line 342014-10-11 03:22:40.619 UTC [error] <0.168.0> {mochiweb_socket_server,310,{acceptor_error,{error,accept_failed}}}2014-10-11 03:22:40.619 UTC [error] <0.12831.4607> application: mochiweb, "Accept failed error", "{error,emfile}"
A google search revealed that a possible cause of these errors is the dreaded open file descriptor limit, which is 1024 by default in Ubuntu.

To be perfectly honest, we hadn't done almost any tuning on our Riak cluster, because it had been running so smoothly. But recently we started to throw more traffic at it, hence issues with open file descriptors made sense. To fix it, we followed the advice in this Riak doc and created /etc/default/riak with the contents:
ulimit -n 65536
We also took the opportunity to apply the networking-related kernel tuning recommendations from this other Riak tuning doc and added these lines to /etc/sysctl.conf:
net.ipv4.tcp_max_syn_backlog = 40000net.core.somaxconn=4000net.ipv4.tcp_timestamps = 0net.ipv4.tcp_sack = 1net.ipv4.tcp_window_scaling = 1net.ipv4.tcp_fin_timeout = 15net.ipv4.tcp_keepalive_intvl = 30net.ipv4.tcp_tw_reuse = 1
Then we ran sysctl -p to update the above values in the kernel. Finally we restarted our Riak nodes one at a time.
I am happy to report that ever since, we've had absolutely no issues with our Riak cluster.  I should also say we are running Riak 1.3, and I understand that Riak 2.0 has better tests in place for avoiding this issue.

I do want to give kudos to Basho for an amazingly robust piece of technology, whose only fault is that it gets you into the habit of ignoring it because it just works!

How Digital is Changing Physical Experiences

The business economy is going through massive change, as the old world meets the new world.

The convergence of mobility, analytics, social media, cloud computing, and embedded devices is driving the next wave of digital business transformation, where the physical world meets new online possibilities.

And it’s not limited to high-tech and media companies.

Businesses that master the digital landscape are able to gain strategic, competitive advantage.   They are able to create new customer experiences, they are able to gain better insights into customers, and they are able to respond to new opportunities and changing demands in a seamless and agile way.

In the book, Leading Digital: Turning Technology into Business Transformation: Turning Technology Into Business Transformation, George Westerman, Didier Bonnet, and Andrew McAfee, share some of the ways that businesses are meshing the physical experience with the digital experience to generate new business value.

Provide Customers with an Integrated Experience

Businesses that win find new ways to blend the physical world with the digital world.  To serve customers better, businesses are integrating the experience across physical, phone, mail, social, and mobile channels for their customers.

Via Leading Digital: Turning Technology into Business Transformation:

“Companies with multiple channels to customers--physical, phone, mail, social, mobile, and so on--are experiencing pressure to provide an integrated experience.  Delivering these omni-channel experiences requires envisioning and implementing change across both front-end and operational processes.  Innovation does not come from opposing the old and the new.  But as Burberry has shown,  innovation comes from creatively meshing the digital and the physical to reinvent new and compelling customer experiences and to foster continuous innovation.”

Bridge In-Store Experiences with New Online Possibilities

Starbucks is a simple example of blending digital experiences with their physical store.   To serve customers better, they deliver premium content to their in-store customers.

Via Leading Digital: Turning Technology into Business Transformation:

“Similarly, the unique Starbucks experience is rooted in connecting with customers in engaging ways.  But Starbucks does not stop with the physical store.  It has digitally enriched the customer experience by bridging its local, in-store experience with attractive new online possibilities.  Delivered via a free Wi-Fi connection, the Starbucks Digital Network offers in-store customers premium digital content, such as the New York Times or The Economist, to enjoy alongside their coffee.  The network also offers access to local content, from free local restaurant reviews from Zagat to check-in via Foursquare.”

An Example of Museums Blending Technology + Art

Museums can create new possibilities by turning walls into digital displays.  With a digital display, the museum can showcase all of their collections and provide rich information, as well as create new backdrops, or tailor information and tours for their customers.

Via Leading Digital: Turning Technology into Business Transformation:

“Combining physical and digital to enhance customer experiences is not limited to just commercial enterprises.  Public services are getting on the act.  The Cleveland Museum of Art is using technology to enhance the experience and the management of visitors.  'EVERY museum is searching for this holy grail, this blending of technology and art,' said David Franklin, the director of the museum.

 

Fort-foot-wide touch screens display greeting-card sized images of all three thousand objects, and offers information like the location of the actual piece.  By touching an icon on the image, visitors can transfer it from the wall to an iPad (their own, or rented from the museum for $5 a day), creating a personal list of favorites.  From this list, visitors can design a personalized tour, which they can share with others.

 

'There is only so much information you can put on a wall, and no one walks around with catalogs anymore,' Franklin said.  The app can produce a photo of the artwork's original setting--seeing a tapestry in a room filled with tapestries, rather than in a white-walled gallery, is more interesting.  Another feature lets you take the elements of a large tapestry and rearrange them in either comic-book or movie-trailer format.  The experience becomes fun, educational, and engaging.  This reinvention has lured new technology-savvy visitors, but has also made seasoned museum-goers come more often.”

As you figure out the future capability vision for your business, and re-imagine what’s possible, consider how the Nexus of Forces (Cloud, Mobile, Social, and Big Data), along with the mega mega-trend (Internet-of-Things), can help you shape your digital business transformation.

You Might Also Like

Cloud Changes the Game from Deployment to Adoption

Management Innovation is at the Top of the Innovation Stack

McKinsey on Unleashing the Value of Big Data Analytics

Categories: Architecture, Programming

How League of Legends Scaled Chat to 70 million Players - It takes Lots of minions.

How would you build a chat service that needed to handle 7.5 million concurrent players, 27 million daily players, 11K messages per second, and 1 billion events per server, per day?

What could generate so much traffic? A game of course. League of Legends. League of Legends is a team based game, a multiplayer online battle arena (MOBA), where two teams of five battle against each other to control a map and achieve objectives.

For teams to succeed communication is crucial. I learned that from Michal Ptaszek, in an interesting talk on Scaling League of Legends Chat to 70 million Players (slides) at the Strange Loop 2014 conference. Michal gave a good example of why multiplayer team games require good communication between players. Imagine a basketball game without the ability to call plays. It wouldn’t work. So that means chat is crucial. Chat is not a Wouldn’t It Be Nice feature.

Michal structures the talk in an interesting way, using as a template the expression: Make it work. Make it right. Make it fast.

Making it work meant starting with XMPP as a base for chat. WhatsApp followed the same strategy. Out of the box you get something that works and scales well...until the user count really jumps. To make it right and fast, like WhatsApp, League of Legends found themselves customizing the Erlang VM. Adding lots of monitoring capabilities and performance optimizations to remove the bottlenecks that kill performance at scale.

Perhaps the most interesting part of their chat architecture is the use of Riak’s CRDTs (commutative replicated data types) to achieve their goal of a shared nothing fueled massively linear horizontal scalability. CRDTs are still esoteric, so you may not have heard of them yet, but they are the next cool thing if you can make them work for you. It’s a different way of thinking about handling writes.

Let’s learn how League of Legends built their chat system to handle 70 millions players...

Stats
Categories: Architecture

Xebia KnowledgeCast Episode 5: Madhur Kathuria and Scrum Day Europe 2014

Xebia Blog - Mon, 10/13/2014 - 10:48

xebia_xkc_podcast
The Xebia KnowledgeCast is a bi-weekly podcast about software architecture, software development, lean/agile, continuous delivery, and big data. Also, we'll have some fun with stickies!

In this 5th episode, we share key insights of Madhur Kathuria, Xebia India’s Director of Agile Consulting and Transformation, as well as some impressions of our Knowledge Exchange and Scrum Day Europe 2014. And of course, Serge Beaumont will have Fun With Stickies!

First, Madhur Kathuria shares his vision on Agile and we interview Guido Schoonheim at Scrum Day Europe 2014.

In this episode's Fun With Stickies Serge Beaumont talks about wide versus deep retrospectives.

Then, we interview Martin Olesen and Patricia Kong at Scrum Day Europe 2014.

Want to subscribe to the Xebia KnowledgeCast? Subscribe via iTunes, or use our direct rss feed.

Your feedback is appreciated. Please leave your comments in the shownotes. Better yet, send in a voice message so we can put you ON the show!

Credits

Stuff The Internet Says On Scalability For October 10th, 2014

Hey, it's HighScalability time:


Social climber: Instagram explorer scales to new heights in New York.

 

  • 11 billion: world population in 2100; 10 petabytes: Size of Netflix data warehouse on S3; $600 Billion: the loss when a trader can't type; 3.2: 0-60 mph time of probably not my next car.
  • Quotable Quotes:
    • @kahrens_atl: Last week #NewRelic Insights wrote 618 billion events and ran 237 trillion queries with 9 millisecond response time #FS14
    • @sustrik: Imagine debugging on a quantum computer: Looking at the value of a variable changes its value. I hope I'll be out of business by then.
    • Arrival of the Fittest: Solving Evolution's Greatest Puzzle: Every cell contains thousands of such nanomachines, each of them dedicated to a different chemical reaction. And all their complex activities take place in a tiny space where the molecular building blocks of life are packed more tightly than a Tokyo subway at rush hour. Amazing.
    • Eric Schmidt: The simplest outcome is we're going to end up breaking the Internet," said Google's Schmidt. Foreign governments, he said, are "eventually going to say, we want our own Internet in our country because we want it to work our way, and we don't want the NSA and these other people in it.
    • Antirez: Basically it is neither a CP nor an AP system. In other words, Redis Cluster does not achieve the theoretical limits of what is possible with distributed systems, in order to gain certain real world properties.
    • @aliimam: Just so we can fathom the scale of 1B vs 1M: 1,000,000 seconds is 11.5 days. 1,000,000,000 seconds is 31.6 YEARS
    • @kayousterhout: 92% of catastrophic failures in distributed data-intensive systems caused by incorrect error handling https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf … #osdi14
    • @DrQz: 'The purpose of computing is insight, not numbers.' (Hamming) Sometimes numbers ARE the insight so, make them accesible too. (Me)

  • Robert Scoble on the Gillmor Gang said that because of the crush of signups, ello had to throttle invites. Their single PostgreSQL server couldn't handle it captain.

  • Containers are getting much larger with new composite materials. Not that kind of container. Shipping containers. High oil costs have driven ships carrying 5000 containers to evolve. Now they can carry 18,000 and soon 19,000 containers!

  • If you've wanted to make a network game then this is a great start. Making Fast-Paced Multiplayer Networked Games is Hard: Fast-paced multiplayer games over the Internet are hard, but possible. First understanding your constraints then building within them is essential. I hope I have shed some light on what those constraints are and some of the techniques you can use to build within them. No doubt there are other ways out there and ways yet to be used. Each game is different and has its own set of priorities. Learning from what has been done before could help a great deal.

  • Arrival of the Fittest: Solving Evolution's Greatest Puzzle: Environmental change requires complexity, which begets robustness, which begets genotype networks, which enable innovations, the very kind that allow life to cope with environmental change, increase its complexity, and so on, in an ascending spiral of ever-increasing innovability...is the hidden architecture of life.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

New daily stand up questions

Xebia Blog - Fri, 10/10/2014 - 15:51

This post provides some alternate standup questions to let your standup be: aimed forward, goal focused, team focused.

The questions are:

  1. What have I achieved since our last SUM?
  2. What is my goal for today?
  3. What things keep me from reaching my goal?
  4. What is our team goal for the end of our sprint day?

The daily stand up runs on a few standard questions. The traditional questions are:

  • What did I accomplish yesterday?
  • What will I be doing today?
  • What obstacles are impeding my progress?

A couple of effects I see when using the above list are:

  • A lot of emphasis is placed on past activities rather than getting the most out of the day at hand.
  • Team members tell what they will be busy with, but not what they aim to complete.
  • Impediments are not related to daily goals.
  • There is no summary for the team relating to the sprint goal.

If you are experiencing the same issues you could try the alternate questions. They worked for me, but any feedback is appreciated of course. Are you using other questions? Let me know your experience. You could use the PDF below to print out the questions for your scrum board.

STAND_EN

STAND_EN

 

The LGPL on Android

Xebia Blog - Fri, 10/10/2014 - 08:11

My client had me code review an Android app built for them by a third party. As part of my review, I checked the licensing terms of the open source libraries that it used. Most were using Apache 2.0 without a NOTICE file. One was using the GNU Lesser General Public License (LGPL).

My client has commercial reasons to avoid Copyleft-style licenses and so I flagged the library as unusable. The supplier understandably was not thrilled about the rework that implied and asked for an explanation and ideally some way to make it work within the license. Looking into it in more detail, I'm convinced that if you share my client's concerns, then there is no way to use LGPL licensed code on Android. Here's why I believe this to be the case.

The GNU LGPL

When I first encountered the LGPL years ago, it was explained to me as “the GPL, without the requirement to publish your source code”. The actual license terms turn out to be a bit more restrictive. The LGPL is an add-on to the full GPL that weakens (only) the restrictions to how you license and distribute your work. These weaker restrictions are in section 4.

Here's how I read that section:

You may convey a Combined Work under terms of your choice that […] if you also
do each of the following:
  a) [full attribution]
  b) [include a copy of the license]
  c) [if you display any copyright notices, you must mention the licensed Library]
  d) Do one of the following:
    0) [provide means for the user to rebuild or re-link your application against
       a modified version of the Library]
    1) [use runtime linking against a copy already present on the system, and allow
       the user to replace that copy]
  e) [provide clear instructions how to rebuild or re-link your application in light
     of the previous point]

The LGPL on Android

An Android app can use two kinds of libraries: Java libraries and native libraries. Both run into the same problem with the LGPL.

The APK file format for Android apps is a single, digitally signed package. It contains native libraries directly, while Java libraries are packaged along with your own bytecode into the dex file. Android has no means of installing shared libraries into the system outside of your APK, ruling out out (d)(1) as an option. That leaves (d)(0). Making the library replaceable is not the issue. It may not be the simplest thing, but I'm sure there is some way to make it work for both kinds of libraries.

That leaves the digital signature, and here's where it breaks down. Any user who replaces the LGPL licensed library in your app will have to digitally sign their modified APK file. You can't publish your code signing key, so they have to sign with a different key. This breaks signature compatibility, which breaks updates and custom permissions and makes shared preferences and expansion files inaccessible. It can therefore be argued that such an APK file is not usable in lieu of the original app, thus violating the license.

In short

The GNU Lesser General Public License ensures that a user has freedom to modify a so licensed library used by your application, even if your application is itself closed source. Android's app packaging and signature requirements are such that I believe it is impossible to comply with the license when using an LGPL licensed library in a closed source Android app.

Function references in Swift and retain cycles

Xebia Blog - Thu, 10/09/2014 - 14:49

The Swift programming language comes with some nice features. One of those features are closures, which are similar to blocks in objective-c. As mentioned in the Apple guides, functions are special types of closures and they too can be passed around to other functions and set as property values. In this post I will go through some sample uses and especially explain the dangers of retain cycles that you can quickly run into when retaining function pointers.

Let's first have a look at a fairly simple objective-c sample before we write something similar in Swift.

Objective-c

We will create a button that executes a block statement when tapped.

In the header file we define a property for the block:

@interface BlockButton : UIButton

@property (nonatomic, strong) void (^action)();

@end

Keep in mind that this is a strong reference and the block and references in the block will be retained.

And then the implementation will execute the block when tapped:

#import "BlockButton.h"

@implementation BlockButton

-(void)setAction:(void (^)())action
{
    _action = action;
    [self addTarget:self action:@selector(performAction) forControlEvents:UIControlEventTouchUpInside];
}

-(void)performAction {
    self.action();
}

@end

We can now use this button in one of our view controllers as following:

self.button.action = ^{
    NSLog(@"Button Tapped");
};

We will now see the message "Button Tapped" logged to the console each time we tap the button. And since we don't reference self within our block, we won't get into trouble with retain cycles.

In many cases however it's likely that you will reference self because you might want to call a function that you also need to call from other places. Let's look as such an example:

-(void)viewDidLoad {
    self.button.action = ^{
        [self buttonTapped];
    };
}

-(void)buttonTapped {
    NSLog(@"Button Tapped");
}

Because our view controller (or it's view) retains our button, and the button retains the block, we're creating a retain cycle here because the block will create a strong reference to self. That means that our view controller will never be deallocated and we'll have a memory leak.

This can easily be solved by using a weak reference to self:

__weak typeof(self) weakSelf = self;
self.button.action = ^{
    [weakSelf buttonTapped];
};

Nothing new so far, so let's continue with creating something similar in Swift.

Swift

In Swift we can create a similar Button that executes a closure instead of a block:

class ClosureButton: UIButton {

    var action: (() -> ())? {
        didSet {
            addTarget(self, action: "callClosure", forControlEvents: .TouchUpInside)
        }
    }

    func callClosure() {
        if let action = action {
            action()
        }
    }
}

It doing the same as the objective-c version (and in fact you could use it from objective-c with the same block as before). We can assign it an action from our view controller as following:

button.action = {
    println("Button Tapped")
}

Since this closure doesn't capture self, we won't be running into problems with retain cycles here.

As mentioned earlier, functions are just a special type of closures. Which is pretty nice, because it lets us reference functions immediately like this:

override func viewDidLoad() {
    button.action = buttonTapped
}

func buttonTapped() {
    println("Button Tapped")
}

Nice and easy syntax and good for functional programming. If only it wouldn't give us problems. Without it being immediately obvious, the above sample does create a retain cycle. Why? We're not referencing self anywhere? Or are we? The problem is that the buttonTapped function is part of our view controller instance. So when the button.action references to that function, it creates a strong reference to the view controller as well. In this case we could fix it by making buttonTapped a class function. But since in most cases you'll want to do something with self in such a function, for example accessing variables, this is not an option.

The only thing we can do to fix this is to make sure that the button won't get a strong reference to the view controller. Just like in our last objective-c sample, we need to create a weak reference to self. Unfortunately there is no easy way to simply get a weak reference to our function. So we need a work around here.

Work around 1: wrapping in closure

We can create a weak reference by wrapping the function in a closure:

button.action = { [weak self] in
    self!.buttonTapped()
}

Here we first create a weak reference of self. And in Swift, weak references are always optional. That means self within this closure is now an optional and need to unwrap it first, which is what the exclamation mark is for. Since we know this code cannot be called when self is deallocated we can safely use the ! instead of ?.

A lot less elegant than immediately referencing our function immediately.

In theory, using an unowned reference to self should also work as following:

button.action = { [unowned self] in
    self.buttonTapped()
}

Unfortunately (for reasons unknown to me) this crashes with a EXC_BAD_ACCESS upon deallocation of the ClosureButton. Probably a bug.

Work around 2: method pointer function

Thanks to a question on StackOverflow about this same problem and an answer provided by Rob Napier, there is a way to make the code a bit more elegant again. We can define a function that does the wrapping in a closure for us:

func methodPointer<T: AnyObject>(obj: T, method: (T) -> () -> Void) -> (() -> Void) {
    return { [weak obj] in
        method(obj!)()
    }
}

Now we can get a weak reference to our function a bit easier.

button.action = methodPointer(self, ViewController.buttonTapped)

The reason this works is because you can get a reference to any instance function by calling it as a class function with the instance (in this case self) as argument. For example, the following all does the same thing:

// normal call
self.buttonTapped()

// get reference through class
let myFunction = MyViewController.buttonTapped(self)
myFunction()

// directly through class
MyViewController.buttonTapped(self)()

However, the downside of this is that it only works with functions that take no arguments and return Void. i.e. methods with a () -> () signature, like our buttonTapped.

For each signature we would have to create a separate function. For example for a function that takes a String parameter and returns an Int:

func methodPointer<T: AnyObject>(obj: T, method: (T) -> (String) -> Int) -> ((String) -> Int) {
    return { [weak obj] string in
        method(obj!)(string)
    }
}

We can then use it the same way:

func someFunction() {
    let myFunction = methodPointer(self, MyViewController.stringToInt)
    let myInt = myFunction("123")
}

func stringToInt(string: String) -> Int {
    return string.toInt()
}
Retain cycles within a single class instance

Retain cycles do not only happen when strong references are made between two instances of a class. It's also possible, and probably less obvious, to create a strong reference within the same instance. Let look an an example:

var print: ((String) -> ())?

override func viewDidLoad() {
    print = printToConsole
}

func printToConsole(message: String) {
    println(message)
}

Here we do pretty much the same as in our button examples. We define an optional closure variable and then assign a function reference to it. This creates a strong reference from the print variable to self and thus creating a retain cycle. We need to solve it by using the same tricks we used earlier.

Another example is when we define a lazy variable. Since lazy variables are assigned after initialisation, they are allowed to reference self directly. That means we can set them to a function reference as following:

lazy var print: ((String) -> ()) = self.printToConsole

Of course this also creates a retain cycle.

Conclusion

To avoid creating retain cycles in Swift you should always remember that a reference to an instance function means that you're referencing the instance as well. And thus when assigning to a variable, you're creating a strong reference. Always make sure to wrap such references in a closure with a weak reference to the instance or make sure to manually set the variables to nil once you're done with them.

Unfortunately Swift does not support weak closure variables, which is something that would solve the problem. Hopefully they will support it in the future or come up with a way to create a weak reference to a function much like we can use [weak self] now in closures.

That's Not My Problem - I'm Renting Them

Scott Hanselman gives a hilarious and insightful talk in Virtual Machines, JavaScript and Assembler, a keynote at Velocity Santa Clara 2014. The topic of his talk is an intuitive understanding of the cloud and why it's the best thing ever. 

At about 6:30 into the video Scott is at his standup comic best when he recounts a story of a talk Adrian Cockroft gave on Netflix’s move to SSDs. An audience member energetically questioned the move to SSDs saying they had high failure rates and how moving to SSDs was a stupid idea.

To which Mr. Cockroft replies:

That's not my problem, I'm renting them.

Scott selected the ideal illustration of the high level of abstraction the cloud provides. If you are new to the cloud that's a very hard idea to grasp. "That's not my problem, I'm renting them" is the perfect mantra when you find yourself worried about things you don't need to be worried about anymore.

Categories: Architecture