Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Engine Yard Blog
Syndicate content
Updated: 42 weeks 16 hours ago

Polyglot Background Jobs

Tue, 06/25/2013 - 20:05

There's many things we end up needing to perform background jobs for; but the main reason is to provide a snappy, non-blocking user experience.

Whether that task is encoding a video file, batch data import, or (in one case I ran into) jabber instant messaging, we want to offload them from our web servers as quickly as possible.

There are lots of tools to accomplish this across all languages, including Resque, Sidekiq, delayed_job, node-schedule, beanstalkd, Amazon Simple Queue Service (SQS) and then there is my personal favorite: Gearman.

Gearman has client libraries in C, PHP, Ruby, Node.js, Python, Java, Perl, C#/.NET and even includes tools that can be called via shell script, and user-defined functions for both MySQL and PostgreSQL.

Gearman itself is written in C, and is super simple. If you get a chance, I highly recommend checking out the source code. Note: gearman was originally written in Perl and later re-written in C. Be sure not to use the perl version (e.g. dev-perl/Gearman* in Gentoo portage).

The main reason I like gearmand is it's simplicity. Gearman has three parts to it:

  1. GearmanClient submits tasks to the job queue

  2. gearmand is the job queue itself (running as a daemon)

  3. GearmanWorker retrieves the tasks from the job queue and handles them

Gearman Communication Diagram-1

(View Large)

By default, the Gearman queue is stored in memory, however you can also make it persistent and stored in MySQL, PostgreSQL, memcached or SQLite. With memcache, obviously if it's on the same machine as gearmand then you're likely to lose it just as easily as the regular queue. The only difference is that you could re-start gearmand without losing the queue.

However, another potential option is to use the new MySQL 5.6 NoSQL Interface, which supports the memcached protocol. This should be faster than using the Gearman MySQL backend without sacrificing the persistence it brings.

It obviously has the ability to run background jobs being as this is what this post is all about, but it also foreground jobs which allow the GearmanClient and the GearmanWorker to communicate with each other using gearmand as the middle-man.

The best thing about Gearman, is that you can use different languages for different pieces. So you build your website in PHP, but maybe it's not the best option for wrangling text; so you schedule a job with gearmand, and a Python worker picks it up. Or Ruby, or Node.js, or… you get the idea.

What this allows us to do is to pick the correct tool for every task in our stack. Why workaround the pitfalls of our primary language when you can simply pick up a better tool and do things right.

Using Gearman

First we are going to use PHP to schedule a task with the job queue. This uses the pecl/gearman extension.

function createBackgroundJob($task, $data = array()) {
    $client = new \GearmanClient();
    $client->addServer(/* Defaults to 127.0.0.1, 4730 */);
    $handle = $client->doBackground($task, json_encode($data));

    if ($client->returnCode() != GEARMAN_SUCCESS) {
        return false;
    }

    return $handle;
}

In this simple example we create an instance of the \GearmanClient class, tell it to connect to the default server (localhost:4730) and send a background task ($client->doBackground()).

Next we ensure that the task was added successfully, and return the job handle.

We might call it with something like this, passing in the username:

$handle = createBackgroundJob('sendWelcomeEmail', ['username' => 'dshafik']);

We would then want to store the handle so that we can later check the status of the task.

The Worker

Next we'll create a worker, this time in Ruby:

require 'rubygems'
require 'gearman'
require 'json'

servers = ['localhost:4730']
worker = Gearman::Worker.new(servers)

# Add a handler for the "sendWelcomeEmail" task
worker.add_ability('sendWelcomeEmail') do |data,job|
    data = JSON.parse data
    user = User.first(:conditions => [ "username = ?", data["username"] ])
    user.sendWelcomeEmail();
end
loop { worker.work }

Here we use the gearman-ruby gem to create a Gearman::Worker, and then register the task handler.

In this case, we first decode the JSON data passed in from our GearmanClient and then find our user in the database by the username. We then call the sendWelcomeEmail method.

For something that takes more time, you could send back a running status. The job variable is an instance of Gearman::Worker::Job class which allows you to respond using job.report_status(numerator, denominator).

It’s important to note that you can run as many workers for each task as you’d like, Gearman will not hand the same job to multiple workers (however, there is a re-try config option should it fail) and because they are pulling jobs it will not overload the workers, though you may run out of them. The number of workers you run can also act as a way to manage priority — higher priority jobs get more workers — and balance resources.

Checking the Status

Finally, we'll need a way to check the status of the request. For this we’ll use Node.js/Javascript. In our case we are only looking to see if the job has completed as we haven't send any other status.

var http = require('http'), 
    url = require("url"),
    querystring = require("querystring"),
    gearman = require("gearman");

var server = http.createServer(function (request, response) {
    var client = gearman.createClient();
    var query = url.parse(request.url, true).query;

    if (!("handle" in query)) {
        response.writeHead(404, {"Content-Type": "text/plain"});
        response.end("Job not found!\n");
    } else {
        var status = { };
        client.getJobStatus(query.handle, function(s) { 
            if (s) {
                status = s;
            }

            response.writeHead(200, {"Content-Type": "application/json"});
            response.end(JSON.stringify(status));
        });
    }
});

server.listen(8000);

This creates an HTTP server on port 8000 that when passed a handle via GET arguments will return the status.

Using Gearman with Engine Yard Cloud

In order to make Gearman a part of the background job processes on your Engine Yard Cloud account, it is necessary to create a custom chef recipe to compile it yourself (chef recipes can be used to take advantage of software outside of the current stack). For more details on using Chef with Engine Yard Cloud, check out our knowledge base.

As with all background jobs, best practices recommend Gearman be run on an Utility Instance, so that all issues are processed without interfering with the Application Instances themselves.

Can't we all just get along?

So, as you can see, Gearman can act like glue between the various parts of your application. It's super fast, has low resource usage and can be used with almost any language you can think of.

Additionally, it can not only do foreground tasks (with communication), but can also prioritize jobs into high/standard/low priority queues.

You can also easily scale Gearman as the clients and workers both support multiple servers, allowing you to spread your queue, and your workers out over multiple machines.

I highly recommend checking it out at http://gearman.org.

 

The post Polyglot Background Jobs appeared first on Engine Yard Developer Blog.

Categories: Programming

June 21, 2013: This Week at Engine Yard

Fri, 06/21/2013 - 22:38

We've been very busy on a number of exciting projects that we're looking forward to sharing with you all. While I can’t tell you what they are just yet, I can point you to some adorable animal photos.

Have an awesome weekend!

--Tasha Drew, Product Manager

Engineering Updates

PHP customers will be excited to know we’ve released Composer support! Composer is a dependency manager for PHP and allows developers to specify project dependencies in a composer.json file - Composer then handles the rest. Big props to the legendary Ben Chapman for his work getting this ready -- read all about it in our docs!

We’ve released a feature to improve snapshot management for our customers into Early Access. This feature allows you to see snapshots attached to your environments and delete them. You can also set a default policy for how long you want to keep snapshots around via the UI. Our default limit is now 90 days, if you enable the feature, so if you had a stopped environment with some super old snapshots in it, talk to support before you enable it. Hopefully this will make snapshot management significantly more easy and transparent!

Data Data Data

This week Ines and team worked on some deep diving projects dealing with how backups will be performed on new relational clusters going forward. This super forward focused work will eventually begin to reveal itself as we reveal new cluster types, enhancements, and a few new features.

Also, for the Postgres fans out there, Tom Lane’s excellent SF-PUG presentation on The Architecture of the PostgreSQL query planner is up!

Social Calendar (Come say hi!)

Tuesday June 25th, 6:30pm: San Francisco Office: Let’s meet up and talk about LevelDB and Node! Speakers this week include Dominic Tarr, Rod Vagg, Jake Verbaten, Paolo Fragomeni, and Mikeal Rogers.

Tuesday June 25th, 6:30pm: Buffalo Office: Girl Develop It! is teaching an Intro to HTML & CSS course, the third in a series of four.

Wednesday June 26th, 6:30pm: Portland Office: We will be hosting Coder Dojo for students K-12 to learn about software. Parents welcome to attend and participate!

NodeConf (June 27-29, Walker Creek Ranch, CA). We’re sponsoring - if you see Engine Yard t-shirts, come by and say hello! We’re excited to be sponsoring NodeConf for the first time, summer camp style.

Thursday June 27th, 6:30pm: Dublin, Ireland Office: Node.js Dublin will be investigating all things Node. Grab a ticket here!

EuRuKo (June 29-30, Athens, Greece). Stop by the Engine Yard booth! Grab a t-shirt and some swag, learn what’s new with Cloud and meet our awesome community manager, Kelsey Schimmelman.

Lonestar PHP (June 29-30, Dallas, TX). We’re excited to be returning to Lonestar PHP--if you see Davey Shafik walking around, say hi!

Articles of Interest

Google finally admits that those crazy brain teasers do not, in fact, indicate anything  about how good a hire an engineer is going to be. But they’re definitely fun to dream up.

The post June 21, 2013: This Week at Engine Yard appeared first on Engine Yard Developer Blog.

Categories: Programming

How to Troubleshoot PostgreSQL Alerts

Fri, 06/21/2013 - 17:17

So you have your PostgreSQL application deployed on Engine Yard Cloud and everything is going great. You have enabled a few extensions, have added basic redundancy by spinning a database replica, and are busy developing new features.  One day though, you look at the dashboard and see this message:

What do these alerts mean? Is the database at risk? Should you escalate to support? This post will help you understand PostgreSQL dashboard alerts and correlate them to the health of your database and application.

Monitoring and the checkpoint check

The alerts I showed you popped up in one of our mission-critical applications. This blog discusses the steps I performed to troubleshoot the cause of the problem and the resolution. But first a little bit of background.

We monitor the health of your PostgreSQL database using a combination of our own custom checks and Bucardo’s check_postgres scripts. I’ll wave a big wand here and tell you that either Collectd or Nagios (depends on your stack and features) consume the results of these checks and present them to the Engine Yard dashboard.

The following documentation page provides an explanation of the alerts Engine Yard issues for PostgreSQL. Today I’ll focus only on the alert I received but refer to the documentation if you see something different in your application’s dashboard.

Let’s take a closer look at the message:

POSTGRES_CHECKPOINT CRITICAL: Last checkpoint was 16204 seconds ago

I know from this message that the checkpoint check originated the alert and the severity of the alert is critical. In human talk, the message means that the database has not had a checkpoint for about 4.5 hours!  Here is another example:

POSTGRES_CHECKPOINT WARNING: Last checkpoint was 1265 seconds ago

This message means that the database has not had a checkpoint for about 21 minutes.

We issue a WARNING severity when checkpoint delays range from 20 to 30 minutes. For anything that exceeds 30 minutes, the severity of the alert goes to CRITICAL.

A checkpoint is a point in the transaction log sequence at which all data files have been updated to reflect the information in the log and flushed to disk. If your system crashes recovery will start from the last known checkpoint. So the checkpoint check helps us confirm two things: that your database consistently takes forward the position in which recovery is started, and in the case of replicas that your standby is keeping up with its master (since the activity the replica sees is what the master has sent it).

For more information about checkpoints and replication, please refer to the Postgresql replication write-ahead-log (WAL) documentation.

Back to my App

Now we understand that the alert I received means that there was a problem with the database replica and its ability to checkpoint. The database logs showed nothing out of order, so I logged into the server console and discovered the following:

# psql
psql (9.1.9, server 9.1.3)
Type "help" for help.

The psql prompt showed me that there was a version mismatch between the database server binaries and the running psql process. This typically happens after a stack update (that includes a minor version bump of your database) is applied on a running environment, and the database process is not restarted.  The database server is left in a state where its effectively running two versions at the same time. To ensure that the postgresql process is running the latest version of the database, you MUST always restart the database process after upgrading your environment.

The stack and version update was absolutely necessary as it included critical security patches - See April 4, 3013 - PostgreSQL security update. But when the stack was applied, the person who applied the stack update didn’t restart the postgresql process as outlined in the upgrade instructions. This is typically not a problem, as replication is known to work between patch level versions, but we hit a replication bug in the 9.1.3 to 9.1.9 upgrade which caused replication to break.

Our Solution

So in a nutshell, our database replica became unable to receive WAL archives from its master,  checkpoints started falling behind, and we were alerted. Restarting the database process would have solved the problem but instead we decided to utilize the maintenance window to upgrade the server to PostgreSQL 9.2 and create a new replica.

I performed an in-place upgrade of the database master (something that professional services has a lot of experience with) and within minutes the application was back online running the latest version of PostgreSQL.

But troubleshooting this alert made me aware of the issues with our current upgrade process:

  • Documentation on alerts was lacking. There is no place to quickly look up the alerts we present in the UI and their meaning.
  • Our upgrade message did not remind us to restart the database process (though the release notes did).
  • An unexpected replication bug between patch versions caused my database replica to become stale.

Here is what we we’ll do to make sure you don’t experience the problems I did last week.

New PostgreSQL alert documentation

PostgreSQL alerts will be explained in a new documentation page. We’ll work on documenting MySQL and Riak alerts as well.

Improved stack upgrade messages

We will enhance stack release notes with icons to visually indicate if a process restart is needed when a new version of a database is available.


Ability to lock your database version

Without a doubt, we want customers to keep their database stacks up to date with security releases and patches. But it would be fantastic to be able to lock your entire database version (to the patch level) and still receive stack updates.

We have developed (and are internally testing) a toggle to lock your database version. With this feature, I can schedule a maintenance window (to restart the database process when I’m ready) while continuing to receive stack updates. We are still working on documentation but if this feature is something that interests you, please open a support ticket and let me know. It should be in limited access soon!

Hopefully now you have a little more context and information available to interpret the alerts we display in your environment. Exciting things are happening in the Engine Yard’s Data stack (think new clusters!).  A little hint for the curious, Tasha Drew’s excellent weekly recap of engineering always includes juicy details on what we are up to ;)

The post How to Troubleshoot PostgreSQL Alerts appeared first on Engine Yard Developer Blog.

Categories: Programming

Announcing Composer Support

Tue, 06/18/2013 - 21:55

We’re pleased to announce Composer support for PHP applications.

This has been one of our most requested features, and should make it even easier for you to manage your apps. If you’re already using Composer, you can dive right in. If not, now is a great time to try it out. We recommend Composer for all PHP apps!

What is Composer?

Composer is a popular dependency manager for PHP. With it, you can specify project dependencies in a composer.json file and Composer will automatically handle the rest. For more information about Composer, take a look at the project website.

Why is It Useful?

Composer allows you to manage third-party dependencies separate from your code, decluttering your repository. What’s more, it makes updating your dependencies a snap. Just run composer update and Composer will fetch the latest compatible versions.

How Can I Use It?

Using Composer with Engine Yard is very simple. We’ll detect the presence of a composer.lock file in your repository, and automatically install your app’s dependencies. To get started with Composer for Engine Yard, take a look at the documentation.

The post Announcing Composer Support appeared first on Engine Yard Developer Blog.

Categories: Programming

June 14, 2013: This Week at Engine Yard

Fri, 06/14/2013 - 17:17

This is the week a big chunk of the San Francisco development team went on a roadtrip to our Portland office to do some intense cross-office feature pollination. Things may have started out with some office rivalry, but developers quickly overcame any differences to work together to build, drink copious amounts of amazing coffee, and figure out the location of some of the awesome restaurants Portland has to offer. Pro-Tip: check out Blue Star donuts #amazing.

--Tasha Drew, Product Manager

Engineering Updates

Customer feedback is important to us and is an important part of how we prioritize work within our product management process. We received a few comments from customers who were frustrated because they couldn’t figure out why they were being charged money when they didn’t have any running instances. The answer was that they still had IP addresses that were detached from instances when the instances were terminated, but not deleted.

Customers can always see IP addresses and manage them in the dashboard by going to Tools -> IP addresses, but we decided to add more messaging to call this out to people.

Going forward, you will see a dashboard notice if you delete an instance and don’t delete the IP address - and you will also receive an email. We will also be sending out emails to any customer who has an account where the only items they’re being billed for are IP addresses and snapshots to let them know.

Hope this helps going forward! Big thanks to one of our newest platform developers, the amazing Daniela, for turning this request around so quickly.

We’re also wrapping up some cool new features around snapshot management which you should be reading about in this space next week!

Data Data Data

Our lead data engineer, Ines, has been busy working on the underlying code for exciting new features that we’ll be rolling out in the next few months. She also handed off a new feature that allows for database version locking to alleviate upgrade pains. The DBA team is actively testing and improving it and we should make it available soon. Watch out for her blogpost next week.

Ines and I were delighted to get to meet up with local Postgres ladies while we were in Portland. Selena Deckelmann has some great thoughts on the intersection of developers and Operations on her blog for those of you who need some fun weekend reading.  Kris Pennella gave me a valuable reminder to take a deep breath when facing stressful situations in her blog, “3 Tips Channeling a Negative into a Positive.”

We also had the pleasure of seeing Basho’s Eric Redmond (author of 7 Databases in 7 weeks and the Little Riak Book). We got a chance to hear some of the features that will come in Riak 1.4 and we are very excited!

Social Calendar (Come say hi!)

Friday June 14 - Saturday June 15: DevOps Days Amsterdam!: Meet the always charming Slava and the ridiculously knowledgable Richard as they hang out and participate in this awesome DevOps conference where we are not only a PaaS -- we are also a cake.

Tuesday June 18, 19:00: Ruby Ireland Meetup at Engine Yard Dublin. We are Going off The Rails this month at Ruby Ireland as we go through some of the options for extending your web apps with mobile apps or through a Javascript framework. Kevin Fagan, Fergal Condron, Simon Rand, Gavin Joyce and Paul Watson will be speaking.

Thursday June 20 - Friday June 21st: Lyon, France, Ruby Lugdunum: Crowd favorite Engine Yard engineer PJ Hagerty will be presenting at Ruby Lugdunum in exotic Lyon, France, on how to grow and nurture your local Ruby group.

Thursday June 20, 18:30: Open Data Ireland #8 at Engine Yard Dublin. General theme for ODI Meetup #8 is 'Open Government Partnership'. This meetup will be facilitated by Denis Parfenov, Tom Stewart and Nuala Haughey. We'll be hosting a brief presentation from OGP representative. The rest of the evening will be dedicated to building topic- specific, multi-stakeholder/multi-disciplinary working groups with a view to taking an active part in co-drafting/crowdsourcing Ireland’s first national Action Plan around OGP principles.

Thursday June 20, 19:00: Engine Yard’s Buffalo Offices: Riak is bustin’ out all over in June, a meetup led by renouned Riakifier Dave Parfitt.

Articles of Interest

Drink coffee: avoid death! The New York Times tells us exactly what we’ve been hoping to hear.

Nobody Understands the GIL: Jesse Storimer explores MRI and analyzes functions for thread safety.

And for our distributed systems fans (that’s everyone, right?) a deep dive into non-blocking transactional atomicity by Peter Bailis.

Call me maybe: Kyle Kingsbury’s summary post on Jepsen looking at how various databases handle network partitions.

The post June 14, 2013: This Week at Engine Yard appeared first on Engine Yard Developer Blog.

Categories: Programming

You Cannot Win Engineering

Thu, 06/13/2013 - 20:51

For as long as I can remember, I’ve been a fan of Saturday Night Live and improvisational theater. Improv looks chaotic and uncontrolled, but the best practitioners operate under strict rules that govern interactions between players. Some of the most successful entertainers today, people like Stephen Colbert and Tina Fey, directly credit what they have learned in improv with making them better at what they do both on and off screen.

Unlike workplace policies that you are probably used to, the rules of improv aren’t meant to constrain you, but to open you up to the ideas of others. Let’s take a look at some of the rules the Mr. Colbert and Ms. Fey live by and see how they can improve team collaboration.

Agree and Say “Yes”.

Here’s Tina, from her book Bossypants, talking about the rules of engagement:

The first rule of improvisation is AGREE. Always agree and SAY YES. When you’re improvising, this means you are required to agree with whatever your partner has created. So if we’re improvising and I say, “Freeze, I have a gun,” and you say, “That’s not a gun. It’s your finger. You’re pointing your finger at me,” our improvised scene has ground to a halt. But if I say, “Freeze, I have a gun!” and you say, “Yes! The gun I gave you for Christmas! You bastard!” then we have started a scene because we have AGREED that my finger is in fact a Christmas gun.

The same is true of engineering teams. When one of your teammates has an idea, your first response needs to be affirmative. Take any and all ideas from your teammates as positive contributions and you start from a place of being open-minded and welcoming. Nothing kills team morale faster than someone who says “No, that won’t work” in response to any idea that they didn’t come up with.

It’s Not Just “Yes”, it’s “Yes, and…”

Everyone loves games and games are more fun when everyone  plays nicely. Make positive contributions and you will foster a spirit of openness, collaboration and — dare I say — fun.  Make it your habit to answer your teammate’s ideas with “Yes, and…” instead of “No, because”. Always offer your ideas, you just are as entitled to be silly and wrong as everyone else. Ideas seldom spring fully-formed from the head of Zeus and the part you’re holding back out of fear might be the thing that makes it work. “Yes, and…” makes you part of the solution; “No, because” makes you part of the problem.

Your Team is the Most Important Person on Your Team

 Stephen Colbert went back to his alma mater, Northwestern University, to give the commencement address in 2011. He may play a know-it-all blowhard on The Colbert Report, but that’s clearly not the case in real life. Here’s an excerpt from his speech:

…One of the things I was taught early on is that you are not the most important person in the scene. Everybody else is. And if they are the most important people in the scene, you will naturally pay attention to them and serve them. But the good news is you're in the scene too. So hopefully to them you're the most important person, and they will serve you. No one is leading, you're all following the follower, serving the servant.

You cannot win improv.

And life is an improvisation. You have no idea what's going to happen next and you are mostly just making things up as you go along.

And like improv, you cannot win your life.

The software corollary to this is: “You cannot win engineering”.

Think about the implications of this for a moment. If everyone on your team acts as if their teammates are more important than they are, you create an environment of support, giving, and progress that is mutually enriching and productive. You’ll know you have succeeded when no one on your team remembers where a great idea came from. More importantly, no one will care.

When one of your teammates asks you a question, don’t tell them to Google it (which is a bit of a jerk response in any case). Act as if their problems are more important than yours, serve the team by serving them. When you are stuck on a problem, they will treat you the same way.

None of these rules for improvisation will make you funnier or get you a slot on Weekend Update, but applying them to your co-workers will almost certainly make your team awesome. Everyone wins.

The post You Cannot Win Engineering appeared first on Engine Yard Developer Blog.

Categories: Programming

June 7, 2013: This Week at Engine Yard

Fri, 06/07/2013 - 23:41

Things are pretty busy right now as we ship a bunch of customer enhancements on Engine Yard Cloud and continue with our planned infrastructure abstractions and cluster model improvements. Exciting things to come! In the meantime, here’s what’s available as of this week.

--Tasha Drew, Product Manager

Engineering Updates

Now in GA: Application takeover preferences. Based on your application's customizations, you might not want to use the default application takeover behavior we've developed to automatically promote your application slaves when the app master goes away or becomes totally unresponsive for some reason.

Engine Yard Cloud now provides two automated options for replacing capacity in an application takeover situation. We also provide alternatives if you need to handle part or all of an app takeover yourself.

We now have Provisioned IOPs and EBS Optimized instances available for customers to use in Early Access! To enable them for your environment from your cloud dashboard, click the Tools menu -> Early Access, and then enable “EBS Optimized Instances” and “Provisioned IOPs.”

Keep in mind that they work best in tandem, and they will only be an option on instances booted after you enable the feature.

Data Data Data

Databases love I/O and provisioned IOPs and EBS optimized instances are very well suited for applications where the database can use more performance (think backups and snapshots too).  

You can enhance the performance of your application by having a volume with provisioned IOPs on the database master. If your application has been already deployed you can add new replicas to the environment (that have this performance boost) and have them promoted to master.
As usual don’t hesitate to ask us if PIOPs or EBS optimized instances can give your database a boost.

Social Calendar (Come say hi!)

Tuesday, June 11th: Our Buffalo office will be hosting the WNY Ruby Meetup Group. Mark Josef will be providing us with some code katas.

Wednesday, June 12th: Our PDX office will be hosting the weekly CoderDojo K-12 night, ably assisted by one of the San Francisco sprint teams, who will be on site for an off site (as it were).

Wednesday, June 12th: Girl Develop It will be doing a Code and Coffee night in our Buffalo office. The participants be focusing on honing their skills and working in groups. Swing by for the whole thing or just for a part of it.

Friday, June 14th: DevOps Day Amsterdam will be happening! Be sure to meet our own Slava and Richard and let them tell you about how Engine Yard can make your lives easier.

Articles of Interest

Mozilla's John O’Dunn discusses how to use release engineering as a force-multiplier!

David Padilla explains why hash lookups are so fast in Ruby on the Engine Yard blog.

The post June 7, 2013: This Week at Engine Yard appeared first on Engine Yard Developer Blog.

Categories: Programming

Speaking at Conferences: How to write a talk and get it accepted

Fri, 06/07/2013 - 21:38

At php[tek] 2013, Engine Yard sponsored the Mentorship Summit, a special forum to discuss the value of mentoring to create more connections and advancement opportunities for developers. A common theme that came out of the summit was that speaking at conferences is a great way to further oneself both personally and professionally. During the discussion, inevitably someone said they've submitted numerous times but had never been accepted to speak, then someone else said they don't know how to write a good proposal, and many discussions were had about what it means to write a good talk proposal.

I've had this conversation many times over the past few years with those I mentor, but this year something was different: I was a member of the selection committee for Distill, Engine Yard’s first developer conference.

Being on the other side for once has changed how I think about what is important when submitting for a talk and I thought it might be helpful to share how my perspective has evolved.

Questions and Answers

A proposal typically consists of a title, a short abstract, and sometimes a larger body of text to give the reviewers more detail about what to expect in the talk.

The first two items are really important because in addition to helping your talk stand out to the reviewer, they are also how an attendee will choose your talk over others on the schedule.

It's important to understand that there is an implicit question being asked in the title of your talk. For example, the two talks I presented at php[tek] were:

  1. "PHP 5.5: The New Bits"

  2. "MySQL High Availability, Disaster Recovery and Load Balancing"

The first one has an implicit question along the lines of: What's new in PHP 5.5? and the second: How do I make my database more robust and more scalable?  How do I make it easier to recover from failures?

Your abstract will then further inform the attendee (and the reviewer) of that question, and more importantly, create an expectation, or even a promise of what the talk will cover.

Most negative feedback I hear about talks isn't that the speaker was terrible or didn't know what they were talking about; it’s simply that the talk wasn't what the attendee wanted. Communicating the intent of your talk clearly is essential. If an attendee is disappointed, it’s often either because he misinterpreted the intention of the talk, or the speaker failed to properly set his expectations.

For example, if my MySQL talk focused mainly on using memcache (as a means to reduce load on the DB and therefore reducing the need to scale) it would have been a terrible talk — though a worthwhile topic, it's not addressing the question posed by my title and therefore not meeting the promise of the talk.

However, a talk is not always a matter of simply answering a question. Sometimes you are teaching people how to better ask their question. If the talk had been titled "High performance websites with MySQL", then memcache would certainly be answering some of the implicit question by teaching them how to ask a better question: "How do I make my website faster when using MySQL?"

Is Your Question Relevant?

One of the more important things to remember is to ask the right question. What do people care about? Look at schedules from previous years for the conference (if they exist) to get an idea of what experience level the conference targets, and observe the associated community to see what people are interested in — what are emerging trends and topics that people are fascinated by?

Are you the right person for the job?

A very important thing to realize about your proposal is that who you are matters. To be more specific: why are you qualified to speak on a specific subject, or why do your opinions about it matter?

When selecting talks for Distill, I ran into mostly speakers I was unfamiliar with. So I Googled them.

If you want to establish yourself as a possible speaker, you need to be part of the larger conversation on the subject on which you want to speak. This can take many forms:

  • Blog posts

  • Tweets with people about the subject (yes, that's right, your tweets can matter!)

  • Any other media (podcasts, books, magazine articles)

  • Code contributions

The last one takes the longest to verify and is a last resort — if the reviewer hasn't been grabbed by your title/abstract then they may or may not even bother.

If you are contributing code to projects but doing nothing else, then you should start blogging about what you're contributing.

Submit Early, Submit Often

This one seems like it will be quite controversial to communities other than the PHP community which has made this the norm: submit many proposals.

When I submit to a conference, I will propose no fewer than four proposals. This not only increases your chances (by the numbers), but when selecting talks, if you have absolutely no other information on a speaker, this will at least give the reviewer a better idea of what you do, and a little more about what experience you may have.

Speaking Experience

The ability to communicate as a speaker is also something to worry about when selecting talks. Having a history of speaking, be it at other conferences, user groups, or even via webcasts/podcasts, is a big plus. Be sure to share all the media from your talks: slides, video and audio recordings. Similarly, podcasts are another way to give great insight into your ability to communicate verbally.

However, even writing can give insight into how you communicate.

Hard skills vs Soft skills Talks

There are two types of talks: so-called “hard talks” that teach technical skills, and “soft talks” which teach personal skills (e.g. team working, project management, how to get hired... how to write talk proposals).

When setting up a conference schedule it's very important to get the balance of hard skills vs soft skills talks right. Too few hard skills talks, and it's hard to justify the expense to your employer.

But quite often, the soft skills talks are the ones that get the most people talking, and when you are an experienced developer are often what you need to advance your career.

What this means is that you are far more likely to get a hard skills talk accepted than a soft skills talk — just because of the ratios, and because the few soft skills talks that are selected have to really stand out, typically well established speakers are chosen.

The "Distillation" Process: How We Selected Talks for Distill

For Distill, our review process was what I called "Iron Chef Style". We rated each talk as follows:

  • Content (15 points)

  • Fit (10 points)

  • Speaker (5 points).

The content is the topic covered, the fit is how well the talk fits into the overarching theme of the conference, and the speaker is not who they were, but what made them the right person to talk on the subject.

Once they were all rated by each member of the committee, we tallied the numbers and sorted by the totals.

At this point, we asked if anybody had any specific talks they felt strongly about, and also used the standard deviation to see which talks have the biggest difference between the different reviewers to start discussion on those.

This gave us our top 32. We then looked at any speaker who had multiple talks in the top 32, and made a decision on which one of them we preferred and removed the duplicates (there were only 3).

Finally, as we further refined our theme, we narrowed our list down to the final 15 — who were selected as speakers for Distill.

I was surprised at just how much of an extremely lengthy and difficult task this was. To anyone who has ever done this with my talks… I promise that my bribes will be much bigger in the future, you deserve it!

distill-blogbutton

The post Speaking at Conferences: How to write a talk and get it accepted appeared first on Engine Yard Developer Blog.

Categories: Programming

Hash lookup in Ruby, why is it so fast?

Tue, 06/04/2013 - 18:00

Note: Our friend and CEO at Crowd Interactive, David Padilla, wrote this great piece about hash lookup in Ruby. Be sure to check out MagmaConf!

Have you ever noticed that in Ruby looking for a specific key-value pair in a Hash is very fast?

Allow me to explain the logic behind hashes in Ruby with a language that you probably understand: Ruby.

Let's imagine for a second that we want to emulate the functionality in hashes because, for some strange reason, they have not been implemented yet.

If we want to have some sort of key-value structure, we'll have to implement it ourselves, so let's get to work.

First, we'll need a Struc to represent a HashEntry, or the key-value objects that we will add to our hashes.

HashEntry = Struct.new(:key, :value)

Now, we'll need a class that represents our Hashes or (to avoid conflicts with the original Hash class) HashTable.

class HashTable
  attr_accessor :bins

  def initialize
    self.bins = []
  end
end

We add the `bins` attribute to the class. `bins` will be an array where we'll store our HashEntry elements.

Now, let's write a method to add a HashEntry to a HashTable. To follow convention, we'll use the traditional `<<` as method name.

class HashTable
  attr_accessor :bins

  def initialize
    self.bins = []
  end

  def <<(entry)
    self.bins << entry
  end
end

Great, now we can add HashEntry elements to our HashTable like so:

entry = HashEntry.new :foo, :bar
table = HashTable.new
table << entry

What if we want to look for an entry by key? Let's write the `[]` method on the HashTable class to handle that.

def [](key)
  self.bins.detect { |entry| entry.key == key }
end

What we're doing here is simply going element by element comparing the given key until we find what we're looking for. Efficient? Let's figure it out.

Benchmarking

We'll use Ruby's benchmarking tools to figure out how much time we're spending looking for elements on our hash tables.

require 'benchmark'

#
# HashTable instance
#
table = HashTable.new

#
# CREATE 1,000,000 entries and add them to the table
#

(1..1000000).each do |i|
  entry = HashEntry.new i.to_s, "bar#{i}"

  table << entry
end

#
# Look for an element at the beginning, middle and end of the HashTable.
# Benchmark it
#
%w(100000 500000 900000).each do |key|
  time = Benchmark.realtime do
    table[key]
  end

  puts "Finding #{key} took #{time * 1000} ms"
end

When we run this benchmark, we get the following results:

Finding 100000 took 33.641 ms
Finding 500000 took 192.678 ms
Finding 900000 took 345.329 ms

What we see here is that lookup times increase depending on the amount of entries and its position within the array or `bins`. This is obviously very inefficient and unacceptable for real life scenarios.

Now, let's see how Ruby tackles this problem internally.

Bins

Instead of using a single array to store all its entries, hashes in Ruby use an array of arrays or "bins".

First, it calculates a unique integer value for each entry. For this example we will use `Object#hash`. Then, Ruby divides this hash integer by the total number of bins and obtains the remainder or modulus. This modulus will be used as the bin index for that specific entry.

When you lookup for a key, you calculate its bin index again using the same algorithm and you look for the corresponding object directly on that bin.

Let's add an attribute on the HashTable class that will determine how many bins each HashTable will have, and we'll initialize it with 500.

class HashTable
  # ...

  attr_accessor :bin_count

  def initialize
    self.bin_count = 500
    self.bins = []
  end

  # ...
end

Now, let's write a method that calculates the bin for a specific entry depending on the number of bins.

class HashTable
  # ...

  def bin_for(key)
    key.hash % self.bin_count
  end

  # ...
end

When storing the HashEntry in the HashTable, we won't just store it on an array, we'll store it on an array that corresponds to the `bins` index depending on what the `bin_for` method returns:

class HashTable
  # ...

  def <<(entry)
    index = bin_for(entry.key)
    self.bins[index] ||= []
    self.bins[index] << entry
  end

  # ...
end

And last, whenever we want to retrieve a HashEntry, we'll recalculate the bin index again using the `bin_for` method and once we have that, we'll know exactly where to look for our entry.

def [](key)
  index = bin_for(key)
  self.bins[index].detect do |entry|
    entry.key == key
  end
end

When we run the same benchmark that we used earlier, we can see times improve dramatically:

Finding 100000 took 0.025 ms
Finding 500000 took 0.094 ms
Finding 900000 took 0.112 ms

Not only did times improve, but we got rid of the variance that we used to have depending on the position of the element in the bin pool.

There's still room for improvement here. Let's add more bins and see what happens.

class HashTable
  # ...

  def initialize
    self.bin_count = 300000
    self.bins = []
  end

  # ...
end

When we run the benchmark we get:

Finding 100000 took 0.014 ms
Finding 500000 took 0.016 ms
Finding 900000 took 0.005 ms

Even more improvement. This mean that the more bins, the less time spent looking for a specific key in a bin.

How many bins does Ruby actually use?

Ruby manages the size of the bins dynamically. It starts with 11 and as soon as one of the bins has 5 or more elements, the bin size is increased and all hash elements are reallocated to their new corresponding bin.

At some point you pay an exponentially increased time penalty while Ruby resizes the bin pool, but if you think about it, its worth the time since this will keep lookup times and memory usage as low as possible.

Further reading

If you want to learn where this algorithm came from and a little more about Ruby internals, I really recommend that you read Pat Shaughnessy's Ruby Under a Microscope book. Pat explains how the Ruby VM works in a way that anyone can understand. No C knowledge required, I really enjoyed reading it.

You can find the working example I used for this example on this gist.

You could also read some of the Rubinius source code. Take a look at their implementation of the Hash class, you'll probably understand a little more of the logic they used after you've read this post and Pat's book.

Thanks for reading.

The post Hash lookup in Ruby, why is it so fast? appeared first on Engine Yard Developer Blog.

Categories: Programming

May 31, 2013: This Week at Engine Yard

Fri, 05/31/2013 - 23:47

Spent an awesome week at DevOps Day Berlin and visiting customers in Berlin with application engineer extraordinaire Kevin Holler and our fearless sales leader Bridget Gleason. So fantastic to see the exchange of best practices and ideas around DevOps and get a chance to check out the Berlin tech community!

We’re planning to be at DevOps Day Amsterdam in a few weeks too, so hope to see you there!

--Tasha Drew, Product Manager

Engineering Updates

The EngineYard gem 2.1.0 has been released with a bunch of enhanced capabilities, including new asset strategies. We received a lot of customer requests around these areas and hope that this will make you super happy. Read all about it on the change log and the readme on github.

Ruby 2.0 is now in GA on Engine Yard Cloud! Read all about it.

We have released our new hardened Gentoo stack in GA (Gentoo 12.11), with all the latest and greatest packages and features that the new Gentoo has to offer. It only works in new environments, so if you are going to spin up a new environment, check it out!

We've updated the worker counts for Passenger/Unicorn/etc.. and exposed some easy ways for customers to alter the worker counts, memory thresholds, and other cool stuff. Read all about how to do it here! This allows customers to avoid writing custom chef to alter some of this stuff. As always, if you need any help, open a support ticket!

Workers are the processes that allow your application to respond to incoming web requests. Regardless of the application server stack you select, it provides one or more workers per instance to run your application.  Learn all about how to tune your workers per server and use an excellent new calculator to figure out your ideal set up in our new documentation!

Social Calendar (Come say hi!)

Postgres Ireland: June 4, Engine Yard Dublin. Come and meet some Postgres backend developers based in Ireland, and discuss the recent landmark 9.2 release. We encourage participation from people of all levels of experience with Postgres - from people who aren't current users, and would like to know more, right up to experienced users with large installations. Greg Stark, a long-time Postgres contributor and committer, will speak.

Magma Rails: June 5-7, Manzanillo, Mexico. MagmaRails is now MagmaConf!, one of the most important web development conferences in Mexico; with Ruby, Ruby on Rails, Frontend and more web technologies sessions for developers and engineers that build cutting-edge web applications.

GoRuco: New York’s premier community-organized Ruby conference. Stop by the Engine Yard table and say hello!

Articles of Interest

Our inaugural developer’s conference, Distill, on Treasure Island in the San Francisco Bay, still has early bird tickets left. Hope to see you there!

Northwestern University’s engineering students are given dance lessons to help them have more balanced brains.

Fan favorite and lead AP support manager Chris Rigor shares his talking points from his presentation at Ruby Kaigi this week, in which he created a debian wheezy vagrant box with chef recipes to set up a rails app!

And this was my favorite slide of the week, where Meagan Fisher demystifies design process at BTConf.

The post May 31, 2013: This Week at Engine Yard appeared first on Engine Yard Developer Blog.

Categories: Programming

We Love Our Sponsors!

Fri, 05/31/2013 - 22:15

Over the years, Engine Yard has been a proud sponsor of hundreds of incredible open source conferences. When it came time to seek sponsors for Distill, our inaugural developer conference, we were thrilled and humbled by the amount of love and support we received from companies who also put community first. We are proud to announce our first round of sponsors for this year--without them, Distill would not be possible! AppFirst will be making sure that all talks can be seen on demand after the conference by sponsoring our videos! Get the AppFirst DevOps dashboard and free offering for full stack insights across all of your Engine Yard environments. GitHub, helping us all build software better, together, will be sponsoring the lightning talks portion of our schedule. These rapid, info-packed talks are always a conference highlight and are sure to be meaningful, memorable and entertaining. Kony helps you take your business mobile. They’re also going to be mobilizing attendees by sponsoring the transportation from the conference hotel to Treasure Island and back again. MongoLab takes the sweetest sponsorship--make sure to thank them for the delicious Smitten ice cream you’ll be enjoying at Distill! They provide a fully-managed MongoDB Database-as-a-Service platform automating the operational aspects of running MongoDB in the cloud.

New Relic will be sponsoring our kick-off party, Moonshine. New Relic makes awesome app-monitoring software. Check it out.

SendGrid will be keeping you alert throughout Distill (especially the morning following Moonshine) with excellent coffee from Ritual Coffee Roasters! SendGrid are the masters of simple, reliable email delivery. Zendesk makes customer support painless. Think of them as you jot down words of wisdom in your custom Field Notes notebook! If you haven’t bought your ticket to Distill yet, do it quick! The early bird discount ends on June 1. At that time, the price will go up from $400 to $500. distill-blogbutton

The post We Love Our Sponsors! appeared first on Engine Yard Developer Blog.

Categories: Programming

PaaS Pricing and Transparency – A Message from the CEO

Wed, 05/29/2013 - 19:12

At Engine Yard, we work hard to cultivate close relationships with you, our valued customers.  Every part of our organization, from sales and support to product development and engineering, is keenly focused on our clients' needs.  We believe that is the only way we can continue to deliver increasing value so you can spend your time and attention on important innovation.  So far this year, in direct response to your input, we’ve rolled out a variety of new features and capabilities that we hope make your jobs easier. (see our This Week at Engine Yard blog post series and our 2013 announcements).  Most recently, we announced significant price reductions for Engine Yard Cloud that lower the cost to get started (just 5¢ an hour) and make it easier to scale up (with additional reductions on our largest instance sizes).

As we go forward, we’ll continue to invest our resources based on your feedback.  We look to you to give us advice, help us set priorities and share your best-practice experiences. For example, from your feedback we’ve seen how important transparency is.  Today our service is tightly coupled to the specific infrastructure instance size – and both PaaS and IaaS are combined in a single price.  While this does give you full visibility into exactly what kind of infrastructure your app is running (unlike other PaaS offerings), we’re considering going one step further and breaking out the pricing of the infrastructure from the price of our platform.  As part of this approach, we are looking at tiering our platform services to more closely match the varying requirements you have.  Our platform services for small, simple environments would cost less than our services for large complex environments that utilize the most advanced features.  We’d love your feedback on these approaches – and would greatly appreciate your comments and thoughts.  Please post them below!

The post PaaS Pricing and Transparency – A Message from the CEO appeared first on Engine Yard Developer Blog.

Categories: Programming

RailsInstaller: An Unexpected Journey

Tue, 05/28/2013 - 21:39

In the course of every man’s life, there comes a time when he must rise up and own the moment. He must buckle down and become the leader he was meant to be. He must bear a great burden and push through to become a stronger person. This is not that journey…or is it?

hobbit (1)

The past defines us

A long time ago (2008) I was a lowly college student. I had started a course that would change the way I look at the programming world. This course introduced me to Ruby on Rails, though on a Windows machine it was anything but pleasant.

At that time, setting up an environment for building Rails applications was not a simple task. There were many different components that may or may not be easy to find and installing them was not trivial. While I was able to get things figured out, I dreamt of the day when somebody would make something to ease my installation woes.

Enter the wizard

It is now 2011 and Wayne E. Seguin has been conjuring up something that would shock the Rails community. This “magic” would be the saving grace of so many a developer on Windows. One package that would finally put an end to the madness and make installation easy. And like that, RailsInstaller was born.

Along with the installer, a website was created as a place to download RailsInstaller and get more information about Ruby on Rails development. This also included a tutorial screencast, done by Wayne, that deployed a simple application to Engine Yard Cloud

My journey begins

In June 2011, I was hired by Engine Yard to be what is known as a PANDA. Among my other many duties, I pretty much took on the role of helping Windows users who were trying to deploy applications on our platform. It was at this time that I really became familiar with RailsInstaller as well as the people using it. I helped a few users and also worked on a couple of the updated releases but did not really get too much into it until about a year later.

Into darkness

After working as a PANDA for about six months, I transitioned into the role of Application Support Engineer. Although I didn’t typically work with Windows users since most of our customers are using Mac or Linux, I did keep RailsInstaller in the back of my mind. I released an updated version of the video for Windows and also did the one for the brand new OSX version of the installer, built by Michal Papis.

One of the other things I did was submit a talk proposal to Aloha Ruby Conf, titled “Rails Development on Windows. Seriously” which focused on using RailsInstaller to get started. The talk was accepted and at this point, I knew that I’d need to spend much more time living in the Windows world as a Rails developer.

I spent the next six months before the conference, trying to immerse myself in the Windows world. At this time, I was also following the RailsInstaller Google Group and the questions/issues/discussions happening there. I came to the realization that while RailsInstaller will get a development environment set up quickly, there are many issues that can crop up afterwards. I also saw the same questions popping up repeatedly, even though they had just been answered in a previous thread.

All of this time spent working in an operating system that seemed counter-productive, working my daily support duties, and answering questions on the Google Group had me pretty burnt out. After the conference was over, I disappeared from the RailsInstaller world.

An interesting proposition

At the beginning of 2013, all of the Support Engineers went over employee development plans with their respective managers. One of the things in the plan is a week-long sabbatical to work on something outside of support tickets. I talked with my manager for a while and he brought up the idea of doing something with RailsInstaller. At this time, there wasn’t really anybody guiding the project and there hadn’t been an update to the Windows version in months. We picked a week in April for me to take over as the project maintainer and re-focus the vision.

The return

During the week of April 15th, I stepped into my new role. I focused on learning how both versions worked and the different parts that are needed to build each one. One of the other things I wanted to accomplish was to change the focus of the project.

Originally, RailsInstaller had been pegged as a way for new developers to get started with building Ruby on Rails applications. While it is good to bring in new people and teach them development skills, there are also many things to learn outside of just Ruby on Rails. I really wanted it to be more focused on getting from a fresh, new computer to a full-blown development environment as fast as possible.

So today, we’re re-launching RailsInstaller and we’re focused on building applications and making money instead of messing with setting up everything manually. Please check out the updated site at railsinstaller.org as well as the organization on Github. If you need help with the installer or want to contribute, feel free to reach out in the Google Group, IRC, or hit me up on Twitter.

I really appreciate everyone who has contributed to the project as well as everyone who has used the installer. Looking forward to what comes next and the evolution of RailsInstaller. Thanks!

The post RailsInstaller: An Unexpected Journey appeared first on Engine Yard Developer Blog.

Categories: Programming

May 24, 2013: This Week at Engine Yard

Fri, 05/24/2013 - 21:58

Just finished speaking at the delightful Cloud East Conference in Cambridge, England! Glad to say that my velociraptor slide went over well. Bridget and I have been visiting customers in London and Brighton, and are going to Berlin on Monday for DevOps Day! If you’re there, be sure to say hi! The intrepid Kevin Holler and inimitable Slava will also be in attendance.

--Tasha Drew, Product Manager

Engineering Updates

We have a couple fun and useful items in our early access phase! If you go to the dashboard, and from the menu select Tools >> Early Access, you will now find Application Takeover Preference and Ruby 2.0.

Application takeover preference allows you to easily set how you want our automated application takeovers to occur, in case our default behavior isn’t working for you.  Once enabled, you can go to “Edit Environment,” and you will see a new dropdown:

So if you’ve been having any trouble with your specific customizations not booting properly when we are automatically handling application takeover, you can select to instead boot from a new volume; have a takeover occur, but without booting a new application slave to replace the one being promoted; or disable the feature entirely.

As with all our early access features, we would be delighted if you shared your thoughts on these as you use them. Need something additional? Anything not working correctly? Documentation confusing? Love it and want to let us know? Tell us any and everything at our Early Access Forum!

And, also in early access, we have big news for our Rubyists: Engine Yard now has Ruby 2.0! We’ve had Rails 4 for a while now if you want to try them together. The full featured Ruby 2.0 integration in the UI, etc., is still getting some work done, so for now the installation instructions have some unique steps (i.e. not through the dashboard). See our docs for all the details!

Data Data Data

As we continue to enhance our new cluster model, we are in the planning phase for upcoming new data stacks! One of our top priorities is our mySQL database stack and how to make it more awesome as it takes advantage of the new data model. We are in the early stages of re-productizing it, so if you have any wish lists of things you’d like to see from our mySQL offering, please let us know in the feature request forums!

Social Calendar (Come say hi!)

Monday May 27th - Tuesday May 28th: Engine Yard will be participating as a sponsor of DevOps Day Berlin! Kevin Holler will be delivering a 5 minute talk about Engine Yard Cloud, and Tasha Drew and Bridget Gleason will cheer him on from the audience.

Wednesday May 29th: Our Portland office will be hosting the weekly Coder Dojo PDX!

Thursday May 30th: Engine Yard's Dublin office is host to another addition of Node.js Dublin, featuring Dominykas Blyžė, Daniel McKay, and Isaac Schlueter.  Pizza and beer will be served!

Saturday June 1st: Engine Yard Dublin will host an IXDA Prototyping workshop. In this workshop you will learn how to create wireframes and interactive prototypes using the popular tool Axure RP.

Articles of Interest 

Cern launched a public appeal to find the world’s oldest webpage, and this is what they found.

What would the Games of Thrones characters look like if the show was staged in the 1990’s? Now we know.

The post May 24, 2013: This Week at Engine Yard appeared first on Engine Yard Developer Blog.

Categories: Programming

Announcing: Moonshine, the Distill Kick-off Party, Sponsored by New Relic

Thu, 05/23/2013 - 21:38

By now you’ve heard that Engine Yard is proudly presenting our inaugural developer conference, Distill, on August 8-9. In addition to a lovely Treasure Island venue and a stellar lineup of speakers, we will also be throwing a kickoff party the likes of which you’ve never seen.

In keeping with the theme of distillation, which during the daytime events means learning, best practices and collaboration, we’ve dubbed the sure-to-be-awesome kick-off soiree “Moonshine”  as a nod to the tasty beverages we’ll be providing for you all. World-class DJs will kick out the jams as you enjoy our distillation-themed smorgasboard of hors d'oeuvres and the company of your fellow conference-goers. Throughout the evening, we’ll also be doing a few surprise giveaways (and if you know Engine Yard, you know that we don’t disappoint!)

Moonshine will take place at the Old Mint in San Francisco. We want to thank our partner New Relic for sponsoring. If you haven’t purchased your ticket for Distill yet, hurry and grab your First Batch (early bird discounted) ticket! Tickets will be increasing in cost from $400 to $500 on June 1.

Categories: Programming

A Conversation About Testing in PHP

Wed, 05/22/2013 - 21:45

We are proud to sponsor Chris Hartjes and Ed Finkler's Development Hell podcast series where they record their freewheeling, uncensored discussions on programming the web, so future generations can learn from their failures.

Read on to get the low down on different testing tools and their relative merits--check it out as Ed and Chris weep for the future, come to some interesting conclusions and get their hands dirty so you don't have to.

To hear more from Chris and Ed tune in to their podcast, /dev/hell

Ed and Chris had a little chat about testing in PHP.

Chris: Okay, so today's topic is PHP testing

Ed: Word up

Chris: Now, Ed, I know that for the most part you are not a big fan of the mainstream PHP testing tools

Ed: Yes, that's true

Chris: So what is it that you don't like about them

Ed: I guess realistically my complaints are aimed at PHPUnit . It's very powerful and very complete from what I can tell, but I think it's difficult to pick up and I think that difficulty makes people less likely to use it. Because it's by far the best known testing tool, I think that tends to limit the use of unit testing, period, in PHP. That's not necessarily PHPUnit's fault per se. I just think it's the situation we're in. I think the documentation, the setup, and just obtaining PHPUnit is a challenge, particularly when compared to unit testing options I've seen in other languages. Python, for example, has a simple but effective unit testing library built into the core.

Chris: So, when you say "difficult to pick up", is it because tests look like this?

<?php 
class Labels
{
    public $db;

    /**
     * @param GrumpyDb $db
     */
    public function __construct($db)
    {
        $this->db = $db;
    }

    /**
     * Turns label values like codingStandardsSuck into
     * CODING_STANDARDS_SUCK
     */
    public function screamingSnakeLabels()
    {
        $results = $db->query("SELECT name FROM labels");
        $labels = array();
        foreach ($results as $result) {
            $labels[] = $this->_camelToScreamingSnake($result);
        }
        return $labels;
    }

    /**
     * Method that takes a camelCase string into SCREAMING_SNAKE_CASE
     *
     * @param string $value
     */
    protected function _camelToScreamingSnake($value)
    {
        $result = preg_replace_callback(
            '/[A-Z]/',
            function ($match) {
                return "_" . strtolower($match[0]);
            },
            $value
        );
        return strtoupper($result);
    }
}

class DevhellTest extends PHPUnit_Framework_TestCase
{
    public function testShowEdHow()
    {
        $db = $this->getMockBuilder('Foo')
            ->disableOriginalConstructor()
            ->setMethods(array('query'))
            ->getMock();
        $db->expects($this->once())
            ->method('query')
            ->will($this->returnValue(array('devHell', 'camelCase'));
        $label = new Label($db);
        $expectedResults = array('DEV_HELL', 'CAMEL_CASE');
        $testResults = $label->screamingSnakeLabels();
        $this->assertEquals(
            $expectedResults,
            $testResults,
            "Labels were not correctly coverted to screaming snake case"
        );
    }
}

Chris: Maybe it's because I've worked with it a lot, all I see is some boilerplate and then a few statements that seem pretty intuitive to me.

Ed: I think boilerplate is part of the issue. I think that's intimidating. Tools can mitigate that to some extent, but I don't think it eliminates the problem entirely. I just don't think writing a simple test should be anything more than a couple lines of code. Then you can build upon that iteratively as you need. I think that approach of starting simply and building up your set of tests really helps you understand what's going on, and I think it makes testing a lot more accessible to people who haven't done it before. A lot of testing framework docs I see throw a ton of nomenclature out at the reader. I think if you don't already understand that nomenclature, you won't understand what's up.

Chris: So when you say 'nomenclature', you're talking about things like what exactly? Assertions and mocks?

Ed: Knowing how to mock that stuff up is pretty complex. In my experience the majority of people who work with PHP don't have a lot of formal training and even if they do, it often doesn't cover testing concepts. Like, what's a "unit?" What's an assertion? What's a mock or a stub?

Chris: I weep for the future, Ed. A unit is a small amount of code that you're trying to test

In PHP, that's usually one object, An assertion is simply a statement that "I am saying that the following is true", whatever that assertion happens to be. I do agree that there is lots of confusion about what a mock or a stub is so in my book I devote a chapter to explaining those things.

Ed: So I know what that stuff is (although I get confused about the diff between a mock and stub). But the real problem is that in order to write tests, you have to already know how to program, and that in itself is super-intimidating for people. PHP has a very shallow learning curve: the time between learning and becoming productive in some way is very short. That's certainly one of the reasons PHP is so popular. We need, I think, to mirror that in how we present testing, and make it easy to get into. It shouldn't be something that is terribly complex to set-up and do.

Chris: In that light, I understand the motivation to develop your own testing tools, but I still think PHPUnit is the way to go. So many people use it and there are so many resources available to learn it, that picking it up isn't as difficult as I think you're making it out to be. Alternately, I think the Behavior-Driven Development (BDD) model that Behat offers is appealing, and easier to pick up than the xUnit style. Behat combined with Mink is a solid alternative to PHPUnit.

Ed: If you are doing acceptance testing (meaning that you only care that the application as a whole is working) I don't
think you can go wrong with being able to write tests that look like this:

Feature:
    Scenario: Main page loads
    Given I am on "/index.php"
    Then I should see "Lies I Told My Kids"

    Scenario: Empty form fields trigger errors
    Given I am on "/index.php"
    When I press "submitButton"
    Then I should see "You submitted an invalid e-mail address"

    Scenario: Missing description triggers errors
    Given I am on "/index.php"
    When I fill in "email" with "test@domain.com"
    And I press "submitButton"
    Then I should see "You submitted a blank description"

Chris: The Behat and Mink combo can let you create some very interesting acceptance tests, and it even provides you with tools that will tell you when you when you will have to write your own helpers to supplement what they can provide you. It took me a few days to figure out Behat's own way of doing things but once I did I was able to create some very interesting tests, even ones where JavaScript (long the bane of automated acceptance testing) was being used.

If your mind doesn't align well with unit testing, then something like Behat is definitely the way to go. There's something neat about watching PHP run Behat which in turn opens up a browser and starts acting like a user and hopefully using your application correctly.

Ed: Ultimately, though, a lot of the problem with testing in PHP is that PHP's insane flexibility makes it super easy to write code that you cannot test. That and PHP is almost always working in concert with other systems, like a web server, so it can be tough to know what you can easily test inside the CLI and what you'd need to use a different approach.

To write testable code, you really have to be thinking about testing when you write your code. It takes a bit of time to get used to that, but I think it's very doable. In much the same way, it's taken us a long time to make security a first-order concern in PHP development, but I think we've done a decent job of that. We need to do that for testing as well.

Chris: If only, Ed. If only.

Categories: Programming

May 17, 2013: This Week at Engine Yard

Fri, 05/17/2013 - 19:36

I spent this week with the team of engineers who made Riak on Engine Yard Cloud possible, attending RICON East: all Distributed Systems, all the time. Later in the week we took advantage of being in New York City to visit local customers and discuss the various features we’re working on and field any technical, product, and data questions.

Both our engineering and product teams love incorporating customer feedback into our direction. Speaking of which -- if you’re in San Francisco, I’m organizing customer UX feedback sessions! Hit me up :)

--Tasha Drew, Product Manager

Engineering Updates

PHP is now GA on Engine Yard Cloud! Per Product Manager Noah Slater: “PHP has been an important part of Engine Yard’s growing family since the acquisition of Orchestra in 2011. And now, PHP on Engine Yard Cloud represents the culmination of our efforts to deliver the industry’s best Platform as a Service for PHP developers. The result of this work is a unified service offering for PHP, Node.js, and Ruby applications.” Read all about the GA launch announced by Davey Shafik at php[tek] in Chicago this week!

Data Data Data

Riak and Clusters are live! See our blog post for more info - https://blog.engineyard.com/2013/riak-is-ga-engine-yard

A cluster is a new way to organize and manage instances that share a specific function.  Clusters take much of the functionality that was once placed at the environment level, and moves it down to the cluster level. One environment can have many clusters, and each cluster can run different cookbooks and be in different regions.

We drove the cluster model hand in hand with our productization of Riak on Cloud because the distributed model of Riak paired perfectly with where we wanted to drive the future of our platform. We can now take this underlying work and begin to re-productize other offerings to take advantage of its flexibility in many ways.

Social Calendar (Come say hi!)

Tuesday May 20th: Engine Yard Dublin hosts the PHP meetup where Eugene Kenny, Adverts.ie discusses his "Developer Toolbox", and then Matthew Weier O'Phinney of Zend Framework & Nate Abele of Lithium go head to head on the subject of Frameworks.

Wednesday May 21st: Engine Yard’s San Francisco HQ will be hosting the monthly Riak meetup! Lead data engineer and fan favorite Ines Sombra will be presenting about Riak on Engine Yard Cloud, followed by Basho’s Mark Phillips discussing Riak CS.

Wednesday May 21st: Our PDX office will be hosting Coder Dojo for students K-12 to learn about software! Grab a ticket and bring your parents for some software fun.

Thursday May 22nd: Engine Yard Dublin plays host to Open Data Ireland, “Give us our health data!”

Friday May 23rd: In which I talk about myself in the 3rd person? Tasha Drew will be speaking at Cloud East in Cambridge, UK, about deployments in the cloud, including various strategies we at Engine Yard see for environments of different sizes -- and concluding with sharing our own deployment strategy.

Articles of Interest 

Lightweight screenshot and annotation tool http://glui.me/ has gained some fans in our office!

Engine Yard friend Daragh Curran, Head of Product Engineering, Intercom shared an awesome blog here. “Shipping brings life to your team, to your product, and to your customers. Shipping is your company’s heartbeat.”

Categories: Programming

Shipping is your company’s heartbeat

Thu, 05/16/2013 - 12:38

Note: Engine Yard friend Daragh Curran, Head of Product Engineering, Intercom has graciously let us post this great piece about code deployment on our blog. Check it out on their own blog here.

Software only becomes valuable when you ship it to customers. Before then it's just a costly accumulation of hard work and assumptions.

Shipping unlocks a feedback loop that confirms or challenges those assumptions. It makes new things possible for your customers, and gives you the opportunity to focus on the next thing.

Shipping brings life to your team, to your product, and to your customers. Shipping is your company's heartbeat.

Shipping will try to kill you

The scramble to get that one last feature done, the late nights, the compromises, the sinking feeling when we realise something major is broken, the post-mortems… It's agony, but if it was easy everyone would do it. Shipping exposes mistakes. We're nervous about it, and our natural reaction is to do it reluctantly and infrequently, which actually carries higher risk, causing more reluctance in the future.

The cost of shipping is approaching zero

Not too long ago, shipping software involved actual ships, disks, and printed manuals. It happened perhaps once a year. Bug fixes weren't automatic over the internet like today. Everything was slower and more controlled. The cost of shipping was massive, the consequence of a mistake was large. Today, the cost of shipping has approached zero. Most people can deploy in seconds or minutes with a single command or button click. With a little thought you can do that without your customers noticing, and with automated monitoring you'll find out immediately if something goes wrong.

Despite the cost of shipping approaching zero, many people still ship software guided by very old habits.

Shipping cadence defines your company

The cadence at which you ship defines your company. A yearly cadence results in a very structured approach to the design->build->test cycle. A few months of building, while the rest is spend fixing. Engineers can join and leave before seeing their hard work end up in the hands of customers. The approach to design becomes one of anticipating all possible needs, rather than focusing and iterating on the important ones.

Obstacles downstream propagate upstream

An obstacle downstream propagates upstream. If you're not allowed to implement new ideas, you stop having them.
- Paul Graham

The right approach to shipping has a positive influence on your company's productivity and your team's happiness & job satisfaction. Shipping infrequently is an obstacle. Ship slow, and you'll introduce challenges that push you to ship even slower. Ship frequently, and see positive effects everywhere in your company. For example, lets examine how behaviour changes along with shipping frequency, while handling a simple request from a customer.

Time to production behavior

Lets say a customer gets in touch to say "No matter what I do, I cannot save my name correctly, I think it doesn't like hyphens". In a company where you ship continuously, you see this and think Simple — I'll tweak a test and a regex pattern, get a quick code review from my buddy beside me, merge to mainline, and 1 minute later when it's deployed to production, reply to the customer: "Sorry about this, it's fixed now, thanks for letting us know". They'll reply: "Wow, thanks for fixing so quickly". High fives all around!

If we stretch the time to production (TTP) out a little, even to 10 minutes, the behaviour changes. You either do the same, but reply saying it'll be fixed with our next deploy (probably 10 minutes) - or you wait, so that you can communicate with certainty. The waiting is time where you'll shift focus to something else, but have the baggage of having to follow up. Perhaps you'll think, I'll have a quick coffee, then move on to something else afterwards. Even though your deployments are entirely automated, you lose time because of waiting and losing focus.

Customer support shipping

If TTP is hours, the behaviour changes again. No longer can you say with certainty when the change will be out there, so you're tempted to batch up with other similar small changes. You postpone replying until you get time to do it, sometimes forgetting about it. You're less likely to take prompt action, wow'ing the customer, and you pay some mental cost for having it on a todo list. Since getting to production takes hours now, your team will start restricting to morning only deploys, so miss that slot and it's further delays.

If TTP is days, it exacerbates that further - perhaps you'll reply "Thanks for letting us know. We'll fix this in our next sprint". It gets bundled in with a whole load of other small low, priority items, you spend more time debating estimates, and priorities, than the first guy took to fix it and reply to the customer. Miss the beginning of week deploy window and further slippage. The larger releases bring higher risk, you'll tell your customer it's fixed, only to later require rolling back because of a separate change. Your bug database gets bigger and bigger, with little details that you'll probably never fix.

When TTP is weeks, it exaggerates that even further - perhaps you'll reply "Sorry about this, I'll let the development team know" or something equally lame from your customer’s standpoint. Deep down you realise nothing will be fixed, and the job of talking to customers becomes a cost or hassle, rather than an opportunity to improve your product and nurture happy loyal customers.

Shipping continuously

Better approaches to writing or testing software help us iterate more quickly and confidently, but the benefits are quite local to engineering teams. Continuous shipping on the other hand, touches all parts of your company, as do the benefits, and the behaviours it enables and encourages.

Linkedin's transition to continuous deployment is linked to their recent financial success.

Good products, are a side effect of combining good people with an idea in an environment that helps those people to kick ass. Your attitude to shipping is a big part of that environment you create.

Shipping breathes life into how we think. The feedback loop helps us learn, gain confidence in making quick decisions, and build momentum. Momentum in product improvements excites and engages our customers. Seeing quickly the benefits of our hard work, motivates us to do more. Building a team where people can work hard and move fast attracts others to join you - hiring gets easier.

shipping-brings

Shipping continuously isn't an achievement you unlock and then move on. You've got to constantly obsess about it. If you believe in the benefits it brings, you'll be driven to shrink 20 minutes down to 1 minute or less, you'll consider 'ability to ship' as an equal to 'does it scale' when building new systems. And you'll do that because of all the life it breathes into your company and your product.

Shipping is your company's heartbeat.

Categories: Programming

Riak is GA on Engine Yard Cloud

Tue, 05/14/2013 - 20:56

Hello from NYC! We stopped by RICON East to share great news. We are thrilled to announce the General Availability of Riak on Engine Yard Cloud.

Riak is our first highly available, non-relational database and the first component of our stack to use a new cluster provisioning model. Riak exemplifies the future of Engine Yard and you should totally check it out! Here’s why.

Highlights of Using Riak on Engine Yard Cloud

Riak’s use case primarily fits applications with loosely structured data where even seconds of downtime are unacceptable. Riak has a key/value data model and is completely data agnostic, meaning you can store anything you want in a value (media, json, xml, text, etc.).

Riak is masterless. You can send writes to any node in the cluster and data will be appropriately stored, even in the case of individual node failures. Riak also supports tunable consistency, allowing you to make the datastore more strict on certain types of data and more responsive on others.

Painless Installation, Management, and Support

We have invested in simplifying Riak's installation and configuration to make the learning curve less steep. In one easy step you can define the flavor and size of your cluster, the location of your data (EBS, ephemeral, etc.), optimize your cluster by selecting desired backends, and even enable full text search.

Once your cluster is up and running you can painlessly grow it if you need to add capacity. Removing nodes is also a trivial operation. If for any reason you want to archive your entire cluster, you can easily do this, too.

Riak clusters come with the fantastic support you have come to expect from Engine Yard. As partners of Riak's makers, Basho, we can quickly escalate tickets on your behalf when they require extra engineering insight.

A Whole New (Clustered) World

The cluster model used by Riak evolves the deployment topology of  environments. Environments become more flexible with the ability to specify zero to many clusters per environment, and have all clusters properly deployed and balanced within availability zones in your region. We are also working on the ability to have clusters within a single environment provisioned in a different region.

As of today, clusters are exposed to all customers.


We will be migrating individual stack components to our new cluster model. All supported databases will be re-done and acquire the provisioning features you see in Riak. We are very excited about what we'll be releasing over the next few months.

Introducing Cluster Behaviors

The cluster provisioning model also allow us to express cluster-specific behaviors and act upon them in a scheduled way (or on demand). For example: all Riak clusters have access to rolling backups as their first supported behavior.

With rolling backups we can archive the entire contents of a cluster one node at a time without compromising its overall performance and ability to respond to requests.  We will be introducing new behaviors (like rolling snapshots) very soon.

Things You Must Know

To prepare for the migration of legacy components to clusters we have decided to change the way environments update. We have pushed stack responsibilities down to the cluster level. This means that clusters are now responsible for managing their stacks and updates which gives us greater granularity and flexibility (it’s a great thing, we promise!).

An important thing to note is that environment-wide custom Chef runs will no longer be applied to cluster instances. Clusters are isolated from system-wide versions of Chef as they carry their own stack and updates.

What Comes Next?

Here are a few things we have in store as we continue to evolve Riak and clusters:

We want to make Riak’s management tasks more intuitive than ever, so we will roll out enhancements to the environment page and overall cluster user experience.  We are also working towards improve cluster monitoring and alerting.

Enhancements to instance booting times are in the pipeline. You will be able to go from zero to a fully running cluster faster than ever!

Where Can I Learn More?

Our documentation has been updated and it’s a great place to get started. We will be leveraging Basho’s excellent Riak documentation, too.

If you are in San Francisco we will be giving a tour of Riak on Engine Yard on May 22nd. Come ask questions! We’ll hand out a few gifts to the best ones

http://www.meetup.com/San-Francisco-Riak-Meetup/events/118840422/

Still Have Not Tried Riak?

Riak is available on all trial accounts. Simply sign up, boot up a cluster, and you’ll be able to experiment with it.

Also feel free to open a Support ticket if you are wondering if Riak is a good fit for your application.  We love hearing from our customers and want your feedback.

Categories: Programming

Announcing PHP on Engine Yard Cloud

Tue, 05/14/2013 - 15:45

We’re excited to announce the general availability of PHP on Engine Yard Cloud.

PHP has been an important part of Engine Yard’s growing family since the acquisition of Orchestra in 2011. And now, PHP on Engine Yard Cloud represents the culmination of our efforts to deliver the industry’s best Platform as a Service for PHP developers. The result of this work is a unified service offering for PHP, Node.js, and Ruby applications.

With PHP on Engine Yard Cloud, users get a proven, robust platform on which they can both horizontally and vertically scale applications – including content, media, e-commerce, and more. As a highly configurable PaaS, Engine Yard Cloud gives PHP developers – from enterprises to digital agencies to SMBs – a wider range of instance sizes, a fully curated PHP stack, and advanced automation and orchestration features such as database replication and failover.

Whether deploying a simple Wordpress blog or an advanced MySQL-backed web application, developers get a range of control over configuration, deployment and management of their application environments, including full root access on virtual servers and the flexibility of using custom Chef recipes to control and automate entire environments, regardless of size.

Get Started With Our Lowest Entry-Level Cost Ever

We recently announced several big price reductions including a new entry level price that gives you a dedicated EC2 small instance for $0.05 per hour. That's an average of $36.50 per month — almost 50 percent less than the original price! This means you can immediately start using Engine Yard Cloud to deploy your PHP applications at an entry level cost so low, it's less than the cost of a basic application on Orchestra.

What’s more, if you haven't already made use of the free trial, you can login to Engine Yard Cloud with your existing login and claim your free 500 hours to get started!

Want to try it out? Head over to our documentation and give things a whirl.

What Does This Mean for Orchestra Customers?

We plan to retire Orchestra later this year, as we have already communicated to our Orchestra customers. In fact, we are already working with some customers to help them migrate to Cloud. And if you haven't already migrated, there are several reasons why you might want to try PHP on Engine Yard Cloud right away.

Some of the benefits of PHP on Engine Yard Cloud:

  • Choose the dedicated instance sizes you need
  • Run your database in your environment. No more third party providers required!
  • More control over your deployments
  • SSH access. Logs. Debugging.
  • Automated backups and snapshots of your environment
  • Stop and start environments

If you haven't migrated yet, and you can open a support ticket and we will work with you on the migration. Or you can read more about our plans in the unification FAQ.

Thanks

We know we couldn’t have gotten this far without the support from this community, so we’d like to say a big “THANK YOU” to everyone involved. The whole Orchestra team is now working on Engine Yard Cloud. And we hope you’re as excited as we are about the expanded PHP service with more deployment choices, increased flexibility, better management, and — as always — the industry’s best support included.

Please note: GA features will go live at 1 pm PST today.

Categories: Programming