Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

How combined Lean- and Agile practices will change the world as we know it

Xebia Blog - Tue, 07/01/2014 - 08:50

You might have attended this month at our presentation about eXtreme Manufacturing and the keynote of Nalden last week on XebiCon 2014. There are a few epic takeaways and additions I would like to share with you in this blogpost.

Epic TakeAway #1: The Learn, Unlearn and Relearn Cycle Like Nalden expressed in his inspiring keynote, one of the major things for him to be successful is being able to Learn, Unlearn and Relearn every time again. In my opinion, this will be the key ability for every successful company in the near future.  In fact, this is how nature evolutes: in the end, only the species who are able to adapt to changing circumstances will survive and evolute. This mechanism makes for example, most of the startups fail, but those who will survive, can be extremely disruptive for non-agile organizations.  Best example for this is of course Whatsapp.  Beating up the Telco Industry by almost destroying their whole businessmodel in only a few months. Learn more about disruptive innovation from one of my personal heroes, Harvard Professor Clayton Christensen.

Epic TakeAway #2: Unlearning Waterfall, Relearning Lean & Agile Globally, Waterfall is still the dominant method in companies and universities.  Waterfall has its origins more than 40 years ago. Times have changed. A lot. A new, successful and disruptive product could be there in only a matter of days instead of (many) years. Finally, things are changing. For example, the US Department of Defence has recently embraced Lean and Agile as mandatory practices, especially Scrum. Schools and universities are also more and more adopting the Agile way of working. Later more in this blogpost.

Epic TakeAway #3: Combined Lean- and Agile practices =  XM Lean practices arose in Japan in the 1980’s , mainly in the manufacturing industry, Toyota being the frontrunner here.  Agile practices like Scrum, were first introduced in the 1990’s by Ken Schwaber and Jeff Sutherland, these practices were mainly applied in the IT-industry. Until now, the manufacturing and IT world didn’t really joined forces combining Lean and Agile practices.  Until recently.  The WikiSpeed initiative of Joe Justice proved combining these practices result in a hyper-productive environment, where a 100 Mile/Gallon road legal sportscar could be developed in less than 3 months.  Out of this success eXtreme Manufacturing (XM) arose. Finally, a powerful combination of best practices from the manufacturing- and IT-world came together.

Epic TakeAway #4: Agile Mindset & Education fotoLike Sir Ken Robinson and Dan Pink already described in their famous TED-talks, the way most people are educated and rewarded, is not suitable anymore for modern times and even conflicts with the way we are born.  We learn by "failing", not by preventing it.  Failing in it’s essence should stimulate creativity to do things better next time, not be punished.  On the long run, failing (read: learning!) has more added value than short-term succes, for example by chasing milestones blindly. EduScrum in the Netherlands stimulates schools and universities to apply Scrum in their daily classes in order to stimulate creativity, happiness, self-reliantness and talent. The results of the schools joining these initiative are spectacular: happy students, less dropouts an significantly higher grades. For a prestigious project for the Delft University, Forze, the development of a hydrogen race car, the students are currently being trained and coached to apply Agile and Lean practices.  Also these results are more than promising. The Forze team is happier, more productive and more able to learn faster and better from setbacks.  Actually, they are taking the first steps of being anti-fragile.  Due too an intercession of the Forze team members themselves,  the current support of agile (Xebia) coaches is now planned being extended to the flagship of the Delft University:  the NUON solar team.

The Final Epic TakeAway In my opinion, we reached a tipping point in the way goals should be achieved.  Organizations are massively abandoning Waterfall and embracing Agile practices, like Scrum.  Adding Lean practices like Joe Justice did in his WikiSpeed project, makes Agile and Lean extremely powerful.  Yes, this will even make this world a much better place.  We cannot prevent nature disasters with this, but we can be anti-fragile.  We cannot prevent every epidemic, but we can respond in an XM-fashion on this by developing a vaccin in only days instead of years.  This brings me finally to the missing statement of the current Agile Manifesto:   We should Unlearn and Relearn before we Judge.  Dare to Dream like a little kid again. Unlearn your skepticism.  Companies like Boeing, Lockheed Martin and John Deere already did. Adopting XM speeded up their velocity in some cases with more than 7 times.

Keeping a journal

Gridshore - Sun, 06/29/2014 - 23:34

Today I was reading the first part of a book I got as a gift from one of my customers. The book is called Show your work by Austin Kleon(Show Your Work! @ Amazon). The whole idea around this book is that you must be open en share what you learn and the steps you took to learn.

I think this fits me like a glove, but I can be more expressive. Therefore I have decided to do things differently. I want to start by writing smaller pieces of the things I want to do that day, or what I accomplished that day, give some excerpts of things I am working on. Not real blog posts or tutorials but more notes that I share with you. Since it is a Sunday I only want to share the book I am reading.


The post Keeping a journal appeared first on Gridshore.

Categories: Architecture, Programming

Is there a future for Map/Reduce?

8w9jj

Google’s Jeffrey Dean and Sanjay Ghemawat filed the patent request and published the map/reduce paper  10 year ago (2004). According to WikiPedia Doug Cutting and Mike Cafarella created Hadoop, with its own implementation of Map/Reduce,  one year later at Yahoo – both these implementations were done for the same purpose – batch indexing of the web.

Back than, the web began its “web 2.0″ transition, pages became more dynamic , people began to create more content – so an efficient way to reprocess and build the web index was needed and map/reduce was it. Web Indexing was a great fit for map/reduce since the initial processing of each source (web page) is completely independent from any other – i.e.  a very convenient map phase and you need  to combine the results to build the reverse index. That said, even the core google algorithm –  the famous pagerank is iterative (so less appropriate for map/reduce), not to mention that  as the internet got bigger and the updates became more and more frequent map/reduce wasn’t enough. Again Google (who seem to be consistently few years ahead of the industry) began coming up with alternatives like Google Percolator  or  Google Dremel (both papers were published in 2010, Percolator was introduced at that year, and dremel has been used in Google since 2006).

So now, it is 2014, and it is time for the rest of us to catch up with Google and get over Map/Reduce and  for multiple reasons:

  • end-users’ expectations (who hear “big data” but interpret that as  “fast data”)
  • iterative problems like graph algorithms which are inefficient as you need to load and reload the data each iteration
  • continuous ingestion of data (increments coming on as small batches or streams of events) – where joining to existing data can be expensive
  • real-time problems – both queries and processing

In my opinion, Map/Reduce is an idea whose time has come and gone – it won’t die in a day or a year, there is still a lot of working systems that use it and the alternatives are still maturing. I do think, however, that if you need to write or implement something new that would build on map/reduce – you should use other option or at the very least carefully consider them.

So how is this change going to happen ?  Luckily, Hadoop has recently adopted YARN (you can see my presentation on it here), which opens up the possibilities to go beyond map/reduce without changing everything … even though in effect,  a lot  will change. Note that some of the new options do have migration paths and also we still retain the  access to all that “big data” we have in Hadoopm as well as the extended reuse of some of the ecosystem.

The first type of effort to replace map/reduce is to actually subsume it by offering more  flexible batch. After all saying Map/reduce is not relevant, deosn’t mean that batch processing is not relevant. It does mean that there’s a need to more complex processes. There are two main candidates here  Tez and Spark where Tez offers a nice migration path as it is replacing map/reduce as the execution engine for both Pig and Hive and Spark has a compelling offer by combining Batch and Stream processing (more on this later) in a single engine.

The second type of effort or processing capability that will help kill map/reduce is MPP databases on Hadoop. Like the “flexible batch” approach mentioned above, this is replacing a functionality that map/reduce was used for – unleashing the data already processed and stored in Hadoop.  The idea here is twofold

  • To provide fast query capabilities* – by using specialized columnar data format and database engines deployed as daemons on the cluster
  • To provide rich query capabilities – by supporting more and more of the SQL standard and enriching it with analytics capabilities (e.g. via MADlib)

Efforts in this arena include Impala from Cloudera, Hawq from Pivotal (which is essentially greenplum over HDFS), startups like Hadapt or even Actian trying to leverage their ParAccel acquisition with the recently announced Actian Vector . Hive is somewhere in the middle relying on Tez on one hand and using vectorization and columnar format (Orc)  on the other

The Third type of processing that will help dethrone Map/Reduce is Stream processing. Unlike the two previous types of effort this is covering a ground the map/reduce can’t cover, even inefficiently. Stream processing is about  handling continuous flow of new data (e.g. events) and processing them  (enriching, aggregating, etc.)  them in seconds or less.  The two major contenders in the Hadoop arena seem to be Spark Streaming and Storm though, of course, there are several other commercial and open source platforms that handle this type of processing as well.

In summary – Map/Reduce is great. It has served us (as an industry) for a decade but it is now time to move on and bring the richer processing capabilities we have elsewhere to solve our big data problems as well.

Last note  – I focused on Hadoop in this post even thought there are several other platforms and tools around. I think that regardless if Hadoop is the best platform it is the one becoming the de-facto standard for big data (remember betamax vs VHS?)

One really, really last note – if you read up to here, and you are a developer living in Israel, and you happen to be looking for a job –  I am looking for another developer to join my Technology Research team @ Amdocs. If you’re interested drop me a note: arnon.rotemgaloz at amdocs dot com or via my twitter/linkedin profiles

*esp. in regard to analytical queries – operational SQL on hadoop with efforts like  Phoenix ,IBM’s BigSQL or Splice Machine are also happening but that’s another story

illustration idea found in  James Mickens’s talk in Monitorama 2014 –  (which is, by the way, a really funny presentation – go watch it) -ohh yeah… and pulp fiction :)

Categories: Architecture

Hadoop YARN overview

I did a short overview of Hadoop YARN to our big data development team. The presentation covers the motivation for YARN, how it works and its major weaknesses

You can watch/download on slideshare

Categories: Architecture

Using Dropwizard in combination with Elasticsearch

Gridshore - Thu, 05/15/2014 - 21:09

Dropwizard logo

How often do you start creating a new application? How often have you thought about configuring an application. Where to locate a config file, how to load the file, what format to use? Another thing you regularly do is adding timers to track execution time, management tools to do thread analysis etc. From a more functional perspective you want a rich client side application using angularjs. So you need a REST backend to deliver json documents. Does this sound like something you need regularly? Than this blog post is for you. If you never need this, please keep on reading, you might like it.

In this blog post I will create an application that show you all the available indexes in your elasticsearch cluster. Not very sexy, but I am going to use: AngularJS, Dropwizard and elasticsearch. That should be enough to get a lot of you interested.


What is Dropwizard

Dropwizard is a framework that combines a lot of other frameworks that have become the de facto standard in their own domain. We have jersey for REST interface, jetty for light weight container, jackson for json parsing, free marker for front-end templates, Metric for the metrics, slf4j for logging. Dropwizard has some utilities to combine these frameworks and enable you as a developer to be very productive in constructing your application. It provides building blocks like lifecycle management, Resources, Views, loading of bundles, configuration and initialization.

Time to jump in and start creating an application.

Structure of the application

The application is setup as a maven project. To start of we only need one dependency:

<dependency>
    <groupId>io.dropwizard</groupId>
    <artifactId>dropwizard-core</artifactId>
    <version>${dropwizard.version}</version>
</dependency>

If you want to follow along, you can check my github repository:


https://github.com/jettro/dropwizard-elastic

Configure your application

Every application needs configuration. In our case we need to configure how to connect to elasticsearch. In drop wizard you extend the Configuration class and create a pojo. Using jackson and hibernate validator annotations we configure validation and serialization. In our case the configuration object looks like this:

public class DWESConfiguration extends Configuration {
    @NotEmpty
    private String elasticsearchHost = "localhost:9200";

    @NotEmpty
    private String clusterName = "elasticsearch";

    @JsonProperty
    public String getElasticsearchHost() {
        return elasticsearchHost;
    }

    @JsonProperty
    public void setElasticsearchHost(String elasticsearchHost) {
        this.elasticsearchHost = elasticsearchHost;
    }

    @JsonProperty
    public String getClusterName() {
        return clusterName;
    }

    @JsonProperty
    public void setClusterName(String clusterName) {
        this.clusterName = clusterName;
    }
}

Then you need to create a yml file containing the properties in the configuration as well as some nice values. In my case it looks like this:

elasticsearchHost: localhost:9300
clusterName: jc-play

How often did you start in your project to create the configuration mechanism? Usually I start with maven and quickly move to tomcat. Not this time. We did do maven, now we did configuration. Next up is the runner for the application.

Add the runner

This is the class we run to start the application. Internally jetty is started. We extend the Application class and use the configuration class as a generic. This is the class that initializes the complete application. Used bundles are initialized, classes are created and passed to other classes.

public class DWESApplication extends Application<DWESConfiguration> {
    private static final Logger logger = LoggerFactory.getLogger(DWESApplication.class);

    public static void main(String[] args) throws Exception {
        new DWESApplication().run(args);
    }

    @Override
    public String getName() {
        return "dropwizard-elastic";
    }

    @Override
    public void initialize(Bootstrap<DWESConfiguration> dwesConfigurationBootstrap) {
    }

    @Override
    public void run(DWESConfiguration config, Environment environment) throws Exception {
        logger.info("Running the application");
    }
}

When starting this application, we have no succes. A big error because we did not register any resources.

ERROR [2014-05-14 16:58:34,174] com.sun.jersey.server.impl.application.RootResourceUriRules: 
	The ResourceConfig instance does not contain any root resource classes.
Nothing happens, we just need a resource.

Before we can return something, we need to have something to return. We create a pogo called Index that contains one property called name. For now we just return this object as a json object. The following code shows the IndexResource that handles the requests that are related to the indexes.

@Path("/indexes")
@Produces(MediaType.APPLICATION_JSON)
public class IndexResource {

    @GET
    @Timed
    public Index showIndexes() {
        Index index = new Index();
        index.setName("A Dummy Index");

        return index;
    }
}

The @GET, @PATH and @Produces annotations are from the jersey rest library. @Timed is from the metrics library. Before starting the application we need to register our index resource with jersey.

    @Override
    public void run(DWESConfiguration config, Environment environment) throws Exception {
        logger.info("Running the application");
        final IndexResource indexResource = new IndexResource();
        environment.jersey().register(indexResource);
    }

Now we can start the application using the following runner from intellij. Later on we will create the executable jar.

Running the app from intelij

Run the application again, this time it works. You can browse to http://localhost:8080/index and see our dummy index as a nice json document. There is something in the logs though. I love this message, this is what you get when running the application without health checks.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!    THIS APPLICATION HAS NO HEALTHCHECKS. THIS MEANS YOU WILL NEVER KNOW      !
!     IF IT DIES IN PRODUCTION, WHICH MEANS YOU WILL NEVER KNOW IF YOU'RE      !
!    LETTING YOUR USERS DOWN. YOU SHOULD ADD A HEALTHCHECK FOR EACH OF YOUR    !
!         APPLICATION'S DEPENDENCIES WHICH FULLY (BUT LIGHTLY) TESTS IT.       !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Creating a health check

We add a health check, since we are creating an application interacting with elasticsearch, we create a health check for elasticsearch. Don’t think to much about how we connect to elasticsearch yet. We get there later on.

public class ESHealthCheck extends HealthCheck {

    private ESClientManager clientManager;

    public ESHealthCheck(ESClientManager clientManager) {
        this.clientManager = clientManager;
    }

    @Override
    protected Result check() throws Exception {
        ClusterHealthResponse clusterIndexHealths = clientManager.obtainClient().admin().cluster().health(new ClusterHealthRequest())
                .actionGet();
        switch (clusterIndexHealths.getStatus()) {
            case GREEN:
                return HealthCheck.Result.healthy();
            case YELLOW:
                return HealthCheck.Result.unhealthy("Cluster state is yellow, maybe replication not done? New Nodes?");
            case RED:
            default:
                return HealthCheck.Result.unhealthy("Something is very wrong with the cluster", clusterIndexHealths);

        }
    }
}

Just like with the resource handler, we need to register the health check. Together with the standard http port for normal users, another port is exposed for administration. Here you can find the metrics reports like Metrics, Ping, Threads, Healthcheck.

    @Override
    public void run(DWESConfiguration config, Environment environment) throws Exception {
        Client client = ESClientFactorybean.obtainClient(config.getElasticsearchHost(), config.getClusterName());

        logger.info("Running the application");
        final IndexResource indexResource = new IndexResource(client);
        environment.jersey().register(indexResource);

        final ESHealthCheck esHealthCheck = new ESHealthCheck(client);
        environment.healthChecks().register("elasticsearch", esHealthCheck);
    }

You as a reader now have an assignment to start the application and check the admin pages yourself: http://localhost:8081. We are going to connect to elasticsearch in the mean time.

Connecting to elasticsearch

We connect to elasticsearch using the transport client. This is taken care of by the ESClientManager. We make use of the drop wizard managed classes. The lifecycle of these classes is managed by drop wizard. From the configuration object we take the host(s) and the cluster name. Now we can obtain a client in the start method and pass this client to the classes that need it. The first class that needs it is the health check, but we already had a look at that one. Using the ESClientManager other classes have access to the client. The managed interface mandates the start as well as the stop method.

    @Override
    public void start() throws Exception {
        Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", clusterName).build();

        logger.debug("Settings used for connection to elasticsearch : {}", settings.toDelimitedString('#'));

        TransportAddress[] addresses = getTransportAddresses(host);

        logger.debug("Hosts used for transport client : {}", (Object) addresses);

        this.client = new TransportClient(settings).addTransportAddresses(addresses);
    }

    @Override
    public void stop() throws Exception {
        this.client.close();
    }

We need to register our managed class with the lifecycle of the environment in the runner class.

    @Override
    public void run(DWESConfiguration config, Environment environment) throws Exception {
        ESClientManager esClientManager = new ESClientManager(config.getElasticsearchHost(), config.getClusterName());
        environment.lifecycle().manage(esClientManager);
    }	

Next we want to change the IndexResource to use the elasticsearch client to list all indexes.

    public List<Index> showIndexes() {
        IndicesStatusResponse indices = clientManager.obtainClient().admin().indices().prepareStatus().get();

        List<Index> result = new ArrayList<>();
        for (String key : indices.getIndices().keySet()) {
            Index index = new Index();
            index.setName(key);
            result.add(index);
        }
        return result;
    }

Now we can browse to http://localhost:8080/indexes and we get back a nice json object. In my case I got this:

[
	{"name":"logstash-tomcat-2014.05.02"},
	{"name":"mymusicnested"},
	{"name":"kibana-int"},
	{"name":"playwithip"},
	{"name":"logstash-tomcat-2014.05.08"},
	{"name":"mymusic"}
]
Creating a better view

Having this REST based interface with json documents is nice, but not if you are human like me (well kind of). So let us add some AngularJS magic to create a slightly better view. The following page can of course also be created with easier view technologies, but I want to demonstrate what you can do with dropwizard.

First we make it possible to use free marker as a template. To make this work we need to additional dependencies: dropwizard-views and dropwizard-views-freemarker. The first step is a view class that knows the free marker template to load and provide the fields that you template can read. In our case we want to expose the cluster name.

public class HomeView extends View {
    private final String clusterName;

    protected HomeView(String clusterName) {
        super("home.ftl");
        this.clusterName = clusterName;
    }

    public String getClusterName() {
        return clusterName;
    }
}

Than we have to create the free marker template. This looks like the following code block

<#-- @ftlvariable name="" type="nl.gridshore.dwes.HomeView" -->
<html ng-app="myApp">
<head>
    <title>DWAS</title>
</head>
<body ng-controller="IndexCtrl">
<p>Underneath a list of indexes in the cluster <strong>${clusterName?html}</strong></p>

<div ng-init="initIndexes()">
    <ul>
        <li ng-repeat="index in indexes">{{index.name}}</li>
    </ul>
</div>

<script src="/assets/js/angular-1.2.16.min.js"></script>
<script src="/assets/js/app.js"></script>
</body>
</html>

By default you put these template in the resources folder using the same sub folders as your view class has for the package. If you look closely you see some angularjs code, more on this later on. First we need to map a url to the view. This is done with a resource class. The following code block shows the HomeResource class that maps the “/” to the HomeView.

@Path("/")
@Produces(MediaType.TEXT_HTML)
public class HomeResource {
    private String clusterName;

    public HomeResource(String clusterName) {
        this.clusterName = clusterName;
    }

    @GET
    public HomeView goHome() {
        return new HomeView(clusterName);
    }
}

Notice we now configure it to return text/html. The goHome method is annotated with GET, so each GET request to the PATH “/” is mapped to the HomeView class. Now we need to tell jersey about this mapping. That is done in the runner class.

final HomeResource homeResource = new HomeResource(config.getClusterName());
environment.jersey().register(homeResource);
Using assets

The final part I want to show is how to use the assets bundle from drop zone to map a folder “/assets” to a part of the url. To use this bundle you have to add the following dependency in maven: dropwizard-assets. Than we can easily map the assets folder in our resources folder to the web assets folder

    @Override
    public void initialize(Bootstrap<DWESConfiguration> dwesConfigurationBootstrap) {
        dwesConfigurationBootstrap.addBundle(new ViewBundle());
        dwesConfigurationBootstrap.addBundle(new AssetsBundle("/assets/", "/assets/"));
    }

That is it, now you can load the angular javascript file. My very basic sample has one angular controller. This controller uses the $http service to call our /indexes url. The result is used to show the indexes in a list view.

myApp.controller('IndexCtrl', function ($scope, $http) {
    $scope.indexes = [];

    $scope.initIndexes = function () {
        $http.get('/indexes').success(function (data) {
            $scope.indexes = data;
        });
    };
});

And the result

the very basic screen showing the indexes

Concluding

This was my first go at using drop wizard, I must admit I like what I have seen so far. Not sure if I would create a big application with it, on the other hand it is really structured. Before moving on I would need to reed a bit more about the library and check all of its options. There is a lot more possible than what I have showed you in here.

References

The post Using Dropwizard in combination with Elasticsearch appeared first on Gridshore.

Categories: Architecture, Programming

Yet More Change for the Capitals

DevHawk - Harry Pierson - Sat, 04/26/2014 - 21:13

Six years ago, I was pretty excited about the future for the Washington Capitals. They had just lost their first round match up with the Flyers – which was a bummer – but they had made the playoffs for the first time in 3 seasons. I wrote at the time:

Furthermore, even though they lost, these playoffs are a promise of future success. I tell my kids all the time that the only way to get good at something is to work hard while you’re bad at it. Playoff hockey is no different. Most of the Caps had little or no playoff experience going into this series and it really showed thru the first three games. But they kept at it and played much better over the last four games of the series. They went 2-2 in those games, but the two losses went to overtime. A little more luck (or better officiating) and the Caps are headed to Pittsburgh instead of the golf course.

What a difference six seasons makes. Sure, they won the President’s Trophy in 2010. But the promise of future playoff success has been broken, badly. The Caps have been on a pretty steep decline after getting beat by the eighth seed Canadians in the first round of the playoffs in 2010. Since then, they’ve switched systems three times and head coaches twice. This year, they missed the playoffs entirely even with Alex Ovechkin racking up a league-leading 51 goals.

Today, the word came down that both the coach and general manager have been let go. As a Caps fan, I’m really torn about this. I mean, I totally agree that the coach and GM had to go – frankly, I was surprised it didn’t happen 7-10 days earlier. But now what do you do? The draft is two months and one day away, free agency starts two days after that. The search for a GM is going to have to be fast. Then the GM will have to make some really important decisions about players at the draft, free agency and compliance buyouts with limited knowledge of the players in our system. Plus, he’ll need to hire a new head coach – preferably before the draft as well.

The one positive note is that the salary cap for the Capitals looks pretty good for next year. The Capitals currently have the second largest amount of cap space / open roster slot in the league. (The Islanders are first with $14.5 million / open roster slot. The Caps have just over $7 million / open roster slot.) They have only a handful of unrestricted free agents to resign – with arguably only one “must sign” (Mikhail Grabovski) in the bunch. Of course, this could also be a bug rather than a feature – having that many players under contract may make it harder for the new GM to shape the team in his image.

Who every the Capitals hire to be GM and coach, I’m not expecting a promising start. It feels like the next season is already a wash, and we’re not even finished with the first round of this year’s playoffs yet.

I guess it could be worse.

I could be a Toronto Leafs fan.

Categories: Architecture, Programming

Brokered WinRT Components Step Three

DevHawk - Harry Pierson - Fri, 04/25/2014 - 16:45

So far, we’ve created two projects, written all of about two lines of code and we have both our brokered component and its proxy/stub ready to go. Now it’s time to build the Windows Runtime app that uses the component. So far, things have been pretty easy – the only really tricky and/or manual step so far has been registering the proxy/stub, and that’s only tricky if you don’t want to run VS as admin. Unfortunately, tying this all together in the app requires a few more manual steps.

But before we get to the manual steps, let’s create the WinRT client app. Again, we’re going to create a new project but this time we’re going to select “Blank App (Windows)” from the Visual C# -> Store Apps -> Windows App node of the Add New Project dialog. Note, I’m not using “Blank App (Universal)” or “Blank App (Windows Phone)” because the brokered WinRT component feature is not support on Windows Phone. Call the client app project whatever you like, I’m calling mine “HelloWorldBRT.Client”.

Before we start writing code, we need to reference the brokered component. We can’t reference the brokered component directly or it will load in the sandboxed app process. Instead, the app need to reference a reference assembly version of the .winmd that gets generated automatically by the proxy/stub project. Remember in the last step when I said Kieran Mockford is an MSBuild wizard? The proxy/stub template project includes a custom target that automatically publishes the reference assembly winmd file used by the client app. When he showed me that, I was stunned – as I said, the man is a wizard. This means all you need to do is right click on the References node of the WinRT Client app project and select Add Reference. In the Reference Manager dialog, add a reference to the proxy/stub project you created in step two.

Now I can add the following code to the top of my App.OnLaunched function. Since this is a simple Hello World walkthru, I’m not going to bother to build any UI. I’m just going to inspect variables in the debugger. Believe me, the less UI I write, the better for everyone involved. Note, I’ve also added the P/Invoke signatures for GetCurrentProcess/ThreadID and to the client app like I did in the brokered component in step one. This way, I can get the process and thread IDs for both the app and broker process and compare them.

var pid = GetCurrentProcessId();
var tid = GetCurrentThreadId();

var c = new HelloWorldBRT.Class();
var bpid = c.CurrentProcessId;
var btid = c.CurrentThreadId;

At this point the app will compile, but if I run it the app will throw a TypeLoadException when it tries to create an instance of HelloWorldBRT.Class. The type can’t be loaded because the we’re using the reference assembly .winmd published by the proxy/stub project – it has no implementation details, so it can’t load. In order to be able to load the type, we need to declare the HelloWorldBRT.Class as a brokered component in the app’s pacakge.appxmanifest file. For non-brokered components, Visual Studio does this for you automatically. For brokered components we have to do it manually unfortunately. Every activatable class (i.e. class you can construct via “new”) needs to be registered in the appx manifest this way.

To register HelloWorldBRT.Class, right click the Package.appxmanifest file in the client project, select “Open With” from the context menu and then select “XML (Text) editor” from the Open With dialog. Then you need to insert inProcessServer extension that includes an ActivatableClass element for each class you can activate (aka has a public constructor). Each ActivatableClass element contains an ActivatableClassAttribute element that contains a pointer to the folder where the brokered component is installed. Here’s what I added to Package.appxmainfest of my HelloWorldBRT.Client app.

<Extensions>
  <Extension Category="windows.activatableClass.inProcessServer">
    <InProcessServer>
      <Path>clrhost.dll</Path>
      <ActivatableClass ActivatableClassId="HelloWorldBRT.Class" 
                        ThreadingModel="both">
        <ActivatableClassAttribute 
             Name="DesktopApplicationPath" 
             Type="string" 
             Value="D:\dev\HelloWorldBRT\Debug\HelloWorldBRT.PS"/>
      </ActivatableClass>
    </InProcessServer>
  </Extension>
</Extensions>

The key thing here is the addition of the DesktopApplicationPath ActivatableClassAttribute. This tells the WinRT activation logic that HelloWorldBRT.Class is a brokered component and where the managed .winmd file with the implementation details is located on the device. Note, you can use multiple brokered components in your side loaded app, but they all have the same DesktopApplicationPath.

Speaking of DesktopApplicationPath, the path I’m using here is path the final location of the proxy/stub components generated by the compiler. Frankly, this isn’t an good choice to use in a production deployment. But for the purposes of this walk thru, it’ll be fine.

ClientWatchWindow

Now when we run the app, we can load a HelloWorldBRT.Class instance and access the properties. re definitely seeing a different app process IDs when comparing the result of calling GetCurrentProcessId directly in App.OnLoaded vs. the result of calling GetCurrentProcessId in the brokered component. Of course, each run of the app will have different ID values, but this proves that we are loading our brokered component into a different process from where our app code is running.

Now you’re ready to go build your own brokered components! Here’s hoping you’ll find more interesting uses for them than comparing the process IDs of the app and broker processes in the debugger! :)

Categories: Architecture, Programming

Brokered WinRT Components Step Two

DevHawk - Harry Pierson - Fri, 04/25/2014 - 16:43

Now that we have built the brokered component , we have to build a proxy/stub for it. Proxies and stubs are how WinRT method calls are marshalled across process boundaries. If you want to know more – or you have insomnia – feel free to read all the gory details up on MSDN.

Proxies and stubs look like they might be scary, but they’re actually trivial (at least in the brokered component scenario) because 100% of the code is generated for you. It couldn’t be much easier.

Right click the solution node and select Add -> New Project. Alternatively, you can select File -> New -> Project in the Visual Studio main menu, but if you do that make sure you change the default solution from “Create new Solution” to “Add to Solution”. Regardless of how you launch the new project wizard, search for “broker” again, but this time select the “Brokered Windows Runtime ProxyStub” template. Give the project a name – I chose “HelloWorldBRT.PS”.

ProxyStubAddReferenceOnce you’ve created the proxy/stub project, you need to set a reference to the brokered component you created in step 1. Since proxies and stubs are native, this is a VC++ project. Adding a reference in a VC++ is not as straightforward as it is in C# projects. Right click the proxy/stub project, select “Properties” and then select Common Properties -> References from the tree on the left. Press the “Add New Reference…” button to bring up the same Add Reference dialog you’ve seen in managed code projects. Select the brokered component project and press OK.

Remember when I said that 100% of the code for the proxy/stub is generated? I wasn’t kidding – creating the template and setting referencing the brokered component project is literally all you need to do. Want proof? Go ahead and build now. If you watch the output windows, you’ll see a bunch of output go by referencing IDL files and MIDLRT among other stuff. This proxy/stub template has some custom MSBuild tasks that generates the proxy/stub code using winmdidl and midlrt. The process is similar to what is described here. BTW, if you get a chance, check out the proxy/stub project file – it is a work of art. Major props to Kieran Mockford for his msbuild wizardry.

ProxyStubRegisterOutputUnfortunately, it’s not enough just to build the proxy/stub – you also have to register it. The brokered component proxy/stub needs to be registered globally on the machine, which means you have to be running as an admin to do it. VS can register the proxy/stub for you automatically, but that means you have to run VS as an administrator. That always makes me nervous, but if you’re OK with running as admin you can enable proxy/stub registration by right clicking the proxy/stub project file, selecting Properties, navigating to Configuration properties -> Linker -> General in the tree of the project properties page, and then changing Register Output to “Yes”.

If you don’t like running VS as admin, you can manually register the proxy/stub by running “regsvr32 <proxystub dll>” from an elevated command prompt. Note, you do have to re-register every time the public surface area of your brokered component changes so letting VS handle registration admin is definitely the easier route to go.

In the third and final step, we’ll build a client app that accesses our brokered component.

Categories: Architecture, Programming

Brokered WinRT Components Step One

DevHawk - Harry Pierson - Fri, 04/25/2014 - 16:41

In this step, we’ll build the brokered component itself. Frankly, the only thing that makes a brokered component different than a normal WinRT component is some small tweaks to the project file to enable access to the full .NET Runtime and Base Class Library. The brokered component whitepaper describes the these tweaks in detail, but the new brokered component template takes care of these small tweaks for you.

BRT_NewProjectStart by selecting File -> New -> Project in Visual Studio. With the sheer number of templates to choose from these days, I find it’s easier to just search for the one I want. Type “broker” in the search box in the upper left, you’ll end up with two choices – the brokered WinRT component and the brokered WinRT proxy/stub. For now, choose the brokered component. We’ll be adding a brokered proxy/stub in step two. Name the project whatever you want. I named mine “HelloWorldBRT”.

This is probably the easiest step of the three as there’s nothing really special you have to do – just write managed code like you always do. In my keynote demo, this is where I wrote the code that wrapped the existing ADO.NET based data access library. For the purposes of this walkthrough, let’s do something simpler. We’ll use P/Invoke to retrieve the current process and thread IDs. These Win32 APIs are supported for developing WinRT apps and will make it obvious that the component is running in a separate process than the app. Here’s the simple code to retrieve those IDs (hat tip to pinvoke.net for the interop signatures):

public sealed class Class
{
    [DllImport("kernel32.dll")]
    static extern uint GetCurrentThreadId();

    [DllImport("kernel32.dll")]
    static extern uint GetCurrentProcessId();

    public uint CurrentThreadId
    {
        get { return GetCurrentThreadId(); }
    }

    public uint CurrentProcessId
    {
        get { return GetCurrentProcessId(); }
    }
}

That’s it! I didn’t even bother to change the class name for this simple sample.

Now, to be clear, there’s no reason why this code needs to run in a broker process. As I pointed out, the Win32 functions I’m wrapping here are supported for use in Windows Store apps. For this walkthrough, I’m trying to keep the code simple in order to focus on the specifics of building brokered components. If you want to see an example that actually leverages the fact that it’s running outside of the App Container, check out the NorthwindRT sample.

In the next step, we’ll add the proxy/stub that enables this component to communicate across a process boundary.

Categories: Architecture, Programming

Brokered WinRT Components Step-by-Step

DevHawk - Harry Pierson - Fri, 04/25/2014 - 16:40

Based on the feedback I’ve gotten since my keynote appearance @ Build – both in person and via email & twitter – there are a lot of folks who are excited about the Brokered WinRT Component feature. However, I’ve been advising folks to hold off a bit until the new VS templates were ready. Frankly, the developer experience for this feature is a bit rough and the VS template makes the experience much better. Well, hold off no longer! My old team has published the Brokered WinRT Component Project Templates up on the Visual Studio Gallery!

Now that the template is available, I’ve written a step-by-step guide demonstrating how to build a “Hello World” style brokered component. Hopefully, this will help folks in the community take advantage of this cool new feature in Windows 8.1 Update.

To keep it readable, I’ve broken it into three separate posts:

Note, this walkthrough assumes you’re running Windows 8.1 Update, Visual Studio 2013 with Update 2 RC (or later) and the Brokered WinRT Component Project Templates installed.

I hope this series helps you take advantage of brokered WinRT components. If you have any further questions, feel free to drop me an email or hit me up on Twitter.

Categories: Architecture, Programming

Affordance Open-Space at XP Day

Mistaeks I Hav Made - Nat Pryce - Tue, 12/03/2013 - 00:20
Giovanni Asproni and I facilitated an open-space session at XP Day on the topic of affordance in software design. Affordance is "a quality of an object, or an environment, which allows an individual to perform an action". Don Norman distinguishes between actual afforance and perceived affordance, the latter being "those action possibilities that are readily perceivable by an actor". I think this distinction applies as well to software programming interfaces. Unless running in a sandboxed execution environments, software has great actual affordance, as long as we're willing to do the wrong thing when necessary: break encapsulation or rely on implementation-dependent data structures. What makes programming difficult is determining how and why to work with the intentions of those who designed the software that we are building upon: the perceived affordance of those lower software layers. Here are the notes I took during the session, cleaned up a little and categorised into things that provide help affordance, hinder affordance or can both help and hinder. Helping Repeated exposure to language you understand results in fluency. Short functions Common, consistent language. Domain types. Good error messages Things that behave differently should look different. e.g. convention for ! suffix in Scheme and Ruby for functions that mutate data. Able to explore software: able to put "probes" into/around the software to experiment with its behaviour know where to put those probes TDD, done properly, is likely to make most important affordances visible simpler APIs ability to experiment and probe software running in isolation. Hindering Two ways software has bad affordance: can't work out what it does (poor perceived affordance) it behaves unexpectedly (poor actual affordance) (are these two viewpoints on the same thing?) Null pointer exceptions detected far from origin. Classes named after design patterns incorrectly. e.g. programmer did not understand the pattern. Factories that do not create objects Visitors that are really internal iterators Inconsistencies can be caused by unfinished refactoring which code style to use? agree in the team spread agreement by frequent pair rotation Names that have subtly different meanings in natural language and in the code. E.g. OO design patterns. Compare with FP design patterns that have strict definitions and meaningless names (in natural language). You have to learn what the names really mean instead of rely on (wrong) intuition. Both Helping and Hindering Conventions e.g. Ruby on Rails alternative to metaphor? but: don't be magical! hard to test code in isolation If you break convention, draw attention to where it has been broken. Naming types after patterns better to use domain terms than pattern names? e.g. CustomerService mostly does stuff with addresses. Better named AddressBook, can see it also has unrelated behaviour that belongs elsewhere. but now need to communicate the system metaphors and domain terminology (is that ever a bad thing?) Bad code style you are familiar with does offer perceived affordance. Bad design might provide good actual affordance. e.g. lots of global variables makes it easy to access data. but provides poor perceived affordance e.g. cannot understand how state is being mutated by other parts of the system Method aliasing: makes code expressive common in Ruby but Python takes the opposite stance - only one way to do anything Frameworks: new jargon to learn new conventions provide specific affordances if your context changes, you need different affordances and the framework may get in the way. Documentation vs Tests actual affordance of physical components is described in a spec documents and not obvious from appearance. E.g. different grade of building brick. Is this possible for software? docs are more readable tests are more trustworthy best of both worlds - doctests? Used in Go and Python How do we know what has changed between versions of a software component? open source libraries usually have very good release notes why is it so hard in enterprise development?
Categories: Programming, Testing & QA

Multimethods

Mistaeks I Hav Made - Nat Pryce - Thu, 10/10/2013 - 00:30
I have observed that... Unless the language already provides them, any sufficiently complicated program written in a dynamically typed language contains multiple, ad hoc, incompatible, informally-specified, bug-ridden, slow implementations of multimethods. (To steal a turn of phrase from Philip Greenspun).
Categories: Programming, Testing & QA

Thought after Test Automation Day 2013

Henry Ford said “Obstacles are those frightful things you see when take your eyes off the goal.” After I’ve been to Test Automation Day last month I’m figuring out why industrializing testing doesn’t work. I try to put it in this negative perspective, because I think it works! But also when is it successful? A lot of times the remark from Ford is indeed the problem. People tend to see obstacles. Obstacles because of the thought that it’s not feasible to change something. They need to change. But that’s not an easy change.

After attending the #TAD2013 as it was on Twitter I saw a huge interest in better testing, faster testing, even cheaper testing by using tools to industrialize. Test automation has long been seen as an interesting option that can enable faster testing. it wasn’t always cheaper, especially the first time, but at least faster. As I see it it’ll enable better testing. “Better?” you may ask. Test automation itself doesn’t enable better testing, but by automating regression tests and simple work the tester can focus on other areas of the quality.

images

And isn’t that the goal? In the end all people involved in a project want to deliver a high quality product, not full of bugs. But they also tend to see the obstacles. I see them less and less. New tools are so well advanced and automation testers are becoming smarter and smarter that they enable us to look beyond the obstacles. I would like to say look over the obstacles.

At the Test Automation Day I learned some new things, but it also proved something I already know; test automation is here to stay. We don’t need to focus on the obstacles, but should focus on the goal.

Categories: Testing & QA

Polyglot Background Jobs

Engine Yard Blog - Tue, 06/25/2013 - 20:05

There's many things we end up needing to perform background jobs for; but the main reason is to provide a snappy, non-blocking user experience.

Whether that task is encoding a video file, batch data import, or (in one case I ran into) jabber instant messaging, we want to offload them from our web servers as quickly as possible.

There are lots of tools to accomplish this across all languages, including Resque, Sidekiq, delayed_job, node-schedule, beanstalkd, Amazon Simple Queue Service (SQS) and then there is my personal favorite: Gearman.

Gearman has client libraries in C, PHP, Ruby, Node.js, Python, Java, Perl, C#/.NET and even includes tools that can be called via shell script, and user-defined functions for both MySQL and PostgreSQL.

Gearman itself is written in C, and is super simple. If you get a chance, I highly recommend checking out the source code. Note: gearman was originally written in Perl and later re-written in C. Be sure not to use the perl version (e.g. dev-perl/Gearman* in Gentoo portage).

The main reason I like gearmand is it's simplicity. Gearman has three parts to it:

  1. GearmanClient submits tasks to the job queue

  2. gearmand is the job queue itself (running as a daemon)

  3. GearmanWorker retrieves the tasks from the job queue and handles them

Gearman Communication Diagram-1

(View Large)

By default, the Gearman queue is stored in memory, however you can also make it persistent and stored in MySQL, PostgreSQL, memcached or SQLite. With memcache, obviously if it's on the same machine as gearmand then you're likely to lose it just as easily as the regular queue. The only difference is that you could re-start gearmand without losing the queue.

However, another potential option is to use the new MySQL 5.6 NoSQL Interface, which supports the memcached protocol. This should be faster than using the Gearman MySQL backend without sacrificing the persistence it brings.

It obviously has the ability to run background jobs being as this is what this post is all about, but it also foreground jobs which allow the GearmanClient and the GearmanWorker to communicate with each other using gearmand as the middle-man.

The best thing about Gearman, is that you can use different languages for different pieces. So you build your website in PHP, but maybe it's not the best option for wrangling text; so you schedule a job with gearmand, and a Python worker picks it up. Or Ruby, or Node.js, or… you get the idea.

What this allows us to do is to pick the correct tool for every task in our stack. Why workaround the pitfalls of our primary language when you can simply pick up a better tool and do things right.

Using Gearman

First we are going to use PHP to schedule a task with the job queue. This uses the pecl/gearman extension.

function createBackgroundJob($task, $data = array()) {
    $client = new \GearmanClient();
    $client->addServer(/* Defaults to 127.0.0.1, 4730 */);
    $handle = $client->doBackground($task, json_encode($data));

    if ($client->returnCode() != GEARMAN_SUCCESS) {
        return false;
    }

    return $handle;
}

In this simple example we create an instance of the \GearmanClient class, tell it to connect to the default server (localhost:4730) and send a background task ($client->doBackground()).

Next we ensure that the task was added successfully, and return the job handle.

We might call it with something like this, passing in the username:

$handle = createBackgroundJob('sendWelcomeEmail', ['username' => 'dshafik']);

We would then want to store the handle so that we can later check the status of the task.

The Worker

Next we'll create a worker, this time in Ruby:

require 'rubygems'
require 'gearman'
require 'json'

servers = ['localhost:4730']
worker = Gearman::Worker.new(servers)

# Add a handler for the "sendWelcomeEmail" task
worker.add_ability('sendWelcomeEmail') do |data,job|
    data = JSON.parse data
    user = User.first(:conditions => [ "username = ?", data["username"] ])
    user.sendWelcomeEmail();
end
loop { worker.work }

Here we use the gearman-ruby gem to create a Gearman::Worker, and then register the task handler.

In this case, we first decode the JSON data passed in from our GearmanClient and then find our user in the database by the username. We then call the sendWelcomeEmail method.

For something that takes more time, you could send back a running status. The job variable is an instance of Gearman::Worker::Job class which allows you to respond using job.report_status(numerator, denominator).

It’s important to note that you can run as many workers for each task as you’d like, Gearman will not hand the same job to multiple workers (however, there is a re-try config option should it fail) and because they are pulling jobs it will not overload the workers, though you may run out of them. The number of workers you run can also act as a way to manage priority — higher priority jobs get more workers — and balance resources.

Checking the Status

Finally, we'll need a way to check the status of the request. For this we’ll use Node.js/Javascript. In our case we are only looking to see if the job has completed as we haven't send any other status.

var http = require('http'), 
    url = require("url"),
    querystring = require("querystring"),
    gearman = require("gearman");

var server = http.createServer(function (request, response) {
    var client = gearman.createClient();
    var query = url.parse(request.url, true).query;

    if (!("handle" in query)) {
        response.writeHead(404, {"Content-Type": "text/plain"});
        response.end("Job not found!\n");
    } else {
        var status = { };
        client.getJobStatus(query.handle, function(s) { 
            if (s) {
                status = s;
            }

            response.writeHead(200, {"Content-Type": "application/json"});
            response.end(JSON.stringify(status));
        });
    }
});

server.listen(8000);

This creates an HTTP server on port 8000 that when passed a handle via GET arguments will return the status.

Using Gearman with Engine Yard Cloud

In order to make Gearman a part of the background job processes on your Engine Yard Cloud account, it is necessary to create a custom chef recipe to compile it yourself (chef recipes can be used to take advantage of software outside of the current stack). For more details on using Chef with Engine Yard Cloud, check out our knowledge base.

As with all background jobs, best practices recommend Gearman be run on an Utility Instance, so that all issues are processed without interfering with the Application Instances themselves.

Can't we all just get along?

So, as you can see, Gearman can act like glue between the various parts of your application. It's super fast, has low resource usage and can be used with almost any language you can think of.

Additionally, it can not only do foreground tasks (with communication), but can also prioritize jobs into high/standard/low priority queues.

You can also easily scale Gearman as the clients and workers both support multiple servers, allowing you to spread your queue, and your workers out over multiple machines.

I highly recommend checking it out at http://gearman.org.

 

The post Polyglot Background Jobs appeared first on Engine Yard Developer Blog.

Categories: Programming

June 21, 2013: This Week at Engine Yard

Engine Yard Blog - Fri, 06/21/2013 - 22:38

We've been very busy on a number of exciting projects that we're looking forward to sharing with you all. While I can’t tell you what they are just yet, I can point you to some adorable animal photos.

Have an awesome weekend!

--Tasha Drew, Product Manager

Engineering Updates

PHP customers will be excited to know we’ve released Composer support! Composer is a dependency manager for PHP and allows developers to specify project dependencies in a composer.json file - Composer then handles the rest. Big props to the legendary Ben Chapman for his work getting this ready -- read all about it in our docs!

We’ve released a feature to improve snapshot management for our customers into Early Access. This feature allows you to see snapshots attached to your environments and delete them. You can also set a default policy for how long you want to keep snapshots around via the UI. Our default limit is now 90 days, if you enable the feature, so if you had a stopped environment with some super old snapshots in it, talk to support before you enable it. Hopefully this will make snapshot management significantly more easy and transparent!

Data Data Data

This week Ines and team worked on some deep diving projects dealing with how backups will be performed on new relational clusters going forward. This super forward focused work will eventually begin to reveal itself as we reveal new cluster types, enhancements, and a few new features.

Also, for the Postgres fans out there, Tom Lane’s excellent SF-PUG presentation on The Architecture of the PostgreSQL query planner is up!

Social Calendar (Come say hi!)

Tuesday June 25th, 6:30pm: San Francisco Office: Let’s meet up and talk about LevelDB and Node! Speakers this week include Dominic Tarr, Rod Vagg, Jake Verbaten, Paolo Fragomeni, and Mikeal Rogers.

Tuesday June 25th, 6:30pm: Buffalo Office: Girl Develop It! is teaching an Intro to HTML & CSS course, the third in a series of four.

Wednesday June 26th, 6:30pm: Portland Office: We will be hosting Coder Dojo for students K-12 to learn about software. Parents welcome to attend and participate!

NodeConf (June 27-29, Walker Creek Ranch, CA). We’re sponsoring - if you see Engine Yard t-shirts, come by and say hello! We’re excited to be sponsoring NodeConf for the first time, summer camp style.

Thursday June 27th, 6:30pm: Dublin, Ireland Office: Node.js Dublin will be investigating all things Node. Grab a ticket here!

EuRuKo (June 29-30, Athens, Greece). Stop by the Engine Yard booth! Grab a t-shirt and some swag, learn what’s new with Cloud and meet our awesome community manager, Kelsey Schimmelman.

Lonestar PHP (June 29-30, Dallas, TX). We’re excited to be returning to Lonestar PHP--if you see Davey Shafik walking around, say hi!

Articles of Interest

Google finally admits that those crazy brain teasers do not, in fact, indicate anything  about how good a hire an engineer is going to be. But they’re definitely fun to dream up.

The post June 21, 2013: This Week at Engine Yard appeared first on Engine Yard Developer Blog.

Categories: Programming

How to Troubleshoot PostgreSQL Alerts

Engine Yard Blog - Fri, 06/21/2013 - 17:17

So you have your PostgreSQL application deployed on Engine Yard Cloud and everything is going great. You have enabled a few extensions, have added basic redundancy by spinning a database replica, and are busy developing new features.  One day though, you look at the dashboard and see this message:

What do these alerts mean? Is the database at risk? Should you escalate to support? This post will help you understand PostgreSQL dashboard alerts and correlate them to the health of your database and application.

Monitoring and the checkpoint check

The alerts I showed you popped up in one of our mission-critical applications. This blog discusses the steps I performed to troubleshoot the cause of the problem and the resolution. But first a little bit of background.

We monitor the health of your PostgreSQL database using a combination of our own custom checks and Bucardo’s check_postgres scripts. I’ll wave a big wand here and tell you that either Collectd or Nagios (depends on your stack and features) consume the results of these checks and present them to the Engine Yard dashboard.

The following documentation page provides an explanation of the alerts Engine Yard issues for PostgreSQL. Today I’ll focus only on the alert I received but refer to the documentation if you see something different in your application’s dashboard.

Let’s take a closer look at the message:

POSTGRES_CHECKPOINT CRITICAL: Last checkpoint was 16204 seconds ago

I know from this message that the checkpoint check originated the alert and the severity of the alert is critical. In human talk, the message means that the database has not had a checkpoint for about 4.5 hours!  Here is another example:

POSTGRES_CHECKPOINT WARNING: Last checkpoint was 1265 seconds ago

This message means that the database has not had a checkpoint for about 21 minutes.

We issue a WARNING severity when checkpoint delays range from 20 to 30 minutes. For anything that exceeds 30 minutes, the severity of the alert goes to CRITICAL.

A checkpoint is a point in the transaction log sequence at which all data files have been updated to reflect the information in the log and flushed to disk. If your system crashes recovery will start from the last known checkpoint. So the checkpoint check helps us confirm two things: that your database consistently takes forward the position in which recovery is started, and in the case of replicas that your standby is keeping up with its master (since the activity the replica sees is what the master has sent it).

For more information about checkpoints and replication, please refer to the Postgresql replication write-ahead-log (WAL) documentation.

Back to my App

Now we understand that the alert I received means that there was a problem with the database replica and its ability to checkpoint. The database logs showed nothing out of order, so I logged into the server console and discovered the following:

# psql
psql (9.1.9, server 9.1.3)
Type "help" for help.

The psql prompt showed me that there was a version mismatch between the database server binaries and the running psql process. This typically happens after a stack update (that includes a minor version bump of your database) is applied on a running environment, and the database process is not restarted.  The database server is left in a state where its effectively running two versions at the same time. To ensure that the postgresql process is running the latest version of the database, you MUST always restart the database process after upgrading your environment.

The stack and version update was absolutely necessary as it included critical security patches - See April 4, 3013 - PostgreSQL security update. But when the stack was applied, the person who applied the stack update didn’t restart the postgresql process as outlined in the upgrade instructions. This is typically not a problem, as replication is known to work between patch level versions, but we hit a replication bug in the 9.1.3 to 9.1.9 upgrade which caused replication to break.

Our Solution

So in a nutshell, our database replica became unable to receive WAL archives from its master,  checkpoints started falling behind, and we were alerted. Restarting the database process would have solved the problem but instead we decided to utilize the maintenance window to upgrade the server to PostgreSQL 9.2 and create a new replica.

I performed an in-place upgrade of the database master (something that professional services has a lot of experience with) and within minutes the application was back online running the latest version of PostgreSQL.

But troubleshooting this alert made me aware of the issues with our current upgrade process:

  • Documentation on alerts was lacking. There is no place to quickly look up the alerts we present in the UI and their meaning.
  • Our upgrade message did not remind us to restart the database process (though the release notes did).
  • An unexpected replication bug between patch versions caused my database replica to become stale.

Here is what we we’ll do to make sure you don’t experience the problems I did last week.

New PostgreSQL alert documentation

PostgreSQL alerts will be explained in a new documentation page. We’ll work on documenting MySQL and Riak alerts as well.

Improved stack upgrade messages

We will enhance stack release notes with icons to visually indicate if a process restart is needed when a new version of a database is available.


Ability to lock your database version

Without a doubt, we want customers to keep their database stacks up to date with security releases and patches. But it would be fantastic to be able to lock your entire database version (to the patch level) and still receive stack updates.

We have developed (and are internally testing) a toggle to lock your database version. With this feature, I can schedule a maintenance window (to restart the database process when I’m ready) while continuing to receive stack updates. We are still working on documentation but if this feature is something that interests you, please open a support ticket and let me know. It should be in limited access soon!

Hopefully now you have a little more context and information available to interpret the alerts we display in your environment. Exciting things are happening in the Engine Yard’s Data stack (think new clusters!).  A little hint for the curious, Tasha Drew’s excellent weekly recap of engineering always includes juicy details on what we are up to ;)

The post How to Troubleshoot PostgreSQL Alerts appeared first on Engine Yard Developer Blog.

Categories: Programming

Announcing Composer Support

Engine Yard Blog - Tue, 06/18/2013 - 21:55

We’re pleased to announce Composer support for PHP applications.

This has been one of our most requested features, and should make it even easier for you to manage your apps. If you’re already using Composer, you can dive right in. If not, now is a great time to try it out. We recommend Composer for all PHP apps!

What is Composer?

Composer is a popular dependency manager for PHP. With it, you can specify project dependencies in a composer.json file and Composer will automatically handle the rest. For more information about Composer, take a look at the project website.

Why is It Useful?

Composer allows you to manage third-party dependencies separate from your code, decluttering your repository. What’s more, it makes updating your dependencies a snap. Just run composer update and Composer will fetch the latest compatible versions.

How Can I Use It?

Using Composer with Engine Yard is very simple. We’ll detect the presence of a composer.lock file in your repository, and automatically install your app’s dependencies. To get started with Composer for Engine Yard, take a look at the documentation.

The post Announcing Composer Support appeared first on Engine Yard Developer Blog.

Categories: Programming

June 14, 2013: This Week at Engine Yard

Engine Yard Blog - Fri, 06/14/2013 - 17:17

This is the week a big chunk of the San Francisco development team went on a roadtrip to our Portland office to do some intense cross-office feature pollination. Things may have started out with some office rivalry, but developers quickly overcame any differences to work together to build, drink copious amounts of amazing coffee, and figure out the location of some of the awesome restaurants Portland has to offer. Pro-Tip: check out Blue Star donuts #amazing.

--Tasha Drew, Product Manager

Engineering Updates

Customer feedback is important to us and is an important part of how we prioritize work within our product management process. We received a few comments from customers who were frustrated because they couldn’t figure out why they were being charged money when they didn’t have any running instances. The answer was that they still had IP addresses that were detached from instances when the instances were terminated, but not deleted.

Customers can always see IP addresses and manage them in the dashboard by going to Tools -> IP addresses, but we decided to add more messaging to call this out to people.

Going forward, you will see a dashboard notice if you delete an instance and don’t delete the IP address - and you will also receive an email. We will also be sending out emails to any customer who has an account where the only items they’re being billed for are IP addresses and snapshots to let them know.

Hope this helps going forward! Big thanks to one of our newest platform developers, the amazing Daniela, for turning this request around so quickly.

We’re also wrapping up some cool new features around snapshot management which you should be reading about in this space next week!

Data Data Data

Our lead data engineer, Ines, has been busy working on the underlying code for exciting new features that we’ll be rolling out in the next few months. She also handed off a new feature that allows for database version locking to alleviate upgrade pains. The DBA team is actively testing and improving it and we should make it available soon. Watch out for her blogpost next week.

Ines and I were delighted to get to meet up with local Postgres ladies while we were in Portland. Selena Deckelmann has some great thoughts on the intersection of developers and Operations on her blog for those of you who need some fun weekend reading.  Kris Pennella gave me a valuable reminder to take a deep breath when facing stressful situations in her blog, “3 Tips Channeling a Negative into a Positive.”

We also had the pleasure of seeing Basho’s Eric Redmond (author of 7 Databases in 7 weeks and the Little Riak Book). We got a chance to hear some of the features that will come in Riak 1.4 and we are very excited!

Social Calendar (Come say hi!)

Friday June 14 - Saturday June 15: DevOps Days Amsterdam!: Meet the always charming Slava and the ridiculously knowledgable Richard as they hang out and participate in this awesome DevOps conference where we are not only a PaaS -- we are also a cake.

Tuesday June 18, 19:00: Ruby Ireland Meetup at Engine Yard Dublin. We are Going off The Rails this month at Ruby Ireland as we go through some of the options for extending your web apps with mobile apps or through a Javascript framework. Kevin Fagan, Fergal Condron, Simon Rand, Gavin Joyce and Paul Watson will be speaking.

Thursday June 20 - Friday June 21st: Lyon, France, Ruby Lugdunum: Crowd favorite Engine Yard engineer PJ Hagerty will be presenting at Ruby Lugdunum in exotic Lyon, France, on how to grow and nurture your local Ruby group.

Thursday June 20, 18:30: Open Data Ireland #8 at Engine Yard Dublin. General theme for ODI Meetup #8 is 'Open Government Partnership'. This meetup will be facilitated by Denis Parfenov, Tom Stewart and Nuala Haughey. We'll be hosting a brief presentation from OGP representative. The rest of the evening will be dedicated to building topic- specific, multi-stakeholder/multi-disciplinary working groups with a view to taking an active part in co-drafting/crowdsourcing Ireland’s first national Action Plan around OGP principles.

Thursday June 20, 19:00: Engine Yard’s Buffalo Offices: Riak is bustin’ out all over in June, a meetup led by renouned Riakifier Dave Parfitt.

Articles of Interest

Drink coffee: avoid death! The New York Times tells us exactly what we’ve been hoping to hear.

Nobody Understands the GIL: Jesse Storimer explores MRI and analyzes functions for thread safety.

And for our distributed systems fans (that’s everyone, right?) a deep dive into non-blocking transactional atomicity by Peter Bailis.

Call me maybe: Kyle Kingsbury’s summary post on Jepsen looking at how various databases handle network partitions.

The post June 14, 2013: This Week at Engine Yard appeared first on Engine Yard Developer Blog.

Categories: Programming

You Cannot Win Engineering

Engine Yard Blog - Thu, 06/13/2013 - 20:51

For as long as I can remember, I’ve been a fan of Saturday Night Live and improvisational theater. Improv looks chaotic and uncontrolled, but the best practitioners operate under strict rules that govern interactions between players. Some of the most successful entertainers today, people like Stephen Colbert and Tina Fey, directly credit what they have learned in improv with making them better at what they do both on and off screen.

Unlike workplace policies that you are probably used to, the rules of improv aren’t meant to constrain you, but to open you up to the ideas of others. Let’s take a look at some of the rules the Mr. Colbert and Ms. Fey live by and see how they can improve team collaboration.

Agree and Say “Yes”.

Here’s Tina, from her book Bossypants, talking about the rules of engagement:

The first rule of improvisation is AGREE. Always agree and SAY YES. When you’re improvising, this means you are required to agree with whatever your partner has created. So if we’re improvising and I say, “Freeze, I have a gun,” and you say, “That’s not a gun. It’s your finger. You’re pointing your finger at me,” our improvised scene has ground to a halt. But if I say, “Freeze, I have a gun!” and you say, “Yes! The gun I gave you for Christmas! You bastard!” then we have started a scene because we have AGREED that my finger is in fact a Christmas gun.

The same is true of engineering teams. When one of your teammates has an idea, your first response needs to be affirmative. Take any and all ideas from your teammates as positive contributions and you start from a place of being open-minded and welcoming. Nothing kills team morale faster than someone who says “No, that won’t work” in response to any idea that they didn’t come up with.

It’s Not Just “Yes”, it’s “Yes, and…”

Everyone loves games and games are more fun when everyone  plays nicely. Make positive contributions and you will foster a spirit of openness, collaboration and — dare I say — fun.  Make it your habit to answer your teammate’s ideas with “Yes, and…” instead of “No, because”. Always offer your ideas, you just are as entitled to be silly and wrong as everyone else. Ideas seldom spring fully-formed from the head of Zeus and the part you’re holding back out of fear might be the thing that makes it work. “Yes, and…” makes you part of the solution; “No, because” makes you part of the problem.

Your Team is the Most Important Person on Your Team

 Stephen Colbert went back to his alma mater, Northwestern University, to give the commencement address in 2011. He may play a know-it-all blowhard on The Colbert Report, but that’s clearly not the case in real life. Here’s an excerpt from his speech:

…One of the things I was taught early on is that you are not the most important person in the scene. Everybody else is. And if they are the most important people in the scene, you will naturally pay attention to them and serve them. But the good news is you're in the scene too. So hopefully to them you're the most important person, and they will serve you. No one is leading, you're all following the follower, serving the servant.

You cannot win improv.

And life is an improvisation. You have no idea what's going to happen next and you are mostly just making things up as you go along.

And like improv, you cannot win your life.

The software corollary to this is: “You cannot win engineering”.

Think about the implications of this for a moment. If everyone on your team acts as if their teammates are more important than they are, you create an environment of support, giving, and progress that is mutually enriching and productive. You’ll know you have succeeded when no one on your team remembers where a great idea came from. More importantly, no one will care.

When one of your teammates asks you a question, don’t tell them to Google it (which is a bit of a jerk response in any case). Act as if their problems are more important than yours, serve the team by serving them. When you are stuck on a problem, they will treat you the same way.

None of these rules for improvisation will make you funnier or get you a slot on Weekend Update, but applying them to your co-workers will almost certainly make your team awesome. Everyone wins.

The post You Cannot Win Engineering appeared first on Engine Yard Developer Blog.

Categories: Programming

June 7, 2013: This Week at Engine Yard

Engine Yard Blog - Fri, 06/07/2013 - 23:41

Things are pretty busy right now as we ship a bunch of customer enhancements on Engine Yard Cloud and continue with our planned infrastructure abstractions and cluster model improvements. Exciting things to come! In the meantime, here’s what’s available as of this week.

--Tasha Drew, Product Manager

Engineering Updates

Now in GA: Application takeover preferences. Based on your application's customizations, you might not want to use the default application takeover behavior we've developed to automatically promote your application slaves when the app master goes away or becomes totally unresponsive for some reason.

Engine Yard Cloud now provides two automated options for replacing capacity in an application takeover situation. We also provide alternatives if you need to handle part or all of an app takeover yourself.

We now have Provisioned IOPs and EBS Optimized instances available for customers to use in Early Access! To enable them for your environment from your cloud dashboard, click the Tools menu -> Early Access, and then enable “EBS Optimized Instances” and “Provisioned IOPs.”

Keep in mind that they work best in tandem, and they will only be an option on instances booted after you enable the feature.

Data Data Data

Databases love I/O and provisioned IOPs and EBS optimized instances are very well suited for applications where the database can use more performance (think backups and snapshots too).  

You can enhance the performance of your application by having a volume with provisioned IOPs on the database master. If your application has been already deployed you can add new replicas to the environment (that have this performance boost) and have them promoted to master.
As usual don’t hesitate to ask us if PIOPs or EBS optimized instances can give your database a boost.

Social Calendar (Come say hi!)

Tuesday, June 11th: Our Buffalo office will be hosting the WNY Ruby Meetup Group. Mark Josef will be providing us with some code katas.

Wednesday, June 12th: Our PDX office will be hosting the weekly CoderDojo K-12 night, ably assisted by one of the San Francisco sprint teams, who will be on site for an off site (as it were).

Wednesday, June 12th: Girl Develop It will be doing a Code and Coffee night in our Buffalo office. The participants be focusing on honing their skills and working in groups. Swing by for the whole thing or just for a part of it.

Friday, June 14th: DevOps Day Amsterdam will be happening! Be sure to meet our own Slava and Richard and let them tell you about how Engine Yard can make your lives easier.

Articles of Interest

Mozilla's John O’Dunn discusses how to use release engineering as a force-multiplier!

David Padilla explains why hash lookups are so fast in Ruby on the Engine Yard blog.

The post June 7, 2013: This Week at Engine Yard appeared first on Engine Yard Developer Blog.

Categories: Programming