Warning: Table './devblogsdb/cache_page' is marked as crashed and last (automatic?) repair failed query: SELECT data, created, headers, expire, serialized FROM cache_page WHERE cid = 'http://www.softdevblogs.com/?q=aggregator&page=2' in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc on line 135

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 729

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 730

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 731

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 732
Software Development Blogs: Programming, Software Testing, Agile, Project Management
Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator
warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/common.inc on line 153.

Running an application using Kubernetes on AWS

Agile Testing - Grig Gheorghiu - Wed, 11/23/2016 - 02:13
I've been knee-deep in Kubernetes for the past few weeks and to say that I like it is an understatement. It's exhilarating to have at your fingertips a distributed platfom created by Google's massive brain power.

I'll jump right in and talk about how I installed Kubernetes in AWS and how I created various resources in Kubernetes in order to run a database-backed PHP-based web application.

Installing Kubernetes

I used the tack tool from my laptop running OSX to spin up a Kubernetes cluster in AWS. Tack uses terraform under the hood, which I liked a lot because it makes it very easy to delete all AWS resources and start from scratch while you are experimenting with it. I went with the tack defaults and spun up 3 m3.medium EC2 instances for running etcd and the Kubernetes API, the scheduler and the controller manager in an HA configuration. Tack also provisioned 3 m3.medium EC2 instances as Kubernetes workers/minions, in an EC2 auto-scaling group. Finally, tack spun up a t2.nano EC2 instance to server as a bastion host for getting access into the Kubernetes cluster. All 7 EC2 instances launched by tack run CoreOS.

Using kubectl

Tack also installs kubectl, which is the Kubernetes command-line management tool. I used kubectl to create the various Kubernetes resources needed to run my application: deployments, services, secrets, config maps, persistent volumes etc. It pays to become familiar with the syntax and arguments of kubectl.

Creating namespaces

One thing I needed to do right off the bat was to think about ways to achieve multi-tenancy in my Kubernetes cluster. This is done with namespaces. Here's my namespace.yaml file:

$ cat namespace.yaml
apiVersion: v1
kind: Namespace
  name: tenant1

To create the namespace tenant1, I used kubectl create:

$ kubectl create -f namespace.yaml

To list all namespaces:

$ kubectl get namespaces
NAME          STATUS    AGE
default       Active    12d
kube-system   Active    12d
tenant1       Active    11d 

If you don't need a dedicated namespace per tenant, you can just run kubectl commands in the 'default' namespace.

Creating persistent volumes, storage classes and persistent volume claims

I'll show how you can create two types of Kubernetes persistent volumes in AWS: one based on EFS, and one based on EBS. I chose the EFS one for my web application layer, for things such as shared configuration and media files. I chose the EBS one for my database layer, to be mounted as the data volume.

First, I created an EFS share using the AWS console (although I recommend using terraform to do it automatically, but I am not there yet). I allowed the Kubernetes worker security group to access this share. I noted one of the DNS names available for it, e.g. us-west-2a.fs-c830ab1c.efs.us-west-2.amazonaws.com. I used this Kubernetes manifest to define a persistent volume (PV) based on this EFS share:

$ cat web-pv-efs.yaml
apiVersion: v1
kind: PersistentVolume
  name: pv-efs-web
    storage: 50Gi
    - ReadWriteMany
    server: s-west-2a.fs-c830ab1c.efs.us-west-2.amazonaws.com
    path: "/"

To create the PV, I used kubectl create, and I also specified the namespace tenant1:

$ kubectl create -f web-pv-efs.yaml --namespace tenant1

However, creating a PV is not sufficient. Pods use persistent volume claims (PVC) to refer to persistent volumes in their manifests. So I had to create a PVC:

$ cat web-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
  name: web-pvc
    - ReadWriteMany
      storage: 50Gi 

$ kubectl create -f web-pvc.yaml --namespace tenant1

Note that a PVC does not refer directly to a PV. The storage specified in the PVC is provisioned from available persistent volumes.

Instead of defining a persistent volume for the EBS volume I wanted to use for the database, I created a storage class:

$ cat db-storageclass-ebs.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
  name: db-ebs
provisioner: kubernetes.io/aws-ebs
  type: gp2

$ kubectl create -f db-storageclass-ebs.yaml --namespace tenant1

I also created a PVC which does refer directly to the storage class name db-ebs. When the PVC is used in a pod, the underlying resource (i.e. the EBS volume in this case) will be automatically provisioned by Kubernetes.

$ cat db-pvc-ebs.yaml
apiVersion: v1
kind: PersistentVolumeClaim
  name: db-pvc-ebs
     volume.beta.kubernetes.io/storage-class: 'db-ebs'
    - ReadWriteMany
      storage: 50Gi

$ kubectl create -f db-pvc-ebs.yaml --namespace tenant1

To list the newly created resource, you can use:

$ kubectl get pv,pvc,storageclass --namespace tenant1

Creating secrets and ConfigMaps

I followed the "Persistent Installation of MySQL and Wordpress on Kubernetes" guide to figure out how to create and use Kubernetes secrets. Here is how to create a secret for the MySQL root password, necessary when you spin up a pod based on a Percona or plain MySQL image:
$ echo -n $MYSQL_ROOT_PASSWORD > mysql-root-pass.secret
$ kubectl create secret generic mysql-root-pass --from-file=mysql-root-pass.secret --namespace tenant1 

Kubernetes also has the handy notion of ConfigMap, a resource where you can store either entire configuration files, or key/value properties that you can then use in other Kubernetes resource definitions. For example, I save the GitHub branch and commit environment variables for the code I deploy in a ConfigMap:
$ kubectl create configmap git-config --namespace tenant1 \                 --from-literal=GIT_BRANCH=$GIT_BRANCH \                 --from-literal=GIT_COMMIT=$GIT_COMMIT
I'll show how to use secrets and ConfigMaps in pod definitions a bit later on.
Creating an ECR image pull secret and a service account

We use AWS ECR to store our Docker images. Kubernetes can access images stored in ECR, but you need to jump through a couple of hoops to make that happen. First, you need to create a Kubernetes secret of type dockerconfigjson which encapsulates the ECR credentials in base64 format. Here's a shell script that generates a file called ecr-pull-secret.yaml:



PASSWORD=$(aws --profile default --region us-west-2 ecr get-login | cut -d ' ' -f 6)


cat > ecr-pull-secret.yaml << EOF
apiVersion: v1
kind: Secret
  name: ecr-key
  namespace: tenant1
  .dockerconfigjson: $BASE64CONFIG
type: kubernetes.io/dockerconfigjson


Once you run the script and generate the file, you can then define a Kubernetes service account that will use this secret:

$ cat service-account.yaml
apiVersion: v1
kind: ServiceAccount
  namespace: tenant1
  name: tenant1-dev
 - name: ecr-key

Note that the service account refers to the ecr-key secret in the imagePullSecrets property.

As usual, kubectl create will create these resources based on their manifests:

$ kubectl create -f ecr-pull-secret.yaml
$ kubectl create -f service-account.yaml

Creating deployments

The atomic unit of scheduling in Kubernetes is a pod. You don't usually create a pod directly (though you can, and I'll show you a case where it makes sense.) Instead, you create a deployment, which keeps track of how many pod replicas you need, and spins up the exact number of pods to fulfill your requirement. A deployment actually creates a replica set under the covers, but in general you don't deal with replica sets directly. Note that deployments are the new recommended way to create multiple pods. The old way, which is still predominant in the documentation, was to use replication controllers.

Here's my deployment manifest for a pod running a database image:

$ cat db-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
  name: db-deployment
    app: myapp
    type: Recreate
        app: myapp
        tier: db
      - name: db
        image: MY_ECR_ID.dkr.ecr.us-west-2.amazonaws.com/myapp-db:tenant1
        imagePullPolicy: Always
        - name: MYSQL_ROOT_PASSWORD
              name: mysql-root-pass
              key: mysql-root-pass.secret
        - name: MYSQL_DATABASE
              name: tenant1-config
              key: MYSQL_DATABASE
        - name: MYSQL_USER
              name: tenant1-config
              key: MYSQL_USER
        - name: MYSQL_DUMP_FILE
              name: tenant1-config
              key: MYSQL_DUMP_FILE
        - name: S3_BUCKET
              name: tenant1-config
              key: S3_BUCKET
        - containerPort: 3306
          name: mysql
        - name: ebs
          mountPath: /var/lib/mysql
      - name: ebs
          claimName:  db-pvc-ebs
      serviceAccount: tenant1-dev

The template section specifies the elements necessary for spinning up new pods. Of particular importance are the labels, which, as we will see, are used by services to select pods that are included in a given service.  The image property specifies the ECR Docker image used to spin up new containers. In my case, the image is called myapp-db and it is tagged with the tenant name tenant1. Here is the Dockerfile from which this image was generated:

$ cat Dockerfile
FROM mysql:5.6

# disable interactive functions
ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y python-pip
RUN pip install awscli

VOLUME /var/lib/mysql

COPY etc/mysql/my.cnf /etc/mysql/my.cnf
COPY scripts/db_setup.sh /usr/local/bin/db_setup.sh

Nothing out of the ordinary here. The image is based on the mysql DockerHub image, specifically version 5.6. The my.cnf is getting added in as a customization, and a db_setup.sh script is copied over so it can be run at a later time.

Some other things to note about the deployment manifest:

  • I made pretty heavy use of secrets and ConfigMap key/values
  • I also used the db-pvc-ebs Persistent Volume Claim and mounted the underlying physical resource (an EBS volume in this case) as /var/lib/mysql
  • I used the tenant1-dev service account, which allows the deployment to pull down the container image from ECR
  • I didn't specify the number of replicas I wanted, which means that 1 pod will be created (the default)

To create the deployment, I ran kubectl:

$ kubectl create -f db-deployment.yaml --record --namespace tenant1

Note that I used the --record flag, which tells Kubernetes to keep a history of the commands used to create or update that deployment. You can show this history with the kubectl rollout history command:

$ kubectl --namespace tenant1 rollout history deployment db-deployment 

To list the running deployments, replica sets and pods, you can use:

$ kubectl get get deployments,rs,pods --namespace tenant1 --show-all

Here is another example of a deployment manifest, this time for redis:

$ cat redis-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
  name: redis-deployment
  replicas: 1
  minReadySeconds: 10
        app: myapp
        tier: redis
        - name: redis
          command: ["redis-server", "/etc/redis/redis.conf", "--requirepass", "$(REDIS_PASSWORD)"]
          image: MY_ECR_ID.dkr.ecr.us-west-2.amazonaws.com/myapp-redis:tenant1
          imagePullPolicy: Always
          - name: REDIS_PASSWORD
                name: redis-pass
                key: redis-pass.secret
          - containerPort: 6379
            protocol: TCP
      serviceAccount: tenant1-dev

One thing that is different from the db deployment is the way a secret (REDIS_PASSWORD) is used as a command-line parameter for the container command. Make sure you use in this case the syntax $(VARIABLE_NAME) because that's what Kubernetes expects.

Also note the labels, which have app: myapp in common with the db deployment, but a different value for tier, redis instead of db.

My last deployment example for now is the one for the web application pods:

$ cat web-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
  name: web-deployment
  replicas: 2
    type: Recreate
        app: myapp
        tier: frontend
      - name: web
        image: MY_ECR_ID.dkr.ecr.us-west-2.amazonaws.com/myapp-web:tenant1
        imagePullPolicy: Always
        - containerPort: 80
          name: web
        - name: web-persistent-storage
          mountPath: /var/www/html/shared
      - name: web-persistent-storage
          claimName: web-pvc
      serviceAccount: tenant1-dev

Note that replicas is set to 2, so that 2 pods will be launched and kept running at all times. The labels have the same common part app: myapp, but the tier is different, set to frontend.  The persistent volume claim web-pvc for the underlying physical EFS volume is used to mount /var/www/html/shared over EFS.

The image used for the container is derived from a stock ubuntu:14.04 DockerHub image, with apache and php 5.6 installed on top. Something along these lines:

FROM ubuntu:14.04

RUN apt-get update && \
    apt-get install -y ntp build-essential binutils zlib1g-dev telnet git acl lzop unzip mcrypt expat xsltproc python-pip curl language-pack-en-base && \
    pip install awscli

RUN export LC_ALL=en_US.UTF-8 && export LC_ALL=en_US.UTF-8 && export LANG=en_US.UTF-8 && \
        apt-get install -y mysql-client-5.6 software-properties-common && add-apt-repository ppa:ondrej/php

RUN apt-get update && \
    apt-get install -y --allow-unauthenticated apache2 apache2-utils libapache2-mod-php5.6 php5.6 php5.6-mcrypt php5.6-curl php-pear php5.6-common php5.6-gd php5.6-dev php5.6-opcache php5.6-json php5.6-mysql

RUN apt-get remove -y libapache2-mod-php5 php7.0-cli php7.0-common php7.0-json php7.0-opcache php7.0-readline php7.0-xml

RUN curl -sSL https://getcomposer.org/composer.phar -o /usr/bin/composer \
    && chmod +x /usr/bin/composer \
    && composer selfupdate

COPY files/apache2-foreground /usr/local/bin/
RUN chmod +x /usr/local/bin/apache2-foreground
CMD bash /usr/local/bin/apache2-foreground

Creating services

In Kubernetes, you are not supposed to refer to individual pods when you want to target the containers running inside them. Instead, you need to use services, which provide endpoints for accessing a set of pods based on a set of labels.

Here is an example of a service for the db-deployment I created above:

$ cat db-service.yaml
apiVersion: v1
kind: Service
  name: db
    app: myapp
    - port: 3306
    app: myapp
    tier: db
  clusterIP: None

Note the selector property, which is set to app: myapp and tier: db. By specifying these labels, we make sure that only the deployments tagged with those labels will be included in this service. There is only one deployment with those 2 labels, and that is db-deployment.

Here are similar service manifests for the redis and web deployments:

$ cat redis-service.yaml
apiVersion: v1
kind: Service
  name: redis
    app: myapp
    - port: 6379
    app: myapp
    tier: redis
  clusterIP: None

$ cat web-service.yaml
apiVersion: v1
kind: Service
  name: web
    app: myapp
    - port: 80
    app: myapp
    tier: frontend
  type: LoadBalancer

The selector properties for each service are set so that the proper deployment is included in each service.

One important thing to note in the definition of the web service: its type is set to LoadBalancer. Since Kubernetes is AWS-aware, the service creation will create an actual ELB in AWS, so that the application can be accessible from the outside world. It turns out that this is not the best way to expose applications externally, since this LoadBalancer resource operates only at the TCP layer. What we need is a proper layer 7 load balancer, and in a future post I'll show how to use a Kubernetes ingress controller in conjunction with the traefik proxy to achieve that. In the mean time, here is a KubeCon presentation from Gerred Dillon on "Kubernetes Ingress: Your Router, Your Rules".

To create the services defined above, I used kubectl:

$ kubectl create -f db-service.yaml --namespace tenant1
$ kubectl create -f redis-service.yaml --namespace tenant1$ kubectl create -f web-service.yaml --namespace tenant1
At this point, the web application can refer to the database 'host' in its configuration files by simply using the name of the database service, which is db in our example. Similarly, the web application can refer to the redis 'host' by using the name of the redis service, which is redis. The Kubernetes magic will make sure calls to db and redis are properly routed to their end destinations, which are the actual containers running those services.

Running commands inside pods with kubectl exec

Although you are not really supposed to do this in a container world, I found it useful to run a command such as loading a database from a MySQL dump file on a newly created pod. Kubernetes makes this relatively easy via the kubectl exec functionality. Here's how I did it:


POD=$(kubectl --namespace $NAMESPACE get pods --show-all | grep $DEPLOYMENT | awk '{print $1}')
echo Running db_setup.sh command on pod $POD
kubectl --namespace $NAMESPACE exec $POD -it /usr/local/bin/db_setup.sh

where db_setup.sh downloads a sql.tar.gz file from S3 and loads it into MySQL.

A handy troubleshooting tool is to get a shell prompt inside a pod. First you get the pod name (via kubectl get pods --show-all), then you run:

$ kubectl --namespace tenant1 exec -it $POD -- bash -il

Sharing volumes across containers

One of the patterns I found useful in docker-compose files is to mount a container volume into another container, for example to check out the source code in a container volume, then mount it as /var/www/html in another container running the web application. This pattern is not extremely well supported in Kubernetes, but you can find your way around it by using init-containers.

Here's an example of creating an individual pod for the sole purpose of running a Capistrano task against the web application source code. Simply running two regular containers inside the same pod would not achieve this goal, because the order of creation for those containers is random. What we need is to force one container to start before any regular containers by declaring it to be an 'init-container'.

$ cat capistrano-pod.yaml
apiVersion: v1
kind: Pod
  name: capistrano
     pod.beta.kubernetes.io/init-containers: '[
                "name": "data4capistrano",
                "image": "MY_ECR_ID.dkr.ecr.us-west-2.amazonaws.com/myapp-web:tenant1",
                "command": ["cp", "-rH", "/var/www/html/current", "/tmpfsvol/"],
                "volumeMounts": [
                        "name": "crtvol",
                        "mountPath": "/tmpfsvol"
  - name: capistrano
    image: MY_ECR_ID.dkr.ecr.us-west-2.amazonaws.com/capistrano:tenant1
    imagePullPolicy: Always
    command: [ "cap", "$(CAP_STAGE)", "$(CAP_TASK)", "--trace" ]
    - name: CAP_STAGE
          name: tenant1-cap-config
          key: CAP_STAGE
    - name: CAP_TASK
          name: tenant1-cap-config
          key: CAP_TASK
    - name: DEPLOY_TO
          name: tenant1-cap-config
          key: DEPLOY_TO
    - name: crtvol
      mountPath: /var/www/html
    - name: web-persistent-storage
      mountPath: /var/www/html/shared
  - name: web-persistent-storage
      claimName: web-pvc
  - name: crtvol
    emptyDir: {}
  restartPolicy: Never
  serviceAccount: tenant1-dev

The logic is here is a bit convoluted. Hopefully some readers of this post will know a better way to achieve the same thing. What I am doing here is launching a container based on the myapp-web:tenant1 Docker image, which already contains the source code checked out from GitHub. This container is declared as an init-container, so it's guaranteed to run first. What it does is it mounts a special Kubernetes volume declared at the bottom of the pod manifest as an emptyDir. This means that Kubernetes will allocate some storage on the node where this pod will run. The data4capistrano container runs a command which copies the contents of the /var/www/html/current directory from the myapp-web image into this storage space mounted as /tmpfsvol inside data4capistrano. One other thing to note is that init-containers are a beta feature currently, so their declaration needs to be embedded into an annotation.

When the regular capistrano container is created inside the pod, it also mounts the same emptyDir container (which is not empty at this point, because it was populated by the init-container), this time as /var/www/html. It also mounts the shared EFS file system as /var/www/html/shared. With these volumes in place, it has all it needs in order to run Capistrano locally via the cap command. The stage, task, and target directory for Capistrano are passed via ConfigMaps values.

One thing to note is that the RestartPolicy is set to Never for this pod, because we only want to run it once and be done with it.

To run the pod, I used kubectl again:

$ kubectl create -f capistrano-pod.yaml --namespace tenant1

Creating jobs

Kubernetes also has the concept of jobs, which differ from deployments in that they run one instance of a pod and make sure it completes. Jobs are useful for one-off tasks that you want to run, or for periodic tasks such as cron commands. Here is an example of a job manifest which runs a script that uses the twig template engine under the covers in order to generate a configuration file for the web application:

$ cat template-job.yaml
apiVersion: batch/v1
kind: Job
  name: myapp-template
      name: myapp-template
      - name: myapp-template
        image: Y_ECR_ID.dkr.ecr.us-west-2.amazonaws.com/myapp-template:tenant1
        imagePullPolicy: Always
        command: [ "php", "/root/scripts/templatize.php"]
        - name: DBNAME
              name: tenant1-config
              key: MYSQL_DATABASE
        - name: DBUSER
              name: tenant1-config
              key: MYSQL_USER
        - name: DBPASSWORD
              name: mysql-db-pass
              key: mysql-db-pass.secret
        - name: REDIS_PASSWORD
              name: redis-pass
              key: redis-pass.secret
        - name: web-persistent-storage
          mountPath: /var/www/html/shared
      - name: web-persistent-storage
          claimName: web-pvc
      restartPolicy: Never
      serviceAccount: tenant1-dev

The templatize.php script substitutes DBNAME, DBUSER, DBPASSWORD and REDIS_PASSWORD with the values passed in the job manifest, obtained from either Kubernetes secrets or ConfigMaps.

To create the job, I used kubectl:

$ kubectl create -f template-job.yaml --namespace tenant1

Performing rolling updates and rollbacks for Kubernetes deployments

Once your application pods are running, you'll need to update the application to a new version. Kubernetes allows you to do a rolling update of your deployments. One advantage of using deployments as opposed to the older method of using replication controllers is that the update process for deployment happens on the Kubernetes server side, and can be paused and restarted. There are a few ways of doing a rolling update for a deployment (and a recent linux.com article has a good overview as well).

a) You can modify the deployment's yaml file and change a label such as a version or a git commit, then run kubectl apply:

$ kubectl --namespace tenant1 apply -f deployment.yaml

Note from the Kubernetes documentation on updating deployments:

a Deployment’s rollout is triggered if and only if the Deployment’s pod template (i.e. .spec.template) is changed, e.g. updating labels or container images of the template. Other updates, such as scaling the Deployment, will not trigger a rollout.

b) You can use kubectl set to specify a new image for the deployment containers. Example from the documentation:
$ kubectl set image deployment/nginx-deployment nginx=nginx:1.9.1 deployment "nginx-deployment" image update

c) You can use kubectl patch to add a unique label to the deployment spec template on the fly. This is the method I've been using, with the label being set to a timestamp:
$ kubectl patch deployment web-deployment --namespace tenant1 -p \  "{\"spec\":{\"template\":{\"metadata\":{\"labels\":{\"date\":\"`date +'%Y%M%d%H%M%S'`\"}}}}}"

When updating a deployment, a new replica set will be created for that deployment, and the specified number of pods will be launched by that replica set, while the pods from the old replica set will be shut down. However, the old replica set itself will be preserved, allowing you to perform a rollback if needed. 
If you want to roll back to a previous version, you can use kubectl rollout history to show the revisions of your deployment updates:
$ kubectl --namespace tenant1 rollout history deployment web-deploymentdeployments "web-deployment"REVISION CHANGE-CAUSE1 kubectl create -f web-deployment.yaml --record --namespace tenant12 kubectl patch deployment web-deployment --namespace tenant1 -p {"spec":{"template":{"metadata":{"labels":{"date":"1479161196"}}}}}3 kubectl patch deployment web-deployment --namespace tenant1 -p {"spec":{"template":{"metadata":{"labels":{"date":"1479161573"}}}}}4 kubectl patch deployment web-deployment --namespace tenant1 -p {"spec":{"template":{"metadata":{"labels":{"date":"1479243444"}}}}}
Now use kubectl rollout undo to rollback to a previous revision:
$ kubectl --namespace tenant1 rollout undo deployments web-deployment --to-revision=3deployment "web-deployment" rolled back
I should note that all these kubectl commands can be easily executed out of Jenkins pipeline scripts or shell steps. I use a Docker image to wrap kubectl and its keys so that they I don't have to install it on the Jenkins worker nodes.

And there you have it. I hope the examples I provided will shed some light on some aspects of Kubernetes that go past the 'Kubernetes 101' stage. Before I forget, here's a good overview from the official documentation on using Kubernetes in production.

I have a lot more Kubernetes things on my plate, and I hope to write blog posts on all of them. Some of these:

  • ingress controllers based on traefik
  • creation and renewal of Let's Encrypt certificates
  • monitoring
  • logging
  • using the Helm package manager
  • ...and more

Let's Encrypt Everything

Coding Horror - Jeff Atwood - Wed, 11/23/2016 - 01:03

I'll admit I was late to the HTTPS party.

But post Snowden, and particularly after the result of the last election here in the US, it's clear that everything on the web should be encrypted by default.


  1. You have an unalienable right to privacy, both in the real world and online. And without HTTPS you have zero online privacy – from anyone else on your WiFi, from your network provider, from website operators, from large companies, from the government.

  2. The performance penalty of HTTPS is gone, in fact, HTTPS arguably performs better than HTTP on modern devices.

  3. Using HTTPS means nobody can tamper with the content in your web browser. This was a bit of an abstract concern five years ago, but these days, there are more and more instances of upstream providers actively mucking with the data that passes through their pipes. For example, if Comcast detects you have a copyright strike, they'll insert banners into your web contentall your web content! And that's what the good guy scenario looks like – or at least a corporation trying to follow the rules. Imagine what it looks like when someone, or some large company, decides the rules don't apply to them?

So, how do you as an end user "use" encryption on the web? Mostly, you lobby for the websites you use regularly to adopt it. And it's working. In the last year, the use of HTTPS by default on websites has doubled.

Browsers can help, too. By January 2017, Google Chrome will show this alert in the UI when a login or credit card form is displayed on an unencrypted connection:

Additionally, Google is throwing their considerable weight behind this effort by ranking non-encrypted websites lower in search results.

But there's another essential part required for encryption to work on any websites – the HTTPS certificate. Historically these certificates have been issued by certificate authorities, and they were at least $30 per year per website, sometimes hundreds of dollars per year. Without that required cash each year, without the SSL certificate that you must re-purchase every year in perpetuity – you can't encrypt anything.

That is, until Let's Encrypt arrived on the scene.

Let's Encrypt is a 501.3(c)(3) non-profit organization supported by the Linux Foundation. They've been in beta for about a year now, and to my knowledge they are the only reliable, official free source of SSL certificates that has ever existed.

However, because Let's Encrypt is a non-profit organization, not owned by any company that must make a profit from each SSL certificate they issue, they need our support:

As a company, we've donated a Discourse hosted support community, and a cash amount that represents how much we would have paid in a year to one of the existing for-profit certificate authorities to set up HTTPS for all the Discourse websites we host.

I urge you to do the same:

  • Estimate how much you would have paid for any free SSL certificates you obtained from Let's Encrypt, and please donate that amount to Let's Encrypt.

  • If you work for a large company, urge them to sponsor Let's Encrypt as a fundamental cornerstone of a safe web.

If you believe in an unalienable right to privacy on the Internet for every citizen in every nation, please support Let's Encrypt.

[advertisement] Find a better job the Stack Overflow way - what you need when you need it, no spam, and no scams.
Categories: Programming

Vanity Metrics? Maybe Not!

A beard without gray is a reflection of vanity at this point in my life!

A beard without gray might be a reflection of vanity at this point in my life!

Unlike vanity license plates, calling a measure or metric a ‘vanity metric’ is not meant as a compliment. The real answer is never as cut and dry as when someone jumps up in the middle of a presentation and yells, “that is a vanity metric, you are suggesting we go back to the middle ages.”  Before you brand a metric with the pejorative of “vanity metric,” consider:

  1. Not all vanity metrics are useless.
  2. Your perception might not be same as someone else.
  3. Just because you call something a vanity metric does not make it true.

I recently toured several organizations that had posted metrics. Several charts caught my eye. Three examples included:

  1. Number of workdays injury-free;
  2. Number of function points billed in the current quarter, and
  3. A daily total of user calls.

Using our four criteria (gamability, linked to business outcomes, provides process knowledge and actionable) I could each classify each of the metrics above as a vanity metric but that might just be my perception based on the part of the process I understand. 

The number of workday injury-free is a simple metric I have seen at construction job sites, manufacturing plants and warehouses since I entered the workplace.  The number tends to increment over time until it suddenly shifts to zero.  By all definitions of a vanity metric, the number shown has little to do with the output of the process and nor is it really actionable.  That said, the metric provides workers with the some assurance that management is “interested” in the well-being of their employees or at least want to avoid the fines for not posting the chart. Clearly some vanity metrics are useful.

IFPUG function points are a measure of software functionality delivered.  Function points were introduced in the late 1980’s and have evolved over the years. Function points are sometimes perceived as a vanity metric when not used as a system metric. While this might be true in some scenarios, if we consider the common purchasing practice of paying per function point used in several countries (including the US, Brazil, Australia, Korea and Japan to name a few), the metric clearly is linked to a business outcome and is actionable, and therefore is not a vanity metric. The perception who is using the metric clearly impacts how a metric is classified.

I have worked for call centers, help desks and voice credit card authorization organizations during my career.  One of the most ubiquitous metric collected and displayed is the total number of calls answered daily (versions of this chart are limitless and include calls per hour and calls during peak hours). During a recent tour of a warehouse call center, one of the people on the tour suggested the metric was purely for the the vanity of the organization and for showing visitors. The tour leader pointed out the metric was used for staffing the call center properly so employees would not burn out and so that customers got their questions answered in a timely manner. I made sure I was not standing next to him for the rest of tour in case retribution for the snide question was required. Calling something a vanity metric is can be related perception; however in some cases it is a knee jerk negative reaction to any form of measurement. Clearly just because someone calls a measure or metric a vanity metric does not mean the epithet is true.

The concept of measurement and metrics in software development is always an interesting discussion. Metrics and measures are a useful tool support empirical processes, such as Scrum used in software development. Measures and metrics provide transparency so that we can inspect and adapt. That does not mean that vanity metrics don’t exist. For example, on my tours i saw a chart that represented the number of user stories completed this year by month . . . for the whole department.  I have no clue how the metric could be used and nor did my hosts when I asked what decisions were driven by the data shown.  Clearly vanity a metric based on any criteria you might propose.  


Categories: Process Management

Kubernetes: Writing hostname to a file

Mark Needham - Tue, 11/22/2016 - 20:56

Over the weekend I spent a bit of time playing around with Kubernetes and to get the hang of the technology I set myself the task of writing the hostname of the machine to a file.

I’m using the excellent minikube tool to create a local Kubernetes cluster for my experiments so the first step is to spin that up:

$ minikube start
Starting local Kubernetes cluster...
Kubectl is now configured to use the cluster.

The first thing I needed to work out how to get the hostname. I figured there was probably an environment variable that I could access. We can call the env command to see a list of all the environment variables in a container so I created a pod template that would show me that information:


apiVersion: v1
kind: Pod
  name: mark-super-simple-test-pod
    - name: test-container
      image: gcr.io/google_containers/busybox:1.24
      command: [ "/bin/sh", "-c", "env" ]      
  dnsPolicy: Default
  restartPolicy: Never

I then created a pod from that template and checked the logs of that pod:

$ kubectl create -f hostname_super_simple.yaml 
pod "mark-super-simple-test-pod" created
$ kubectl logs  mark-super-simple-test-pod

The information we need is in $HOSTNAME so the next thing we need to do is created a pod template which puts that into a file.


apiVersion: v1
kind: Pod
  name: mark-test-pod
    - name: test-container
      image: gcr.io/google_containers/busybox:1.24
      command: [ "/bin/sh", "-c", "echo $HOSTNAME > /tmp/bar; cat /tmp/bar" ]
  dnsPolicy: Default
  restartPolicy: Never

We can create a pod using this template by running the following command:

$ kubectl create -f hostname_simple.yaml
pod "mark-test-pod" created

Now let’s check the logs of the instance to see whether our script worked:

$ kubectl logs mark-test-pod

Indeed it did, good times!

Categories: Programming

SE-Radio Episode 275: Josh Doody on Salary Negotiation for Software Engineers

Marcus Blankenship talks with Josh Doody about salary negotiation. Topics include a framework for thinking about salary negotiations, how you can know what you’re worth, the employers view of salary negotiation, and missed negotiation opportunities. Also discussed are common fears about negotiating and how to overcome them, common mistakes during negotiations, and how negotiation makes […]
Categories: Programming

Welcoming the third class of Launchpad Accelerator with expansion into new countries!

Google Code Blog - Tue, 11/22/2016 - 19:43

Roy Glasberg, Global Lead, Launchpad Program & Accelerator

After two successful classes, we're excited to announce the next group of promising startups for the third class of Launchpad Accelerator. The startups from Brazil, India, Indonesia, and Mexico will be joined by developers from five additional countries: Argentina, Colombia, Philippines, Thailand and Vietnam.

The program includes intensive mentoring from Google engineers, product managers and other expert mentors from top technology companies and VCs in Silicon Valley. Participants receive equity-free support, credits for Google products, PR support and work closely with Google for six months in their home country.

Class 3 kicks off early next year (January 30) at Launchpad Space, our physical space in San Francisco where developers and startups can get free technical training, one-on-one mentoring and more education geared towards helping them successfully build their apps & startups.

Here's the full list of participating startups (by country):



Delivery DiretoQuintoAndarDogHero

MobillsPortal TelemedicinaMeus Pedidos







Happy Adda Studios




JurnalMapan (Ruma)PicMix






ELSA SpeakHaravan

If you're interested in applying for future Launchpad Accelerator cohorts, we encourage you to follow us on the Launchpad Accelerator site to receive updates. We also expect to add more countries to the program in the future. Stay tuned!

Categories: Programming

IT Hare: Ultimate DB Heresy: Single Modifying DB Connection. Part I. Performanc

Sergey Ignatchenko continues his excellent book series with a new chapter on databases. This is a guest repost

The idea of single-write-connection is used extensively in the post, as it's defined elsewhere I asked Sergey for a definition so the article would make a little more sense...

As for single-write-connection - I mean that there is just one app (named "DB Server" in the article) having a single DB connection to the database which is allowed to issue modifying statements (UPDATEs/INSERTs/DELETEs). This allows to achieve several important simplifications - first of all, all fundamentally non-testable concurrency issues (such as missing SELECT FOR UPDATE and deadlocks) are eliminated entirely, second - the whole thing becomes deterministic (which is a significant help to figure out bugs - even simple text logging has been seen to make the system quite debuggable, including post-mortem), and last but not least - this monopoly on updates can be used in quite creative ways to improve performance (in particular, to keep always-coherent app-level cache which can be like 100x-1000x more efficient than going to DB).

After we finished with all the preliminaries, we can now get to the interesting part – implementing our transactional DB and DB Server. We already mentioned implementing DB Server briefly in Chapter VII, but now we need much more detailed discussion on this all-important topic.

“Transactional / operational DB is a place where all the automated decisions are made about your game (stock exchange, bank, etc.)First of all, let’s re-iterate what we’re speaking about. Transactional/operational DB is a place where all the automated decisions are made about your game (stock exchange, bank, etc.).

It stores things such as player accounts, with all their persistent attributes etc. etc.; it also stores communications related to payment processing, and so on, and so forth. And “DB Server” is our app handling access to DBMS (as noted in Chapter VII, I am firmly against having SQL statements issued directly by your Game Servers/Game Logic, so an intermediary such as DB Server is necessary).

As discussed above, ACID properties tend to be extremely important for transactional/operational DB. We don’t want money – or that artifact which is sold for real $20K on eBay – to be lost or duplicated. For this and some other reasons, we’ll be speaking about SQL databases for our transactional/operational DB (while it is possible to use NoSQL for transactional/operational DB – achieving strict guarantees is usually difficult, in particular because of lack of multi-object ACID transactions in most of NoSQL DBs out there, see discussion in [[TODO]] section above).

And now, we’re finally ready to start discussing interesting things.

Multi-Connection DB Access
Categories: Architecture

Sponsored Post: Loupe, New York Times, ScaleArc, Aerospike, Scalyr, Gusto, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring?
  • The New York Times is looking for a Software Engineer for its Delivery/Site Reliability Engineering team. You will also be a part of a team responsible for building the tools that ensure that the various systems at The New York Times continue to operate in a reliable and efficient manner. Some of the tech we use: Go, Ruby, Bash, AWS, GCP, Terraform, Packer, Docker, Kubernetes, Vault, Consul, Jenkins, Drone. Please send resumes to: technicaljobs@nytimes.com

  • IT Security Engineering. At Gusto we are on a mission to create a world where work empowers a better life. As Gusto's IT Security Engineer you'll shape the future of IT security and compliance. We're looking for a strong IT technical lead to manage security audits and write and implement controls. You'll also focus on our employee, network, and endpoint posture. As Gusto's first IT Security Engineer, you will be able to build the security organization with direct impact to protecting PII and ePHI. Read more and apply here.
Fun and Informative Events
  • Your event here!
Cool Products and Services
  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • ScaleArc's database load balancing software empowers you to “upgrade your apps” to consumer grade – the never down, always fast experience you get on Google or Amazon. Plus you need the ability to scale easily and anywhere. Find out how ScaleArc has helped companies like yours save thousands, even millions of dollars and valuable resources by eliminating downtime and avoiding app changes to scale. 

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: Architecture

Calling European game developers, enter the Indie Games Contest by December 31

Google Code Blog - Tue, 11/22/2016 - 10:30

Posted by Matteo Vallone, Google Play Partner Development Manager

To build awareness of the awesome innovation and art that indie game developers are bringing to users on Google Play, we have invested heavily over the past year in programs like Indie Corner, as well as events like the Google Play Indie Games Festivals in North America and Korea.

As part of that sustained effort, we also want to celebrate the passion and innovation of indie game developers with the introduction of the first-ever Google Play Indie Games Contest in Europe. The contest will recognize the best indie talent in several countries and offer prizes that will help you get your game noticed by industry experts and gamers worldwide.

Prizes for the finalists and winners:

  • An open showcase held at the Saatchi Gallery in London
  • YouTube influencer campaigns worth up to 100,000 EUR
  • Premium placements on Google Play
  • Tickets to Google I/O 2017 and other top industry events
  • Promotions on our channels
  • Special prizes for the best Unity game
  • And more!

Entering the contest:

If you're based in Czech Republic, Denmark, Finland, France (coming soon), Germany, Iceland, Israel, Netherlands, Norway, Poland (coming soon), Romania, Spain, Sweden, Turkey, or UK (excl. Northern Ireland), have 15 or less full time employees, and published a new game on Google Play after 1 January 2016, you may now be eligible to enter the contest. If you're planning on publishing a new game soon, you can also enter by submitting a private beta. Check out all the details in the terms and conditions. Submissions close on 31 December 2016.

The process:

Up to 20 finalists will get to showcase their games at an open event at the Saatchi Gallery in London on the 16th February 2017. At the event, the top 10 will be selected by the event attendees and the Google Play team. The top 10 will then get the opportunity to pitch to a jury of industry experts, from which the final winner and runners up will be selected.

Even if someone is NOT entering the contest:

Even if you're not eligible to enter the contest, you can still register to attend the final showcase event in London on 16 February 2017, check out some great indie games, and have fun with various industry experts and indie developers. We will also be hosting a workshop for all indie games developers from across EMEA in the new Google office in Kings Cross the next day, so this will be a packed week.

Get started:

Enter the Indie Games Contest now and visit the contest site to find out more about the contest, the event, and the workshop.

Categories: Programming

Firebase App Indexing for Personal Content

Google Code Blog - Mon, 11/21/2016 - 23:43
Originally posted on Firebase blog Posted by Fabian Schlup, Software Engineer

In September, we launched a new way to search for content in apps on Android phones. With this update, users were able to find personal content like messages, notes, music and more across apps like OpenTable, Ticketmaster, Evernote, Glide, Asana, Gmail, and Google Keep from a single search box. Today, we're inviting all Android developers to enable this functionality for their apps.

Starting with version 10.0, the Firebase App Indexing API on Android lets apps add their content to Google's on-device index in the background, and update it in real-time as users make changes in the app. We've designed the API with three principles in mind:

  • making it simple to integrate
  • keeping all personal data on the device
  • giving the developer full control over what goes into the index and when

There are several predefined data types that make it easy to represent common things such as messages, notes, and songs, or you can add custom types to represent additional items. Plus, logging user actions like a user listening to a specific song provides an important signal to help rank user content across the Google app.

Indexable note = Indexables.noteDigitalDocumentBuilder()
.setName("Shopping list")
.setText("steak, pasta, wine")
Example of adding or updating a user's shopping list in the on-device index.

Integrating with Firebase App Indexing helps increase user engagement with your app, as users can get back to their personal content in an instant with Google Search. Because that data is indexed directly on the device, this even works when offline.

To get started, check out our implementation guideand codelab.

Categories: Programming

Google Play services and Firebase for Android will support API level 14 at minimum

Android Developers Blog - Mon, 11/21/2016 - 23:37
Posted by Doug Stevenson, Developer Advocate

Version 10.0.0 of the Google Play services client libraries, as well as the Firebase client libraries for Android, will be the last version of these libraries that support Android API level 9 (Android 2.3, Gingerbread). The next scheduled release of these libraries, version 10.2.0, will increase the minimum supported API level from 9 to 14 (Android 4.0.1, Ice Cream Sandwich). This change will happen in early 2017.

Why are we discontinuing support for Gingerbread and Honeycomb in Google Play services?

The Gingerbread platform is almost six years old. Many Android developers have already discontinued support for Gingerbread in their apps. This helps them build better apps that make use of the newer capabilities of the Android platform. For us, the situation is the same. By making this change, we will be able to provide a more robust collection of tools for Android developers with greater speed.

What this means for your Android app that uses Google Play services or Firebase:

You may use version 10.0.0 of Google Play services and Firebase as you are currently. It will continue to work with Gingerbread devices as it has in the past.

When you choose to upgrade to the future version 10.2.0, and if your app minimally supports API level 14 or greater (typically specified as "minSdkVersion" in your build.gradle), you will not encounter any versioning problems. However, if your app supports lower than API level 14, you will encounter a problem at build time with an error that looks like this:

Error:Execution failed for task ':app:processDebugManifest'.
> Manifest merger failed : uses-sdk:minSdkVersion 9 cannot be smaller than version 14 declared in library [com.google.android.gms:play-services:10.2.0]
        Suggestion: use tools:overrideLibrary="com.google.android.gms:play_services" to force usage

Unfortunately, the stated suggestion will not help you successfully run your app on older devices. In order to use Google Play services 10.2.0 and later, you can choose one of the following options:

1. Target API level 14 as the minimum supported API level.

This is the recommended course of action. To discontinue support for API levels that will no longer receive Google Play services updates, simply increase the minSdkVersion value in your app's build.gradle to at least 14. If you update your app in this way and publish it to the Play Store, users of devices with less than that level of support will not be able to see or download the update. However, they will still be able to download and use the most recently published version of the app that does target their device.

A very small percentage of all Android devices are using API levels less than 14. You can read more about the current distribution of Android devices. We believe that many of these old devices are not actively being used.

If your app still has a significant number of users on older devices, you can use multiple APK support in Google Play to deliver an APK that uses Google Play services 10.0.0. This is described below.

2. Build multiple APKs to support devices with an API level less than 14.

Along with some configuration and code management, you can build multiple APKs that support different minimum API levels, with different versions of Google Play services. You can accomplish this with build variants in Gradle. First, define build flavors for legacy and newer versions of your app. For example, in your build.gradle, define two different product flavors, with two different compile dependencies for the components of Play Services you're using:

productFlavors {
    legacy {
        minSdkVersion 9
        versionCode 901  // Min API level 9, v01
    current {
        minSdkVersion 14
        versionCode 1401  // Min API level 14, v01

dependencies {
    legacyCompile 'com.google.android.gms:play-services:10.0.0'
    currentCompile 'com.google.android.gms:play-services:10.2.0'

In the above situation, there are two product flavors being built against two different versions of the Google Play services client libraries. This will work fine if only APIs are called that are available in the 10.0.0 library. If you need to call newer APIs made available with 10.2.0, you will have to create a compatibility library for the newer API calls so that they are only built into the version of the application that can use them:

  • Declare a Java interface that exposes the higher-level functionality you want to perform that is only available in current versions of Play services.
  • Build two Android libraries that implement that interface. The "current" implementation should call the newer APIs as desired. The "legacy" implementation should no-op or otherwise act as desired with older versions of Play services. The interface should be added to both libraries.
  • Conditionally compile each library into the app using "legacyCompile" and "currentCompile" dependencies.
  • In the app's code, call through to the compatibility library whenever newer Play APIs are required.

After building a release APK for each flavor, you then publish them both to the Play Store, and the device will update with the most appropriate version for that device. Read more about multiple APK support in the Play Store.

Categories: Programming

Software Development Linkopedia November 2016

From the Editor of Methods & Tools - Mon, 11/21/2016 - 15:08
Here is our monthly selection of knowledge on programming, software testing and project management. This month you will find some interesting information and opinions about introvert project manager, scaling Agile, Test-Driven Development, UX vs UI, philosophy and programming, retrospectives, BDD in Java and Agile metrics. Blog: How Introvert Can Survive as Project Manager Blog: #AgileAfrica […]

Dart Developer Summit 2016 Videos: Soundness, AngularDart 2.0, and the Fastest Growing Language at Google

Google Code Blog - Mon, 11/21/2016 - 13:00
Posted by Filip Hracek, Program Manager, Dart
Videos from last month’s Dart Developer Summit are up on YouTube and we thought we’d cherry-pick the highlights for you. The summit keynote is a summary of all the major news and of the direction the team is taking now. It’s where we announced that Dart is the fastest growing language at Google. Teams switching to Dart report up to twice the productivity and development speed of what they had previously.
Next, AngularDart 2.0 was launched in a presentation by Ferhat Buyukkokten and Matan Lurey. They showed how they made the framework’s output 40% smaller and 15% faster in the 4 months since AngularDart got its own dedicated team. They also showed our 60 fps table using setState(), and the new testing framework called NgTestBed. Later in the day, Ted Sander launched AngularDart Components — the material design widgets Google is using in customer-facing apps like AdWords and AdSense. Hundreds of Google engineers work with these components every day. Watch the video to learn how they make our teams more productive, and our web apps more performant.
If you’re interested  in language design, watch Sound Dart, a talk by Leaf Petersen in which he explains Dart’s strong mode. With strong mode, Dart’s type system becomes sound, so that when you write types they are guaranteed to be correct (while still allowing you to write dynamically typed code where you want the flexibility). This differentiates strong-mode Dart from many popular compile-to-JavaScript languages, and improves both performance and developer productivity.
Another presentation that made waves was the Flutter keynote from Day 2 of the summit. Eric Seidel impressed the audience by showing just how fast mobile development can be with Flutter.
After Eric’s talk, John McCutchan and Todd Turnidge went into details about Flutter hot reloading. They also showed, for the first time, code rewind in Dart.
These are just 6 out of the 18 talks that are available on YouTube. For example, Will Ekiel’s talk titled Learnings from building a CRM app at Google gives a perspective on managing a product built with Dart and deploying it across both web and mobile. Another interesting practical presentation on using Dart in production is the one given by Faisal Abid and Kevin Birch about their large-scale JS-to-AngularDart rewrite. And the list goes on. We’re very happy how the event went, and we’re already looking forward for next year’s summit. In the meantime, follow our blog, our Twitter account, our G+ page, or join the conversation in any other way. We want to hear from you. Thanks for building in Dart.
Categories: Programming

Five Dysfunctions of a Team, Patrick Lencioni: Re-Read Week 8

The Five Dysfunctions of a Team Cover

The “Book” during unboxing!

I am back from the Øredev  in Malmo, Sweden. It was a wonderful conference. Check out my short review.

In this week’s re-read of The Five Dysfunctions of a Team  by Patrick Lencioni (Jossey-Bass, Copyright 2002, 33rd printing), the team returns to the office and quickly begins the transformation process.

(Remember to buy a copy from the link above and read along.)

Part Three – Heavy Lifting


Kathryn and the team return to the day-to-day grind of the office. Significant progress building teams can be made when the day-to-day pressure of the office are removed, but Kathryn immediately observes that the progress the team has made offsite deteriorates.  I have observed that much of the progress made when away from the office is transitory without reinforcement. Behavior tends to revert when confronted by the same triggers. All progress goes out the window when Nick calls a meeting to propose acquiring another firm and includes only a subset of the team. When called on not including Mikey, Nick slams her competence. Despite a rocky start, the team holds a fairly good discussion of the plusses and minuses until Nick blurts out that while Kathryn might be great at teamwork, she doesn’t know anything about the business and isn’t qualified to participate in the discussion. Kathryn doesn’t let the slight slide and gives Nick the choice of having it out right there in public or behind closed doors.

Nick is frustrated that he was underutilized. He feels that he could be leading the organization.  The acquisition is a reflection of his frustration.  He infers that perhaps he should quit.  Kathryn points out that he is underutilized because he is only interested advancing his own career rather than advancing the goals of the organization. Earlier in my career as a quality manager, I reported to the general manager of the organization. One of my co-workers was a Nick.  All that was important to him was the next rung on the ladder.  He never did anything to did not directly benefit this goal. He was not much of team player and often caused conflict amongst the team. Everyone was happy when he was promoted to another site (I heard he flamed out). Kathryn leaves Nick to sort things out in advance of her first staff meeting late in the day.


The staff meeting starts at two with everyone present except Nick and JR. Nick arrives at the last second and interrupts Kathryn as she begins, Nick delivers an apology for his outburst during the meeting earlier in the day. He publicly admitted to the team (showing trust) that he feels underutilized and that his underutilization will reflect poorly on his career.  Even though he is frustrated, he doesn’t want to leave yet and needs everyone’s help to find something he could hang his hat on. Lencioni uses the reversal in his behavior to provide an example of how team members should be able to safely ask for help. After bares his soul, Kathryn drops the bombshell that JR had quit the night before. With JR gone someone needs to step up and take the sales role.  Carlos volunteers (Carlos tries to please as we have seen before). In the end, Nick decides  to take the sales (he is underutilized) even thought he had come to DecsionTech he felt that sales pigeonholed him even though he was “damn good at it.” Carlos was relieved not to have been called to deliver on his suggestion. Remember to be careful what you ask for…you just might get it and in Carlos’s case, he was not underutilized.

In the end, Nick’s underutilization poisoned his attitude in the same way over-utilization can poison attitudes. Team members need to be able to trust each other enough to ask and provide help.  

Three quick takeaways:

  •         Never tell your boss they are unqualified unless you are willing to suffer the consequences.
  •         Not everyone can fit into every team (team members are not easily replaceable parts).
  •         Trust can be learned in theory at off-site meetings, but trust is really learned on-site.

Previous Installments in the re-read of  The Five Dysfunctions of a Team by Patrick Lencioni:

Week 1 – Introduction through Observations

Week 2 – The Staff through the End Run

Week 3 – Drawing the Line though Pushing Back

Week 4 – Entering Danger though Rebound

Week 5 – Awareness through Goals

Week 6 – Deep Tissue through Exhibition

Week 7 – Film Noir through Application

Categories: Process Management

What Test Engineers do at Google: Building Test Infrastructure

Google Testing Blog - Fri, 11/18/2016 - 18:13
Author: Jochen Wuttke

In a recent post, we broadly talked about What Test Engineers do at Google. In this post, I talk about one aspect of the work TEs may do: building and improving test infrastructure to make engineers more productive.

Refurbishing legacy systems makes new tools necessary
A few years ago, I joined an engineering team that was working on replacing a legacy system with a new implementation. Because building the replacement would take several years, we had to keep the legacy system operational and even add features, while building the replacement so there would be no impact on our external users.

The legacy system was so complex and brittle that the engineers spent most of their time triaging and fixing bugs and flaky tests, but had little time to implement new features. The goal for the rewrite was to learn from the legacy system and to build something that was easier to maintain and extend. As the team's TE, my job was to understand what caused the high maintenance cost and how to improve on it. I found two main causes:
  • Tight coupling and insufficient abstraction made unit testing very hard, and as a consequence, a lot of end-to-end tests served as functional tests of that code.
  • The infrastructure used for the end-to-end tests had no good way to create and inject fakes or mocks for these services. As a result, the tests had to run the large number of servers for all these external dependencies. This led to very large and brittle tests that our existing test execution infrastructure was not able to handle reliably.
Exploring solutions
At first, I explored if I could split the large tests into smaller ones that would test specific functionality and depend on fewer external services. This proved impossible, because of the poorly structured legacy code. Making this approach work would have required refactoring the entire system and its dependencies, not just the parts my team owned.

In my second approach, I also focussed on large tests and tried to mock services that were not required for the functionality under test. This also proved very difficult, because dependencies changed often and individual dependencies were hard to trace in a graph of over 200 services. Ultimately, this approach just shifted the required effort from maintaining test code to maintaining test dependencies and mocks.

My third and final approach, illustrated in the figure below, made small tests more powerful. In the typical end-to-end test we faced, the client made RPCcalls to several services, which in turn made RPC calls to other services. Together the client and the transitive closure over all backend services formed a large graph (not tree!) of dependencies, which all had to be up and running for the end-to-end test. The new model changes how we test client and service integration. Instead of running the client on inputs that will somehow trigger RPC calls, we write unit tests for the code making method calls to the RPC stub. The stub itself is mocked with a common mocking framework like Mockito in Java. For each such test, a second test verifies that the data used to drive that mock "makes sense" to the actual service. This is also done with a unit test, where a replay client uses the same data the RPC mock uses to call the RPC handler method of the service.

This pattern of integration testing applies to any RPC call, so the RPC calls made by a backend server to another backend can be tested just as well as front-end client calls. When we apply this approach consistently, we benefit from smaller tests that still test correct integration behavior, and make sure that the behavior we are testing is "real".

To arrive at this solution, I had to build, evaluate, and discard several prototypes. While it took a day to build a proof-of-concept for this approach, it took me and another engineer a year to implement a finished tool developers could use.

The engineers embraced the new solution very quickly when they saw that the new framework removes large amounts of boilerplate code from their tests. To further drive its adoption, I organized multi-day events with the engineering team where we focussed on migrating test cases. It took a few months to migrate all existing unit tests to the new framework, close gaps in coverage, and create the new tests that validate the mocks. Once we converted about 80% of the tests, we started comparing the efficacy of the new tests and the existing end-to-end tests.

The results are very good:
  • The new tests are as effective in finding bugs as the end-to-end tests are.
  • The new tests run in about 3 minutes instead of 30 minutes for the end-to-end tests.
  • The client side tests are 0% flaky. The verification tests are usually less flaky than the end-to-end tests, and never more.
Additionally, the new tests are unit tests, so you can run them in your IDE and step through them to debug. These results allowed us to run the end-to-end tests very rarely, only to detect misconfigurations of the interacting services, but not as functional tests.

Building and improving test infrastructure to help engineers be more productive is one of the many things test engineers do at Google. Running this project from requirements gathering all the way to a finished product gave me the opportunity to design and implement several prototypes, drive the full implementation of one solution, lead engineering teams to adoption of the new framework, and integrate feedback from engineers and actual measurements into the continuous refinement of the tool.
Categories: Testing & QA

Stuff The Internet Says On Scalability For November 18th, 2016

Hey, it's HighScalability time:


Now you don't have to shrink yourself to see inside a computer. Here's a fully functional 16-bit computer that's over 26 square feet huge! Bighex machine


If you like this sort of Stuff then please support me on Patreon.
  • 50%: drop in latency and CPU load after adopting PHP7 at Tumblr; 4,425: satellites for Skynet; 13%: brain connectome shared by identical twins; 20: weird & wonderful datasets for machine learning; 200 Gb/sec: InfiniBand data rate; 15 TB: data generated nightly by Large Synoptic Survey Telescope; 17.24%: top comments that were also first comments on reddit; $120 million: estimated cost of developing Kubernetes; 3-4k: proteins involved in the intracellular communication network;

  • Quotable Quotes:
    • Westworld: Survival is just another loop.
    • Leo Laporte: All bits should be treated equally. 
    • Paul Horner: Honestly, people are definitely dumber. They just keep passing stuff around. Nobody fact-checks anything anymore
    • @WSJ: "A conscious effort by a nation-state to attempt to achieve a specific effect" NSA chief on WikiLeaks 
    • encoderer: For the saas business I run, Cronitor, aws costs have consistently stayed around 10% total MRR. I think there are a lot of small and medium sized businesses who realize a similar level of economic utility.
    • @joshtpm: 1: Be honest: Facebook and Twitter maxed out election frenzy revenues and cracked down once the cash was harvested. Also once political ...
    • boulos: As a counter argument: very few teams at Google run on dedicated machines. Those that do are enormous, both in the scale of their infrastructure and in their team sizes. I'm not saying always go with a cloud provider, I'm reiterating that you'd better be certain you need to.
    • Renegade Facebook Employees: Sadly, News Feed optimizes for engagement. As we've learned in this election, bullshit is highly engaging. A bias towards truth isn't an impossible goal.
    • Russ White: The bottom line is this—don’t be afraid to use DNS for what it’s designed for in your network...We need to learn to treat DNS like it’s a part of the IP stack, rather than something that “only the server folks care about,” or “a convenience for users we don’t really take seriously for operations.”
    • Wizart_App: It's always about speed – never about beauty.
    • Michael Zeltser: MapReduce is just too low level and too dumb. Mixing complex business logic with MapReduce low level optimization techniques is asking too much. 
    • Michael Zeltser: One thing that always bugged me in MapReduce is its inability to reason about my data as a dataset. Instead you are forced to think in single key-value pair, small chunk, block, split, or file. Coming from SQL, it felt like going backwards 20 years. Spark has solved this perfectly.  
    • Guillaume Sachot: I can confirm that I've seen high availability appliances fail more often than non-clustered ones. And it's not limited to firewalls that crash together due to a bug in session sharing, I have noticed it for almost anything that does HA: DRBD instances, Pacemaker, shared filesystems...
    • Albert-Laszlo Barabasi: The bottom line is: Brother, never give up. When you give up, that’s when your creativity ends
    • SpaceX: According to a transcript received by Space News, he argued that the supercooled liquid oxygen that SpaceX uses as propellant actually became so cold that it turned into a solid. And that’s not supposed to happen.
    • Murat: Safety is a system-level property, unit testing of components is not enough.
    • @alexjc: 1/ As deep learning evolves as a discipline, it's becoming more about architecting highly complex systems that leverage data & optimization.
    • btgeekboy: Indeed. If there's one thing I've learned in >10 years of building large, multi-tenant systems, it's that you need the ability to partition as you grow. Partitioning eases growth, reduces blast radius, and limits complexity.
    • @postwait: Monitoring vendors that say they support histograms and only support percentiles are lying to their customers. Full stop. #NowYouKnow
    • @crucially: Fastly hit 5mm request per seconds tonight with a cache hit ratio of 96% -- proud of the team.
    • Rick Webb: Just because Silicon Valley has desperately wanted to believe for twenty years that communities can self-police does not make it true. 
    • Cybiote: Humans can additionally predict other agents and other things about the world based on intuitive physics. This is why they can get on without the huge array of sensors and cars cannot. Humans make up for the lack of sensors by being able to use the poor quality data more effectively. To put this in perspective, 8.75 megabits / second is estimated to pass through the human retina but only on the order of a 100 bits is estimated to reach conscious attention.
    • David Rand: What I found was consistent with the theory and the initial results: in situations where there're no future consequences, so it's in your clear self-interest to be selfish, intuition leads to more cooperation than deliberation.   
    • @crucially: Fastly hit 5mm request per seconds tonight with a cache hit ratio of 96% -- proud of the team
    • SpaceX: With deployment of the first 800 satellites, SpaceX will be able to provide widespread U.S. and international coverage for broadband services. Once fully optimized through the Final Deployment, the system will be able to provide high bandwidth (up to 1 Gbps per user), low latency broadband services for consumers and businesses in the U.S. and globally.
    • Steve Gibson: Anyone can make a mistake [regarding Pixel ownage], and Google is playing security catch up. But what they CAN and SHOULD be proud of is that they had the newly discovered problem patched within 24 hours!
    • dragonnyxx: Calling a 10,000 line program a "large project" is like calling dating someone for a week a "long-term relationship".
    • Brockman: I have three friends: confusion, contradiction, and awkwardness. That’s how I try to meander through life. Make it strange.
    • Martin Sústrik: In this particular case, almost everybody will agree that adding the abstraction was not worth it. But why? It was a tradeoff between code duplication and increased level of abstraction. But why would one decide that the well known cost of code duplication is lower than somewhat fuzzy "cost of abstraction"?

  • Biomedical engineering might be an area a lot of tech people interested in real-time monitoring and control at scale could be of help. Hr2: Wireless Spinal Tech, Climate Policy, Moon Impact. Researchers want to use wireless technology to record 100k+ neurons simultaneously, 24x7, for long periods of time. The goal is to use this data to control high dimensional systems, like when when reaching and grasping the shoulder, elbow, hand, wrist, and fingers must all work together in real-time. Sound familiar?

  • Making the Switch from Node.js to Golang. Digg switched a S3 heavy service from Node to Go and: Our average response time from the service was almost cut in half, our timeouts (in the scenario that S3 was slow to respond) were happening on time, and our traffic spikes had minimal effects on the service...With our Golang upgrade, we are easily able to handle 200 requests per minute and 1.5 million S3 item fetches per day. And those 4 load-balanced instances we were running Octo on initially? We’re now doing it with 2.

  • Not a lie. The best explanation to resilience. Resilience is how you maintain the self-organizing capacity of a system. Great explanation. The way you maintain the resilience of a system is by letting it probe its boundaries. The only way to make forest resilient to fire is to burn it. Efficiency is riding as close as possible to the boundary by using feedback to keep the system self-organizing.

  • Facebook does a lot of work making their mobile apps work over poor networks. One change they are making is Client-side ranking to more efficiently show people stories in feed. Previously, all story ranking occurred on the server and entries paged up to the device and displayed in order. The problem with this approach is that an article's rank could change while media is being loaded. Now a pool of stories is kept on the client and as new stories are added they are reranked and shown to users in rank order. This approach adapts well to slow networks because slow-loading content is temporarily down-ranked while it loads.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Pixel Security: Better, Faster, Stronger

Android Developers Blog - Fri, 11/18/2016 - 06:07

Posted by Paul Crowley, Senior Software Engineer and Paul Lawrence, Senior Software Engineer

Encryption protects your data if your phone falls into someone else's hands. The new Google Pixel and Pixel XL are encrypted by default to offer strong data protection, while maintaining a great user experience with high I/O performance and long battery life. In addition to encryption, the Pixel phones debuted running the Android Nougat release, which has even more security improvements.

This blog post covers the encryption implementation on Google Pixel devices and how it improves the user experience, performance, and security of the device.

File-Based Encryption Direct Boot experience

One of the security features introduced in Android Nougat was file-based encryption. File-based encryption (FBE) means different files are encrypted with different keys that can be unlocked independently. FBE also separates data into device encrypted (DE) data and credential encrypted (CE) data.

Direct boot uses file-based encryption to allow a seamless user experience when a device reboots by combining the unlock and decrypt screen. For users, this means that applications like alarm clocks, accessibility settings, and phone calls are available immediately after boot.

Enhanced with TrustZone® security

Modern processors provide a means to execute code in a mode that remains secure even if the kernel is compromised. On ARM®-based processors this mode is known as TrustZone. Starting in Android Nougat, all disk encryption keys are stored encrypted with keys held by TrustZone software. This secures encrypted data in two ways:

  • TrustZone enforces the Verified Boot process. If TrustZone detects that the operating system has been modified, it won't decrypt disk encryption keys; this helps to secure device encrypted (DE) data.
  • TrustZone enforces a waiting period between guesses at the user credential, which gets longer after a sequence of wrong guesses. With 1624 valid four-point patterns and TrustZone's ever-growing waiting period, trying all patterns would take more than four years. This improves security for all users, especially those who have a shorter and more easily guessed pattern, PIN, or password.
Encryption on Pixel phones

Protecting different folders with different keys required a distinct approach from full-disk encryption (FDE). The natural choice for Linux-based systems is the industry-standard eCryptFS. However, eCryptFS didn't meet our performance requirements. Fortunately one of the eCryptFS creators, Michael Halcrow, worked with the ext4 maintainer, Ted Ts'o, to add encryption natively to ext4, and Android became the first consumer of this technology. ext4 encryption performance is similar to full-disk encryption, which is as performant as a software-only solution can be.

Additionally, Pixel phones have an inline hardware encryption engine, which gives them the ability to write encrypted data at line speed to the flash memory. To take advantage of this, we modified ext4 encryption to use this hardware by adding a key reference to the bio structure, within the ext4 driver before passing it to the block layer. (The bio structure is the basic container for block I/O in the Linux kernel.) We then modified the inline encryption block driver to pass this to the hardware. As with ext4 encryption, keys are managed by the Linux keyring. To see our implementation, take a look at the source code for the Pixel kernel.

While this specific implementation of file-based encryption using ext4 with inline encryption benefits Pixel users, FBE is available in AOSP and ready to use, along with the other features mentioned in this post.

Categories: Programming

Vanity Metrics in Software Organizations


Measurement and metrics are lightning rods for discussion and argument in software development.  One of the epithets used to disparage measures and metrics is the term ‘vanity metric’. Eric Ries, author of The Lean Startup, is often credited with coining the term ‘vanity metric’ to describe metrics that make people feel good, but are less useful for making decisions about the business.  For example, I could measure Twitter followers or I could measure the number of blog reads or podcast listens that come from Twitter. The count of raw Twitter followers is a classic vanity metric.

In order to shortcut the discussion (and reduce the potential vitriol) of whether a measure or metric can be classified as actionable or vanity I ask four questions:

  1. Are there mechanisms in place to ensure the measure isn’t game-able. Does the metric reflect how work is being done or have guidelines in place so it can’t easily be manipulated without changing the outcome of the process?  For example, I can buy 10,000 Twitter followers, but adding these users will not translate into blog readers or podcast listeners which is the important output.
  2. Do changes in the measure or metric correlate to changes in the business outcome? For example, measure of automated code coverage is typically positively correlated to product quality and to amount of value delivered. If a metric is not correlated, there is a strong possibility that the metric is a vanity metric.
  3. Does the metric provide an understanding of what is happening within the process being measured without confusion or ambiguity?  For example, if a team measures the number test cases run and the number test cases number of increased (or decreased), what would the change mean?
  4. When the measure shows a change can and will you be able to take action?  If criteria 3 is true and you understand the signal being sent, can and will your organization do something about it? If true, can a decision be made? For example, an organization I recently met with measured overtime amongst developers.  Coder and testers chronically put in 8 hours of overtime each week and had for over a year they either could not use the data to make a change or choose not use the data; this was a vanity metric.

If you can answer the four questions with a yes, the metric will be actionable.  A no to any of the four questions generally indicates a vanity metric. Cutting out vanity metrics provides a better focus on the measures that provide value.  

Next – one person’s vanity metric is another’s actionable metric


Categories: Process Management

Moving to Google Sign-In for a better user experience and higher conversion rates

Google Code Blog - Thu, 11/17/2016 - 22:33
Posted by Steven Soneff, Product Manager

We're always working to make Google Sign-In a better experience for developers and end users. Over the last year, we've simplified the user experience by reducing the default amount of information requested from the user and updated the branding. Major apps like The Guardian have taken advantage of these updates and we now see over twice as many people use Google Sign-In with their app.

The more streamlined experience begins with updated sign-in buttons that show the standard Google logo. We've updated the sign in button to reflect our new Google logo design. Furthermore, Google Sign-in now works for all users, not just those with a G+ profile. The consent screen has been redesigned so that the user sees inline the information that will be provided to the app (name, email, and profile photo) on Android and the web and iOS soon, too.

With these improvements in place, we are now announcing the migration from our Google+ Sign-In product to the new model. Making this change for your app is simple: just use the latest libraries with default sign-in configuration, or replace the "https://www.googleapis.com/auth/plus.login" scope with "profile" and update branding of the Google Sign-in button (your existing users will not be asked to sign-in again).

For developers who continue to use Google+ Sign-In scopes, expect some changes in behavior. New users going through the older sign-in flow will no longer be asked to share social graph data with your app. In the upcoming versions of SDKs on all platforms, we'll replace the Google+ branded assets with the new Google branding. So, if your app uses the default button, expect a new look and improved user experience with Google Sign-In. And after January 2017, calling our Plus People or Games Players APIs for users who had previously granted you access may begin returning empty results.

With these changes, we are deprecating the Plus People API. You can read the deprecation notes here: Android, Web. If your app needs social information and more extensive profile data, we have better alternatives for you. The new contacts-based People API provides a rich set of users' connections. To enhance the distribution of your app through the social graphs of your app's userbase, use the recently launched Firebase Invites, a cross-platform solution for sending personalized email and SMS invitations. On Android, you may also get rich cloud and device-based Contacts data from the Contacts Provider.

In addition to these user facing changes, we've also overhauled our Identity/authentication APIs to simplify implementation on both the client and server. Please check out our previous blog posts if you missed them:

Categories: Programming

Is Your Organization Killing Your Software?

From the Editor of Methods & Tools - Thu, 11/17/2016 - 16:19
When asked “What is your architecture?” most people immediately respond with how their software is laid out and what their plans are for improving parts of it. Rarely does anybody really think through their team and organizational architecture, and even more rarely do people understand how that may fundamentally impact how the software gets written […]