Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Database

Why does it work in staging but not in production?

Agile Testing - Grig Gheorghiu - Mon, 04/07/2014 - 23:07
This is a question that I am sure was faced by every developer and operation engineer out there. There can be multiple answers to this question, and I'll try to offer some of the ones we arrived at, having to do mainly with our Chef workflow, but that can be applied I think to any other configuration management tool.

1) A Chef cookbook version in staging is different from the version in production

This is a common scenario, and it's supposed to work this way. You do want to test out new versions of your cookbooks in staging first, then update the version of the cookbook in production.

2) A feature flag is turned on in staging but turned off in production

We have Chef attributes defined in attributes/default.rb that serve as feature flags. If a certain attribute is true, some recipe code or template section gets included which wouldn't be included if the attribute were false. The situation can occur where a certain attribute is set to true in the staging environment but is set to false in the production environment, at which point things can get out of sync. Again, this is expected, as you do want to test new features out in staging first, but don't forget to turn them on in production at some point.

3) A block of code or template is included in staging but not in production

We had this situation very recently. Instead of using attributes as feature flags, we were directly comparing the environment against 'stg' or 'prod' inside an if block in a template, and only including that template section if the environment was 'stg'. So things were working perfectly in staging, but mysteriously the template section wasn't even there in production. An added difficulty was that the template in question was peppered with non-indented if blocks, so it took us a while to figure out what was going on.

Two lessons here:

a) Make your templates readable by indenting code blocks.

b) Use attributes as feature flags, and don't compare directly against the current environment. This way, it's easier to always look at the default attribute file and see if a given feature flag is true or false.

4) A modification is made to the cookbook version in production directly on the Chef server

I blogged about this issue in the past. Suppose you have an environments file that pins a given cookbook (let's designate it as cookbook C) to 1.0.1 in staging and to 1.0.0 in production. You want to upgrade production to 1.0.1, because it was tested in staging and it worked fine. However, instead of i) modifying the environments/prod.rb file and pinning the cookbook C to 1.0.1, ii) updating the Chef server via "knife environment from file environments/prod.rb" and iii) committing your changes in git, you modify the version of the cookbook C directly on the Chef server with "knife environment edit prod".

Then, the next time you or somebody else modifies environments/prod.rb to bump up another cookbook to the next version, the version of cookbook C in that file is still 1.0.0, so when you upload environments/prod.rb to the Chef server, it will downgrade cookbook C from 1.0.1 to 1.0.0. Chaos will ensue the next time chef-client runs on the nodes that have recipes from cookbook C. Production will be broken, while staging will still happily work.

Here are 2 other scenarios not related directly to staging vs production, but instead having the potential to break production altogether.

You forget to upload the new version of the cookbook to the Chef server

You make all of your modifications to the cookbook, you commit your code to git, but for some reason you forget to upload the cookbook to the Chef server. Particularly if you keep the same version of the cookbook that is in staging (and possibly in production), then your modifications won't take effect and you may spend some quality time pulling your hair.

You upload a cookbook to the Chef server without bumping its version

There is another, even worse, scenario though: you do upload your cookbook to the Chef server, but you realize that you didn't bump up the version number compared to what is currently pinned to production. As a consequence, all the nodes in production that have recipes from that cookbook will be updated the next time they run chef-client. That's a nasty one. It does happen. So make sure you pay attention to your cookbook versioning process and stick to it!

More on haproxy geolocation detection and CDN services

Agile Testing - Grig Gheorghiu - Fri, 03/07/2014 - 19:38
In a previous blog post I described a method to do geolocation detection with haproxy. The country detection was based on the user's client IP. However, if you have a CDN service in front of your load balancer, then the source IPs will all belong to the CDN server farm, and the closest such server to an end user may not be in the same country as the user. Fortunately, CDN services generally pass that end user IP address in some specific HTTP header, so you can still perform the geolocation detection by inspecting that header. For example, Akamai passes the client IP in a header called True-Client-IP.

In our haproxy.cfg rules detailed below we wanted to handle both the case where our load balancer is hit directly by end users (in case we bypass any CDN service), and the case where the load balancer is hit via a CDN.

1) We set our own HTTP headers containing the country code as detected by geolocation based on a) the source IP (this is so we can still look at the source IP in case we bypass the CDN and hit our load balancer directly) and b) the specific CDN header containing the actual client IP (True-Client-IP in the case of Akamai):

http-request set-header X-Country-Src %[src,map_ip(/etc/haproxy/geolocation.txt)]

http-request set-header X-Country-Akamai %[req.hdr_ip(True-Client-IP,-1),map_ip(/etc/haproxy/geolocation.txt)]

2) We set an ACL that is true if we detect the presence of the True-Client-IP header, which tells us that we are hit via Akamai:
acl acl_akamai_true_client_ip_header_exists req.hdr(True-Client-IP) -m found

3) We set an ACL that is true if we detect that the country of origin (obtained via Akamai's True-Client-IP) is US:

acl acl_geoloc_akamai_true_client_ip_us req.hdr(X-Country-Akamai) -m str -i US
4) We set an ACL that is true if we detect that the country of origin (obtained via the source IP of the client) is US:
acl acl_geoloc_src_us req.hdr(X-Country-Src) -m str -i US
5) Based on the ACLs defined above, we send non-US traffic to a specific backend, IF we are being hit via Akamai (ACL #2) AND we detected that the country of origin is non-US (negation of ACL #3) OR if we detected that the country of origin if non-US via the source IP (negation of ACL #4):

use_backend www-backend-non-us if acl_akamai_true_client_ip_header_exists !acl_geoloc_akamai_true_client_ip_us or !acl_geoloc_src_us

(note that the AND is implicit in the way haproxy looks at combinations of ACLs)
6) We also we an HTTP header called X-Country which our application inspects in order to perform country-specific logic. We first set this header to the X-Country-Src header set in rule #1, but we override it if we are getting hit via Akamai:
http-request set-header X-Country %[req.hdr(X-Country-Src)]
http-request set-header X-Country %[req.hdr(X-Country-Akamai)] if acl_akamai_true_client_ip_header_exists

This looks pretty complicated, but it works well :-)


Example of Chef workflow

Agile Testing - Grig Gheorghiu - Fri, 02/28/2014 - 21:15
Here is a quick example of a Chef workflow that has been working for us. It can be easily improved on, especially around testing, but it's a good foundation.

1) Put your chef-repo on Github.
2) When you want to modify a cookbook, do a git pull to get the latest version of the cookbook.
3) Modify the cookbook.
4) Check your environments (I'll assume staging and production for now, to keep it simple) to see what version of the cookbook is used in production vs staging. Let's assume both staging and production environments use the latest version of the cookbook, say 0.1.
5) Modify metadata.rb and bump up the version of the cookbook to 0.2.
6) Modify the staging environment file (for example environments/stg.rb) and pin the cookbook you modified to version 0.2. Make sure the production environment is still pinned to 0.1.
7) Update the staging environment on the Chef server via: 'knife environment from file environments/stg.rb'
8) Upload the new version of the cookbook (0.2) to the Chef server via: 'knife cookbook upload mycookbook' (it should report version 0.2 after the upload)
9) Run chef-client on a staging box that uses the cookbook you modified. Check that everything looks good.
10) Assuming everything looks good in staging, modify the production environment file (for example environments/prod.rb) and pin the cookbook you modified to the new version 0.2.
11) Update the production environment on the Chef server via: 'knife environment from file environments/prod.rb'.
12) Run chef-client on a prod box and check that everything is OK. If it looks good, either let chef-client run by itself on all prod boxes, or run chef-client manually to force the change.
13) Commit your coobook and environment changes into git and push to Github.

Note that there is the possibility of screw-ups if somebody forgets step #13. For this reason, I usually am double careful and check especially my local version of the environment files (stg.rb and prod.rb) against what is actually running on the Chef server. I run 'knife environment show stg' and compare the result to stg.rb. I also run 'knife environment show prod' and compare the result to prod.rb. Only if they both look good do I modify my local copies of stg.rb and prod.rb and then upload them to the Chef server. We've had issues in the past with changes that were made to the Chef server directly (via 'knife environment edit') that got overwritten when somebody uploaded their version of the environment file that contained an older version of the given cookbook. For this reason I don't recommed making changes directly on the Chef server by editing roles, environments, etc, but instead making all changes on your local files, then uploading those files to Chef and also committing those changes to Github.

As I said in the beginning, there is the opportunity to run various testing tools (at a minimum rubocop and Foodcritic) on your cookbook before uploading it to the Chef server. But that is for another post.

Geolocation detection with haproxy

Agile Testing - Grig Gheorghiu - Tue, 01/14/2014 - 00:33
A useful feature for a web application is the ability to detect the user's country of origin based on their source IP address. This used not to be possible in haproxy unless you applied Cyril Bonté's geolocation patches (see the end of this blog post for how exactly to do that if you don't want to live on the bleeding edge of haproxy). However, the latest development version of haproxy (which is 1.5-dev21 at this time) contains geolocation detection functionality.

Here's how to use the geolocation detection feature of haproxy:

1) Generate text file which maps IP address ranges to ISO country codes

This is done using Cyril's haproxy-geoip utility, which is available in his geolocation patches. Here's how to locate and run this utility:
  • clone patch git repo: git clone https://github.com/cbonte/haproxy-patches.git
  • the haproxy-geoip script is now available in haproxy-patches/geolocation/tools
    • for the script to run, you need to have the funzip utility available on your system (it's part of the unzip package in Ubuntu)
    • you also need the iprange binary, which you can 'make' from its source file available in the haproxy-1.5-dev21/contrib/iprange directory; once you generate the binary, copy it somewhere in your PATH so that haproxy-geoip can locate it
  • run haproxy-geoip, which prints its output (IP ranges associated to ISO country codes) to stdout, and capture stdout to a file: haproxy-geoip > geolocation.txt
  • copy geolocation.txt to /etc/haproxy
2) Set custom HTTP header based on geolocation
For this, haproxy provides the map_ip function, which locates the source IP (the predefined 'src' variable in the line below) in the IP range in geolocation.txt and returns the ISO country code. We assign this country code to the custom X-Country HTTP header:
http-request set-header X-Country %[src, map_ip(/etc/haproxy/geolocation.txt)]
If you didn't want to map the source IP to a country code, but instead wanted to inspect the value of an HTTP header such as X-Forwarded-For, you could do this:
http-request set-header X-Country %[req.hdr_ip(X-Forwarded-For,-1), map_ip(/etc/haproxy/geolocation.txt)]
3) Use geolocation in ACLs
Let's assume that if the country detected via geolocation is not US, then you want to send the user to a different backend. You can do that with an ACL. Note that we compare the HTTP header X-Country which we already set above to the string 'US' using the '-m str' string matching functionality of haproxy, and we also specify that we want a case insensitive comparison with '-i US':
acl acl_geoloc_us req.hdr(X-Country) -m str -i USuse_backend www-backend-non-us if !acl_geoloc_us
If you didn't want to set the custom HTTP header, you could use the map_ip function directly in the definition of the ACL, like this:
acl acl_geoloc_us %[src, map_ip(/etc/haproxy/geolocation.txt)] -m str -i USuse_backend www-backend-non-us if !acl_geoloc_us
Speaking of ACLs, here's an example of defining ACLs based on the existence of a cookie and based on the value of the cookie then choosing a backend based on those ACLs:
acl acl_cookie_country req.cook_cnt(country_code) eq 1acl acl_cookie_country_us req.cook(country_code) -m str -i USuse_backend www-backend-non-us if acl_cookie_country !acl_cookie_country_us
And now for something completely different...which is what I mentioned in the beginning of this post: 
How to use the haproxy geolocation patches with the current stable (1.4) version of haproxy
a) Patch haproxy source code with gelocation patches, compile and install haproxy:
  • clone patch git repo: git clone https://github.com/cbonte/haproxy-patches.git
  • change to haproxy-1.4.24 directory
  • copy haproxy-1.4-geolocation.patc to the root of haproxy-1.4.24 
  • apply the patch: patch -p1 < haproxy-1.4-geolocation.patch
  • make clean
  • make TARGET=linux26
  • make install
b) Generate text file which maps IP address ranges to ISO country codes
  • install funzip: apt-get install unzip
  • create iprange binary
    • cd haproxy-1.4.24/contrib/iprange
    • make
    • the iprange binary will be created in the same folder. copy that to /usr/local/sbin
  • haproxy-geoip is located here: haproxy-patches/geolocation/tools
  • haproxy-geoip > geolocation.txt
  • copy geolocation.txt to /etc/haproxy 
c) Obtain country code based on source IP and use it in ACL
This is done via the special 'geolocate' statement and the 'geoloc' variable added to the haproxy configuration syntax by the geolocation patch:

geolocate src /etc/haproxy/geolocation.txt
acl acl-au geoloc eq AU
use_backend www-backend-au if acl-au

If instead of the source IP you want to map the value of the X-Forwarded-For header to a country, use:
geolocate hdr_ip(X-Forwarded-For,-1) /etc/haproxy/geolocation.txt

If you wanted to redirect to another location instead of using an ACL, use:
redirect location http://some.location.example.com:4567 if { geoloc AU }

That's it for now. I want to thank Cyril Bonté, the author of the geolocation patches, and Willy Tarreau, the author of haproxy, for their invaluable help and their amazingly fast responses to my emails. It's a pleasure to deal with such open source developers passionate about the software they produce.  Also thanks to my colleagues Zmer Andranigian for working on getting version 1.4 of haproxy to work with geolocation, and Jeff Roberts for working on getting 1.5-dev21 to work.
One last thing: haproxy-1.5-dev21 has been very stable in production for us, but of course test it thoroughly before deploying it in your environment.