Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Programming

5 Reasons why you should test your code

Xebia Blog - Mon, 07/27/2015 - 09:37

It is just like in mathematics class when I had to make a proof for Thales’ theorem I wrote “Can’t you see that B has a right angle?! Q.E.D.”, but he still gave me an F grade.

You want to make things work, right? So you start programming until your feature is implemented. When it is implemented, it works, so you do not need any tests. You want to proceed and make more cool features.

Suddenly feature 1 breaks, because you did something weird in some service that is reused all over your application. Ok, let’s fix it, keep refreshing the page until everything is stable again. This is the point in time where you regret that you (or even better, your teammate) did not write tests.

In this article I give you 5 reasons why you should write them.

1. Regression testing

The scenario describes in the introduction is a typical example of a regression bug. Something works, but it breaks when you are looking the other way.
When you had tests with 100% code coverage, a red error had been appeared in the console or – even better – a siren goes off in the room where you are working.

Although there are some misconceptions about coverage, it at least tells others that there is a fully functional test suite. And it may give you a high grade when an audit company like SIG inspects your software.

coverage

100% Coverage feels so good

100% Code coverage does not mean that you have tested everything.
This means that the test suite it implemented in such a way that it calls every line of the tested code, but says nothing about the assertions made during its test run. If you want to measure if your specs do a fair amount of assertions, you have to do mutation testing.

This works as follows.

An automatic task is running the test suite once. Then some parts of you code are modified, mainly conditions flipped, for loops made shorter/longer, etc. The test suite is run a second time. If there are tests failing after the modifications begin made, there is an assertion done for this case, which is good.
However, 100% coverage does feel really good if you are an OCD-person.

The better your test coverage and assertion density is, the higher probability to catch regression bugs. Especially when an application grows, you may encounter a lot of regression bugs during development, which is good.

Suppose that a form shows a funny easter egg when the filled in birthdate is 06-06-2006 and the line of code responsible for this behaviour is hidden in a complex method. A fellow developer may make changes to this line. Not because he is not funny, but he just does not know. A failing test notices him immediately that he is removing your easter egg, while without a test you would find out the the removal 2 years later.

Still every application contains bugs which you are unaware of. When an end user tells you about a broken page, you may find out that the link he clicked on was generated with some missing information, ie. users//edit instead of users/24/edit.

When you find a bug, first write a (failing) test that reproduces the bug, then fix the bug. This will never happen again. You win.

2. Improve the implementation via new insights

“Premature optimalization is the root of all evil” is something you hear a lot. This does not mean that you have to implement you solution pragmatically without code reuse.

Good software craftmanship is not only about solving a problem effectively, also about maintainability, durability, performance and architecture. Tests can help you with this. If forces you to slow down and think.

If you start writing your tests and you have trouble with it, this may be an indication that your implementation can be improved. Furthermore, your tests let you think about input and output, corner cases and dependencies. So do you think that you understand all aspects of the super method you wrote that can handle everything? Write tests for this method and better code is garanteed.

Test Driven Development even helps you optimizing your code before you even write it, but that is another discussion.

3. It saves time, really

Number one excuse not to write tests is that you do not have time for it or your client does not want to pay for it. Writing tests can indeed cost you some time, even if you are using boilerplate code elimination frameworks like Mox.

However, if I ask you whether you would make other design choices if you had the chance (and time) to start over, you probably would say yes. A total codebase refactoring is a ‘no go’ because you cannot oversee what parts of your application will fail. If you still accept the refactoring challenge, it will at least give you a lot of headache and costs you a lot of time, which you could have been used for writing the tests. But you had no time for writing tests, right? So your crappy implementation stays.

Dilbert bugfix

A bug can always be introduced, even with good refactored code. How many times did you say to yourself after a day of hard working that you spend 90% of your time finding and fixing a nasty bug? You are want to write cool applications, not to fix bugs.
When you have tested your code very well, 90% of the bugs introduced are catched by your tests. Phew, that saved the day. You can focus on writing cool stuff. And tests.

In the beginning, writing tests can take up to more than half of your time, but when you get the hang of it, writing tests become a second nature. It is important that you are writing code for the long term. As an application grows, it really pays off to have tests. It saves you time and developing becomes more fun as you are not being blocked by hard to find bugs.

4. Self-updating documentation

Writing clean self-documenting code is one if the main thing were adhere to. Not only for yourself, especially when you have not seen the code for a while, but also for your fellow developers. We only write comments if a piece of code is particularly hard to understand. Whatever style you prefer, it has to be clean in some way what the code does.

  // Beware! Dragons beyond this point!

Some people like to read the comments, some read the implementation itself, but some read the tests. What I like about the tests, for example when you are using a framework like Jasmine, is that they have a structured overview of all method's features. When you have a separate documentation file, it is as structured as you want, but the main issue with documentation is that it is never up to date. Developers do not like to write documentation and forget to update it when a method signature changes and eventually they stop writing docs.

Developers also do not like to write tests, but they at least serve more purposes than docs. If you are using the test suite as documentation, your documentation is always up to date with no extra effort!

5. It is fun

Nowadays there are no testers and developers. The developers are the testers. People that write good tests, are also the best programmers. Actually, your test is also a program. So if you like programming, you should like writing tests.
The reason why writing tests may feel non-productive is because it gives you the idea that you are not producing something new.

OLYMPUS DIGITAL CAMERA

Is the build red? Fix it immediately!

However, with the modern software development approach, your tests should be an integrated part of your application. The tests can be executed automatically using build tools like Grunt and Gulp. They may run in a continuous integration pipeline via Jenkins, for example. If you are really cool, a new deploy to production is automatically done when the tests pass and everything else is ok. With tests you have more confidence that your code is production ready.

A lot of measurements can be generated as well, like coverage and mutation testing, giving the OCD-oriented developers a big smile when everything is green and the score is 100%.

If the test suite fails, it is first priority to fix it, to keep the codebase in good shape. It takes some discipline, but when you get used to it, you have more fun developing new features and make cool stuff.

Neo4j: From JSON to CSV to LOAD CSV via jq

Mark Needham - Sun, 07/26/2015 - 00:05


In my last blog post I showed how to import a Chicago crime categories & sub categories JSON document using Neo4j’s cypher query language via the py2neo driver. While this is a good approach for people with a developer background, many of the users I encounter aren’t developers and favour using Cypher via the Neo4j browser.

If we’re going to do this we’ll need to transform our JSON document into a CSV file so that we can use the LOAD CSV command on it. Michael pointed me to the jq tool which comes in very handy.

To recap, this is a part of the JSON file:

{
    "categories": [
        {
            "name": "Index Crime",
            "sub_categories": [
                {
                    "code": "01A",
                    "description": "Homicide 1st & 2nd Degree"
                },
            ]
        },
        {
            "name": "Non-Index Crime",
            "sub_categories": [
                {
                    "code": "01B",
                    "description": "Involuntary Manslaughter"
                },
            ]
        },
        {
            "name": "Violent Crime",
            "sub_categories": [
                {
                    "code": "01A",
                    "description": "Homicide 1st & 2nd Degree"
                },
            ]
        }
    ]
}

We want to get one row for each sub category which contains three columns – category name, sub category code, sub category description.

First we need to pull out the categories:

$ jq ".categories[]" categories.json
 
{
  "name": "Index Crime",
  "sub_categories": [
    {
      "code": "01A",
      "description": "Homicide 1st & 2nd Degree"
    },
  ]
}
{
  "name": "Non-Index Crime",
  "sub_categories": [
    {
      "code": "01B",
      "description": "Involuntary Manslaughter"
    },
  ]
}
{
  "name": "Violent Crime",
  "sub_categories": [
    {
      "code": "01A",
      "description": "Homicide 1st & 2nd Degree"
    },
  ]
}

Next we want to create a row for each sub category with the category alongside it. We can use the pipe function to combine the two selectors:

$ jq ".categories[] | {name: .name, sub_category: .sub_categories[]}" categories.json
 
{
  "name": "Index Crime",
  "sub_category": {
    "code": "01A",
    "description": "Homicide 1st & 2nd Degree"
  }
}
...
{
  "name": "Non-Index Crime",
  "sub_category": {
    "code": "01B",
    "description": "Involuntary Manslaughter"
  }
}
...
{
  "name": "Violent Crime",
  "sub_category": {
    "code": "01A",
    "description": "Homicide 1st & 2nd Degree"
  }
}

Now we want to un-nest the sub category:

$ jq ".categories[] | {name: .name, sub_category: .sub_categories[]} | [.name, .sub_category.code, .sub_category.description]" categories.json
 
[
  "Index Crime",
  "01A",
  "Homicide 1st & 2nd Degree"
]
 
[
  "Non-Index Crime",
  "01B",
  "Involuntary Manslaughter"
]
 
[
  "Violent Crime",
  "01A",
  "Homicide 1st & 2nd Degree"
]

And finally let’s use the @csv filter to generate CSV lines:

$ jq ".categories[] | {name: .name, sub_category: .sub_categories[]} | [.name, .sub_category.code, .sub_category.description] | @csv" categories.json
"\"Index Crime\",\"01A\",\"Homicide 1st & 2nd Degree\""
"\"Index Crime\",\"02\",\"Criminal Sexual Assault\""
"\"Index Crime\",\"03\",\"Robbery\""
"\"Index Crime\",\"04A\",\"Aggravated Assault\""
"\"Index Crime\",\"04B\",\"Aggravated Battery\""
"\"Index Crime\",\"05\",\"Burglary\""
"\"Index Crime\",\"06\",\"Larceny\""
"\"Index Crime\",\"07\",\"Motor Vehicle Theft\""
"\"Index Crime\",\"09\",\"Arson\""
"\"Non-Index Crime\",\"01B\",\"Involuntary Manslaughter\""
"\"Non-Index Crime\",\"08A\",\"Simple Assault\""
"\"Non-Index Crime\",\"08B\",\"Simple Battery\""
"\"Non-Index Crime\",\"10\",\"Forgery & Counterfeiting\""
"\"Non-Index Crime\",\"11\",\"Fraud\""
"\"Non-Index Crime\",\"12\",\"Embezzlement\""
"\"Non-Index Crime\",\"13\",\"Stolen Property\""
"\"Non-Index Crime\",\"14\",\"Vandalism\""
"\"Non-Index Crime\",\"15\",\"Weapons Violation\""
"\"Non-Index Crime\",\"16\",\"Prostitution\""
"\"Non-Index Crime\",\"17\",\"Criminal Sexual Abuse\""
"\"Non-Index Crime\",\"18\",\"Drug Abuse\""
"\"Non-Index Crime\",\"19\",\"Gambling\""
"\"Non-Index Crime\",\"20\",\"Offenses Against Family\""
"\"Non-Index Crime\",\"22\",\"Liquor License\""
"\"Non-Index Crime\",\"24\",\"Disorderly Conduct\""
"\"Non-Index Crime\",\"26\",\"Misc Non-Index Offense\""
"\"Violent Crime\",\"01A\",\"Homicide 1st & 2nd Degree\""
"\"Violent Crime\",\"02\",\"Criminal Sexual Assault\""
"\"Violent Crime\",\"03\",\"Robbery\""
"\"Violent Crime\",\"04A\",\"Aggravated Assault\""
"\"Violent Crime\",\"04B\",\"Aggravated Battery\""

The only annoying thing about this output is that all the double quotes are escaped. We can sort that out by passing the ‘-r’ flag when we call jq:

$ jq -r ".categories[] | {name: .name, sub_category: .sub_categories[]} | [.name, .sub_category.code, .sub_category.description] | @csv" categories.json
"Index Crime","01A","Homicide 1st & 2nd Degree"
"Index Crime","02","Criminal Sexual Assault"
"Index Crime","03","Robbery"
"Index Crime","04A","Aggravated Assault"
"Index Crime","04B","Aggravated Battery"
"Index Crime","05","Burglary"
"Index Crime","06","Larceny"
"Index Crime","07","Motor Vehicle Theft"
"Index Crime","09","Arson"
"Non-Index Crime","01B","Involuntary Manslaughter"
"Non-Index Crime","08A","Simple Assault"
"Non-Index Crime","08B","Simple Battery"
"Non-Index Crime","10","Forgery & Counterfeiting"
"Non-Index Crime","11","Fraud"
"Non-Index Crime","12","Embezzlement"
"Non-Index Crime","13","Stolen Property"
"Non-Index Crime","14","Vandalism"
"Non-Index Crime","15","Weapons Violation"
"Non-Index Crime","16","Prostitution"
"Non-Index Crime","17","Criminal Sexual Abuse"
"Non-Index Crime","18","Drug Abuse"
"Non-Index Crime","19","Gambling"
"Non-Index Crime","20","Offenses Against Family"
"Non-Index Crime","22","Liquor License"
"Non-Index Crime","24","Disorderly Conduct"
"Non-Index Crime","26","Misc Non-Index Offense"
"Violent Crime","01A","Homicide 1st & 2nd Degree"
"Violent Crime","02","Criminal Sexual Assault"
"Violent Crime","03","Robbery"
"Violent Crime","04A","Aggravated Assault"
"Violent Crime","04B","Aggravated Battery"

Excellent. The only thing left is to write a header and then direct the output into a CSV file and get it into Neo4j:

$ echo "category,sub_category_code,sub_category_description" > categories.csv
$ jq -r ".categories[] |
         {name: .name, sub_category: .sub_categories[]} |
         [.name, .sub_category.code, .sub_category.description] |
         @csv " categories.json >> categories.csv
$ head -n10 categories.csv
category,sub_category_code,sub_category_description
"Index Crime","01A","Homicide 1st & 2nd Degree"
"Index Crime","02","Criminal Sexual Assault"
"Index Crime","03","Robbery"
"Index Crime","04A","Aggravated Assault"
"Index Crime","04B","Aggravated Battery"
"Index Crime","05","Burglary"
"Index Crime","06","Larceny"
"Index Crime","07","Motor Vehicle Theft"
"Index Crime","09","Arson"
LOAD CSV WITH HEADERS FROM "file:///Users/markneedham/projects/neo4j-spark-chicago/categories.csv" AS row
MERGE (c:CrimeCategory {name: row.category})
MERGE (sc:SubCategory {code: row.sub_category_code})
ON CREATE SET sc.description = row.sub_category_description
MERGE (c)-[:CHILD]->(sc)

And that’s it!

Graph  25
Categories: Programming

Android: Custom ViewMatchers in Espresso

Xebia Blog - Fri, 07/24/2015 - 16:03

Somehow it seems that testing is still treated like an afterthought in mobile development. The introduction of the Espresso test framework in the Android Testing Support Library improved the situation a little bit, but the documentation is limited and it can be hard to debug problems. And you will run into problems, because testing is hard to learn when there are so few examples to learn from.

Anyway, I recently created my first custom ViewMatcher for Espresso and I figured I would like to share it here. I was building a simple form with some EditText views as input fields, and these fields should display an error message when the user entered an invalid input.

Android TextView with Error Message

Android TextView with error message

In order to test this, my Espresso test enters an invalid value in one of the fields, presses "submit" and checks that the field is actually displaying an error message.

@Test
public void check() {
  Espresso
      .onView(ViewMatchers.withId((R.id.email)))
      .perform(ViewActions.typeText("foo"));
  Espresso
      .onView(ViewMatchers.withId(R.id.submit))
      .perform(ViewActions.click());
  Espresso
      .onView(ViewMatchers.withId((R.id.email)))
      .check(ViewAssertions.matches(
          ErrorTextMatchers.withErrorText(Matchers.containsString("email address is invalid"))));
}

The real magic happens inside the ErrorTextMatchers helper class:

public final class ErrorTextMatchers {

  /**
   * Returns a matcher that matches {@link TextView}s based on text property value.
   *
   * @param stringMatcher {@link Matcher} of {@link String} with text to match
   */
  @NonNull
  public static Matcher<View> withErrorText(final Matcher<String> stringMatcher) {

    return new BoundedMatcher<View, TextView>(TextView.class) {

      @Override
      public void describeTo(final Description description) {
        description.appendText("with error text: ");
        stringMatcher.describeTo(description);
      }

      @Override
      public boolean matchesSafely(final TextView textView) {
        return stringMatcher.matches(textView.getError().toString());
      }
    };
  }
} 

The main details of the implementation are as follows. We make sure that the matcher will only match children of the TextView class by returning a BoundedMatcher from withErrorText(). This makes it very easy to implement the matching logic itself in BoundedMatcher.matchesSafely(): simply take the getError() method from the TextView and feed it to the next Matcher. Finally, we have a simple implementation of the describeTo() method, which is only used to generate debug output to the console.

In conclusion, it turns out to be pretty straightforward to create your own custom ViewMatcher. Who knew? Perhaps there is still hope for testing mobile apps...

You can find an example project with the ErrorTextMatchers on GitHub: github.com/smuldr/espresso-errortext-matcher.

Some Questions About The Pomodoro Technique

Making the Complex Simple - John Sonmez - Thu, 07/23/2015 - 15:00

In this episode, I answer a questions about the Pomodoro Technique. Full transcript: John:               Hey, John Sonmez from simpleprogrammer.com. I got this question from Kent about the Pomodoro Technique. As you might know I am a big proponent of. I follow that technique. I have kind of my own version of this with—using the KanbanFlow […]

The post Some Questions About The Pomodoro Technique appeared first on Simple Programmer.

Categories: Programming

Neo4j: Loading JSON documents with Cypher

Mark Needham - Thu, 07/23/2015 - 07:15

One of the most commonly asked questions I get asked is how to load JSON documents into Neo4j and although Cypher doesn’t have a ‘LOAD JSON’ command we can still get JSON data into the graph.

Michael shows how to do this from various languages in this blog post and I recently wanted to load a JSON document that I generated from Chicago crime types.

This is a snippet of the JSON document:

{
    "categories": [
        {
            "name": "Index Crime", 
            "sub_categories": [
                {
                    "code": "01A", 
                    "description": "Homicide 1st & 2nd Degree"
                }
            ]
        }, 
        {
            "name": "Non-Index Crime", 
            "sub_categories": [
                {
                    "code": "01B", 
                    "description": "Involuntary Manslaughter"
                }
            ]
        }, 
        {
            "name": "Violent Crime", 
            "sub_categories": [
                {
                    "code": "01A", 
                    "description": "Homicide 1st & 2nd Degree"
                }
            ]
        }
    ]
}

We want to create the following graph structure from this document:

2015 07 23 06 46 50

We can then connect the crimes to the appropriate sub category and write aggregation queries that drill down from the category.

To do this we’re going to have to pass the JSON document to Neo4j via its HTTP API rather than through the browser. Luckily there are drivers available for {insert your favourite language here} so we should still be good.

Python is my current goto language so I’m going to use py2neo to load the data in.

Let’s start by writing a simple query which passes our JSON document in and gets it straight back. Note that I’ve updated my Neo4j password to be ‘foobar’ – replace that with your equivalent if you’re following along:

import json
from py2neo import Graph, authenticate
 
# replace 'foobar' with your password
authenticate("localhost:7474", "neo4j", "foobar")
graph = Graph()
 
with open('categories.json') as data_file:
    json = json.load(data_file)
 
query = """
RETURN {json}
"""
 
# Send Cypher query.
print graph.cypher.execute(query, json = json)
$ python import_categories.py
   | document
---+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 1 | {u'categories': [{u'name': u'Index Crime', u'sub_categories': [{u'code': u'01A', u'description': u'Homicide 1st & 2nd Degree'}, {u'code': u'02', u'description': u'Criminal Sexual Assault'}, {u'code': u'03', u'description': u'Robbery'}, {u'code': u'04A', u'description': u'Aggravated Assault'}, {u'code': u'04B', u'description': u'Aggravated Battery'}, {u'code': u'05', u'description': u'Burglary'}, {u'code': u'06', u'description': u'Larceny'}, {u'code': u'07', u'description': u'Motor Vehicle Theft'}, {u'code': u'09', u'description': u'Arson'}]}, {u'name': u'Non-Index Crime', u'sub_categories': [{u'code': u'01B', u'description': u'Involuntary Manslaughter'}, {u'code': u'08A', u'description': u'Simple Assault'}, {u'code': u'08B', u'description': u'Simple Battery'}, {u'code': u'10', u'description': u'Forgery & Counterfeiting'}, {u'code': u'11', u'description': u'Fraud'}, {u'code': u'12', u'description': u'Embezzlement'}, {u'code': u'13', u'description': u'Stolen Property'}, {u'code': u'14', u'description': u'Vandalism'}, {u'code': u'15', u'description': u'Weapons Violation'}, {u'code': u'16', u'description': u'Prostitution'}, {u'code': u'17', u'description': u'Criminal Sexual Abuse'}, {u'code': u'18', u'description': u'Drug Abuse'}, {u'code': u'19', u'description': u'Gambling'}, {u'code': u'20', u'description': u'Offenses Against Family'}, {u'code': u'22', u'description': u'Liquor License'}, {u'code': u'24', u'description': u'Disorderly Conduct'}, {u'code': u'26', u'description': u'Misc Non-Index Offense'}]}, {u'name': u'Violent Crime', u'sub_categories': [{u'code': u'01A', u'description': u'Homicide 1st & 2nd Degree'}, {u'code': u'02', u'description': u'Criminal Sexual Assault'}, {u'code': u'03', u'description': u'Robbery'}, {u'code': u'04A', u'description': u'Aggravated Assault'}, {u'code': u'04B', u'description': u'Aggravated Battery'}]}]}

It’s a bit ugly but we can see that everything’s there! Our next step is to extract each category into its own row. We can do this by accessing the ‘categories’ key in our JSON document and then calling the UNWIND function which allows us to expand a collection into a sequence of rows:

query = """
WITH {json} AS document
UNWIND document.categories AS category
RETURN category.name
"""
$ python import_categories.py
   | category.name
---+-----------------
 1 | Index Crime
 2 | Non-Index Crime
 3 | Violent Crime

Now we can create a node for each of those categories. We’ll use the MERGE command so that we can run this script multiple times without ending up with repeat categories:

query = """
WITH {json} AS document
UNWIND document.categories AS category
MERGE (:CrimeCategory {name: category.name}) 
"""

Let’s quickly check those categories were correctly imported:

match (category:CrimeCategory)
return category

Graph  23

Looking good so far – now for the sub categories. We’re going to use the UNWIND function to help us out here as well:

query = """
WITH {json} AS document
UNWIND document.categories AS category
UNWIND category.sub_categories AS subCategory
RETURN category.name, subCategory.code, subCategory.description
"""
$ python import_categories.py
    | category.name   | subCategory.code | subCategory.description
----+-----------------+------------------+---------------------------
  1 | Index Crime     | 01A              | Homicide 1st & 2nd Degree
  2 | Index Crime     | 02               | Criminal Sexual Assault
  3 | Index Crime     | 03               | Robbery
  4 | Index Crime     | 04A              | Aggravated Assault
  5 | Index Crime     | 04B              | Aggravated Battery
  6 | Index Crime     | 05               | Burglary
  7 | Index Crime     | 06               | Larceny
  8 | Index Crime     | 07               | Motor Vehicle Theft
  9 | Index Crime     | 09               | Arson
 10 | Non-Index Crime | 01B              | Involuntary Manslaughter
 11 | Non-Index Crime | 08A              | Simple Assault
 12 | Non-Index Crime | 08B              | Simple Battery
 13 | Non-Index Crime | 10               | Forgery & Counterfeiting
 14 | Non-Index Crime | 11               | Fraud
 15 | Non-Index Crime | 12               | Embezzlement
 16 | Non-Index Crime | 13               | Stolen Property
 17 | Non-Index Crime | 14               | Vandalism
 18 | Non-Index Crime | 15               | Weapons Violation
 19 | Non-Index Crime | 16               | Prostitution
 20 | Non-Index Crime | 17               | Criminal Sexual Abuse
 21 | Non-Index Crime | 18               | Drug Abuse
 22 | Non-Index Crime | 19               | Gambling
 23 | Non-Index Crime | 20               | Offenses Against Family
 24 | Non-Index Crime | 22               | Liquor License
 25 | Non-Index Crime | 24               | Disorderly Conduct
 26 | Non-Index Crime | 26               | Misc Non-Index Offense
 27 | Violent Crime   | 01A              | Homicide 1st & 2nd Degree
 28 | Violent Crime   | 02               | Criminal Sexual Assault
 29 | Violent Crime   | 03               | Robbery
 30 | Violent Crime   | 04A              | Aggravated Assault
 31 | Violent Crime   | 04B              | Aggravated Battery

Let’s give sub categories the MERGE treatment too:

query = """
WITH {json} AS document
UNWIND document.categories AS category
UNWIND category.sub_categories AS subCategory
MERGE (c:CrimeCategory {name: category.name})
MERGE (sc:SubCategory {code: subCategory.code})
ON CREATE SET sc.description = subCategory.description
MERGE (c)-[:CHILD]->(sc)
"""

And finally let’s write a query to check what we’ve imported:

match (category:CrimeCategory)-[:CHILD]->(subCategory)
return *
Graph  24

I hadn’t realised before running this query is that some sub categories sit under multiple categories so that’s quite an interesting insight. The final Python script is available on github – any questions let me know.

Categories: Programming

The Best Productivity Book for Free

image"At our core, Microsoft is the productivity and platform company for the mobile-first and cloud-first world." -- Satya Nadella

We take productivity seriously at Microsoft. Ask any Softie. I never have a lack of things to do, or too much time in my day, and I can't ever make "too much" impact.

To be super productive, I've had to learn hard-core prioritization techniques, extreme energy management, stakeholder management, time management, and a wealth of productivity hacks to produce better, faster results.

We don’t learn these skills in school.  But if we’re lucky, we learn from the right mentors and people all around us, how to bring out our best when we need it the most.

Download the 30 Days of Getting Results Free eBook

You can save years of pain for free:

30 Days of Getting Results Free eBook

There’s always a gap between books you read and what you do in the real world. I wanted to bridge this gap. I wanted 30 Days of Getting Results to be raw and real to help you learn what it really takes to master productivity and time management so you can survive and thrive with the best in the world.

It’s not pretty.  It’s super effective.

30 Days of Getting Results is a 30 Day Personal Productivity Improvement Sprint

I wrote 30 Days of Getting Results using a 30 Day Sprint. Each day for that 30 Day Sprint, I wrote down the best information I learned from the school of hard knocks about productivity, time management, work-life balance, and more.

For each day, I share a lesson, a story, and an exercise.

I wanted to make it easy to practice productivity habits.

Agile Results is a Fire Starter for Personal Productivity

The thing that’s really different about Agile Results as a time management system is that it’s focused on meaningful results.  Time is treated as a first-class citizen so that you hit your meaningful windows of opportunity, and get fresh starts each day, each week, each month, each year.  As a metaphor, you get to be the author of your life and write your story forward.

For years, I’ve received emails from people around the world how 30 Days of Getting Results was a breath of fresh air for them.

It helped them find their focus, get more productive, enjoy what they do, renew their energy, and spend more time in their strengths and their passions, while pursuing their purpose.

It’s helped doctors, teachers, students, lawyers, developers, grandmothers, and more.

Learn a New Language, Change Careers, or Start a Business

You can use Agile Results to learn better, faster, and deeper because it helps you think better, feel better, and take better action.

You can use Agile Results to help you learn a new language, build new skills, learn an instrument, or whatever your heart desires.

I used the system to accidentally write a book in a month.

I didn’t set out to write a book. I set out to share the world’s best insight and action for productivity and time management. I wrote for 20 minutes each day, during that month, to share the best lessons and the best insights I could with one purpose:

Help everyone thrive in work and life.

Over the coming months, I had more and more people ask for a book version. As much as they liked the easy to flip through Web pages, they wanted to consume it as an eBook. So I turned 30 Days of Getting Results into a free eBook and made that available.

Here's the funny part:

I forgot I had done that.

The Accidental Free Productivity Book that Might Just Change Your Life

One day, I was having a conversation with one of my readers, and they said that I should sell 30 Days of Getting Results as a $30 work book. They liked it much more than the book, Getting Results the Agile Way. They found it to be more actionable and easier to get started, and they liked that I used the system as a way to teach the system.

They said I should make the effort to put it together as a PDF and sell it as a workbook. He said people would want to pay for it because it’s high-value, real-world training, and he said it was better than any live training he had ever taken (and he had taken a lot.)

I got excited by the idea, and it made perfect sense. After all, wouldn’t people want to learn something that could impact every single day of their lives, and help them achieve more in work and life and help them adapt and compete more effectively in our ever-changing world?

I went to go put it together, and I had already done it.

Set Your Productivity on Fire

When you’re super productive, it’s easy to forget some of the things you create because they so naturally flow from spending the right time, on the right things, with the right energy. You’ll naturally leave a trail of results from experimenting and learning.

Whether you want to be super productive, or do less, but accomplish more, check out the ultimate free productivity guide:

30 Days of Getting Results Free eBook

Share it with friends, family, colleagues, and whoever else you want to have an unfair advantage in our hyper-competitive world.

Lifting others up, lifts you up in the process.

If you have a personal story of how 30 Days of Getting Results has helped you in some way, feel free to share it with me.  It’s always fun to hear how people are using Agile Results to take on new challenges, re-invent their productivity, and operate at a higher level.

Or simply get started again … like a fresh start, for the first time, full of new zest to be your best.

Categories: Architecture, Programming

Human Variation Introduction - New Lecture Posted

10x Software Development - Steve McConnell - Tue, 07/21/2015 - 17:58

In this week's lecture (https://cxlearn.com) I introduce the topic of human variation. I start by describing the general phenomenon of 10x variation. I briefly overview the research on 10x. I describe the problems that 10x variation presents for research in software engineering. I go into the specific examples of the Chrysler C3 project and the New York Times Chief Programmer Team project. And I summarize a few of the software development issues that are strongly affected by human variation.  

Lectures posted so far include:  

0.0 Understanding Software Projects - Intro
     0.1 Introduction - My Background
     0.2 Reading the News
     0.3 Definitions and Notations 

1.0 The Software Lifecycle Model - Intro
     1.1 Variations in Iteration 
     1.2 Lifecycle Model - Defect Removal
     1.3 Lifecycle Model Applied to Common Methodologies
     1.4 Lifecycle Model - Selecting an Iteration Approach  

2.0 Software Size
     2.05 Size - Comments on Lines of Code
     2.1 Size - Staff Sizes 
     2.2 Size - Schedule Basics 
     2.3 Size - Debian Size Claims (New) 

3.0 Human Variation - Introduction (New)

Check out the lectures at http://cxlearn.com!

Understanding Software Projects - Steve McConnell

 

Parallax image scrolling using Storyboards

Xebia Blog - Tue, 07/21/2015 - 07:37

Parallax image scrolling is a popular concept that is being adopted by many apps these days. It's the small attention to details like this that can really make an app great. Parallax scrolling gives you the illusion of depth by letting objects in the background scroll slower than objects in the foreground. It has been used in the past by many 2d games to make them feel more 3d. True parallax scrolling can become quite complex, but it's not very hard to create a simple parallax image scrolling effect on iOS yourself. This post will show you how to add it to a table view using Storyboards.

NOTE: You can find all source code used by this post on https://github.com/lammertw/ParallaxImageScrolling.

The idea here is to create a UITableView with an image header that has a parallax scrolling effect. When we scroll down the table view (i.e. swipe up), the image should scroll with half the speed of the table. And when we scroll up (i.e. swipe down), the image should become bigger so that it feels like it's stretching while we scroll. The latter is not really a parallax scrolling effect but commonly used in combination with it. The following animation shows these effects:

imageedit_9_7419205352  

But what if we want a "Pull down to Refresh" effect and need to add a UIRefreshControl? Well, then we just drop the stretch effect when scrolling up:  

imageedit_7_3493339154    

And as you might expect, the variation with Pull to Refresh is actually a lot easier to accomplish than the one without.

Parallax Scrolling Libraries

While you can find several objective-c or Swift libraries that provide parallax scrolling similar to the ones here, you'll find that it's not that hard to create these yourself. Doing it yourself has the benefit of customizing it exactly the way your want it and of course it will add to your experience. Plus it might be less code than integrating with such a library. However if you need exactly what such a library provides then using it might work better for you.

The basics

NOTE: You can find all the code of this section at the no-parallax-scrolling branch.

Let's start with a simple example that doesn't have any parallax scrolling effects yet.

imageedit_5_3840368192

Here we have a standard UITableViewController with a cell containing our image at the top and another cell below it with some text. Here is the code:

class ImageTableViewController: UITableViewController {

  override func numberOfSectionsInTableView(tableView: UITableView) -> Int {
    return 2
  }

  override func tableView(tableView: UITableView, numberOfRowsInSection section: Int) -> Int {
    return 1
  }

  override func tableView(tableView: UITableView, cellForRowAtIndexPath indexPath: NSIndexPath) -> UITableViewCell {
    var cellIdentifier = ""
    switch indexPath.section {
    case 0:
      cellIdentifier = "ImageCell"
    case 1:
      cellIdentifier = "TextCell"
    default: ()
    }

    let cell = tableView.dequeueReusableCellWithIdentifier(cellIdentifier, forIndexPath: indexPath) as! UITableViewCell

    return cell
  }

  override func tableView(tableView: UITableView, heightForRowAtIndexPath indexPath: NSIndexPath) -> CGFloat {
    switch indexPath.section {
    case 0:
      return UITableViewAutomaticDimension
    default: ()
      return 50
    }
  }

  override func tableView(tableView: UITableView, estimatedHeightForRowAtIndexPath indexPath: NSIndexPath) -> CGFloat {
    switch indexPath.section {
    case 0:
      return 200
    default: ()
      return 50
    }
  }

}

The only thing of note here is that we're using UITableViewAutomaticDimension for automatic cell heights determined by constraints in the cell: we have a UIImageView with constraints to use the full width and height of the cell and a fixed aspect ratio of 2:1. Because of this aspect ratio, the height of the image (and therefore of the cell) is always half of the width. In landscape it looks like this:

iOS Simulator Screen Shot 20 Jul 2015 17.27.38

We'll see later why this matters.

Parallax scrolling with Pull to Refresh

NOTE: You can find all the code of this section at the pull-to-refresh branch.

As mentioned before, creating the parallax scrolling effect is easiest when it doesn't need to stretch. Commonly you'll only want that if you have a Pull to Refresh. Adding the UIRefreshControl is done in the standard way so I won't go into that.

Container view
The rest is also quite simple. With the basics from below as our starting point, what we need to do first is add a UIView around our UIImageView that acts as a container. Since our image will change it's position while we scroll, we cannot use it anymore to calculate the height of the cell. The container view will have exactly the constraints that our image view had: use the full width and height of the cell and have an aspect ratio of 2:1. Also make sure to enable Clip Subviews on the container view to make sure the image view is clipped by it.

Align Center Y constraint
The image view, which is now inside the container view, will keep its aspect ratio constraint and use the full width of the container view. For the y position we'll add an Align Center Y constraint to vertically center the image within the container. All that looks something like this: Screen Shot 2015-07-20 at 17.46.25

Parallax scrolling using constraint
When we run this code now, it will still behave exactly as before. What we need to do is make the image view scroll with half the speed of the table view when scrolling down. We can do that by changing the constant of the Align Center Y constraint that we just created. First we need to connect it to an outlet of a custom UITableViewCell subclass:

class ImageCell: UITableViewCell {
  @IBOutlet weak var imageCenterYConstraint: NSLayoutConstraint!
}

When the table view scrolls down, we need to lower the Y position of the image by half the amount that we scrolled. To do that we can use scrollViewDidScroll and the content offset of the table view. Since our UITableViewController already adheres to the UIScrollViewDelegate, overriding that method is enough:

override func scrollViewDidScroll(scrollView: UIScrollView) {
  imageCenterYConstraint?.constant = min(0, -scrollView.contentOffset.y / 2.0) // only when scrolling down so we never let it be higher than 0
}

We're left with one small problem. The imageCenterYConstraint is connected to the ImageCell that we created and the scrollViewDidScroll method is in the view controller. So what left is create a imageCenterYConstraint in the view controller and assign it when the cell is created:

weak var imageCenterYConstraint: NSLayoutConstraint?

override func tableView(tableView: UITableView, cellForRowAtIndexPath indexPath: NSIndexPath) -> UITableViewCell {
  var cellIdentifier = ""
  switch indexPath.section {
  case 0:
    cellIdentifier = "ImageCell"
  case 1:
    cellIdentifier = "TextCell"
  default: ()
  }

  // the new part of code:
  let cell = tableView.dequeueReusableCellWithIdentifier(cellIdentifier, forIndexPath: indexPath) as! UITableViewCell
  if let imageCell = cell as? ImageCell {
    imageCenterYConstraint = imageCell.imageCenterYConstraint
  }

  return cell
}

That's all we need to do for our first variation of the parallax image scrolling. Let's go on with something a little more complicated.

Parallax scrolling with Pull to Refresh

NOTE: You can find all the code of this section at the no-pull-to-refresh branch.

When starting from the basics, we need to add a container view again like we did in the Container view paragraph from the previous section. The image view needs some different constraints though. Add the following constraints to the image view:

  • Ass before, keep the 2:1 aspect ratio
  • Add a Leading Space and Trailing Space of 0 to the Superview (our container view) and set the priority to 900. We will break these constraints when stretching the image because the image will become wider than the container view. However we still need them to determine the preferred width.
  • Align Center X to the Superview. We need this one to keep the image in the center when we break the Leading and Trailing Space constraints.
  • Add a Bottom Space and Top Space of 0 to the Superview. Create two outlets to the cell class ImageCell like we did in the previous section for the center Y constraint. We'll call these bottomSpaceConstraint and topSpaceConstraint. Also assign these from the cell to the view controller like we did before so we can access them in our scrollViewDidScroll method.

The result: Screen Shot 2015-07-20 at 21.30.52 We now have all the constraints we need to do the effects for scrolling up and down.

Scrolling down
When we scroll down (swipe up) we want the same effect as in our previous section. Instead of having an 'Align Center Y' constraint that we can change, we now need to do the following:

  • Set the bottom space to minus half of the content offset so it will fall below the container view.
  • Set the top space to plus half of the content offset so it will be below the top of the container view.

With these two calculation we effectively delay the scrolling speed of the image view with the half of the table view scrolling speed.

bottomSpaceConstraint?.constant = -scrollView.contentOffset.y / 2
topSpaceConstraint?.constant = scrollView.contentOffset.y / 2

Scrolling up
When the table view scrolls up (swipe down) the container view is going down. What we want here is that the image view sticks to the top of the screen instead of going down as well. As well need for that is to set the constant of the topSpaceConstraint to the content offset. That means the height of the image will increase. Because of our 2:1 aspect ratio, the width of the image will grow as well. This is why we had to lower the priority of the Leading and Trailing constraint because the image no longer fits inside the container and breaks those constraints.

topSpaceConstraint?.constant = scrollView.contentOffset.y

We're left with one problem now. When the image sticks to the top while the container view goes down, it means that the image falls outside the container view. And since we had to enable Clip Subviews for scrolling down, we now get something like this: iOS Simulator Screen Shot 20 Jul 2015 21.45.44

We can't see the top of the image since it's outside the container view. So what we need is to clip when scrolling down and not clip when scrolling up. We can only do that in code so we need to connect the container view to an outlet, just as we've done with the constraints. Then the final code in scrollViewDidScroll becomes:

func scrollViewDidScroll(scrollView: UIScrollView) { 
  if scrollView.contentOffset.y >= 0 { 
    // scrolling down 
    containerView.clipsToBounds = true 
    bottomSpaceConstraint?.constant = -scrollView.contentOffset.y / 2 
    topSpaceConstraint?.constant = scrollView.contentOffset.y / 2 
  } else { 
    // scrolling up 
    topSpaceConstraint?.constant = scrollView.contentOffset.y 
    containerView.clipsToBounds = false 
  } 
}
Conclusion

So there you have it. Two variations of parallax scrolling without too much effort. As mentioned before, use a dedicated library if you have to, but don't be afraid that it's too complicated to do it yourself.

Additional notes

If you've seen the source code on GitHub you might have noted a few additional things. I didn't want to mention in the main body of this post to prevent any distractions but it's important to mention them anyway.

  • The aspect ratio constraints need to have a priority lower than 1000. Set them 999 or 950 or something (make sure they're higher than the Leading and Trailing Space constraints that we set to 900 in the last section). This is because of an issue related to cells with dynamic height (using UITableViewAutomaticDimension) and rotation. When the user rotates the device, the cell will get its new width while still having the previous height. The new height calculation is not yet done at the beginning of the rotation animation. At this moment, the 2:1 aspect ratio cannot exist, which is why we cannot set it to 1000 (required). Right after the new height is calculated it the aspect ratio constraint will kick back in. It seems that the state in which the aspect ratio constraint cannot exist is not even visible so don't worry about your cell looking strange. Also leaving it at 1000 only seems to generate an error message about the constraint, after which it continues as expected.
  • Instead of assigning the outlets from the ImageCell to new variables in the view controller you may also create a scrollViewDidScroll in the cell, which is then being called from the scrollViewDidScroll from your view controller. You can get the cell using cellForRowAtIndexPath. See the code on GitHub to see this done.

Neo4j 2.2.3: neo4j-import – Encoder StringEncoder[2] returned an illegal encoded value 0

Mark Needham - Tue, 07/21/2015 - 07:11

I’ve been playing around with the Chicago crime data set again while preparing for a Neo4j webinar next week and while running the import tool ran into the following exception:

Importing the contents of these files into tmp/crimes.db:
Nodes:
  /Users/markneedham/projects/neo4j-spark-chicago/tmp/crimes.csv
 
  /Users/markneedham/projects/neo4j-spark-chicago/tmp/beats.csv
 
  /Users/markneedham/projects/neo4j-spark-chicago/tmp/primaryTypes.csv
 
  /Users/markneedham/projects/neo4j-spark-chicago/tmp/locations.csv
Relationships:
  /Users/markneedham/projects/neo4j-spark-chicago/tmp/crimesBeats.csv
 
  /Users/markneedham/projects/neo4j-spark-chicago/tmp/crimesPrimaryTypes.csv
 
  /Users/markneedham/projects/neo4j-spark-chicago/tmp/crimesLocationsCleaned.csv
 
Available memory:
  Free machine memory: 263.17 MB
  Max heap memory : 3.56 GB
 
Nodes
[*>:17.41 MB/s-------------------------|PROPERTIES(3)=|NODE:3|LABEL SCAN----|v:36.30 MB/s(2)===]  3MImport error: Panic called, so exiting
java.lang.RuntimeException: Panic called, so exiting
	at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.assertHealthy(AbstractStep.java:200)
	at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.await(AbstractStep.java:191)
	at org.neo4j.unsafe.impl.batchimport.staging.ProcessorStep.receive(ProcessorStep.java:98)
	at org.neo4j.unsafe.impl.batchimport.staging.ProcessorStep.sendDownstream(ProcessorStep.java:224)
	at org.neo4j.unsafe.impl.batchimport.staging.ProcessorStep.access$400(ProcessorStep.java:42)
	at org.neo4j.unsafe.impl.batchimport.staging.ProcessorStep$Sender.send(ProcessorStep.java:250)
	at org.neo4j.unsafe.impl.batchimport.LabelScanStorePopulationStep.process(LabelScanStorePopulationStep.java:60)
	at org.neo4j.unsafe.impl.batchimport.LabelScanStorePopulationStep.process(LabelScanStorePopulationStep.java:37)
	at org.neo4j.unsafe.impl.batchimport.staging.ProcessorStep$4.run(ProcessorStep.java:120)
	at org.neo4j.unsafe.impl.batchimport.staging.ProcessorStep$4.run(ProcessorStep.java:102)
	at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:237)
Caused by: java.lang.IllegalStateException: Encoder StringEncoder[2] returned an illegal encoded value 0
	at org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.EncodingIdMapper.encode(EncodingIdMapper.java:229)
	at org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.EncodingIdMapper.put(EncodingIdMapper.java:208)
	at org.neo4j.unsafe.impl.batchimport.NodeEncoderStep.process(NodeEncoderStep.java:77)
	at org.neo4j.unsafe.impl.batchimport.NodeEncoderStep.process(NodeEncoderStep.java:43)
	... 3 more

I narrowed the problem down to a specific file and from tracing the code learned that this exception happens when we’ve ended up with a node that doesn’t have an id.

I guessed that this might be due to there being an empty column somewhere in my CSV file so I did a bit of grepping:

$ grep -rn  "\"\"" tmp/locations.csv
tmp/locations.csv:11:"",Location

We can now narrow down the import to just the one line and see if we still get the exception:

$ cat foo.csv
id:ID(Location),:LABEL
"",Location
 
$ ./neo4j-community-2.2.3/bin/neo4j-import --into tmp/foo --nodes foo.csv
Importing the contents of these files into tmp/foo:
Nodes:
  /Users/markneedham/projects/neo4j-spark-chicago/foo.csv
 
Available memory:
  Free machine memory: 2.22 GB
  Max heap memory : 3.56 GB
 
Nodes
Import error: Encoder StringEncoder[2] returned an illegal encoded value 0

Yep, same error. Now we can clean up our CSV file and try again:

$ grep -v  "\"\"" foo.csv > fooCleaned.csv
 
# I put in a few real records so we can see them import
$ cat fooCleaned.csv
id:ID(Location),:LABEL
"RAILROAD PROPERTY",Location
"NEWSSTAND",Location
"SCHOOL, PRIVATE, BUILDING",Location
 
$ ./neo4j-community-2.2.3/bin/neo4j-import --into tmp/foo --nodes fooCleaned.csv
Importing the contents of these files into tmp/foo:
Nodes:
  /Users/markneedham/projects/neo4j-spark-chicago/fooCleaned.csv
 
Available memory:
  Free machine memory: 1.23 GB
  Max heap memory : 3.56 GB
 
Nodes
[*>:??-------------------------------------------------|PROPE|NODE:7.63 MB-----------------|LA] 10k
Done in 110ms
Prepare node index
[*DETECT:7.63 MB-------------------------------------------------------------------------------]   0
Done in 60ms
Calculate dense nodes
[*>:??-----------------------------------------------------------------------------------------]   0
Done in 10ms
Relationships
[*>:??-----------------------------------------------------------------------------------------]   0
Done in 11ms
Node --> Relationship
[*v:??-----------------------------------------------------------------------------------------] 10k
Done in 1ms
Relationship --> Relationship
[*>:??-----------------------------------------------------------------------------------------]   0
Done in 11ms
Node counts
[>:|*COUNT:76.29 MB----------------------------------------------------------------------------] 10k
Done in 46ms
Relationship counts
[*>:??-----------------------------------------------------------------------------------------]   0
Done in 12ms
 
IMPORT DONE in 1s 576ms. Imported:
  3 nodes
  0 relationships
  3 properties

Sweet! We’re back in business.

Categories: Programming

Easier Auth for Google Cloud APIs: Introducing the Application Default Credentials feature.

Google Code Blog - Mon, 07/20/2015 - 19:27

Originally posted to the Google Cloud Platform blog

When you write applications that run on Google Compute Engine instances, you might want to connect them to Google Cloud Storage, Google BigQuery, and other Google Cloud Platform services. Those services use OAuth2, the global standard for authorization, to help ensure that only the right callers can make the right calls. Unfortunately, OAuth2 has traditionally been hard to use. It often requires specialized knowledge and a lot of boilerplate auth setup code just to make an initial API call.

Today, with Application Default Credentials (ADC), we're making things easier. In many cases, all you need is a single line of auth code in your app:

Credential credential = GoogleCredential.getApplicationDefault();

If you're not already familiar with auth concepts, including 2LO, 3LO, and service accounts, you may find this introduction useful.

ADC takes all that complexity and packages it behind a single API call. Under the hood, it makes use of:

  • 2-legged vs. 3-legged OAuth (2LO vs. 3LO) -- OAuth2 includes support for user-owned data, where the user, the API provider, and the application developer all need to participate in the authorization dance. Most Cloud APIs don't deal with user-owned data, and therefore can use much simpler two-party flows between the API provider and the application developer.
  • gcloud CLI -- while you're developing and debugging your app, you probably already use the gcloud command-line tool to explore and manage Cloud Platform resources. ADC lets your application piggyback on the auth flows in gcloud, so you only have to set up your credentials once.
  • service accounts -- if your application runs on Google App Engine or Google Compute Engine, it automatically has access to the built-in "service account", that helps the API provider to trust that the API calls are coming from a trusted source. ADC lets your application benefit from that trust.

You can find more about Google Application Default Credentials here. This is available for Java, Python, Node.js, Ruby, and Go. Libraries for PHP and .Net are in development.

Categories: Programming

MongoDB and WTFs and Anger

Eric.Weblog() - Eric Sink - Mon, 07/20/2015 - 19:00

Recently, Sven Slootweg (joepie91) published a blog entry entitled Why you should never, ever, ever use MongoDB. It starts out with the words "MongoDB is evil" and proceeds to give a list of negative statements about same.

I am not here to respond to each of his statements. He labels them as "facts", and some (or perhaps all) of them surely are. In fact, for now, let's assume that everything he wrote is correct. My point here is not to say that the author is wrong.

Rather, my point here is that this kind of blog entry tells me very little about MongoDB while it tells me a great deal about the emotions of the person who wrote it.

Like I said, it may be true that every WTF the author listed is correct. It is also true that some software has more WTFs than others.

I'm not a MongoDB expert, but I've been digging into it quite a bit, and I could certainly make my own list of its WTFs. And I would also admit that my own exploration of Couchbase has yielded fewer of those moments. Therefore, every single person on the planet who chooses MongoDB instead of Couchbase is making a terrible mistake, right?

Let me briefly shift to a similar situation where I personally have a lot more knowledge: Microsoft SQL Server vs PostgreSQL. For me, it is hard to study SQL Server without several WTF moments. And while PostgreSQL is not perfect, I have found that a careful study there tends to produce more admiration than WTFs.

So, after I discovered that (for example) SQL Server has no support for deferred foreign keys, why didn't I write a blog entry entitled "Why you should never, ever, ever use SQL Server"?

Because I calmed down and looked at the bigger picture.

I think I could make an entirely correct list of negative things about SQL Server that is several pages long. And I suppose if I wanted to do that, and if I were really angry while I was writing it, I would include only the facts that support my feelings, omitting anything positive. For example, my rant blog entry would have no reason to acknowledge that SQL Server is the mostly widely used relational database server in the world. These kinds of facts merely distract people from my point.

But what would happen if I stopped writing my rant and spent some time thinking about the fact I just omitted?

I just convinced myself that this piece of software is truly horrible, and yet, millions of people are using it every day. How do I explain this?

If I tried to make a complete list of theories that might fit the facts, today's blog entry would get too long. Suffice it to say this: Some of those theories might support an anti-Microsoft rant (for example, maybe Microsot's field sales team is really good at swindling people), but I'm NOT going to be able to prove that every single person who chose SQL Server has made a horrible mistake. There is no way I can credibly claim that PostgreSQL is the better choice for every single company simply because I admire it. Even though I think (for example) that SQL Server handles NULL and UNIQUE in a broken way, there is some very large group of people for whom SQL Server is a valid and smart choice.

So why would I write a blog entry that essentially claims that all SQL Server users are stupid when that simply cannot be true? I wouldn't. Unless I was really angry.

MongoDB is undisputably the top NoSQL vendor. It is used by thousands of companies who serve millions of users every day. Like all young software serving a large user base, it has bugs and flaws, some of which are WTF-worthy. But it is steadily getting better. Any discussion of its technical deficiences which does not address these things is mostly just somebody venting emotion.

 

Released Today: Visual Studio 2015, ASP.NET 4.6, ASP.NET 5 & EF 7 Previews

ScottGu's Blog - Scott Guthrie - Mon, 07/20/2015 - 16:14

Today is a big day with major release announcements for Visual Studio 2015, Visual Studio 2013 Update 5, and .NET Framework 4.6. All these releases have been covered in great detail on Soma’s Blog, Visual Studio Blog, and .NET Blog

Join us online for the Visual Studio 2015 Release Event, where you can see Soma, Brian Harry, Scott Hanselman, and many other demo new Visual Studio 2015 features and technologies. This year, in a new segment called “In The Code”, we share how a team of Microsoft engineers created a real app in 3 days. There will be opportunities along the way to interact in live Q&A with the team on subjects such as Agile development, web and cloud development, cross-platform mobile dev and much more. 

In this post I’d like to specifically talk about some of the ground we have covered in ASP.NET and Entity Framework.  In this release of Visual Studio, we are releasing ASP.NET 4.6, updating our Visual Studio Web Development Tools, and updating the latest beta release of our new ASP.NET 5 framework.  Below are details on just a few of the great updates available today: ASP.NET Tooling Improvements

Today’s VS 2015 release delivers some great updates for web development.  Here are just a few of the updates we are shipping in this release: JSON Editor

JSON has become a first class experience in Visual Studio 2015 and we are now giving you a great editor to allow you to maintain your JSON content.  With support for JSON Schema validation, intellisense, and support for SchemaStore.org writing and producing JSON content has never been as easy.  We’ve also added intellisense support for bower.json and package.json files for bower and npm package manager use.

image HTML Editor Updates

Our HTML editor received a lot of attention in this update.  We wanted to deliver an editor that kept up with HTML 5 standards and provided rich support for popular new frameworks and libraries.  We previously shipped the bootstrap responsive web framework with our ASP.NET templates, and we are now providing intellisense for their classes with an indicator icon to show that they are bootstrap CSS classes.

image

 

This helps you keep clear the classes that you wrote in your project, like the page-inner class above, and the bootstrap classes marked with the B icon.

We are also keeping up with support for the emerging web components standard with the import link for the web components that markup imports.

 image

We are also providing intellisense for AngularJS directives and attributes with an appropriate Angular icon so you know you’re triggering AngularJS functionality

 image JavaScript Editor Improvements

With the VS 2015 release we are introducing support for AngularJS structures including controllers, services, factories, directives and animations.  There is also support for the new EcmaScript 6 features such as classes, arrow functions, and template strings. We are also bringing a navigation bar to the editor to help you navigate between the major elements of your JavaScript.  With JSDoc support to deliver intellisense, JavaScript development gets easier.

 image ReactJS Editor Support

We spent some time with the folks at Facebook to make sure that we delivered first class capabilities for developers using their ReactJS framework.  With appropriate syntax highlighting and intellisense for React methods, developers should be very comfortable building React applications with the new Visual Studio:

 image Support for JavaScript package managers like Grunt and Gulp and Task Runners

JavaScript and modern web development techniques are the new recommended way to build client-side code for your web application.  We support these tools and programming techniques with our new Task Runner Explorer that executes grunt and gulp task runners.  You can open this tool window with the Ctrl+Alt+Backspace hotkey combination.

 image

Execute any of the tasks defined in your gruntfile.js or gulpfile.js by right-clicking on the task name in the left panel and choosing “Run” from the context menu that appears.  You can even use this context menu to attach grunt or gulp tasks to project build events in Visual Studio like “After Build” as shown in the figure above.  Every time the .NET objects in your web project are completed compiling, the ‘build’ task will be executed from the gruntfile.js

Combined with the intellisense support for JavaScript and JSON editors, we think that developers wanting to use grunt and gulp tasks will really enjoy this new Visual Studio experience.  You can add grunt and gulp tasks with the newly integrated npm package manager capabilities.  When you create a package.json file in your web project, we will install and upgrade local copies of all packages referenced.  Not only do we deliver syntax highlighting and intellisense for package.json terms, we also provide package name and version lookup against the npmjs.org gallery.

 image

The bower package manager is also supported with great intellisense, syntax highlighting and the same package name and version support in the bower.json file that we provide for package.json.

 image

These improvements in managing and writing JavaScript configuration files and executing grunt or gulp tasks brings a new level of functionality to Visual Studio 2015 that we think web developers will really enjoy.

ASP.NET 4.6 Runtime Improvements

Today’s release also includes a bunch of enhancements to ASP.NET from a runtime perspective. HTTP/2 Support

Starting with ASP.NET 4.6 we are introducing support for the HTTP/2 standard.  This new version of the HTTP protocol delivers a true multiplexing of requests and responses between browser and web server.  This exciting update is as easy as enabling SSL in your web projects to immediately improve your ASP.NET application responsiveness.

 image

With SSL enabled (which is a requirement of the HTTP/2 protocol), IISExpress on Windows 10 will begin interacting with the browser using the updated protocol.  The difference between the protocols is clear.  Consider the network performance presented by Microsoft Edge when requesting the same website without SSL (and receiving HTTP/1.x) and with SSL to activate the HTTP/2 protocol:

image

image

Both samples are showing the default ASP.NET project template’s home page.  In both scenarios the HTML for the page is retrieved in line 1.  In HTTP/1.x on the left, the first six elements are requested and we see grey bars to indicate waiting to request the last two elements.  In HTTP/2 on the right, all eight page elements are loaded concurrently, with no waiting. Support for the .NET Compiler Platform

We now support the new .NET compilers provided in the .NET Compiler Platform (codenamed Roslyn).  These compilers allow you to access the new language features of Visual Basic and C# throughout your Web Forms markup and MVC view pages.  Our markup can look much simpler and readable with new language features like string interpolation:

Instead of building a link in Web Forms like this:

  <a href="/Products/<%: model.Id %>/<%: model.Name %>"><%: model.Name %></a>

We can deliver a more readable piece of markup like this:

  <a href="<%: $"/Products/{model.Id}/{model.Name}" %>"><%: model.Name %></a>

We’ve also bundled the Microsoft.CodeDom.Providers.DotNetCompilerPlatform NuGet package to enable your Web Forms assets to compile significantly faster without requiring any changes to your code or project. Async Model Binding for Web Forms

Model binding was introduced for Web Forms applications in ASP.NET 4, and we introduced async methods in .NET 4.5  We heard your requests to be able to execute your model binding methods on a Web Form asynchronously with the new language features.  Our team has made this as easy as adding an async=”true” attribute to the @Page directive and return a Task from your model binding methods:

    public async Task<IEnumerable<Product>> myGrid_GetData()

    {

      var repo = new Repository();

      return await repo.GetAll();

    }

We have a blog post demonstrating with more information and tips about this feature on our MSDN Web Development blog. ASP.NET 5

I introduced ASP.NET 5 back in February and shared in detail what this release would bring. I’ll reiterate just a few high level points here, check out my post Introducing ASP.NET 5 for a more complete run down. 

ASP.NET 5 works with .NET Core as well as the full .NET Framework to give you greater flexibility when hosting your web apps. With ASP.NET MVC 6 we are merging the complimentary features and functionality from MVC, Web API, and Web Pages. With ASP.NET 5 we are also introducing a new HTTP request pipeline based on our learnings from Katana which enables you to add only the components you need with an opt-in strategy. Additionally, included in this release are multiple development features for improved productivity and to enable you to build better web applications. ASP.NET 5 is also open source. You can find us on GitHub, view and download the code, submit changes, and track when changes are made.   

The ASP.NET 5 Beta 5 runtime packages are in preview and not recommended for use in production, so please continue using ASP.NET 4.6 for building production grade apps. For details on the latest ASP.NET 5 beta enhancements added and issues fixed, check out the published release notes for ASP.NET 5 beta 5 on GitHub. To get started with ASP.NET 5 get the docs and tutorials on the ASP.NET site

To learn more and keep an eye on all updates to ASP.NET, checkout the Webdev blog and read along with the tutorials and documentation at www.asp.net/vnext Entity Framework

With today’s release, we not only have an update to Entity Framework 6 that primarily includes bug fixes and community contributions, but we also released a preview version of Entity Framework 7, keep reading for details: Entity Framework 6.x

Visual Studio 2015 includes Entity Framework 6.1.3. EF 6.1.3 primarily focuses on bug fixes and community contributions; you can see a list of the changes included in EF 6.1.3 in this EF 6.1.3 announcement blog post. The Entity Framework 6.1.3 runtime is included in a number of places in this release. In EF 6.1.3 when you can create a new model using the Entity Framework Tools in a project that does not already have the EF runtime installed, the runtime is automatically installed for you. Additionally, the runtime is pre-installed in new ASP.NET projects, depending on the project template you select.

image 

To learn more and keep an eye on all updates to Entity Framework, checkout the ADO.NET blog.   Entity Framework 7

Entity Framework 7 is in preview and not yet ready for production yet. This new version of Entity Framework enables new platforms and new data stores. Universal Windows Platform, ASP.NET 5, and traditional desktop applications can now use EF7. EF7 can also be used in .NET applications that run on Mac and Linux. Visual Studio 2015 includes an early preview of the EF7 runtime that is installed in new ASP.NET 5 projects. 

image

For more information on EF7, check out the GitHub page for what is EF7 all about.

image Summary

Today’s Visual Studio release is a big one that we are proud to share with you all. Thank you for your continued support by providing feedback on the interim releases (CTPs, Preview, RC).  We are really looking forward to seeing what you build with it.

Hope this helps,

Scott

P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me @scottgu omni

Categories: Architecture, Programming

You’ve Just Been Laid Off From Your Programming Job. Now What?

Making the Complex Simple - John Sonmez - Mon, 07/20/2015 - 16:00

Don’t worry. It happens to the best of us. We’ve all been “laid off” at some point in our lives—well, at least most of us. Maybe your employer went through a “workforce reduction” and randomly selected people to be laid off—you just got unlucky. Or perhaps it wasn’t so much luck, but your tendency to […]

The post You’ve Just Been Laid Off From Your Programming Job. Now What? appeared first on Simple Programmer.

Categories: Programming

R: Bootstrap confidence intervals

Mark Needham - Sun, 07/19/2015 - 20:44

I recently came across an interesting post on Julia Evans’ blog showing how to generate a bigger set of data points by sampling the small set of data points that we actually have using bootstrapping. Julia’s examples are all in Python so I thought it’d be a fun exercise to translate them into R.

We’re doing the bootstrapping to simulate the number of no-shows for a flight so we can work out how many seats we can overbook the plane by.

We start out with a small sample of no-shows and work off the assumption that it’s ok to kick someone off a flight 5% of the time. Let’s work out how many people that’d be for our initial sample:

> data = c(0, 1, 3, 2, 8, 2, 3, 4)
> quantile(data, 0.05)
  5% 
0.35

0.35 people! That’s not a particularly useful result so we’re going to resample the initial data set 10,000 times, taking the 5%ile each time and see if we come up with something better:

We’re going to use the sample function with replacement to generate our resamples:

> sample(data, replace = TRUE)
[1] 0 3 2 8 8 0 8 0
> sample(data, replace = TRUE)
[1] 2 2 4 3 4 4 2 2

Now let’s write a function to do that multiple times:

library(ggplot)
 
bootstrap_5th_percentile = function(data, n_bootstraps) {
  return(sapply(1:n_bootstraps, 
                function(iteration) quantile(sample(data, replace = TRUE), 0.05)))
}
 
values = bootstrap_5th_percentile(data, 10000)
 
ggplot(aes(x = value), data = data.frame(value = values)) + geom_histogram(binwidth=0.25)

2015 07 19 18 05 48

So this visualisation is telling us that we can oversell by 0-2 people but we don’t know an exact number.

Let’s try the same exercise but with a bigger initial data set of 1,000 values rather than just 8. First we’ll generate a distribution (with a mean of 5 and standard deviation of 2) and visualise it:

library(dplyr)
 
df = data.frame(value = rnorm(1000,5, 2))
df = df %>% filter(value >= 0) %>% mutate(value = as.integer(round(value)))
ggplot(aes(x = value), data = df) + geom_histogram(binwidth=1)

2015 07 19 18 09 15

Our distribution seems to have a lot more values around 4 & 5 whereas the Python version has a flatter distribution – I’m not sure why that is so if you have any ideas let me know. In any case, let’s check the 5%ile for this data set:

> quantile(df$value, 0.05)
5% 
 2

Cool! Now at least we have an integer value rather than the 0.35 we got earlier. Finally let’s do some bootstrapping over our new distribution and see what 5%ile we come up with:

resampled = bootstrap_5th_percentile(df$value, 10000)
byValue = data.frame(value = resampled) %>% count(value)
 
> byValue
Source: local data frame [3 x 2]
 
  value    n
1   1.0    3
2   1.7    2
3   2.0 9995
 
ggplot(aes(x = value, y = n), data = byValue) + geom_bar(stat = "identity")

2015 07 19 18 23 29

‘2’ is by far the most popular 5%ile here although it seems weighted more towards that value than with Julia’s Python version, which I imagine is because we seem to have sampled from a slightly different distribution.

Categories: Programming

How To Get Smarter By Making Distinctions

"Whatever you do in life, surround yourself with smart people who'll argue with you." -- John Wooden

There’s a very simple way to get smarter.

You can get smarter by creating categories.

Not only will you get smarter, but you’ll also be more mindful, and you’ll expand your vocabulary, which will improve your ability to think more deeply about a given topic or domain.

In my post, The More Distinctions You Make, the Smarter You Get, I walk through the ins and outs of creating categories to increase your intelligence, and I use the example of “fat.”   I attempt to show how “Fat is bad” isn’t very insightful, and how by breaking “fat” down into categories, you can dive deeper and reveal new insight to drive better decisions and better outcomes.

I’m this post, I’m going to walk this through with an example, using “security” as the topic.

The first time I heard the word “security”, it didn’t mean much to me, beyond “protect.”

The next thing somebody taught me, was how I had to focus on CIA:  Confidentiality, Integrity, and Availability.

That was a simple way to break security down into meaningful parts.

And then along came Defense in Depth.   A colleague explained that Defense in Depth meant thinking about security in terms of multiple layers:  Network, Host, Application, and Data.

But then another colleague said, the real key to thinking about security and Defense in Depth, was to think about it in terms of people, process, and technology.

As much as I enjoyed these thought exercises, I didn’t find them actionable enough to actually improve software or application security.  And my job was to help Enterprise developers build better Line-Of-Business applications that were scalable and secure.

So our team went to the drawing board to map out actionable categories to take application security much deeper.

Right off the bat, just focusing on “application” security vs. “network” security or “host” security, helped us to get more specific and make security more tangible and more actionable from an Line-of-Business application perspective.

Security Categories

Here are the original security categories that we used to map out application security and make it more actionable:

  1. Input and Data Validation
  2. Authentication
  3. Authorization
  4. Configuration Management
  5. Sensitive Data
  6. Session Management
  7. Cryptography
  8. Exception Management
  9. Auditing and Logging

Each of these buckets helped us create actionable principles, patterns, and practices for improving security.

Security Categories Explained

Here is a brief description of each application security category:

Input and Data Validation
How do you know that the input your application receives is valid and safe? Input validation refers to how your application filters, scrubs, or rejects input before additional processing. Consider constraining input through entry points and encoding output through exit points. Do you trust data from sources such as databases and file shares?

Authentication
Who are you? Authentication is the process where an entity proves the identity of another entity, typically through credentials, such as a user name and password.

Authorization
What can you do? Authorization is how your application provides access controls for resources and operations.

Configuration Management
Who does your application run as? Which databases does it connect to? How is your application administered? How are these settings secured? Configuration management refers to how your application handles these operational issues.

Sensitive Data
How does your application handle sensitive data? Sensitive data refers to how your application handles any data that must be protected either in memory, over the network, or in persistent stores.

Session Management
How does your application handle and protect user sessions? A session refers to a series of related interactions between a user and your Web application.

Cryptography
How are you keeping secrets (confidentiality)? How are you tamper-proofing your data or libraries (integrity)? How are you providing seeds for random values that must be cryptographically strong? Cryptography refers to how your application enforces confidentiality and integrity.

Exception Management
When a method call in your application fails, what does your application do? How much do you reveal? Do you return friendly error information to end users? Do you pass valuable exception information back to the caller? Does your application fail gracefully?

Auditing and Logging
Who did what and when? Auditing and logging refer to how your application records security-related events.

As you can see, just by calling out these different categories, you suddenly have a way to dive much deeper and explore application security in depth.

The Power of a Security Category

Let’s use a quick example.  Let’s take Input Validation.

Input Validation is a powerful security category, given how many software security flaws and how many vulnerabilities and how many attacks all stem from a lack of input validation, including Buffer Overflows.

But here’s the interesting thing.   After quite a bit of research and testing, we found a powerful security pattern that could help more applications stand up to more security attacks.  It boiled down to the following principle:

Validate for length, range, format, and type.

That’s a pithy, but powerful piece of insight when it comes to implementing software security.

And, when you can’t validate the input, make it safe by sanitizing the output.  And along these lines, keep user input out of the control path, where possible.

All of these insights flow from just focusing on Input Validation as a security category.

Threats, Attacks, Vulnerabilities, and Countermeasures

Another distinction our team made was to think in terms of threats, attacks, vulnerabilities, and countermeasures.  We knew that threats could be intentional and malicious (as in the case of attacks), but they could also be accidental and unintended.

We wanted to identify vulnerabilities as weaknesses that could be addressed in some way.

We wanted to identify countermeasures as the actions to take to help mitigate risks, reduce the attack surface, and address vulnerabilities.

Just by chunking up the application security landscape into threats, attacks, vulnerabilities, and countermeasures, we empowered more people to think more deeply about the application security space.

Security Vulnerabilities Organized by Security Categories

Using the security categories above, we could easily focus on finding security vulnerabilities and group them by the relevant security category.

Here are some examples:

Input/Data Validation

  • Using non-validated input in the Hypertext Markup Language (HTML) output stream
  • Using non-validated input used to generate SQL queries
    Relying on client-side validation
  • Using input file names, URLs, or user names for security decisions
  • Using application-only filters for malicious input
  • Looking for known bad patterns of input
  • Trusting data read from databases, file shares, and other network resources
  • Failing to validate input from all sources including cookies, query string parameters, HTTP headers, databases, and network resources

Authentication

  • Using weak passwords
  • Storing clear text credentials in configuration files
  • Passing clear text credentials over the network
  • Permitting over-privileged accounts
  • Permitting prolonged session lifetime
  • Mixing personalization with authentication

Authorization

  • Relying on a single gatekeeper
  • Failing to lock down system resources against application identities
  • Failing to limit database access to specified stored procedures
  • Using inadequate separation of privileges

Configuration Management

  • Using insecure administration interfaces
  • Using insecure configuration stores
  • Storing clear text configuration data
  • Having too many administrators
  • Using over-privileged process accounts and service accounts

Sensitive Data

  • Storing secrets when you do not need to
  • Storing secrets in code
  • Storing secrets in clear text
  • Passing sensitive data in clear text over networks

Session Management

  • Passing session identifiers over unencrypted channels
  • Permitting prolonged session lifetime
  • Having insecure session state stores
  • Placing session identifiers in query strings

Cryptography

  • Using custom cryptography
  • Using the wrong algorithm or a key size that is too small
  • Failing to secure encryption keys
  • Using the same key for a prolonged period of time
  • Distributing keys in an insecure manner

Exception Management

  • Failing to use structured exception handling
  • Revealing too much information to the client

Auditing and Logging

  • Failing to audit failed logons
  • Failing to secure audit files
  • Failing to audit across application tiers
Threats and Attacks Organized by Security Categories

Again, using our security categories, we could then group threats and attacks by relevant security categories.

Here are some examples of security threats and attacks organized by security categories:

Input/Data Validation

  • Buffer overflows
  • Cross-site scripting
  • SQL injection
  • Canonicalization attacks
  • Query string manipulation
  • Form field manipulation
  • Cookie manipulation
  • HTTP header manipulation

Authentication

  • Network eavesdropping
  • Brute force attacks
  • Dictionary attacks
  • Cookie replay attacks
  • Credential theft

Authorization

  • Elevation of privilege
  • Disclosure of confidential data
  • Data tampering
  • Luring attacks

Configuration Management

  • Unauthorized access to administration interfaces
  • Unauthorized access to configuration stores
  • Retrieval of clear text configuration secrets
  • Lack of individual accountability

Sensitive Data

  • Accessing sensitive data in storage
  • Accessing sensitive data in memory (including process dumps)
  • Network eavesdropping
  • Information disclosure

Session Management

  • Session hijacking
  • Session replay
  • Man-in-the-middle attacks

Cryptography

  • Loss of decryption keys
  • Encryption cracking

Exception Management

  • Revealing sensitive system or application details
  • Denial of service attacks

Auditing and Logging

  • User denies performing an operation
  • Attacker exploits an application without trace
  • Attacker covers his tracks
Countermeasures Organized by Security Categories

Now here is where the rubber really meets the road.  We could group security countermeasures by security categories to make them more actionable.

Here are example security countermeasures organized by security categories:

Input/Data Validation

  • Do not trust input
  • Validate input: length, range, format, and type
  • Constrain, reject, and sanitize input
  • Encode output

Authentication

  • Use strong password policies
  • Do not store credentials
  • Use authentication mechanisms that do not require clear text credentials to be passed over the network
  • Encrypt communication channels to secure authentication tokens
  • Use HTTPS only with forms authentication cookies
  • Separate anonymous from authenticated pages

Authorization

  • Use least privilege accounts
  • Consider granularity of access
  • Enforce separation of privileges
  • Use multiple gatekeepers
  • Secure system resources against system identities

Configuration Management

  • Use least privileged service accounts
  • Do not store credentials in clear text
  • Use strong authentication and authorization on administrative interfaces
  • Do not use the Local Security Authority (LSA)
  • Avoid storing sensitive information in the Web space
  • Use only local administration

Sensitive Data

  • Do not store secrets in software
  • Encrypt sensitive data over the network
  • Secure the channel

Session Management

  • Partition site by anonymous, identified, and authenticated users
  • Reduce session timeouts
  • Avoid storing sensitive data in session stores
  • Secure the channel to the session store
  • Authenticate and authorize access to the session store

Cryptography

  • Do not develop and use proprietary algorithms (XOR is not encryption. Use platform-provided cryptography)
  • Use the RNGCryptoServiceProvider method to generate random numbers
  • Avoid key management. Use the Windows Data Protection API (DPAPI) where appropriate
  • Periodically change your keys

Exception Management

  • Use structured exception handling (by using try/catch blocks)
  • Catch and wrap exceptions only if the operation adds value/information
  • Do not reveal sensitive system or application information
  • Do not log private data such as passwords

Auditing and Logging

  • Identify malicious behavior
  • Know your baseline (know what good traffic looks like)
  • Use application instrumentation to expose behavior that can be monitored

As you can see, the security countermeasures can easily be reviewed, updated, and moved forward, because the actionable principles are well organized by the security categories.

There are many ways to use creating categories as a way to get smarter and get better results.

In the future, I’ll walk through how we created an Agile Security approach, using categories.

Meanwhile, check out my post on The More Distinctions You Make, the Smarter You Get to gain some additional insights into how to use empathy and creating categories to dive deeper, learn faster, and get smarter on any topic you want to take on.

Categories: Architecture, Programming

We Help Our Customers Transform

"Innovation—the heart of the knowledge economy—is fundamentally social." -- Malcolm Gladwell

I’m a big believer in having clarity around what you help your customers do.

I was listening to Satya Nadella’s keynote at the Microsoft Worldwide Partner Conference, and I like how he put it so simply, that we help our customers transform.

Here’s what Satya had to say about how we help our customers transform their business:

“These may seem like technical attributes, but they are key to how we drive business success for our customers, business transformation for our customers, because all of what we do, collectively, is centered on this core goal of ours, which is to help our customers transform.

When you think about any customer of ours, they're being transformed through the power of digital technology, and in particular software.

There isn't a company out there that isn't a software company.

And our goal is to help them differentiate using digital technology.

We want to democratize the use of digital technology to drive core differentiation.

It's no longer just about helping them operate their business.

It is about them excelling at their business using software, using digital technology.

It is about our collective ability to drive agility for our customers.

Because if there is one truth that we are all faced with, and our customers are faced with, it's that things are changing rapidly, and they need to be able to adjust to that.

And so everything we do has to support that goal.

How do they move faster, how do they interpret data quicker, how are they taking advantage of that to take intelligent action.

And of course, cost.

But we'll keep coming back to this theme of business transformation throughout this keynote and throughout WPC, because that's where I want us to center in on.

What's the value we are adding to the core of our customer and their ability to compete, their ability to create innovation.

And anchored on that goal is our technical ambition, is our product ambition.”

Transformation is the name of the game.

You Might Also Like

Satya Nadella is All About Customer Focus

SatyaSatya Nadella on a Mobile-First, Cloud-First World

Satya Nadella on Empower Every Person on the Planet

Satya Nadella on Everyone Has To Be a Leader

Satya Nadella on How the Key To Longevity is To Be a Learning Organization

Satya Nadella on Live and Work a Meaningful Life

Sayta Nadelle on The Future of Software

Categories: Architecture, Programming

Satya Nadella on a Mobile-First, Cloud-First World

You hear Mobile-First, Cloud-First all the time.

But do you ever hear it really explained?

I was listening to Satya Nadella’s keynote at the Microsoft Worldwide Partner Conference, and I like how he walked through how he thinks about a Mobile-First, Cloud-First world.

Here’s what Satya had to say:

“There are a couple of attributes.

When we talk about Mobile-First, we are talking about the mobility of the experience.

What do we mean by that?

As we look out, the computing that we are going to interface with, in our lives, at home and at work, is going to be ubiquitous.

We are going to have sensors that recognize us.

We are going to have computers that we are going to wear on us.

We are going to have computers that we touch, computers that we talk to, the computers that we interact with as holograms.

There is going to be computing everywhere.

But what we need across all of this computing, is our experiences, our applications, our data.

And what enables that is in fact the cloud acting as a control plane that allows us to have that capability to move from device to device, on any given day, at any given meeting.

So that core attribute of thinking of mobility, not by being bound to a particular device, but it's about human mobility, is very core to our vision.

Second, when we think about our cloud, we think distributed computing will remain distributed.

In fact, we think of our servers as the edge of our cloud.

And this is important, because there are going to be many legitimate reasons where people will want digital sovereignty, people will want data residency, there is going to be regulation that we can't anticipate today.

And so we have to think about a distributed cloud infrastructure.

We are definitely going to be one of the key hyper-scale providers.

But we are also going to think about how do we get computing infrastructure, the core compute, storage, network, to be distributed throughout the world.

These may seem like technical attributes, but they are key to how we drive business success for our customers, business transformation for our customers, because all of what we do, collectively, is centered on this core goal of ours, which is to help our customers transform.”

That’s a lot of insight, and very well framed for creating our future and empowering the world.

You Might Also Like

Microsoft Explained: Making Sense of the Microsoft Platform Story

Satya Nadella is All About Customer Focus

Satya Nadella on Empower Every Person on the Planet

Satya Nadella on Everyone Has To Be a Leader

Satya Nadella on How the Key To Longevity is To Be a Learning Organization

Satya Nadella on Live and Work a Meaningful Life

Sayta Nadelle on The Future of Software

Categories: Architecture, Programming

Empower Every Person on the Planet to Achieve More

It’s great to get back to the basics, and purpose is always a powerful starting point.

I was listening to Satya Nadella’s keynote at the Microsoft Worldwide Partner Conference, and I like how he walked through the Microsoft mission in a mobile-first, cloud-first world.

Here’s what Satya had to say:

“Our mission:  Empowering every person and every business on the planet to achieve more.

(We find that by going back into our history and re-discovering that core sense of purpose, that soul ... a PC in every home, democratizing client/server computing.)

We move forward to a Mobile-First, Cloud-First world.

We care about empowerment.

There is no other ecosystem that is primarily, and solely, built to help customers achieve greatness.

We are focused on helping our customers achieve greatness through digital technology.

We care about both individuals and organizations.  That intersection of people and organizations is the cornerstone of what we represent as excellence.

We are a global company.  We want to make sure that the power of technology reaches every country, every vertical, every organization, irrespective of size.

There will be many goals.

What remains constant is this sense of purpose, the reason why this ecosystem exists.

This is a mission that we go and exercise in a Mobile-First, Cloud-First world.”

If I think back to why I originally joined Microsoft, it was to empower every person on the planet to achieve more.

And the cloud is one powerful enabler.

You Might Also Like

Satya Nadella is All About Customer Focus

Satya Nadella on Everyone Has To Be a Leader

Satya Nadella on How the Key To Longevity is To Be a Learning Organization

Satya Nadella on Live and Work a Meaningful Life

Sayta Nadelle on The Future of Software

Categories: Architecture, Programming

R: Blog post frequency anomaly detection

Mark Needham - Sat, 07/18/2015 - 00:34

I came across Twitter’s anomaly detection library last year but haven’t yet had a reason to take it for a test run so having got my blog post frequency data into shape I thought it’d be fun to run it through the algorithm.

I wanted to see if it would detect any periods of time when the number of posts differed significantly – I don’t really have an action I’m going to take based on the results, it’s curiosity more than anything else!

First we need to get the library installed. It’s not on CRAN so we need to use devtools to install it from the github repository:

install.packages("devtools")
devtools::install_github("twitter/AnomalyDetection")
library(AnomalyDetection)

The expected data format is two columns – one containing a time stamp and the other a count. e.g. using the ‘raw_data’ data frame that is in scope when you add the library:

> library(dplyr)
> raw_data %>% head()
            timestamp   count
1 1980-09-25 14:01:00 182.478
2 1980-09-25 14:02:00 176.231
3 1980-09-25 14:03:00 183.917
4 1980-09-25 14:04:00 177.798
5 1980-09-25 14:05:00 165.469
6 1980-09-25 14:06:00 181.878

In our case the timestamps will be the start date of a week and the count the number of posts in that week. But first let’s get some practice calling the anomaly function using the canned data:

res = AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', plot=TRUE)
res$plot

2015 07 18 00 09 22

From this visualisation we learn that we should expect both high and low outliers to be identified. Let’s give it a try with the blog post publication data.

We need to get the data into shape so we’ll start by getting a count of the number of blog posts by (week, year) pair:

> df %>% sample_n(5)
                                                           title                date
1425                            Coding: Copy/Paste then refactor 2009-10-31 07:54:31
783  Neo4j 2.0.0-M06 -> 2.0.0-RC1: Working with path expressions 2013-11-23 10:30:41
960                                        R: Removing for loops 2015-04-18 23:53:20
966   R: dplyr - Error in (list: invalid subscript type 'double' 2015-04-27 22:34:43
343                     Parsing XML from the unix terminal/shell 2011-09-03 23:42:11
 
> byWeek = df %>% 
    mutate(year = year(date), week = week(date)) %>% 
    group_by(week, year) %>% summarise(n = n()) %>% 
    ungroup() %>% arrange(desc(n))
 
> byWeek %>% sample_n(5)
Source: local data frame [5 x 3]
 
  week year n
1   44 2009 6
2   37 2011 4
3   39 2012 3
4    7 2013 4
5    6 2010 6

Great. The next step is to translate this data frame into one containing a date representing the start of that week and the number of posts:

> data = byWeek %>% 
    mutate(start_of_week = calculate_start_of_week(week, year)) %>%
    filter(start_of_week > ymd("2008-07-01")) %>%
    select(start_of_week, n)
 
> data %>% sample_n(5)
Source: local data frame [5 x 2]
 
  start_of_week n
1    2010-09-10 4
2    2013-04-09 4
3    2010-04-30 6
4    2012-03-11 3
5    2014-12-03 3

We’re now ready to plug it into the anomaly detection function:

res = AnomalyDetectionTs(data, 
                         max_anoms=0.02, 
                         direction='both', 
                         plot=TRUE)
res$plot

2015 07 18 00 24 20

Interestingly I don’t seem to have any low end anomalies – there were a couple of really high frequency weeks when I first started writing and I think one of the other weeks contains a New Year’s Eve when I was particularly bored!

If we group by month instead only the very first month stands out as an outlier:

data = byMonth %>% 
  mutate(start_of_month = ymd(paste(year, month, 1, sep="-"))) %>%
  filter(start_of_month > ymd("2008-07-01")) %>%
  select(start_of_month, n)
res = AnomalyDetectionTs(data, 
                         max_anoms=0.02, 
                         direction='both',       
                         #longterm = TRUE,
                         plot=TRUE)
res$plot

2015 07 18 00 34 02

I’m not sure what else to do as far as anomaly detection goes but if you have any ideas please let me know!

Categories: Programming

Episode 232: Mark Nottingham on HTTP/2

Stefan Tilkov talks to Mark Nottingham, chair of the IETF (Internet Engineering Task Force) HTTP Working Group and Internet standards veteran, about HTTP/2, the new version of the Web’s core protocol. The discussion provides a glimpse behind the process of building standards. Topics covered include the history of HTTP versions, differences among those versions, and […]
Categories: Programming