Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!
Software Development Blogs: Programming, Software Testing, Agile Project Management
Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!
I truly believe that a lightweight, pragmatic approach to software architecture is pivotal to successfully delivering software, and that it can complement agile approaches rather than compete against them. After all, a good architecture enables agility and this doesn't happen by magic. But the people in our industry often tend to have a very different view. Historically, software architecture has been a discipline steeped in academia and, I think, subsequently feels inaccessible to many software developers. It's also not a particularly trendy topic when compared to [microservices|Node.js|Docker|insert other new thing here].
I've been distilling the essence of software architecture over the past few years, helping software teams to understand and introduce it into the way that they work. And, for me, the absolute essence of software architecture is about building firm foundations; both in terms of the team that is building the software and for the software itself. It's about technical leadership, creating a shared vision and stacking the chances of success in your favour.
I'm delighted to have been invited back to ASAS 2014 and my opening keynote is about what a software team can do in order to create those firm foundations. I'm also going to talk about some of the concrete examples of what I've done in the past, illustrating how I apply a minimal set of software architecture practices in the real world to take an idea through to working software. I'll also share some examples of where this hasn't exactly gone to plan too! I look forward to seeing you in a few weeks.
My Software architecture vs code talk has evolved considerably over the past year and I've presented a number of iterations at events in Europe and the US. If you're interested, thanks to the nice people behind the GOTO conferences, you can now watch the video of the version that I presented at GOTO Chicago 2014.
It needs a little updating (isn't that always the case!), but I've moved the example software guidebook (previously an appendix in my Software Architecture for Developers book) into a separate free and open source book on Leanpub.
techtribes.je is a side-project of mine to create a content aggregator for the tech, IT and digital sector in Jersey, Channel Islands. The code behind the techtribes.je website is open source and available on GitHub. The source for the software guidebook is also open source and available on GitHub.
The techtribes.je software guidebook is based upon the concept of a software guidebook as described in my Software Architecture for Developers book; the software guidebook is a lightweight, pragmatic way to document the "big picture" of a software system. In essence, it's my simplified version of many "software architecture document" templates you'll find out there on the web.
techtribes.je - Software Guidebook is available to download for free from Leanpub. I hope you find it useful.
There's been some interesting discussion over the past fews days about Leanpub, both on Twitter and blogs. Jurgen Appelo posted Why I Don't Use Leanpub and Peter Armstrong responded. I think the biggest selling points of Leanpub as a publishing platform from an author's perspective may have been lost in the discussion. So, here's why my take on why I use Leanpub for Software Architecture for Developers.Some history
I pitched my book idea to a number of traditional publishing companies in 2008 and none of them were very interested. "Nice idea, but it won't sell" was the basic summary. A few years later I decided to self-publish my book instead and I was about to head down the route of creating PDF and EPUB versions using a combination of Pages and iBooks Author on the Mac. Why? Because I love books like Garr Reynolds' Presentation Zen and I wanted to do something similar. At first I considered simply giving the book away for free on my website but, after Googling around for self-publishing options, I stumbled across Leanpub. Despite the Leanpub bookstore being fairly sparse at the start of 2012, the platform piqued my interest and the rest is history.The headline: book creation, publishing, sales and distribution as a service
I use Leanpub because it allows me to focus on writing content. Period. The platform takes care of creating and selling e-books in a number of different formats. I can write some Markdown, sync the files via Dropbox and publish a new version of my book within minutes.Typesetting and layout
I frequently get asked for advice about whether Leanpub is a good platform for somebody to write a book. The number one question to ask is whether you have specific typesetting/layout needs. If you want to produce a "Presentation Zen" style book or if having control of your layout is important to you, then Leanpub isn't for you. If, however, you want to write a traditional book that mostly consists of words, then Leanpub is definitely worth taking a look at.
Leanpub uses a slightly customised version of Markdown, which is a super-simple language for writing content. Here's an example of a Markdown file from my book, and you can see the result in the online sample of my book. Leanpub does allow you to tweak things like PDF page size, font size, page breaking, section numbering, etc but you're not going to get pixel perfect typesetting. I think that Leanpub actually does a pretty fantastic job of creating good looking PDF, EPUB and MOBI format ebooks based upon the very minimal Markdown. This is especially true when you consider the huge range of ebook reader software across PCs, Macs, Android devices, Apple devices, Kindles, etc. Plus the readers themselves can mess with the fonts/font sizes too.
It's like building my own server at Rackspace versus using a "Platform as a Service" such as Cloud Foundry. You need to make a decision about the trade-off between control and simplicity/convenience. Since authoring isn't my full-time job and I have lots of other stuff to be getting on with, I'm more than happy to supply the content and let Leanpub take care of everything else for me.Toolchain
My toolchain as a Leanpub author is incredibly simple: Dropbox and Mou. From a structural perspective, I have one Markdown file per essay and that's basically it. Leanpub does now provide support for using GitHub to store your content and I can see the potential for a simple Leanpub-aware authoring tool, but it's not rocket science. And to prove the point, a number of non-technical people here in Jersey have books on Leanpub too (e.g. Thrive with The Hive and a number of books by Richard Rolfe).Iterative and incremental delivery
Before starting, I'd already decided that I'd like to write the book as a collection of short essays and this was cemented by the fact that Leanpub allows me to publish an in-progress ebook. I took an iterative and incremental approach to publishing the book. Rather than starting with essay number one and progressing in order, I tried to initially create a minimum viable book that covered the basics. I then fleshed out the content with additional essays once this skeleton was in place, revisiting and iterating upon earlier essays as necessary. I signed up for Leanpub in January 2012 and clicked the "Publish" button four weeks later. That first version of my book was only about ten pages in length but I started selling copies immediately.Variable pricing and coupons
Another thing that I love about Leanpub is that it gives you full control over how you price your book. The whole pricing thing is a balancing act between readership and royalties, but I like that I'm in control of this. My book started out at $4.99 and, as content was added, that price increased. The book now currently has a minimum price of $20 and a recommended price of $30. I can even create coupons for reduced price or free copies too. There's some human psychology that I don't understand here, but not everybody pays the minimum price. Far from it, and I've had a good number of people pay more than the recommend price too. Leanpub provides all of the raw data, so you can analyse it as needed.An incubator for books
As I've already mentioned, I pitched my book idea to a bunch of regular publishing companies and they weren't interested. Fast-forward a few years and my book is the currently the "bestselling" book on Leanpub this week, fifth by lifetime earnings and twelfth in terms of number of copies sold. I've used quotes around "bestselling" because Jurgen did. ;-)
In his blog post, Peter Armstrong emphasises that Leanpub is a platform for publishing in-progress ebooks, especially because you can publish using an iterative and incremental approach. For this reason, I think that Leanpub is a fantastic way for authors to prove an idea and get some concrete feedback in terms of sales. Put simply, Leanpub is a fantastic incubator for books. I know of a number of books that were started on Leanpub have been taken on by traditional publishing companies. I've had a number of offers too, including some for commercial translations. Sure, there are other ways to publish in-progress ebooks, but Leanpub makes this super-easy and the barrier to entry is incredibly low.The future for my book?
What does the future hold for my book then? I'm not sure that electronic products are ever really "finished" and, although I consider my book to be "version 1", I do have some additional content that is being lined up. And when I do this, thanks to the Leanpub platform, all of my existing readers will get the updates for free.
I've so far turned down the offers that I've had from publishing companies, primarily because they can't compete in terms of royalties and I'm unconvinced that they will be able to significantly boost readership numbers. Leanpub is happy for authors to sell their books through other channels (e.g. Amazon) but, again, I'm unconvinced that simply putting the book onto Amazon will yield an increased readership. I do know of books on the Kindle store that haven't sold a single copy, so I take "Amazon is bigger and therefore better" arguments with a pinch of salt.
What I do know is that I'm extremely happy with the return on my investment. I'm not going to tell you how much I've earned, but a naive calculation of $17.50 (my royalty on a $20 sale) x 4,600 (the total number of readers) is a little high but gets you into the right ballpark. In summary, Leanpub allows me focus on content, takes care of pretty much everything and gives me an amazing author royalty as a result. This is why I use Leanpub.
After a lovely summer (mostly) spent in Jersey, September is right around the corner and is shaping up to be a busy month. Here's a list of the events where you'll be able to find me.
It's going to be a fun month and besides, I have to keep up my British Airways frequent flyer status somehow, right? ;-)
A few people have recently asked me for a poster/cheat sheet/quick reference of the C4 model that I use for communicating and diagramming software systems. You may have seen an old copy floating around the blog, but I've made a few updates and you can grab the new version from http://static.codingthearchitecture.com/c4.pdf (PDF, A3 size).
I was reading Dan North Visits Osper and was pleasantly surprised to see Dan mention CRC modelling. CRC is a great technique for the process of designing software, particularly when used in a group/workshop environment. It's not a technique that many people seem to know about nowadays though.
A Google search will yield lots of good explanations on the web, but basically CRC is about helping to identify the classes needed to implement a particular feature, use case, user story, etc. You basically walk through the feature from the start and whenever you identify a candidate class, you write the name of it on a 6x4 index card, additionally annotating the card with the responsibilities of the class. Every card represents a separate class. As you progress through the feature, you identify more classes and create additional cards, annotating the cards with responsibilities and also keeping a note of which classes are collaborating with one another. Of course, you can also refactor your design by splitting classes out or combining them as you progress. When you're done, you can dry-run your feature by walking through the classes (e.g. A calls B to do X, which in turn requests Y from C, etc).
Much of what you'll read about CRC on the web discusses how the technique is useful for teaching OO design at the class level, but I like using it at the component level when faced with architecting a software system given a blank sheet of paper. The same principles apply, but you're identifying components (or services, microservices, etc) rather than classes. When you've done this for a number of significant use cases, you end up with a decent set of CRC cards representing the core components of your software system.
From this, you can start to draw some architecture diagrams using something like my C4 model. Since the cards represent components, you can simply lay out the cards on paper, draw lines between them and you have a component diagram. Each of those components needs to be running in an execution environment (e.g. a web application, database, mobile app, etc). If you draw boxes around groups of components to represent these execution environments, you have a containers diagram. Step up one level further and you can create a simple system context diagram to show how your system interacts with the outside world. My Simple Sketches for Diagramming your Software Architecture article provides more information about the C4 model and the resulting diagrams, but hopefully you get the idea.
CRC then ... yes, it's a great technique for collaborative design, particularly when applied at the component level. And it's a nice starting point for creating software architecture diagrams too.
In Diagramming Spring MVC webapps, I presented an approach that allows you to create a fairly comprehensive model of a software system in code. It starts with you creating a simple base model that includes software systems, people and containers. With this in place, all of the components can then be automatically populated into the model via a scan of the compiled Java code. This is all based upon Software architecture as code.
Once you have a model to work with, it's relatively straightforward to visualise it via a number of views. In the Spring PetClinic example, three separate views (one each of a system context, containers and components view) are sufficient to show everything. With larger software systems, however, this isn't the case.
As an example, here's what a single component diagram for the web application of my techtribes.je system looks like.
Yup, it's a mess. The components around the left, top and right edges are Spring MVC controllers, while those in the centre are the core components. There are clearly three hotspots here - the LoggingComponent, ActivityComponent and ContentSourceComponent. The reason for the first should be obvious, in that almost all components use the LoggingComponent. The latter two are used by all controllers, simply because some common information is displayed on the header of all pages on the website. I don't mind excluding the LoggingComponent from the view, but I'd quite like to keep the other two. That aside, even excluding the ActivityComponent and ContentSourceComponent doesn't actually solve the problem here. The resulting diagram is still a mess because it's showing far too much information. Instead, another approach is needed.
With this in mind, what I've done instead is use a programmatic approach to create a number of views for the techtribes.je web application, one per Spring MVC controller. The code looks like this.
The result is a larger number of simple diagrams, but I think that the trade-off is worth it. It's a much better way to navigate a large model.
And here's an example component diagram that focusses on a single Spring MVC controller.
The JSON representing the techtribes.je model can be found on GitHub and you can copy-paste it into my (still in-progress) diagramming tool if you'd like to explore the model yourself. I'm still experimenting with much of this but I really like the opportunities provided by having the software architecture model in code. This really is "software architecture for developers". :-)
Simon recently talked about the gap between Software Architecture and Code and how to close this with architecturally-evident coding. He's also creating tools to allow Software Architecture to be expressed as code.
If you're working on a greenfield project then including annotations to help with navigation is a great solution but what if you've inherited a large system with a model-code gap? Or if you only realise, sometime into a project, that you lack a model to help you understand its growing complexity? Well, Simon also had some thoughts on scanning Spring annotations to provide this data. This works quite well and it got me thinking about other artifacts in code that can be extracted for these diagrams.
(In more formal terms - for those of you that like to quote ISO42010 - we are trying to extract architectural elements from code that can be displayed within architectural views. Of course the elements may be from a variety of differing abstractions, levels and granularity and therefore need to be placed within differing views.)
So what can we extract from a current/legacy system to give us a view into it? Some suggestions include:Annotations As already suggested, dependency injections systems such as Spring provide some annotations that can be extracted to give a basic, high level model. Annotations are also present in Java EE applications and other enterprise frameworks. XML DI Configuration files Many (legacy) Spring projects use xml configuration files to define beans. Having scanned a few examples this seems to create a relatively low level model which would need some manual tweaking after generation. With sensible naming convention for beans you can produce models for a desired abstraction. The bean properties indicate the connections between these elements. Module Bundling Systems Modular systems such as OSGi define bundles of components and services including lifecycle and service registry. The deployment information should provide a high level overview. Packages If you have used 'package-by-component' then your packages will relate one-to-one with your components. The links between components should be identifiable by the links between the classes within them (the has-a relationships). If you have package-by-layer then this is much harder or impossible to use. Experience tells me that most real-world systems are actually a combination of the two so you should have some useful information. Class Names It's very common for class names to contain a strong indication of their role e.g. XyzService, XyzConnector, XyzDao, XyzFacade etc. Scanning for known patterns should identify the element names and roles. Interfaces and class hierarchies If you implement interfaces (or extend base classes) then the interfaces used may show the abstraction level and type e.g. implementing Service, DAO, Connector, Repository etc Delegation or Library Dependency Shared delegates used by a set of classes/functions may indicate their purpose. e.g. components delegating to a database utility might indicate a DAO component or using a CORBA utility might indicate a service. This is likely to be time consuming as you need to identify and scan for each delegate you are interested in. Comments/Javadoc/JSDoc/NDoc/Doclets Comments and javadoc style API comments can provide a large amount of meta-information about a class or package. In fact, many UML modeling tools enrich code using custom comments and tags. This has the advantage of not affecting the compiled code or introducing library dependencies but may not be consistently used. Tests Test can provide a lot of meta-data about your system. Unit tests tend to be concentrated around important classes and often construct entire components to test. Simply extracting the names of classes that are directly tested will produce a useful list of components. The higher level systems tests will reveal the important services. Build Systems Build systems such as ant, maven, NuBuild etc all have hooks into the code base for building and deployment. A simple extraction of the build targets will give you the deployment modules (which is a very helpful view for operation teams). This may give you the required information for a Containers view.
What do you think? Of course all of the above is very dependent on your codebase but if none of the above works then you have to question the quality and structure of the code! The data extracted may need filtering and manual correction as it won't give you exactly what you want. You might consider creating structurizr annotations using an initial scan and then maintaining them. One of my tasks for the next few weeks is to try this out on some legacy codebases I maintain.
What other ways of identifying architectural elements can you think of?
Regular readers will know that I'm a big fan of pictures, especially as a mechanism for communicating the structure of software systems. To this end, I sometimes introduce myself as somebody who teaches people how to draw pictures. This is often said with a smile, because on the face of it this is exactly what I appear to be doing. But there's less truth to this than there initially appears.
Something that we do on the full 2-day training course and the 1-day sketching workshop is to get people drawing some pictures to communicate the structure of a software system. This is either a solution to the financial risk system or their own software. Over the years, I've done this for thousands of people and pretty much everybody struggles to draw something that effectively communicates the software. The same challenges crop up again and again.
The diagrams that are produced during the initial 90-minute timebox are usually fairly confusing and exhibit problems ranging from a lack of clarity around the abstractions and notation used through to diagrams that show too little or too much detail. This is an aside, but the majority of the diagrams are informal "boxes and lines" rather than UML.
After some reviews and feedback, I introduce people to my C4 model. In a nutshell, it's all about thinking of a software system as being made up of containers (web applications, databases, mobile devices, etc), each of which contains components, which in turn are made up of classes. There are some variations of this depending on the technology you're using, but that's basically it. With this structure in mind, you can then draw a diagram at each level in turn, which leads to a system context diagram, a container diagram, one or more component diagrams and (optionally) some class diagrams. If you want a slightly more detailed introduction to C4, take a look at Simple sketches for diagramming your software architecture.
Back to the workshops, and iteration two is another 90-minute timebox in which teams can redraw their diagrams. I don't actually mandate that people should follow my C4 approach, but most do, and what happens next is really interesting. The conversations change. In stark contrast to the first timebox, the conversations are much more technical. Gone are the discussions about what to include on each diagram and instead the focus is on the technical aspects of the software. Instead of "what should we show on this diagram?", the questions I hear are more along the lines of "what are the responsibilities of this thing?" and "how does this thing talk to that thing?". In essence, the discussion returns to be about software design, which is exactly where it should be.
There are a number of ways in which you can communicate the architecture of a software system, so I don't want to pitch my C4 approach as "the one true way". What I do want to say is that providing even a little guidance in this area can go a long way. I ask people for their thoughts after the second timebox and the consensus is that the diagrams are easier to draw and comprehend because of the "framework" (their words, not mine) that's been provided. All I'm doing is providing some constraints that people might want to work within, and this frees them up to focus on what's important again.
Back to that comment about introducing myself as somebody who teaches people how to draw pictures. It turns out that this isn't actually what I do. I don't really care so much *how* people draw pictures. What I do care about is that they use a well-understood, consistent and meaningful set of *abstractions* to think about and therefore describe their software. It turns out that modelling our software isn't dead after all. ;-)
If you want evidence that the software development industry is susceptible to fashion, just go and take a look at all of the hype around microservices. It's everywhere! For some people microservices is "the next big thing", whereas for others it's simply a lightweight evolution of the big SOAP service-oriented architectures that we saw 10 years ago "done right". I do like a lot of what the current microservice architectures are doing, but it's by no means a silver bullet. Okay, I know that sounds obvious, but I think many people are jumping on them for the wrong reason.
I often show this slide in my conference talks, and I've blogged about this before, but basically there are different ways to build software systems. On the one side we have traditional monolithic systems, where everything is bundled up inside a single deployable unit. This is probably where most of the industry is. Caveats apply, but monoliths can be built quickly and are easy to deploy, but they provide limited agility because even tiny changes require a full redeployment. We also know that monoliths often end up looking like a big ball of mud because of the way that software often evolves over time. For example, many monolithic systems are built using a layered architecture, and it's relatively easy for layered architectures to be abused (e.g. skipping "around" a service to call the repository/data access layer directly).
On the other side we have service-based architectures, where a software system is made up of many separately deployable services. Again, caveats apply but, if done well, service-based architectures buy you a lot of flexibility and agility because each service can be developed, tested, deployed, scaled, upgraded and rewritten separately, especially if the services are decoupled via asynchronous messaging. The downside is increased complexity because your software system now has many more moving parts than a monolith. As Robert says, the complexity is still there, you're just moving it somewhere else.
There is, of course, a mid-ground here. We can build monolithic systems that are made up of in-process components, each of which has an explicit well-defined interface and set of responsibilities. This is old-school component-based design that talks about high cohesion and low coupling, but I usually sense some hesitation when I talk about it. And this seems odd to me. Before I explain why, let me quote something from a blog post that I read earlier this morning about the rationale behind a team adopting a microservices approach. When we started building Karma, we decided to split the project into two main parts: the backend API, and the frontend application. The backend is responsible for handling orders from the store, usage accounting, user management, device management and so forth, while the frontend offers a dashboard for users which accesses this API. Along the way we noticed that if the whole backend API is monolithic it doesn't work very well because everything gets entangled.
The blog post also mentions scaling, versioning and multiple languages/frameworks as other reasons to choose microservices. Again, there are no silver bullets here, everything is a trade-off. Anyway, "everything getting entangled" is not a reason to switch from monoliths to microservices. If you're building a monolithic system and it's turning into a big ball of mud, perhaps you should consider whether you're taking enough care of your software architecture. Do you really understand what the core structural abstractions are in your software? Are their interfaces and responsibilities clear too? If not, why do you think moving to a microservices architecture will help? Sure, the physical separation of services will force you to not take some shortcuts, but you can achieve the same separation between components in a monolith. A little design thinking and an architecturally-evident coding style will help to achieve this without the baggage of going distributed.
Many of the teams I've spoken to are building monolithic systems and don't want to look at component-based design. The mid-ground seems to be a hard-sell. I ran a software architecture sketching workshop with a team earlier this year where we diagrammed one of their software systems. The diagram started as a strictly layered architecture (presentation, business services, data access) with all arrows pointing downwards and each layer only ever calling the layer directly beneath it. The code told a different story though and the eventual diagram didn't look so neat anymore. We discussed how adopting a package by component approach could fix some of these problems, but the response was, "meh, we like building software using layers".
It seems as if teams are jumping on microservices because they're sexy, but the design thinking and decomposition strategy required to create a good microservices architecture are the same as those needed to create a well structured monolith. If teams find it hard to create a well structured monolith, I don't rate their chances of creating a well structured microservices architecture. As Michael Feathers recently said, "There's a bit of overhead involved in implementing each microservice. If they ever become as easy to create as classes, people will have a freer hand to create trouble - hulking monoliths at a different scale.". I agree. A world of distributed big balls of mud worries me.
Following on from my previous post (Software architecture as code) where I demonstrated how to create a software architecture model as code, I decided to throw together a quick implementation of a Spring component finder that could be used to (mostly) automatically create a model of a Spring MVC web application. Spring has a bunch of annotations (e.g. @Controller, @Component, @Service and @Repository) and these are often/can be used to signify the major building blocks of a web application. To illustrate this, I took the Spring PetClinic application and produced some diagrams for it. First is a context diagram.
Next up are the containers, which in this case are just a web server (e.g. Apache Tomcat) and a database (HSQLDB by default).
And finally we have a diagram showing the components that make up the web application. These, and their dependencies, were found by scanning the compiled version of the application (I cloned the project from GitHub and ran the Maven build).
Here is the code that I used to generate the model behind the diagrams.
The resulting JSON representing the model was then copy-pasted across into my simple (and very much in progress) diagramming tool. Admittedly the diagrams are lacking on some details (i.e. component responsibilities and arrow annotations, although those can be fixed), but this approach proves you can expend very little effort to get something that is relatively useful. As I've said before, it's all about getting the abstractions right.
If you've been following the blog, you will have seen a couple of posts recently about the alignment of software architecture and code. Software architecture vs code talks about the typical gap between how we think about the software architecture vs the code that we write, while An architecturally-evident coding style shows an example of how to ensure that the code does reflect those architectural concepts. The basic summary of the story so far is that things get much easier to understand if your architectural ideas map simply and explicitly into the code.
Regular readers will also know that I'm a big fan of using diagrams to visualise and communicate the architecture of a software system, and this "big picture" view of the world is often hard to see from the thousands of lines of code that make up our software systems. One of the things that I teach people during my sketching workshops is how to sketch out a software system using a small number of simple diagrams, each at very separate levels of abstraction. This is based upon my C4 model, which you can find an introduction to at Simple sketches for diagramming your software architecture. The feedback from people using this model has been great, and many have a follow-up question of "what tooling would you recommend?". My answer has typically been "Visio or OmniGraffle", but it's obvious that there's an opportunity here.Representing the software architecture model in code
I've had a lot of different ideas over the past few months for how to create, what is essentially, a lightweight modelling tool and for some reason, all of these ideas came together last week while I was at the GOTO Amsterdam conference. I'm not sure why, but I had a number of conversations that inspired me in different ways, so I skipped one of the talks to throw some code together and test out some ideas. This is basically what I came up with...
It's a description of the context and container levels of my C4 model for the techtribes.je system. Hopefully it doesn't need too much explanation if you're familiar with the model, although there are some ways in which the code can be made simpler and more fluent. Since this is code though, we can easily constrain the model and version it. This approach works well for the high-level architectural concepts because there are very few of them, plus it's hard to extract this information from the code. But I don't want to start crafting up a large amount of code to describe the components that reside in each container, particularly as there are potentially lots of them and I'm unsure of the exact relationships between them.Scanning the codebase for components
If your code does reflect your architecture (i.e. you're using an architecturally-evident coding style), the obvious solution is to just scan the codebase for those components, and use those to automatically populate the model. How do we signify what a "component" is? In Java, we can use annotations...
Identifying those components is then a matter of scanning the source or the compiled bytecode. I've played around with this idea on and off for a few months, using a combination of Java annotations along with annotation processors and libraries including Scannotation, Javassist and JDepend. The Reflections library on Google Code makes this easy to do, and now I have simple Java program that looks for my component annotation on classes in the classpath and automatically adds those to the model. As for the dependencies between components, again this is fairly straightforward to do with Reflections. I have a bunch of other annotations too, for example to represent dependencies between a component and a container or software system, but the principle is still the same - the architecturally significant elements and their dependencies can mostly be embedded in the code.Creating some views
The model itself is useful, but ideally I want to look at that model from different angles, much like the diagrams that I teach people to draw when they attend my sketching workshop. After a little thought about what this means and what each view is constrained to show, I created a simple domain model to represent the context, container and component views...
Again, this is all in code so it's quick to create, versionable and very customisable.Exporting the model
Now that I have a model of my software system and a number of views that I'd like to see, I could do with drawing some pictures. I could create a diagramming tool in Java that reads the model directly, but perhaps a better approach is to serialize the object model out to an external format so that other tools can use it. And that's what I did, courtesy of the Jackson library. The resulting JSON file is over 600 lines long (you can see it here), but don't forget most of this has been generated automatically by Java code scanning for components and their dependencies.
Visualising the views
All of the C4 model Java code is open source and sitting on GitHub. This is only a few hours of work so far and there are no tests, so think of this as a prototype more than anything else at the moment. I really like the simplicity of capturing a software architecture model in code, and using an architecturally-evident coding style allows you to create large chunks of that model automatically. This also opens up the door to some other opportunities such automated build plugins, lightweight documentation tooling, etc. Caveats apply with the applicability of this to all software systems, but I'm excited at the possibilities. Thoughts?
Okay, this is the separate blog post that I referred to in Software architecture vs code. What exactly do we mean by an "architecturally-evident coding style"? I built a simple content aggregator for the local tech community here in Jersey called techtribes.je, which is basically made up of a web server, a couple of databases and a standalone Java application that is responsible for actually aggegrating the content displayed on the website. You can read a little more about the software architecture at techtribes.je - containers. The following diagram is a zoom-in of the standalone content updater application, showing how it's been decomposed.
This diagram says that the content updater application is made up of a number of core components (which are shown on a separate diagram for brevity) and an additional four components - a scheduled content updater, a Twitter connector, a GitHub connector and a news feed connector. This diagram shows a really nice, simple architecture view of how my standalone content updater application has been decomposed into a small number of components. "Component" is a hugely overloaded term in the software development industry, but essentially all I'm referring to is a collection of related behaviour sitting behind a nice clean interface.
Back to the "architecturally-evident coding style" and the basic premise is that the code should reflect the architecture. In other words, if I look at the code, I should be able to clearly identify each of the components that I've shown on the diagram. Since the code for techtribes.je is open source and on GitHub, you can go and take a look for yourself as to whether this is the case. And it is ... there's a je.techtribes.component package that contains sub-packages for each of the components shown on the diagram. From a technical perspective, each of these are simply Spring Beans with a public interface and a package-protected implementation. That's it; the code reflects the architecture as illustrated on the diagram.
So what about those core components then? Well, here's a diagram showing those.
Again, this diagram shows a nice simple decomposition of the core of my techtribes.je system into coarse-grained components. And again, browsing the source code will reveal the same one-to-one mapping between boxes on the diagram and packages in the code. This requires conscious effort to do but I like the simple and explicit nature of the relationship between the architecture and the code.When architecture and code don't match
The interesting part of this story is that while I'd always viewed my system as a collection of "components", the code didn't actually look like that. To take an example, there's a tweet component on the core components diagram, which basically provides CRUD access to tweets in a MongoDB database. The diagram suggests that it's a single black box component, but my initial implementation was very different. The following diagram illustrates why.
My initial implementation of the tweet component looked like the picture on the left - I'd taken a "package by layer" approach and broken my tweet component down into a separate service and data access object. This is your stereotypical layered architecture that many (most?) books and tutorials present as a way to build (e.g.) web applications. It's also pretty much how I've built most software in the past too and I'm sure you've seen the same, especially in systems that use a dependency injection framework where we create a bunch of things in layers and wire them all together. Layered architectures have a number of benefits but they aren't a silver bullet.
This is a great example of where the code doesn't quite reflect the architecture - the tweet component is a single box on an architecture diagram but implemented as a collection of classes across a layered architecture when you look at the code. Imagine having a large, complex codebase where the architecture diagrams tell a different story from the code. The easy way to fix this is to simply redraw the core components diagram to show that it's really a layered architecture made up of services collaborating with data access objects. The result is a much more complex diagram but it also feels like that diagram is starting to show too much detail.
The other option is to change the code to match my architectural vision. And that's what I did. I reorganised the code to be packaged by component rather than packaged by layer. In essence, I merged the services and data access objects together into a single package so that I was left with a public interface and a package protected implementation. Here's the tweet component on GitHub.But what about...
Again, there's a clean simple mapping from the diagram into the code and the code cleanly reflects the architecture. It does raise a number of interesting questions though.
This is still a layered architecture, it's just that the layers are now a component implementation detail rather than being first-class architectural building blocks. And that's nice, because I can think about my components as being my architecturally significant structural elements and it's these building blocks that are defined in my dependency injection framework. Something I often see in layered architectures is code bypassing a services layer to directly access a DAO or repository. These sort of shortcuts are exactly why layered architectures often become corrupted and turn into big balls of mud. In my codebase, if any consumer wants access to tweets, they are forced to use the tweet component in its entirety because the DAO is an internal implementation detail. And because I have layers inside my component, I can still switch out my tweet data storage from MongoDB to something else. That change is still isolated.Component testing vs unit testing
Ah, unit testing. Bundling up my tweet service and DAO into a single component makes the resulting tweet component harder to unit test because everything is package protected. Sure, it's not impossible to provide a mock implementation of the MongoDBTweetDao but I need to jump through some hoops. The other approach is to simply not do unit testing and instead test my tweet component through its public interface. DHH recently published a blog post called Test-induced design damage and I agree with the overall message; perhaps we are breaking up our systems unnecessarily just in order to unit test them. There's very little to be gained from unit testing the various sub-parts of my tweet component in isolation, so in this case I've opted to do automated component testing instead where I test the component as a black-box through its component interface. MongoDB is lightweight and fast, with the resulting component tests running acceptably quick for me, even on my ageing MacBook Air. I'm not saying that you should never unit test code in isolation, and indeed there are some situations where component testing isn't feasible. For example, if you're using asynchronous and/or third party services, you probably do want to ability to provide a mock implementation for unit testing. The point is that we shouldn't blindly create designs where everything can be mocked out and unit tested in isolation.Food for thought
The purpose of this blog post was to provide some more detail around how to ensure that code reflects architecture and to illustrate an approach to do this. I like the structure imposed by forcing my codebase to reflect the architecture. It requires some discipline and thinking about how to neatly carve-up the responsibilities across the codebase, but I think the effort is rewarded. It's also a nice stepping stone towards micro-services. My techtribes.je system is constructed from a number of in-process components that I treat as my architectural building blocks. The thinking behind creating a micro-services architecture is essentially the same, albeit the components (services) are running out-of-process. This isn't a silver bullet by any means, but I hope it's provided some food for thought around designing software and structuring a codebase with an architecturally-evident coding style.
I presented two talks last week with the title "Software architecture vs code" - first as the opening keynote for the inaugural Software Design and Development conference and also the next day as a regular conference session at GOTO Chicago. Videos from both should be available at some point and the slides are available now. The talk itself seems to polarise people, with responses ranging from Without a doubt, Simon delivered one of the best keynotes I have seen. I got a lot from it, with plenty 'food for thought' moments. through to "hmmm, meh".Separating software architecture from code
The basic premise of the talk is that the architecture and code of a software system never quite match up. The traditional way to communicate the architecture of a software system is with diagrams based upon a number of views ... a logical view, a functional view, a module view, a physical view, etc, etc. Philippe Kruchten's 4+1 model is an example often cited as a starting point for such approaches. I've followed these approaches in the past myself and, although I can get my head around them, I don't find them an optimal way to describe a software system. The "why?" has taken me a while to figure out, but the thing I dislike is the way in which you get an artificial separation between the architecture-related views (logical, module, functional, etc) and the code-related views (implementation, design, etc). I don't like treating the architecture and the code as two separate things, but this seems to be the starting point for many of the ways in which software systems are communicated/documented. If you want a good example of this, take a look at the first chapter of "Software Architecture in Practice" where it describes the relationship between modules, components, and component instances. It makes my head hurt.The model-code gap
This difference between the architecture and code views is also exaggerated by what George Fairbanks calls the "model-code gap" in his book titled "Just Enough Software Architecture" (highly recommended reading, by the way). George basically says that your architecture models will include abstract concepts (e.g. components, services, modules, etc) but the code usually doesn't reflect this. This matches my own experience of helping people communicate their software systems ... people will usually draw components or services, but the actual implementation is a bunch of classes sitting inside a traditional layered architecture. Actually, if I'm being honest, this matches my own experience of building software myself because I've done the same thing! :-)The intersection of software architecture and code
My approach to all of this is to ensure that the architecture and code views of a software system are one and the same thing, albeit from different levels of abstraction. In other words, my primary focus when describing a software system is the static structure, which ranges from code (classes) right up through components and containers. I model this with my C4 approach, which recognises that software developers are the primary stakeholders in software architecture. Other views of the software system (deployment, infrastructure, etc) slot into place really easily when you understand the static structure.
To put this all very simply, your code should reflect the architecture diagrams that you draw. If your diagrams include abstract concepts such as components, your code should reflect this. If the diagrams and code don't line up, you have to question the value of the diagrams because they're creating a fantasy and there's little point in referring to them.Challenging the traditional layered approach
This deserves a separate blog post, but something I also mentioned during the talk was that teams should challenge the traditional layered architecture and the way that we structure our codebase. One way to achieve a nice mapping between architecture and code is to ensure that your code reflects the abstract concepts shown on your architecture diagrams, which can be achieved by writing components rather than classes in layers. Another side-effect of changing the organisation of the code is less test-induced design damage. The key question to ask here is whether layers are architecturally significant building blocks or merely an implementation detail, which should be wrapped up inside of (e.g.) components. As I said, this needs a separate blog post.Thoughts?
As I said, the slides are here. Aligning the architecture and the code raises a whole bunch of interesting questions but provides some enormous benefits for a software development team. A clean mapping between diagrams and code makes a software system easy to explain, the impact of change becomes easier to understand and architectural refactorings can seem much less daunting if you know what you have and where you want to get to. I'm interested in your thoughts on things like the following:
Convincing people to structure the code underlying their monolithic systems as a bunch of collaborating components seems to be a hard pill to swallow, yet micro-service architectures are going to push people to reconsider how they structure a software system, so I think this discussion is worth having. Thoughts?