Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!
Software Development Blogs: Programming, Software Testing, Agile Project Management
Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!
In response to my System Context diagram as code post yesterday was this question:
@simonbrown why is that information not already in the system's code?— Nat Pryce (@natpryce) January 12, 2015
I've often asked the same thing and, if the code is the embodiment/implementation of the architecture, this information really should be present in the code. But my experience suggests this is rarely the case.System context
My starting point for describing a software system is to draw a system context diagram. This shows the system in question along with key user types (e.g. actors, roles, personas, etc) and system dependencies.
I should be able to get a list of user roles from the code. For example, many web applications will have some configuration that describes the various user roles, Active Directory groups, etc and the parts of the web application that they have access too. This will differ from codebase to codebase and technology to technology, but in theory this information is available somewhere.
The key system dependencies are a little harder to extract from a codebase. Again, we can scrape security configuration to identify links to systems such as LDAP and Active Directory. We could also search the codebase for links to known libraries or APIs, and make the assumption that these are a system dependencies. But what about those system interactions that are done by copying a file into a network share? I know this sounds archaic, but it still happens. Understanding inbound dependencies is also tricky, especially if you don't keep track of your API consumers.Containers
The next level in my C4 model is a container diagram, which basically shows the various web applications, mobile apps, databases, file systems, standalone applications, etc and how they interact to form the overall software system. Again, some of this information will be present, in one form or another, in the codebase. For example, you could scrape this information out of an IDE such as IntelliJ IDEA (i.e. modules) or Visual Studio (i.e. projects). The output from build scripts for code (e.g. Ant, Maven, MSBuild, etc) and infrastructure (e.g. Puppet, Chef, Vagrant, Docker, etc) will probably result in deployable units, which can again be identified and this information used to create the containers model.Components
The third level of the C4 model is components (or modules, services, layers, etc). Since even a relatively small application may consist of a large number of components, this is a level that we certainly want to automate. But it turns out that even this is tricky. Usually there's a lack of an architecturally-evident coding style, which means you get a conflict between the software architecture model and the code. This is particularly true in older systems where the codebase lacks modularity and looks like a sea of thousands of classes interacting with one another. As Robert Annett suggests, there are a number of strategies that we can use to identify "components" from a codebase though; including annotations/attributes, packaging conventions, naming conventions, module systems (e.g. OSGi), library dependencies and so on.Auto-generating the software architecture model
Ultimately, I'd like to auto-generate as much of the software architecture model as possible from the code, but this isn't currently realistic. Why?January 13, 2015
We face two key challenges here. First of all, we need to get people thinking about software architecture once again so that they are able to think about, describe and discuss the various structures needed to reason about a large and/or complex software system. And secondly, we need to find a way to get these structures into the codebase. We have a way to go but, in time, I hope that the thought of using Microsoft Visio for drawing software architecture diagrams will seem ridiculous.
As I said in Resolving the conflict between software architecture and code, my focus for this year is representing a software architecture model as code. In Simple Sketches for Diagramming Your Software Architecture, I showed an example System Context diagram for my techtribes.je website.
It's a simple diagram that shows techtribes.je in the middle, surrounded by the key types of users and system dependencies. It's your typical "big picture" view. This diagram was created using OmniGraffle (think Microsoft Visio for Mac OS X) and it's exactly that - a static diagram that needs to be manually kept up to date. Instead, wouldn't it be great if this diagram was based upon a model that we could better version control, collaborate on and visualise? If you're not sure what I mean by a "model", take a look at Models, sketches and everything in between.
This is basically what the aim of Structurizr is. It's a way to describe a software architecture model as code, and then visualise it in a simple way. The Structurizr Java library is available on GitHub and you can download a prebuilt binary. Just as a warning, this is very much a work in progress and so don't be surprised if things change! Here's some Java code to recreate the techtribes.je System Context diagram.
Don't worry, there will eventually be an API for uploading software architecture models and the diagrams will get some styling, but it proves the concept. What we have then is an API that implements the various levels in my C4 software architecture model, with a simple browser-based rendering tool. Hopefully that's a nice simple introduction of how to represent a software architecture model as code, and gives you a flavour for the sort of direction I'm taking it. Having the software architecture as code provides some interesting opportunities that you don't get with static diagrams from Visio, etc and the ability to keep the models up to date automatically by scanning the codebase is what I find particularly exciting. If you have any thoughts on this, please do drop me a note.
Eoin Woods (co-author of the Software Systems Architecture book) and I presented a session at the Software Architect 2014 conference titled Models, sketches and everything in between, where we discussed the differences between diagrams and models for capturing and communicating the software architecture of a system.Just the mention of the word "modelling" brings back horrible memories of analysis paralysis for many software developers. And, in their haste to adopt agile approaches, weâve seen countless software teams who have thrown out the modelling baby with the process bathwater. In extreme cases, this has led to the creation of software systems that really are the stereotypical "big ball of mud". In this session, Simon and Eoin will discuss models, sketches and everything in between, providing you with some real world advice on how even a little modelling can help you avoid chaos.
It was a very fun session to do and I'd recommend taking a look if you're interested in describing/communicating the software architecture of your system. Enjoy!
I attended a fantastic talk about big data visualisation at the YOW! 2014 conference in Sydney last month (slides), where Doug Talbott talked about how to understand and visualise large quantities of data. One of the things he mentioned was Shneiderman's mantra:Overview first, zoom and filter, then details-on-demand
Leaving aside the thorny issue of how teams structure their software systems as code, one of the major problems I see teams having with software architecture is how to think about their systems. There are various ways to do this, including a number of view catalogs (e.g. logical view, design view, development view, etc) and I have my C4 model that focuses on the static structure of a software system. If you inherit an existing codebase and are asked to create a software architecture model though, where do you start? And how to people start understanding the model as quickly as possible so they can get on with their job?
Shneiderman's mantra fits really nicely with the C4 model because it's hierarchical.
Overview first (context and container diagrams)
My starting point for understanding any software system is to draw a system context diagram. This helps me to understand the scope of the system, who is using it and what the key system dependencies are. It's usually quick to draw and quick to understand.
Next I'll open up the system and draw a diagram showing the containers (web applications, mobile apps, standalone applications, databases, file systems, message buses, etc) that make up the system. This shows the overall shape of the software system, how responsibilities have been distributed and the key technology choices that have been made.Zoom and filter (component diagrams)
As developers, we often need more detail, so I'll then zoom into each (interesting) container in turn and show the "components" inside it. This is where I show how each application has been decomposed into components, services, modules, layers, etc, along with a brief note about key responsibilities and technology choices. If you're hand-drawing the diagrams, this part can get a little tedious, which is why I'm focussing on creating a software architecture model as code, and automating as much of this as possible.Details on demand (class diagrams)
Optionally, I might progress deeper into the hierarchy to show the classes* that make up a particular component, service, module, layer, etc. Ultimately though, this detail resides in the code and, as software developers, we can get that on demand.Understanding a large and/or complex software system
Next time you're asked to create an architecture model, understand an existing system, present an system overview, do some software archaeology, etc, my advice is to keep Shneiderman's mantra in mind. Start at the top and work into the detail, creating a story that gets deeper into the detail as it progresses. The C4 model is a great way to do this and if you'd like an introduction to it (with example diagrams), you can take a look at Simple Sketches for Diagramming Your Software Architecture on the new Voxxed website.
* this assumes an OO language like Java or C#, for example
So, 2015 ... happy new year! 2014 was a busy year with workshops, conferences and consulting gigs in countries ranging from Iceland to Australia. I'd like to say a huge thank you to everybody who made 2014 so much fun.Software architecture vs code
One of the things that I spent a good chunk of time on during 2014 was the conflict between software architecture and code. I've written about this before, but you will have seen this in action if the code for your software system doesn't reflect the architecture diagrams you have on the wall. If you've not seen it, my closing keynote from the ABB DevDay conference in KrakÃ³w, Poland last September provides a good summary of this.
What I'm really interested in is how we can solve this problem. And that's really where my focus is going to be this year, by taking my C4 software architecture model and representing it as code. I already have some experimental code and tooling that you can find at structurizr.com, but I'm going to be enhancing and expanding this over the coming weeks and months. I want to get people thinking about how to appropriately structure their codebase, understanding that there are different strategies for modularity and adopting, what George Fairbanks calls, an architecturally-evident coding style. I also want to provide tooling that helps people create software architecture models and keep them up to date, ideally based upon the real code and with as much automation as possible. To give you an example, here's a post about diagramming Spring MVC webapps.
I'll be posting updates on the blog, but if you want to hear me talk about this, I'll be at the following conferences over the next few months.
As a final note, my Software Architecture for Developers ebook is only $10 until the end of this week.
All the best for 2015.
One of the core concepts in the Software Architecture for Developers course is that the Quality Attributes (non-functional requirements) need to be understood in order to provide foundations for a system's architecture. It's no good building a system that fulfills its user's functional requirements if these are delivered incorrectly. Consider the embedded software in a pacemaker. It may correctly analyse the rhythm of the patient's heart and conclude that a shock is required but if this is performed at the wrong time (possibly due to jitter in the response) then it may kill the patient.
Discovering that critical quality attributes are not being met can require a complete system redesign e.g. modifying an asynchronous system to be synchronous. Therefore the early identification of key Quality Attributes is important to drive your design and in the selection of tools and technologies.
However I've often had difficulties getting course attendees to identify specific attributes, as opposed to generic ones, for a case study. For example, most people will identify performance as important but struggle to go beyond this to consider trade-offs between, say, throughput and jitter.
Therefore, in the last couple of courses, I have expanded the identification of Quality Attributes to include a very brief (and lightweight) Quality Attribute Workshop for our case study.
The Software Engineering Institute has a description of how to perform a Quality Attribute Workshop which includes a full process and template set. While excellent (and a core part of their ATAM architecture evaluation process) this is too involved for a short training course. We therefore just performed the 'Identification of Architectural Drivers' and review steps.
Importantly the SEI also provides a very useful tool for the identification of Quality Attributes - a taxonomy. This is not just a list of attributes with a detailed description, it actually breaks down attributes from the generic to the specific. Take, for example, the following diagram for performance:
(Performance Taxonomy Extracted from Barbacci, Mario; Klein, Mark; Longstaff, Thomas; & Weinstock, Charles. Quality Attributes (CMU/SEI-95-TR-021 ). Software Engineering Institute, Carnegie Mellon University, 1995.)
The Quality Attributes are broken down under the 'Concerns' branch. For example, in the case study used the 'Response Window' is an important metric which needs analysis.
The 'Factors' branch, lists properties of the system that can impact the concerns. In our case study the 'Arrival Pattern' and 'Execution Time' are both important factors that need to be considered.
Lastly the 'Methods' branch lists tools/theories that can be used to analyse the concerns.
This diagram is useful for identification as it encourages the reader to consider all the aspects of the attribute in question and the measurable specifics for it. Without this taxonomy it is common to hear comments such as "it has to run quick enough" but with the taxonomy the analysis becomes much more detailed and useful.
However there is a danger, particularly with using a general, external taxonomy. My observation is that once provided with a taxonomy the participants tend to stick very closely to it and forget out the Quality Attributes NOT listed on it. For example the SEI list does not include Usability attributes or anything covering Internationalisation/Localisation. In response to this I'd suggest creating your own domain specific taxonomy. For example, if you work on retail websites you'll want more focus on usability and less on safety criticality.Conclusion
I have found lightweight Quality Attribute Workshops to be a very effective way of identifying Quality Attributes in a short space of time, particularly if you use a Taxonomy to focus the participants. However you must be careful to not become blinkered by what it lists. Therefore I'd suggest you create your own taxonomy, specific to your domain.
I'm just back from the YOW! conference tour in Australia (which was amazing!) and I presented this as the closing slide for my Agility and the essence of software architecture talk, which was about how to create agile software systems in an agile way.
You will have probably noticed that software architecture sketches/diagrams form a central part of my lightweight approach to software architecture, and I thought this slide was a nice way to summarise the various things that diagrams and the C4 model enable, plus how this helps to do just enough up front design. The slides are available to view online/download and hopefully one of the videos will be available to watch after the holiday season.
There is currently a strong trend for microservice based architectures and frequent discussions comparing them to monoliths. There is much advice about breaking-up monoliths into microservices and also some amusing fights between proponents of the two paradigms - see the great Microservices vs Monolithic Melee. The term 'Monolith' is increasingly being used as a generic insult in the same way that 'Legacy' is!
However, I believe that there is a great deal of misunderstanding about exactly what a 'Monolith' is and those discussing it are often talking about completely different things.
A monolith can be considered an architectural style or a software development pattern (or anti-pattern if you view it negatively). Styles and patterns usually fit into different Viewtypes (a viewtype is a set, or category, of views that can be easily reconciled with each other [Clements et al., 2010]) and some basic viewtypes we can discuss are:
A monolith could refer to any of the basic viewtypes above.
If you have a module monolith then all of the code for a system is in a single codebase that is compiled together and produces a single artifact. The code may still be well structured (classes and packages that are coherent and decoupled at a source level rather than a big-ball-of-mud) but it is not split into separate modules for compilation. Conversely a non-monolithic module design may have code split into multiple modules or libraries that can be compiled separately, stored in repositories and referenced when required. There are advantages and disadvantages to both but this tells you very little about how the code is used - it is primarily done for development management.
For an allocation monolith, all of the code is shipped/deployed at the same time. In other words once the compiled code is 'ready for release' then a single version is shipped to all nodes. All running components have the same version of the software running at any point in time. This is independent of whether the module structure is a monolith. You may have compiled the entire codebase at once before deployment OR you may have created a set of deployment artifacts from multiple sources and versions. Either way this version for the system is deployed everywhere at once (often by stopping the entire system, rolling out the software and then restarting).
A non-monolithic allocation would involve deploying different versions to individual nodes at different times. This is again independent of the module structure as different versions of a module monolith could be deployed individually.
A runtime monolith will have a single application or process performing the work for the system (although the system may have multiple, external dependencies). Many systems have traditionally been written like this (especially line-of-business systems such as Payroll, Accounts Payable, CMS etc).
Whether the runtime is a monolith is independent of whether the system code is a module monolith or not. A runtime monolith often implies an allocation monolith if there is only one main node/component to be deployed (although this is not the case if a new version of software is rolled out across regions, with separate users, over a period of time).
Note that my examples above are slightly forced for the viewtypes and it won't be as hard-and-fast in the real world.
Be very carefully when arguing about 'Microservices vs Monoliths'. A direct comparison is only possible when discussing the Runtime viewtype and properties. You should also not assume that moving away from a Module or Allocation monolith will magically enable a Microservice architecture (although it will probably help). If you are moving to a Microservice architecture then I'd advise you to consider all these viewtypes and align your boundaries across them i.e. don't just code, build and distribute a monolith that exposes subsets of itself on different nodes.
For my final trip of the year, I'm heading to Australia at the end of this month for the YOW! 2014 series of conferences. I'll be presenting Agility and the essence of software architecture in Melbourne, Brisbane and Sydney. Plus I'll be running my Simple sketches for diagramming your software architecture workshop in Melbourne and Sydney. I can't wait; see you there!
If youâre working in an agile software development team at the moment, take a look around at your environment. Whether itâs physical or virtual, thereâs likely to be a story wall or Kanban board visualising the work yet to be started, in progress and done. Visualising your software development process is a fantastic way to introduce transparency because anybody can see, at a glance, a high-level snapshot of the current progress.
As an industry, weâve become adept at visualising our software development process over the past few years â however, it seems weâve forgotten how to visualise the actual software that weâre building. Iâm not just referring to post-project documentation. This also includes communication during the software development process. Agile approaches talk about moving fast, and this requires good communication, but itâs surprising that many teams struggle to effectively communicate the design of their software.
ã¢ããªã·ãã¯ããã¡ã ããã¨ãã£ã¦ããã¤ã¯ããµã¼ãã¹ãè§£æ±ºçã«ãªãããã§ã¯ãªã ã½ããã¦ã§ã¢éçºæ¥çã¯æµè¡ã«å·¦å³ãããããã¨ããè¨¼æ ã«ãä»ãã¤ã¯ããµã¼ãã¹ãããããã¨ããã§å¤§é¨ãããã¦ãã¾ããâæ¬¡ã®å¤§ãã¼ã âã ã¨æãäººãããã§ããããã¾ããï¼10å¹´åã«âä¸åºæ¥âã¨è¦ãªããããããªï¼å¤§åã®SOAããµã¼ãã¹æåã¢ã¼ããã¯ãã£ãåã«è»½éåãã¦é²åãããã®ã ã¨æããäººãããã§ããããç§ã¯ç¾å¨ã®ãã¤ã¯ããµã¼ãã¹ã¢ã¼ããã¯ãã£ã«é¢ãã¦ã¯å¥½æçã«è¦ã¦ãã¾ããããããã ããã¨ãã£ã¦ãã®ã¢ã¼ããã¯ãã£ã¯æ±ºãã¦ä¸è½è¬ã§ã¯ããã¾ãããè¨ãã¾ã§ããªããã¨ããããã¾ããããå¤ãã®äººãééã£ãçç±ã§ãã¤ã¯ããµã¼ãã¹ã«é£ã³ä»ãã¦ããããã«æããã®ã§ãã
A lightweight approach to software architecture is pivotal to successfully delivering software, and it can complement agile approaches rather than compete against them. After all, a good architecture enables agility and this doesn't happen by magic. "Software Architecture for Developers" is a practical and pragmatic guide to lightweight software architecture. You'll learn:
I'm excited to be working with Parleys on this and I think they have an amazing platform for delivering online training. If you're thinking about creating an online course, I recommend taking a look at Parleys. The tooling behind the scenes used to put the course together is incredible. Many thanks to Carlo Waelens and the Parleys team for everything over the past few months - I hope this is the start of something big for you.
I know there's demand for a hard-copy of the regular version, so I'll be doing this early next year, probably as a print-on-demand book from somewhere like Lulu, CreateSpace, etc.
Daniel Bryant, Simon and I recently had a discussion about how to represent system communication with external APIs. The requirement for integration with external APIs is now extremely common but it's not immediately obvious how to clearly show them in architectural diagrams.
How to Represent an External System?
The first thing we discussed was what symbol to use for a system supplying an API. Traditionally, UML has used the Actor (stick man) symbol to represent a "user or any other system that interacts with the subject" (UML Superstructure Specification, v2.1.2). Therefore a system providing an API may look like this:
I've found that this symbol tends to confuse those who aren't well versed in UML as most people assume that the Actor symbol always represents a *person* rather than a system. Sometimes this is stereotyped to make it more obvious e.g.
However the symbol is very powerful and tends to overpower the stereotype. Therefore I prefer to use a stereotyped box for an external system supplying an API. Let's compare two context diagrams using Boxes vs Stick Actors.
In which diagram is it more obvious what are systems or people?
Note that ArchiMate has a specific symbol for Application Service that can be used to represent an API:
(Application Service notation from the Open Group's ArchiMate 2.1 Specification)
An API or the System that Supplies it?
Whatever symbol we choose, what we've done is to show the *system* rather than the actual API. The API is a definition of a service provided by the system in question. How should we provide more details about the API?
There are a number of ways we could do this but my preference is to give details of the API on the connector (line connecting two elements/boxes). In C4 the guidelines for a container diagram includes listing protocol information on the connector and an API can be viewed as the layer above the protocol. For example:
Multiple APIs per External System
Many API providers supply multiple services/APIs (I'm not referring to different operations within an API but multiple sets of operations in different APIs, which may even use different underlying protocols.) For example a financial marketplace may have APIs that do the following:
Two of the services use the same protocol (xml over HTTP) but have very different content and use. One of the APIs is used to constantly supply information after user subscription (market data) and the last service involves the user supplying all the information with no acknowledgment (although it should reconcile at EOD).
There are multiple ways of showing this. We could:
Some examples are below:
A Single Service element and simple description
In the above diagram the containers are stating what they are using but contain no information about how to use the APIs. We don't know if it is a single API (with different operations) or anything about the mechanisms used to transport the data. This isn't very useful for anyone implementing a solution or resolving operational issues.
Single, Service box with descriptive connectors
In this diagram there is a single, service box with descriptive connectors. The above diagram shows all the information so is much more useful as a diagnostic or implementation tool. However it does look quite crowded.
Services/APIs shown as separate boxes
Here the external system has its services/APIs shown as separate boxes. This contains all the information but might be mistaken as defining the internal structure of the external system. We want to show the services it provides but we know nothing about the internal structure.
Using Ports to Represent APIs
In the above diagram the services/APIs are shown as 'ports' on the external system and the details have been moved into a separate key/table. This is less likely to be mistaken as showing any internal structure of the external service. (Note that I could have also shown outgoing rPorts from the Brokerage System.)
This final diagram is using a UML style interface provider and requirer. This is a clean diagram but requires the user to be aware of what the cup and ball means (although I could have explained this in the key).
Any of these solutions could be appropriate depending on the complexity of the API set you are trying to represent. I'd suggest starting with a simple representation (i.e. fully labeled connections) and moving to a more complex one if needed BUT remember to use a key to explain any elements you use!
I had the pleasure of delivering the closing keynote at the DevDay 2014 conference in Krakow, Poland last month. It's a one day event, with a bias towards the .NET platform, and one of my favourite conferences from this year. Beautiful city, fantastic crowd and top-notch hospitality. If you get the chance to attend next year, do it!
If you missed it, you can find videos of the talks to watch online. Here's mine called Software architecture vs code. It covers the conflict between software architecture and code, how we can resolve this, the benefits of doing so, fishing and a call for donations to charity every time you write public class without thinking. Enjoy!
p.s. I've written about some of these same topics on the blog ... for example, Modularity and testability and Software architecture vs code. My Structurizr project is starting to put some of this into practice too.
I'll be in Iceland next month for the Agile Iceland 2014 conference, which I'm really looking forward to as everybody tells me that Iceland is a fantastic country to visit. While in Iceland, I'll also be running my 1-day software architecture sketching workshop on the 6th of November. If you're interested in learning how to communicate the design of your software in a simple yet effective way without using lots of complex UML diagrams, please do join me. Everybody who attends will also get a copy of my Software Architecture for Developers ebook too. :-)
Here's a little snippet that my class really picked up on yesterday. During the training course, we get people into groups and ask them to design a solution based upon some simple requirements. The deliverable is "one or more diagrams to describe your solution". Aside from answering a few questions about the business domain and the environment, that's pretty much all the guidance that groups get.
As you can probably imagine, the resulting diagrams are all very different. Some are very high-level, others very low-level. Some show static structure, others show runtime and behavioural views. Some show technology, others don't. Without a consistent approach, these differing diagrams make it hard for people to understand the solutions being presented to them. But furthermore, the differing diagrams make it really hard to compare solutions too.
As I've said before, I don't actually teach people to draw pictures. What I do instead is to teach people how to think about, and therefore describe, their software using a simple set of abstractions. This is my C4 model. With these abstractions in place, groups then redraw their diagrams. Despite the notations still differing between the groups, the solutions are much easier to understand. The solutions are much easier to compare too, because of the consistency in the way they are being described. A common set of abstractions is much more important than a common notation.
I've been writing blog posts covering a number of topics over the past few months; from the conflict between software architecture and code and architecturally-evident coding styles through to representing a software architecture model as code and how microservice architectures can easily turn into distributed big balls of mud. The common theme running throughout all of them is structure, and this in turn has a relationship with testability.
The TL;DR version of this post is: think about modularity, think about how you structure your code, think about the options you have for testing your code and stop making everything public.1. The conflict between software architecture and code
I've recently been talking a lot about the disconnect between software architecture and code. George Fairbanks calls this the "model-code gap". It basically says that the abstractions we consider at the architecture level (components, services, modules, layers, etc) are often not explicitly reflected in the code. A major cause is that we don't have those concepts in OO programming languages such as Java, C#, etc. You can't do public component X in Java, for example.2. The "unit testing is wasteful" thing
Hopefully, we've all see the "unit testing is wasteful" thing, and all of the follow-up discussion. The unfortunate thing about much of the discussion is that "unit testing" has been used interchangeably with "TDD". In my mind, the debate is about unit testing rather than TDD as a practice. I'm not a TDDer, but I do write automated tests. I mostly write tests afterwards. But sometimes I write them beforehand, particularly if I want to test-drive my implementation of something before integrating it. If TDD works for you, that's great. If not, don't worry about it. Just make sure that you *do* write some tests. :-)
There are, of course, a number of sides to the debate, but in TDD is dead. Long live testing. (ignore the title), DHH makes some good points about the numbers and types of tests that a system should have. To quote (strikethrough mine):I think that's the direction we're heading. Less emphasis on unit tests, because we're no longer doing test-first as a design practice, and more emphasis on, yes, slow, system tests. (Which btw do not need to be so slow any more, thanks to advances in parallelization and cloud runner infrastructure).
The type of software system you're building will also have an impact on the number and types of tests. I once worked on a system where we had a huge number of integration tests, but very few unit tests, primarily because the system actually did very little aside from get data from a Microsoft Dynamics CRM system (via web services) and display it on some web pages. I've also worked on systems that were completely the opposite, with lots of complex business logic.
There's another implicit assumption in all of this ... what's the "unit" in "unit testing"? For many it's an isolated class, but for others the word "unit" can be used to represent anything from a single class through to an entire sub-system.3. The microservices hype
Microservices is the new, shiny kid in town. There *are* many genuine benefits from adopting this style of architecture, but I do worry that we're simply going to end up building the next wave of distributed big balls of mud if we're not careful. Technologies like Spring Boot make creating and deploying microservices relatively straightforward, but the design thinking behind partitioning a software system into services is still as hard as it's ever been. This is why I've been using this slide in my recent talks.
Uncle Bob Martin posted Microservices and Jars last month, which touches upon the topic of building monolithic applications that do have a clean internal structure, by using the concept of separately deployable units (e.g. JARs, DLLs, etc). Although he doesn't talk about the mechanisms needed to make this happen (e.g. plugin architectures, Java classloaders, etc), it's all achievable. I rarely see teams doing this though.
Structuring our code for modularity at the macro level, even in monolithic systems, provides a number of benefits, but it's a simple way to reduce the model-code gap. In other words, we structure our code to reflect the structural building blocks (e.g. components, services, modules) that we define at the architecture level. If there are "components" on the architecture diagrams, I want to see "components" in the code. This alignment of architecture and code has positive implications for explaining, understanding, maintaining, adapting and working with the system.
It's also about avoiding big balls of mud, and I want to do this by enforcing some useful boundaries in order to slice up my thousands of lines of code/classes into manageable chunks. Uncle Bob suggests that you can use JARs to do this. There are other modularity mechanisms available in Java too; including SPI, CDI and OSGi. But you don't even need a plugin architecture to build a structured monolith. Simply using the scoping modifiers built in to Java is sufficient to represent the concept of a lightweight in-process component/module.Stop making everything public
We need to resist the temptation to make everything public though, because this is often why codebases turn into a sprawling mass of interconnected objects. I do wonder whether the keystrokes used to write public class are ingrained into our muscle memory as developers. As I said during my closing session at DevDay in Krakow last week, we should make a donation to charity every time we type public class without thinking about whether that class really needs to be public.
A simple way to create a lightweight component/module in Java is to create a public interface and keep all of the implementation (one or more classes) package protected, ensuring there is only one "component" per package. Here's an example of a such a component, which also happens to be a Spring Bean. This isn't a silver bullet and there are trade-offs that I have consciously made (e.g. shared domain classes and utility code), but it does at least illustrate that all code doesn't need to be public. Proponents of DDD and ports & adapters may disagree with the naming I've used but, that aside, I do like the stronger sense of modularity that such an approach provides.Testability
And now you have some options for writing automated tests. In this particular example, I've chosen to write automated tests that treat the component as a single thing; going through the component API to the database and back again. You can still do class-level testing too (inside the package), but only if it makes sense and provides value. You can also do TDD; both at the component API and the component implementation level. Treating your components/modules as black boxes results in a slightly different testing pyramid, in that it changes the balance of class and component tests.
A microservice architecture will likely push you down this route too, with a balanced mix of low-level class and higher-level service tests. Of course there is no "typical" shape for the testing pyramid; the type of system you're building will determine what it looks like. There are many options for building testable software, but neither unit testing or TDD are dead.
In summary, I'm looking for ways in which it we can structure our code for modularity at the macro-level, to avoid the big ball of mud and to shrink the model-code gap. I also want to be able to automatically draw some useful architecture diagrams based upon the code. We shouldn't blindly be making everything public and writing automated tests at the class level. After all, there are a number of different approaches that we can take for all of this, and the modularity you choose has an implication on the number and types of tests that you write. As I said at the start; think about modularity, think about how you structure your code, think about the options you have for testing your code and stop making everything public. Designing software requires conscious effort. Let's not stop thinking.