Once upon a time, diagrams were drawn, sent round for discussion and then built by hardware teams.

It was a slow process and while the diagrams themselves were useful and generated valuable discussion, ultimately they would be stuck inside MS Word documents. Once built, they would not touched until there was a problem.

We now have a world that can use configuration files via APIs to programmatically construct infrastructure, which has reduced construction time to minutes rather than months. Infrastructure as software (or more accurately as configuration) is key to this stage of evolution, with tools such as Ansible, Terraform, Chef, Puppet, Saltstack and others.

The configuration has effectively become a contract with the cloud, and that contract needs to be shared with stakeholders, such as security people, architects and project managers, as it defines data use, flow and costs.

Our problem now is that this configuration is stuck inside the developer’s source code repository.

Whilst we can share these files, the stakeholders from the other domains need skill and patience to work out what’s gone on. Thus, the actuality of deployment is effectively hidden from the rest of the organisation.

What is needed, is a way to share the design, costs and interfaces with the wider IT organisation.

It needs to be represented in a way that is easy for humans to understand and ideally a way to gather feedback.

A Modest Proposal

To help with increasing understanding, I suggest introducing a modern version of traditional system diagrams.

They should represent system configuration, be sharable among the broader stakeholder group, and be used to create orchestration scripts for straight-through, speedy deployment. Effectively taking infrastructure as code and representing it visually to cloud-source improvements.

The consumers would be security teams, fellow developers and engineers, even regulators and new roles including micro service coordination.

This proposal would also have an important contribution to scalability and the success of DevOps and agility.

Diagrams of this type can allow simpler construction and a level of implicit configuration. The current situation needs smart people at the top of their game to keep the lights running. But there are only a finite number of smart people to go around. Some aid to construction would be welcome by many teams that struggle with the technology overall, especially in keeping excessive costs down.

The Different Parts

So how do we represent a system that can be simple, comprehensive and intuitive? One that needs to combine infrastructure, services, user software, system software with configuration? Moreover, how do we include innovations like containers, lambda or functional-as-a-service computing?

The approach taken is to split architecture components and concepts into categories as follows:-

Hosts
Software
Services
Clusters
Networks
Network equipment
Data paths
Locations

Each category holds a set of real components that we might understand, like a two socket server being a type of host or a database being a type of service.

Components of different types can be held together in a catalogue for convenience. For example, all the components that a cloud provider might offer (such as hosts or services) would go in to a single catalogue. The catalogue can then be maintained, distributed and included in projects. Switching to a different provider means using a different catalogue.

To describe a general architecture, we use an abstract catalogue that contains generic components that can’t be turned into a real system. When we are ready to commit to a provider and have a notion of the sizes that we may wish, there is a process to convert generics into specifics. The specifics would belong to a catalogue from a provider based on real models they offer.

We use the abstract catalogue in describing the type of components, starting with hosts.

Hosts

The host is a set of processors, memory and storage that is able to run a stack of software and attach to a network.

Hosts range in size and power from large physicals down to just enough cycles to run individual functions. The types are as follows:-

Physical host - Whole stack on a physical server
Virtual host - Whole stack on a hosted, smaller, relocatable virtual server
Scaling host - A type of virtual host that can dynamically change in size and configuration while deployed, depending on its workload.
Cluster - Container or pod platform - A platform of one or more physical machines that is able to run a software stack, typically in the form of containers. Multiple machines are often used to maintain high availability and scalability in multiple availability zones or locations. However, the physical nature of the platform is not known to the stack or the operational details; it just looks like host. In its purest form, but is a contract relationship.
Lambda or function platform - A platform that is able to run code functions or lambdas: the smallest useful part of an application triggered on an external event such as the arrival of data or a time mark. The platform may be multiple machine in different locations like containers.

As an abstract component, no suggestion is given of what it is to be and that helps us defer the the decision to later or change from one provider to another

The host is the most customisable component, as allows the designer to run any valid software that has been orgnanised into a stack. The stack includes the operating system.

The graphic of a host is shown below. It shows a software stack, host storage, the host icon, the instance name and a type description.

Host
Figure: Representing a host with its stack of software and local storage

The storage can be any non-volatile technology, such as NVMe or spinning disk and can be dragged out frrom a list. The persistence is tied to the life of the host; remove the host and the storage will be lost. If longer term persistence is required, then an external storage facility should be used using a service and given a name (covered below).

The software box holds a stack of software, starting from the operating system and following the list to layer software sequentially. This allows control of any side effect that may be caused of desired by the complete ensemble. Each item is a package method supported by the host type, typically Linux RPMs, DEB packages or Docker images.

The host types, storage and software are defined by the catalogue.

Containers, Host Clusters and Lambda

In our system of diagrams, hosts are contracts to execute software, an agreement with the platform provider to run a server of a particular type and configuration. They are logical representations that ultimately may be a single machine, but it is entirely possible that it might be several machines or a highly available compute service. Hence the emphasis on multiple types of hosting, that can adapt as compute alternatives evolve.

Container platforms differ from other hosts in the packaging of their software: only container formats are allowed.

Some container platforms like Kubernetes, require a more complex representation in the diagrams as there are more parameters to configure or multiple locations to consider. We will cover the specific detail in another article.

The same is true for lambda: they are a type of host but only compatible packaging can be added to the software stack.

Connecting to a Network

Hosts and Services are always connected to networks.

Sometimes, you don’t care about the details and you need nothing special. The convention is that every point in the design is able to communicate with every other point; they are completely routable. At other times, a specific type of network is offered in the catalogue and can be chosen. For example, a 10 GB and 1 GB Ethernet may be offered and the design may want to use a specific type or even both.

A network component appears as an icon, a description and a name, with dropping regions above and below to place hosts and services. Anything dropped into the regions associates those components with that network device. So, if a set of hosts need to be attached with a fast network but others can be slower, two different network type would be placed on the design and hosts placed accordingly. The faster network technology may be more expensive, so by utilising both, a cost saving may be gained.

By default, a network line is drawn from the component to the network device, which is to aid visualisation and it took like a network diagram. This can be used to make the diagram more readable or to emphasise aspects of the design.

Figure: Diagram of two hosts (with storage and software) attached to a network (Standard Network) from above. The space below is empty but can fit more hosts

More Networking

Regardless of how the lines are drawn (or not, they can be suppressed), it is all one connected (or routable) network; any device can talk to any other device.

We do this by network names. By default every network grouping has the same name and thus they are routed together.

If, however, you wanted to split the network so that they are separate and disconnected, we give the networks different names. Traffic will only propagate to the network groups that have the same names.

To reconnect groups together with control devices like firewalls or load balancers, we can add network components that represents these devices. The visualisation will then represent this by drawing a ‘jumper’ between networks and the device in question. These can be used to connecting separate networks internally or for attaching your design to the outside world.

Figure: Diagram of a firewall or control point implementing a security group, attached to a standard network

In this diagram, a user is connected to a ‘Standard Network’ via a firewall or control point, specifically implementing a Security Group. It behaves just like a firewall with IP connections, with source and destinations points. The security group can be configured in the design by filling in a properties table that belongs to that component.

Figure: Property table for the security group

Services

As a complement to hosts, we introduce services.

Whereas hosts are platforms to run software and are customisable, services are complete and defined, implementing a single set of specialist functionality. They have network addresses, possibly implementing RESTful interfaces, RPC or another API. A common example is the DB server offered as a service. You don’t know the patch level or how big it is, but you do know it stores your data and you may be sharing it with others. Another example is a storage device, offering NFS or SMB.

In other regards, a service looks a bit like a host and so is represented as one but without the ability to customise its composition. In the diagram below, services with AWS icons are shown attached to a network, which is shared with a host attached above.

Figure: Diagram of VM connected to three Amazon services: RDS, S3 and SES. Services are treated like hosts but can’t be loaded with software

Tiers

Traditional architecture diagrams typically have a way to separate elements of complexity into layers or tiers. So there might be a database tier, a computation tier and a presentation or web tier: each section is then able to be more straightforward. It is still desirable to offer this functionality, so a way is needed to collect diagram components together into groups. It is done with a layering construct that builds up a set of vertical regions that group components which keeps the design clean.

The diagram below shows three layers. A compute layer with host and file server on a network. Another with two hosts and a third with a pair of hosts attached to network and SAN fibre networks.

Figure: Three tiers: two computes and database

Data Flow

With host, service and networks to connect them, we have the fundamentals to describe an architecture topology, able to cope with traditional physical computing, together with the newer (or rediscovered) compute paradigms of containers and functions.

In order to impart more meaning, we can overlay a set of lines, showing the flow of data from one component to another. It shows direction and protocol, useful for firewalls, protocols or to show how the components interact. For example, a web request over HTTP.

Superimposed lines can also be used to express other meanings, such as bulk data paths for replication or grouping of components when they need to follow a specific order when being built.

Network devices
Figure: User to GSS, then to F5 load balancer. The coloured arrows represent data flow, the wide arrow band information flow between the two GSS devices

Data Flow Diagrams

Once data flow has been established, it also possible to express the information as a Data Flow Diagram rather than the host oriented model. In that visualisation, data is shown moving between processes and logically significant network detail can be shown as a grouping of processes and flow.

No one visualisation wins overall; however, if the configuration data is represented as a data graph, not an illustration or picture, it is possible to emulate one representation with another. This is the case with the host platform diagram; it is possible to dispense with the host altogether and redraw with simple nodes and edges like a data flow, which is useful in some contexts. However, by keeping the anchor to how a system will be executed, it is possible to make more detailed performance and deployment decisions.

Annotations

To help set the context of the designs and to provide additional information to humans, one can annotate the design with free text. It takes the form of headings, labels and text boxes for individual or groups of components and are anchored to parts of the diagram. The diagram can then be moved around whilst the text sticks to the components that need explanation.

Name Services

If you provide a service to others, you will want to advertise yourself. This takes the API entry points (IP address and port pairs) and registers them against a domain name, at least. With clusters, such as Kubernetes, there is a more active management of services to domain name. The cluster makes dynamic decisions of this mapping and is implemented by the catalogue supporting code using the abstraction the diagramming standard.

Next Steps

This post defines the core of the System Garden diagram system; in a second part, we will round-off the model to make it more practical. By using the model, one can create a new type of technology asset — the diagram is data. It is an asset that can be transformed into multi-purpose visualisations and drive installations. Used as a template for system dashboards or evidence for security assessment. The data can be used for financial forecasting and logistical planning.

Most importantly though, it is a basis for describing configurations that can be used for many cloud providers. As clouds become complex or agile deployment may be difficult for many stakeholders to understand, using pictures makes the system much easier to and create and understand.

About the Author

Architecture, Strategy, Innovation. Follow me on Twitter @nigelstuckey

System Garden

Agile Infrastructure for Enterprise DevOps Design from diagrams, document and deploy to your cloud.
systemgarden.com, Twitter @systemgarden

How to Graphically Represent Software Defined Systems I