Making good software is hard work. A lot of hard work and ever-evolving. For practical reasons nearly every software we use today is put together by assembling components sourced from somewhere else. The sourced components may be freely available on the Internet, or some of it acquired by purchasing a license. Whether automated or not, the collection of sources where software components are drawn from and the way they are tied together represents its supply chain. This supply chain can be very wide and broad, teams might not have a clear idea of exactly how many component sources are involved in making their software.
Let’s look at an expanded picture, a simplified view of a supply chain network.
From top to bottom, the first consideration relates to the users, they typically don’t have any idea what is behind the software serving them. This may be software installed on the users’ computers. However, it is most likely that users are accessing remote services through software on their computer, a web browser for example. To keep things simple let us not make a distinction between software exclusively running locally, which is vanishingly rare, or software that is interacting with services on the web. Apple popularised usage of the word app in place of application, when they introduced their iPhone platform. Such usage is now commonplace even companies like Microsoft now often only refer to app (and not application), so let’s use the term app. Let’s describe each level a little bit more.
Level | Description | Control by the Organisation |
---|---|---|
0 | Build & deploy software | The team is in charge. Software (app) is deployed locally or on servers in the data centre or in the Cloud. Using automation where possible. |
1 | Create or configure | Teams write new software, or configure vendor software to meet the organisation’s needs. Using automation where possible. |
2 | Direct dependencies | The supply sources are usually directly referred to inside scripts and build configuration files. The teams tend to know these, to a greater degree. With large software, the components here might be sourced from colleagues in different teams or even divisions, think large corporations. |
3 | First Transitive dependencies | Quite often components transitively pull in others. The teams tend to be less aware of these, a second-place where surprises may come from. These sources are quite often not well known or understood. |
4, 5 | Second Transitive dependencies | The second transitive dependency level (and beyond) tend to be opaque if not totally overlooked by teams. Only when trouble strikes would people become aware of their presence. |
A few illustrations
Take a wildly popular software, which is installed on users’ computers and may be used to edit documents from remote locations, Microsoft Excel. Do you have any idea what a supply chain for Excel might look like? We can get a glimpse of some of it by looking at the Third Party Notices link.
Let’s take another example. Suppose you’ve heard lots of good things about React Native, the promise of a single codebase for three platforms: web, iOS, and Android. What’s not to like? Give it a go? Right at the bat, before writing anything custom, the toolkit creates a sample project and download the basic dependencies for you. Your blank project now contains a node-modules
folder with over 500 sub-folders! What is in there, do you know? Ever tried to look and understand them all?
There is a wild profusion of software frameworks and toolkits written in JavaScript, Python, Java, Ruby, PHP, .NET, Scala, far too many to list. If your teams write custom software building on popular frameworks or toolkits, your Software Supply Chain might be a sprawling web of dependencies that you’re not fully aware of. This could be a blessing as long as you are delivering value effectively and efficiently. However, this might also be hiding some cumbersome burden or vulnerabilities that you ought to proactively manage and contain. It is always far too late and expensive when you are facing troubles.
Spring Framework (Spring) is a very popular with Java developers for two decades going. It has become a de-facto standard for writing Java applications. Spring ecosystem is now a wide array of specialised frameworks that cater to various scenarios, from microservices to big data to API design and many more. Apps built with Spring automatically benefit from lots of functionality such as database access, authentication, internationalisation, etc. Java features wired in the app typically ship as Level 2 dependencies, some of these are specified by the developer, others are pulled in as Level 3 and 4 dependencies via Spring. To run a Java app, you need a an engine called a virtual machine, JVM for short, which understands the operating system it is installed on and can interact with it. Level 4 and 5 (and beyond) dependencies are typically present at the JVM layer. The picture is fairly similar for .NET apps.
What about security, ransomware?
Each grey block in the diagram depicted earlier could be holding both pleasant and unpleasant surprises unless you are able to ascertain it. As bad actors seek to cause harm, they typically try to break into the software supply chain. One notorious recent example is SolarWinds breach in which the bad actors cunningly managed to inject malware between Level 1 and Level 0 of the vendor’s software supply chain. Companies got infected by buying a legit genuinely verified software package! It was a very successful supply chain attack of a kind so unusual that it went unnoticed for possibly years.
What about broken software builds?
Every now and then teams get unexpected disruption in their software construction process due to a supply chain issue. A good but fairly forgotten example is left-pad, which broke numerous teams’ automated builds. It wasn’t the only instance but that was when I first thought about writing this article and kept postponing its draft, till now. Left-pad is a good example of Level 4 dependency for NodeJS apps (built with JavaScript). Most developers were not even aware of its existence, then one day the developer decided to remove it from its well-known public place, the NPM repository. Suddenly lots of startups and even scaled-up companies saw their Ruby on Rails apps fail to build. The story can be read here and here.
Whenever publicised or not, give or take every month an organisation is impacted somewhere due to some vulnerability in their software supply chain.
What about software bloat?
What might start as a clever convenience, just fire up a tool and get a baseline app running in no time, may actually be hiding unnecessary bloat yielding unneeded complexity, a ballast that will linger on for a very long time. In the case of a React Native app, are you sure you need all the components that it automatically install in your app? What about a frontend app that uses Bootstrap, a popular web frontend framework, have you grabbed the whole thing and the kitchen sink though you’re not using most of it? Say you’ve initiated your app from a nice blog post you’ve read, the author illustrated lots of concepts in their example, do you need them all? Are all of those things ending up in your app and probably staying there forever?
Software is very much a liability. Think not when you build it they will come, think instead when you build it you maintain it and it will cost you more time and effort than you think. The software industry could make people believe that getting started is the most difficult part of the journey, this may be why an overwhelming amount of publication focuses on getting started, hello world. Starting something new always sounds exciting, everyone wants a piece of that. However, try looking for How to Keep Going and Stay Sane, not many references can be found. Your app may be depending on a clever component that does magic for you. Too bad the developer has long decided he wanted to be closer to nature, became a hermit, and now lives in a remote mountainous area without electricity or running water, you’re stuck with that magic dust component. Did anyone really need to depend on left-pad? Take a look at the excerpt below and see for yourself, it is simply removing some whitespaces to the left of a word (or a chain of characters), that’s all! Why should this be allowed to break someone’s code, lots of startups and even scale-up companies?
This is not to discredit some individuals, it simply illustrates how some innocuous (let’s use this handy little tool) can end up everywhere without anyone paying proper attention to it, that’s the SSC issue in a nutshell.
So, how well do you know your Software Supply Chain (SSC)?
- Does your team or organisation fully understand all the levels present in your SSC? How many levels can you and your teams relate to?
- Have you ever considered auditing what goes in?
- Have you figured out how to inspect and catalogue your SSC?
- Do you think you are totally prepared against disruption in your SSC? Is it protected against cyberattacks, ransomware? How can you tell?
- Are you actively managing your SSC, or would you rather leave things up to chance?
- What about resilience, what if parts of your SSC are vulnerable, have you worked out how to survive a sudden change of licensing or availability in critical elements of your SSC?
- Do you think being on the Clouds provides a sufficient guarantee that you are covered? Has anyone checked?
These are some of the questions that executives could be asking. There are a number of different ways of building resilience in the companys’ SSC, a proactive approach to addressing it will go a long way.
What can you do to improve control of your SSC?
Virtually every company today is a software company in some ways. Companies large and small will benefit from getting to grip with their SSC. You want to understand what you’re exposed to, where the biggest risks lie and what you should do to mitigate them. There is no one-size-fits-all. Very large companies likely have processes in place and would know how to manage their assets. Big Tech such as Microsoft, Google, Amazon, Apple most certainly have the resources to manage their SSC properly. Arguably, the rest of the business world are not trillion-dollar valuation companies, they need to fend for themselves and find solutions. Here are some ways to get started.
Start at the bottom, with the software
A good place to start might be to try and gain some understanding of your particular app SSC situation. This doesn’t need to be too difficult, many software development tools exist. In your situation, you might even recall some past incident whereby teams struggled with a technical issue and someone came forward and said oh, it was just such and such component that we frankly shouldn’t even have, that was the root cause. So someone in your team knew about a potential risk, maybe they were simply not empowered to make the change, or business owners didn’t prioritise some proposed improvement – we don’t have time for “fancy architecture” stuff, no need for refactoring as you call it, we just need to ship even more new features and faster!. There exist toolkits for every modern software development platform. Modern software development workbenches such as Eclipse, Netbeans, from JetBrains (IntelliJ, Rider, PyCharm, …) or Microsoft (Visual Studio, Visual Studio Code), all offer ways to analyse dependencies. Use these tools to create a map of your environment. Make diagrams and dive into that as a team, challenge everything you see in there, each item’s presence should be unambiguously justified otherwise it should be marked for (eventual) removal. This exercise might reveal directly actionable and beneficial items.
Look for and remove dead code
You want to look for “dead code” and get rid of it, if nothing else this will reduce obvious bloat and remove some unnecessary risks. Dead code is code that is present in your app or system but is never visited by any of the usages you have for it, you need the source code for this. This may be a legacy from smart-Johnny-who-became-a-hermit, Johnny is no longer reachable and nobody ever dare to touch that piece of code. Don’t have such a thing, if it’s not COBOL then you probably can look at it – incidentally, COBOL software engineering role pay handsomely for a reason, it’s legacy code that needs attention now and the competency has long been on a wane. For apps written with JavaScript, Google made a tool called Closure that proved handy for lots of teams. There are many tools for detecting dead code. A couple of examples: for a .NET app, a tool called NDepend might help; for a Java app, this blog post can be useful on how to do it with Netbeans.
Harden your production systems
Cybersecurity takes dead code removal one step further, it’s no longer about source code but about deployed software. The term hardening is used to refer to the practice of removing all pieces of software that are not explicitly known to be necessary to run a business, remedying well-known vulnerabilities such as weak authentication and authorisation, removing all test data, changing predictably weak configurations, and protecting measures based on records in well-kept vulnerability databases. Organisations would normally have security teams in place, and they ensure that these activities take place. It is part of the security assurance practice, it gives no guarantee that vulnerable software won’t get deployed, but just that only the strictly necessary and properly configurated software will be present on production systems. This is by far not the full gamut of security controls necessary, just a small part of it.
Practice regular code refactoring
As a frame of mind, consider something else entirely. Ever admired beautifully kept botanical gardens, with pristine walkways and stunning plants and trees? This doesn’t happen by chance, someone has been regularly trimming and pruning plants and trees. If they only build a garden then do nothing for a long time, nature would increasingly take over and it would stop being beautiful. A similar pattern exists with software. Never-trimmed and unmaintained software increases exposure to security threats, hinders the organisation ability to execute, can have an impact on the people who work there as well. It is healthy to regularly analyse and restructure code looking to reduce complexity, to make it more maintainable, this will help your organisation become ever more responsive to changes in the business environment. This practice works best if an active architecture discipline works in lockstep with software development processes.
As you analyse your SSC you might also discover your own flavours left-pads, start isolating them and prepare to remove them. Code refactoring offers a good way to remove such dependency, it may take several iterations to achieve it fully. Martin Fowler wrote a whole book on refactoring, for a very good reason – in case someone wonders, Martin Fowler is a brilliant tech mind who’s been guiding developers for as long as the early 2000s when I initially came across his work.
Protect your SSC against security vulnerabilities
There are numerous tools for checking software quality and improving security in the SSC. One reputed tool is Snyk, which offers both open-source as well as online services. Such tools are typically integrated into the software build pipelines or may be used to protect asset repositories. Ensure you only source software from well-known secure repositories. In some instances, you might want to curate your own secure-software-asset repositories, all the major Cloud providers (AWS, Microsoft, Google, Oracle) offer solutions. Additionally, moving your code to that those types of vendors, and perhaps using online software development workbenches, could all help to improve your practice. This Microsoft article also gives some tips. It is reported that SolarWinds breach stemmed from an improperly configured CI/CD environment – CI/CD stands for Continous Integration / Continuous Deployment, the set of automated processes whereby code is taken from source repositories and software is built and deployed on testing and/or asset repository systems. If true, one can argue that should teams at SolarWinds not allow weak/default credentials in their CI/CD systems to persist, massive and very damaging attacks could have been prevented.
Adopt an architecture led SSC management
One can argue that SSC management is indeed an architecture governance issue, that’s where it belongs and can have the most impactful benefit. As the organisation improves at pro-actively managing its SSC, it should discover more opportunities to optimise things even further. As stated earlier, software is a liability, you build it you maintain it (the gorgeous botanical garden doesn’t keep itself squeaky clean and beautiful). Since liability speaks well to number crunching, getting it under control thus makes a lot of sense.
Is it possible to avoid such liability entirely? Is that a reasonable expectation? Is it possible to reduce the liability, the blast radius? Well, there are options. You might have heard about No-code? Low-code? These can start to look attractive. Have you wondered whether that’s something for your organisation? – those who have been in the industry long enough, you’d probably be reminiscing of how “Microsoft Access was going to obsolete software development”, well that’s good but maybe hang on for a moment. Let’s first understand them by looking at another parallel notion, accommodation for people.
As people in modern society, we can achieve our accommodation goals in many ways. When we are temporarily travelling to some place, say a foreign city, we often only need to carry lugguage holding only the most essential items we need, we would stay at a hotel. Depending on our budget, the hotel may be providing luxuriously many services, a 5-star hotel, or as little as just a place to sleep and take a shower, maybe a 2-star or 3-star hotel or a hostel. Aside from our clothes and strictly essential belongings, we don’t own anything that we find and peruse in a hotel facility, they are all services for fee (often included in the nightly fares). A hotel room can be equated to software’s equivalent of software-as-service (SaaS) notion No-code but with a twist – with no-code, you are going in for the duration of a service or some form of sustained commitment. In some instances we might be planning to stay a lot longer at someplace rather than just a couple of nights, it might then be nicer to rent an apart-hotel or a semi-furnished apartment. In this case, the level of service is lower than that of a hotel, the place has adequate furniture and regularly replaced clean beddings and towels are available, but we need to do the groceries and cook for ourselves. We might bring more belonging with us though not an entire household. An apart-hotel room or a semi-furnished apartment could be equated to the SaaS notion of No-Code or most often Low-code. When we plan to stay at someplace for many years then there is a whole bunch of options, ranging from a totally empty apartment to a whole house or even a ground where we can obtain construction permit and build the house of our dreams. The long-stay case is akin to when we build and run our own software of which variations also exist: the rented apartment with a blend of SaaS and own code, the apartment bought varying from a combination of Platform-as-aService (PaaS) and SaaS and own-code, all the way through to managed Data Centre facilies rented or bought.
Whatever the case, renting a hotel room, renting an apart-hotel or semi-furnished apartment, or building your own house, you don’t try to do absolutely everything by yourself, do you? That leading-edge security system at the million-dollar mansion was probably set up and is run by professional companies, the utilities are supplied by a local provider. So you do rent some services, it is always going to be a question of mix-n-match.
After such a large detour, we can examine our organisation’s situation and find different models that can apply. It is hopefully now clear that we don’t get rid of the problems entirely, we simply shift responsibilities to where they will be best addressed. We delegate some tasks to parties that we trust, hence remove such concerns from our daily operations so as to achieve more focus. This also implies strategic thinking, you are going in for the longer term (contrary to the hotel room paradigm), there are pre-requisites, some pre-existing systems, and data that you want to rely upon. With software, you want to take charge of the levers that best arm you for properly steering your business, forge appropriate relationships where you can delegate valuable tasks and maintain good standing. It is not the aim of this article to dive into the current flurry of vendor low-code and no-code offerings and their nuances. There is no silver bullet, just carefully calculated and measured moves.
In summary
As business starts and scales up, it may go from renting a simple hotel room, or out of someone’s bedroom, through to building its own corporate tower blocks on own ground, and a mix of both is inevitable at a certain scale – Salesforce started in a bedroom and now have shining towers they are rightly very proud of. This is not to suggest that your organisation’s journey could be anything like that. Instead, this is an invitation to get to grips with the software that powers your business, validate that everything is truly fit for purpose, that’s what control of the SSC can also help you achieve. Is Salesforce moving all employees to their big tower? Of course not. Do they own several different buildings across the US and around the world? Surely, yes. At any given time, do they also have people working from some hotels or temporarily rented facilities? Most probably. That is another way of considering how an organisation could mix and match solutions to fit their needs.
This has been a broad sweep at an important concern for business today, the Software Supply Chain, the SSC. Not all organisations have sufficient awareness to realise how important it is for them, whether they should worry about it or not. Again, take another topic, Covid-19. People living outside of China would remember seeing TV footage of agents in full body protection clothing spraying the streets of Wuhan? How many looked at that and thought “ah, strange things happening in China!”? Fast forward to now, hasn’t that proved to be a global concern? There is something similar with software, when we read/hear about ransomware attacks or broken systems, we probably tend to brush that aside and think “not here though”. Well, are you so sure? It’s better to ascertain where the organisation is at, then there may be things you can do to increase your odds of coming out at the better end of it. Raising some awareness on such issues is what this article aimed to achieve.