The importance of “Below the Fold” SBOMs

 

If you ask someone on your engineering team to name something in your SBOM, you might hear the names of a handful of well known open source libraries, especially if they have been in the news lately due to a serious vulnerability. These components are typically the ones they use day in and day out as part of the code they write for your product. They’ll likely be written in the programming language your team uses and managed by the repository manager they use as well.  In the newspaper business, Above the Fold is the place where important and critical news of the day is reported, while the news “Below the Fold” is perhaps perceived as less important or at least, less flashy. That said, 99% of the news in the newspaper is “Below the Fold”

 

In the world of SBOMs we find the same distinction in the types of SBOMs we build and manage. We often will be striving to make a SBOM for the dependencies our code touches directly, while neglecting or ignoring the foundations under our product; the operating system, containers and infrastructure pieces. 

 

While there are many historical reasons for this, the ability to ignore these layers is long over. 

 

What are you less likely to hear about?

 

Just as important but often neglected are what I call “Below the Fold” components, the infrastructure and operating systems that your entire stack is built upon, but they are often treated as an opaque box with little understanding of their internal dependencies or even existence.

 

Important components for your stack such as Operating Systems, Databases (you may have multiple optimized for different purposes), Message Queues or Monitoring/Observability components may be indispensable but are often ignored.

 

The inverse pyramid of SBOM knowledge

 

When you create or receive a SBOM for a product, the layers your product developers work on will be concentrated at the top of the stack with less and less knowledge as you get closer to the bottom. You might even see some components from each layer mixed in your final SBOM, but that is not a sign of completeness. Your SBOM might be 80% complete at the top level, but only 1% complete at the lowest. 

 

MORE TRACKING

Your Product

Infrastructure and System Services Layer

Operating System Layer

LESS TRACKING

 

The complexity of a modern software product means that multiple teams assemble layers of software, network together services and download container images, all while expecting that “someone, somewhere” will be keeping an eye on things. If you are reading this, you might be that someone!

 

Who selects and builds the Operating System layer?

 

Traditionally, an IT person or System Administrator was responsible for selecting, provisioning and updating the operating systems that products ran on. As time went on, this task became something that developers took on. As the march toward containers and Cloud Computing continued, it became easy to grab an operating system image from the Internet and move on. The component you depend on might even recommend an image name from a container service. This one component might bring in hundreds or thousands of other open source components silently. Your product may depend on more than one image, selected or created by different people at different times. Some may be kept up to date, others are a “one and done” and left in place with no updates or fixes even if serious vulnerabilities are discovered.

 

As attacks on components increase, concerns around poisoned components and legal requirements for accurate and complete SBOMs also increase. The need to track and manage these layers is becoming more important.

 

Why do people neglect the operating system layer?

 

Much of the industry has moved away from a dedicated job title and role (System Administrator) while at the same time, systems have become more complex and expectations around security and SBOM management have increased

 

Often there is a belief that Firewalls or Network scanners are performing this task, especially since they may see some of the same open source component names, CVEs and devices in the reports generated by them.

 

Network scanning tools give only a thin slice of the open source in use at these layers. These results will be concentrated on services that advertise their existence or explicitly transmit information about their name and version when prompted. This is a very small percentage of the actual number of components to be managed.

 

Additionally, containers and other operating systems are seen as opaque boxes. “That’s Linux” or “That’s our Postgres Server” and tracked as a single line item if they are even tracked at all. There may be a belief that a vendor or open source project will be sending out updates or alerts, even if this is not the case.

 

Even if you are doing container scanning, the SBOM might not be sending active alerts depending on the tools used to scan it before deployment. The depth of analysis might not be appropriate for the current threats and many times their alerts might be ignored since they are going to the wrong people

 

What serious concerns come from the operating system layer?

 

Depending on your industry and distribution model, you will be concerned about a few different things that a SBOM will help you identify and manage. These include:

 

  • Vulnerabilities, CVEs and Security issues
  • Open Source License Compliance
  • Poisoned Components and Malware
  • Supportability and End of Life issues for OSS Components

 

By creating SBOMs for these layers, you will be in a better place to manage each of these concerns.

 

Where are we going with Below the Fold SBOMS?

 

As teams look lower in the stack, I expect to see a few trends occur. The first is that the tools available for teams to perform better analysis and SBOM creation for the lower levels will appear. These might be part of your existing SCA toolset, or may be from a separate vendor with a concentration on container and infrastructure experience. It is okay to mix and match tools to get a better result! 

 

SCA products performing Binary analysis of existing images and binaries will help do spot checks for the Vulnerability of the Day, especially before vendors and OSS projects have their SBOMs created and shared.

 

There is movement to use Hardened and Minimal images. These are images specifically designed to be as secure as possible, published often with vulnerabilities fixed and in some cases, with unneeded functionality removed in order to reduce the attack surface. This will help, but you will still find cases where your teams have built their own images, or are layering open source components on these hardened images. 

 

Additionally, scanners to help understand the interactions and full service dependencies of a software system are appearing. These help collect and visualize the network and data paths between different silos. By getting a complete understanding of all the containers, services and teams involved with running your product, you will have a more complete and useful set of SBOMs.

 

Who Manages the security for each layer?

 

It’s important to understand who is responsible for the security response for each layer. Do you have a DevSecOps team who can handle it all or do you need to pull in a classic IT team to help with operating system and VM issues? It is important to examine every container,  layer and service and explicitly assign it to a security response team. Doing this over a holiday weekend idue to a high severity vulnerability is a recipe for disaster.

 

Questions to Ask Your Team and Yourself

 

  • Do we know all the containers our system depends on?
  • Where did these images come from, what is the update process?
  • Are these images being scanned and managed by a SCA product?
  • Can we produce a SBOM for these images?
  • Do we depend on any “real” server hardware with an Operating system on it?
  • Do we use Virtual Machines (VMs) anywhere?
  • Where did those images come from, what is the update process?
  • What infrastructure components like Databases, Queues do we use?
  • Do they have their own operating system / container to manage?
  • Are there any temporary systems that are still running? Something for a single customer or integration?
  • What is our process to keep our SBOMs up to date for everything we discovered in this review?

 

Wrapping Up

As the legal requirements for complete and sharable SBOMS become more stringent, the need to examine the lower foundations of your software stack becomes more important. You should bring up these concerns with your software team and discover the layers and containers that should be brought under management. You may find varying levels of security and compliance even in the same team. Putting pressure to produce SBOMs on suppliers of foundational components can help you get a better understanding of your complete supply chain and dependencies list. Bring in the right tools for each layer and make sure that the results are viewed by a team with the responsibility to protect the company based on the results of these scans.

 

As said before, while sometimes you need to read the first page, 99% of the news is in the rest of the paper!

 

Are you cleaning your sidewalks fast enough?

Companies all look the same when the weather is good.

Most days it doesn’t matter. Rain, sunshine, wind. These might affect other parts of your business but not the safety of the path out front. People don’t notice too much, maybe there’s a small crack, a little trash in a corner, but it’s passable, clear and not a danger to people walking by.

Then the snowstorm hits….

The next day, do people see a clear sidewalk or a dangerous icy patch?

Are you running business as usual, but not fixing a clear danger out front?

What does this tell your customers or potential customers about how much you care? Is it telling them that you are understaffed or failing? What other health and safety issues are you neglecting? Will you blame your users for not being safe if they get hurt?

This is the case with CVEs and other vulnerabilities. When the CVE of the day gets publicized, are your response plans nimble? Can you patch a library and get a new release out quickly? Can you publish a VEX document saying “we’re not affected”? Is it always an emergency, or do you have a playbook at the ready?

As a user, which of your vendors was a month late getting you an update? Which one never patched at all? Did you read about breach in the news but not get clear alerts from them in the first place?

Your business might still be open after the storm, but your users will see how you handle the icy patches in the meantime….

What 20 years of “stolen” snippets teaches about managing AI generated code

 

For many years I led a team whose job it was to track down who actually wrote the software that major corporations were shipping in their products. As part of this review, we created Software Bills of Materials (SBOMs) containing lists of the large open source libraries most people think of when they think of SBOMs, but also information concerning bundles of individual source code files and even snippets of cut and pasted code. 

 

In the last two cases, developers had gone out to the internet, found a blog or internet forum containing code that solved their problem and cut and pasted it into their code base. They often asked a search engine a question and found a block of code created by a human that almost exactly answered that question.  With a quick cut and paste and some minor editing the developer would quickly be on their way with the programming problem solved.

 

The first major issue with these types of cut and pastes is the lack of compliance with open source licensing that this third party code was released under. The original authors published this code with the expectation that these licenses would be read, respected and complied with.

 

In the vast majority of cases, the code was brought over without any embedded licensing information and in many cases either silently copied in or seen following a comment by the developer jokingly disclosing something like “Stolen from blog xyz!”. At least you might say the second case actually gave a little bit of credit to the original author, though the lawyers involved would always get upset.

 

Additionally, the quality and security of this code could be questionable. Typically the code was created to demonstrate a point, or to provide simple guidance on how to do a task, but lacked security guardrails or error handling.

 

Through a lot of scanning and human analysis, many of these source code snippets were brought into compliance and safety while others required removal and an independent re-write in order to clear an open source license violation or quality problem.

 

Why does this happen?

 

Developers are under intense time pressure and will use any tools available in order to complete the work they have in front of them. This, combined with a common lack of understanding of open source licensing, lack of source code origin tagging and the difficulty of discovering and managing this code means that the path of least resistance allows this third party code to sit silently until an event like a M&A or lawsuit or contractual scanning requirement occurs that causes it to be discovered, often at great cost to fix.

 

Policies were often in place, but a paper process with no teeth and without tooling to enforce is pretty much worthless. 

 

Additionally, the original code often lacked inline copyright and license statements making it difficult for the average software developer to know what the expectations on them were.

 

By examining the way that many developers interact with code from the developer help forum “Stack Overflow” shows us that licenses are confusing and developers don’t know or care enough to look for them. When a developer cuts and pastes code from Stack Overflow, the code by default comes under a Copyleft style open source license, the Creative Commons Share Alike License (CC-SA). See https://stackoverflow.com/help/licensing 

 

This can cause headaches later on as the team tries to understand how this license affects their use of the snippet, makes effort to relicense this code by contacting the original author, or to somehow decide the code is “not protectable” which can be a difficult and risky process itself. 

 

This type of cut and paste and then legal review can cause a lot of wrangling and debate. “Why did they post this if they didn’t want me to use it?” or “Are we REALLY going to get sued?” or “Everybody ELSE is doing it, why shouldn’t we?”

 

These questions are very much in line with the concerns and questions teams are having around AI generated source code.

 

How is it similar to the AI Code Generation problem right now?

 

The vast majority of code generated by AI/LLM code generators is emitted with no tagging or information tracking its origin. This is unlikely to change in the near future. Additionally, there is often a claim by the companies generating this code that the emitted code is not under an open source license. This code is being generated in large quantities at all levels of an organization with no tracking and policies that, even if in place, are at odds with reality. For example a policy that says “You will not use AI/LLM to generate source code for this project” is unlikely to be respected unless a clear believable reason is communicated to each developer personally. A policy slide on a Powerpoint on a shared drive someplace might as well not exist.

 

Is there an AI Code Generation Problem?

 

We don’t know and that is a tough place to be in. Companies who are in the business of selling AI Code generation are claiming there is no open source licensing problem. Content creators in other industries like Newspapers, Books and Cinema are filing lawsuits against similar claims in non-source code related fields of use. The next few years will be interesting. 

 

From a security perspective, there is concern and some evidence that AI generated code may be of lower quality than that human generated code especially in areas that the developer requesting help lacks expertise.

 

What can be done to manage this problem?

 

 

  • Policies don’t often work – but are important

 

Policies around Open Source compliance and security work are important but are only given as much respect as they earn. A one-off policy that is buried, is too harsh, or that has no dedicated time on the roadmap and timetable is destined to be ignored either out of ignorance or perhaps the hope that the policy will change in the future or that the developer will be long gone before it’s an issue. Your Legal team might know about the policy but developers will almost never know. Think about the impacts of Bans! It is important to explain why a policy exists and how to ask for permission to change it or get permission for a variance.

 

 

  • What doesn’t work?

 

Expecting developers to add comments that are not present in the original code is very difficult. It’s a great requirement to have in your policy and guidelines, but you should expect that most people will not follow the request. This is especially difficult if the developers IDE itself is generating the code inline. It may be helpful to push back on the dev tool manufacturers to request tagging or inline comments for generated code, but it does seem that this is against current trends.

 

 

  • If the concern is Quality, require tools to enforce quality

 

Whether the vulnerable code is written by your developer, generated by an AI tool, or brought in by an open source project, scanners exist to help discover vulnerable code. They do require expertise and time to run and analyze results, BUT this time should be brought into the calculation as part of the decision to introduce code from the outside in the first place.

 

 

  • Tools might help, if they are used promptly

 

Whether its SCA scanners that do Snippet Matching to find unlabeled licensed code, or SAST/DAST scanners to discover vulnerable code, tools exist to help solve these problems, but are often left to the end when it’s too late to do anything, or overwhelm the user with perceived false positives and large amount of real issues to resolve. It takes a trained eye to understand these results and make the required security changes. There is no easy button.

 

 

  • Make it clear if there are areas where NO risks can be taken

 

You may have extremely important areas of your codebase that you can take no risks in. This might be your core engine, core forecasting model or “secret sauce” that you may wish to wall off from AI generated code. This is a perfect place to have an explicit policy communicated to anyone with check in privileges to this area. A file in this directory containing policies and expectations can be helpful as well.

 

 

  • Understand that some areas might be more likely to have problems

 

If you are using AI to generate code for Linux Device Drivers or to create functionality to integrate with a well known open source project, you may find yourself in a situation when the model gives you EXACTLY what you are asking it for. If these areas of functionality are highly likely to be licensed under the terms of the General Public License (GPL) or other copyleft/viral licenses, you may get back source code that sure looks like the GPL licensed ecosystem you asked for.  This is an area to perform deep scans on and to be prepared to release to the community under a strong open source license itself.

 

 

  • It’s hard to slice and dice the law and ethics

 

It can be confusing to employees why you respect some open source licenses, but not others. Or why you do in certain areas but not others. Keep abreast of changes in the industry, especially around licensing and indemnification. Some AI vendors may indemnify your use of their code generation tools but there may be hoops to jump through in order to make sure you are covered. What are the limits of this coverage if it exists?

 

Where do we go from here?

 

The genie is out of the bottle, AI generated code is going into organizations’ codebases every day. Ignoring its existence is unlikely to be the best long term strategy. It will find its way into your codebase, and be in a form that will make it very difficult to track. If problems arise, it will be difficult to remove and replace. Through education, thoughtful and well communicated policies and proper tooling your organization can use the current crop of AI tools while being prepared for changes in the licensing, regulatory and community aspects of their use. Watch how parallel industries deal with this issue and be prepared to quickly communicate policy and legal changes to your developers. Communicate the risks so that developers can better use their judgment while building your products.  The next few years will be a time of serious change in the software industry. Understanding how you use these new tools, and the potential impact on your bottom line, can help you best manage the impact these tools have on your software.

 

The S in SBOM should stand for Sharable! 

a person hands another person a document

 

What serious issues besides vulnerabilities and compliance issues will keep you from sharing your SBOM?

 

You are going to find yourself in a jam. An urgent request for a Software Bill of Materials (SBOM) is going to come across your desk. Thinking that it’s a simple “push a button, get a report” process you’re going to ask an engineer to “push a button and get you a report”.

 

When you look at that report, your first SBOM, you are going to find a long list of problems that you won’t want to share with your customers or the government or whoever else is demanding a SBOM ASAP!

 

Let’s talk about what problems you’ll encounter.

 

In a previous article “Your first SBOM is going to stink. Don’t panic, get started fixing it!” I discussed common mechanical problems in getting to your first complete SBOM. These included Completeness, Depth, Unremediated Vulnerabilities, Open Source License Violations, and Over Delivery.

 

As part of the process, you will clear out vulnerabilities and open source license violations but along the way you will find other issues that may prevent you from sharing your SBOM.

 

These include Commercial Components with licensing problems, information you can’t share due to NDAs, architecture issues you don’t want exposed, export control issues, competitor’s technology, perceived over dependence on open source, and dependence on old code or technology.

 

Commercial Components and Technology

While much of the discussion around SBOMs has revolved around Open Source, there is still a large dependence on Commercial technology in the modern software stack. Most Software Composition Analysis (SCA) tools do not discover or report on Commercial components. If the contract or legislation your SBOM is being generated in order to fulfill requires a list of “all third party software” or explicitly calls out Commercial components as reportable, it will be required to discover and add these components to your SBOM. This is typically a manual process. 

 

These commercial components often require payment or contract terms in order to be used. It is important to confirm that the usage is being tracked and properly paid for under the terms of the agreement. It is not uncommon to find commercial code that should have been removed at the end of a contract term still being used and shipped in a codebase. This may require re-licensing for continued use or rework to remove.

 

Additionally, you may find yourself with a conflict between the requirement to produce a complete SBOM and a Non-Disclosure Agreement with your technology vendor. It is important to understand any restriction on your ability to tell others about your usage of this commercial technology. It is also important to understand how to resolve any conflicts between a requirement to share everything with an agreement to keep some items confidential.

 

Competitors’ Technology or Relationship Disclosure

The larger your organization, the more likely that products have had a long lifetime, possibly being brought in through mergers and acquisitions or through technology partnerships. The technology stack that was appropriate a decade ago, may no longer be politically or commercially appropriate now. While reviewing your SBOM you may find indicators of components or technology coming from current competitors. It is very important to understand the licensing framework that this technology requires. Are there restrictions on using this technology? Has the cost become prohibitive? Will this potentially public use cause marketing, business or legal issues for your organization?

 

Will sharing of this SBOM alert competitors to your technology usage or business relationships?

It is common for business relationships to be confidential or at least kept quiet. Through this sharing of a SBOM, will these relationships become public? For example, does your new release swap out one technology (commercial or open source) with another, will this be a surprise to your own vendors or community? Is your executive management aware of this? Are there additional NDAs or agreements that could be put into place in order to limit sharing of this information? Are you prevented from sharing this relationship information in your SBOM? How can you legally satisfy both sides?

 

Export Licenses and Review

The shipping of certain technology may require export licenses or compliance review depending on your location and the type of technology being used. By creating a SBOM you may find yourself aware of technology that requires this process and review. It is important to bring in experts who are aware of export restrictions and requirements especially if there has been potentially missing compliance.  Additionally, your organization may already be producing this required documentation which can be helpful in producing a complete SBOM. SCA scans often miss the commercial (and sometimes open source) components that may require export review. This is a manual review step when producing your SBOM.

 

Perceived Overdependence on Open Source Technology or Engines

Even though 90% of a project’s technology stack is typically open source, it can sometimes be a surprise to customers or the public to see the actual contents of the SBOM. If a company has made a big deal about “their secret sauce” or cutting edge technology, but the SBOM shows that it’s based heavily on an Open Source engine, you may find yourself with a marketing problem. If your organization is concerned about a customer asking “Why are we paying so much for this if it’s just rebranded Open Source?” it’s important to get ahead of the question and have a well thought out response to change this impression. This also might also be a time to look at your public and financial support of the open source you depend on. If you are heavily dependent on certain open source technology are you a financial supporter or major code contributor to the project? “Of course we are heavily dependent on it, we’re a major financial and code contributor to the project” can be a great response to this objection!

 

Dependence on Old code or Technologies

Before SBOMs, many commercial products were Closed Boxes where customers had little insight into how and when they were put together. By sharing a SBOM, it will be clear the years that the majority of the project was built in, the types of source code languages being used, and technology choices that can be second guessed.

 

If no new components have been added in years, you may have to manage customer expectations about perceived lack of development. If the technology stack is based on a programming language or architecture that is out of the mainstream or has become passe, you also may find yourself with a customer management problem. It is important to get an understanding of what objections customers may have when they read your SBOM for the first time based on age, architecture or design choices. These choices are often set in stone though, and management may need to provide staff with clear and concise responses to prevent customer concern.

 

Sharing what you Know

The process to getting to a Sharable SBOM requires attention to detail, coordination of multiple teams, and an understanding of how technological decisions can be impacted by legal, commercial and political concerns. It’s important to be aware of how SBOM creation and dissemination is occurring in your organization and to make sure that the business needs of the organization are reflected in the SBOM process. By being aware of these issues, you can prevent legal and commercial issues, as well as preventing ethical breaches when teams find themselves stuck between two contrary requirements. A little early planning goes a long way to showing that you are aware and in control of your technology and will continue to be a good partner going forward.

 

 

 

 

How Falsehoods, Folklore and Foul-ups hurt your SBOM

 

 

In a perfect world, the software components our teams select are flawless. They have no software vulnerabilities, have an appropriate Open Source license, and are well maintained by a loving team forever. Unfortunately, we don’t live in that world and have to manage software that is built and selected by humans with differing levels of knowledge, webs of complexity, and physical and emotional lifetimes. These can lead to Open Source license violations, software vulnerabilities and a general feeling of unease in your customers when they read your Software Bill of Material (SBOM). Let’s take a look at some common problems in your SBOM and what caused them.

 

Foul-up: Someone claims “It’s public domain” or “It’s Open Source!”

There are two common mistakes that developers make when first encountering open source software licenses. The first is mistaking and misusing the term “Public Domain” in order to mean that something is Open Source. The second is to believe that the term Open Source gives them carte blanche to use the component any way they want with few or no restrictions. While there are open source components that are considered “Public Domain”, many times this term is being misused in place of the phrase “Open Source”. Additionally, “Open Source” is an umbrella term that includes obligations ranging from “releasing all your source code” to ones with which have almost no requirements at all. I have often seen the wildly inaccurate statement “This code is public domain under the terms of the General Public License” which should be your sign to walk, not run, away from that component!

The fallout from this misunderstanding is that the actual open source licensing of these components are not visible until a SBOM is created and reviewed (often by customers!). This shows up in your SBOM as open source components with licenses that are contrary to your license policy (if you have one) or cause a serious license conflict. Examples of this would be components licensed under the General Public License (GPL) in a distributed closed source application like a smartphone or desktop app or components with an Affero General Public License (AGPL) in a SaaS app.

 

Foul-up; “I didn’t know I needed permission and/or read the company’s license policy”

A similar mistake is caused by a developer or team member not knowing about your organization’s open source policy or guidelines. Introducing new open source or commercial components without understanding the impact of their inclusion can lead to serious and expensive problems. Periodic training (starting at onboarding and continuing on in the future) is important to teach best practices and prevent license violations as well as help reduce vulnerabilities and other security exposures. Additionally, having a centralized Open Source Portal with training and policy documents and scanning using Software Composition Analysis (SCA) software is a great way to prevent and catch problems as they occur.

 

Folklore “We don’t ship software so we don’t need to worry about Open Source licenses”

Another problem occurs when licenses are simply not checked at all before selecting a component. This often happens when a developer misunderstands the impact of software distribution models on open source licenses. While it is true that many open source licenses only come in effect when the software is shipped, there are many open source and commercial licenses that are in effect no matter what the distribution model is.

An important consideration is that distribution models change. While you might not be distributing the software today, you may find yourself supporting on-premises installations for a large customer. Now suddenly every component you have built into your application is distributed and the open source license obligations are required to be complied with. 

Even the most SaaS company may find itself making software distribution of utilities, onboarding tools and integrations. Misusing a SaaS open source policy for these projects can cause serious license violation surprises.

 

Foul-up “It doesn’t have any vulnerabilities….right now”

When a component is first selected by a developer it is less likely to contain known vulnerabilities or CVEs associated with it. Typically vulnerabilities are discovered as time goes on and after security researchers get access to a component version in question.

A SCA tool is a great way to keep on top of the vulnerabilities that occur in the components you select. Make sure you are examining both the top level as well as the transitive dependencies.

Even without a SCA tool, if your versions are years out of date, you almost certainly have vulnerabilities present in your dependencies..

 

Foul-up: “I didn’t know I needed to check the component’s transitive dependencies”

While your developers may be aware of the license and vulnerability status of the components they select, they may be unaware of the sub-components brought in automatically by the repository management system (e.g. NPM or Maven). These components, called transitive dependencies, are just as important as the top level component, but are often overlooked or ignored. Your ability to fix any problems with these components is limited (i.e. you are unlikely to be able to fix a problem faster or better than the parent open source project themselves!) but you may face pressure from your customers to fix these problems ASAP.  By scanning at component selection time, you can get ahead of core license conflicts and ancient unmanaged vulnerabilities. By running a SCA system with alerting, you can get warned of changes in vulnerability status and plan required upgrades as necessary. Additionally, you can help your upstream open source projects by becoming aware of issues they may not have the technology or knowledge to manage.

 

Foul-up “I thought my SCA tool handled Commercial Vulnerabilities”

If you are using any commercial dependencies, you may be surprised that most SCA scanners do not report on their vulnerabilities/CVEs and many are unable to report on even the Open Source components used by those commercial dependencies. This is a large hole in many companies’ open source management processes. It shows one important reason why requesting your vendors’ SBOMs and reviewing their contents not just at ingestion time, but continuously if possible.

 

Folklore: “I got it from the Internet, I must be able to use it”

No news is NOT good news. Another common source of unexpected SBOM churn is when code is downloaded from web pages, forums or project documentation sites. Any time you take software or source code from another place, it is required that you know what permissions to use these resources are. A clear open source license, or permission statement is typically required in order to legally use this source code.  If components appear in your SBOM with unknown, blank or generic sounding license metadata, it is important to dive deeper into what the true permissions exist for this component. If the original author has not put any in place, it may be helpful to suggest to them a license that works well for you (perhaps a Apache 2.0 or MIT license). Be aware that the author may select a license that is incompatible with your business model or desires and may even require commercial payment or forbid your use!

While it’s always best to know your obligations before you build something into your application, clean up work is always required.

 

Code from forums like Stack Overflow often have published license guidelines that may be a surprise for you. For example, source code from Stack Overflow is licensed under the Creative Commons Share-alike 4.0 license by default (though individual authors may override that) See https://stackoverflow.com/legal/terms-of-service#licensing and https://meta.stackexchange.com/questions/347758/creative-commons-licensing-ui-and-data-updates for more information.

 

 Foul-up: “I’m sure someone is still maintaining this component!”

Deciding to incorporate an open source component into your project is the beginning of a long term relationship. While you are not explicitly owed anything by the open source maintainer, there is often an expectation that vulnerabilities will be fixed, features will be worked on and the project will maintain some sort of line of communication with the community.  In some cases projects die. This might be due to burnout, change in employment, lack of interest or even the maintainer dying!

When selecting a component, you do not want to select a component that is already dead. Look at release cadence, open issues, known vulnerabilities and indicators of response from the maintainer. If a project is already dead or dying, don’t select it unless you are able to shoulder the maintenance of the component for as long as you use it.

Periodically check your components’ health status. If projects die you should replace them or take over maintenance if possible.

 

Humans are making these decisions, remember them!

Humans are making the decisions that impact your product every day and it’s important to understand the thinking that goes into these decisions. By understanding the reasons why problems occur you can better put in policies, procedures and products that help prevent serious impacts and future mistakes. Overarching policies that don’t take into account the human factor are doomed to seeing the same mistakes being made over and over again.

Humans make mistakes, but also learn from their mistakes. Talk to the developers and the development teams about what needed to be fixed. Make it part of your retrospectives as well as release checklists. SCA tools allow policies to be enforced automatically, try to automate as many of these as you can, but also understand the limitations of current tools. 

Onboarding training and yearly training should go beyond simplistic “GPL bad! Upgrade often!” slides.  

The more your SBOMs are living documents, the healthier they will be. Your first experience reviewing a SBOM may be scary but with time and experience your company will be healthier and safer. Good luck!

 

Your open source project will outlive you! Will the Future be able to use it? [updated Oct 2023]

tombstone saying "in loving memory"

Every decade or so the technology world gets punched in the face by a problem requiring poring through massive amounts of code written well before many of us were born.

In the late 1990s it was the Y2K problem when two digit years were no longer sufficient.

 

In the 2008 financial crisis COBOL based systems required hand editing in order to change state employee salaries en mass.

 

In 2020 we had the Pandemic related economic impact requiring Bank and Employment system’s code to be modified, many of which again were written in COBOL!

 

In 2032 we’ll have the Y2k38 problem when the Unix time will overflow causing software issues akin to Y2K.

 

While many of these systems were closed source and proprietary, open source systems are starting to dominate the software landscape. Much like your grandparents’ hammer, these examples show us that useful tools will almost always outlive the person who first selected or created them.

 

Besides the question of maintainability and programmer experience with very old languages like COBOL, questions of intellectual property and software licensing will complicate the usability of open source software over the next 50 years and beyond.

Think about Open Source Licensing!

As part of software due diligence (when a company purchases another company and confirms that the source code they are buying is correctly owned, licensed and documented) I have had the experience of trying to track down the ownership and licensing of code decades old. This type of software archeology requires access to archives of source code, books, magazines, blogs and other places that programmers have published software over the last 60 years! In many cases the true origin of some source code is lost to time, or can be only partially known.

 

Questions such as “Who wrote this?”, “Did they expect others to freely use this source?”, or “Does a commercial company own this?” are common.

 

It is important to make these types of answers clear for those who come after us.

 

The most important of these is to specify an open source license for the code you are publishing, even if it’s just a “single page” or block of code. If it’s worth putting on the Internet, it’s worth telling people what its license is.  There are many suggestions on how to label code to make the copyright and license clear, but I strongly suggest that each file contains a copyright statement and at least a SPDX license identifier (See https://spdx.org/licenses/)

 

This allows someone in the distant future to know who wrote the source code and what the obligations are even if only a single file remains.

 

You may decide to change the license after you die to something less restrictive or even dedicate it to the Public Domain or Open Source equivalent like the Creative Commons 0 license (CC0). See https://creativecommons.org/public-domain/cc0/

Who do you depend on?

Document all your third party dependencies, including dependencies of dependencies. There is no guarantee that any of our current repository managers will still be working decades in the future, but your code may be. By listing these dependencies, you help the Future build and run your code.

 

In a similar vein, your build system and running environment should be documented as well. For example, if you depend on a certain make system or database to be installed, call these out in separate build and running environment documentation. 

 

Some projects take care to store away copies of all the source code, tools and environments that they require in order to build and run their projects. A copy of the source code and binaries that are downloaded through repository managers like Maven or NPM may be the only way the future generations may be able to build and run your software.

Till Death Do Us Part

In the short term, understand that source code is considered property. What happens after you die should be clearly specified. In most places the ownership of your code will pass on to your heirs, but possibly with complicated and divided ownership.  Do you want anyone in particular to be the new code owner (or someone outside of your family)? In this case your will (or related documents) should make this clear.

 

While everyone should have the permissions available under your open source license for the duration of your copyright, you may wish your heir to have the ability to “own” the code just like you do.

 

By making the ownership clear, they will then have the ability to change the license for your code, just like you likely do now. This means they may have the permission to also sell commercial licenses to this code, or change the open source license of the project. An open source project with multiple contributors has additional concerns about ownership. You may need to make clear dividing lines between projects you own outright as opposed to projects you contribute to, or have others contribute to.

 

Explicit ownership, licensing and future plans should be clear for all resources including source code, images, art work, sounds, documentary and anything else created by humans for the project. 

 

Do you want to change the license after your death, or after a certain amount of time?  Make these changes clear as well. Some may want to open previously closed source, or change to a Creative Commons Zero (CC0) license or Public domain declaration.

 

Similar questions may come up in the case of divorce. While it’s often clear who owns code you write when “on the clock at work”, the code you write at home may have complex ownership issues.

 

Go beyond the code!

An additional thing to consider is account access, logins, domain names and payments.

Typically we tell everyone to keep their account information private and secure. This may be at odds with your desire to keep your project going even after your death.

Keeping a list of domain names, third party services and other account information related to your project can help the heirs to your code keep the project going.

Bear in mind that, after your death, certain accounts may be locked, go away or may be controlled by people other than the code owner. Since these may be considered property, clarity around transfer of ownership is extremely important if you wish to keep the project moving forward.

 

As with most things involving intellectual property and life events, it is best to consult a lawyer to understand your best options.

Specifying A Successor for your Online Accounts

 

Some online accounts allow you to specify a Successor or have a specific feature for handing over inactive accounts. For example Github has a Deceased User Policy which allows “next of kin, a pre-designated successor, or other authorized individual (which could include a collaborator or business partner” to get access to your account after you die. 

 

For more information see: https://docs.github.com/en/site-policy/other-site-policies/github-deceased-user-policy

 

Github allows you to appoint a “successor” which makes it easier for them to legally access the account information. This transfer of accounts is unlikely to affect the legal ownership of the source code and project’s intellectual property. See https://docs.github.com/en/enterprise-cloud@latest/account-and-profile/setting-up-and-managing-your-personal-account-on-github/managing-access-to-your-personal-repositories/maintaining-ownership-continuity-of-your-personal-accounts-repositories#about-successors

 

Google has a similar set of features called “Inactive Account Manager” to allow someone to take control of your Google Gmail and other Google products after your death. See https://support.google.com/accounts/answer/3036546

 

Rest in Peace!

A little care and effort in the present can save the community a significant amount of time in the future. By specifying a license, documenting project dependencies, and clearly transferring ownership you can make sure your code stands the test of time.



Your first SBOM is going to stink. Don’t panic, get started fixing it!

Your first SBOM is going to stink, that means you need to get started now to fix it up enough to share it. 

 

It seems like everybody but you is showing off their shiny new SBOM. You know you have to get started but you’re worried about what you’re going to find. I’m here to tell you that your first SBOM is going to stink, everybody’s does. If they tell you that it came out perfectly they’re either lying or their SBOM is woefully incomplete,  So let’s rip off the Band-Aid, get our scanning tools warmed up and work on getting a SBOM produced that you can stand behind and that won’t embarrass you or get you in trouble.

There’s a few common areas that SBOMs will have problems in.  These include Completeness, Depth, Unremediated Vulnerabilities, Open Source License Violations, and Over Delivery.

Each of these areas can cause rework, missed deadlines, loss of sales and even legal problems. 

The last thing you want is to deliver a product or a SBOM to your customer and have a previously unknown set of vulnerability and license compliance issues be sent back over to you with a timetable for resolution not of your own setting.

 

Let’s first talk about completeness

What I mean by completeness is that you examined all the code bases that are part of your project, you used a scanning tool that was capable of generating SBOM information for the type of libraries and artifacts you depend on, and that your Software Composition Analysis (SCA)  tool is configured correctly in order to produce SBOM information from whatever repository manager you are using. It’s common to get a short SBOM since the SCA tooling is unable to discover the open source in use due to lack of scanning ability or misconfiguration of the tool.

 

What are some questions you can use to gauge completeness?

Do I see artifacts for both my front end and back end in the SBOM or SBOMs. For example, do you see software components written in JavaScript if that is what you were using for your web app’s front end? 

Are you seeing a good list of Java components if you are using Java and Maven for your back end?

Bear in mind, you may have open source components in use that are automatically put on your SBOM through the use of a repository manager, and also have artifacts that are not managed by a repository manager that have to be manually incorporated into a SBOM. For example, you might have explicitly copied the source code for a component into your codebase, or load the library from a remote web location. Both of these cases require manual effect in order to have an accurate SBOM.

Additionally, ask developers to list some of the large open source dependencies they are aware of. Do you see them in your SBOM? If not, this is a very helpful indication that underscanning of some type is occurring.

 

Over Delivery in your SBOM

The completeness issue is closely related to the Overdelivery issue.  Sometimes your team will generate an SBOM that contains far more information than is appropriate for your individual application. This may be because the projects or directories to scan have been over specified. This also may be because you were using a repository technique called a Mono Repo which may contain many unrelated sub projects to the bill of materials that you are expected to deliver. There may be a large directory of third-party artifacts that are required to run every single application in your company, but the project you are concerned about right now, only requires a small percentage of those artifacts. Getting a SBOM for a small part of your MonoRepo may require advanced scanning techniques like Runtime analysis, etc.. in order to best cut away un-related disclosures.

You may find that there are multiple old distributions of your software checked into scan directories wildly inflating the artifact count and including artifacts from long dead versions of the application you’re scanning. This may require pruning or excluding directories scanned by your SCA tooling. Indicators of this issue may occur when you see many multiple copies of the same set of open source libraries differing only in version numbers. The names of the directories that these articles are seen in can also prompt you that this is the issue (e.g. /OldReleases/ or /PreviousVersions)

You may have scanned test or customer data directories that are part of the QA process and are unrelated to the running of your application or may even be inappropriately disclosing customer relationships or data.

You may have scanned artifacts that are related to the building and development environment of your application, which also may be out of scope for your SBOM delivery (though bear in mind, in the future, build and test environments are likely going to be required as part of SBOM deliveries!)

 

Is your SBOM Deep Enough?

Another very common underdelivery in SBOMs is not scanning “deep enough” or ignoring Transitive Dependencies. Transitive Dependencies are the dependencies of the dependencies you explicitly request. For example, you might depend on Component A, which in turn depends on Component B, C and D. These 3 dependencies might not show up explicitly in your Repository Manager configuration files but are resolved at build time and downloaded silently and automatically in the background. Depending on what SCA tool you use, and what settings you have turned on in that tool, you may find yourself not getting a complete list of required third party dependencies. Transitive dependencies may double to 10X your visible use of open source!

 

Have you Resolved All(?) Your Vulnerabilities! 

Now that you have a complete SBOM you will need to examine it for security and compliance problems. Top of mind for many organizations is the vulnerability status of each of the third party dependencies in their SBOM. There are many philosophies and more and more legal requirements in terms of defining how to resolve these vulnerabilities. The simplest process is to update all components so that there are no known vulnerabilities visible in the SCA scan. This may be difficult or impossible to get done in a timely manner, or may be impossible due to lack of available fixes. That said, many customers are going to expect a CVE free SBOM even if it is not possible to do so.

Other philosophies of vulnerability management includes performing runtime or reachability analysis. This means a SCA or similar tool will attempt to see if vulnerable components or buggy subcomponents are actually used or reached during the running of the application. A successful resolution of a vulnerability can be a statement that this vulnerability is not valid for your use case since that code is never used or reached during runtime.

Delivering a Clean SBOM may be possible with additional information explaining why known vulnerabilities do not affect your application. This may be due to not being in reachable code, not valid due to your runtime environment, or due to not being valid vulnerabilities in the first place. This is often the beginning of a discussion with your customer who may have additional questions or even pushback on your opinion. A common way of delivering this information is through a manual spreadsheet or through the use of a VEX document. See https://cyclonedx.org/capabilities/vex/ for more information on VEX.

 

Have you Resolved Any Open Source License Violations?

It is very common to see a large number of Open Source License Violations when running a SCA scan for the first time on a codebase. Some distribution models are more affected by license issues than others. For example, if you are distributing an application to end users or delivering a piece of hardware, there are many open source licenses you need to comply with.

If you are running a piece of software as a Software as a Service (SaaS) model, there is not likely a classic distribution of software, so many of the open source licenses will have no compliance requirements (with some notable exceptions like the AGPL license!)

In the distribution model, are you paying attention to any embedded Operating Systems like Linux?  If you are an IoT or embedded device product, this is extremely important to get correct. 

The most serious license violations are typically issues like GPL violations, where your organization is not complying with the terms of the General Public License (e.g. not sharing your application source code when making a distribution) 

Your organization should create Open Source License Use Policies for each of your distribution models and use cases. In many cases your SCA tool can help with the enforcement of these policies and create reports of policy violations.

Other issues are not creating license notice files, not putting copyright statements in about boxes, and other required attributions.

There may be other legal requirements (that technically may not be open source requirements) but are discovered during this analysis phase. These may be restrictions on certain types of commercial or business use, quasi-commercial terms, or even advertising requirements!

Additionally, you may find Commercial components embedded in your product which contain their own SBOMs and open source usage that may not be discoverable through the use of SCA tools. You in turn may have an open source license compliance, vulnerability and SBOM management conversation with your upstream vendor in order to be compliant with your downstream vulnerability and open source license compliance needs.

Open Source and Commercial Legal compliance is a complex topic and is worth the time to understand what is appropriate for your business and distribution model. Explicit legal advice is often warranted!

 

Putting it All Together

Once you have started using SCA tools, reviewing your SBOMs and then delivering them, you will start exercising a business process that makes future SBOM delivery easier.  One of the biggest causes of stress around SBOM creation and delivery is the fear of the unknown and the lack of knowledge on how to deal with problems. This is a perfect time to create an Open Source Program Office (OSPO) or at least a working group with similar knowledge and responsibilities. Building institutional knowledge on tooling, vulnerability management, open source license compliance and SBOM requirements goes a long way to making your business able to deal with the current and future regulations and contractual obligations regarding SBOMS. Good luck, and get started!

Curl is seen everywhere except your SBOM, why is it missing even though you use it?

What is curl?

curl is an open source command line tool and embeddable library for transferring data over a network. It is one of the most popular and well known open source projects and has over 20 billion installations according to its author Daniel Stenberg. It’s licensed under the curl license which is similar to the MIT license. Its latest version is 8.4.0 as of October 10, 2023 and its hosted on the web at https://curl.se/ 

 

Why are we talking about it?

Recently a high severity vulnerability was reported in the project. This vulnerability is tracked in the National Vulnerability Database using the ID CVE-2023-38545. See https://curl.se/docs/CVE-2023-38545.html  

 

There was a lot of chatter in the lead up to the public release of the vulnerability details, but in the end it affected fewer configurations than the early buzz warranted. That said, much like the log4j vulnerability a few years ago, it could have been possible to have a very serious zero day vulnerability in a widely used open source component.

 

Let’s just look for curl in our SBOM and get on with our day!

Unfortunately, it’s not that easy for components like curl. There are many components that are easily found with scanners and Software Composition Analysis (SCA) tools, but curl is not one of them. It is not easily found due to the programming language it is written in and the languages that are often used to embed it into larger projects.

 

Why does SCA have trouble finding curl?

The most common SCA scanning products these days traditionally look at information provided by repository managers like Maven or NPM. Repository Managers are tools for automatically downloading and installing open source libraries as part of the build process.  SCA tools reformat this repository information into the traditional SBOM formats. Additionally, these SCA tools will add information, such as known vulnerabilities, project health information and updated license metadata. Some languages such as Java, JavaScript, Python and Go have popular repository managers and have SBOMs easily created using quick lightweight SCA scanning.

 

On the other hand, languages such as C and C++ do not commonly use repository managers to handle their third-party dependencies. This means it requires much deeper, slower and sometimes human based analysis in order to discover and manage third-party dependencies. Right now it is very difficult, if not impossible, to get SBOMS when scanning C and C++ applications, especially when being built from source code. 

 

curl and libcurl are very often compiled into C and C++ projects and unless a human explicitly puts them in a SBOM you will not know that they are in use. 

 

Where might curl be hiding?

 

As mentioned before curl is an immensely popular and successful open source project and is embedded in untold thousands of commercial and open source components. It’s also embedded multiple times in almost every Operating System! Let’s walk through some of the most common places you will find curl.

 

Operating Systems: curl is embedded in almost every operating system. Updates to fix the curl vulnerability will almost certainly be released for currently supported versions of these operating systems. Older versions of OSes will likely remain unpatched and potentially vulnerable.

Do you have dedicated devices with operating systems that do not get updated? Do any require manual intervention to upgrade?

 

IoT devices: It is extremely common for IoT devices to have dependencies like libcurl in order to download system updates or other network operations. If no upgrades are available, it may be worth a conversation with the IoT vendor to understand their current SBOM and patch process.

 

Virtual Machines (VMs): VMs are a way of packaging up an entire operating system and a set of applications in order to run multiple virtual computers on a single piece of hardware. A VM looks like a real computer running a standard operating system and will likely have multiple copies of curl and libcurl bundled with the OS, libraries and running applications. You will be unlikely to receive a SBOM for a VM and the applications inside of it. The OS will have one set of dependencies, the required system level services will have another and finally the application will have its own independent SBOM. All of which should be reviewed and updated as needed. If no SBOM is available, use this exercise as the push to make one. 

 

Containers: Containers are a special lightweight method of running applications bundled with all their dependencies. While they are different from a Virtual Machine, it may be helpful to think of them like a VM. It is very common to see curl or libcurl as dependencies in a container, and in fact, this is one of the places where we will see curl automatically discovered and put on a SBOM though container scanning using tools like Syft and Grype (https://github.com/anchore/syft and https://github.com/anchore/grype ). Just because you see one or more copies of curl mentioned in your container’s SBOM, there are likely many other undisclosed copies of curl hiding in the operating system and applications running in the container.  The curl seen in the SBOM is likely system level services explicitly requested by the person who put the container together, but these container scans may only be looking at top level components.

 

Command line tools and Scripts: It is very common for applications to make external calls to command line tools, like the curl command line, in order to perform updates or remote download functions. These dependencies are often overlooked when putting together a SBOM and are almost never found though SCA scanning.

 

Commercial Products and their OSS dependencies: A commercial product may or may not have a SBOM or open source license disclosure. If it does, take a look for curl, libcurl or daniel@haxx.se in the SBOM or open source license file. Again, any disclosed curl may only be one of many actual curl dependencies in a large project. 

 

Open Source Projects: It is very common to see other open source projects use curl for internal network communication and downloads. Sometimes these projects will disclose their use of curl in a SBOM or Open Source license file, but in many cases they will not let end users know.

 

Wrappers for curl in other ecosystems: It is very common for other program language ecosystems to create “wrappers” for curl in the native programming language and ship a compiled version of curl or libcurl to provide the actual functionality.  If you are using a language like Java, Python, Go, etc and you see curl mentioned as an open source project name this project is likely a wrapper from a different group that either depends on a local version of curl or bundles an independent version of curl. These might require separate upgrades for each wrapper, and each system level installation of curl. 

 

Strings that indicate that curl is being used in a product

If you see these strings in a SBOM or Open source License file these are great indicators that curl is being used in a product or project.

Curl

Libcurl

daniel@haxx.se

https://curl.se/libcurl/

http://curl.haxx.se/libcurl/

 

Questions to ask your team to help uncover usage of curl

  • Are we doing any automatic downloads in our product? What tools do we use?
  • Does the system patch or upgrade or update itself? What library is it using to do so?
  • Does the manual or installation instructions mention curl as a dependency or pre-condition for use of the project? 
  • Is the curl RPM (or equivalent) required to install or build the project?
  • Does the product do web scraping or downloading of web site resources? If so, what library is used to perform these functions?
  • Do we see curl or libcurl or any variation of that name in our SBOM or license disclosures?
  • Do we see the email daniel@haxx.se anywhere in our license disclosures?
  • What do we see if we grep for libcurl, daniel@haxx or other curl strings in our codebase?
  • Does our product require a container to run in? Have we run a SCA scan of the container?

 

Use this experience to understand what visibly you are getting with your current SBOMs and SCA scans

Every day we get a better understanding of our use of open source and third party software through the use of SBOMs and SCA scanning. There is still a long way to go before we get complete visibility of every product’s SBOM though. This is due to the newness of this process, the complexity of how software is packaged and delivered, and the limitations of current SCA products. curl is used everywhere, but due to how it is packaged and the programming language ecosystems it is used in, it (and other C/C++ dependencies) is not showing up in the SBOMs we review to keep our companies and projects safe and updated. Use the questions in this guide and the areas where tools like curl might be found to help understand the current weaknesses in SBOM completeness and to get ahead of the next vulnerability!

 

You are the dog that caught the car: Handling the SBOM you asked for!

 

 

We all think we have more time.

 

One way or another, you are going to soon receive an email telling you that the Software Bill of Materials (The SBOM!) that you asked for is ready for you. Maybe it’s coming from a Vendor, maybe it’s an internal project, maybe it’s from your own team. You suddenly have a very important document to review, and it’s hard to even know where to begin. 

 

A big cause of paralysis in the security world is not knowing what to do next, especially with a seemingly buzzword heavy issue like this. Certainly everyone else knows what they are doing, but where do I even start?

 

I’m going to walk you through the basics of SBOMs, how to view them, some good first questions and where to go next.

 

What did you just receive?

 

A Software Bill of Materials (SBOM) is a listing of the third party software components that a software project uses in order to function. Typically this list of components consists of open source packages and libraries, but may also contain Commercially licensed components and possibly components with licenses that are neither Open Source or Commercial and may require further review.

 

We use this list of components to understand the software dependencies of this project, use it to identify potential security vulnerabilities, find end of life and unsupported software components, discover components that can be supported with money or software contributions, as well as discover other architecture or support issues.

 

This listing should include at least the name of the software component, its version and possibly information about the license it is released under. Beyond these basics, you may find additional pieces of information for these components such as Project URL, description, known vulnerabilities, etc. Since exchange formats and OSS scanners are still being defined, you may find yourself with varying levels of disclosure with varying levels of data quality and completeness.

 

The first thing to get a handle on is what type of SBOM you have received. There are a few different file formats and mechanisms for SBOM sharing, and more appearing every day.

 

What type of formats might you have received?

 

There are three main formats for SBOM sharing right now. CycloneDX, SPDX and free text files of varying complexity and origin. Typically, these files are designed to be both human and machine readable though it seems like the machines often have an easier time of it!

 

CycloneDX (https://cyclonedx.org/) is a file format created by the OWASP Foundation. You will know you have a CycloneDX file if your partner tells you that that is the format they will be giving you or possibly if the filenames are bom.json, bom.xml or end with .cdx.json or .cdx.xml

 

SPDX (https://spdx.dev/) is a file format created through the Linux Foundation. You will know if you have a SPDX file if your partner tells you that is the format they will be giving you or if the file name is similar to the following .spdx, .spdx.json or .spdx.rdf.xml.  

 

Free Text, CSV or Excel Files are traditional text or spreadsheet files that contain SBOM information in one-off or less common SBOM formats. They may be created by a tool or a human and are often designed for human review instead of computer processing. 

 

All of these file formats will contain information about the software components in use, many of the files will be “self documenting” meaning they will have Field Names (like Component Name or Version) near the data you are reading, or in a traditional spreadsheet format will have Column Names for each piece of data. 

 

In a JSON, XML or free text file, component data often is spread out over multiple lines of text.

 

In a spreadsheet, each row is often a single component, where each column is the component’s metadata (e.g. name, version, etc…)

 

How to view and process the SBOM

 

The easiest way to get insights from the SBOM you just received is to run it through a SBOM scanner tool like Bomber. Bomber is a free and open source tool that can provide information about known vulnerabilities and license information for the open source components found in the supplied SBOM. Bomber can handle CycloneDX files in either JSON or XML format, SPDX SBOMS in JSON format, as well as Syft JSON SBOM files. If you have a file in a different format, you can use a free tool to convert it to one of these, or request your partner to resubmit it in a format you can handle.

 

See https://github.com/devops-kung-fu/bomber for installation and usage information. If using command line tools is new to you, this might be a perfect time to call one of your developers to work together.

 

Examining a SBOM file by hand (if not using a tool like Bomber)

While using a tool is much easier, it is possible to examine the SBOM files using a text editor and picking it apart by hand.

 

When looking at the JSON or XML files themselves in a text editor, you can find the component name, URL, version information and license information. For example, in CycloneDX the following tags are found near the information of interest:

 

“name”   (the component name)

“version”  (the component version)

“bom-ref” (the URL or similar locator for the component in question) 

“license”  (The license or license options for the component)

 

The license tag may be after a stretch of “hashes” or IDs used to describe the files that make up the component. 

 

By using this information a web search can be used to find out vulnerability information. For example if you found Struts 2.3.31, you could do a web search using the terms “Struts 2.3.31 cve” and find out that this version of the component is affected by the vulnerability known as CVE-2017-5638 ( See https://nvd.nist.gov/vuln/detail/CVE-2017-5638 )

 

CycloneDX

 

For a deeper description of the CycloneDX SBOM format see https://cyclonedx.org/guides/sbom/OWASP_CycloneDX-SBOM-Guide-en.pdf

 

SPDX

 

For a deeper description of the SPDX SBOM format see https://spdx.github.io/spdx-spec/v2.3/

 

Many of these SBOM documents can be read in a standard text file viewer, or in a worst case, a Word Processor application. If the file is jumbled together or is in one long line, you will want to explore finding a more powerful text viewer that can better handle line breaks or special characters. Free tools like Visual Studio Code ( https://code.visualstudio.com/ ) can view Text, JSON and XML files. You may need to reformat the text if it is all in one line or jumbled together. In Visual Studio Code, Go to the Command Palette and select Format Document. The file should be more readable to a human now.

 

There exist JSON and XML file viewers which can make these files prettier to see and more useful to search or view.

 

Additional utilities are being released to support the use and viewing of SBOM documents in CycloneDX and SPDX formats.

 

A CSV or .XLS document can be opened in a spreadsheet application like Excel or Open Office.

 

Now That You Can View the SBOM What are you looking for?

One thing you can do is put it in a drawer! The very act of asking for an SBOM does a lot to kick the vendor into managing their third party risk. While this process works best if you ask them questions or give some pushback, asking for an SBOM allows them to say internally “our customer is asking for this, we need to do SBOM generation, SCA scanning, OSS Patch management, etc…

 

That said, you have it, let’s go get some value out of it!

 

This might be the point to bring in a developer if you are not familiar with open source libraries. There are a few things you can do on your own or you might find it is helpful to work together to understand the SBOM you just received.

 

There are a few questions we use SBOMs to help answer when looking at a piece of software

 

  • Does the SBOM seem legitimate? Can you view it, read through it, see real data?
  • Is it recently created? When was it generated? What version of the project was scanned? (e.g. is the SBOM wildly out of date?)
  • Does the SBOM only contain “Top Level Dependencies” or does it include the dependencies of those dependencies, also known as Transitive Dependencies? This could be a difference of 3-10 times the number of actual dependencies seen!
  • Can you find a few well known open source components and check their versions against the National Vulnerability Database (NVD) ( https://nvd.nist.gov/vuln/search )
  • Are there well known vulnerabilities in the codebase (old versions of Log4j, OpenSSL, Apache HTTPServer, Apache Tomcat, Apache Struts) 
  • Does the list of component versions seem “too old”? Are all reported vulnerabilities from years ago (e.g. CVE-2017-5638 in Struts) 
  • Are there Open Source Licenses that might cause a problem for you? (Do you see licenses like the General Public License or Affero General Public License which might be contrary to your company’s license policy. This can be complicated since some parts of your company may happily use GPL software in Linux Operating Systems but may forbid it in distributed applications)  
  • Does it seem complete? Is it missing important information like version information? 
  • What software languages are seen? Do you see what you expect? Java libraries? NPM libraries? Is something missing?

 

 

 

Pushing back or Requesting More Information

After processing or examining the SBOM you may have some questions or feedback for the team that supplied it to you. Typically you might request more information about the highest security vulnerabilities or license issues found in the report. There’s a lot of discussion about how customers and suppliers can work together best to share and respond to SBOM and vulnerability questions. In general, especially if this is your first experience with SBOMS, you might find the most value in letting your supplier know you’ve run the SBOM through a vulnerability tool and you have some questions about what you are seeing. The idea is to gently (and perhaps later on, not so gently) work with your partner to reduce exposure to known vulnerabilities, as well as better provide their customers or end users with an explanation of why they are or are not affected. As you get more experience, you may find that providing 3 to 5 clear concerns can help your partner start to get a handle on your expectations, as well as chip away at the worst problems. For example, if you see that the software contains old high severity vulnerabilities in Log4J, Curl, OpenSSL etc,, this might be a sign that they have not been using SCA scanning or good vulnerability management practices.

 

Feedback from the Supplier

 

In general, throwing a list of 100s of problems back to a vendor will not be well received, especially if you are new to SBOM reviews. That said, getting feedback on 5-10 of the worst of the worst can give you a good feeling if they are managing their supply well or not.  Many vulnerabilities may be present in an open source library, but not affect the software as you use it. A company should be able to clearly explain why they think they are not affected. “Trust me Bro!” is usually not a satisfying answer though. There should be clear explanations. For example, a good answer might be something like “This reported CVE only affects this component when run under the Windows operating system, and in this case we are using Linux”. 

 

As time goes on the SBOM you receive from this supplier should contain fewer vulnerabilities, a more complete listing of third party dependencies, as well as explanations on why potential vulnerabilities seen in the codebase are not valid for their current usage.

 

 

Keep Requesting High Quality SBOMS

 

As mentioned before, one of the best side effects of requiring a SBOM to be delivered to you is that the team responsible for creating the SBOM will now put in place Software Composition Analysis (SCA) scanning tools, CVE/Vulnerability Patch Management, and processes in place to create/fix/deliver up to date SBOM information to you. A better understood product is a more secure product. The more SBOMS you see, the more that quality issues will pop out to you. Keep reviewing and keep giving and demanding strong feedback!



Open Source License Location Alignment Chart

 

Text Version:

 

Open Source License Location Alignment Chart

 

Where’s the open source license?

Lawful Good: SPDX Identifier at the top of each file
Lawful Neutral: in a LICENSE file at the top level of the source tree
Lawful Evil: at the bottom of each file

Neutral Good: on the project’s home page
Neutral Neutral: on the project’s Wikipedia page
Neutral Evil: as a reply to a GitHub issue asking for the license text

Chaotic Good: available as output of a python script
Chaotic Neutral: author states no license applies since code was written in a country with no copyright law
Chaotic Evil: in a scanned image in a TIFF file only found on the WayBack Machine