What is a GitHub Gist?
A GitHub “Gist” is a webpage used to share snippets of code or a single file up to 1MB of source or text. It is commonly used to share examples, small utilities or documentation. Other similar sites are commonly referenced to as a “pastebin”.
Why would you use the code from one?
It’s common for a developer to search for code to implement a small procedure, for example, “reverse the letters in a string” or “check for the existence of a file on disk”. These small pieces of code are not seen as large or complicated enough to be named and hosted as an open source project or have an entire web site devoted to them.
More complicated code is likely to be found as well. Single page utilities or programs can be encountered, often with embedded documentation in comments, or on a web page pointing to the Gist.
You may also see Gists used to host data listings or research. Common examples of this are ID numbers of products or the code used to demonstrate a bug or exploit.
Why does it need a license?
In a nutshell, other people’s source code or resources often require a software license for you to use it in your project. A license (commercial or open source) is a way to explain the conditions that you are able to use that software and without one you likely do not have permission to use it.
The topic of what requires a license and at what size or complexity of source code is sufficient to require a license is beyond the scope of this blog entry. In general, source code cut and pasted from somewhere else may require a license in order to be legally used.
How do Gist authors show off their licenses?
Unfortunately most Gist authors FAIL to clearly show what license is attached to the source code they are sharing.
The most helpful and accurate way for a Gist author to declare their license is to put the license text in the source on the Gist itself. Typically this would be at the top of the file in a comment block and would contain the copyright date and owner if required by the license.
You may also see a short one-line comment such as:
This code is licensed under the terms of the MIT license
This is helpful but not complete, you don’t have a copyright date or Copyright owner, but at least have a general idea of the licensing style the author prefers.
In some cases the Gist author places a note somewhere in their GitHub site or homepage that declares the default license for their Gists. They may use text such as “The default license for all public Gists I publish is the following:” and then put the name or text of the license.
While this is far better than nothing (and does reduce perceived clutter in published Gists) it does break the connection between the source and the license that it was published under. It makes it harder for a consumer of that source to bring along the accurate licensing information when using the code.
What if it doesn’t have a license and you want to use it?
If a file, snippet or other content does not have a declared license, it is a good practice to reach out to the original author and ask what license the content is available under. You may want to suggest a license that you are comfortable with in order to short circuit any back and fourth. For example, you may want to send an email like so:
Dear Jeff, I found the code you published on your GitHub Gist site at https://gist.github.com/jeff-luszcz/c470d282599ea42424b976c673d7c115 It currently does not appear to have a license associated with it. I would like to use this code but only if it has an open source license. Would you be able to let me know what license this code is under? I’m a big fan of the MIT license if you are looking for suggestions. (See https://choosealicense.com/licenses/mit/ ) Thanks!
While the author does not owe you a response, don’t be surprised if you get a helpful answer in a couple of days or so.
How should you preserve or document licenses to the code you use from a Gist?
As a developer one of the best things you can do for yourself or others that use your code is to document the licensing and origin of all third party code you use. This helps you legally share the projects you build, and also helps your users comply with open source licenses and stay ahead of security problems in the third party code you selected.
The best time to document things is when you have the data initially right in front of you.
If need be, cut and paste the licensing and origin information and attach it to the code you are using. For example:
# code snippet taken from https://gist.github.com/jeff-luszcz/c470d282599ea42424b976c673d7c115 # licensed under a MIT license as per the information on http://www.example.com/jeff-gists-license-info which says: # All my gist code is licensed under the terms of the MIT license
What are some caveats or warnings about using code from Gists?
One of the questions people often have about code snippets is “Where did this code come from originally?” Is the code in the Gist a bug fix to some Apache code? Is it originally from the Linux Kernel and copied out as an example? Is it something from a commercial SDK that wasn’t publically available on the net?
If so, the original author’s licensing may have been stripped (often accidently, but sometimes purposely) by the person who has published the Gist.
It may be difficult to track this information down, and it may not be something the original author can even remember. Through the use of software composition analysis (SCA) tools or source code fingerprinting databases, you may discover earlier origins of code. In that case you may need to update your licensing or remediate (fix, remove, etc…) the snippet.
Wrap up and next steps
If you get in the habit of documenting the licensing you use for all third party content, including snippets, you will find yourself in a much better position in the future when your code is used by other people or projects. It is always easier to get licensing done right at creation time, than if you have to go back in time and become a license or source code detective.