Using Submodules for Shared Code
Our UI team is constantly juggling projects — both new and existing — and how we handle shared dependencies between these projects is critical to keeping up the pace of our development. The go-to solution for handling these dependencies? Submodules. Here’s why:
Intro to Submodules Git submodules are repositories embedded in other repositories. Say you have a local repository that you are currently working in but need to use the code from your other repositories. Submodules are a way to include the external codebase of another repository into your local repository. By using submodules, you no longer have to import and publish shared code every time you make some changes, thereby speeding up development. For comparison, think about a folder system on your computer. You could have a folder existing deep within other folders on your computer, but instead of having to dig through those folders every time to access it, you can create a shortcut on your desktop to access it immediately. Submodules act as that shortcut.
How does one use and maintain submodules?
To add a submodule to your local repository, use `git submodule add <repository> [path]`. The content of the submodule will exist in your local repository’s folder structure and can be referenced directly in your local repository’s code. The submodule is treated like a separate repository so any changes you make within the submodule’s code needs to be committed within the submodule first. The local repository only stores the git hash of its submodules, not their code.
Say somebody else just pushed up some changes to the local and submodule repositories. You can use `git submodule update` to synchronize your submodules to the committed hashes. This does a checkout of the specific hashes stored in the local repository’s submodule registry. Do note, however, that the branch information of each submodule is not stored. This means that each submodule will be in a detached head state after running the update command. So, when continuing to code, be sure to checkout the correct branches within each submodule. Lastly, to initialize a repository that uses submodules, after `git clone`, you need to run `git submodule init`. This will set up the necessary folder structure and git plumbing. You will also need to run `git submodule update` to populate the submodules with content.
Are Submodules the Only Way?
There are many other ways to manage shared code. Here’s a comparison of those methods:
- Imported libraries: This is the most common or familiar to many developers. It uses separate codebases, which means maintaining different versions and dependencies. Any changes to shared code need to be snapshotted or published before it can be used anywhere else. While some developers prefer this more manual approach, it requires more time and is a bit more tedious than what you would have to do with submodules.
- Monorepos: This is when you have one “monolithic” repository that contains every piece of code that you have. The advantage over imported libraries is that changes made to shared code do not need to be published anywhere; all projects in the monorepo will be able to use the new changes immediately. The problem, though, is that any commits you make will be for the whole repository, even if you only modified one project. The git history will quickly become long and messy, making it hard to detangle what changes were made for what features. Every project in the codebase has to be versioned together unless more complicated tooling is built to manage otherwise.
- Submodules: Given the above options, submodules are the best of both worlds. You can have separate versioning while also having the shared code in the same codebase as your local repository. This means you can have it look like it’s all in the same repository, but still maintain a separate git history and version them all independently because each local repository can point to a different hash of their submodules. There are some added complexities here in regards to maintenance as well, but in the long run it will save you significant amounts of time.
How We Use Submodules at Tapad At Tapad, we have one main submodule that contains our UI framework built on top of Angular, our SCSS framework, documentation, and the build process and any related scripts. We have other shared code broken out by business or functionality so that we can be more organized with our feature sets. Our local repositories are individual UIs that share the same components and styles and have, if any, very limited communication across applications. Each local repository will have the main submodule and a variety of other submodules depending on what is required.
The workflow we have implemented when working with submodules is an extension of git-flow.
The steps for creating a new feature are:
- Create a feature branch in the local repository and all submodules that will be affected.
- Implement the feature and make the requisite commits in the affected repositories.
- Open a pull request in each repository separately. (Unfortunately GitHub does not have good utilities to code review submodules.)
- After the code is approved, merge each feature branch into the development branch for the local repository and all submodules.
When cutting a release, it is important to have all the submodules in a state that is easy to return to in case of hotfixes. The steps for this are:
- In the local repository and each submodule, merge the development branch into production branch.
- Tag all the commits with semantic versioning.
- Push everything, including the tags.
In those rare situations where a hotfix is needed 😀, the steps are:
- In the local repository, create a hotfix branch off of the production branch.
- Do `git submodule update`, which will set each submodule to the right commit (that should be tagged).
- Create a hotfix branch off of each affected submodule repository.
- Implement the hotfix and make the requisite commits.
- Merge the hotfix branches to their production and development branches.
- Tag all the changed production branches.
- Push everything, including the tags.
There are some additions that we have added on top of the submodules system to help improve our workflow. We created scripts to help manage our boilerplate local code so that when we need to create a new project, we can get a UI up and running easily. We have also written scripts to help synchronize npm dependencies and their versions. Each submodule contains a `package.json`-like dependencies map that gets merged with the local repository’s dependencies map to generate the local repository’s `package.json`. We do this so that we know what npm dependencies are within each submodule.
In conclusion… By using submodules to manage shared code within our UI projects, we are able to work quickly and efficiently. We can work on multiple UI projects at the same time while adding new features to the same shared codebase. We have also built tooling around the submodules to help us maintain synchronized npm dependencies across all of our UI projects.
This post also appears on our new Medium blog, here.