Countless of bytes have been wasted on tutorials and Internet debates (i.e. flame wars) on which Git workflow is the best, when to merge or rebase, how many lines of code per commit make for the best review experience, and so on.
What I’ll attempt to do in this post, in the least amount of bytes possible, is describe a simple, orthogonal Git workflow, designed for projects with a semi-regular release cadence, and built around a pre-release feature freeze.
This is not intended to be the end-all guide to workflow nirvana, but rather a collection of idioms that have been applied successfully throughout the lifecycles of various projects.
Additionally, the final two chapters contain advice and pointers on merging strategies and commit standards. If all that sounds interesting, please read on.
Using Gitflow as a starting point, we simplify certain concepts and entirely discard others. Thus, we specify two main, permanent branches:
master branch represents code released to production (for releases with final release tags) and staging (for releases tagged
-rc). We’ll talk more about release tags in a second, but it’s important to understand that the tip of
master should always be pointed to by a tag,
-rc or otherwise.
Figure 1.0 - The
master branch tagged at various points.
Commits are never made against
master directly, but are rather made as part of other branches, and merged into
master when we wish to deploy a new tag. Merge strategies are described below, and apply towards all code moved between branches.
develop branch is the integration point for all new features which will eventually make their way into
Figure 1.1 - The
develop branch, merging back into
master, no commits should be made against
develop directly, but should rather be part of ephemeral feature branches. The use of such branches is described below.
While the above two branches are permanent (i.e. should never be removed), they only serve as integration points for code built in ephemeral, or temporary branches. Of these we have two, each serving a distinctly different purpose, and having different semantics.
Feature Branches (
Feature branches are where most of the work in a project happens, and are always opened against, and merged back into, the
develop branch. What constitutes a feature is fairly broad, but essentially covers any code that is not a bugfix for an issue that exists in current
Figure 2.0 - Branching and merging feature branches.
Feature branch names follow a naming convention of
XXX refers to the ticket number opened against the work (if any), while
yyy is a short, all-lowercase, dash-separated description of the work done. A (perhaps contrived) example would be:
git checkout develop # This will affect the base branch for our feature. git pull develop # Always a good idea to branch of the latest changes. git checkout -b feature/45_implement-flux-capacitor
The rules behind merging of feature branches back into
develop are project-specific, but most teams would have the code go through peer review and possibly a CI pass before merging. However, it is intended that projects implement a semi-regular (or at least predictable) release schedule, in which case features that are intended to appear in the upcoming release will have to be merged into
develop before the feature freeze starts.
Once the feature freeze starts and
develop is merged back into
master and tagged as an
-rc, the team is free to merge feature branches into
While most rules are meant to be broken, the ones described above (as loosely defined as they are) fit into the versioning strategies employed, and as such will benefit by being followed as closely as possible.
Certain fixes, however, cannot wait for next release, or are designed to fix breaking issues present in the
master branch. For those, we have the following.
Bugfix Branches (
Bugfix branches are intended to contain the bare minimum amount of code required for fixing an issue present on the
master branch, and as such are always opened against, and merged back into
Figure 2.1 - Merging bugfix branches between
Naming conventions and code acceptance rules are identical to those for feature branches, apart for the
bugfix/ prefix applied. Bugfixes are not subject to feature freezes or release schedules.
For bugs that appear on both
develop, the bugfix branch may, optionally, be merged into
develop as well, which has the additional benefit of reducing divergence between the two branches. Why does this matter? Read below.
Versioning and Tagging Schemes
So now you have a bunch of code on
develop waiting to be released. How do we go about doing that? Imagine the following, two-week (i.e. ten working day) release schedule.
Days 1 - 6: Feature Development
Cycle starts, with feature development commencing immediately. Features are opened against
develop, peer-reviewed, tested, and eventually merged back into
develop according to the release manager/team lead/maintainer’s directions. Large features ready for merging during the end of the window may be left un-merged in order to better test and/or avoid any latent issues.
Days 7 - 9: Feature Freeze/Pre-Release Bugfixing
Merge window closes, with any features left unmerged making their way into next cycle’s release. This is also called a “feature freeze”.
develop branch is merged into
master, and a
-rc tag corresponding to the next feature version is opened. So, for instance, if the last version tagged against
v.2.9.3, this tag is to be
v.2.10.0-rc1. This tag is then pushed to a staging server and tested by all means available.
Any bugs we inevitably find are fixed in bugfix branches opened against
master, and merged as soon as the fixes have been verified on the branches themselves. A subsequent
v.2.10.0-rc2) release is tagged whenever we wish to push a new, fixed version to staging.
Day 10: Release Day
Release day! Hopefully we’ve had enough time to thoroughly test the new version, and as such are ready to tag and push a final version of
v.2.10.0, to production. We make another round of testing on production and get ready for the next cycle (or release drinks).
We will eventually find bugs in production that weren’t uncovered by our testing on either feature branches or
master. The strategy we follow differs slightly depending on which phase of the next cycle we’re on.
Before the Feature Freeze
master is still in a pristine state, merging bugfixes back into
master is a simple matter of opening a bugfix branch, merging that in, and tagging a new bugfix release version (e.g.
v.2.10.1 for the above example) as soon as we’re ready to push to production.
Figure 3.0 - Tagging a bugfix before the feature freeze.
After the Feature Freeze
The situation is slightly complicated by the fact that
master now contains code that we’re not ready to release to production, and as such cannot be tagged directly. However, the workflow for opening a bugfix branch remains the same, as the issue will most likely exist in
master, even with the additions from
The most elegant way of solving this issue is opening a new “release” branch against the last stable tag, which will serve as the integration point for all relevant bugfix branches. The naming convention we’ll use for this branch is the major version for the release we’re branching off, i.e.
Once the branch has been created, we’re free to merge in all relevant bugfix branches, test locally, and tag the new version against this branch.
Figure 3.1 - Tagging a bugfix after feature freeze.
This is the only case where a branch other than
master is tagged, and as such constitutes a extraordinary measure.
A Note About Versioning
Both situations above require us to tag new release versions. Normally, we’d tag an initial
-rc version, after which we’d push to staging and test. Whether this is necessary or not for bugfixes is debatable, and is left to be decided on a case-by-case basis. However, the convention of
-rc to staging, final version to production, remains constant in all situations.
Countless debates exist on rebasing vs. merging, and whether to squash commits or not. Realizing that, in most cases, personal preference plays the largest role in choosing a strategy, the following sections may appear debatable, so please, take them with a grain of salt and apply them as needed. However, we’ll try to provide as much rationale as possible, while exploring alternatives in order to better understand the reasons behind our choices.
It is also important to understand that the following sections only apply to public code, i.e. anything that has been pushed to a remote. Nobody but you knows whether you squashed your 15 commits into 1 just before you pushed your code to a public repository somewhere. Regardless of the above, it often pays off to use the same strategies both offline and online, for reasons explained below.
General rules that apply to all strategies: merge with fast-forward, avoid squashing, avoid rebasing.
Merging Between Branches
Merging code between
develop and various feature branches, as well as between
master, is one of the most common day-to-day operations, so let’s cover each case individually.
Once a feature has been peer reviewed and tested as a unit, and provided the feature freeze window is still open, a feature may be merged back into
Choosing to merge instead of rebasing is based on the following rationale: the state of the work in the feature branch is a direct result of the point in
develop it was branched off. For long-lived feature branches, this meta-information is an important aspect of understanding the design choices behind the feature work.
Additionally, rebasing disrupts the linear nature of history, that is to say, commits may appear to be behind ones that were made further in the past, but which were rebased into
develop afterwards. This makes reasoning about the history harder (for instance, when wanting to bisect based on the knowledge that
develop was in a “good” state on some specific date).
Rebasing may also lead to the loss of information concerning how a feature evolved in time, especially when a feature had to be refactored in response to changes made in
develop (more on how these changes are brought into the feature branch in the following chapter).
In most cases, we’re only really concerned with the latest version of a feature. A commit introducing some code that is superseded by a following commit in the same feature branch may appear to be irrelevant since it never really touches upon the state of
develop at the time of merging.
However such information is important in a historical sense. It may be that the code was refactored in response to an outside event, such as a different feature being merged or product decisions being made behind the scenes. Such information may be relevant in the future, even if the code itself was never strictly part of any release.
The general idea is that, anything done with intent, that is to say, manually, should be preserved in the state in which it was made.
develop into a
In several cases, you may need to synchronize your feature branch with
develop, for instance, when fixing merge conflicts.
Again, choosing to merge instead of rebasing is based on the general idea that actions with intent should be preserved. This especially true when working on a public branch, but holds for private branches as well.
Imagine the following scenario: You’re working on a feature branch for implementing image-uploading functionality in a CMS product. The functionality is close to be complete, when a refactor of the underlying
Image class is merged into
You, of course, can no longer merge my code as-is, and will have to change it in response to the refactor. You’re given two choices: either merge
develop into the feature branch, fix any conflicts, and continue to add commits for refactoring any remaining functionality, or rebase the feature branch on top of
develop and make it appear as if you were working with the refactored code from the beginning.
There are several reasons why merging provides benefit, especially in the long-term. One is, of course, that your refactor may be relevant to someone (including yourself) in the future. It may be as a pointer for refactoring other, similar features, or may help when attempting to debug issues that did not exist prior to the refactor.
Another, perhaps more esoteric reason is: fixing merge conflicts can go wrong. You may accidentally choose the wrong part of a conflict, or not merge the changes correctly, or add a typo somewhere that would not exist otherwise. When rebasing, it will appear as if these errors were part of the original design. When merging, these errors will appear as part of the merge commit, and as such can be traced back to with greater ease.
The proliferation of merge commits is the most common reason for choosing to rebase rather than merge, but cases like this demonstrate the value of preserving merge commits, both for their content and as meta-information: this was the point where you needed to refactor your feature; this is the point you merged your feature into
The same general rules for merging between feature branches and
develop apply here as well: merge with fast-forward, do not squash.
It may be that, due to bugfix branches being applied to
master alone and not
develop, that the two will diverge. This however, should not complicate matters much, as in most cases
develop is a strict super-set of
master. Merging bugfix branches on both
develop can help alleviate any future problems, and is the preferred strategy.
Again, the same rules as with merging between feature branches and
develop apply. As stated above, we should also merge all bugfixes into
develop as well, even when the bug no longer applies, in order to eliminate divergence.
Notes About Fast-Forwarding and Squashing Commits
When merging features or bugfixes, we choose to fast-forward our branch relative to the base branch, for various reasons, most notably, the fact that feature and bugfix branches are intended to be ephemeral, and can (and should) be pruned regularly. The reason of why we treat these branches as such is related to how we treat commits, and is explained further below.
Choosing not to fast-forward makes bisecting and reasoning about the history harder, while providing dubious benefits, especially since pull requests (on, for example, GitHub or Bitbucket) continue to exist even if the underlying branch has been deleted.
It may appear, from the above, that the most important aspects of our workflow lie within our branching and merging strategies. However, this is not entirely true.
The smallest monad in any Git repository is the commit, which also makes it the most important aspect of our workflow. Maintaining a clean history depends largely on the quality of each individual commit pushed, and keeping the quality consistent is hard and requires buy-in from every individual team member.
Squashing commits is the antithesis of maintaining consistent quality – why would you squash commits that have been prepared with such diligence? Several other reasons apply, as explained in the following sections.
The following chapters outline several rules on creating good commits.
Naming and Messages
Perhaps the easiest rule to implement, and the one providing the most benefits for the least amount of effort, is standardizing on naming conventions for commit messages. The advice below echoes conventions followed by quite a few large repositories, including the Git repository for the Linux kernel itself, but is nevertheless worth repeating:
Commit titles should be prepended with the file name or subsystem they affect, be written in the imperative starting with a verb, and be up to 60 characters in length.
So, applying the aforementioned rules, we have two examples:
The Get method of the Image class now fetches files asynchronously
Image: Refactor method “Get” for asynchronous operation
The reasons are many-fold: prepending the name of the subsystem helps in understanding where the work is happening at a glance. Using the imperative and starting with a verb is easier to understand by using the following sentence before every commit title: “applying this commit will…”. Lastly, the choice of limiting the title to 60 characters may appear archaic, but it helps in being more terse.
All commits should be accompanied by a commit message (separated from the title by two consecutive newlines), ideally containing the rationale behind the changes made within the commit, but minimally the name of the ticket this work is attached to, which will most certainly be useful to you at some point in the future. For example:
Image: Refactor method “Get” for asynchronous operation
Fetching images from the remote image repository is now asynchronous, in order to allow for multiple images to download concurrently. This change does not affect the user-facing API or functionality in any way.
Using a standard syntax for relating commits to ticket numbers helps with finding them using
git log --grep.
What to Commit and When
We don’t always have the ability or knowledge to foresee the final, completed state of work needed in order to implement a feature or fix a bug. As such, most work is driven by whatever idea we have about the code at the moment, and can therefore change rapidly.
The standard rule for choosing what to include in a commit is this: every commit should represent a single, individually reversible change to the codebase. That is to say, related work, work that builds on top of itself in the same branch, should be part of the same commit.
As an example, in the course of implementing the asynchronous image operations described above, you find a bug in the same file but a different, unrelated method.
This bugfix and the feature work done should appear in two, separate commits, for the simple reason that, we should be able to revert a buggy feature without sacrificing unrelated bugfixes made in the course of building that feature.
The tools we use will largely affect what our commits look like: GitHub now allows for better control over reviewing specific commits. Gerrit allows commits to be grouped into patch-sets, which can be reviewed and reworked as separate entities (which would usually either require a rebase or a new pull request). Other tools only allow reviewing the latest version of a branch as a whole.
Pushing for clear boundaries between commits, especially in the face of ever-changing requirements, and the fact that in most cases, you’d only ever revert an entire pull request/branch and not the individual commits themselves, may appear to be losing battle.
The easiest way to deal with these issues is at the time of review: if the commits are too big (over a couple hundred lines of code) and do not appear cohesive, reviewing the code is that much harder, and will eventually lead to inferior code quality and/or bugs falling through the cracks.
Various concepts have been presented, some harder to implement than others. If there is one take-away, please allow it to be this: it’s better to be consistent than to be correct, and it’s better to be simple than comprehensive.
Rules that are not orthogonal to one another are harder to implement and follow consistently, so keep that in mind when choosing which battles to fight.
The graphics in this post have been generated using Grawkit, a AWK script which generates
git graphs from command-line descriptions.