Rewriting history

Aug 21, 2023

The ability to rewrite history is arguably the most powerful feature of Git. But it's also the biggest cause of issues, particularly when working within a team.

Luckily, there are ways to do it effectively. If you understand the implications of rewriting history and learn how to recover when things go wrong, you won't cause trouble to yourself or your team.

gray coupe on road in focus photography — Photo by Jason Leung on Unsplash

Three levels

Let's say we have an uncommitted change in our working directory that we'd like to add to a previous commit. How would we go about doing this?

Level 1: Resetting

If we wanted to add the change to the latest commit in a branch, a simple approach would be to undo the latest commit, but keep its changes in the working directory:

The command above resets the current branch (and HEAD1) to the previous commit C. Changes from commit D are preserved and now appear as uncommitted. If we now commit all changes, we have a new commit E that includes all changes from D plus the uncommitted change we had at the beginning.

Commit D still exists in the repository, but it's no longer part of the current branch.

Level 2: Amending

Amending the latest commit is so common that there's a more straightforward way of doing it:

Assumes changes have been previously staged (with Add)

The command above not only requires fewer steps than Level 1, but it also prepopulates the commit message for the new commit E with the commit message from D2.

Level 3: Rebasing

What if we wanted to amend a commit other than the latest one? We could use the technique from Level 1 and reset to an earlier commit (e.g. HEAD~2). But then we'd need to redo two or more commits manually, which starts to get cumbersome. Rebase to the rescue!

As a first step, we create a new commit with only the uncommitted change:

The command above creates a new commit with the message in a special format. This will be used by Git in the next step:

The command above opens a text editor with the Rebase to-do3. Because we told Git to reapply the commits on top of HEAD~3 (i.e. B), only C, D and E appear on the list. But they're not in their original order: Git has already reordered them for us according to our previous instructions. It has also changed the action associated with E to indicate it needs to be combined with C (and keep the commit message from C). If we save the file and close the text editor, the Rebase is executed:

Implications

Rewriting commits is completely safe, as long as you stick to local commits (i.e. commits that haven't been pushed to a remote). However, there are a few situations where it’s acceptable to rewrite remote commits4 that haven't been publicly shared5. In those cases, it's important to understand how to synchronize the local and the remote branches after the rewrite.

Local-remote synchronization

If you're used to always doing a Pull before a Push, you'll need to be careful. Let's imagine we do a Pull just after amending a remote commit:

In this case, you'll end up with “duplicate” commits. This is because Pull does a Fetch & Merge by default, so both D and E (as well as the merge commit F) will end up in your commit history. What a mess!

The right way to proceed here is to skip the Pull altogether and use Push with the special flag --force-with-lease:

Consuming a force-Push

Occasionally, you might need to update your local branch to reflect the state of a remote branch after a force-Push is done by an external agent (which could be yourself working on another machine). The solution is to Fetch & Reset:

git fetch
git reset --hard @{upstream}

The commands above completely replace the local branch with the latest version of the remote branch (or upstream).

Recovery

When you rewrite a commit, the original commit isn't discarded immediatelly6. As a result, you can generally go back to it easily using Reset:

git reset --hard HEAD@{1}

The command above uses the special revision HEAD@{1}7, which means “the commit previously referenced by HEAD”. But you could also use a commit hash (although it would require you to make a note of it, which isn't practical) or any other revision. If you need to undo something more complex, like a Rebase, you should use the Reflog to find the right revision:

git reflog

Being able to undo a rewrite is a game-changer. It encourages you to experiment and practice without the fear of messing up.

HEAD is a special reference that always points to the latest commit in the current branch.

You can tell Git to reuse the message from the commit you're amending by adding the --no-edit flag.

In the Rebase to-do, commits appear from older to latest.

For the record, all rewrite operations are done on local commits (because of the distributed nature of Git, where every repository has all the commits). As such, “remote commits” refers to local commits that have already been pushed to a remote.

I consider commits to have been publicly shared when they become available to people who weren't involved in producing them. For example, a commit in a remote branch only becomes publicly available when it's merged to main or it has a pull request opened for it.

If you're curious about how Git does garbage collection, see GC.

If you're feeling adventurous and want to know more about revisions, see the official documentation.

Gitting

Discussion about this post