Effective Git
Published: 2021-11-26
Tagged: git learning software work
Git isn't easy. The first, nerve-wracking months of my first job were made even more intense by having to "commit" and "push" and fix "merge conflicts", and... ugh! Now, almost nine years later, I still occasionally find myself in a sticky situation, but it's more of a fun exercise to break the rhythm of the day than anything else.
I wrote this guide because I see many developers, both baby-faced new grads as well as (fewer) grizzled grey-beards struggling like I once did. That seems like such horrible waste because each of us has a limited amount of quality focus time each day. But nobody has the time to learn all of git -- all those subcommands and flags would take weeks to study and practice. However, power laws apply (almost) everywhere, so there has to be a small slice of git, say 20%, that produces 80% of the results. What is it?
Based on my experience, that 20% is:
- Understanding what git really is. Because understanding what all those commits and branches are doing makes it clear what every git command will do, allowing you to get what you want faster or avoid costly mistakes.
- Learning the dozen plus two commands that are the most useful. Some of them you will run hundreds of times a months, so small efficiencies in their use add up quickly. Others you will use a few times a year, but each one can save you hours of work.
Let's dive in!
What is git, really?
Git records snapshots of changes performed on a directory of files. These snapshots are called commits. Each one references at least one previous commit, creating a chain of changes from oldest to newest. Multiple commits can reference a single one, which allows us to branch off small sub-chains and merge them later. This whole structure ends up looking like a tree.
To help navigate this tree, git supports creating labels that point at specific commits. The first type are called branches. Branches always point at the tip of a chain of commits. When you initialize a new repository and create the first commit, git creates a "master" branch that points to this commit. And when you create another commit, git updates the current branch to point at it.
How does git know what the current branch is? After all, multiple branches can point at the same commit. The answer is that there's a special label called HEAD that always points at the current branch. So if you create a new branch, git will update HEAD to point to it. And when you create a new commit now, the new branch and HEAD will be updated to point to it.
The second type are called tags. Tags are like branches except that they are not updated when new commits are created. In other words, they are static, always pointing at the same commit, which makes them great of labeling versions of software.
Git-the-CLI is used to manipulate the tree and its labels. When you check out a branch, git makes the filesystem directory look like the latest commit on that branch. And when you commit changes, git writes information about the state of the filesystem directory to the tree (updating the branch, too).
That's git in a nutshell.
It also does a few other helpful things like downloading commits from remote repositories or merging conflicting files, but at the end of the day, you're just manipulating a tree.
Now, let's go one level of abstraction deeper: you're actually manipulating three trees: the repository (tree of commits), the filesystem directory (working directory), and the staging area tree. Understanding how these relate to each other will help you understand what git commands do.
The working directory is the collection of all tracked and untracked files in a directory managed by git (ie. a directory with a .git subdirectory). Tracked files are those that git "knows about", ones that have been staged or committed earlier. Untracked files are new files.
The staging area is a temporary tree that sits between the working directory and the repository. It's useful to think of it as a sort of scratchpad, a place for incrementally adding changes until you have exactly the ones you want to create a new commit. I suspect that without this feature, we would all be committing or omitting important files.
Let's stop here.
At this point, you should be able to answer these two questions some of the time:
- Will the command I am about to run read or write to the tree?
- Which tree will it read from or write to?
With practice, you will get the right answer more and more often and you will find yourself getting into fewer complicated situations. And then, when you, these questions will help guide you out without much stress.
Alright, let's move onto the most useful commands, beginning with all those that read from a tree.
Reading from the Tree
- git show
git show (.|commit|branch-name|tag)
- Print the contents (changes) of the commit indicated by the argument. Note that
.
refers to the latest commit on the current branch, which is equivalent togit show HEAD
. Useful for inspecting changes and tying them to a person/time/branch.
- Print the contents (changes) of the commit indicated by the argument. Note that
git show (.commit|branch-name|tag):file-name
- Print the full contents of
file-name
at a specific commit.
- Print the full contents of
- git log
git log --oneline -n5
- Print up to 5 commits leading up to the current commit, using a compact format of
short-hash (optional branch/tag) commit message
. Useful for quickly orienting yourself after a pull or checking out a new branch.
- Print up to 5 commits leading up to the current commit, using a compact format of
git log (file-name)
- Print the commit history of a specific file. Great for finding out who touched it last or seeing how a piece of code evolved over time.
git log --grep (search-string)
- Print only the commits that contain
search-string
in the commit message. Another useful investigation tool for understanding changes. - A related flag is the
-S
flag that will print all commits that containsearch-string
in contents of the entire commit.
- Print only the commits that contain
git log (branch1)..(branch2)
- Print the set of commits in
branch2
minus the set of commits inbranch1
. Useful for understanding the difference between branches, especially master, eg.git log master..HEAD
will print all commits on current branch that are not merged into master.git log origin/master..master
will print all commits on local master that are not present on the remote master branch.
- Print the set of commits in
git log (branch1)...(branch2) --left-rigt
- Print the set of commits unique to each branch minus the commits they share. This is useful for comparing two feature branches and the
--left-right
flag makes the output more explicit about which branch each change belongs to. Rarely used, but useful when you need it.
- Print the set of commits unique to each branch minus the commits they share. This is useful for comparing two feature branches and the
- git diff
git diff
- Print the differences in tracked files between the working directory and staging area.
git diff --cached
- Print the differences between the files in the staging area and the last commit. Supremely useful for checking what changes are going to be included when you run
git commit
. I use it to review my work before I make any commit and it has saved me from countless little errors like typos, missing files, revealing secrets, etc.
- Print the differences between the files in the staging area and the last commit. Supremely useful for checking what changes are going to be included when you run
git diff (branch1) (branch2)
OR `git diff (branch1)..(branch2)- Print the differences between the tips of
branch1
andbranch2
. Useful for seeing the exact differences between a feature branch and master.
- Print the differences between the tips of
git diff (branch1)...(branch2)
- Print the differences between the tip of
branch2
and the closest common ancestor of bothbranch1
andbranch2
. Useful for comparing feature branches.
- Print the differences between the tip of
- git blame
git blame (file-name)
- Print the contents of
file-name
, annotating each line withshort-hash author timestamp line-contents
. When combined withgit show
andgit log
, serves as a powerful tool for investigating git history.
- Print the contents of
git blame (file-name) -L10,20
- Print the contents of
file-name
, annotating each line as described above, but only print lines 10 through 20, making it handy for large files.
- Print the contents of
- git tag
- Creating tags is covered in the "Writing to the Tree" section below.
git tag -l
- List all tags in the repository.
git tag -l (pattern)
- Print all tags that match
pattern
, wherepattern
can be any valid shell wildcard pattern, eg.git tag -l 'v1.*'
will printv1.0
,v1.1
,v1.1.0
, etc.
- Print all tags that match
- git reflog
- Like
git log
, but prints the history of pointers like HEAD, branches, tags, etc. In other words, this logs every timegit checkout
is run (because the HEAD pointer is moved) and every time you make a new commit (because the branch pointer is moved). - Useful for debugging
git rebase
orgit reset
problems because if HEAD ever pointed at a commit, you can find the hash in the reflog.
- Like
- git checkout
git checkout -b (new-branch-name)
- Create a new branch and move HEAD to point to it.
git checkout (branch-name)
- Move HEAD to the tip of
branch-name
and make the working directory look like the latest commit on that branch. In case files from the current branch andbranch-name
would overlap, it will print a warning and abort the checkout, making it a safe command.
- Move HEAD to the tip of
git checkout (branch-name) (file-name)
- Without moving the HEAD, restore
file-name
from the tip ofbranch-name
to the working directory.
- Without moving the HEAD, restore
Writing to the Tree
- git add
git add (file-name)
- Moves
file-name
to the staging area, preparing it to be written to the tree. Accepts globs, allowing it to add multiple files at once.
- Moves
git add -i
-i
enables interactive mode. This can be a very powerful tool because, if you commit the shortcuts to muscle memory, it can serve to very quickly add/update/remove/patch/etc. any number of files to the staging area.
- git commit
- Adding this for completeness sake: the command takes the files in the staging area and creates a new commit on the current branch, followed by updating both HEAD and the current branch to point to it.
- git reset
- This command can read and write to both the working directory and the tree, so while it doesn't strictly fit in this section, it's the best place to describe it here.
- I've found
git reset
to be useful for undoing additive commands likegit commit
andgit add
. Rarely, but it's also come in handy to reset the repository to clean known state after a messed up rebase or merge operation. - On top of that, I feel like this command really allowed me to understand that tree-like nature of git. Just playing around with it a few times showed me how pointers/branches move and how it affects the output of
git status
. git reset --soft (commit|branch-name|tag-name)
- Move the branch that HEAD is pointing to to the specified
commit|brach-name|tag-name
. This does not modify the staging area, which looks like the commit just before HEAD was moved. So if you rungit status
after this command, the staging area will look like the commit you just moved from. In other words, you just undid agit commit
operation.
- Move the branch that HEAD is pointing to to the specified
git reset --mixed (commit|branch-name|tag-name)
- This is the default
git reset
operation. - Does everything
--soft
does plus populates the staging area with the contents ofcommit|branch-name|tag-name
. If you rangit status
, you would only see changes in the working directory -- the working directory would look like the commit you moved from. And continuing with the example from above, if you rangit reset HEAD~2
, you would have undone bothgit commit
andgit add
operations.
- This is the default
git reset --hard (commit|branch-name|tag-name)
- This is unsafe to run, meaning it will overwrite files in your working directory so you could potentially lose work. Tread carefully.
- Does everything
--mixed
does plus populates the working directory with the contents ofcommit|branch-name|tag-name
. If you rangit status
now, it would printnothing to commit, working tree clean
.
- git rebase
- Take a string of commits and apply them on top of a branch. Keep in mind though that if the branch exists on a remote repository, you will have to force-push it up after rebasing, creating confusion for everyone else. So use this only on branches nobody else is using for best results.
- I've found it useful in two cases, one pretty much daily, and the other occasionally. Let's start with the frequent one.
git rebase -i HEAD~2
HEAD~2
tells git to rebase off the "2nd parent" of the current commit. Increase that number and you'll go back farther back. The-i
flag opens an interactive prompt that allows you to specific what to do with each commit in the selection: edit the message, drop or squash the commit, stop for making changes, etc. This is super useful for when you're working on your own branch and preparing to make a pull request by giving you an opportunity to combine/split commits, write nice messages, etc.
git rebase (master|other-branch)
- This will take all the commits from the current branch and apply them on top of master. The current branch will remain a separate branch from master though, so you can continue working on it. This is great for catching up with changes on master or merging feature branches together.
- Special note: I rarely, if ever use
git merge
. There's nothing wrong with it, but the only time I thinkgit merge
is useful is when you merge branches into master. And since that's handled by the software running the remote repository like GitHub, GitLab, Bitbucket, etc. then I don't do it on my local machine.
- git cherry-pick
- Take one or more commits and apply them to the current branch. The difference between cherry-pick and rebase is that cherry-pick creates new commits while rebase moves the commits to the current branch.
git cherry-pick (commit-hash)
- Take
commit-hash
and apply the changes to the current branch. Very handy for moving small changes, like bug-fixes, from one branch to another.
- Take
- git tag
- Tags come in two flavors: lightweight and annotated. Lightweight tags are simple labels pointing at a specific commits. Annotated tags do that plus include creation timestamps, tagger name and email, a message, and optionally a GPG signature. The former are useful for local, temporary tags while the latter for permanent tags made for others--like indicating release versions.
git tag (tag-name) (commit-hash)
- Create a lightweight tag
tag-name
attached tocommit-hash
.
- Create a lightweight tag
git tag -a (tag-name) (commit-hash)
- Create an annotated tag
tag-name
attached tocommit-hash
. It will open a prompt to enter your message. Use-u
or-s
flag if you want to add a gpg signature. Don't forget to add--tags
togit push
to upload the newly created tag to the remote!
- Create an annotated tag
- Tags come in two flavors: lightweight and annotated. Lightweight tags are simple labels pointing at a specific commits. Annotated tags do that plus include creation timestamps, tagger name and email, a message, and optionally a GPG signature. The former are useful for local, temporary tags while the latter for permanent tags made for others--like indicating release versions.
- git revert
git revert (commit-hash)
- Create a commit that undoes the changes of
commit-hash
. Useful for rolling back changes that have already been merged to remote branches because it won't break the normalgit pull
workflow for others.
- Create a commit that undoes the changes of
The One that Doesn't Fit Above
- git stash
- Git stash is useful for saving changes when changing branches. It's actually rare when I get to start working on a branch and continue until I finish without have to switch to other work. Also handy for doing some experimental changes, then saving them for the future.
git stash
- Create a snapshot of the working directory and staging area and save it, then restore the working directory to the commit that HEAD is pointing to. Add
-u
to include untracked files as well. Use it when you need to quickly clear up any work in progress without losing it.
- Create a snapshot of the working directory and staging area and save it, then restore the working directory to the commit that HEAD is pointing to. Add
git stash pop
- Take most recently saved snapshot from the stash and apply it to the branch you're currently on. The reverse of
git stash
- Take most recently saved snapshot from the stash and apply it to the branch you're currently on. The reverse of
git stash list
- View all the saved changes in the stash. Notice how they are addressed in the stash, eg.
stash@{2}
. This allows you to retrieve changes from further back in time, allowing you to stash multiple different sets of changes.
- View all the saved changes in the stash. Notice how they are addressed in the stash, eg.
Comments
Add new comment