This page contains some notes that I took while I was learning git.

TODO

This page needs a full rewrite.

Setting mtimes on files to their last commit time

If you want to set the modification time of each file in your working directory to the time of the last commit where it was changed, run this in your git repo:

$ git ls-tree -r -z --name-only HEAD | xargs -0 -n 1 -I {} sh -c \
       'touch --date="$(git log -1 --pretty="format:%aD" HEAD -- {})" {}'

It gets a list of files, passes them to xargs, which touches each file with the time returned by git log.

Setting mtimes on files to their first commit (creation) time

$ git ls-tree -r -z --name-only HEAD | xargs -0 -n 1 -I {} sh -c \
       'touch --date="$(git log --reverse --pretty="format:%aD" HEAD -- {} | head -1)" {}'

The only trick is that git applies the -1 before the --reverse (I assume), so that selects a single commit and then reverses it. You need to output all commits then use head to select the single commit.

Culling a single commit way back in history

Would this work to blow away a commit and rewrite all the history since then to make it as if it was never made in the first place?

<spearce> bronson: if you are doing that on a large scale, consider git-filter-branch.
<gitster> I'd probably do the laziest thing --- branch off of the bottom of the chain, "read-tree -m -u" from the top commit, and commit that result.  That will squash all the intermediate step.
<bronson> That sounds like the easiest.
* gitster expected to hear "Yuck, that's too low level and dirty".
<bronson> Well, I'm having a hard time wrapping my head around filter-branch.
<wagle_home> bronson, try out the examples in the man page..  it makes a lot more sense after that
 gitster Gitzilla
<-- wycats_ has quit ()
<wagle_home> of course, i figured it out while sitting in the emergency room right after having bashed my head against the pavement..  that might have helped
<bronson> gitster: wait, doesn't that squash all the more recent commits into one?
 wagle_home wagle
<-- \ask_ has quit ()
<bronson> wagle_home: I'll try bashing the head as a last resort.
<wagle_home> good plan..  8)
<gitster> My "bottom" and "top" refer to the bottom and top of the range you want to get rid of.
 So if you have ---o---A---B---C---D...---X---Y---Z, and then C...X are cruft,
 then I would (1) git checkout B; git read-tree -m -u X; git commit
<bronson> Oh, right.  Then Y and Z come along with X.
 That makes sense.
<gitster> to have another history that goes ---o---A---B---X' (where X and X' has the same tree), and then
 "rebase --onto HEAD X $(whatever the branch that has Z at its tip)"
 which would make the branch ---o---A---B---X'--Y'--Z'

Recovering Disk Space

Even though you've removed objects from your commit history, they're still taking up space on disk.  Here's how to remove it:

Git Recover Storage

Support

http://git.or.cz/ http://git.or.cz/gitwiki/GitCommunity

http://vger.kernel.org/vger-lists.html#git

  1. git on irc.freenode.net

In General

svn commands generally apply only to the current directory and all its subdirectories, git commands generally apply to the entire repo (right?).

svn hook scripts <-> git update hooks -- [1]

git only tracks unique content, and only files can have unique content. git, like cvs, does not track directories. If you need an "empty" directory in your project, you need to add an empty ".gitignore" file to it.

git treats all files as opaque binary blobs. It does not adjust line endings. All decent developer tools work with all 3 line ending styles anyway so this is not as big a problem as it was for CVS back in 1995. Things are simpler this way; don't expect this feature to ever be added to git.

All state is stored in the .git directory in the root directory. It does not store any information in deeper directories (i.e. no CVS or .svn directories).

git Terminology

HEAD
the latest revision on the current branch
master
the canonical branch in your reposiotory. This is only by convention.
origin
the master branch of the remote repository that you cloned from. git pull will pull changes from the remote master branch into the local origin branch.
porcelain
calls meant for regular git use. Many porcelain calls are just shell scripts that call plumbing calls underneath.
plumbing
low-level calls that you should rarely if ever need to call in regular git usage. Will generally take obscure arguments and produce machine-readable results.
remote
While "remote" usually means "not in this computer", with git it means "Not a part of this repository." You can pull changes from a remote repository that lives elsewhere on your local filesystem.

General Functionality

svn up/update -- git pull Bring upstream changes into your repo.

svn co/checkout -- git clone Create a local copy of an upstream repo.

svn ci/checkin -- git commit -a -- commit changes in your working copy to the repository

You must have previously called git-add on all files that you want to commit. Unlike CVS/svn/p4, git does not automatically commit changes in the tree. You must explicitly tell it the changes that you want with 'git add'. (until git 1.4 you'd mark modified files for submission with git-update-index. That functionality has been rolled into git and and git-update-index has been deprecated.)
a git commit is actually two steps: update the index with the changes in the working directory, then commit the index to the repository. You don't need to know this except during strange merge situations. See [2] for more information.

svn st/status -- git status -- tell what files have been modified

svn revert -- git checkout -f -- return a local file to the state it's in the repository

If you have a branch of the same name as a file, git will switch the entire working directory to the branch rather than checking out the file. To force git to treat names as files, precede them with "--".

?? -- git clean -d -- Remove all untracked files from the working copy. Note that by default git clean doesn't remove files named in .gitignore, specify -x to remove ALL untracked files.

So, to return the repository to its pristine state, run "git clean -d -x; git checkout -f". (maybe git reset would be better than git checkout?)

svn add -- git add -- add a file to the working copy, gets added to repo on checkin

svn rm FILE -- rm FILE -- delete a file from the working copy, gets deleted from repo on checkin.

git actually doesn't require you to run the "git rm" command. If you remove a file from your working directory, git will notice this on the next checkin and remove it from the index too. git rm only works on the index so you should rarely have a need ot call it. Call git rm -f to force git to remove the file both from the working directory and the repo.

svn mv -- git mv -- Move or rename a file

svn diff -- git diff -- show changes between local file and repo

supports --color to display the diff in color

svn cat -- git fetch -- show contents of a file

git ls-files: list the files in various areas of version control. You can add --ignored to show ignored, files, --unmerged, --deleted, --modified, --others (files in the working directory that are not under revision control), or --cached (show files under revision control, the defualt). --stage: ??

New Repositories

Before starting work with git, make sure your identities are correct. These will be saved for all time in the log info so you don't get to go back and fix it (except in trivial cases). Show your identities using:

git var -l

(apparently this doesn't work outside of a repository anymore. Run the Creating step first, then run git-var from inside the new repo).

Creating

Go to the root of the project you want to import and run.

cd my-existing-dir
 git init
# obsolete:
 git init-db
 git add .
 git commit

why did we not use git-update-index? Because it's not needed when adding files, only when updating them. You can use 'git commit -a' if you need the CVS finger feel of checking in every change in your working directory. It feels weird for a day or two but, once you get used to calling git-update-index, I think you'll start to find it strange that every change in your working directory would be committed!

Your project is now under git revision control.

Fetching

-- git clone REPO -- create a new local repository that is an exact duplicate of the remote. git supports a number of different protocols:

git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linus-2.6
 git clone rsync://rsync.kernel.org/pub/scm/linux/kernel/git/bcollins/ubuntu-2.6.git ubuntu-2.6

TODO: git update-index, how is this used in the real world?

Specifying Revisions

git has a number of commands that allow you to specify revisions (git log, git branch, etc).

You can append ^ to any of these specifications to refer to the named revision's parent. For instance, HEAD shows the latest checkin, so HEAD^ shows the second-to-latest checkin, HEAD^^ shows third-to-latest, etc. HEAD~4 is the same as saying HEAD^^^^ (note tilde instead of caret) Some revisions have more than one parent (when merging), so follow a caret with an integer to disambiguate. HEAD^1 refers to the first parent (same as HEAD^), HEAD^2 refers to the other.

HEAD
the head of the current branch
branch name
Specifies the latest revision on the named branch.
commit id
Each commit is given a unique hash. Specify the hash (or enough of the hash to be unique, at lest 4 characters) to refer to that commit.
file name
Specifies all commits that contain changes to the named file. If a command takes either a branch or a file name, the branch name takes precedence. If you specify files, you can precede them with '--' to ensure they won't be interpredted as branch names.
tag
If you use git tag to name a particular revision, you can refer to it by that name from then on.

Viewing History

svn log -- git log, git show -- print revision history

-- git log -p -- show revision history in patch form (TODO: a single patch? think so...) You can specify a branch or multple files and see the merged revision history for all files. Precede the files with "--" so they don't accidentally get interpreted as a branch name.

you can specify a range like REV1..REV2. If REV2 is omitted it defaults to the current rev. For instance, "git log v2.6.12.." would show all changes from v2.6.12 to now.

git log release..test -- (assuming test was branched from release) show changes to the test branch that have not been merged back to the release branch)

git rev-list produces a human readable complete log of changes. Linus uses it to produce the full (--pretty) and short (--pretty=short) changelogs for the kernel.

TODO: git whatchanged

Branching

git has one of the easiest branching models of any SCM. The intent, like Subversion, is to make branching so easy that you'll do it all the time. You should do most of your local work in a branch so that changes to head don't get in your way. When you're done, merge your branch into head, push it to a public repo, and delete the branch.

All git projects start with a branch called "master". (I don't know why "git branch" doesn't show you this until you create an alternative branch; seems like a bug to me.)

git branch -- list all the branches in the current repo

?? -- git checkout BRANCH -- switch the working directory to another branch (changes all tracked files to be of the same version as the branch, doesn't affect other files. The checkout will fail if you have untracked changes in your working directory. Specify -m to perform a 3-way merge (svn automatically merges right?)

git branch NAME [START] -- creates a branch named NAME, starting at the given START point (either another branch name, a tag name, or a commit ID). If omitted, the new branch is created from the current branch. This only affects the repo. If you want to start using the branch in your working directory, you must call git checkout.

git checkout -b NAME [START] -- same as git branch NAME START; git checkout NAME (branches and checks out the new branch in one operation).

git branch -d -- delete a branch. Git tries to be careful about ensuring no changesets are lost so it won't let you delete a child branch unless all the commits have been merged into the master branch. Pass -D to tell git that you know what you're doing and that it should delete the branch anyway.

svn switch -- git checkout -m -- move to the named branch, keeping uncommitted changes in the working directory. (use git reset --hard to switch to the named branch and abandon uncommitted changes).

Special Branches

master
This is the default branch created by git init-db. It typically contains the canonical representation of the source code. Development is typically done on branches and then merged back to master.
origin
If your repository was cloned from a remote, the origin branch is a pristine copy of the remote repository's master branch. Never commit to this.

Remote Branches

git allows you to easily track any number of upstream branches in your repository. In this example, we'll track Linus's master branch in a local branch named "linus".

First, create the .git/remotes/linus file:

URL: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Pull: master:linus

Then create the linus branch:

$ git branch linus

Now any time you want to update the linus branch, just run git fetch. You can merge the changes from there into your own branches.

$ git fetch linus

You want your master branch to always track Linus's master branch, yet still contain the change that you make. When Linus changes his tree, you want apply your changes on top of his new master branch. Git calls this rebasing.

Merging

All merges end with committing the new changes unless you specify --no-commit.

svn merge -- git pull REPO BRANCH -- merge the changes from the named BRANCH into the current branch and working directory and, if successful, commits the changes (unless --no-commit). REPO gives the repository where the branch lives; specify a dot to indicate the current repository. If the repo doesn't have any branches yet, specify 'master'. TODO: git pull is the equivalent of "git fetch; git ??". (TODO: does git pull operate on the entire project or just the current subdirectory?)

-- git fetch REPO REMOTE-BRANCH:LOCAL-BRANCH -- fetches the remote branch without merging. If LOCAL-BRANCH already exists, git fetch will fast-forward it to match the remote (so git requires reasonable merge histories). If not, it will be created.

Never work on a branch created by git fetch. Instead, create a new local branch, fetch the changes from the remote repo as usual, then pull the changes into your local branch.

TODO: What exactly does '+REMOTE-BRANCH' mean when fetching?

TODO: talk about rebasing [3] (if the master branch has moved on since you forked your branch from it, you'll probably want to rebase your branch so you can keep closer to it).

Use git ls-files --unmerged to tell what files still have unresolved conflicts.

Handling Patches

Even if everybody used git, a lot of development would still involve sending patchfiles to each other. These commands make working with patchfiles easier.

Moving Patches by Mail

git format-patch origin -- package up all my changes

Each change is in a separate file. You can turn this into an mbox with a simple 'cat *.txt > file.mbox'. You can also push each message straight up to your IMAP server. It's a good idea not to send it out before glancing at it first because, once you send it out, you can't ever get it back.

This creates a series of patches that you can then post on a web site, send to an email list, etc. Even if everybody on the list is using git, rather than telling readers how to clone your tree, generally it's easier just to include the patches with your message. And it also doesn't lock you into a single SCM.

git imap-send -- dump my changes into an imap drafts folder.

You can fill in your return address, the to address, the server, etc using a config file.

Finally, this is how to apply the diffs:

git am 000* -- apply

You should always use git-am. git-applymbox only exists because Linus doesn't want to update his scripts. :)

Other Patch Commands

git diff-tree

git apply

Tagging

svn tag -- git tag -- Name the current revision.

git tag -l PAT -- list all tags matching PAT. If PAT is omitted, all tags are listed.

git tag NAME [REV] -- tags the given revision (HEAD if not specified) with the given name. If a tag already exists by that name, you must specify -f to override it.

git tag -d NAME -- deletes the named tag.

git tags only name a particular revision, not a full branch. The easiest way to check that revision out is to place it on a new branch: git checkout -b linux-2.6.17-branch linux-2.6.17

Tags and push/pull: By default, git pulls tags from the remote repository if they are related to any branch heads that you are tracking. You can specify --tags to get all tags (maybe someone referred you to a revision marked by a tag whose branch has been deleted; you would need to run git pull --tags to retrieve the extra tags). Git doesn't push tags by default. You must name them on the command line. You can also push all the tags in your repository using git-push --tags or git-push all, but that's a sloppy way of going about it.

TODO: describe more about tags. Like, what's the difference between a lightweight tag and a heavyweight one?

Signing

git is very good about maintaining a cryptographic history of all changes made. If you want to change a revision in a git archive, you must remove all revisions after that, make the change, then replay the revisions. If you don't have the keys used to sign the revisions you deleted, then you're stuck.

To alter a checkin once someone else has signed a revision based on that checkin requires breaking the SHA1 hash. So, if your repo uses signed checkins, you can verify that all previous checkins are exactly what the author intended.

Use -s on git tag and git commit to sign the operation with your key.

Nice Tricks

this article is great.

git grep -- searches for the given string. By default searches all tracked files in the current branch. Specify another branch or revision to only search in that.

gitk -- graphical git browser. accepts git specifications, so to see all changes under drivers in the last two weeks, run gitk --since="2 weeks ago" drivers/. Control-minus reduces the font size, control-+ increases it.

git tar-tree REV [LEADING] > FILE -- tars up the specified tree. no need for manual steps, no need to clean your directory of intermediate files. REV can be a revname, a tag, a branch, etc. LEADING gives the directory/directories that should contain the git contents. By default the output goes to stdout so you'll probably want to pipe it to a file.

git-svnimport -- git is very good at importing cvs and svn repos, but see this for the dangers involved. Thanks to the cryptographic security, you must get everything right the first time. There's no going back and fixing things. See the git site for more.

TODO: how do I separate the committer of a patch and the author of a patch? Probably manually specifying GIT_AUTHOR_IDENT on the command line?

.gitignore: if you have files that you want to remain in your working directory without warnings, but never check into your repository, name them in a .gitignore file one per line. Works just like .cvsignore. Supports shell-globbing wildcards (*.ps). You can also name the files in .git/info/exclude (but since this file isn't checked into the repository it can't be shared via patches). See EXCLUDE PATTERNS in the help for git ls-files.

How do I send a patch to somebody?

git format-patch -k ${commit before you changed anything}

Publish your Repository

publishing

Convert From Subversion

Svn Import   (and a failed idea: svn-copy)

Working with Subversion Upstream

git-svn

"Missing" Functionality

git will never adjust your line endings. Your editor should be able to work with whatever line endings it's given.

git will never offer keyword expansion. If you really want this sort of brain damage, you can do your own keyword expansion with the hook scripts. Warning: it's not easy to get 100% right (and that right there is a good reason why it should never be part of an SCM).

To Look At

http://primates.ximian.com/~federico/misc/git-cheat-sheet.txt