Working with Git

From genomewiki
Jump to navigationJump to search

Git structure

Definitions

central repository – The main repository we all share. This is where we push files for everyone to use. Contains a history of everyone’s commits that have been pushed.

local repository – Your main repository. This is where you push files from. Contains a history of all your commits for each branch.

branch – A version of files that are stored in your repository. You can have many branches in your local repository, each with their own history of commits. Your main branch is typically called the master.

staging area – The place where git keeps track of things you have added but not yet committed.

working directory – Your actual files in your ~/kent/ directory.

Terms that can refer to multiple things

history – both your local repository and the central repository have a history. This history contains all the commits you have done as well as any merges, including a ‘git pull’ (this is why you see "Merge branch 'master' of …" in git log).

HEAD – names the current branch you are on, typically master. It is often convenient to use HEAD instead of having to name the specific branch you are on.

master – the main branch of either the central repository or the local repository. Can be used to point to the last commit ID of your master branch.

Synonyms

central repository = shared repository = origin/master


local repository = your repository = master


staged = staging area = staged changes = cached = commit list = index


working directory = sandbox

The commands

Making changes

'git add fileName'

Use this command to tell git that you are interested in "saving" this file in your local repository. You can do multiple git adds to a file as you change it. This is a great way to save a file that you are making lots of edits to before making a final commit. Adding a file does not change your history.

'git rm fileName'

Removes a file from your working directory and staging area. After you do a 'git rm fileName' you still need to do a 'git commit fileName' in order to remove it from your repository. Can do with -r to recursively remove files and directory.

'git commit -m "some message" fileName(s)'

Commits the specified file(s) to your local history. If you do not specify a file, it will commit everything in your staging area (i.e. anything to which you have done a git add or git rm, etc). If you try to commit a file that is not already in your local repository without adding it first, git commit will complain.

Fixing un-wanted changes

It is possible to fix unwanted changes in three areas: your working directory, your staging area and your repository (i.e. your commit history). It is important to never modify commits that are already to the central repository (i.e. ones you pulled from others ones you have pushed). Once a commit is in the shared repository, you have to do some very complicated maneuvers to un-do it, which are not covered here.

'git checkout HEAD fileName'

Resets just your working directory file for that one file. Can also be used for your whole working directory if no fileName is specified.

'git commit --amend -m "some message" fileName'

Abandons the most recent commit in your history and makes this commit the newest commit in the history. Can also be used to re-do the commit message.

'git reset --[option] HEAD'

Your main friend for backing out changes. There are three useful options for this command: --soft, --mixed (default) and --hard. Here is what each of the options does (if there is an X in the column that means that it resets that particular thing back to the HEAD of your central repository):

	   HEAD of            staging           working                       
	local repository       area            directory 
--hard       X                  X                  X 
--mixed      X                  X 
--soft       X  
  git reset --soft HEAD   # does nothing

Note that the '--soft' option in this case doesn't actually do anything since you are resetting the HEAD of your local repository back to itself.

The '--soft' option is useful if you are trying to back out the very last commit.

Only do this if it has not been pushed to the central repository.

To back your very last commit out use:

  git reset --soft HEAD^  

^ is short for ^1 which is the first parent. This will leave your staging area and working directory unchanged.

 git reset --mixed HEAD

--mixed clears the staging area.

 git reset --hard HEAD

--hard resets the working directory as well as the staging area. WARNING: any uncommitted changes in your working directory are lost.

Getting information about commits and your staging area

'git show [--stat] commitID[:path/fileName]'

One of the more useful commands. Since git keeps track of files, trees and commits by taking the entire contents of all the files and creating a SHA1 name of them, you can use git show to see exactly what was committed. You can do this for an entire commit, or you can specify just one file of a commit. The '--stat' option will suppress the actual changes and instead list the files that were committed.

'git diff'

A multipurpose command that lets you see the diffs between many different areas. Here are some useful options:

'git diff' - see changes in your working directory that you haven't added/removed.

'git diff --cached' - see changes that are added/removed but not committed.

'git diff HEAD' - see changes between your working directory and HEAD.

'git diff --stat' - see just list of files that have changed.

'git diff fileName' - see changes that have happened to a specific file or directory.

'git diff commitID commitID' - see differences between commits. If a commitID on one side is omitted, it will have the same effect as using HEAD instead.

'git log [path/fileName]'

Shows the history (commits/merges) of your entire local repository or just for one file.

The listing is in reverse chronological order.

Ways to limit the output:

-10 means stop after listing 10 commits
--after=<date> 
--before=<date>
--author=<pattern>
--no-merges

You can use --stat to see what files were changed in each commit listed.

'git blame fileName'

Details who edited each line of a file. Useful option: -L <start>, <end> or -L <start>+offset - prints the blame for those only lines.

'git status'

Gives a report on the status of tracked and untracked files. Will also show how far ahead of the central repository you are since your last 'git fetch' or 'git pull'.

Keeping up with the central repository

'git pull'

Pulls in commits from the central repository made by others. 'git pull' (and 'git merge') will fail if:

a) you have ANY staged files. --> To get around this either commit your changes or un-stage your changes (see 'git reset').

b) you have local uncommitted changes in your working directory that overlap with files that git pull/git merge may need to update. --> To get around this use 'git stash' (see below).

'git stash'

Stashes away any changes in your staging area and working directory. Very useful if you are working on something and want to pull in the most recent changes. You can use it to resolve situation "b" above like so:

git pull
...
file foobar not up to date, cannot merge.
git stash
git pull
git stash pop

Note that you have to do a 'git stash pop' to get your half-baked changes back into your working directory and index. You may need to resolve conflicts if you pulled in someone else's changes to the same file you were working on. You should not use 'git stash' to store changes long-term; instead use a separate branch.

Resolving conflicts: It is very similar to CVS. In order to resolve a conflict from doing a git pull or git merge, you will have to edit the file in your working directory, do a git add and git commit it. You can use 'git diff' to see the changes that need to be resolved. Git will not let you commit until you have resolved your conflicts.

'git fetch'

Brings the most recent changes from the central repository into your local repository, without merging those changes with your local repository. It is useful if you want to see the changes others have been making and how they will affect your local repository before doing a 'git merge'.

'git merge'

Used to merge your local repository with the central repository after doing a 'git fetch'. Note that you may run into the same problems as 'git pull' if you have staged changes or git thinks there will be a conflict with your working directory (see above). Is also used to merge branches in your local repository.

'git push'

Pushes your changes out to the central repository. Will error if you aren't up to date with everyone else's changes. To fix this, do a 'git pull'.

Other good things to know

Notation for specifying ancestors.

 X^1 means the first parent of X.  X can be HEAD, a commit ID, a branch, or a tag. 
 Merges have 2 or more parents. They would be referred to as X^1 and X^2.
 You can therefore specify any ancestor of X using this notation.
 X^2^1 would refer to the first parent of the second parent of X.
 ^ alone means ^1.  So a bunch of changes without merging would be:
 X^1^1^1 == X^^^ = X~3.

Note that because HEAD is a variable which points to the most recent commit ID, you can also just specify a commit ID instead of "HEAD" in any command. Similarly, you can do a commit ID~1 or commit ID~2, just like you would do with HEAD.

You can run most commands with --dry-run to see what is going to happen if you run it for real.

You can use 27f3e63 as an abbreviation for 27f3e639263d5bb0c018d5ec7cf6633c2ccd7e07. You can use just the first 7 or so characters as an abbreviated hash ID in any of the git commands. As long as the abbreviated hash ID is unique, git will expand it for you. Of course an abbreviation that was unique at one time is not guaranteed to be unique in the future.

Useful resources

Git Community Book

Git User Manual

Git Man Pages