Working with Git

From genomewiki
Revision as of 22:04, 10 August 2010 by Marygoldman (talk | contribs) (New page: ==Git structure== ===Definitions=== central repository – The main repository we all share. This is where we push files for everyone to use. Contains a history of everyone’s commits t...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Git structure

Definitions

central repository – The main repository we all share. This is where we push files for everyone to use. Contains a history of everyone’s commits that have been pushed.

local repository – Your main repository. This is where you push files from. Contains a history of all your commits for each branch.

branch – A set of files that are stored in your repository. You can have many branches in your local repository, each with their own history of commits. Your main branch is typically called the master.

staging area – The place where things you have added but not yet committed live.

working directory – Your actual files in your ~/kent/ directory.

Terms that can refer to multiple things

history – both your local repository and the central repository have a history. This history contains all the commits you have done as well as any merges, including a ‘git pull’ (this is why you see "Merge branch 'master' of …" in git log).

master – the main branch of either the central repository or the local repository. Can be used to point to the last commit ID of your master branch.

HEAD – best to think of HEAD as a variable. It points to the last commit ID of the branch you are working on, which in most cases this is the master branch. This is convenient because this means you don't have to remember where you are. You should use 'HEAD' as default rather than 'master'

Synonyms

central repository = shared repository = origin/master


local repository = your central repository = master


staged = staging area = staged changes = cached = commit list = index


working directory = sandbox

The commands

Making changes: 'git add', 'git rm', 'git commit'

'git add fileName'

Use this command to tell git that you are interested in "saving" this file in your local repository. You can do multiple git adds to a file as you change it. It is important to do this after every change you make to the file in your working directory. This is a great way to save a file that you are making lots of little edits to before making a final commit. Adding a file does not change your history.

'git rm fileName'

Removes a file from your working directory and staging area. Still needs to be committed, similar to 'git add'. Can do with -r to recursively remove files and directory.

'git commit -m "some message" fileName'

Commits the file(s) to your local history. If you do not specify a file, it will commit everything in your staging area (ie. anything you have done a git add to) to your local history, which is not necessarily recommended. If you have not yet added the file, it will do it for you automatically.

Fixing un-wanted changes: 'git commit --amend', 'git reset', 'git checkout'

'git commit --amend -m "some message" fileName'

Makes this commit the newest commit in the history. In effect, you overwrite the last commit with this new commit. Can also be used to re-do the commit message.

'git reset --[option] HEAD'

Your main friend for backing out changes. There are three useful options for this command: --soft, --mixed (default) and --hard. Here is what each of the options do (if there is an X in the column that means that it resets that particular thing back to the HEAD of your central repository):

	   HEAD of            staging           working                       
	local repository       area            directory 
--hard       X                  X                  X 
--mixed      X                  X 
--soft       X  

Note that the '--soft' option in this case doesn't actually do anything since you are resetting the HEAD of your local repository back to itself. The '--soft' option is only useful if you are trying to back out the very last commit you made that has not been pushed to the central repository. To back your very last commit out use:

  'git reset --soft HEAD~1'

where "1" is the number of commits you want to go back. You can also go farther back by doing ~2, ~3, etc. (to save typing, some people write HEAD^, which =HEAD~1). You can also do a 'git reset --[option] HEAD^' with --mixed and --hard if you would like, just keep in mind that it will delete that commit from either your local repository and the staging area or from your local repository, the staging area and your working directory (see table above).

'git checkout HEAD fileName'

Resets just your working directory file for that one file. Can also be used for your whole working directory if no fileName is specified.

-- Note that because HEAD is a variable which points to the most recent commit ID, you can also just specify a commit ID instead of "HEAD". Similarly, you can do a commit ID~1 or commit ID~2, just like you would do with HEAD.

Getting information about commits and your staging area: 'git show', 'git diff', 'git log', 'git status'

'git show [--stat] commitID[:path/fileName]'

One of the more useful commands. Since git keeps track of files, trees and commits by taking the entire contents of all the files and creating a SHA1 name of them, you can use git show to see exactly what was committed. You can do this for an entire commit, or you can specify just one file of a commit. Also note, that similar to 'git reset', you can substitute HEAD for the commitID and can also do things like HEAD~3. The '--stat' option will suppress the actual changes and instead list the files that were committed.

'git diff'

A multipurpose command that lets you see the diffs between many different areas. Here are some useful options:

'git diff' - see changes in your working directory that you haven't added/removed.

'git diff --cached' - see changes that are added/removed but not committed.

'git diff HEAD' - see changes between your working directory and HEAD.

'git diff --stat' - see just list of files that have changed.

'git diff fileName' - see changes that have happened to a specific file or directory.

'git diff commitID commitID' - see differences between commits. If a commitID on one side is omitted, it will have the same effect as using HEAD instead.

'git log [path/fileName]'

Shows the history (commits/merges) of your entire local repository or just for one file. Ways to limit the output: -## (limit to the last ## of commits); --after=<date>; --before=<date>; --author=<pattern>, --no-merges. You can use 'git log --stat' to see a list of commits along with what files were committed.

'git status'

Gives a report on the status of tracked and untracked files. Will also show how far ahead of the central repository you are since your last 'git fetch' or 'git pull'.

Keeping up with the central repository: 'git pull', 'git stash', 'git fetch', 'git merge', 'git push', 'git blame'

'git pull'

Pulls in commits from the central repository made by others. 'git pull' (and 'git merge') will fail if:

a) you have ANY staged files. --> To get around this either un-stage your changes with a 'git reset --mixed' or commit your changes.

b) you have local uncommitted changes in either your working directory or staging area that overlap with files that git pull/git merge may need to update. --> To get around this use 'git stash' (see below).

'git stash' -- stashes away any changes in your staging area and working directory. Very useful if you are working on something and want to pull in the most recent changes. You can use it to resolve situation "b" above like so:

'git pull'
...
file foobar not up to date, cannot merge.
'git stash'
'git pull'
'git stash pop'

Note that you have to do a 'git stash pop' to get your half-baked changes back into your working directory and index. You may need to resolve conflicts if you pulled in someone else's changes to the same file you were working on. You should not use 'git stash' to store changes long-term; instead use a separate branch.

Resolving conflicts: It is very similar to CVS. In order to resolve a conflict from doing a git pull or git merge, you will have to edit the file in your working directory, do a git add and git commit it. You can use 'git diff' to see the changes that need to be resolved. Git will not let you commit until you have resolved your conflicts

'git fetch'

Updates a separate HEAD used to track the central repository called: FETCH_HEAD. then when you do a git merge, it merges FETCH_HEAD and HEAD. Look at git status - how does this work. git diff will look at FETCH_HEAD, you can use this if you think other people are working something similar and want to see their changes. Still have to do a git stash to get their changes in, if yours are not committed.

'git push'

Pushes your changes out to the central repository. Will error if you aren't up to date with everyone else's changes. To fix this, do a 'git pull'.

'git blame fileName'

Details who edited each line of a file. Useful option: -L <start>, <end> or -L <start>+offset - prints the blame for those only lines


Other good things to know:

You can run most commands with --dry-run to see what is going to happen if you run it for real.

You can use just the first 7 or so letter/numbers as a partial hash ID in any of the above commands. As long as the partial hash ID is unique, git will expand the partial hash ID for you.