Child pages
  • Version Control with Git and Github
Skip to end of metadata
Go to start of metadata

Git and Github provide functionality for version control. This allows you to work on new versions of a program or code without having to modify an existing functional copy. The functional copy can be left untouched while a new feature is being developed. 

Git is the underlying tool that implements version control locally. Using Git to manage different versions of your files makes it easy to implement changes and recover previous copies if a mistake is made. Git allows you to return to previous version of a project directory or a file. Git is also useful for collaborative work, as multiple users can work on different aspects of a project simultaneously. Git was developed by the folks that maintain the linux kernel so they could have hundreds or more people editing the code simultaneously. Our use is a little more subdued but helpful none the less.

An extensive resource for Git is the book Pro Git by Scott Chacon and Ben Straub. It can be found here. The Atlassian tutorial is also a good resource and reference. 

Github (www.github.com) is a mostly  free service that allows you to host your Git repositories on the internet. This enables you to access your projects on any computer. Github offers some additional features, such as source code viewers, webpage hosting, and something of a social network for programmers and techies. The computational tools developed at the Mouse Imaging Centre are hosted on our Github page



Basic Usage

Version control with Git and Github is implemented using repositories on Github. The repository located on Github is the origin or remote copy. In order to work with repositories on your computer, a copy of the repository has to be cloned. This creates a local copy. Local copies can exist on multiple machines. Files and directories are modified using the local copy of a repository. However changes made to files locally are not immediately reflected in the remote copy. These changes first need to be track, committed, and pushed to the remote in order for them to show up on Github. In order to keep a local copy up to date with the origin, new changes that exist in the remote copy need to be pulled to the local repository.

Creating Repositories

New repositories can be created either remotely or locally.

A new remote repository can be created directly on Github. In order to work with this repository, a local copy needs to be cloned. This clone is created using the repository ID.

git clone [repo ID]

Another way to do this is

git clone git@github.com:username/repo_name.git


A repository can also be created locally before it is tied to an origin on Github. To create a repository from an existing local directory, use

git init

This new local repository needs to be linked to a remote repository in order to enable version control. To do so:

git remote add origin git@github.com:username/repo_name
git push -u origin master


When working with RStudio, repositories can be created using an RStudio Project.

Making Changes

Changes are made in the local copy of a repository. In order to push these changes to the remote copy, they first have to be staged/tracked:

git add [filename1] [filename2]

All changes made in the local repository can be tracked using the flag -A

git add -A

Once changes have been tracked, they are committed:

git commit -m "Commit message"

Changes cannot be committed without a message. It is always good practice to indicate what changes have been made in the present commit.Once changes are committed, they are pushed from the local copy to the remote copy.

git push


At any point, the status of your local repository can be checked using

git status

This tells you:

  1. The name of the current branch
  2. Files that have been modified
  3. Changes that are or aren't committed

In order to see what changes have been made, use

git diff


Suggestions for working with commits: First think of something that needs to be fixed, or a feature that needs to be implemented. Second, do the work to implement. Third, test that the changes you made are functional. Finally, track it and commit it.

Good practices for committing files:

  • Don't include files that are derived from other files in the repository (e.g. figures, markdowns, etc.)
  • Avoid committing binary files or really big files. Git works best with text files.
  • For big data files that are changing, track a text-based version, or use a separate Git repository for the data


The names of the files in your project directory that aren't being tracked should be placed in a file called .gitignore. Each subdirectory in a repository can have its own .gitignore file.

To stop a file from being tracked on Github without deleting the file locally, we can use

git rm --cached [filename]

This will remove the file from Github without deleting it locally. If the filename exists in the .gitignore file, the file won't be tracked going forward.


If the remote copy of your repository is ahead of the local copy, you need to pull the remote copy. On the master branch, this is done with

git pull origin master

Or simply

git pull


Note that git pull is actually doing the following

git fetch
git merge


Branches

When working with Git and Github, different copies of a repository, termed branches, can be created to develop specific features of a project or work on a bug without modifying an existing stable version. The default branch in Git is the master branch. This is generated automatically when a new repository is created. New branches can be created to work on specific aspects of a project. Like the master branch, alternate branches have both a local and remote copy (origin/[branch name]). Changes that are made on the local copy have to be pushed to the remote copy of the branch. Changes created within a branch will stay in that branch until the branch is merged back into the master.

To review existing branches:

git branch

To create a new branch:

git branch [branch name]

Switch between branches with

git checkout [branch name]

To push the branch to Github:

git push origin [branch name]

When work on a branch is finalized, it can be merged into the master copy of the repository. This is done by merging the local branch into the local master and then pushing the local copy of the master to the remote. To merge a branch back into the master, switch to the master branch and execute

git merge [branch name]

The changes then need to be pushed to the remote.


The converse can be done if changes were made to the master that are not on a given branch. Then the master can be merged into the branch by switching to the branch locally and running

git merge master


Merging branches may result in conflict if a file is changed in multiple ways in different branches. Changes must be consolidated before branches can be merged.

Once the purpose of a branch has been fulfilled, the branch can be deleted. To delete locally:

git branch -d [branch name]

To delete a remote branch:

git push origin --delete [branch name]


References

This content is based on the following references. More information and help can be found there.

  • No labels