This short tutorial gets you started with GIT from zero. It is intended for non-tech audience who wants to adopt GIT as an efficient way to track a project. Here some slides on the topic: https://users.aalto.fi/~eglerean/git.pdf
What is GIT?
- GIT is a version control system: a simple way to put a time-stamp to your files so that you can keep track of all the changes.
- A copy of the files is stored remotely on the GIT repository server, so GIT can also function as a backup for your files.
- You can always revert to a previous version of a file.
- GIT was born to track code for software projects, so it works best with simple text files. It also works with binary files.
- GIT forces you to add a comment every time you want to back-up the status of your project; If you write meaningful comments, it will be easier to track changes.
- GIT was born to work with other collaborators, however the two persons you are collaborating the most are your past self and your future self.
Why using GIT?
Some motivating scenarios
- You are a programmer: most likely you are working in a team and/or releasing software for public use. Then you need git to keep colleagues and users updated on changes and new features
- You want to become a programmer: there is no better CV than showing that you can actually write code and solve problems. Companies look for your github profile to see how you code. Start using GIT for all your school homework or class projects and with no effort you have a huge programming portfolio.
- You are not a programmer, you do not want to become a programmer, but you are working on a project: even if you work alone, you are still collaborating with your past self and with your future self. GIT keeps a status of your files, your notes and your project so that you don’t have to rely on file names such as projectReport_final_verylastversion_thisisthefinal.doc.
- You are a (data) scientist: it doesn’t matter if you code or not, you need to have a lab notebook to keep track of anything that happened throughout your project, from the request of ethical permit, to piloting, to final analysis and writing the paper. GIT can become your digital lab notebook.
GIT in practice part 1: set-up your project
This guide is for Linux/Mac users: it requires the use of the terminal to fully understand the steps involved. Once you understood what happens under the hood, you can switch to a graphic-user-interface solution if you prefer.
- Create an account on github, bitbucket or any other GIT repository (for my colleagues and studends, click here).
If you deal with sensitive data, make your project private (e.g. by asking your IT team to set up a git repository for you).
- Set-up ssh keys.
Ssh keys are just a way to avoid using passwords. From experience, using GIT via SSH is i) faster, ii) less prone to server limitations (important when uploading large files) and iii) more secure. Check if you already have ssh keys in your computer. Alternatively you can generate a public/private ssh keys pair. Follow steps here: https://help.github.com/articles/generating-an-ssh-key/
Then upload the public key to the website of the GIT repository, usually under your account settings. Then open the terminal and validate the key by typing something like:
ssh -T email@example.com
For my colleagues and students:
ssh -T firstname.lastname@example.org
- Create a new repository from the web interface.
Give it a meaningful name (imagine you want to share it with others). Do not use spaces or hyphens. With GITLAB, if you are out of own projects, create a project under the group.
- Set up the local copy of the repository on your machine.
From the web-interface, copy the SSH string for your project (something like: email@example.com:username/projectname.git); Open a terminal, go to the folder where you want your project to live and type:
mkdir projectname cd projectname git init git remote add origin firstname.lastname@example.org:username/projectname.git
- Create a file called README.md and store it in the projectname folder. This is your main project file. I use it to let others know what is inside this project and where to look for references. I also use it to write down experiment choices, parameters used in the analysis, todo lists. The extension .md is for “markdown”, a formatting language to set things like headers, lists etc. The file README.md is displayed by default on your project GIT page, so it is nice to add some formatting. Here a simple formatting to copy paste:
# Project main title *A small one-line subtitle* Here a link with a picture. The picture is stored in a subfolder called figures. [ ![NAME FOR THE LINK](figures/demo_figure.png) ](http://the-actual-link-whatever-it-is.com) Code released under [MIT License](https://en.wikipedia.org/wiki/MIT_License) (see LICENSE file). ## WHAT IS IT Describe what is it about ## WHAT IS WHERE Describe organization of subfolders and files ## HOW TO INSTALL A manual, if it's a software. ## HOW TO CITE A link to the publication ### 15/07/2015 TODO A simple todo list for your future self.
- Add the file to the repository.
Tell GIT that we want to keep track of this file. This is done only once per file with command:
git add README.md
- Commit the file to the repository.
Tell GIT that we want to permanently store the current version of the file, with command:
git commit -m "A meaningful message that explains what you did" README.md
- Synchronize your local copy of the repository with the remote one
This is where the file is uploaded to the remote repository and backed up. Use command
git push -u origin master
Sometimes it’s enough to just type
- Check that the web repository has the new file
Hurray! You have a fully functioning repository
GIT in practice part 2: working on an existing repository
You are all set-up and just want to start adding files, editing them, keeping track of them.
- Before doing anything, make sure your local repository is up to date with the remote one.
Open the terminal and go to the local GIT folder for the project. To get a simple summary of the local status, just type.
. To see if there are any differences from local versus remote, type:
git remote update
git diff origin/master
Update local version with the remote one:
- Add a new file. Create a new file or copy an existing file in a folder (or subfolder) of your local GIT repository and tell GIT that you want to track this file (just once per file):
git add subfolder/name_of_file.txt
- Commit file. Tell remote GIT repository to store the current version of the file.
git commit -m "useful message here" subfolder/name_of_file.txt
- Push the file to the remote repository
- Editing. When editing an existing file, you just need to run git commit and git push. Before committing you can always the last changes of the local file by running:
git diff name_of_file.txt
- Delete. Remove a file with
git rm filename git commit -m "deleting because..." filename git push
- Re-start from scratch or get somebodyelse’s code
Sometimes your local configuration gets messy, sometimes you change machine and you need to start from where you have left it. Go to an empty folder and just type
git clone email@example.com:username/projectname
This will create a subfolder called projectname and will download all remote files into it. The same command can be used to download locally someonelse’s git repository. In the latter case however you might not have permissions to add changes if the repository owner doesn’t give you write access. For these cases you can also “fork” somebodyelse’s project. Just visit an existing project and you will see the fork button.
In general, google the error you get: there is always an answer. If local configuration gets messy, it is easier to start from scratch with git clone (see above). Sometimes there are cases where the remote repository is modified (e.g. via web interface) and also the local repository has changes. This can happen especially when collaborating with others. When both repositories have changes, git will try to merge them. When you locally run git push, the command will complain that it needs to do a merge. You can merge these situations with:
git merge origin/master
- Open the conflicting files. Conflicts will look like:
<<<<<<< HEAD code on your version that is not in the remote version ======= code on the remote version that is not in the master version >>>>>>> origin/master
- Remove the “<<<” “===” “>>>” lines and fix the conflicting changes. You can then git commit and git push again.
GIT dump of some advanced commands
From https://help.github.com/articles/changing-a-remote-s-url/ sometimes you have an existing local repository and want to back it up to a new remote repo. You first create your remote empty repo with the new location and then:
cd existing_repo git remote set-url origin URLOFNEWREPO git push -u origin --all git push -u origin --tags