Git
We will use git to manage lab and program submissions in this course. At minimum, this allows our TAs to quickly access your assignments that you “submit” on the server, and allows you to get some basic practice with an extremely important tool in the software development world. You can also choose to use git more extensively to manage your work in this class.
Git basics
Git is a version control system, meaning that it is meant to keep track of changes to code. It’s especially useful for collaborating, but it can also be used when working by yourself so that you can keep a history of your changes and why you made them.
Git is not GitHub. GitHub is a service that hosts many git repositories. Other similar services include GitLab and BitBucket. In this course, we will not cover hosting services.
Other version control systems include Mercurial and Subversion. Because these are version control systems, they are comparable to git. In this course, we will focus on git, not other version control systems.
Git repositories
The basic organizational unit in git is called a repository, or a
repo for short. As the name implies, it is a place to store things. In
practice, a repository is a directory in your filesystem. Any subdirectories
are also part of the repository. For example, at some point during this class,
your csci112_spring2024
directory might look like this:
csci112_spring2024/
├── classwork
│ ├── week1
│ │ └── fri
│ │ ├── autograder.txt
│ │ └── example.txt
│ └── week2
│ └── mon
│ ├── autograder.txt
│ ├── hello_world.c
│ └── out.txt
└── labs
└── lab1
├── autograder.txt
├── lab1.c
└── temp.txt
This means that the csci112_spring2024
directory contains two subdirectories,
classwork
and labs
, each of which can contain their own subdirectories, and
inside those subdirectories are various files.
If we make csci112_spring2024
a git repository, all of the subdirectories and
files inside of it are part of the repository.
Git commits
The snapshots of our repository that we use git to record are called commits. We make a commit every time make a change that we would like to record. Although we may only change one file, a commit is actually a snapshot of the entire directory at the time that we make the commit. In practice, making a commit looks like this:
git commit -m "<Put a description of the changes made in this commit in the quotes>"
Staging in git (git add
)
Git includes an extra step to making a commit that many other version control systems do not: staging. In order to make a commit, we first must stage our changes: that is, tell git which files were changed and should thus be included in the next commit. We can either stage files by name:
git add file1.txt
or we can stage all files in the repository that have changed:
git add -A
Note that you must stage your files before you can commit them!
Setup for assignment submission
For this course, we’re going to create a git repository and use it to tag the version of our programs that we would like to be graded.
Creating a git repository
To start, create a directory. For this course, your directory should be called
csci112_spring2024
, so we will use that as an example. You should already
have this directory, but if you didn’t, you would create it using mkdir csci112_spring2024
.
Navigate into the directory:
cd csci112_spring2024
Initialize a git repository inside the directory:
git init
You will see the following message. You can ignore the first part about hints:
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in /home/<your_netid>/csci112_spring2024/.git/
But notice the last line: Initialized empty Git repository in /home/<your_netid>/csci112_spring2024/.git/
.
csci112_spring2024
directory is now a git repository!
Try running git status
to see the status of your repository. You should see
something like this:
[g73x595@csci112 csci112_spring2024]$ git status
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
classwork/
nothing added to commit but untracked files present (use "git add" to track)
If you just created your csci112_spring2024
directory for the first time and it
is empty, you will need to create a file something for git to track. You could do this by opening a new file
with vim
, putting some text in, and saving it. Or you can copy the following:
echo "Hello, world!" > example.txt
(The echo
command prints whatever you specify to the screen; the >
redirects that output to a file instead of the screen.) You may want to check
that the file is there by running ls
.
Now, we can stage all of your files by running
git add -A
The -A
is for “all”. Now, if you run git status
, you should see:
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: classwork/week1/fri/autograder.txt
new file: classwork/week1/fri/example.txt
new file: classwork/week2/mon/autograder.txt
new file: classwork/week2/mon/hello_world.c
new file: classwork/week2/mon/out.txt
This means that all of the files in your repository have been staged.
Now, we are finally ready to commit. We also need to provide a description of
this commit, called a commit message, which we specify with the -m
option
followed by our message in quotes. For example, we might do:
git commit -m "Add previous classworks"
And then we see a message like this:
[master (root-commit) 2c82f20] Add previous classworks
5 files changed, 30 insertions(+)
create mode 100644 classwork/week1/fri/autograder.txt
create mode 100644 classwork/week1/fri/example.txt
create mode 100644 classwork/week2/mon/autograder.txt
create mode 100644 classwork/week2/mon/hello_world.c
create mode 100644 classwork/week2/mon/out.txt
Running git status
again shows that we have no unsaved changes:
On branch master
nothing to commit, working tree clean
We can run
git log
to see a history of all of our commits. In this case, it looks something like
commit 9a8c77c95593a302436dfb9509663658271f43e9 (HEAD -> master)
Author: fangtian.zhong <fangtian.zhong@montana.edu>
Date: Sun Aug 20 18:40:50 2024 -0600
Add previous classworks
Tagging a commit
If we want to label a commit because it is important or we want to be able to access it easily, we can give it a tag. In this class, you will use git tags to indicate which version of your code we should grade.
To create a tag, run git tag <tagname>
after you have committed your changes.
You can run git log
to check that your tag was associated with the right
commit. For example, we might run
git tag example_tag
and then
git log
and see
commit 2c82f20e9b41b4fe09b6c142d381dce32f95abe3 (HEAD -> master, tag: example_tag)
Author: fangtian.zhong <fangtian.zhong@montana.edu>
Date: Thu Jan 26 19:55:31 2024 -0700
Add example.txt file
Note the tag at the end of the first line!
You’ll notice that if you make a new commit and try to tag that commit with a
tag tagname
that has already been used, you’ll get this message:
fatal: tag '<tagname>' already exists
To solve this, first delete the tag
git tag -d <tagname>
and then tag as normal
git tag <tagname>
You can tag a previous commit by referencing its commit hash. For example, if I
run git log
and see something like this:
[g73x595@csci112 csci112_spring2024]$ git log
commit 71203455057227331ac8384f3f2c2a1910ef5094 (HEAD -> master)
Author: fangtian.zhong <fangtian.zhong@montana.edu>
Date: Fri Feb 10 09:08:12 2024 -0700
Add a example file
commit aa08bd749ec35221abd90ce0908fa7279714e3c1 (origin/master)
Author: fangtian.zhong <fangtian.zhong@montana.edu>
Date: Wed Feb 8 16:38:34 2024 -0700
Update
commit 3601b167efb9a1c6ed3f384c55a720c740f4ab07
Author: fangtian.zhong <fangtian.zhong@montana.edu>
Date: Thu Jan 26 20:19:37 2024 -0700
lab1
commit 5170dff5f9795001a837407645ed55cae24e43d9
Author: fangtian.zhong <fangtian.zhong@montana.edu>
Date: Thu Jan 26 20:18:53 2024 -0700
Update
and I want to tag the commit with the comment “lab1” with the tag lab1
, I could run
git tag lab1 3601b167efb9a1c6ed3f384c55a720c740f4ab07
tl;dr
Once you have a version of your lab or program that you would like to “submit”, (i.e., be the one we grade if you don’t update it), do this:
git add -A
git commit -m "<description of your changes, or just name of assignment>"
git tag <specified tag for assignment>
If the tag already exists, do git tag -d <specified tag for assignment>
before the third command above.
Common issues
Nested repositories
In general, you do not want to have nested git repositories. This is what
happens if you run git init
in one directory, and then run it again in one of
the subdirectories of that directory later. You’ll know if you have nested
repositories if git log
looks different in different directories of your
repository.
Let’s get a better understanding of what this means with an example. Suppose that
you have one directory called directory
, and inside it you had two
directories called subdir1
and subdir2
, and maybe those have a few files in
them. Visually, that looks like this:
├── directory
│ ├── subdir1
│ │ ├── file1.txt
│ └── subdir2
│ ├── file2.txt
│ └── file3.txt
If at some point you had run git init
when you were in directory
, then
directory
(and everything in it, including the subdirectories) is now being
tracked as a git repository. This means that there is a .git
directory inside
directory
, so the file structure actually looks like this:
├── directory
| ├── .git
│ ├── subdir1
│ │ ├── file1.txt
│ └── subdir2
│ ├── file2.txt
│ └── file3.txt
(Note that we don’t see .git
unless we look for hidden files, for example by
running ls -a
when we’re in the directory
directory.)
Now suppose we navigate to subdir1
and run git init
. This creates another
(nested) repository inside subdir1
, so there is another .git
directory
under subdir1
, and subdir1
is a new git repository with its own commit
history.
├── directory
| ├── .git
│ ├── subdir1
│ │ ├── .git
│ │ ├── file1.txt
│ └── subdir2
│ ├── file2.txt
│ └── file3.txt
But the whole idea of a repository is that it tracks everything inside of it.
So now the directory
repository is tracking file1.txt
and so is the
subdir1
repository, but there’s just the one file and so it really only makes
sense to track it once.
If you realize you have created a nested repository, make sure you understand
the level that you really want your repository to exist, and then delete the
unneeded one by deleting the associated .git
repository. The command to do
this will be different depending on where you currently are. If you’re in
directory
, it would be rm -rf subdir1/.git
. -r
says delete all
subdirectories recursively, and -f
says force the deletions without checking
whether you want to delete each file. (There are a lot of them in there.) Note
that this deletes the whole commit history and can’t be undone, so make sure
you delete the correct .git
directory.
Optional: more advanced git
Adding a remote to your repository
If you would like to connect your csci112_spring2024
repository to GitHub,
do the following steps. You must choose whether to use HTTPS or SSH to
communicate with GitHub. Either is fine, but HTTPS is simpler, so I recommend
that.
- Create a new repository on GitHub.com. Do not initialize the new repository with README, license, or gitignore files. Please make this repository private so that I can reuse assignments for this course without the solutions being available online! Copy the remote url with https in it.
- On the server, navigate to your
csci112_spring2024
repository and rungit remote add origin <REMOTE_URL>
- Now run
git push --set-upstream origin master
You may need to generate and enter a personal access token here.
Everything in your repository is now stored on the connected GitHub repository as well! But updates are not automatic. After any new commits, if you would like to send them to the GitHub repository, run
git push
Cloning that repo to your local computer
To copy over the files in your csci112_spring2024
directory (and now on
GitHub as well) to a local computer, clone the GitHub repository.
In a terminal window, enter
git clone <REMOTE_URL>
This will create a directory called csci112_spring2024
that holds everything
that was in the GitHub repo. If you make an update there (by pushing from the
server) and want to get that update on your computer, navigate into the
csci112_spring2024
directory and run git pull
.
Additional resources
You can watch the lecture and/or read the text from the MIT Missing Semester lecture on git.
You can look at the git tutorial from the software carpentries program, which aims to teach researchers basic computing skills.