Git & GitHub for Beginners
In today’s collaborative world, it is very essential that one understands the need for a Concurrent Versions System (CVS). A CVS is a simple revision system, that tracks all your changes. Imagine an ecosystem where you and a few other developers are working on a web application & updating html/css & js files asynchronously/separately. When you want to integrate the bits of code, you need to sit side by side and compare notes on what all you have done. What to add, what to remove etc. A Version control system is precisely for this purpose & Git is one such program. GitHub uses git to power itself.
- A Tinge of History
A Tinge of History
The Linux kernel is an open source software project and was created by Linus Torvalds. During 1991 – 2002, changes to the kernel were managed by DVCS called BitKeep. In 2005, BitKeep was no more open source, so Linus & the Linux development company had to come with one of their own. The requirements were pretty simple
- Simple design
- Strong support for non-linear development (thousands of parallel branches)
- Fully distributed
- Able to handle large projects like the Linux kernel efficiently (speed and data size)
Thus Git was born in 2005!
When you are working with Git, you may need to be aware of the following terms
- Repository or Repo
- This is a folder on your server where your project reside. This folder can be on you machine or a remote Git host like Github. This folder is also called as the master branch.
- A branch is a copy of the master or your entire repo. Generally, when multiple people are working on the same project, they want to Branch out. Then make changes to the files and merge with the master branch. This makes things easy for both the developers as well as Git.
- This command is used to switch between branches of a repo.
- When you want to create a copy of the repo, you checkout the code. This clones the repo into a local location you have provided.
- Once you are done with your changes and have tested the code sufficiently in your local, you would want to save the code to Git repo. This will create a Snapshot of your filesystem.
- Push is specifically for committing files to a remote repository. Lets say you have made changes to your files locally, and committed them. Commit happens on your local repo. If you want to push these changes to a remote repo like GitHub, you would use a push.
- When you want to update your local files with the latest files from the remote repo, you would run a Pull. This is more of a update command.
- If you would like to add a new file you have created to Git, you will let Git know by running the add command. Yes, Git doesn’t automatically add the files you have created. This command will be used to stage a file before a commit.
- This would remove a file/folder from your Git repo locally. Then you can push the changes to remove the file/folder from your remote repo too.
How Git Works
By now, you must got an idea on what Git is. Lets take a close look at what is happening under the hood.
Imagine, our project has 3 files A,B,C. Our initial commit is to check in all the files. Now we have updated FIle A & FIle C. We check in the changes. As you can see from the image below, File A is tracked as A1 (a newer version of File A) & File C as C1. This is the second revision to the file system, first being the initial commit.
What happened here? When you asked Git to check in the changes, Git created a Snapshot of your project (in this case, the folder & 3 files inside it). The Snapshot is a mini file system, that tells Git what is the state of each file when a certain commit occurred. To be crude, these details are stored as a new entry to the Git database.
Unlike other CVS who store data as a list of file based changes across commits, Git stores a snap shot of all the files. To make this a bit more efficient, if files have not changed, Git doesn’t store the file again
Now, if you refer to the image above, you can see how the files get stored on Git. If the post fix number (A1, A2..) does not change, that mean the file has not changed during that checkin/revision.
Set up Git
Now lets setup git and have some fun!! You can download Git (not GitHub) from here. First lets understand what Git is and then lets bring GitHub into the picture. You can download the setup applicable to your operating system. The setup is fairly simple. Once you are done, you can validate the same by
|Method 1: Right click on any folder – you should see a list of Git options attached to your right click menu (like git bash.. git diff).||—|
|Method 2: Open prompt/Git Bash from start and run git --version||Open terminal and run git --version|
Awesome!! now we are all setup to get rolling with Git!
Creating a Repository Locally
PS : There are a lot of GUI based git tools out there which will do the same tasks we do but via an interface. I want to encourage the usage of command line based interaction so that as UI developers, we are not really dependent on a UI.
First, we will take a look at things locally, then deal with remote Git hosts like GitHub. So, lets create a new folder and call it myRepo. Then,
|Right click on the folder & select 'Git bash'||Open a terminal and CD to the path.|
Create a couple of files inside this directory.
|Right click on the folder > New > Text Document. Name it file1.txt & similarly create one more file2.txt||Open a terminal and CD inside myRepo. Then run touch file1.txt & touch file2.txt|
You should see 2 files, when you run list directory ls command.
PS: Windows user, please make sure you are running the commands via Git Bash, as some of commands will not work in the Prompt.
Lets initiate a Git repo here. This will tell Git to start tracking changes to the folder. Run git init. You should see a prompt back with the following message
Initialized empty Git repository in /path/to/myRepo/.git/
.git is a hidden folder that consists all the meta data for Git to understand whats happening inside this repo. Its best to leave this folder alone.
Next lets ask Git what the status of the repo is run git status. You should see something like this (click to enlarge)
Git knows that there are 2 files and they are not tracked. The last line is like a hint on what needs to be done next. Now, lets add these files to our repo. Run git add . The dot at the end indicated all the files inside the repo. Then run git status again to check the status. You should see something like this
Managing files & Git
Lets open up file1.txt in your favorite text editor. I prefer vim editor (kind of old school). And then lets make some changes. Add the below 2 lines of text
Save the file and come back to the prompt. now lets run git status. You will see something like this
Ahhh!! Git found out! It says that you have modified file1.txt. Sweet right?
Lets see what changes were tracked. Run git diff master & Bam!!
This command gives us a list of changes we have made to the file. You can see that we are comparing our file with the master branch. This is always a good practice to view the difference before committing.
First we need to add the changes to Git or stage the files and then commit. We can do both the steps in one command like git commit -a -m "updated file1.txt", Where -a is for adding -m is the commit message flag and the text after -m in quotes is the commit message.
You should see something like this
Run git status as a sanity test to verify if everything is committed.
Lets do something really crazy and see what Git does. Create a file called file3.txt. Run git status
Then lets delete this file rm file3.txt. Now when you run git status, it will respond as if file3.txt never existed! This shows that unless you add those files to Git repo, it will not be tracked. So, create file3.txt add it to the current repo.
Windows : Create New File > file3.txt
Mac : touch file3.txt
The git add file3.txt then git commit -a -m "Added file3.txt". Now for fun, lets run git log --stat This will show the complete history
PS : Green ++ indicates addition & Red — indicated removal
We have seen how Git manages the files, when we are working on the same branch. Lets create a new branch and make changes & commit back to the Master. This is going to be a typical work flow when you are collaborating with a team. There will be a base code, you will branch out, make changes to the files & then push it back to the master.
So, lets create a new branch. Run git branch myApp.r2 . This will create a new branch. Now, lets switch to that branch run git checkout myApp.r2 . We are in the myApp.r2 branch. When you create a new branch, all the files in the master’s latest commit will get copied to the branch. Lets verify that, run git ls-tree -r myApp.r2 You should see something like this
This shows the list of files in your branch. Now lets update file3.txt. Open file3.txt and add the below line
Hi, I am file3.txt from myApp.r2 branch.
save the file and run git status to see if the file is modified. Now, lets compare the file3.txt with the base copy in the current branch
Cool, so we added a new line to file3.txt (captain obvious!!). Lets commit these changes git commit -a -m "updated file3.txt".
Now lets compare this file with the master and see the difference
Awesome!! Now we know how to branch, make changes to our branch and then commit these changes.
PS : When you have switched to a certain branch, and navigated to the repo via Finder/Explorer, you will see only the files which are related to the current branch. When you switch to Master, you will see only the files related to Master. This is typically how Git works, storing snapshots!
Now, our myApp.r2 development is completed. We want to merge our changes with the master.
First lets see how many branches exits. Run git branch -v . This should list 2 branches, the master and myApp.r2 & there latest commits. Now, Lets merge the changes. Lets move to the master branch git checkout master, then git rebase myApp.r2. This will take the changes from myApp.r2 and append them to master without a merge commit. If you want to perform a merge, you can run git merge myApp.r2. More on rebase vs merge here.
Now, open file3.txt and you can see the changes in your master!
So, what did we do all this while?
We created a local repository, created, updated, deleted files. Then we staged these files by running the git add or git commit -a ... Then we commit these files to our repository.
Now, if you want these repository to be hosted remotely, so that everyone can access, you need find a “Git Hoster” ergo GitHub!!
It can’t get simpler than this. The only difference between what we have done so far and what we are going to do from now on with GitHub is that, our repository was in the local but now, it will be hosted remotely.
We will clone a repository on GitHub, branch out, make changes, commit to our local repository, check in to GitHub remote repository and then merge the branch with master. Lets get cracking!
GitHub Account & Local setup
Quick & Dirty
- Step 1 : Navigate to GitHub.com and create an account
- Step 2 : You may need to verify your email as well.
- Step 3 : We need the GitHub app/setup for us to manage the repositories from our machine. Navigate to Downloads, & download the setup.
- Step 4 : Once the setup is completed, you can launch GitHub application & login with your details
- Step 5 : Now, you can see a list of projects that are there in your repository.
Create & Manage GitHub repository
Lets create a new GitHub repository. Navigate to GitHub create page and fill in the repo name. The description is not mandatory, but advisable. You can create private repos incase you need to, but you need to pay for the hosting & services.
Next let’s check ‘Initialize this repository with a README’. Click on Add .gitignore, select Node. Then click on Add a license & select MIT. gitignore is a config file, which will let you configure the files that you don’t want to push to the remote repo. Like log files or node_modules etc. This is helpful when your project is really huge and you are creating a lot of branches.
So, lets create a new Repo called myNodeApp & fill it like this
- The top bar will show the description of the Repo. You can update it accordingly.
- Row 2 shows the status of your repository, like no of commits, contributors etc. This is a quick overview of your repo’s status.
- On the right hand panel, you will see a few tabs. You can switch through each of them and see what they have to offer. This is pretty straight forward if have worked with any team projects before.
- The setting at the bottom right lets you rename, config & delete your repo. Use this section with caution.
- The bottom right section list your git url – an alias for your remote repo. And a download button that will download the repo.
- And finally the page center, which will list all the code or rather the files and folders.
Cloning & Working with repos
Now, lets get this repo to our local and start working on top of it. I will use command line interaction like earlier. Alternatively you can use the github client (which we have downloaded earlier) to do the same.
Navigate to your favorite folder, where you would like save the remote repo. Then navigate to your repo & copy the clone url from the Right hand side panel. Now, open Terminal/Git Bash and run git clone <<url to repo>>. In my case it will be git clone https://github.com/arvindr21/myNodeApp.git. Your output should be like
Now lets add a few changes to the file. Lets make a simple node app. For that you need Node installed. You can check this for a quick start.
Then lets run npm init. Hit return till you prompt out like this
save the file. Back to terminal/Git Bash & run node app.js. You should see a simple Hello Git!! message show up.
Awesome! our dev is completed and our code is ready to be committed.
Committing to GitHub
Now, lets run git status, you should see something like this
Lets stage these files by running git add . & then git commit -m "Added package.json & app.js". Now our local repo is updated with these changes. Now, lets push these changes to our remote repo/GitHub repo. Run git push origin master origin is the name of our remote repo. By default, it is named as origin & master is the name of the branch you want to commit the changes to.
When you run this command, the prompt will ask for your GitHub registered email address & password and then boom!!
Our code is checked into the remote repo! Lets navigate to our repo page on GitHub. Now, you can see the updated code.
You can see that now commits are 2.
GitHub Branching & Merging
This is exactly the same as what we have done locally. So, now lets assume that our release 1 is completed & we are starting release 2. Now we need to have a fresh branch to work on those changes. Lets create a new branch.
Back to Terminal/Prompt and run git branch to see the list of branches. Now lets create one locally and then push it to the remote repo. git checkout -b release2. This will create a new branch. Now lets push this to the remote repo. Run git push origin release2.
You can go to your GitHub repo page and see
Now, lets do some real work. Lets add a node package run npm install express --save-dev (you can know more about Node here). This will add an express package to our project. (Don’t worry, if you did not run something like this before. This operation will take a couple of mins). Now lets run git status, you should see that only package.json file is updated. But when you run open . in mac and explorer . in windows, you will see a folder named node_modules, which was not present earlier. But why isn’t git tracking this folder?
Because, we have added this to the list of items not to track. You can go to your repo, click on .gitignore file and see that node_modules is listed among other things. If you are familiar with node, this folder is not required to be shipped with the code. Any user can clone your repo and run npm install to install the dependencies listed in your package.json. This saves space as well as branching, cloning/downloading time.
Now lets commit these changes to our release2 branch. Run git add . then git commit -m "Added express package". It should typically respond with
[release2 e08b15c] Added express package
1 file changed, 4 insertions(+), 1 deletion(-)
Lets push it to the remote branch. Run git push origin release2 .
Awesome!! now lets merge the release2 branch with Master. For that we need to switch to master and then run the merge command.
Switch to Master git checkout master
Merge with Master git merge release2
package.json | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
Neat, now, lets push the changes to the remote repo. Run git push origin master After you input your GitHub credentials, you should see a message like this
Total 0 (delta 0), reused 0 (delta 0)
152969c..e08b15c master -> master
Thats it, your Branches are merged & saved to the remote repo. You can navigate to your GitHub repo in a browser & checkout the newly added changes to the package.json file.
There may be times when you will see a conflict with files. Git is graceful enough to append markers in your code/files to indicate the location of conflict. But do review the code thoroughly before committing the resolved files.
GitHub Pull request
So far, we have worked on a repo created by us. If you are working on someone else’s repo, the process will be exactly the same except after you merge and push the changes to the remote repo, the repo owner will receive a pull request stating that changes you have made. Now its up to the author to accept the changes or decline the request. Nonetheless, remaining aspects are same.
So this was a very basic overview of Git & GitHub. You can explore other options & features at GitHub.
Thanks for reading! Do comment.