Book Image

Version Control with Git and GitHub

By : Alex Magana, Joseph Muli
Book Image

Version Control with Git and GitHub

By: Alex Magana, Joseph Muli

Overview of this book

<p>Introduction to Git and GitHub begins with setting up and configuring Git on your computer along with creating a repository and using it for exercises throughout the book. With the help of multiple activities, you’ll learn concepts that show various stages of a file—from when it is untracked to when it is set for tracking under version control. As you make your way through the chapters, you’ll learn to navigate through the history of a repository, fetch and deliver code to GitHub, and undo code changes. </p><p> </p><p>The first half of the book ends with you learning to work with branches, storing and retrieving changes temporarily, and merging the desired changes into a repository. </p><p> </p><p>In the second half, you’ll learn about forking as part of a collaborative workflow. You’ll also address modularity and duplication through submodules, tracing and rectifying faulty changes, and maintaining repositories. </p><p> </p><p>By the end of this book, you will have learned how to effectively deploy applications using GitHub.</p>
Table of Contents (8 chapters)

Defining Version Control

Version control is aimed at supporting the tracking of changes to a file, that is, the reversal of changes made to a file and the annotation of changes introduced to a code base. Prior to the inception of version control software, version control itself took an approach that was contrived and agreed upon by a team of developers working on a code base.

To introduce and roll out a change to a product, a developer would, for instance, retrieve an archived version corresponding to the version in the production environment. They would proceed to make and test the change. To deploy the new version, a version number would be assigned to the release, and notes detailing a change would be annotated alongside the release. To revert changes, you would be required to do the following:

  • Match the release notes to specific files.
  • Establish the actual changes introduced in the respective files.
  • Revert the changes and deploy the rectified version.

Software developers' building products need to coordinate concurrent pieces of work to achieve an effective product development flexibility. 

Consider a team working on a bus ticket sale platform for the city of Tunis needs to roll out a mobile app for ticket sales. In this project, the developers may split the work into the following categories: 

  • User authentication
  • Ticket purchase 

This work can be assigned to different members of the development team. Each member can focus their efforts on a task and share the work through a central repository on GitHub. Each feature can be rolled out incrementally in bits, before being tested and merged to the production environment for use by the residents of Tunis. 

Documentation maintainers use version control for collaboration on shared documents. A documentation maintainer allows for the review of proposed changes from concerned stakeholders, after which the final version of a document can be released for use by an organization. 

Software documentation may be maintained by the personnel responsible for managing information resources of an organization using version control. For example, a repository may be used to archive documents that are no longer in use. 

This process would be more challenging in the event that documentation material such as release notes were lacking from a project. As you may have established by now, this is a process marred with frustrations, stress, and inefficacies.

The advent of version control software was instigated by a need to address the issues that plagued the incorporation of the change and release of software. Version control software has seen an evolution of three generations.

In the first generation, version control software utilized a locking mechanism on files to allow a change to occur on the file. A file could only be worked on by one individual at a given point in time. The lock put in place on a file would be lifted once an individual who was working on the said file had persisted their changes by way of commits. Change management was conducted by managing a history on each file. During this period, SCCS (Source Code Control System) and RCS (Revision Control System) were the common version control software in use.

The second generation was characterized by the use of a merge-before-commit mechanism to support concurrent editing of a file by multiple users. To incorporate changes into a file, you would be are required to merge changes made by others to the same file. Once done, you would then proceed to commit your change to the file of interest. This generation of software introduced the use of centralized repositories. Developers were able to access the code base remotely and collaborate with other developers from a shared repository. Additionally, the unit of change was tracked as the change on a set of files in lieu of a single file.

The third generation is decentralized in nature. Each developer obtains a copy of the repository. Changes to the remote repository are introduced by way of a merge. To allow multiple individuals to work on the same file, a commit-before-merge mechanism is used. Here, you make changes to the local repository. To incorporate changes made to the remote repository, you commit the local changes, after which you are able to merge changes made by other individuals. This will be demonstrated in later sections. Git is a third-generation version control tool.

Version control provides an integral part of work, that is, change management. Git and GitHub, as you'll see in this book, provide tools that allow both teams and individuals to effect change in the book of work in a fast and effective manner. This is achieved through the facilities of division of work and integration of change provided by Git and GitHub.

Applications of Version Control

Version control is applicable to both technical and non-technical pieces of work. It lends itself to software development projects just as well as it does to projects that require teams to collaborate on the formulation and usage of documents. A good example to consider is when a building's construction plans may be shared and collaboratively contrived by a team of architects. Such a project could be managed using Git and GitHub to share documents and plans.

Common Terminologies

Let's take a look at some of the common terms that you will come across in this book:

Repository

A unit of storage and change tracking that represents a directory whose contents are tracked by Git.

Branch

A version of a repository that represents the current state of the set of files that constitute a repository.

In a repository, there exists a default or main branch that represents the single source of truth.

Master

The default or main branch. A version of the repository that is considered the single source of truth.

To use the analogy of a river, the master is the main stream of a river. Other branches depart from the main stream, just like a distributary would, but instead of not returning to the main stream, the branches rejoin the main stream, just like a tributary would. This process of joining the main stream is referred to as merging.

Reference

A Git ref or reference is a name corresponding to a commit hash. The references are stored in a files in the .git/refs directory, of a repository.

HEAD

A reference to the most recent commit on a branch. The most recent commit is commonly referred to as the tip of the branch.

Working Tree

This refers to the section in which we view and make changes to the files in a branch. The files that are changed are then moved to a staging area once they are ready for a commit.

Index

This is an area where Git holds files that have been changed, added, or removed in readiness for a commit. It's a staging area from where you commit changes.

Commit

This is an entry into Git's history that represents a change made to a set of files at a given point in time. A commit references the files that have been added to the index and updates the HEAD to point to the new state of the branch.

Merge

Using the analogy of a river, a merge refers to the process through which a tributary joins the main river.

In Git, a merge is the process of incorporating changes from one branch to another. Here, the branch bringing in changes is the tributary, whereas the branch receiving the changes is the main stream of a river.

Workflows

Workflows refer to the approach a team takes to introduce changes to a code base. A workflow is characterized by a distinct approach in the usage of branches (or lack thereof) to introduce changes into a repository.

Gitflow workflow

This uses two branches: master and develop. The master branch is used to track release history, while the develop branch is used to track feature integration into the product.

Centralized workflow

This approach uses the master branch as the default development branch. The changes are committed to the master branch. It's a suitable workflow for small size teams and teams transitioning from Apache Subversion. In Apache Subversion, the trunk is the equivalent of the master branch.

Feature branch workflow

In this workflow, feature development is carried out in a dedicated branch. The branch is then merged to the master once the intended changes are approved.

Forking Workflow

In this approach, the individual seeking to make a change to a repository, makes a copy of the desired repository in their respective GitHub account. The changes are made in the copy of the source repository and then it's merged to the source repository through a pull request.