Book Image

Git Version Control Cookbook

Book Image

Git Version Control Cookbook

Overview of this book

Table of Contents (19 chapters)
Git Version Control Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Git's objects


Now that you know Git stores every commit as a full tree state or snapshot, let's look closer at the object's Git store in the repository.

Git's object storage is a key-value storage, the key being the ID of the object and the value being the object itself. The key is an SHA-1 hash of the object, with some additional information such as size. There are four types of objects in Git, branches (which are not objects, but are important), and the special HEAD pointer that refers to the branch/commit currently checked out. The four object types are as follows:

  • Files, or blobs as they are also called in the Git context

  • Directories, or trees in the Git context

  • Commits

  • Tags

We will start by looking at the most recent commit object in the repository we just cloned, keeping in mind that the special HEAD pointer points to the branch currently checked out.

Getting ready

To view the objects in the Git database, we first need a repository to be examined. For this recipe, we will clone an example repository located here:

$ git clone https://github.com/dvaske/data-model.git
$ cd data-model

Now you are ready to look at the objects in the database, we will start by looking first at the commit object, then the trees, the files, and finally the branches and tags.

How to do it...

Let's take a closer look at the object's Git stores in the repository.

The commit object

The special Git object HEAD always points to the current snapshot/commit, so we can use that as a target for our request of the commit we want to have a look at:

$ git cat-file -p HEAD
tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
author Aske Olsson <[email protected]> 1386933960 +0100
committer Aske Olsson <[email protected]> 1386941455 +0100

This is the subject line of the commit message

It should be followed by a blank line then the body, which is this text. Here you can have multiple paragraphs etc. and explain your commit. It's like an email with subject and body, so get people's attention in the subject

The cat-file command with the -p option pretty prints the object given on the command line; in this case, HEAD, which points to master, which in turn points to the most-recent commit on the branch.

We can now see the commit object, consisting of the root tree (tree), the parent commit object's ID (parent), author and timestamp information (author), committer and timestamp information (committer), and the commit message.

The tree object

To see the tree object, we can run the same command on the tree, but with the tree ID (34fa038544bcd9aed660c08320214bafff94150b) as the target:

$ git cat-file -p 34fa038544bcd9aed660c08320214bafff94150b 
100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15    README.md
040000 tree abc267d04fb803760b75be7e665d3d69eeed32f8    a_sub_directory
100644 blob b50f80ac4d0a36780f9c0636f43472962154a11a    another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727    cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd    hello_world.c

We can also specify that we want the tree object from the commit pointed to by HEAD, by specifying git cat-file -p HEAD^{tree}, which would give the same results as the previous one. The special notation HEAD^{tree} means that from the reference given, (HEAD) recursively dereferences the object at the reference until a tree object is found. The first tree object is the root tree object found from the commit pointed to by the master branch, which is pointed to by HEAD. A generic form of the notation is <rev>^<type> and will return the first object of <type> searching recursively from <rev>.

From the tree object, we can see what it contains: file type/permissions, type (tree/blob), ID, and pathname:

Type/

Permissions

Type

ID/SHA-1

Pathname

100644

blob

f21dc2804e888fee6014d7e5b1ceee533b222c15

README.md

040000

tree

abc267d04fb803760b75be7e665d3d69eeed32f8

a_sub_directory

100644

blob

b50f80ac4d0a36780f9c0636f43472962154a11a

another-file.txt

100644

blob

92f046f17079aa82c924a9acf28d623fcb6ca727

cat-me.txt

100644

blob

bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd

hello-world.c

The blob object

Now, we can investigate the blob (file) object. We can do it using the same command, giving the blob ID as target for the cat-me.txt file:

$ git cat-file -p 92f046f17079aa82c924a9acf28d623fcb6ca727

This is the content of the file: "cat-me.txt."

Not really that exciting, huh?

This is simply the content of the file, which we will also get by running a normal cat cat-me.txt command. So, the objects are tied together, blobs to trees, trees to other trees, and the root tree to the commit object, all by the SHA-1 identifier of the object.

The branch

The branch object is not really like any other Git objects; you can't print it using the cat-file command as we can with the others (if you specify the -p pretty print, you'll just get the commit object it points to):

$ git cat-file master
usage: git cat-file (-t|-s|-e|-p|<type>|--textconv) <object>
   or: git cat-file (--batch|--batch-check) < <list_of_objects>

<type> can be one of: blob, tree, commit, tag.
...
$ git cat-file -p master
tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
...

Instead, we can take a look at the branch inside the .git folder where the whole Git repository is stored. If we open the text file .git/refs/heads/master, we can actually see the commit ID the master branch points to. We can do this using cat as follows:

$ cat .git/refs/heads/master
34acc370b4d6ae53f051255680feaefaf7f7850d 

We can verify that this is the latest commit by running git log -1:

$ git log -1
commit 34acc370b4d6ae53f051255680feaefaf7f7850d
Author: Aske Olsson <[email protected]>
Date:   Fri Dec 13 12:26:00 2013 +0100

    This is the subject line of the commit message
...

We can also see that HEAD is pointing to the active branch by using cat with the .git/HEAD file:

$ cat .git/HEAD
ref: refs/heads/master

The branch object is simply a pointer to a commit, identified by its SHA-1 hash.

The tag object

The last object to be analyzed is the tag object. There are three different kinds of tags: a lightweight (just a label) tag, an annotated tag, and a signed tag. In the example repository, there are two annotated tags:

$ git tag
v0.1
v1.0

Let's take a closer look at the v1.0 tag:

$ git cat-file -p v1.0
object 34acc370b4d6ae53f051255680feaefaf7f7850d
type commit
tag v1.0
tagger Aske Olsson <[email protected]> 1386941492 +0100

We got the hello world C program merged, let's call that a release 1.0

As you can see, the tag consists of an object, which in this case is the latest commit on the master branch, the object's type (both, commits, and blobs and trees can be tagged), the tag name, the tagger and timestamp, and finally a tag message.

How it works...

The Git command git cat-file -p will pretty print the object given as an input. Normally, it is not used in everyday Git commands, but it is quite useful to investigate how it ties together the objects. We can also verify the output of git cat-file, by rehashing it with the Git command git hash-object; for example, if we want to verify the commit object at HEAD (34acc370b4d6ae53f051255680feaefaf7f7850d), we can run the following command:

$ git cat-file -p HEAD | git hash-object -t commit --stdin
34acc370b4d6ae53f051255680feaefaf7f7850d

If you see the same commit hash as HEAD pointing towards you, you can verify whether it is correct with git log -1.

There's more...

There are many ways to see the objects in the Git database. The git ls-tree command can easily show the contents of trees and subtrees and git show can show the Git objects, but in a different way.

See also

  • For further information about Git plumbing, see Chapter 11, Git Plumbing and Attributes, almost at the end of this book.