What's inside .git ?
Jan 24, 2026
Last weekend, I was curious to understand things which lie
inside .git - the repo Git creates on git init
I did some hands-on and felt the findings are interesting enough to write it down.
Let’s go!
Create a directory and initialize git
$ mkdir git-play
$ cd git-play
$ git init
.git is created with the following:
objects/refs/HEADconfig,hooks/,description,info/
At this stage, all of them are almost empty.
Let’s create some files inside our git-play directory, which Git will track.
$ echo "hello" > a.txt
$ mkdir src
$ echo "world" > src/b.txt`
$ echo "india" > src/c.txt
Current structure of git-play directory we created:
.
├── a.txt
└── src
├── b.txt
└── c.txt
.git is still in the same state. It has no idea about the files we created.
Interesting part begins, run:
$ git add .
Broadly, Git does the following:
- Takes the content of
a.txt, adds a headerblob <content-size-bytes>\0to the raw content:blob <content-size-bytes>\0<content-raw-bytes>- Let’s call this final-content
- Hashes the final-content using SHA-1 to get a 40 characters hash.
ce013625030ba8dba906f756967f9e9ca394464a
- Z-lib compress the final-content and stores it in the disk
at
.git/objects/ce/013625030ba8dba906f756967f9e9ca394464a- this is called blob
So, .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a points to the blob for a.txt
Note: The first two characters of the hash generated (“ce”, in this case) is used as subdirectory name in .git/objects/
Similarly blobs and paths (hashes) are created for b.txt and c.txt, stored in objects/.
Let’s run some commands to inspect the objects created:
$ find .git/objects -type f
.git/objects/f4/c5f046f89dd127301687b856cbe109ec46f311
.git/objects/cc/628ccd10742baea8241c5924df992b5c019f71
.git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
$ git hash-object a.txt
ce013625030ba8dba906f756967f9e9ca394464a
$ git cat-file -p cc628ccd10742baea8241c5924df992b5c019f71
world
$ git cat-file -t cc628ccd10742baea8241c5924df992b5c019f71
blob
$ git cat-file -s cc628ccd10742baea8241c5924df992b5c019f71
6
The above command proves cc628ccd10742baea8241c5924df992b5c019f71 is:
- For
b.txt(it’s content is “world”) - The object type is “blob”
- It’s size is 6 bytes (“world” + “\n”).
git add . also creates a .git/index file.
Index is basically the staging area. Next snapshot Git intends to commit.
It has the blob hash which points to blobs stored earlier in objects/
$ git ls-files --stage
100644 ce013625030ba8dba906f756967f9e9ca394464a 0 a.txt
100644 cc628ccd10742baea8241c5924df992b5c019f71 0 src/b.txt
100644 f4c5f046f89dd127301687b856cbe109ec46f311 0 src/c.txt
As a summary, git add . created:
- Three blob objects in
objects/- fora.txt,src/b.txt, andsrc/c.txt. - An
indexfile - which has hashes for those three objects created.
Next, run:
$ git commit
Broadly, Git does the following:
- Focuses on
index. Splits the paths by/.- Top level -
a.txt - Inside
src/-b.txt&c.txt - Starts from the deepest directory and creates a tree object for
src/
- Top level -
Previously, we created objects of type
blob, here the object being created has the type namedtree.
- Content for
src/’s tree object:100644 blob <hash b.txt> b.txt100644 blob <hash c.txt> c.txt
- Serializes this content to get binary data.
- Prepends with tree header
tree <size>\0(like we did forblob):tree <size>\0<binary-data-from-step-2>- Let’s call it final-tree-content
- Hashes this final-tree-content using SHA-1 to get a 40 char hash.
3026ac80f0b9b131294ce17794dbe4ef60f4f180- Let’s say it H1
- Z-lib compress the final-tree-content and stores it in the disk
at
.git/objects/30/26ac80f0b9b131294ce17794dbe4ef60f4f180.
Now, Git prepares root tree object
- Content for root tree object:
100644 blob <hash a.txt> a.txt040000 tree <hash H1> src
- Same steps: 3 to 6 (serialize, header, hash, compress + store)
$ find .git/objects -type f
.git/objects/6f/a5e8ac9ec1172d18aedd33f231a276dfb2a1ad
.git/objects/87/20a7c4e6911e2ae618f237d87302959f60f438
.git/objects/30/26ac80f0b9b131294ce17794dbe4ef60f4f180
.git/objects/f4/c5f046f89dd127301687b856cbe109ec46f311
.git/objects/cc/628ccd10742baea8241c5924df992b5c019f71
.git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
The top 3 are new objects.
- 2 of them are the tree objects - we just discussed.
- 1 is an object of type
commit- we’ll discuss shortly.
Read step 2 and 7, with access to just the root tree object hash, Git has this much info:
root-tree (H2)
├─ a.txt → blob A
└─ src/ → src-tree (H1)
├─ b.txt → blob B
└─ c.txt → blob C
Let’s discuss the commit object, we mentioned above.
- After creating tree objects, git finally creates a commit object.
- Content for the commit object:
tree <root-tree-hash>
author <name> <email> <timestamp>
committer <name> <email> <timestamp>
commit-message
Note that the commit object stores “root tree hash”, we already discussed that Git can use it to traverse all the other tree and blob objects.
Other info like GPG key etc can exist as content in commit object. Excluding those on purpose.
- Prepend a header
commit <size>\0commit <size>\0<commit-content>- Let’s call it final-commit-content
- Same steps for final-commit-content: SHA-1 hash, compress + store in
objects/
The hash generated here is what we keep saying the commit ID / hash.
Let’s take a closer look at the three tree objects created:
$ git cat-file -t 3026ac80f0b9b131294ce17794dbe4ef60f4f180
tree
$ git cat-file -p 3026ac80f0b9b131294ce17794dbe4ef60f4f180
100644 blob cc628ccd10742baea8241c5924df992b5c019f71 b.txt
100644 blob f4c5f046f89dd127301687b856cbe109ec46f311 c.txt
$ git cat-file -t 6fa5e8ac9ec1172d18aedd33f231a276dfb2a1ad
tree
$ git cat-file -p 6fa5e8ac9ec1172d18aedd33f231a276dfb2a1ad
100644 blob ce013625030ba8dba906f756967f9e9ca394464a a.txt
040000 tree 3026ac80f0b9b131294ce17794dbe4ef60f4f180 src
$ git cat-file -t 8720a7c4e6911e2ae618f237d87302959f60f438
commit
$ git cat-file -p 8720a7c4e6911e2ae618f237d87302959f60f438
tree 6fa5e8ac9ec1172d18aedd33f231a276dfb2a1ad
author Prakhar Pratyush <redacted@gmail.com> 1768416905 +0530
committer Prakhar Pratyush <redacted@gmail.com> 1768416905 +0530
commit-message
Look at the contents, it’s the same we discussed - See step 2, 7, and 10.
- The commit step also updates
.git/refs/heads/mainto store the commit hash just created.
As a summary, git commit:
- Created two tree and one commit object in
objects/. .git/refs/heads/mainis updated to store the commit hash. HEAD points to it.- The commit object stores “root tree hash”, Git uses it to traverse all the other tree and blob objects.
So, using a commit hash:
- inspect the commit object to get the root tree hash
- root tree helps to know the state of repo at that point of time
Sounds great?
Knowing this much is great, one can stop reading, but I’ll just go one step more to explain:
“what happens when we edit a few files, then git add . + git commit”
Let’s go…
Edit content in a.txt and src/c.txt using vim (or any editor you use)
Nothing changes in .git
Now, run:
$ git add .
$ find .git/objects -type f
.git/objects/6f/a5e8ac9ec1172d18aedd33f231a276dfb2a1ad
.git/objects/f4/c5f046f89dd127301687b856cbe109ec46f311
.git/objects/87/20a7c4e6911e2ae618f237d87302959f60f438
.git/objects/30/26ac80f0b9b131294ce17794dbe4ef60f4f180
.git/objects/cc/628ccd10742baea8241c5924df992b5c019f71
.git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
.git/objects/85/7e3e796f99f781a7077b7025bf2732e844723f
.git/objects/8e/06e2658cf6fa022744f59386dc8a3b8af56eb3
The last 2 are new entries. New blob for a.txt and src/c.txt is stored at those two locations.
$ git cat-file -p 857e3e796f99f781a7077b7025bf2732e844723f
hello edited
$ git cat-file -t 857e3e796f99f781a7077b7025bf2732e844723f
blob
$ git cat-file -p 8e06e2658cf6fa022744f59386dc8a3b8af56eb3
india edited
$ git cat-file -t 8e06e2658cf6fa022744f59386dc8a3b8af56eb3
blob
git add . also updates the index.
$ git ls-files --stage
100644 857e3e796f99f781a7077b7025bf2732e844723f 0 a.txt
100644 cc628ccd10742baea8241c5924df992b5c019f71 0 src/b.txt
100644 8e06e2658cf6fa022744f59386dc8a3b8af56eb3 0 src/c.txt
-
Hashes for
a.txtandsrc/c.txtupdated to point to new blobs. Old ones still exists inobjects/butindex(staging area) only contains the new ones. -
Hash for
src/b.txtleft unchanged.
As a summary, git add .:
- Created two new blob objects in
objects/- fora.txtandsrc/c.txt. - The
indexfile is upated - hashes for the two new blob objects replaces the older ones.
Run:
$ git commit
Broadly, Git does the following:
- Build tree objects (bottom up - for
src/then root) - Content for
src/’s tree object100644 blob <hash b.txt> b.txt100644 blob <new-hash c.txt> c.txt
- Same process: add tree header + serialize + SHA-1 + z-lib compress + store
- New tree object created in
objects/- let’s say it’s hash is H3 - Content for root tree object
100644 blob <new-hash a.txt> a.txt040000 tree <new-hash H3> src
- Same process: add tree header + serialize + SHA-1 + z-lib compress + store
- New tree object created in
objects/- let’s say it’s hash is H4 - Also, one commit object created (we’ll see what it contains)
$ find .git/objects -type f
.git/objects/c9/42f928dc1c54e9a4efca59e4a1536279d4e6d0
.git/objects/43/33a3d6ed2bd0ca235bdc4144d8216c8c16b109
.git/objects/f7/dffbc10cf4a75f03058c6ae296da99a2f43cb7
.git/objects/6f/a5e8ac9ec1172d18aedd33f231a276dfb2a1ad
.git/objects/f4/c5f046f89dd127301687b856cbe109ec46f311
.git/objects/87/20a7c4e6911e2ae618f237d87302959f60f438
.git/objects/30/26ac80f0b9b131294ce17794dbe4ef60f4f180
.git/objects/cc/628ccd10742baea8241c5924df992b5c019f71
.git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
.git/objects/85/7e3e796f99f781a7077b7025bf2732e844723f
.git/objects/8e/06e2658cf6fa022744f59386dc8a3b8af56eb3
Top 3 are new objects. Let’s take a closer look.
$ git cat-file -t f7dffbc10cf4a75f03058c6ae296da99a2f43cb7
tree
$ git cat-file -p f7dffbc10cf4a75f03058c6ae296da99a2f43cb7
100644 blob cc628ccd10742baea8241c5924df992b5c019f71 b.txt
100644 blob 8e06e2658cf6fa022744f59386dc8a3b8af56eb3 c.txt
$ git cat-file -t c942f928dc1c54e9a4efca59e4a1536279d4e6d0
tree
$ git cat-file -p c942f928dc1c54e9a4efca59e4a1536279d4e6d0
100644 blob 857e3e796f99f781a7077b7025bf2732e844723f a.txt
040000 tree f7dffbc10cf4a75f03058c6ae296da99a2f43cb7 src
$ git cat-file -t 4333a3d6ed2bd0ca235bdc4144d8216c8c16b109
commit
$ git cat-file -p 4333a3d6ed2bd0ca235bdc4144d8216c8c16b109
tree c942f928dc1c54e9a4efca59e4a1536279d4e6d0
parent 8720a7c4e6911e2ae618f237d87302959f60f438
author Prakhar Pratyush <redacted@gmail.com> 1768588668 +0530
committer Prakhar Pratyush <redacted@gmail.com> 1768588668 +0530
commit message
Note the “parent” entry in commit object.
As a summary, git commit:
- Created two tree and one commit object in
objects/. .git/refs/heads/mainis updated to store the new commit hash. HEAD points to it.- The new commit created has a reference to the older one - “parent”.
- The commit object stores the new “root tree hash”.
That’s it.
Hopefully this makes clear:
- How Git manages everything within
.git - What terms like
tree,blob,object,hash ID, etc means and their roles. - After reading this, do some google/LLM search about how
git status,git diff, etc works - they’ll easily make sense!
Thanks for reading :)