On this page

Skip to content

A Brief Introduction to Git Data Structures

TLDR

  • The .git directory is the core of a Git repository; deleting this directory is equivalent to deleting the local version control.
  • Git objects (Blob, Tree, Commit) are all stored in the objects directory using SHA-1 hashes.
  • Branches and Tags are essentially just pointers to specific Commit objects, stored in the refs directory.
  • The logs directory records the change history of HEAD and branches, allowing for recovery after accidental operations via git reflog.
  • The index file is a binary file that records the snapshot of the staging area after git add.
  • The HEAD file records the pointer to the current branch or Commit.

Directory Structure Analysis

hooks

Stores various custom scripts that are automatically triggered at specific moments during Git operations (such as commit, push, merge). Suitable for running automated tests or code style checks.

info

Stores auxiliary information, where the exclude file is used to define local exclusion rules.

  • When to encounter this issue: When a developer needs to exclude specific files, but the rule should not be included in team version control, info/exclude should be used instead of .gitignore.

logs

Records the update history of references (such as branch, HEAD), used for tracking change records.

  • When to encounter this issue: When executing git reset --hard or git rebase -i leads to lost Commits, you can query the content of logs/HEAD via git reflog and perform a recovery.

objects

Stores all Git data objects (Blob, Tree, Commit).

  • Structure: The first two characters of the SHA-1 hash serve as the directory, and the remaining 38 characters serve as the filename.
  • Object Types:
    • Blob object: Stores the actual content of a file.
    • Tree object: Stores the mapping between the directory structure and the SHA-1 of files.
    • Commit object: Stores commit information (including the SHA-1 of the Tree, the SHA-1 of the parent node, the author, and the message).

refs

Stores pointers for branches and Tags; the content is the corresponding Commit HASH value.

  • heads: Stores local branches.
  • remotes: Stores remote branches.
  • tags: Stores Tag names.

File Structure Analysis

COMMIT_EDITMSG

Records the message content of the last Commit, used for git commit --amend or editing during conflict resolution.

config

Stores repository-specific settings, similar to the global .gitconfig.

index

Records the file snapshot after the latest Commit, as well as file information added via git add; it is a binary file.

Stores the currently checked-out branch name or Commit HASH. If it points to a branch, the content format is ref: refs/heads/branch-name.

ORIG_HEAD

Records the state of HEAD before performing destructive operations (such as git reset, git merge), providing a recovery path.

FETCH_HEAD

Marks the record of each git fetch.

  • When to encounter this issue: After executing git fetch, you can view the latest status of remote branches through this file. An example format is as follows:
text
3b3a827b86d264f9c81bc77ef6e0e3df5e302ae8 not-for-merge branch 'main' of http://127.0.0.1/wing/Project

The Essence of Branches

As can be seen from Git's data structure, branches and Tags are merely pointers to specific Commit objects. Branches update with every Commit, while Tags point to fixed historical coordinates. By tracing the parent node information in Commit objects, Git is able to construct a complete historical version graph.

TIP

If you delete Commit records using git reset --hard, these nodes will not immediately disappear from the objects folder and can still be recovered via git reflog.

Change Log

  • 2024-07-31 Initial document creation.
  • 2024-09-20 Removed the description regarding .gitconfig in the repository root directory as it cannot take effect, thus it cannot be used for version control.