Deep dive into Git

I realy think this one i the more important when you are working on a software project you need to master or at least realy grasp the basic conspect of git, so withour furfuer a do i gonna learn you how git work and some tips and tricks.

What is Git ? it’s a content tracker like other before thim (svn and …) the feature that diginguish git from the other is the distributed vresion control system, the git “thinking”.

How work git underneath

git use 3 part for versioning

%%{init: {'theme': 'dark', 'themeVariables': {'darkMode': true}, "flowchart" : { "curve" : "stepBefore" } } }%%

flowchart LR
    subgraph repo["./personnal_project"]

        subgraph workdir["Working directory"]
            file["index.html"]
        end

        subgraph index["Index"]
            style index stroke-dasharray: 5 5
        end

        subgraph local["Local history"]
            style local stroke-dasharray: 5 5
        end
    end

both local history and index are store in the .git folder. But what are these 3 components ?

  1. working directory this is the folder that you are working in with all your file these have nothing to do with git.
  2. Index this area is for carfully selecting what you want to include in the fowup commit (your next step in the history). it’s the staging area, the index is transitory.
  3. Local History this is where git store the commit history.

to create a git repo you can use git init -d main command.

to add a file to the index use the git add <file> or even stage the all project with git add .

then the next step it commit to the history with git commit -m "commit message" or

if you have already used git here is the typical setps you know:

%%{init: {'theme': 'dark', 'themeVariables': {'darkMode': true}, "flowchart" : { "curve" : "stepBefore" } } }%%
sequenceDiagram
    workingdir->>index: git add index.html
    index->>localhistory: git commit -m "add index of the website"

Without the -m git will ask you a message in you default editor.

Git also have the move mv and remove rm commands with remove you can unstaged a file using the git rm --cached <file> command.

to see the history of the commits you can use git log or git log --oneline, then subsequently use git show <commit-id> to see the specific commit change and message

$ git log --oneline
1170a59 (HEAD -> main) First file of the website

$ git show 1170a59
commit 1170a594c97a7c4b4bfff77ad35230b81f833a41 (HEAD -> main)
Author: Guillaume Dorschner <guillaume.dorschner@icloud.com>
Date:   Thu Mar 26 19:56:41 2026 +0100

    First file of the website
diff --git a/index.html b/index.html
new file mode 100644
index 0000000..e69de29

You can use the --follow flag to track a specific file thourghout the history.

git diff displays changes between your working directory and the staging area. You can also see the differences between the staged files with git diff --cached. diff diagram

Also you can view the difference between commit just like the diff in bash like this.

git diff <A-commit> <B-commit>

Structure

your_project
├── .git                    # hidden folder generate by git
│   ├── config
│   ├── description
│   ├── HEAD
│   ├── hooks
│   │   ├── commit-msg.sample
│   │   ├── post-update.sample
│   │   ├── pre-commit.sample
│   │   ├── pre-merge-commit.sample
│   │   ├── pre-push.sample
│   │   └── pre-rebase.sample
│   ├── index
│   ├── info
│   │   └── exclude
│   ├── logs
│   │   ├── HEAD
│   │   └── refs
│   │       └── heads
│   │           └── main
│   ├── objects
│   │   ├── 0a
│   │   │   └── 5568b3eb72786b7d025f317905c26d9b2a59ce
│   │   ├── 11
│   │   │   └── e937b1c0756a158c66bb67e04e0a3646a9253c
│   │   ├── info
│   │   └── pack
│   └── refs
│       ├── heads
│       └── tags
│ # the reset of the files are your working dir
└── index.html

Objects

Git stores data as a collection of objects and uses packfiles to efficiently compress and store them more info in git documentaion here.

Git uses a key-value storage model:

  • Key: a cryptographic hash (historically SHA-1, now optionally SHA-256)
  • Value: the content of the object

Each object is identified by the hash of its content. This guarantees content integrity and identical content is stored once. Objects are stored under .git/objects/ using their hash:

  • The first 2 characters form the directory name (that improve filesystem efficiency if there is too many files in the same directory that ten to slow down filesystem).
  • The remaining characters form the filename.
├── f9
│   └── abef842f9a4b17b614de945456da38a5d2009d
├── fb
│   ├── 2d94894b96a39f2504d6a8152150c6c1de5b1d
│   └── 7ac4652aae96dac4968621df8a5833fdae95b2
├── fc
│   └── 4d19bb2e66650d4f3dbe7f6907eb3ba0bcdd09

These are called loose objects each loose object is compressed individually using zlib algo DEFLATE.

Packfiles

When a repository grows this storage becomes inefficient to solve this, it uses packfiles.

Packfiles combine objects into a single file .pack it apply compression and delta encoding.

  • Blobs

    Holds a file’s data, without any metadata associted but the files (not even his name).

  • Trees

    Records contents of a single level in the directory hierarchie, e.g. lists files and subtrees.

  • Commits

    Metadata for each change introduced into the repo include author, committer, date, message. It point to a tree objects, and also one or more parent like a linked list.

  • Tags

    name link to a commits

You can think of the git structure like this:

graph TD
%%{init: {'theme': 'dark', 'themeVariables': {'darkMode': true}, "flowchart" : { "curve" : "stepBefore" } } }%%
    commit["commit 1492<br>Initial commit"]
    tag@{ shape: tri, label: "tree 8675309"}

    blob1["blob dead23<br>Four score and seven ..."]
    blob2["blob feeb1e<br>Mary had a little lamb"]

    tag{"tag v1.0<br>object: 2504624"}
    branch("master")

    tag --> commit
    branch --> commit
    commit --> tree
    tree --> blob1
    tree --> blob2

    style branch fill:#ffee7e,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
    style tag fill:#0171bc,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
    style commit fill:#b2d4eb,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
    style tree fill:#b2d4eb,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
    style blob1 fill:#80b8de,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
    style blob2 fill:#80b8de,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe

If you have a folder and you copy it into another location, the folder structure inside the copy will remain the same as in the original, so the tree will still exist


A/
    file1.txt
    file2.txt
    Subfolder/
        file3.txt

B/
    file1.txt
    file2.txt
        Subfolder/
            file3.txt
    A/
        file1.txt
        file2.txt
        Subfolder/
            file3.txt
    
graph TD
%%{init: {'theme': 'dark', 'themeVariables': {'darkMode': true}, "flowchart" : { "curve" : "stepBefore" } } }%%
    tree1["blob dead23<br> ..."]
    tree2["blob feeb1e<br>Mary had a little lamb"]


    style tree1 fill:#b2d4eb,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
    style tree2 fill:#b2d4eb,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
    style blob1 fill:#80b8de,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe

Branch

Ther allow you to dev simutalnily on multiple feature. You need to know like basic git workflow the most use is definitly git-flow.

git workflow

  • branch also allow to represent specific release, so your costommer can stick on specifi versions.
  • branch can also be tite to the work of an individaul dev (use in small project)

you can name branch like this feat/great-button fix/bug-hot-reload chore/fix-typos. Using this you can select branch more easily git show-branch 'bug/*'.

To switch branches, use git switch <branch-name>. The HEAD refers to the latest commit on the current branch. You can also move to a previous commit using a relative reference, for example git switch dev~5 to go back 5 commits. Note that doing this will detach HEAD, meaning HEAD points directly to that specific commit instead of a branch. To navigate into commits you can use ^ that will refers to the parents so ~2 will go to the merged commit on the other hand ~2 will go to the grand parent commit.

So what is the differences between a branch and a tags ?

Tags are link to a specific commit and are immutable. They are use to mark a specific point in the history, like version release. Don’t name tags like your it’s not a good idea.

Create branches

you can see you branch git branch and to create branches:

  • git branch <new-branch-name> <starting-pint-branch>: create a new branch
  • git checkout -b <new-branch-name>: create and switch to new branch name

View branches

  • git branch:local branches
  • git branch -r: remote branches
  • git branch -a: all branches

Show branch

  • git show-branch
  • git show-branch <name>

Delete a branch

  • git branch -d <branch-name>: delete a local use -D for

Restore files

This can be quite handy for revert or get specific files from a commit:

git restore feat/button~4 -- index.js # -- help bash to understand what follow is not an flag but an option

For example you just rm a use flie you can recover it

git restore HEAD -- file.js

Merge

allow you can merge feature branch into other branch

git switch main # move to the branch you want the merge to occure
git merge <feat/button>

merge example I do recommend you to keep the dir clean as you will start to merge different branch.

if you merge branch that have differences in the same files a conflict will appears. So git will need your intervention since it doesn’t understand your code only managed it. Git come with three-way diff (conflict resolution markers). It will be as follow:

<!DOCTYPE html>
<html lang="en">
  <body>
    <h1>Welcome to My Website</h1>

    <<<<<<< HEAD
    <p>This paragraph was edited locally in your branch.</p>
    =======
    <p>This paragraph was edited remotely in the other branch.</p>
    >>>>>>> feature-branch
  </body>
</html>

use git diff --check to be sure all the markers are deleted and you have finished the conflict resolution.

You screw up the merge you can retry using git checkout -m, or you want to cancel it do git merge --abort.

Now if you already have it commited use the specific head made for that git reset --hard ORIG_HEAD, that will reset to the state before the merge.

Strategie of merges

Already up to date

%%{ init : { "theme" : "default", "flowchart" : { "curve" : "linear" }}}%%

flowchart RL
    main((main))
    a1((A))
    a2((B))
    a3((C))
    b1((A))
    b2((B))
    b3((C))
    merge((merge))

merge --> a3 --> a2 --> a1 --> main
merge --> b3 --> b2 --> b1 --> main

Fast-forward merge

Before

%%{ init : { "theme" : "default", "flowchart" : { "curve" : "linear" }}}%%

flowchart RL
    main((main))
    a1((A))
    a2((B))
    a3((C))
    merge((merge))

main --> a1
merge --> a3
a3 --> a2 --> a1

style main fill:#ffee7e,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
style merge fill:#ffee7e,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe

After

%%{ init : { "theme" : "default", "flowchart" : { "curve" : "linear" }}}%%

flowchart RL
    main((main))
    a1((A))
    a2((B))
    a3((C))
    merge((merge))

main --> a3
merge --> a3
a3 --> a2 --> a1

style main fill:#ffee7e,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe
style merge fill:#ffee7e,stroke:#1e1e20,stroke-width:2px,color:#e0f2fe

There are other type of merge:

  • Resolve # TODO: do this
  • Octopuse
  • Merge-ort

Merge Base

This command helps to identify the best common ancestors for merging, reducing the chances of conflicts and ensuring a smoother integration process.

git merge-base original-branch new-branch

Rewrite commit history

Now i will explain the use of rebase is command is realy usefull for edit and simplfy you current history I will recommand to only use it on your local or feat branch since if others are on the same branch as you will have to deal with complex conflict.

  • squash
  • delete
  • move commit
  • rename

Bisect

Did you know git can search for you the problem ? With bisect git will search using a binary search the commit that is responcible for the new bug you have found

git bisect

git bisect good
git bisect bad

use git bisect reset to exit the bisect mode and git bisect visualize --pretty=online to see the history.

but even better you can tell git which command to test.

git bisect run <command_to_test> # git will use the linux will return code 0 is good and between 1...127 bad

Blame

You find a typo and want to identify who introduced that atrocity. Use:

git blame <file>

This shows, line by line, the commit and author responsible for the current content of the file.

-S

More powerful than git blame, the -S option lets you search the history for changes that added or removed a specific string. For regex-based searches, use -G.

Example: find all commits that touched the string TODO:

git log -S "TODO" <file>

Git vs others

git not like the other doesn’t store change like a serie of delta step but as snapshot of all files at a given point, which is more powerfull since you clone the repo you have access to all the commit and file history you don’t have to ask the server to get to a sepcific version (CVS and SVN store delta step by step and require a connection).

.gitignore

The file .gitignore allow you to specify folder or files to get ignore by git. Example:

# example of comment
debug/ # folder finish with a / at the end
*.log # globbing
.DS_Store # Macos specific
PatternExample
*.log.log
file.log
debug/file.log
*.[txt|md]file.txt
test.md
folder/**/filefolder/test/file
folder/image/file
photo?.pngphoto1.png photo7.png
!documenation.md⚠️ this won’t ignore the file even if it does macth other pattern

But if you do have an execption for you and only you, add it to the .git/info/exclude file just like ignore but won’t be pushed to the repo, the pattern is the same.

Tips and tricks

For those who learn by doing checkout this cool website Learn Git Branching.

Did you put the cart before the horse ?

Sometime I create a new branch <feat/awesome-A> and i forgot to pull the code of dev… If you don’t have any commit ethy you can:

git switch dev
git pull
git checkout -B <feat/awesome-A>

that will erase the <feat/awesome-A> branch with the new version of dev.

Wrong move you lost your code do worry

git reflog is your friend. Find the commit that you want to be on in that list and you can reset to it (for example:git reset —hard e870e41).

Refactor branch

you made toooo much change in all the direction in a branch before PR you see our 20 commit named wip, tmp, foo, toto, ikd ? so you decide to do something about it. So I will show you how to refacto all cleanly.

  1. first we are going to save you branch juste in case ;)

    git branch <backup/your-branch> <your-branch>
  2. reset all you work to the dev branch to refacto all

     git reset origin/dev
  3. commit all your work with consice and clean commit i would suject you to use vscode to select you files or line more easly see below vscode_git_add

Configuration

you can configure your git, with order of priority:

  1. .git/cofig
  2. ~/.gitconfig
  3. /etc/gitconfig

example of configuration file

[user]
    name = Guillaume Dorschner

[init]
    defaultBranch = main

[pull]
    rebase = false

[push]
    autoSetupRemote = true

[url "git@github.com:"]
    insteadOf = https://github.com/

[core]
    excludesFile = ~/.gitignore_global

[filter "lfs"]
    required = true
    clean = git-lfs clean -- %f
    smudge = git-lfs smudge -- %f
    process = git-lfs filter-process

[include]
    path = ~/.gitconfig-personal

[includeIf "gitdir:~/work/"]
    path = ~/.gitconfig-work

you can even have a specific config for work and passe any other config ther from me is only the email.

work.

[user]
    email = guillaume.dorschner@company.com

personal

[user]
    email = guillaume.dorschner@personal.com

i sometime use a mac so i really need to disable tracing .DS_Store you can put others in this file ~/.gitignore_global:

.DS_Store

lots of infos came from the git documentation and Oreilly Version Control with Git the former was a great help.