Lesson: Why git exists

The problem without version control

Three things that have happened to most people who write code, edit documents, or work with design files:

The first: a file called report-v3-FINAL-actually-final-revised.docx sitting in a folder, next to seventeen other versions, and nobody remembers which one is the one that was approved.

The second: a project folder copied to a thumb drive on Friday afternoon for a backup, then on Monday morning the team discovers a bug shipped in the meantime, and nobody can remember whether the Friday copy contains the bug or pre-dates it.

The third: two collaborators editing the same file at the same time, both saving over each other’s work, and the morning meeting becoming a forensic exercise in who-changed-what-and-when.

These are real problems that pre-date software entirely. They show up in legal contracts, in design files, in spreadsheets, in cooking recipes shared between family members. Every system that involves more than one person, more than one moment in time, and more than one revision will eventually hit these problems.

Version control is the discipline of solving them properly.

What version control actually is

Version control is a system that records changes to a file or a set of files over time so that any specific version can be recalled later.

That is the formal definition, and it sounds dry. But notice what it implies: the system is not just storing the current state of a file. It is storing the history of changes that produced the current state. That history is itself a useful artifact. It answers questions like:

What did this file look like last Tuesday?
Who made the change that introduced this bug?
What did this paragraph say before someone revised it?
Can we get back to the version that worked, even after three days of modifications?

These questions cannot be answered by looking at the current state of a file. They require a record of the past. Version control is the system that maintains that record.

The mental model: snapshots over time

If you remember nothing else from this lesson, remember this: git stores snapshots of project state over time.

A snapshot is a complete picture of every file in the project at a specific moment. Not a description of what changed, but a complete picture. Imagine a row of polaroid photos of your desk, taken at different points in your workday. Each photo is a snapshot. The photos together tell the story of how the desk evolved.

When you make a change in a project tracked by git and you tell git to save a checkpoint, git takes a snapshot of every file in the project at that moment. Not just the files you changed. Every file. Then it stores that snapshot, with a label, a timestamp, and a note about what changed.

The next time you make a change and save another checkpoint, git takes another snapshot. Now there are two snapshots. The two snapshots together form a tiny history.

Over a month of work, you might have hundreds of snapshots. Each one is recoverable. Any of them can be inspected. Any of them can be restored as the current state of the project.

This is what people mean when they describe git as a time-travel capability for code. The metaphor is accurate: any earlier state of the project is reachable, inspectable, and if needed restorable as the current state. A bug introduced last Tuesday can be traced back to the snapshot before it appeared. A decision made three weeks ago can be revisited in the context of what the project looked like at the time.

This is the mental model that makes every other git command later make sense. Commit means make a snapshot. Checkout means load a snapshot. Diff means compare two snapshots. Branch means start a parallel sequence of snapshots. Merge means combine two parallel sequences. None of these commands are mysterious if you start from the snapshot model.

A note for experienced developers

If you have used git before, you may know that git is actually more clever than this. It does not naively duplicate every file on every commit, but the optimization is not what you might expect. Identical files across snapshots are stored ONCE via content-addressed deduplication: if a README file is unchanged across ten commits, it lives in the database one time and all ten commits reference the same blob. Pack files apply delta compression as a disk-space optimization at the storage layer, but that is housekeeping; the base storage model is still snapshots, not diffs. The conceptual model AND the base storage model are both snapshots. Treating git as snapshots-with-clever-storage is correct. Treating it as a diff-tracking system leads to confusion later. This is the single most common mental-model trap that experienced developers carry forward from older version-control systems like Subversion.

Distributed vs centralized

Version control has existed for decades. Earlier systems like Concurrent Versions System (CVS) and Subversion (SVN) were centralized. There is one central server that holds the canonical history of the project. To check out the project, you connect to the server. To save a checkpoint, you connect to the server. The server is the single source of truth, and if it goes down, nobody can do their work.

Git is distributed. There is no required central server. Every developer who clones the project gets a complete copy of the full project history. They can make commits, browse history, and create branches without connecting to anyone. When they want to share their work with others, they push to a shared repository (often hosted on GitHub, GitLab, or a self-hosted server), but the shared repository is not architecturally privileged. It is just another copy.

This matters for three reasons.

Offline work. A developer on a plane or in a rural area can commit work, browse history, and create branches without an internet connection. Centralized systems cannot do this.

Resilience. If the central server goes down or its data is lost, every clone is a complete backup. Centralized systems lose data when the server loses data.

Branching. Because each developer has their own copy of the full project history, creating a branch is cheap and local. Centralized systems traditionally treated branches as expensive operations, which discouraged using them. Git treats branches as the default unit of work.

Git did not invent distributed version control, but it became the dominant version control system in the 2010s for these reasons, plus an excellent open-source implementation and the rise of GitHub (which we cover in L6). For practical purposes today, version control and git are almost synonymous in software development.

Practically, the distributed model breaks down into two parts that matter operationally: your local repository is your safety net, where you experiment, save checkpoints, and recover from mistakes; the remote repository (a shared copy hosted somewhere accessible to the team) is your bridge to the team, where you share work and integrate others’ changes. The local-remote distinction is one of the first things you will internalize in L2, and it shows up again in L6 (collaboration) and L8 (remotes and forks).

What this means for your work

You are about to learn a system that:

Lets you experiment without fear, because you can always recover an earlier version
Lets you collaborate with others without overwriting their work
Lets you understand the history of how a project evolved
Lets you trace back why a change was made and by whom
Lets you ship code, documents, or files with confidence that you can roll back if something goes wrong

For new developers, this is the foundation. Every workflow in the rest of this track builds on git fluency. The investment in learning the snapshot mental model now pays off across every later lesson.

For experienced developers, the rest of Phase 1 may feel like review. You can skim the WHY sections and focus on the HOW. The most valuable parts for you will come in Phase 2 (where we cover collaboration patterns) and Phase 3 (where we cover workflow tradeoffs that are not obvious from documentation alone).

For development managers and technical product managers, Phase 3 will be the most directly useful for understanding the team-process implications of workflow choices. But the snapshot mental model in this lesson is what makes the rest of the track legible. If you understand snapshots, you understand why a hotfix branch is a hotfix branch and why a release tag is a release tag.

Three signs a project needs version control

Some projects do not need version control. A single-author shopping list does not. A static photo album does not. So when does a project cross the threshold where version control becomes useful?

Sign one: more than one person edits the project. Two people editing a shared file without version control will overwrite each other eventually. The collision might be quick (same file at the same time) or slow (one person edits, the other edits the next day not realizing the first edit happened). Either way, work gets lost.

Sign two: the project evolves over time and earlier versions might matter. A blog post that gets revised twice and never looked at again does not need version control. A contract that gets revised twelve times with each version negotiated with the other party does. The question is whether you might ever want to recover an earlier state.

Sign three: the consequences of a mistake are non-trivial. A typo in a shopping list is recoverable. A typo in a production system that shipped overnight is not. When the cost of getting it wrong is high, the value of being able to roll back is high.

Most software projects hit all three signs almost immediately. Many non-software projects do too: legal contracts, large design files, research papers with multiple authors, even spreadsheets used by finance teams.

A useful frame for managers and technical product managers

If you are a development manager or a technical product manager, here is the frame that makes the rest of this track useful even if you never run a git command yourself.

There is a cost to not having version control that does not appear on any expense report. It is the mental overhead of manual coordination. Two engineers without version control spend cognitive cycles tracking who has the latest version of which file, whose edits to merge into whose copy, and which version actually shipped. That cost compounds. A four-person team without version control spends a meaningful fraction of every day on coordination overhead instead of work. Multiply by a year, and the unbilled cost of “no version control” exceeds the cost of any version control tooling several times over. Most experienced engineering teams have internalized this so deeply that they forget it was ever a question; for non-engineering stakeholders, the cost is invisible because it never gets billed, only suffered.

Git is collaboration infrastructure. When an engineering team picks a particular git workflow (we will cover three canonical ones in L9), they are making a choice about how the team collaborates. The choice affects release cadence, rollback options, incident response, and how aggressive the team can be about parallel work.

A team using a workflow called “Trunk-Based Development” can ship many small changes per day. A team using a workflow called “GitFlow” ships in larger, more deliberate batches. Neither is wrong. They serve different team sizes, release cadences, and risk tolerances.

The reason this matters for non-engineering stakeholders: when you understand the git workflow your team is using, you understand why your engineering team’s release cadence looks the way it does. You can ask better questions during planning. You can spot mismatches between the workflow and the team’s actual needs. You can speak the same language as your engineers when discussing release strategy, hotfix response, and feature deployment.

You do not need to write code to benefit from this track. You need to understand the mental model. That starts here, with snapshots.

What is next in Phase 1

Phase 1 covers the foundations of solo git workflow. By the end of L4, you will be able to:

Initialize a git repository and make your first commits (L2)
Write commit messages that communicate the WHY of changes (L3)
Recover safely from mistakes using the reflog as a safety net (L4)

These four lessons together give you confident solo workflow. You can use git for your own projects with no collaborators involved, and that alone is valuable. Phase 2 will add collaboration patterns. Phase 3 will add team workflow choices. Phase 4 will add the multi-agent angle (working with AI agents on parallel branches), which is where this track covers material no other curriculum has.

But for now, the foundation is the mental model: snapshots over time, recallable any time. The next lesson, L2, makes those snapshots real with actual commands.

Remember this if nothing else

Git stores snapshots. Every other command is just navigating those snapshots.

If that sentence is the only thing you carry forward from L1, you have what you need to make sense of every command in the rest of this track.