Track, branch, merge, and manage code revisions with Git, the free and open source distributed version control system. Through a series of step-by-step tutorials, this practical guide quickly takes you from Git fundamentals to advanced techniques, and provides friendly yet rigorous advice for navigating Git's many functions. You'll learn how to work with everything from small to very large projects with speed and efficiency. In this third edition, authors Prem Kumar Ponuthorai and Jon Loeliger break down Git concepts using a modular approach. You'll start with the basics and fundamental philosophy of Git, followed by intermediate commands to help you efficiently supplement your daily development workflow. Finally, you'll learn advanced Git commands and concepts to understand how Git works under the hood.
A very good comprehensive textbook on Git. For my purposes it has even too much details However, the book touches all relevant topics and explain them really well. Of course the last part is very technical, almost devops topics like repo management. [several cmds are nowadays almost useless given the IDE extensions that represent graphically logs and history]
NOTES: Git is a distributed version control system (VCS) designed to track changes in source code during software development. It was created by Linus Torvalds in 2005 and has since become one of the most popular VCS tools in the software development industry.
The main purpose of Git is to enable collaboration among developers working on the same codebase by providing features such as version tracking, branching, merging, and conflict resolution. It allows multiple developers to work on the same project simultaneously, keeping track of their changes and ensuring a smooth integration of their work.
- Repository (Repo): A repository is a central place where all the project files, folders, and version history are stored. It can be local (on your computer) or remote (hosted on a server like GitHub, GitLab, or Bitbucket). - Commit: A commit represents a snapshot of the changes made to the files in the repository at a specific point in time. Each commit is accompanied by a commit message that describes the changes made. - Branch: A branch is an independent line of development in Git. Developers can create branches to work on new features or bug fixes without affecting the main codebase. Branches can later be merged back into the main branch (usually called “master” or “main”). - Merge: Merging is the process of integrating changes from one branch into another. When a feature or bug fix is complete on a separate branch, it can be merged back into the main branch to make those changes part of the main project. - Pull Request (PR): In remote repositories like GitHub, a pull request is a way to propose changes made in one branch to be merged into another branch. It allows for code review and collaboration among team members before merging the changes. - Clone: Cloning a repository means creating a copy of the entire repository on your local machine. This allows you to work on the project locally and push your changes back to the remote repository when ready. - Push/Pull: Pushing refers to sending your local commits to a remote repository, while pulling refers to retrieving changes from the remote repository to update your local copy.
Git provides a powerful and flexible environment for version control, making it an essential tool for software development teams to collaborate efficiently and manage code effectively. Its popularity is largely due to its speed, distributed nature, and robust branching and merging capabilities.
Git and GitHub are related but distinct entities that work together to facilitate version control and collaboration in software development.
1. Git is a distributed version control system. It is the core technology that allows developers to track changes to their code, create branches, merge changes, and collaborate effectively with others. Git is a command-line tool that runs on your local machine and operates directly with your local repositories.
2. GitHub is a web-based platform that provides hosting services for Git repositories. It acts as a remote hosting service for Git repositories, allowing developers to store their repositories in the cloud. GitHub provides a user-friendly interface that makes it easier to manage Git repositories, collaborate with others, and perform various tasks related to version control.
The relationship between Git and GitHub can be summarized as follows: - Developers use Git on their local machines to interact with the local repositories, creating commits, branches, and performing version control operations. - When developers want to collaborate with others or make their code accessible to a wider audience, they can push their local Git repository to GitHub, making it available to others to clone, fork, and contribute to.
GitHub provides additional features on top of Git, such as: - Remote Repository Hosting: GitHub offers a place to store your Git repositories in the cloud, making it easy to share your code with others and collaborate with a team. - Pull Requests and Code Review: GitHub allows developers to propose changes from one branch to another through pull requests. This enables code review and collaboration before merging changes. - Issue Tracking: GitHub provides issue tracking capabilities, allowing developers to manage and discuss bugs, feature requests, and other tasks related to the project. - Project Management: GitHub provides tools for project management, including project boards, milestones, and labels, helping developers organize and track their work.
It's important to note that while GitHub is the most well-known and widely used hosting service for Git repositories, there are other platforms like GitLab and Bitbucket that offer similar services. These platforms also provide additional features to enhance collaboration and project management with Git repositories.
Git Characteristics: - Git stores revision changes as snapshots, it does not track revision changes as a series of modifications (deltas). Instead it takes a snapshot of changes made to the state of the repository at a specific point in time via commit - Git is enhanced for local development, you work on a copy of the repository on your local development machine (local repo or a clone of the remote repo on a Git server) - Git is definitive, the commands are explicit. It waits you to provide instructions. For example, it does not automatically sync changes from your local repository - Git is designed to bolster nonlinear development, it allows to diverge and work in parallel along the main
Git Repository is a key-value pair database containing all the information needed to retain and manage the revisions and history of files in a project. The Index (a.k.a. Staging Area) is stored in binary data and its content is temporary and describes the structure of the entire repository at a specific moment in time: it’s a dynamic stage between your project’s working directory (filesystem) and the repository’s object store (repo commit history). It allows a separation between incremental development steps and the committal of those changes. Git is a content-addressable storage system, because the object store is organized and implemented to store key-value pairs of each object it generates, associated with a unique name produced by applying SHA1 hash value.
Git's content tracking is manifested in two critical ways that differ fundamentally from almost all other version control systems:
1. Git's object store is based on the hashed computation of the contents of its objects, not on the file or directory names from the user's original file layout. 2. Git's internal database efficiently stores every version of every file, not their differences as files go from one revision to the next.
When Git places a file into the object store, it does so based on the hash of the data (file content) and not on the name of the file (file metadata). In fact, Git does not track file or directory names, which are associated with files in secondary ways. The data is stored as a blob object in the object store. Again, Git tracks content instead of files.
If two separate files have exactly the same content, whether in the same or different directories, Git stores only a single copy of that content as a blob within the object store. Git computes the hash code of each file only according to its content, deter- mines that the files have the same SHA1 values and thus the same content, and places the blob object in the object store indexed by that SHA1 value. Both files in the project, regardless of where they are located in the user's directory structure, use that same object for content.
Because Git uses the hash of a file's complete content as the name for that file, it must operate on each complete copy of the file. It cannot base its work or its object store entries on only part of the file's content or on the differences between two revisions of that file. Using the earlier example of two separate files having exactly the same content, if one of those files changes, Git computes a new SHA1 for it, determines that it is now a different blob object, and adds the new blob to the object store. The original blob remains in the object store for the unchanged file to use.
Even though Git stores the complete content of every version of every file directly in its object store it is not inefficient because it uses zlib to compress each object prior to storing it in its object store.
Branches A branch allows the user to launch a separate line of development within a software project. When you create a branch, you are creating a fork from a specific state of the project's timeline. This allows development to progress in multiple directions simul- taneously. Think of it as time travel, where you have the ability to create alternate parallel timelines from a single starting point. A branch also gives you the ability to create different versions of a project. Often, a branch can be reconciled and merged with other branches to combine divergent efforts.
Creating branches in Git is considered a lightweight and inexpensive operation. This is because a branch is just a pointer to a specific commit object in a Git repository. Git allows many branches, and thus many different lines of development within a repository can exist simultaneously at any given moment. Moreover, Git has first-rate support for merges between branches. As a result, most Git users make routine use of branches and are naturally encouraged to do so frequently. In this chapter, we will take a top-down approach to thinking about how branches function in Git by looking at how developers maintain multiple lines of development within a project.
Commits
A commit is a snapshot capturing the current state of a repo at a moment in time, is a single atomic changeset respect to the previous state. Git uses a commit as a means to record changes to a repo. When you make a commit, Git takes a snapshot of the current state of the index directory and stores it in the object store. The snapshot does not contain a copy of every file and directory in the index. Instead, Git compares the current state of the index to the previous commit snapshot and derives a list of affected files and directories when you are creating a new commit. Based on this list, Git creates new blob objects for any file that is changed and new tree objects for any directory that has changed, and it reuses any blob or tree object that has not changed. Git internally maintains the following symrefs automatically for particular reasons:
HEAD always refers to the most recent commit on the current branch. When you change branches, Git automatically updates HEAD to refer to the new branch's latest commit.
Certain operations, such as merge and reset, record the previous version of HEAD in ORIG_HEAD just prior to adjusting it to a new value. You can use ORIG_HEAD to recover or revert to the previous state or to make a comparison.
When remote repositories are used, git fetch records the heads of all branches fetched in the file git/FETCH_HEAD. FETCH_HEAD is a shorthand for the head of the last branch fetched and is valid only immediately after a fetch operation. Using this symref, you can find the HEAD of commits from git fetch even if an anonymous fetch that doesn't specifically name a branch is used.
MERGE HEAD When a merge is in progress, the tip of the other branch is temporarily recorded in the symref MERGE HEAD. In other words, MERGE_HEAD is the commit that is being merged into HEAD.
When cherry-picking is used via the git cherry-pick command, the CHERRY PICK_HEAD symref will record the commits you have selected for the intended operation.
All of these symbolic references are managed by the low-level plumbing command git symbolic-ref.
Index The index a.k.a. the staging area, can be regarded as a cache of the current state of the working directory. It is used to stage or collect altercations to any files as a final step before the commit. Tracked File is any file already in the repo or that is staged in the index. Ignored File must be explicitly declared invisible or ignored in the repo. Untracked File is not found in either of the previous two categories. A .gitgnore file can contain a list of filename patterns that specify what files to ignore. Managing files with Git involves tracking changes, committing them, branching, merging, and more.
1. Initialization and Cloning: - Initialize a new Git repository: “git init” - Clone an existing repository: “git clone “
2. Staging and Committing: - Stage changes for commit: “git add “ - Commit staged changes: “git commit -m "Commit message"“ - A utomatically stage and commit all changes, including modifications and deletions, but not new files git commit -am "Your commit message here"
3. Viewing Changes: - View changes between working directory and staging area: “git diff” - View changes in the staging area: “git diff --staged” - View commit history: “git log”
4. Branching: - Create a new branch: “git branch “ - Switch to a branch: “git checkout “ - Create and switch to a new branch: “git checkout -b “ - Delete a branch: “git branch -d “
5. Merging and Rebasing: - Merge changes from one branch into another: “git merge “ - Rebase changes from one branch onto another: “git rebase “
6. Remote Repositories: - Add a remote repository: “git remote add “ - Push changes to a remote repository: “git push “ - Pull changes from a remote repository: “git pull “ - Fetch changes from a remote repository: “git fetch “
7. Resolving Conflicts: - Resolve merge conflicts: Manually edit conflicted files, then “git add” and “git commit”
8. Ignoring Files: - Create a “.gitignore” file to specify files/folders to be ignored
9. Undoing Changes: - Discard local changes: “git checkout -- “ - Unstage a file: “git reset HEAD “ - Amend the last commit: “git commit --amend”
10. Tags: - Create a tag: “git tag “ - Push tags to remote repository: “git push --tags”
Merges A merge unifies two or more commit histories of branches; it must occur in a single repo, that is all the branches to be merged must be present in the same repository.
Create a Branch: Before you can merge changes, you typically start by creating a new branch. This branch could be for a specific feature, bug fix, or any other purpose.
Make Changes in the Source Branch: Once you have your branch, you make changes (add, edit, delete files) in this branch.
Commit Changes in the Source Branch: As you make changes, you commit them to your source branch using git commit.
Switch to the Target Branch: When you're ready to integrate the changes from the source branch into the target branch, you switch to the target branch using git checkout.
Merge the Source Branch: With the target branch as the current branch, you initiate the merge using git merge . This command combines the changes from the source branch into the target branch.
Resolve Conflicts (if Any): If there are conflicting changes between the source and target branches (i.e., changes to the same lines of code), Git will pause the merge and ask you to resolve the conflicts manually. You do this by editing the conflicting files, marking the conflicts as resolved, and then continuing the merge with git add and git commit.
Commit the Merge: Once the conflicts are resolved (if any), you finalize the merge by committing the changes with git commit. This creates a new commit that represents the merged state.
View the Merge Commit: The merge commit will have two parent commits: one from the target branch and one from the source branch. This creates a history that shows when the merge happened and which branches were involved.
Push the Merged Changes: After the merge is complete and committed, you can push the merged changes to a remote repository using git push.
Fast-Forward Merge: When you want to merge a feature branch back into the main branch (e.g., main or master) and there are no new commits in the main branch since you created the feature branch, Git performs a "fast-forward" merge. This essentially moves the main branch pointer forward to the tip of the feature branch, incorporating all its changes.
Regular Merge (Three-Way Merge): This is the default merge strategy. When there are new commits in both the source and target branches, Git creates a new merge commit that combines the changes from both branches. If there are conflicting changes, Git will pause the merge and prompt you to resolve the conflicts.
Rebase and Merge: Instead of creating a merge commit, this strategy "replays" the commits from the source branch onto the target branch. It produces a linear commit history, but use it cautiously as it rewrites history and can cause conflicts. It's often used for keeping feature branches up to date with the main branch before merging.
Squash and Merge: This strategy combines all commits from the source branch into a single commit in the target branch. It's useful for cleaning up feature branches and providing a concise history, but it can make tracking individual changes more difficult.
The “git diff” command is used in Git, a distributed version control system, to show the differences between two sets of changes in a repository. It displays the differences between the working directory, the staging area (index), and the most recent commit(s). This command is extremely useful for reviewing changes before committing them, understanding modifications between different branches, and tracking the history of changes.