Previous chapter
Git Version ControlCollaboration with Git
Next chapter

Conflicts

Merging branches together through git merge and git rebase is always necessary when many developers work on the same project. This task is relatively easy in Git and it does an excellent job combining multiple code fragments together. However, so-called merge conflicts can occur which come in various forms.

Let’s walk through a simple example. We have a basic R-script file that prints "Good morning sunshine!":

goodmorning <- function(name) {
  sprintf("Good morning %s!", name)
}

goodmorning("sunshine")

This script has been commited to branch master. We create a new branch next and checkout the branch using git checkout -b next and change the script as follows in a new commit:

goodmorning <- function(name) {
  sprintf("Good morning %s!", name)
}

goodmorning("honey")

Back to branch master using git checkout master we change the script as follows and commit the changes:

goodmorning <- function(name) {
  sprintf("Good morning %s!", name)
}

goodmorning("my dear")

We should easily see that the commit on branch next and on branch master changed the same last line using a different name. On branch master running git diff HEAD~1:

diff --git a/goodmorning.R b/goodmorning.R
index 9d09f7d..2e2777d 100644
--- a/goodmorning.R
+++ b/goodmorning.R
@@ -2,5 +2,5 @@ goodmorning <- function(name) {
   sprintf("Good morning %s!", name)
 }
 
-goodmorning("sunshine")
+goodmorning("my dear")

as well as on branch next running git diff HEAD~1:

diff --git a/goodmorning.R b/goodmorning.R
index 9d09f7d..6f90c28 100644
--- a/goodmorning.R
+++ b/goodmorning.R
@@ -2,5 +2,5 @@ goodmorning <- function(name) {
   sprintf("Good morning %s!", name)
 }
 
-goodmorning("sunshine")
+goodmorning("honey")

Trying to merge/rebase the commit on branch next to master results in a conflict. Running git merge next results in the following output:

Auto-merging goodmorning.R
CONFLICT (content): Merge conflict in goodmorning.R
Automatic merge failed; fix conflicts and then commit the result.

git status reveals the following:

On branch master
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)

    both modified:   goodmorning.R

no changes added to commit (use "git add" and/or "git commit -a")

Opening goodmorning.R in a text-editor shows the following output:

goodmorning <- function(name) {
  sprintf("Good morning %s!", name)
}

<<<<<<< HEAD
goodmorning("my dear")
=======
goodmorning("honey")
>>>>>>> next

We immediately see the added markers

  • <<<<<<< HEAD
  • =======
  • >>>>>>> next.

The lines between <<<<<<< HEAD and ======= come from my current head and are in conflict with the changed lines from next between ======= and >>>>>>> next.

To resolve the conflict we need to

  1. Decide how to deal with the conflict in file goodmorning.R, which line should be used or how lines should be combined. We thus need to edit the file and save the result without markers.

  2. Save the file(s) and stage them using git add goodmorning.R

  3. If all conflicts have been resolved, saved and added, you can continue the merge using git commit.

  4. If we run into problems and want to revert the commit you can use git merge --abort

Following 1. we change the file as follows:

goodmorning <- function(name) {
  sprintf("Good morning %s!", name)
}

goodmorning("my dear, honey")

Practice

Let’s do some exercises!

The Rules of Git-Club

In the previous chapters we have seen how to branches can be merged together using git merge and git rebase. We have also mentioned that we typically use a production branch called master. However, what if a project grows in size and developers and data scientist? First, let’s take a look at the most common requirements/rules of our Git version control system. Like the chapter Code Style this part could also be named Repository Style. Follow the following seven rules to live a happier Git-live:

  1. Each commit should be as atomic as possible and have a message which describes its purpose as exactly as possible. Use commit messages like Fix division by null numeric issue during optimization instead of Fix issue, or worse, Fix stuff. Check the diff to ensure that the you only changed what the message suggested. Better split commits which serve multiple purposes.

  2. Make sure that each possible commit/state, at least on your production branch, runs through all defined checks and tests. This makes it easy to undo changes whenever needed without risking a hickup in your test procedures.

  3. Do a cleanup of your repository regularly. Make sure that the master branch is easily readable and ends up as one straight line (from https://git-scm.com/book/en/v1/Git-Branching-Rebasing):

{width=70%}

  1. Do not rebase or force-push commits that you have pushed to a public repository. If you follow that guideline, you’ll be fine. If you don’t, people will hate you, and you’ll be scorned by friends and family.

  2. For bigger projects it makes sense to have a development branch named e.g. dev, next or similar.

  3. Do rebase/merge changes as soon as possible. Do not leave ghost branches for a long time. First-come first-served: If two collaborators commit conflicting commits to a branch the one you pushes later needs to fix the conflicts. This should also invite developers to push as quickly as possible.

  4. Make your git tree as simple as possible. Use git rebase on local branches as often as necessary and don’t do unnecessary merges which complicate your tree structure. If merges are necessary after longer development efforts use pull requests and cleanup unwanted branches after you are sure that the merge and changes were successful.