gource-organisation

Creating an organisation-wide gource animation

Preparations before creating logs

Adding a .mailmap file

In most repositories there are commits that, to git, seem to be made by different authors, even though they are the same person. This happens when there are multiple email addresses or names used for the same person.

To tell git (and thus Gource) who is who, a .mailmap file can be added to the root of each repository.

Each line in this file map the name of an author to email address the author should have.

For example, the following line maps the email address [email protected] to the name “Zé Povinho”:

Zé Povinho <[email protected]>

Now, regardless which name is used in the actual commits, the name for this email address will always be the same.

But what if the same person has multiple email addresses?

It is also possible to map an email address to a different name and email address.

For example, the following line maps [email protected] to the name “Zé Povinho” and email address “[email protected]”:

Now, any commit made with [email protected] (regardless of the name used) will be listed as a commit from Zé Povinho <[email protected]>.

Creating a .mailmap file

In order to know which email addresses and names to map, we first need to know which email addresses and names there are.

For a single repository, this can be done by listing all commits, only showing the email addresses and names: git log --format='%aN <%aE>'. This will give a (possible very) long list, with a lot of duplicates. To only see each entry once, the output can be piped to sort -u:

git log --format='%aN <%aE>' | sort -u

In order to get all authors from all repositories, a for loop can be used, sorting all output and removing duplicates afterwards:

for sDir in ./repos/*/;do 
    git -C "${sDir}" log --format='%an <%ae>';
done | sort -u

If your setup includes directories that are not git repositories, the above command will output errors for those directories.

fatal: not a git repository (or any of the parent directories)

A check can be added to the loop to make sure git log is only called on repositories:

for sDir in ./repos/*/;do
    if [ $(git -C "${sDir}" rev-parse --is-inside-work-tree 2>/dev/null) ];then
        git -C "${sDir}" log --format='%aN <%aE>';
    fi;
done | sort -u

This will output something like:

Zé Povinho <[email protected]>
ze-povinho <[email protected]>
Zé Povinho <[email protected]>
zepovinho <[email protected]>
potherca <[email protected]>
Ben Peachey <[email protected]>
Ben Peachey <[email protected]>
Ben Peachey <[email protected]>
Potherca <[email protected]>
Potherca <[email protected]>
AZ <[email protected]>

To create a .mailmap file, you will need to decide which name and email address should be used for each author. Next, map all the other email addresses to the chosen name and email address.

Using the list given above as an example, lets assume we want to use the name “Zé Povinho” with email address [email protected], “Ben Peachey” with [email protected], and we don’t know (or care) who “AZ” is.

That would give us the following .mailmap file (any line starting with a # is a comment):

# ==============================================================================
# The .mailmap feature is used to coalesce together commits by the same person
# in the (short)log, where their name and/or email address was spelled differently.
#
# In the simple form, each line in the file consists of the canonical real name
# of an author, whitespace, and an email address used in the commit (enclosed by
# < and >) to map to the name.
#
#       Proper Name <[email protected]>
#
# Other examples:
#
# <[email protected]> <[email protected]>               # replace only email
# Proper Name <[email protected]> <[email protected]>   # replace both name and email
# ==============================================================================

# ==============================================================================
# Canonical Email Addresses
#
# These are all contributors to this repository as they SHOULD be listed.
# ------------------------------------------------------------------------------
Zé Povinho      <[email protected]>
Ben Peachey     <[email protected]>
# ==============================================================================

# ==============================================================================
# Other Email addresses
# ------------------------------------------------------------------------------
Zé Povinho      <[email protected]>    <[email protected]>
Zé Povinho      <[email protected]>    <[email protected]>
Ben Peachey     <[email protected]>          <[email protected]>
Ben Peachey     <[email protected]>          <[email protected]>
Ben Peachey     <[email protected]>          <[email protected]>
Ben Peachey     <[email protected]>          <[email protected]>
# ==============================================================================

# ==============================================================================
# Unknown user
# ------------------------------------------------------------------------------
Unknown         <[email protected]>       <[email protected]>
# ==============================================================================

#EOF

Checking the .mailmap file

The mailmap can be validated with git check-mailmap. To quote from the git-check-mailmap manual:

For each contact, a single line is output, terminated by a newline. If the name is provided or known to the mailmap, “Name user@host” is printed; otherwise only “user@host” is printed.

However, just feeding the mailmap to git check-mailmap will trigger an error fatal: unable to parse contact.

All comments and empty lines need to be filtered for the command to understand the mailmap. As the list can be quite long, sort -u is used to remove duplicates.

cat .mailmap | grep -v '#'| grep '<' | git check-mailmap --stdin | sort -u

For the example mailmap given above, this will output:

Ben Peachey <[email protected]>
Unknown <[email protected]>
Zé Povinho <[email protected]>

If any entry is not as expected, the mailmap needs to be fixed until all is well.

Using the .mailmap file

Although Gource will use a .mailmap file when it is present, there is no --mailmap flag to tell gource to use a specific mailmap file. So instead the file needs to be copied into the root of each repository.

For a single repository this would be a simple cp:

cp .mailmap repos/repo-name/

However, to do this for all repositories at once, find can be added:

find ./repos/ -type d -maxdepth 1 -exec cp .mailmap {} \;
Explain Shell command [![explainshell copy-mailmap](/gource-organisation/explainshell.copy-mailmap.png)](https://explainshell.com/explain?cmd=find+.+-type+d+-maxdepth+1+-exec+cp+.mailmap+%7B%7D+%5C%3B)