#61864 closed defect (bug) (fixed)
Meta: Ascribe commits to the proper person and name.
Reported by: | dmsnell | Owned by: | dmsnell |
---|---|---|---|
Milestone: | 6.7 | Priority: | normal |
Severity: | normal | Version: | trunk |
Component: | Administration | Keywords: | has-patch |
Focuses: | Cc: |
Description
Many commits in git
end up with invalid committer information, be it the name, email, or both. This happens for a number of reasons.
- The committer has changed their name.
- The committer has changed their email or used a different one in the past.
- An automated system recorded the wrong email address.
- The username mapping script from
subversion
double-encodes the UTF-8 bytes of a committer's name fromprofiles.wordpress.org
and corrupts names with non-US-ASCII characters.
This obscures authorship for the affected commmitters and may erroneously maintain the wrong attribution for their work, even perpetuating names they have chosen to change.
While the record in subversion
only reports usernames it does not have this problem, but the git
side communicates identity and should either be fixed or have a mechanism to resolve these issues.
git
itself offers one way to fix this via the .mailmap
command, which I propose introducing into the repo. This file allows a git
repository to hold a list of aliases for names and emails such that all commits attributed to any of the aliases resolves to the preferred identity. This provides control to each committer over their own representation, and fixes a number of software mistakes the cloud the history.
Change History (16)
This ticket was mentioned in PR #7180 on WordPress/wordpress-develop by @dmsnell.
4 weeks ago
#1
- Keywords has-patch added
@peterwilsoncc commented on PR #7180:
4 weeks ago
#2
I'm asking around to see if there is a way to automate this to avoid it falling out of date as contributors are added to the commit group. I think it's a nice to have rather than a blocker but let's see if it proves possible in the short term.
3 weeks ago
#3
a way to automate this
I'm sure some automation will be possible, but also I think that raises a lot of ambiguous questions, such as:
- how do we automatically know that a new email represents the same identity as an existing one?
- if an email comes over with a different name, how do we know it's the right name?
- if the name comes over with something like
Mike Adams (mdawaffe)
how do we know if we should or shouldn't remove the parents.
Given that this is a display-only thing I would imagine we can circumvent a lot of maintenance and complexity by allowing contributors to govern their own commits. That's a nice thing about this file: if I did something wrong they can correct it; if they change their name or email, they can correct it, and they won't worry about their changes being automatically wiped out by some process that was unaware.
Happy to see automation; I'm just sharing my skepticism that it's going to be as valuable or effective in practice as it might seem at the outset, mainly because we don't have any real indication if a new identity represents an existing one or not.
This ticket was mentioned in Slack in #core by dmsnell. View the logs.
3 weeks ago
@joemcgill commented on PR #7180:
3 weeks ago
#5
I like this. I think we should preemptively add a {wp-username}@602fd350-edb4-49c9-b593-d223f7449a82
entry for all committers, since it seems like this format is being recorded randomly.
@peterwilsoncc commented on PR #7180:
3 weeks ago
#6
Happy to see automation; I'm just sharing my skepticism that it's going to be as valuable or effective in practice as it might seem at the outset, mainly because we don't have any real indication if a new identity represents an existing one or not.
I thought about this overnight and agree.
It's worth putting in the file initially and subsequently working on improving the maintenance approach. I agree with @joemcgill that including everyone with the @GUID
initially would be helpful
@peterwilsoncc commented on PR #7180:
3 weeks ago
#8
@dmsnell Was the name @git... name @guid
intentional or is the second name redundant. Eg
Aaron D. Campbell <aaroncampbell@git.wordpress.org> aaroncampbell <aaroncampbell@602fd350-edb4-49c9-b593-d223f7449a82>
Lazyweb: I could ~probably~ research this myself but while you are around... :)
3 weeks ago
#9
@peterwilsoncc it could probably be removed. let me redo the last commit. that matches more tightly than just the email, but I don't suppose these would ever need that level of specificity
3 weeks ago
#10
okay the GUID replacements have beeen updated in fff1c51 to only match on the email address and not also the name. this reduces the number of added lines to the mailmap file.
#11
@
3 weeks ago
- Owner set to dmsnell
- Resolution set to fixed
- Status changed from new to closed
In 58899:
Trac ticket: Core-61864.
From time to time a new commit will appear from an existing commit which has a different name or email address (or both) than an existing name or email address. This occurs because of changing names and changing emails and because of mistakes. Additionally, the
svg
-to-git
process double-encodes names fromprofiles.wordpress.org
causing corruption in names with non-US-ASCII characters.This patch introduces a
.mailmap
file to alias committers so that:The
.mailmap
file is a standardgit
configuration.