Make WordPress Core

Opened 4 weeks ago

Closed 3 weeks ago

Last modified 11 days ago

#61864 closed defect (bug) (fixed)

Meta: Ascribe commits to the proper person and name.

Reported by: dmsnell's profile dmsnell Owned by: dmsnell's profile dmsnell
Milestone: 6.7 Priority: normal
Severity: normal Version: trunk
Component: Administration Keywords: has-patch
Focuses: Cc:

Description

Many commits in git end up with invalid committer information, be it the name, email, or both. This happens for a number of reasons.

  • The committer has changed their name.
  • The committer has changed their email or used a different one in the past.
  • An automated system recorded the wrong email address.
  • The username mapping script from subversion double-encodes the UTF-8 bytes of a committer's name from profiles.wordpress.org and corrupts names with non-US-ASCII characters.

This obscures authorship for the affected commmitters and may erroneously maintain the wrong attribution for their work, even perpetuating names they have chosen to change.

While the record in subversion only reports usernames it does not have this problem, but the git side communicates identity and should either be fixed or have a mechanism to resolve these issues.

git itself offers one way to fix this via the .mailmap command, which I propose introducing into the repo. This file allows a git repository to hold a list of aliases for names and emails such that all commits attributed to any of the aliases resolves to the preferred identity. This provides control to each committer over their own representation, and fixes a number of software mistakes the cloud the history.

Change History (16)

This ticket was mentioned in PR #7180 on WordPress/wordpress-develop by @dmsnell.


4 weeks ago
#1

  • Keywords has-patch added

Trac ticket: Core-61864.

From time to time a new commit will appear from an existing commit which has a different name or email address (or both) than an existing name or email address. This occurs because of changing names and changing emails and because of mistakes. Additionally, the svg-to-git process double-encodes names from profiles.wordpress.org causing corruption in names with non-US-ASCII characters.

This patch introduces a .mailmap file to alias committers so that:

  • All contributions for a given person are shown for that person.
  • Committers will be able to control or fix the display of their own name.

The .mailmap file is a standard git configuration.

@peterwilsoncc commented on PR #7180:


4 weeks ago
#2

I'm asking around to see if there is a way to automate this to avoid it falling out of date as contributors are added to the commit group. I think it's a nice to have rather than a blocker but let's see if it proves possible in the short term.

@dmsnell commented on PR #7180:


3 weeks ago
#3

a way to automate this

I'm sure some automation will be possible, but also I think that raises a lot of ambiguous questions, such as:

  • how do we automatically know that a new email represents the same identity as an existing one?
  • if an email comes over with a different name, how do we know it's the right name?
  • if the name comes over with something like Mike Adams (mdawaffe) how do we know if we should or shouldn't remove the parents.

Given that this is a display-only thing I would imagine we can circumvent a lot of maintenance and complexity by allowing contributors to govern their own commits. That's a nice thing about this file: if I did something wrong they can correct it; if they change their name or email, they can correct it, and they won't worry about their changes being automatically wiped out by some process that was unaware.

Happy to see automation; I'm just sharing my skepticism that it's going to be as valuable or effective in practice as it might seem at the outset, mainly because we don't have any real indication if a new identity represents an existing one or not.

This ticket was mentioned in Slack in #core by dmsnell. View the logs.


3 weeks ago

@joemcgill commented on PR #7180:


3 weeks ago
#5

I like this. I think we should preemptively add a {wp-username}@602fd350-edb4-49c9-b593-d223f7449a82 entry for all committers, since it seems like this format is being recorded randomly.

@peterwilsoncc commented on PR #7180:


3 weeks ago
#6

Happy to see automation; I'm just sharing my skepticism that it's going to be as valuable or effective in practice as it might seem at the outset, mainly because we don't have any real indication if a new identity represents an existing one or not.

I thought about this overnight and agree.

It's worth putting in the file initially and subsequently working on improving the maintenance approach. I agree with @joemcgill that including everyone with the @GUID initially would be helpful

@dmsnell commented on PR #7180:


3 weeks ago
#7

Good idea @joemcgill and @peterwilsoncc - I've added them in 3229de6

@peterwilsoncc commented on PR #7180:


3 weeks ago
#8

@dmsnell Was the name @git... name @guid intentional or is the second name redundant. Eg

Aaron D. Campbell <aaroncampbell@git.wordpress.org> aaroncampbell <aaroncampbell@602fd350-edb4-49c9-b593-d223f7449a82>

Lazyweb: I could ~probably~ research this myself but while you are around... :)

@dmsnell commented on PR #7180:


3 weeks ago
#9

@peterwilsoncc it could probably be removed. let me redo the last commit. that matches more tightly than just the email, but I don't suppose these would ever need that level of specificity

@dmsnell commented on PR #7180:


3 weeks ago
#10

okay the GUID replacements have beeen updated in fff1c51 to only match on the email address and not also the name. this reduces the number of added lines to the mailmap file.

#11 @dmsnell
3 weeks ago

  • Owner set to dmsnell
  • Resolution set to fixed
  • Status changed from new to closed

In 58899:

Meta: Add .mailmap to ascribe git commits to proper author.

From time to time a new commit will appear from an existing commit which has a different name or email address (or both) than an existing name or email address. This occurs because of changing names and changing emails and because of mistakes. Additionally, the svg-to-git process double-encodes names from profiles.wordpress.org causing corruption in names with non-US-ASCII characters.

This patch introduces a .mailmap file to alias committers so that:

  • All contributions for a given person are shown for that person.
  • Committers will be able to control or fix the display of their own name.

The .mailmap file is a standard git configuration.

Developed in https://github.com/wordpress/wordpress-develop/pull/7180
Discussed in https://core.trac.wordpress.org/ticket/61864

Fixes #61864.

#12 @peterwilsoncc
3 weeks ago

  • Milestone changed from Awaiting Review to 6.7

@dmsnell commented on PR #7180:


3 weeks ago
#14

Oops, sorry @peterwilsoncc if I missed your updates when I merged. I'll follow-up with your suggestions, but may not get to that today.

#15 @peterwilsoncc
3 weeks ago

In 58900:

Meta: Tidy up and update .mailmap.

Updates the name mappings to sort display names ascii-alphabetically and to ascribe commits to updated usernames.

Accounts without a seperate display name remain listed in the footer of the file.

Props dmsnell, jorbin.
See #61864.

#16 @dmsnell
11 days ago

In 58941:

Add Adam Zieliński to the mailmap file.

Adds Adam Zieliński's display name to the .mailmap file.

Developed in https://github.com/wordpress/wordpress-develop/7204
Discussed in https://core.trac.wordpress.org/ticket/61864

Follow-up to [58899].

Props dmsnell, zieladam.
See #61864.

Note: See TracTickets for help on using tickets.