Make WordPress Core

Opened 2 years ago

Last modified 4 months ago

#52099 reopened enhancement

Sitemaps "Last Modified" (lastmod) tag

Reported by: junaidbhura's profile junaidbhura Owned by:
Milestone: 6.3 Priority: normal
Severity: normal Version: 5.5
Component: Sitemaps Keywords: has-patch
Focuses: Cc:

Description

Sitemaps currently only support the "Location" tag (loc). This ticket adds support for the "Last Modified" tag.

This is how it works:

1. Posts

This is probably the easiest - it just takes the post_modified_gmt value of the post and creates a lastmod tag.

2. Taxonomies

It gets the latest modified post in the taxonomy and creates a lastmod tag based on its last modified value.

3. Users

It gets the latest modified post by a user and creates a lastmod tag based on its last modified value.

4. Indices

Sitemap indices / indexes work in this way:

  1. If its a post index - get the last modified date of the last updated post in the post type
  2. If its a taxonomy index - get the last modified date of the last updated post which is associated with any term in the taxonomy
  3. If its a user index - get the last modified date of the post type "post" - since all posts are associated with users

Attachments (2)

52099.diff (12.1 KB) - added by junaidbhura 2 years ago.
52099.2.diff (12.1 KB) - added by junaidbhura 2 years ago.

Download all attachments as: .zip

Change History (18)

@junaidbhura
2 years ago

This ticket was mentioned in PR #822 on WordPress/wordpress-develop by junaidbhura.


2 years ago
#1

  • Keywords has-unit-tests added

github-actions[bot] commented on PR #822:


2 years ago
#2

Hi @junaidbhura! 👋

Thank you for your contribution to WordPress! 💖

It looks like this is your first pull request to wordpress-develop. Here are a few things to be aware of that may help you out!

No one monitors this repository for new pull requests. Pull requests must be attached to a Trac ticket to be considered for inclusion in WordPress Core. To attach a pull request to a Trac ticket, please include the ticket's full URL in your pull request description.

Pull requests are never merged on GitHub. The WordPress codebase continues to be managed through the SVN repository that this GitHub repository mirrors. Please feel free to open pull requests to work on any contribution you are making.

More information about how GitHub pull requests can be used to contribute to WordPress can be found in this blog post.

Please include automated tests. Including tests in your pull request is one way to help your patch be considered faster. To learn about WordPress' test suites, visit the Automated Testing page in the handbook.

If you have not had a chance, please review the Contribute with Code page in the WordPress Core Handbook.

The Developer Hub also documents the various coding standards that are followed:

Thank you,
The WordPress Project

#3 @junaidbhura
2 years ago

Hey @peterwilsoncc do you think you could take a look at this and let me know what you think?

https://github.com/WordPress/wordpress-develop/pull/822

@junaidbhura
2 years ago

#4 @MadtownLems
2 years ago

I believe that this was intentionally left in plugin territory, at least partially because "last modified" for a post can be much more complicated than this when you consider dynamic content.

If a page's primary function is to embed a YouTube playlist, twitch channel, etc - the page gets new content whenever those things update.

The same is true for pages that pull content from other sources, such as a third party calendaring system, RSS feeds, or even just using the Latest Posts block.

Furthermore, most search engines don't actually consume these other tags.

Here's an excerpt from the blog post announcing the new functionality (https://make.wordpress.org/core/2020/07/22/new-xml-sitemaps-functionality-in-wordpress-5-5/):

"The sitemaps protocol specifies a certain set of supported attributes for sitemap entries. Of those, only the URL (loc) tag is required. All others (e.g. changefreq and priority) are optional tags in the sitemaps protocol and not typically consumed by search engines, which is why WordPress only lists the URL itself. Developers can still add those tags if they really want to."

#5 @pbiron
2 years ago

  • Version changed from 5.6 to 5.5

Version 0.2.0 of the core sitemaps feature plugin was the last one to include support for lastmod.

All support for lastmod was removed in Remove all traces of lastmod, PR 145.

You can check out that PR, which contains links to the issues and slack conversations around that.

In addition to what @MadtownLems mentions, in large sites lastmod can be expensive to compute.

#6 @junaidbhura
2 years ago

  • Resolution set to wontfix
  • Status changed from new to closed

Didn't realise that it was purposely left out - thanks for pointing this out. I'll close this ticket out!

#7 @desrosj
2 years ago

  • Milestone Awaiting Review deleted

#8 @ocean90
23 months ago

#53740 was marked as a duplicate.

#9 @swissspidy
23 months ago

  • Focuses performance added
  • Keywords needs-patch added; has-patch has-unit-tests removed
  • Milestone set to Future Release
  • Resolution wontfix deleted
  • Status changed from closed to reopened

Reopening after talking to @garyillyes regarding #53740.

While Google does not currently use <lastmod>, other search engines consume it to schedule crawls more effectively, saving resources and decreasing load on sites.

It's true that we removed lastmod originally to keep things simple and performant, perhaps there is some middle ground where we can add it without much overhead.

For example, adding lastmod for posts is probably trivial, but for other entries and especially the homepage it might not be, and we could consider those cases plugin territory.

Expensive queries could be cached via wp_cache_add or similar to ensure there's no impact on larger sites.

#10 @mukesh27
12 months ago

Hi @junaidbhura, do you want to create a new pull request?

#11 @flixos90
11 months ago

  • Focuses performance removed

Removing the performance focus here since this isn't a performance enhancement but rather a new sitemaps feature.

#12 @junaidbhura
10 months ago

Hey @mukesh27 I've just merged the master branch into the branch. Could you please take a look and share your initial thoughts, and we'll take it from there?

https://github.com/WordPress/wordpress-develop/pull/822

#13 @joostdevalk
4 months ago

Could we please have another look at this patch? Adding lastmod would really make a difference in crawl efficiency for several search engines.

In fact, Google doesn't use it I understand from hearing @garyilyes speak about it on "Search off the record" here, because it's "unreliable". What if we made it reliable by actually doing it right? :)

For now, the pull above seems like a very good first step.

#14 @SergeyBiryukov
4 months ago

  • Keywords has-patch added; needs-patch removed
  • Milestone changed from Future Release to 6.3

#15 @aristath
4 months ago

Bing is soon rolling out a change which will make more effective use of lastmod (more info: https://blogs.bing.com/webmaster/february-2023/The-Importance-of-Setting-the-lastmod-Tag-in-Your-Sitemap)
With that in mind, adding this info to our sitemaps will reduce server loads and improve the project's sustainability.
I'm sure other search engines will follow Bing - but even if they don't, Google is not the only search engine out there... There are plenty of other crawlers and they do use it.

#16 @fabricecanel
4 months ago

Fabrice from Bing: thanks for considering this sitemap change. This change will really help on a massive scale to optimize the crawling of all WordPress sites leading to cost savings and an improvement in the freshness and completeness of indexed content. Through upcoming testimonials and sharing of data, they aim to demonstrate the benefits and encourage other search engines to leverage lastmod too. It's important to keep in mind that sitemaps are a daily task, therefore, the adoption of IndexNow by WordPress (https://core.trac.wordpress.org/ticket/52900) is highly recommended to further improve crawl efficiency and ensure content freshness. I update the IndexNow feature request as the code is already prepared and this is truly a game changer.

Note: See TracTickets for help on using tickets.