Make WordPress Core

Opened 3 years ago

Last modified 7 weeks ago

#53170 new enhancement

Reduce the transient size of WordPress Events and News feeds

Reported by: lucasbustamante's profile lucasbustamante Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 5.7.1
Component: Widgets Keywords:
Focuses: performance, sustainability Cc:

Description

If we export the database created in a clean WordPress installation that blocks external connections with WP_HTTP_BLOCK_EXTERNAL, the exported file size is just 34kb.

Without blocking external connections, the exported SQL file size is 746kb.

This whopping difference of 712kb comes from WordPress Events and News feeds: https://github.com/WordPress/WordPress/blob/master/wp-admin/includes/dashboard.php#L1490

That downloads these feeds and stores them as transients in the database:

This seems to be unnecessary, given that the content that is actually displayed in the widget is just the event/news name and a link:

https://i.imgur.com/0vWVwE0.jpg

With this ticket, I would like to suggest reducing the transient size of the WordPress News and Events widget feed to the minimum possible.

Attachments (2)

wordpress-feed.jpg (65.2 KB) - added by lucasbustamante 3 years ago.
WordPress News and Events widget, that consumes the feed
feed.jpg (89.0 KB) - added by lucasbustamante 3 years ago.

Download all attachments as: .zip

Change History (14)

@lucasbustamante
3 years ago

WordPress News and Events widget, that consumes the feed

#1 @lucasbustamante
3 years ago

Essentially, 95% of the size of this feed comes from summary/description. The show_summary argument in the code that consumes this feed is always false. So the feed could just not include the summary there.

#2 @lucasbustamante
3 years ago

I've also benchmarked this on a clean WordPress with external requests enabled and the "WordPress News and Events" widget removed with remove_meta_box('dashboard_primary', get_current_screen(), 'side');.

The resulting database export is 76kb in size.

The events portion of that widget is fairly small, the bulk of the data comes from the News that show up below, which includes an unnecessary excerpt of each article in their payload, where the widget only displays the title and the link.

#3 follow-up: @Mte90
3 years ago

I checked with the latest dev version and the WordPress admin dashboard with the wdiget.

There is a transient _transient_dash_v2_88ae138922fe95674369b1cb3d215a2b with this content:

<div class="rss-widget"><ul><li><a class='rsswidget' href='https://wordpress.org/news/2021/05/episode-9-the-cartography-of-wordpress/'>WP Briefing: Episode 9: The Cartography of WordPress</a></li><li><a class='rsswidget' href='https://wordpress.org/news/2021/05/dropping-support-for-internet-explorer-11/'>Dropping support for Internet Explorer 11</a></li></ul></div><div class="rss-widget"><ul><li><a class='rsswidget' href='https://wptavern.com/my-codeless-website-app-detects-site-builder-tools?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=my-codeless-website-app-detects-site-builder-tools'>WPTavern: My Codeless Website App Detects Site Builder Tools</a></li><li><a class='rsswidget' href='https://wptavern.com/bricks-laying-down-a-foundation-in-the-wordpress-page-builder-market?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=bricks-laying-down-a-foundation-in-the-wordpress-page-builder-market'>WPTavern: Bricks: Laying Down a Foundation in the WordPress Page Builder Market</a></li><li><a class='rsswidget' href='https://wordpress.org/news/2021/05/episode-9-the-cartography-of-wordpress/'>WordPress.org blog: WP Briefing: Episode 9: The Cartography of WordPress</a></li></ul></div>

So I tested also with the latest version so I don't know what is the transient you have with all this data.

#4 in reply to: ↑ 3 @lucasbustamante
3 years ago

Replying to Mte90:

I checked with the latest dev version and the WordPress admin dashboard with the wdiget.

There is a transient _transient_dash_v2_88ae138922fe95674369b1cb3d215a2b with this content:

<div class="rss-widget"><ul><li><a class='rsswidget' href='https://wordpress.org/news/2021/05/episode-9-the-cartography-of-wordpress/'>WP Briefing: Episode 9: The Cartography of WordPress</a></li><li><a class='rsswidget' href='https://wordpress.org/news/2021/05/dropping-support-for-internet-explorer-11/'>Dropping support for Internet Explorer 11</a></li></ul></div><div class="rss-widget"><ul><li><a class='rsswidget' href='https://wptavern.com/my-codeless-website-app-detects-site-builder-tools?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=my-codeless-website-app-detects-site-builder-tools'>WPTavern: My Codeless Website App Detects Site Builder Tools</a></li><li><a class='rsswidget' href='https://wptavern.com/bricks-laying-down-a-foundation-in-the-wordpress-page-builder-market?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=bricks-laying-down-a-foundation-in-the-wordpress-page-builder-market'>WPTavern: Bricks: Laying Down a Foundation in the WordPress Page Builder Market</a></li><li><a class='rsswidget' href='https://wordpress.org/news/2021/05/episode-9-the-cartography-of-wordpress/'>WordPress.org blog: WP Briefing: Episode 9: The Cartography of WordPress</a></li></ul></div>

So I tested also with the latest version so I don't know what is the transient you have with all this data.

Howdy! Thanks for looking into this.

The transients are:

  • _site_transient_community-events-05ed3f1295b0676bee065151acc5ed4a
  • _transient_feed_d117b5738fbd35bd8c0391cda1f2b5d9

I will attach a 3 minute video demonstrating the issue

#5 @lucasbustamante
3 years ago

I've hosted the video on Youtube, as it is bigger than the max 10MB allowed for an attachment here: https://www.youtube.com/watch?v=fRzGoIIWwnQ

#6 follow-up: @Mte90
3 years ago

Thanks now is more clear, so the guilty transient is _transient_feed_ that include the whole feed.

This is the file that save the feed in a transient https://github.com/WordPress/wordpress-develop/blob/2382765afa36e10bf3c74420024ad4e85763a47c/src/wp-includes/class-wp-feed-cache-transient.php and this one that fire the saving in the cache https://github.com/WordPress/wordpress-develop/blob/32151af6e47e0fda4c7ac1796eb8ac8916817517/src/wp-includes/feed.php#L805

Thinking of this, I am wondering if we clean all the transient generated by the feed support from the content is a good thing. As other people can use it also for other things, so a way should be ignoring content just for specific feeds.

As this can impact:

Looking at these only the block use the excerpt from the feed RSS so in that one is needed to keep that content but in the other ones is not required.

#7 in reply to: ↑ 6 @lucasbustamante
3 years ago

Mte90

Nice findings! The biggest impact seems to come from the "Dashboard Widget", since it's rendered on the dashboard screen of the administration panel, while the "RSS Block" would probably only load the feed if you add a "RSS block" to a page and render it on the front-end.

By looking at this line https://github.com/WordPress/WordPress/blob/master/wp-admin/includes/dashboard.php#L1510, the feed comes from: https://wordpress.org/news/feed/

I agree with you, just removing the content from the feed that URL returns does not seem like a good solution, as it might break implementations that are relying on that URL to return the feed with the content, the same for the current transient feed.

So I wonder if it would make sense to add a new endpoint, such as https://wordpress.org/news/feed?minimal, that would serve the feed without the content, to be stored as a separate transient feed_minimal for usage on those places except the "RSS block".

So we keep the current URL behavior as it is, as to not break any implementation that might be relying on that URL to return the content, but we add a control structure ?minimal to be able to fetch a smaller version of the feed for the places where the content are not necessary, such as the Dashboard Widget, which should probably be responsible for 99.9% of the hits to that URL.

Last edited 3 years ago by lucasbustamante (previous) (diff)

#8 follow-up: @Mte90
3 years ago

Seems a good solution, but this means that we need to open a ticket to https://meta.trac.wordpress.org/ to do that change that in this way will be retrocompatible but all the previous WP will use the other endpoint.

#9 in reply to: ↑ 8 @lucasbustamante
3 years ago

Replying to Mte90:

Seems a good solution, but this means that we need to open a ticket to https://meta.trac.wordpress.org/ to do that change that in this way will be retrocompatible but all the previous WP will use the other endpoint.

The "Dashboard Widget" also consumes this feed https://planet.wordpress.org/feed/ on line https://github.com/WordPress/WordPress/blob/master/wp-admin/includes/dashboard.php#L1543

What do you think we try to map exactly the content used by the "Dashboard Widget" at first, like:

News feed: Title, Link, Location, Date/Time last 3 items only
Planet feed: Title, Link, last 5 items only

Once we know for sure exactly what the "Dashboard Widget" needs to work properly, we open a ticket at Meta to allow a URL to feed only the necessary information to that widget, to make it more efficient. The feed for the "Dashboard Widget" would probably need to be stored as a dedicated transient only for that widget, as opposed to the shared transient with the RSS feeds. I think this makes more sense, as the RSS widgets are far less used anyway.

Last edited 3 years ago by lucasbustamante (previous) (diff)

@lucasbustamante
3 years ago

#10 @lucasbustamante
3 years ago

I think we can narrow down the purpose of this ticket to improve the "Dashboard Widget" only.

The essential problem is that the "Dashboard Widget" was poorly implemented at the time because the Meta lacked the endpoints to provide only the data it needed, so it ended up fetching 700kb~ of information using the existing endpoints, instead of tweaking the Meta endpoints to allow to fetch only the necessary information.

Given that this widget is enabled by default on all WordPress installations, I think it deserves more love and to be made more efficient - I think that narrowing the scope of this ticket to improve only that Widget, making sure it consumes and caches from an efficient endpoint, might be a good effort.

#11 @lucasbustamante
3 years ago

It's also worth nothing that the developer that implemented the "Dashboard Widget" probably took the conscious decision of using the existing RSS feed, knowing it would only need a small fraction of the data, to make one request and cache the data for both the "Dashboard Widget" and the "RSS Widget/Block".

However, given that the "RSS Widget/Block" is rarely used, this optimization brings way more harm than benefits, since it has to fetch the huge feed on every request. In that sense, the "Dashboard Widget" should have it's own transient, to store only the information it needs.

#12 @Mte90
7 weeks ago

  • Focuses sustainability added

As involve all the websites can have an impact in the DB.

Maybe it makes sense to save just the HTML already generated in the transient.

Note: See TracTickets for help on using tickets.