WordPress.org

Make WordPress Core

Opened 4 months ago

Last modified 2 weeks ago

#42254 assigned defect (bug)

Duplicate news entries in Events & News dashboard widget

Reported by: iandunn Owned by: Iceable
Milestone: 5.0 Priority: low
Severity: minor Version: 4.8
Component: Administration Keywords: good-first-bug has-patch has-unit-tests
Focuses: Cc:

Description

The first item in the News widget is always the latest post from the wordpress.org/news blog. The next 3 posts are selected from the planet.wordpress.org feed, which itself includes the wordpress.org/news blog.

This can lead to duplicate entries:

https://cldup.com/R67ghZqMRh.png

One potential solution is to search the list of Planet feed items, to check ifany of them match the current News item. If one does, it would be removed, so that the next oldest item takes its place.

Reported by @sergeybiryukov on Slack https://wordpress.slack.com/archives/C02RQBWTW/p1507073219000123

Attachments (5)

42254.diff (3.5 KB) - added by Iceable 4 months ago.
Replaces wp_widget_rss_output() for the dashboard's news widgets with its own source code, modified to detect and skip duplicate entries
42254.2.diff (6.1 KB) - added by Iceable 4 months ago.
42254.3.diff (110.2 KB) - added by birgire 4 months ago.
42254.4.diff (111.8 KB) - added by birgire 4 months ago.
42254.5.diff (111.3 KB) - added by Iceable 2 months ago.
Refreshed for 5.0

Download all attachments as: .zip

Change History (21)

#1 @birgire
4 months ago

Each feed is fetched and displayed separately with [src]:

foreach ( $feeds as $type => $args ) {
    $args['type'] = $type;
    echo '<div class="rss-widget">';
    wp_widget_rss_output( $args['url'], $args );
    echo "</div>";
}

where wp_widget_rss_output() contains the fetch_feed and the display part. See:

https://developer.wordpress.org/reference/functions/wp_widget_rss_output/

To be able to display unique feed items, it looks like one would have to skip the handy wp_widget_rss_output() for a custom version of it to handle the combined feed items and display.

Alternative is output buffering to mod the HTML - not keen on that one.


#2 @Mte90
4 months ago

Probably the best way is change the RSS that are used to avoid conflicts.

@Iceable
4 months ago

Replaces wp_widget_rss_output() for the dashboard's news widgets with its own source code, modified to detect and skip duplicate entries

#3 @Iceable
4 months ago

  • Keywords has-patch added; needs-patch removed

To reproduce the issue (and test a solution) when there is no duplicates in the last 3 posts from planet.wordpress.org, you can have it display more than 3 by editing line 1348.
https://core.trac.wordpress.org/browser/tags/4.8.2/src/wp-admin/includes/dashboard.php#L1348.

The output of this widget is cached, so one may also need to skip the cached output by commenting out lines 983 to 988 for testing.
https://core.trac.wordpress.org/browser/tags/4.8.2/src/wp-admin/includes/dashboard.php#L983

The patch above replaces wp_widget_rss_output() with its own source code as suggested by @birgire , modified to detect and skip duplicate entries.

(This is my very first patch so please bear with me and let me know if I'm doing anything wrong :) )

#4 @birgire
4 months ago

Thanks @Iceable and congratulation with your first patch.

If this is the way to go (as there might be other approaches, as mentioned by @Mte90 ) then I wonder if it would make sense to abstract the main part of the wp_widget_rss_output() into a function or functions to reduce the duplicated code? So wp_widget_rss_output() would be a wrapper. Or introduce some new input arguments to alter the output type of that function to make it easier to work with?

#5 @Iceable
4 months ago

Agreed on keeping it dry.

It would broaden the scope of this ticket, but would make sense if wp_widget_rss_output() itself was able to avoid duplicates.

I'd say introducing this ability directly in this function with a new argument sounds good - if we can do it without affecting other areas where it is used.

I just realised wp_widget_rss_output() only takes one feed at a time (and is currently called once for each feed).
So abstracting the main part to reduce duplicate code and creating a new function to display the result from multiple feeds without duplicates sounds good.

Changing the feeds in use would easily get rid of the issue without affecting anything else, but why just circumvent the issue if we can effectively improve and future-proof the function that handles this?

Last edited 4 months ago by Iceable (previous) (diff)

@Iceable
4 months ago

#6 @Iceable
4 months ago

42254.2.diff moves the main part of wp_widget_rss_output() into a new function called wp_widget_rss_get_entries() which returns the feed's entries in an array.

The dashboard widget uses the new wp_widget_rss_get_entries() function to get the entries from the two feeds and only handles the output with custom code to skip duplicates.

Dry-ier than the first patch and wp_widget_rss_output() 's behavior and output should not be affected, though it might need testing if this is decided to be the way to go. (See notes in ticket:42254#comment:3 for testing)

Last edited 4 months ago by Iceable (previous) (diff)

@birgire
4 months ago

#7 @birgire
4 months ago

  • Keywords has-unit-tests added

@Iceable Here's a first pass for adding tests.

In 42254.3.diff

  • added a test for wp_widget_rss_get_entries().
  • added a test for wp_widget_rss_output().
  • added a test for wp_dashboard_primary_output() to make sure there are no duplicated entries.
  • used local test feed (xml) files. Trimmed the Planet feed down to 10 items.
  • changed wp_widget_rss_get_entries() to support the 'items' argument.
  • changed wp_dashboard_primary_output() to check for an empty entries array.
  • changed wp_widget_rss_output() to check for an empty entries array.
  • changed __() in one place, as recommended by phpcs.
  • fixed some indices in wp_widget_rss_get_entries().

There are no tests for these dashboard functions, so I created a new test class for the dashboard.php include.

There also seems to be no tests for the core fetch_feed() function, so I tried to find a way to let it use local xml files, instead of running external feed requests.

The fetch_feed() is a wrapper for a SimplePie feed object. Instead of trying to mock the SimplePie class, I used the wp_feed_options hook to let the object use raw data, via the SimplePie::set_raw_data() method.

See e.g. http://simplepie.org/api/class-SimplePie.html

The SimplePie::set_feed_url() method overrides any raw data and fetch_feed() calls it with $feed->set_feed_url( $url ). Note that we can't use $feed->set_feed_url( null ), because it will set the feed url as http://?#, thus overriding the raw data. So the workaround here was to directly null the public feed_url property with $feed->feed_url = null. Then I was able to load the local test xml files via:

$feed->set_raw_data( file_get_contents( DIR_TESTDATA . '/feeds/news/' . $xml_files[ $url ] ) );

where we disable caching with $feed->enable_cache( false );

I considered adding the xml files into a method, to get rid of file_get_contents(), but this way made the test class much more readable, not having bunch of XML data in there.

I also made some adjustments to wp_widget_rss_get_entries(), so it supports the items argument, instead of returning all feed entries (they could be many).

Instead I added a margin of 5 entries (just an example) within wp_dashboard_primary_output(), to be able to handle possible duplicated entries.

I trimmed the Planet feed test file to include 10 items. (might clean it better for non-ascii?)

The Planet test feed (first 5 items):

Dev Blog: 2017 WordPress Survey and WordCamp US
https://wordpress.org/news/2017/10/2017-wordpress-survey-and-wordcamp-us/

WPTavern: WordPress 4.9 Will Support Shortcodes and Embedded Media in the Text Widget	
https://wptavern.com/wordpress-4-9-will-support-shortcodes-and-embedded-media-in-the-text-widget
	
WPTavern: WPWeekly Episode 292 Recap of WooConf and CaboPress
https://wptavern.com/wpweekly-episode-292-recap-of-wooconf-and-cabopress
	
WPTavern: Goodnight Firebug
https://wptavern.com/goodnight-firebug
	
WPTavern: WordPress 4.9 Beta 4 Removes Try Gutenberg Call to Action
https://wptavern.com/wordpress-4-9-beta-4-removes-try-gutenberg-call-to-action


The News test feed (first 5 items):

2017 WordPress Survey and WordCamp US
https://wordpress.org/news/2017/10/2017-wordpress-survey-and-wordcamp-us/
	
WordPress 4.9 Beta 4
https://wordpress.org/news/2017/10/wordpress-4-9-beta-4/
	
WordPress 4.9 Beta 3
https://wordpress.org/news/2017/10/wordpress-4-9-beta-3/

WordPress 4.9 Beta 2
https://wordpress.org/news/2017/10/wordpress-4-9-beta-2/

WordPress 4.9 Beta 1
https://wordpress.org/news/2017/10/wordpress-4-9-beta-1/


We note that the first item is (link) duplicated, i.e. contained in both feeds.

We want to avoid displaying duplicated items in wp_dashboard_primary_output().

So the expected output of wp_dashboard_primary_output() for (1 News item + 3 Planet items), is to contain:

2017 WordPress Survey and WordCamp US
https://wordpress.org/news/2017/10/2017-wordpress-survey-and-wordcamp-us/

WPTavern: WordPress 4.9 Will Support Shortcodes and Embedded Media in the Text Widget	
https://wptavern.com/wordpress-4-9-will-support-shortcodes-and-embedded-media-in-the-text-widget
	
WPTavern: WPWeekly Episode 292 Recap of WooConf and CaboPress
https://wptavern.com/wpweekly-episode-292-recap-of-wooconf-and-cabopress
	
WPTavern: Goodnight Firebug
https://wptavern.com/goodnight-firebug

instead of:

2017 WordPress Survey and WordCamp US
https://wordpress.org/news/2017/10/2017-wordpress-survey-and-wordcamp-us/

Dev Blog: 2017 WordPress Survey and WordCamp US
https://wordpress.org/news/2017/10/2017-wordpress-survey-and-wordcamp-us/

WPTavern: WordPress 4.9 Will Support Shortcodes and Embedded Media in the Text Widget	
https://wptavern.com/wordpress-4-9-will-support-shortcodes-and-embedded-media-in-the-text-widget
	
WPTavern: WPWeekly Episode 292 Recap of WooConf and CaboPress
https://wptavern.com/wpweekly-episode-292-recap-of-wooconf-and-cabopress
	

The SimplePie also supports fetching multiple feeds, thus also fetch_feed(), but I'm not yet convinced that this would be the way to go here.

There's a link in the inline docs for fetch_feed() for:

http://simplepie.org/wiki/faq/typical_multifeed_gotchas

ps: this link is not parsed correctly in the DevHub, I've reported it recently here on Meta trac:

Last edited 4 months ago by birgire (previous) (diff)

#8 @Iceable
4 months ago

@birgire

The SimplePie also supports fetching multiple feeds, thus also fetch_feed(), but I'm not yet convinced that this would be the way to go here.

I'm not convinced either: merging feeds would not allow to get the desired number of items from each (1 from wordpress.org/news and 3 from planet.wordpress.org). Or would it?


42254.3.diff looks good to me, just one tiny error in wp-includes/widgets.php on line 1392:
$rss->get_error_message() should be $rss_entries->get_error_message() instead:

if ( is_wp_error( $rss_entries ) ) {
	if ( is_admin() || current_user_can( 'manage_options' ) ) {
		echo '<p><strong>' . __( 'RSS Error:' ) . '</strong> ' . $rss_entries->get_error_message() . '</p>';
	}
	return;
}

The margin of 5 entries seems a bit arbitrary indeed, but now thinking about it:

  • It is better than my previous patch that just grabbed as many as available; which was of course much more than necessary.
  • The margin of 5 - when actually just 1 should be enough - allows some room in case the $feeds and 'items' argument come to change in the future. In such case I doubt we'll ever need more than 5.

Maybe we could even go as far as to calculate the necessary offset based on the 'items' arguments in $feeds? (Would it be useful to go this far?)

I only gave it a quick look so I'm not uploading another patch right now. I may have a bit more time for another look over the week end. Not sure I'll have much more to add though. :)

To be honest I am not very experienced with unit tests yet so I don't have much of a feedback about these except that they look pretty fine to me. Just one other small thing though: I think includes-dashboard.php should be named includesDashboard.php to be consistent with the other files in tests/phpunit/tests/admin/

Last edited 4 months ago by Iceable (previous) (diff)

#9 @birgire
4 months ago

Thanks for the feedback @Iceable

Good catch, regarding the $rss_entries->get_error_message().

Regarding the offset, I wrongly recalled from memory, that both items counts were filterable, but I checked again and it's only true for the second feed, via the dashboard_secondary_items filter. The first one is fixed as 1.

So I think we could just use:

$args['items'] = $args['items'] + 1; 

as you mentioned ... but if both were filterable, then we could have used e.g.:

min( feed_1_items_count, feed_2_items_count )

instead of the compromise of $args['items'] = $args['items'] + 5;

The WordPress Code Standard via phpcs suggested using lower-case.php kind of file name
and I noticed the same for files in e.g. tests/phpunit/tests/customize/. I will have to look further into that.

Regarding the multi-feed feature, there are also some gotchas and we also don't want to merge the items.

#10 @Iceable
4 months ago

Thinking about it, the necessary offset to handle possible duplicates is actually the number of items already displayed. Since we keep them in an array already to check for duplicate, we can conveniently count them.
So basically:

$args['items'] = $args['items'] + count( $displayed_links );

This will also work in case the 'items' args values change or if another feed is added in the future. I'm not sure this will ever happen, but I like to make things as flexible as possible :)

You are absolutely right about the filename, my bad. Files with camelCase names seem much older, and those named with snake-case seem to be the most recent ones.
And conforming to the standards > trying to be consistent with old conventions!

Last edited 4 months ago by Iceable (previous) (diff)

@birgire
4 months ago

#11 @birgire
4 months ago

@Iceable I agree, great suggestion using the count( $displayed_links ) to handle the offset for each feed.

My previous suggestion of using the offset of 1, was meant for the second feed, not both, as one could easily read it ;-)

I updated 42254.4.diff with $rss_entries->get_error_message() and count( $displayed_links ).

I also made the test_wp_dashboard_primary_output_should_not_have_duplicated_entries more explicit.

#12 @Iceable
2 months ago

  • Keywords needs-refresh added

The latest patch no longer applies correctly, I'll be happy to refresh it if this ticket gets some attention :)

This ticket was mentioned in Slack in #core by iceable. View the logs.


2 months ago

#14 @SergeyBiryukov
2 months ago

  • Milestone changed from Future Release to 5.0

@Iceable
2 months ago

Refreshed for 5.0

#15 @Iceable
2 months ago

  • Keywords needs-refresh removed

#16 @DrewAPicture
2 weeks ago

  • Owner set to Iceable
  • Status changed from new to assigned

Assigning to mark the good-first-bug as "claimed".

Note: See TracTickets for help on using tickets.