Opened 5 weeks ago
Last modified 13 days ago
#62029 new defect (bug)
Do not bloat Debug Log with Scraping home page… when auto upgrade is OK
Reported by: | georgwordpress | Owned by: | |
---|---|---|---|
Milestone: | 6.7 | Priority: | normal |
Severity: | normal | Version: | |
Component: | Upgrade/Install | Keywords: | has-patch |
Focuses: | Cc: |
Description
Since wp 6.6 with every automatic plugin update/upgrade the WP Debug log is bloated with
“scraping home page… “
followed by the whole html body text.
[29-Aug-2024 22:24:01 UTC] Automatic updates starting... [29-Aug-2024 22:24:02 UTC] Automatic plugin updates starting... [29-Aug-2024 22:24:02 UTC] Upgrading plugin 'advanced-custom-fields-pro'... [29-Aug-2024 22:24:05 UTC] Plugin 'advanced-custom-fields-pro' has been upgraded. [29-Aug-2024 22:24:07 UTC] Scraping home page... [29-Aug-2024 22:24:07 UTC] '<!doctype html> <html lang="de"> <head> ... [29-Aug-2024 22:24:07 UTC] The update for 'advanced-custom-fields-pro' has no fatal errors. [29-Aug-2024 22:24:07 UTC] Automatic plugin updates complete. [29-Aug-2024 22:24:07 UTC] Automatic updates complete.
As a solution 2 Options are possible:
Option 1:
Do not fill Debug Log with html body when there is no error.
Option 2:
Introduce a new filter to handle scraping home page output in debug log.
Attachments (1)
Change History (17)
#2
@
5 weeks ago
- Component changed from General to Upgrade/Install
- Milestone changed from Awaiting Review to 6.7
This ticket was mentioned in PR #7359 on WordPress/wordpress-develop by mslinnea.
4 weeks ago
#4
- Keywords has-patch added
Simplifies debug code to output "true" and not the entire body html on failure.
#5
@
4 weeks ago
Tweaked this a bit at Contributor Day at WCUS 24, and we kind of found out a way to replicate this.
One way to do is if you have a template redirect to lets say another page/website, and it will log the html in the debug log.
However another thing we found is that when you update two plugins that have fatal errors, it enables the maintenance mode, and the html content of the maintenance mode is being logged in the error log.
I'll try to test it more so I can find a stable way to replicate it. It may be possible that the checks for enabling/disabling maintenance mode in load.php may not be properly getting the state of the maintenance mode.
I've managed to replicate it on my setup, however @afragen was not able to replicate it in his setup. I think this one needs more testing.
@georgwordpress would it be possible to give more details on your applications and more steps to replicate that issue.
I've also had a look at the existing PR https://github.com/WordPress/wordpress-develop/pull/7359 and I think it needs rework.
#6
@
3 weeks ago
My report was not complete:
Although no errors were detected and updates of all plugins were successful, during every plugin update the body of the maintenance page is written to the log.
I'll put together more information later.
Update:
Yes - the log is at another place - wp-config.php:
define( 'WP_DEBUG', true); #define( 'WP_DEBUG_LOG', true ); define( 'WP_DEBUG_LOG', ABSPATH .'my_error.log' ); define( 'WP_DEBUG_DISPLAY', false); @ini_set( 'display_errors', 0 );
Update 2:
Changed to standard log
define( 'WP_DEBUG_LOG', true );
=> Result is the same.
This is not a standard theme.
Next I will test another wp installation ...
OK - also with a standard cheme - the same logging.
But I have a Idea - checking ...
#7
@
3 weeks ago
I thought I would get a quicker result if I continued to strip down the existing installation.
But so far no improvement.
- deactivate all plugins
- standard wp log file
- no customized maintenance
I keep only one plugin activated - a plugin which can manual trigger the cron event wp_version_check.
So - first I changed the version number of this active plugin to a lower number.
After the backend reported the availability of a new version I trigger the cron event wp_version_check.
The Result:
The Plugin is updated without Errors.
But the full html body of the maintenance screen is scraped into the log.
I also changed to another cron plugin - but result was the same.
Then I tried another trigger with a small php file in root directory:
require( dirname(__FILE__) . '/wp-load.php' ); wp_maybe_auto_update();
Now the result was:
[19-Sep-2024 06:50:50 UTC] Automatic updates starting... [19-Sep-2024 06:50:50 UTC] Automatic plugin updates starting... [19-Sep-2024 06:50:50 UTC] Upgrading plugin 'leira-cron-jobs'... [19-Sep-2024 06:50:52 UTC] Plugin 'leira-cron-jobs' has been upgraded. [19-Sep-2024 06:50:52 UTC] 'leira-cron-jobs/leira-cron-jobs.php' is inactive and will not be checked for fatal errors. [19-Sep-2024 06:50:54 UTC] Scraping home page... [19-Sep-2024 06:50:54 UTC] '###### wp_scraping_result_start:45c085e5500d698b70ea24fc53edde59 ###### true ###### wp_scraping_result_end:45c085e5500d698b70ea24fc53edde59 ###### ' [19-Sep-2024 06:50:54 UTC] The update for 'leira-cron-jobs' has no fatal errors. [19-Sep-2024 06:50:54 UTC] Automatic plugin updates complete. [19-Sep-2024 06:50:54 UTC] Automatic updates complete.
No scraping of full html body - but during the update the plugin becomes deactivated (see log)?
And becomes not activited again after the end of the update process?
What is the best way for manual testing and forcing automatic updates?
#8
@
3 weeks ago
@stoyangeorgiev:
".. if you have a template redirect to lets say another page/website .."
Yes - that is the reason in my use case.
My htaccess allows only some query arguments - other queries are redirected to the same url without query arguments.
After adding in my htaccess file a new condition before rewriting:
RewriteCond %{QUERY_STRING} !^wp_scrape_key=
the wp log is now as expected after a successful update.
#9
@
3 weeks ago
And now I had seen the Rollback Auto Update function in action:
I build a plugin with a unkown function => fatal error in the plugin init function.
Now WP detect this buggy plugin update and made a rollback.
So - the main issue is not the bloated debug log.
If in some wp installations the Loopback request is redirected - there will be no error detection of the updated buggy plugin.
Or the other way round:
if the automatic wp updater did not find his needle ('###### wp_scraping_result_start:') then this is only the result of the redirected loopback request - in this constellation it does not matter whether the update is buggy or not.
As @stoyangeorgiev already mentioned:
The PR is missleading https://github.com/WordPress/wordpress-develop/pull/7359/files
This ticket was mentioned in PR #7435 on WordPress/wordpress-develop by @georgwordpress.
2 weeks ago
#10
improve logging when wp_scraping_result is not found
Trac ticket: https://core.trac.wordpress.org/ticket/62029
This ticket was mentioned in PR #7436 on WordPress/wordpress-develop by @georgwordpress.
2 weeks ago
#11
improve logging when wp_scraping_result is not found
Trac ticket: https://core.trac.wordpress.org/ticket/62029
@georgwordpress commented on PR #7359:
2 weeks ago
#12
@mslinnea Thank You for the PR.
The problem with the bloated log would be solved with this PR.
But I think the intention of the original WP code was to output the JSON object between the two needles.
In the event of an error, the JSON object contains detailed information about the last error that occurred - so it can be very helpful. This information is lost through this PR.
The original WP code, on the other hand, has problems if the wp_scraping_result is not there at all.
But missing the scraping result does not mean that an update caused an error.
I have provided my own PR.
However, I have no experience with the workflow on Wordpress.org.
How to continue if two PRs exist?
2 weeks ago
#13
@Georg-Git I'll close this in favor of your change. Thanks for the additional context!
Closed in favor of https://github.com/WordPress/wordpress-develop/pull/7436
#14
@
2 weeks ago
To reproduce the bug:
1.) Testenvironment:
1.1) I prepared a simple plugin using the action 'template_redirect'
to redirect any request with queries to the same URL without queries - if no user is logged in.
(see attached file to this ticket)
1.2) And I used another plugin to start the standard wp cron job which is doing the wp autoupdate of the plugin:
https://de.wordpress.org/plugins/leira-cron-jobs/
2.) Testing:
2.1) Reduce the version of the Cron Jobs plugin by editing the header in plugin file.
2.2) Open in wp backend the plugins page. When a update of the Cron Jobs plugin is indicated:
2.3) In the cron job plugin I started the cron job 'wp_version_check' manual
2.4) after the update of the plugin => check wp debug log.
#15
@
2 weeks ago
What makes the reproducibility of problems with the automatic updater so difficult:
The current cookies are sent with the loopback request.
When detecting errors after editing PHP files by the user this makes sense.
But the automatic update of plugins and its error detection is about a process that is started in the background by a cron job.
Usually anonymous - so the loopback request on the homepage would also be executed as an anonymous user.
However, if it is a standard WordPress installation, wp-cron.php is called by every user request.
In this case, the has_fatal_error() function of the wp-automatic-updater and it's loopback request uses the cookies from the user request that triggered the cron job.
As a result, WordPress always behaves slightly differently with this loopback request, depending on user rights - and in an other way than accessed anonymously.
The test sequence described above only works if the cron job is triggered by an anonymous user or a server cron job.
Some plugins can also trigger a cron job anonymously, like the Plugin 'Cron Jobs' described above.
#16
@
13 days ago
Summup:
The detection of errors when auto-updating of plugins is based on the analysis of a loopback request to the homepage.
This loopback request is carried out with special parameters so that when the homepage is requested, WordPress not only generates the normal HTML text, but also appends the following Ascii text to the end of the body:
###### wp_scraping_result_start:e10adc3949ba59abbe56e057f20f883e ###### true ###### wp_scraping_result_end:e10adc3949ba59abbe56e057f20f883e ######
(see wp-includes/load.php)
You will find the scraping result in the middle of two needles:
- 'true' if no error occurred
- a JSON object with the result of the error_get_last() function
The result of the loopback request is then checked by the has_fatal_error() function.
(see wp-admin/includes/class-wp-automatic-updater.php)
The function currently only checks one special case:
is_wp_error( $response ) then 'true' is returned.
If not, the following code for logging relies on the needle 'wp_scraping_result_start' being present in the response.
The absence of this needle therefore results in the entire html body of the loopback request being written to the log.
The complete absence of this needle is not further documented in the log.
Further on the return code from has_fatal_error() then only depends on whether the key 'type' is present in the scraping result. If the scraping result is completely missing, this key is of course not present and has_fatal_error() therefore returns false.
While testing, @stoyangeorgiev and I discovered that there may be some WP setups possible where the loopback request does not contain a scraping result:
1.) redirects to other servers or redirects with removal of query parameters
ToDo:
1.1) has_fatal_error() must recognize if scraping needle is missing and document it in the log
1.2) the WP Health Check could be optimized:
In order to detect errors when auto-updating plugins, not only a loopback request must work in general, but the loopback with the special scraping parameters must contain a scraping result as a response in the body.
If not, the auto-update function cannot detect errors.
Additionally, @stoyangeorgiev identified another possible bug during testing:
2.) "when you update two plugins that have fatal errors"
As result the log is also cluttered up because of point 1.1, but much more important it is an indication that error detection when updating multiple plugins can still lead to problems.
There is already a special note about this in the source code of the function update():
/* * Avoids a race condition when there are 2 sequential plugins that have * fatal errors. It seems a slight delay is required for the loopback to * use the updated plugin code in the request. This can cause the second * plugin's fatal error checking to be inaccurate, and may also affect * subsequent plugin checks. */ sleep(2);
Sounds like there is some room for improvement.
How to proceed?
On point 1.1) The present PR fixes the bug that causes the log to clutter up.
On point 1.2) I can open a new ticket: Enhancement of Health Check
Regarding point 2.) problems when updating several plugins, I would open a new ticket for further investigations.
@stoyangeorgiev what do you think?
Hi there, welcome to WordPress Trac! Thanks for the ticket.
Looking at [58128] / #58281, it appears that the intention was to output
true
in the error log in case of no fatal errors:However, the current code can indeed output the whole HTML body if
###### wp_scraping_result_start:
is not found. Moving to 6.7 for investigation.