Make WordPress Core

Opened 13 years ago

Closed 6 years ago

#20973 closed task (blessed) (fixed)

date_i18n() produces invalid output for shorthand formats

Reported by: rarst's profile Rarst Owned by: pento's profile pento
Milestone: 5.1 Priority: normal
Severity: normal Version: 3.4
Component: Date/Time Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

date_i18n() function relies on parsing passed format to make adjustments. However shorthand formats are not handled and produce invalid output.

Example:

var_dump( date_i18n( 'Y-m-d\TH:i:sP', time() ) ); // 2012-06-15T13:34:03+03:00 << ok
var_dump( date_i18n( DATE_W3C, time() ) ); // 2012-06-15T13:34:03+03:00 << ok
var_dump( date_i18n( 'c', time() ) ); // 2012-06-15T13:34:03+00:00 << broken time zone!

Hook-level fix:

add_filter( 'date_i18n', 'fix_c_time_format', 10, 3 );

function fix_c_time_format( $date, $format, $timestamp ) {

    if ( 'c' == $format )
        $date = date_i18n( DATE_W3C, $timestamp );

    return $date;
}

See Why time functions show invalid time zone when using 'c' time format?

Possibly related (can't say for sure from description) #13538

Attachments (12)

functions.php.patch (2.9 KB) - added by raubvogel 12 years ago.
Fix for date_i18n() (new version)
functions.php.2.patch (2.8 KB) - added by raubvogel 12 years ago.
Fix for date_i18n() (new version 2)
Unpack_shorthand_date_formats.patch (667 bytes) - added by Rarst 12 years ago.
Unpack shorthand date formats in a form date_i18n() can understand
functions.php.3.patch (648 bytes) - added by raubvogel 12 years ago.
Same patch as Rarst’s but "optimized"
Unpack_shorthand_date_formats1.patch (744 bytes) - added by Rarst 12 years ago.
Changed to preg_replace to ignore escaped shorthands
functions.php.4.patch (7.4 KB) - added by raubvogel 12 years ago.
date_i18n() completely rewritten
tests.php (1.3 KB) - added by raubvogel 12 years ago.
Tests for date_i18n()
Unpack_shorthand_date_formats2.patch (749 bytes) - added by Rarst 7 years ago.
Refreshed patch with lookbehind regex that correctly matches at start of string.
20973-with-unit-tests.patch (1.4 KB) - added by pbearne 7 years ago.
patch with unit tests
Fixed shorthand date formats in date_i18n.patch.txt (2.1 KB) - added by Rarst 7 years ago.
Adjusted unit tests and added tests for handling escaping.
Fixed shorthand date formats in date_i18n.patch (3.3 KB) - added by Rarst 7 years ago.
Fixed whitespace, getting used to new workflow :)
Fixed_shorthand_date_format_test_.patch (715 bytes) - added by Rarst 6 years ago.

Download all attachments as: .zip

Change History (69)

#1 @toscho
13 years ago

  • Cc info@… added

#2 @Rarst
13 years ago

Actually scratch time() from example for more accurate:

var_dump( date_i18n( 'Y-m-d\TH:i:sP' ) );
var_dump( date_i18n( DATE_W3C ) );
var_dump( date_i18n( 'c' ) ); // broken time zone!

#3 @stephenh1988
13 years ago

  • Cc stephen@… added

#4 @Rarst
12 years ago

  • Cc contact@… added

Small correction - proper constant for c replacement is DATE_ISO8601, not DATE_W3C (different time zone formatting).

[update] Hm, nope DATE_W3C weirdly seems to be more consistent with c than DATE_ISO8601. While either time zone format is actually ok for standard.

Last edited 12 years ago by Rarst (previous) (diff)

#5 @mintindeed
12 years ago

  • Cc gabriel.koen@… added

#6 @Jayjdk
12 years ago

  • Cc kontakt@… added

#7 @raubvogel
12 years ago

  • Resolution set to duplicate
  • Status changed from new to closed

Duplicate of #25555.

#8 follow-up: @toscho
12 years ago

  • Resolution duplicate deleted
  • Status changed from closed to reopened

I think this is the bug for the general issue while get_comment_time('c') is just one function showing the side effects. So we should focus on date_i18n(), not on get_comment_time().

#9 in reply to: ↑ 8 @raubvogel
12 years ago

Replying to toscho:
You are absolutely right. Therefore I had provided a patch for date_i18n() in #25555. Needs of course validation, but should work now. (That’s why I closed #20973.)

Last edited 12 years ago by raubvogel (previous) (diff)

#10 follow-up: @helen
12 years ago

If #25555 is a duplicate, then it should be closed. Generally speaking, newer tickets are closed as the duplicates.

#11 @raubvogel
12 years ago

#25555 was marked as a duplicate.

#12 in reply to: ↑ 10 @raubvogel
12 years ago

Replying to helen:
Thanks for the advice!

#13 @raubvogel
12 years ago

  • Keywords has-patch added

#14 follow-up: @Rarst
12 years ago

Patch does not seem to fix the issue for me (var_dump( date_i18n( 'c' ) ); still produces invalid result). Could you elaborate a bit on it? I have trouble following what are you trying to do with time zone.

As per my description of the issue I think this is the problem with WP not recognizing shorthand time format, rather than with applying time zone (which works fine for other formats).

#15 follow-up: @Rarst
12 years ago

Also note date_timestamp_set() used in your patch is PHP >= 5.3.0, maybe more.

@raubvogel
12 years ago

Fix for date_i18n() (new version)

#16 in reply to: ↑ 14 @raubvogel
12 years ago

Replying to Rarst:
There seems to be another issue, but fixed it in a new patch. However

date_i18n( 'c', current_time( 'timestamp' ) )

already worked. Now

date_i18n( 'c' )

produced the same output. I’m afraid now that behaviour of date_i18n() has changed to much …

#17 in reply to: ↑ 15 @raubvogel
12 years ago

Replying to Rarst:
Thanks, You are right. In the new patch I use only functions that are PHP <= 5.2 (I hope).

#18 @raubvogel
12 years ago

What I have done in the patch: Wordpress calls date_timezone_set to the UTC/GMT globally. Therefore date() (which is used in date_i18n()) and similar functions don’t care about your time zone set in the Wordpress settings. My patch basically makes a call like date() but cares about the time zone (take a look at PHP’s DateTime class). While doing that no time zone conversations are done!

@raubvogel
12 years ago

Fix for date_i18n() (new version 2)

#19 follow-up: @Rarst
12 years ago

Good news, bad news... :)

Latest patch does fix timezone issue in c format. But only that.

It treats symptom rather than underlying issue. For example in another shorthand r names of day and month are not being correctly localized for same reasons:

var_dump( date_i18n( DATE_RFC2822 ) ); // Пт, 11 Окт 2013 23:04:43 +0300 << localized
var_dump( date_i18n( 'r' ) ); // Fri, 11 Oct 2013 23:04:43 +0000 << stays English

I think we need a patch that fixes all of this, rather than treating this case by case. Meaning digging up all shorthands WP doesn't understand and converting them to formats it does.

Last edited 12 years ago by Rarst (previous) (diff)

@Rarst
12 years ago

Unpack shorthand date formats in a form date_i18n() can understand

#20 in reply to: ↑ 19 ; follow-up: @raubvogel
12 years ago

Replying to Rarst:
Tested

var_dump( date_i18n( DATE_RFC2822 ) ); // string 'Sa, 12 Okt 2013 00:43:18 +0200' (length=30)
var_dump( date_i18n( 'r' ) ); // 'Sat, 12 Oct 2013 00:43:18 +0200' (length=31)

Worked with my patch. Cannot find an issue :-( Used functions.php.2.patch

Your patch is elegant, but covers only 'c', 'r'. There are many more cases http://www.php.net/manual/de/function.date.php, right?

Last edited 12 years ago by raubvogel (previous) (diff)

#21 in reply to: ↑ 20 ; follow-up: @Rarst
12 years ago

Replying to raubvogel:

Tested

var_dump( date_i18n( DATE_RFC2822 ) ); // string 'Sa, 12 Okt 2013 00:43:18 +0200' (length=30)
var_dump( date_i18n( 'r' ) ); // 'Sat, 12 Oct 2013 00:43:18 +0200' (length=31)

Worked with my patch. Cannot find an issue :-( Used functions.php.2.patch

Yes, the timezone works with your patch. See how day and month are different? This is because of same issue with parsing formats, which your patch doesn't address because it only tries to fix timezone.

Your patch is elegant, but covers only 'c', 'r'. There are many more cases http://www.php.net/manual/de/function.date.php, right?

Yes, however it seems other (not compound) formats are all (?) already supported by date_i18n(). As above I have two issues left here - is "unpacking" them good enough of a solution and doing bit of a testing to be sure which other formats might need to be handled.

#22 follow-up: @Rarst
12 years ago

I've looked through all the formats listed in date() documentation and hadn't noticed any other with format parsing issues.

#23 in reply to: ↑ 21 @raubvogel
12 years ago

Replying to Rarst:
Ok, You are absolutely right. Your patch should do the stuff we need. I will upload a patch based on Your’s which has a bit more performance.

#24 in reply to: ↑ 22 @raubvogel
12 years ago

Replying to Rarst:
I have taken a look too. And I think You are right – could not find any issues too.

@raubvogel
12 years ago

Same patch as Rarst’s but "optimized"

#25 follow-up: @Rarst
12 years ago

It's not a given that compound format is present in format input by itself. Can be part of it, thus my patch using replacement rather than comparison.

Also I forgot to account for it possibly being escaped and need to rewrite that...

@Rarst
12 years ago

Changed to preg_replace to ignore escaped shorthands

#26 in reply to: ↑ 25 @raubvogel
12 years ago

Replying to Rarst:
Oh, You are right, again :D I read the PHP doc wrong -.-

@raubvogel
12 years ago

date_i18n() completely rewritten

#27 @raubvogel
12 years ago

I have completely rewritten date_i18n() now. I did tests by

// day
var_dump( 'day: ' . date_i18n( '\d:d | \D:D | \j:j | \l:l | \N:N | \S:S | \w:w | \z:z' ) );
// week
var_dump( 'week: ' . date_i18n( '\W:W' ) );
// month
var_dump( 'month: ' . date_i18n( '\F:F | \m:m | \M:M | \n:n | \t:t' ) );
// year
var_dump( 'year: ' . date_i18n( '\L:L | \o:o | \Y:Y | \y:y' ) );
// time
var_dump( 'time: ' . date_i18n( '\a:a | \A:A | \B:B | \g:g | \G:G | \h:h | \H:H | \i:i | \s:s | \u:u' ) );
// timezone
var_dump( 'timezone: ' . date_i18n( '\e:e | \I:I | \O:O | \P:P | \T:T | \Z:Z' ) );
// full date/time
var_dump( 'full date/time: ' . date_i18n( '\c:c | \r:r | \U:U' ) );

// unix time
var_dump( 'unix time: ' . time() );

// escape test
var_dump( 'escape test: ' . date_i18n( 'D | \D | \\D | \\\D | \\\\D | \\\\\D | \\\\\\D' ) );
// escape test with PHP’s date()
var_dump( 'escape test date(): ' . date ( 'D | \D | \\D | \\\D | \\\\D | \\\\\D | \\\\\\D' ) );

The outputs are (German localization)

string 'day: d:14 | D:Mo | j:14 | l:Montag | N:1 | S:th | w:1 | z:286' (length=61)
string 'week: W:42' (length=10)
string 'month: F:Oktober | m:10 | M:Okt | n:10 | t:31' (length=45)
string 'year: L:0 | o:2013 | Y:2013 | y:13' (length=34)
string 'time: a:am | A:AM | B:012 | g:1 | G:1 | h:01 | H:01 | i:17 | s:40 | u:000000' (length=76)
string 'timezone: e:Europe/Berlin | I:1 | O:+0200 | P:+02:00 | T:CEST | Z:7200' (length=70)
string 'full date/time: c:2013-10-14T01:17:40+02:00 | r:Mo, 14 Okt 2013 01:17:40 +0200 | U:1381706260' (length=93)
string 'unix time: 1381706260' (length=21)
string 'escape test: Mo | D | D | \Mo | \Mo | \D | \D' (length=45)
string 'escape test date(): Sun | D | D | \Sun | \Sun | \D | \D' (length=55)

As you can see translation for 'S' (and 'e' too?) is still missing (needs extension of WP_Locale class). I think date_i18n() needs more testing. If you compare the patch with the old date_i18n(), you will see that there are no regular expressions any more. Some other improvements should boost performance too. By reviewing the the old date_i18n() I recognised that escaping of literals in the date format string was not handled properly (regex are wrong). Can some please check this too? What do you think about my patch?

#28 @raubvogel
12 years ago

It’s true: the unpatched date_i18n() does not handle escaped literals properly. Test

// escape test
var_dump( 'escape test: ' . date_i18n( 'F | \F | \\F | \\\F | \\\\F | \\\\\F | \\\\\\F' ) );
// escape test with PHP’s date()
var_dump( 'escape test date(): ' . date ( 'F | \F | \\F | \\\F | \\\\F | \\\\\F | \\\\\\F' ) );

results in (German localization)

string 'escape test: Oktober | F | F | \October | \October | \F | \F' (length=60)
string 'escape test date(): October | F | F | \October | \October | \F | \F' (length=67)

'F' is translated to "Oktober" but '
\F' is translated to "\October" (should be "\Oktober"). Therefore the unpatched date_i18n() has a further issue …

@Rarst: In Your patch

$dateformatstring = preg_replace( "/([^\\\])c/", "\\1" . DATE_W3C, $dateformatstring ); 
$dateformatstring = preg_replace( "/([^\\\])r/", "\\1" . DATE_RFC2822, $dateformatstring );

causes wrong escaping too (it is the same regex as in the original date_i18n).

#29 follow-up: @Rarst
12 years ago

I would suggest to open new ticket about slashing issues.

Complete rewrite of function is unlikely to be accepted as solution to smaller issues, without major review and testing by core developers.

#30 in reply to: ↑ 29 @raubvogel
12 years ago

  • Keywords dev-feedback added

Replying to Rarst:
I will mark the ticket as "dev-feedback". I do not exactly know the procedure to get a patch accepted. Therefore a trusted developer should review the ticket/patch, right?

Anyway, the original date_i18n() is not implemented very well – there are the mentioned bugs, the bad performance (regex), few comments and it’s not very readable. So rewriting this function is necessary in my opinion.

Someone really should review my code and fix possible issues. I will create further tests for my patch …

Edit: Fixing the slashes issues in original date_i18n() might be bad because one needs more complex regex (something like /(?<!\\)(?:\\\\)*D/ which has negative look-behind).

Last edited 12 years ago by raubvogel (previous) (diff)

@raubvogel
12 years ago

Tests for date_i18n()

#31 @raubvogel
12 years ago

I added a test for date_i18n(). Seems to be good so far.

I will create a further patch that will handle the case 'S' …

Edit: 'S' should not be translated. It’s just the English ordinal suffix (st, nd, rd or th) for the day of the month (simple translation is not enough, so it has to be implemented completely new. See http://en.wikipedia.org/wiki/Ordinal_indicator)

Last edited 12 years ago by raubvogel (previous) (diff)

#32 @raubvogel
12 years ago

  • Keywords needs-testing added
  • Version 3.4 deleted

#33 @raubvogel
12 years ago

  • Version set to 3.4

#34 @raubvogel
12 years ago

  • Severity changed from normal to major
  • Summary changed from date_i18n() produces invalid output for shorthand formats to date_i18n() is wrong for certain formats, escapes sequences and daylight saving time

Beside the format and escaping issue I found another issue maybe:

/*
* Test is called in time zone Europe/Berlin at a date where daylight saving time is off!
* (Daylight saving time in Europe/Berlin was switched off at 2013-10-27)
*/
// results 2013-10-29T22:15:03+01:00 while daylight saving time (Europe/Berlin) is off; result is correct
echo(date_i18n('Y-m-d\TH:i:sP', 1383084903));
// results 2013-10-11T19:37:20+01:00 while daylight saving time (Europe/Berlin) is on; result is wrong! should be +02:00!
echo(date_i18n('Y-m-d\TH:i:sP', 1381520240));

As you can see the time zone offset is sometimes wrong. Actually it depends on the current daylight saving time.

Is issue comes from wordpress\wp-includes\functions.php:127 in date_i18n():

$date_object = date_create( null, $timezone_object );

The date object created with the current time stamp ("null = now"). This is wrong. It has to be created with the time stamp from date_i18n() parameter $unixtimestamp.

For now we have 3 issues at date_i18n() which is annoying (at least for my German blogs). I set the severity of this ticket to "major" and changed the summary according the 3 issues.

I tested the daylight saving time issue with my patch. It worked! I think some core developer really should have a look at this ticket. And it would be nice if other people could test my patch. Thx!

#35 follow-up: @Rarst
12 years ago

Please stop inflating scope of this ticket. If there are more issues in the function - properly report as new tickets.

This here is about shorthand formats and just them.

#36 @Rarst
12 years ago

  • Severity changed from major to normal
  • Summary changed from date_i18n() is wrong for certain formats, escapes sequences and daylight saving time to date_i18n() produces invalid output for shorthand formats

#37 in reply to: ↑ 35 @raubvogel
12 years ago

Ok, so I inflated the list of tickets: new ticket is #25768.

#39 @smerriman
9 years ago

This bug has been around for at least 4 years now. Is there any chance at least the fix for 'c' could be added to core? Wanting to output an ISO 8601 date string for a post is very common (structured data, metadata, etc), and it's strange that in order to do so anywhere within WordPress, you need to use 'Y-m-d\TH:i:sP' instead of 'c' (and only after lots of debugging / searching, as the Wordpress documentation itself says to use 'c':

https://codex.wordpress.org/Formatting_Date_and_Time

#40 @ocean90
9 years ago

  • Keywords needs-unit-tests added

See also https://github.com/WordPress/twentyseventeen/issues/127

Unpack_shorthand_date_formats1.patch looks like a simple fix for this. But I think we should replace it with 'Y-m-d\TH:i:sP' and not with the value of the constants.

#41 @johnbillion
7 years ago

  • Keywords dev-feedback needs-testing removed
  • Milestone changed from Awaiting Review to 5.0
  • Owner set to johnbillion
  • Status changed from reopened to reviewing

#43 @Rarst
7 years ago

#43493 was marked as a duplicate.

@Rarst
7 years ago

Refreshed patch with lookbehind regex that correctly matches at start of string.

@pbearne
7 years ago

patch with unit tests

#44 @pbearne
7 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

@Rarst
7 years ago

Adjusted unit tests and added tests for handling escaping.

@Rarst
7 years ago

Fixed whitespace, getting used to new workflow :)

#45 @johnbillion
7 years ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

In 43434:

Date/Time: Add support for the c and r shorthand formats in date_i18n().

Props Rarst, pbearne

Fixes #20973

#46 @johnbillion
7 years ago

  • Keywords needs-unit-tests added; has-patch has-unit-tests removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening to improve the tests.

#47 @Rarst
7 years ago

The problem with merged test is that timezone change got lost in rewrite. Without WP set to non-UTC time zone the bug is invisible (because wrong output happens to be same as right output) and test doesn't test.

#48 @SergeyBiryukov
7 years ago

In 43562:

Tests: Avoid a race condition in test_date_i18n_handles_shorthand_formats() by using a delta for comparing timestamps.

See #20973, #38381.

#49 @johnbillion
7 years ago

  • Milestone changed from 5.0 to 5.1

#50 @pento
6 years ago

@johnbillion: Are you planning on writing more unit tests for this?

#51 @Rarst
6 years ago

I will take care of the test, just wandered away while 5.0 was happening and my stuff was on hold. :)

#52 @pento
6 years ago

Awesome, thanks @Rarst!

#53 @pento
6 years ago

  • Owner changed from johnbillion to Rarst
  • Status changed from reopened to assigned

#54 @pento
6 years ago

  • Type changed from defect (bug) to task (blessed)

Converting to a task, as it's just waiting for unit tests.

#55 @Rarst
6 years ago

  • Keywords has-patch has-unit-tests added; needs-unit-tests removed

#56 @pento
6 years ago

  • Owner changed from Rarst to pento

#57 @pento
6 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 44710:

Tests: Fix the tests for the c and r formats in date_i18n().

To test the date_i18n() output correctly, the tests added in [43434] need to set a non-UTC timezone.

Props Rarst.
Fixes #20973.

Note: See TracTickets for help on using tickets.