WordPress.org

Make WordPress Core

Opened 3 years ago

Last modified 16 months ago

#20973 reopened defect (bug)

date_i18n() produces invalid output for shorthand formats

Reported by: Rarst Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 3.4
Component: Date/Time Keywords: has-patch dev-feedback needs-testing
Focuses: Cc:

Description

date_i18n() function relies on parsing passed format to make adjustments. However shorthand formats are not handled and produce invalid output.

Example:

var_dump( date_i18n( 'Y-m-d\TH:i:sP', time() ) ); // 2012-06-15T13:34:03+03:00 << ok
var_dump( date_i18n( DATE_W3C, time() ) ); // 2012-06-15T13:34:03+03:00 << ok
var_dump( date_i18n( 'c', time() ) ); // 2012-06-15T13:34:03+00:00 << broken time zone!

Hook-level fix:

add_filter( 'date_i18n', 'fix_c_time_format', 10, 3 );

function fix_c_time_format( $date, $format, $timestamp ) {

    if ( 'c' == $format )
        $date = date_i18n( DATE_W3C, $timestamp );

    return $date;
}

See Why time functions show invalid time zone when using 'c' time format?

Possibly related (can't say for sure from description) #13538

Attachments (7)

functions.php.patch (2.9 KB) - added by raubvogel 17 months ago.
Fix for date_i18n() (new version)
functions.php.2.patch (2.8 KB) - added by raubvogel 17 months ago.
Fix for date_i18n() (new version 2)
Unpack_shorthand_date_formats.patch (667 bytes) - added by Rarst 17 months ago.
Unpack shorthand date formats in a form date_i18n() can understand
functions.php.3.patch (648 bytes) - added by raubvogel 17 months ago.
Same patch as Rarst’s but "optimized"
Unpack_shorthand_date_formats1.patch (744 bytes) - added by Rarst 17 months ago.
Changed to preg_replace to ignore escaped shorthands
functions.php.4.patch (7.4 KB) - added by raubvogel 17 months ago.
date_i18n() completely rewritten
tests.php (1.3 KB) - added by raubvogel 17 months ago.
Tests for date_i18n()

Download all attachments as: .zip

Change History (44)

comment:1 @toscho3 years ago

  • Cc info@… added

comment:2 @Rarst3 years ago

Actually scratch time() from example for more accurate:

var_dump( date_i18n( 'Y-m-d\TH:i:sP' ) );
var_dump( date_i18n( DATE_W3C ) );
var_dump( date_i18n( 'c' ) ); // broken time zone!

comment:3 @stephenh19883 years ago

  • Cc stephen@… added

comment:4 @Rarst2 years ago

  • Cc contact@… added

Small correction - proper constant for c replacement is DATE_ISO8601, not DATE_W3C (different time zone formatting).

Version 0, edited 2 years ago by Rarst (next)

comment:5 @mintindeed2 years ago

  • Cc gabriel.koen@… added

comment:6 @Jayjdk20 months ago

  • Cc kontakt@… added

comment:7 @raubvogel17 months ago

  • Resolution set to duplicate
  • Status changed from new to closed

Duplicate of #25555.

comment:8 follow-up: @toscho17 months ago

  • Resolution duplicate deleted
  • Status changed from closed to reopened

I think this is the bug for the general issue while get_comment_time('c') is just one function showing the side effects. So we should focus on date_i18n(), not on get_comment_time().

comment:9 in reply to: ↑ 8 @raubvogel17 months ago

Replying to toscho:
You are absolutely right. Therefore I had provided a patch for date_i18n() in #25555. Needs of course validation, but should work now. (That’s why I closed #20973.)

Last edited 17 months ago by raubvogel (previous) (diff)

comment:10 follow-up: @helen17 months ago

If #25555 is a duplicate, then it should be closed. Generally speaking, newer tickets are closed as the duplicates.

comment:11 @raubvogel17 months ago

#25555 was marked as a duplicate.

comment:12 in reply to: ↑ 10 @raubvogel17 months ago

Replying to helen:
Thanks for the advice!

comment:13 @raubvogel17 months ago

  • Keywords has-patch added

comment:14 follow-up: @Rarst17 months ago

Patch does not seem to fix the issue for me (var_dump( date_i18n( 'c' ) ); still produces invalid result). Could you elaborate a bit on it? I have trouble following what are you trying to do with time zone.

As per my description of the issue I think this is the problem with WP not recognizing shorthand time format, rather than with applying time zone (which works fine for other formats).

comment:15 follow-up: @Rarst17 months ago

Also note date_timestamp_set() used in your patch is PHP >= 5.3.0, maybe more.

@raubvogel17 months ago

Fix for date_i18n() (new version)

comment:16 in reply to: ↑ 14 @raubvogel17 months ago

Replying to Rarst:
There seems to be another issue, but fixed it in a new patch. However

date_i18n( 'c', current_time( 'timestamp' ) )

already worked. Now

date_i18n( 'c' )

produced the same output. I’m afraid now that behaviour of date_i18n() has changed to much …

comment:17 in reply to: ↑ 15 @raubvogel17 months ago

Replying to Rarst:
Thanks, You are right. In the new patch I use only functions that are PHP <= 5.2 (I hope).

comment:18 @raubvogel17 months ago

What I have done in the patch: Wordpress calls date_timezone_set to the UTC/GMT globally. Therefore date() (which is used in date_i18n()) and similar functions don’t care about your time zone set in the Wordpress settings. My patch basically makes a call like date() but cares about the time zone (take a look at PHP’s DateTime class). While doing that no time zone conversations are done!

@raubvogel17 months ago

Fix for date_i18n() (new version 2)

comment:19 follow-up: @Rarst17 months ago

Good news, bad news... :)

Latest patch does fix timezone issue in c format. But only that.

It treats symptom rather than underlying issue. For example in another shorthand r names of day and month are not being correctly localized for same reasons:

var_dump( date_i18n( DATE_RFC2822 ) ); // Пт, 11 Окт 2013 23:04:43 +0300 << localized
var_dump( date_i18n( 'r' ) ); // Fri, 11 Oct 2013 23:04:43 +0000 << stays English

I think we need a patch that fixes all of this, rather than treating this case by case. Meaning digging up all shorthands WP doesn't understand and converting them to formats it does.

Last edited 17 months ago by Rarst (previous) (diff)

@Rarst17 months ago

Unpack shorthand date formats in a form date_i18n() can understand

comment:20 in reply to: ↑ 19 ; follow-up: @raubvogel17 months ago

Replying to Rarst:
Tested

var_dump( date_i18n( DATE_RFC2822 ) ); // string 'Sa, 12 Okt 2013 00:43:18 +0200' (length=30)
var_dump( date_i18n( 'r' ) ); // 'Sat, 12 Oct 2013 00:43:18 +0200' (length=31)

Worked with my patch. Cannot find an issue :-( Used functions.php.2.patch

Your patch is elegant, but covers only 'c', 'r'. There are many more cases http://www.php.net/manual/de/function.date.php, right?

Last edited 17 months ago by raubvogel (previous) (diff)

comment:21 in reply to: ↑ 20 ; follow-up: @Rarst17 months ago

Replying to raubvogel:

Tested

var_dump( date_i18n( DATE_RFC2822 ) ); // string 'Sa, 12 Okt 2013 00:43:18 +0200' (length=30)
var_dump( date_i18n( 'r' ) ); // 'Sat, 12 Oct 2013 00:43:18 +0200' (length=31)

Worked with my patch. Cannot find an issue :-( Used functions.php.2.patch

Yes, the timezone works with your patch. See how day and month are different? This is because of same issue with parsing formats, which your patch doesn't address because it only tries to fix timezone.

Your patch is elegant, but covers only 'c', 'r'. There are many more cases http://www.php.net/manual/de/function.date.php, right?

Yes, however it seems other (not compound) formats are all (?) already supported by date_i18n(). As above I have two issues left here - is "unpacking" them good enough of a solution and doing bit of a testing to be sure which other formats might need to be handled.

comment:22 follow-up: @Rarst17 months ago

I've looked through all the formats listed in date() documentation and hadn't noticed any other with format parsing issues.

comment:23 in reply to: ↑ 21 @raubvogel17 months ago

Replying to Rarst:
Ok, You are absolutely right. Your patch should do the stuff we need. I will upload a patch based on Your’s which has a bit more performance.

comment:24 in reply to: ↑ 22 @raubvogel17 months ago

Replying to Rarst:
I have taken a look too. And I think You are right – could not find any issues too.

@raubvogel17 months ago

Same patch as Rarst’s but "optimized"

comment:25 follow-up: @Rarst17 months ago

It's not a given that compound format is present in format input by itself. Can be part of it, thus my patch using replacement rather than comparison.

Also I forgot to account for it possibly being escaped and need to rewrite that...

@Rarst17 months ago

Changed to preg_replace to ignore escaped shorthands

comment:26 in reply to: ↑ 25 @raubvogel17 months ago

Replying to Rarst:
Oh, You are right, again :D I read the PHP doc wrong -.-

@raubvogel17 months ago

date_i18n() completely rewritten

comment:27 @raubvogel17 months ago

I have completely rewritten date_i18n() now. I did tests by

// day
var_dump( 'day: ' . date_i18n( '\d:d | \D:D | \j:j | \l:l | \N:N | \S:S | \w:w | \z:z' ) );
// week
var_dump( 'week: ' . date_i18n( '\W:W' ) );
// month
var_dump( 'month: ' . date_i18n( '\F:F | \m:m | \M:M | \n:n | \t:t' ) );
// year
var_dump( 'year: ' . date_i18n( '\L:L | \o:o | \Y:Y | \y:y' ) );
// time
var_dump( 'time: ' . date_i18n( '\a:a | \A:A | \B:B | \g:g | \G:G | \h:h | \H:H | \i:i | \s:s | \u:u' ) );
// timezone
var_dump( 'timezone: ' . date_i18n( '\e:e | \I:I | \O:O | \P:P | \T:T | \Z:Z' ) );
// full date/time
var_dump( 'full date/time: ' . date_i18n( '\c:c | \r:r | \U:U' ) );

// unix time
var_dump( 'unix time: ' . time() );

// escape test
var_dump( 'escape test: ' . date_i18n( 'D | \D | \\D | \\\D | \\\\D | \\\\\D | \\\\\\D' ) );
// escape test with PHP’s date()
var_dump( 'escape test date(): ' . date ( 'D | \D | \\D | \\\D | \\\\D | \\\\\D | \\\\\\D' ) );

The outputs are (German localization)

string 'day: d:14 | D:Mo | j:14 | l:Montag | N:1 | S:th | w:1 | z:286' (length=61)
string 'week: W:42' (length=10)
string 'month: F:Oktober | m:10 | M:Okt | n:10 | t:31' (length=45)
string 'year: L:0 | o:2013 | Y:2013 | y:13' (length=34)
string 'time: a:am | A:AM | B:012 | g:1 | G:1 | h:01 | H:01 | i:17 | s:40 | u:000000' (length=76)
string 'timezone: e:Europe/Berlin | I:1 | O:+0200 | P:+02:00 | T:CEST | Z:7200' (length=70)
string 'full date/time: c:2013-10-14T01:17:40+02:00 | r:Mo, 14 Okt 2013 01:17:40 +0200 | U:1381706260' (length=93)
string 'unix time: 1381706260' (length=21)
string 'escape test: Mo | D | D | \Mo | \Mo | \D | \D' (length=45)
string 'escape test date(): Sun | D | D | \Sun | \Sun | \D | \D' (length=55)

As you can see translation for 'S' (and 'e' too?) is still missing (needs extension of WP_Locale class). I think date_i18n() needs more testing. If you compare the patch with the old date_i18n(), you will see that there are no regular expressions any more. Some other improvements should boost performance too. By reviewing the the old date_i18n() I recognised that escaping of literals in the date format string was not handled properly (regex are wrong). Can some please check this too? What do you think about my patch?

comment:28 @raubvogel17 months ago

It’s true: the unpatched date_i18n() does not handle escaped literals properly. Test

// escape test
var_dump( 'escape test: ' . date_i18n( 'F | \F | \\F | \\\F | \\\\F | \\\\\F | \\\\\\F' ) );
// escape test with PHP’s date()
var_dump( 'escape test date(): ' . date ( 'F | \F | \\F | \\\F | \\\\F | \\\\\F | \\\\\\F' ) );

results in (German localization)

string 'escape test: Oktober | F | F | \October | \October | \F | \F' (length=60)
string 'escape test date(): October | F | F | \October | \October | \F | \F' (length=67)

'F' is translated to "Oktober" but '
\F' is translated to "\October" (should be "\Oktober"). Therefore the unpatched date_i18n() has a further issue …

@Rarst: In Your patch

$dateformatstring = preg_replace( "/([^\\\])c/", "\\1" . DATE_W3C, $dateformatstring ); 
$dateformatstring = preg_replace( "/([^\\\])r/", "\\1" . DATE_RFC2822, $dateformatstring );

causes wrong escaping too (it is the same regex as in the original date_i18n).

comment:29 follow-up: @Rarst17 months ago

I would suggest to open new ticket about slashing issues.

Complete rewrite of function is unlikely to be accepted as solution to smaller issues, without major review and testing by core developers.

comment:30 in reply to: ↑ 29 @raubvogel17 months ago

  • Keywords dev-feedback added

Replying to Rarst:
I will mark the ticket as "dev-feedback". I do not exactly know the procedure to get a patch accepted. Therefore a trusted developer should review the ticket/patch, right?

Anyway, the original date_i18n() is not implemented very well – there are the mentioned bugs, the bad performance (regex), few comments and it’s not very readable. So rewriting this function is necessary in my opinion.

Someone really should review my code and fix possible issues. I will create further tests for my patch …

Edit: Fixing the slashes issues in original date_i18n() might be bad because one needs more complex regex (something like /(?<!\\)(?:\\\\)*D/ which has negative look-behind).

Last edited 17 months ago by raubvogel (previous) (diff)

@raubvogel17 months ago

Tests for date_i18n()

comment:31 @raubvogel17 months ago

I added a test for date_i18n(). Seems to be good so far.

I will create a further patch that will handle the case 'S' …

Edit: 'S' should not be translated. It’s just the English ordinal suffix (st, nd, rd or th) for the day of the month (simple translation is not enough, so it has to be implemented completely new. See http://en.wikipedia.org/wiki/Ordinal_indicator)

Last edited 17 months ago by raubvogel (previous) (diff)

comment:32 @raubvogel17 months ago

  • Keywords needs-testing added
  • Version 3.4 deleted

comment:33 @raubvogel17 months ago

  • Version set to 3.4

comment:34 @raubvogel16 months ago

  • Severity changed from normal to major
  • Summary changed from date_i18n() produces invalid output for shorthand formats to date_i18n() is wrong for certain formats, escapes sequences and daylight saving time

Beside the format and escaping issue I found another issue maybe:

/*
* Test is called in time zone Europe/Berlin at a date where daylight saving time is off!
* (Daylight saving time in Europe/Berlin was switched off at 2013-10-27)
*/
// results 2013-10-29T22:15:03+01:00 while daylight saving time (Europe/Berlin) is off; result is correct
echo(date_i18n('Y-m-d\TH:i:sP', 1383084903));
// results 2013-10-11T19:37:20+01:00 while daylight saving time (Europe/Berlin) is on; result is wrong! should be +02:00!
echo(date_i18n('Y-m-d\TH:i:sP', 1381520240));

As you can see the time zone offset is sometimes wrong. Actually it depends on the current daylight saving time.

Is issue comes from wordpress\wp-includes\functions.php:127 in date_i18n():

$date_object = date_create( null, $timezone_object );

The date object created with the current time stamp ("null = now"). This is wrong. It has to be created with the time stamp from date_i18n() parameter $unixtimestamp.

For now we have 3 issues at date_i18n() which is annoying (at least for my German blogs). I set the severity of this ticket to "major" and changed the summary according the 3 issues.

I tested the daylight saving time issue with my patch. It worked! I think some core developer really should have a look at this ticket. And it would be nice if other people could test my patch. Thx!

comment:35 follow-up: @Rarst16 months ago

Please stop inflating scope of this ticket. If there are more issues in the function - properly report as new tickets.

This here is about shorthand formats and just them.

comment:36 @Rarst16 months ago

  • Severity changed from major to normal
  • Summary changed from date_i18n() is wrong for certain formats, escapes sequences and daylight saving time to date_i18n() produces invalid output for shorthand formats

comment:37 in reply to: ↑ 35 @raubvogel16 months ago

Ok, so I inflated the list of tickets: new ticket is #25768.

Note: See TracTickets for help on using tickets.