WordPress.org

Make WordPress Core

Opened 2 years ago

Last modified 8 months ago

#20973 reopened defect (bug)

date_i18n() produces invalid output for shorthand formats

Reported by: Rarst Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 3.4
Component: Date/Time Keywords: has-patch dev-feedback needs-testing
Focuses: Cc:

Description

date_i18n() function relies on parsing passed format to make adjustments. However shorthand formats are not handled and produce invalid output.

Example:

var_dump( date_i18n( 'Y-m-d\TH:i:sP', time() ) ); // 2012-06-15T13:34:03+03:00 << ok
var_dump( date_i18n( DATE_W3C, time() ) ); // 2012-06-15T13:34:03+03:00 << ok
var_dump( date_i18n( 'c', time() ) ); // 2012-06-15T13:34:03+00:00 << broken time zone!

Hook-level fix:

add_filter( 'date_i18n', 'fix_c_time_format', 10, 3 );

function fix_c_time_format( $date, $format, $timestamp ) {

    if ( 'c' == $format )
        $date = date_i18n( DATE_W3C, $timestamp );

    return $date;
}

See Why time functions show invalid time zone when using 'c' time format?

Possibly related (can't say for sure from description) #13538

Attachments (7)

functions.php.patch (2.9 KB) - added by raubvogel 9 months ago.
Fix for date_i18n() (new version)
functions.php.2.patch (2.8 KB) - added by raubvogel 9 months ago.
Fix for date_i18n() (new version 2)
Unpack_shorthand_date_formats.patch (667 bytes) - added by Rarst 9 months ago.
Unpack shorthand date formats in a form date_i18n() can understand
functions.php.3.patch (648 bytes) - added by raubvogel 9 months ago.
Same patch as Rarst’s but "optimized"
Unpack_shorthand_date_formats1.patch (744 bytes) - added by Rarst 9 months ago.
Changed to preg_replace to ignore escaped shorthands
functions.php.4.patch (7.4 KB) - added by raubvogel 9 months ago.
date_i18n() completely rewritten
tests.php (1.3 KB) - added by raubvogel 9 months ago.
Tests for date_i18n()

Download all attachments as: .zip

Change History (44)

comment:1 toscho2 years ago

  • Cc info@… added

comment:2 Rarst2 years ago

Actually scratch time() from example for more accurate:

var_dump( date_i18n( 'Y-m-d\TH:i:sP' ) );
var_dump( date_i18n( DATE_W3C ) );
var_dump( date_i18n( 'c' ) ); // broken time zone!

comment:3 stephenh19882 years ago

  • Cc stephen@… added

comment:4 Rarst18 months ago

  • Cc contact@… added

Small correction - proper constant for c replacement is DATE_ISO8601, not DATE_W3C (different time zone formatting).

[update] Hm, nope DATE_W3C weirdly seems to be more consistent with c than DATE_ISO8601. While either time zone format is actually ok for standard.

Last edited 18 months ago by Rarst (previous) (diff)

comment:5 mintindeed15 months ago

  • Cc gabriel.koen@… added

comment:6 Jayjdk12 months ago

  • Cc kontakt@… added

comment:7 raubvogel9 months ago

  • Resolution set to duplicate
  • Status changed from new to closed

Duplicate of #25555.

comment:8 follow-up: toscho9 months ago

  • Resolution duplicate deleted
  • Status changed from closed to reopened

I think this is the bug for the general issue while get_comment_time('c') is just one function showing the side effects. So we should focus on date_i18n(), not on get_comment_time().

comment:9 in reply to: ↑ 8 raubvogel9 months ago

Replying to toscho:
You are absolutely right. Therefore I had provided a patch for date_i18n() in #25555. Needs of course validation, but should work now. (That’s why I closed #20973.)

Last edited 9 months ago by raubvogel (previous) (diff)

comment:10 follow-up: helen9 months ago

If #25555 is a duplicate, then it should be closed. Generally speaking, newer tickets are closed as the duplicates.

comment:11 raubvogel9 months ago

#25555 was marked as a duplicate.

comment:12 in reply to: ↑ 10 raubvogel9 months ago

Replying to helen:
Thanks for the advice!

comment:13 raubvogel9 months ago

  • Keywords has-patch added

comment:14 follow-up: Rarst9 months ago

Patch does not seem to fix the issue for me (var_dump( date_i18n( 'c' ) ); still produces invalid result). Could you elaborate a bit on it? I have trouble following what are you trying to do with time zone.

As per my description of the issue I think this is the problem with WP not recognizing shorthand time format, rather than with applying time zone (which works fine for other formats).

comment:15 follow-up: Rarst9 months ago

Also note date_timestamp_set() used in your patch is PHP >= 5.3.0, maybe more.

raubvogel9 months ago

Fix for date_i18n() (new version)

comment:16 in reply to: ↑ 14 raubvogel9 months ago

Replying to Rarst:
There seems to be another issue, but fixed it in a new patch. However

date_i18n( 'c', current_time( 'timestamp' ) )

already worked. Now

date_i18n( 'c' )

produced the same output. I’m afraid now that behaviour of date_i18n() has changed to much …

comment:17 in reply to: ↑ 15 raubvogel9 months ago

Replying to Rarst:
Thanks, You are right. In the new patch I use only functions that are PHP <= 5.2 (I hope).

comment:18 raubvogel9 months ago

What I have done in the patch: Wordpress calls date_timezone_set to the UTC/GMT globally. Therefore date() (which is used in date_i18n()) and similar functions don’t care about your time zone set in the Wordpress settings. My patch basically makes a call like date() but cares about the time zone (take a look at PHP’s DateTime class). While doing that no time zone conversations are done!

raubvogel9 months ago

Fix for date_i18n() (new version 2)

comment:19 follow-up: Rarst9 months ago

Good news, bad news... :)

Latest patch does fix timezone issue in c format. But only that.

It treats symptom rather than underlying issue. For example in another shorthand r names of day and month are not being correctly localized for same reasons:

var_dump( date_i18n( DATE_RFC2822 ) ); // Пт, 11 Окт 2013 23:04:43 +0300 << localized
var_dump( date_i18n( 'r' ) ); // Fri, 11 Oct 2013 23:04:43 +0000 << stays English

I think we need a patch that fixes all of this, rather than treating this case by case. Meaning digging up all shorthands WP doesn't understand and converting them to formats it does.

Last edited 9 months ago by Rarst (previous) (diff)

Rarst9 months ago

Unpack shorthand date formats in a form date_i18n() can understand

comment:20 in reply to: ↑ 19 ; follow-up: raubvogel9 months ago

Replying to Rarst:
Tested

var_dump( date_i18n( DATE_RFC2822 ) ); // string 'Sa, 12 Okt 2013 00:43:18 +0200' (length=30)
var_dump( date_i18n( 'r' ) ); // 'Sat, 12 Oct 2013 00:43:18 +0200' (length=31)

Worked with my patch. Cannot find an issue :-( Used functions.php.2.patch

Your patch is elegant, but covers only 'c', 'r'. There are many more cases http://www.php.net/manual/de/function.date.php, right?

Last edited 9 months ago by raubvogel (previous) (diff)

comment:21 in reply to: ↑ 20 ; follow-up: Rarst9 months ago

Replying to raubvogel:

Tested

var_dump( date_i18n( DATE_RFC2822 ) ); // string 'Sa, 12 Okt 2013 00:43:18 +0200' (length=30)
var_dump( date_i18n( 'r' ) ); // 'Sat, 12 Oct 2013 00:43:18 +0200' (length=31)

Worked with my patch. Cannot find an issue :-( Used functions.php.2.patch

Yes, the timezone works with your patch. See how day and month are different? This is because of same issue with parsing formats, which your patch doesn't address because it only tries to fix timezone.

Your patch is elegant, but covers only 'c', 'r'. There are many more cases http://www.php.net/manual/de/function.date.php, right?

Yes, however it seems other (not compound) formats are all (?) already supported by date_i18n(). As above I have two issues left here - is "unpacking" them good enough of a solution and doing bit of a testing to be sure which other formats might need to be handled.

comment:22 follow-up: Rarst9 months ago

I've looked through all the formats listed in date() documentation and hadn't noticed any other with format parsing issues.

comment:23 in reply to: ↑ 21 raubvogel9 months ago

Replying to Rarst:
Ok, You are absolutely right. Your patch should do the stuff we need. I will upload a patch based on Your’s which has a bit more performance.

comment:24 in reply to: ↑ 22 raubvogel9 months ago

Replying to Rarst:
I have taken a look too. And I think You are right – could not find any issues too.

raubvogel9 months ago

Same patch as Rarst’s but "optimized"

comment:25 follow-up: Rarst9 months ago

It's not a given that compound format is present in format input by itself. Can be part of it, thus my patch using replacement rather than comparison.

Also I forgot to account for it possibly being escaped and need to rewrite that...

Rarst9 months ago

Changed to preg_replace to ignore escaped shorthands

comment:26 in reply to: ↑ 25 raubvogel9 months ago

Replying to Rarst:
Oh, You are right, again :D I read the PHP doc wrong -.-

raubvogel9 months ago

date_i18n() completely rewritten

comment:27 raubvogel9 months ago

I have completely rewritten date_i18n() now. I did tests by

// day
var_dump( 'day: ' . date_i18n( '\d:d | \D:D | \j:j | \l:l | \N:N | \S:S | \w:w | \z:z' ) );
// week
var_dump( 'week: ' . date_i18n( '\W:W' ) );
// month
var_dump( 'month: ' . date_i18n( '\F:F | \m:m | \M:M | \n:n | \t:t' ) );
// year
var_dump( 'year: ' . date_i18n( '\L:L | \o:o | \Y:Y | \y:y' ) );
// time
var_dump( 'time: ' . date_i18n( '\a:a | \A:A | \B:B | \g:g | \G:G | \h:h | \H:H | \i:i | \s:s | \u:u' ) );
// timezone
var_dump( 'timezone: ' . date_i18n( '\e:e | \I:I | \O:O | \P:P | \T:T | \Z:Z' ) );
// full date/time
var_dump( 'full date/time: ' . date_i18n( '\c:c | \r:r | \U:U' ) );

// unix time
var_dump( 'unix time: ' . time() );

// escape test
var_dump( 'escape test: ' . date_i18n( 'D | \D | \\D | \\\D | \\\\D | \\\\\D | \\\\\\D' ) );
// escape test with PHP’s date()
var_dump( 'escape test date(): ' . date ( 'D | \D | \\D | \\\D | \\\\D | \\\\\D | \\\\\\D' ) );

The outputs are (German localization)

string 'day: d:14 | D:Mo | j:14 | l:Montag | N:1 | S:th | w:1 | z:286' (length=61)
string 'week: W:42' (length=10)
string 'month: F:Oktober | m:10 | M:Okt | n:10 | t:31' (length=45)
string 'year: L:0 | o:2013 | Y:2013 | y:13' (length=34)
string 'time: a:am | A:AM | B:012 | g:1 | G:1 | h:01 | H:01 | i:17 | s:40 | u:000000' (length=76)
string 'timezone: e:Europe/Berlin | I:1 | O:+0200 | P:+02:00 | T:CEST | Z:7200' (length=70)
string 'full date/time: c:2013-10-14T01:17:40+02:00 | r:Mo, 14 Okt 2013 01:17:40 +0200 | U:1381706260' (length=93)
string 'unix time: 1381706260' (length=21)
string 'escape test: Mo | D | D | \Mo | \Mo | \D | \D' (length=45)
string 'escape test date(): Sun | D | D | \Sun | \Sun | \D | \D' (length=55)

As you can see translation for 'S' (and 'e' too?) is still missing (needs extension of WP_Locale class). I think date_i18n() needs more testing. If you compare the patch with the old date_i18n(), you will see that there are no regular expressions any more. Some other improvements should boost performance too. By reviewing the the old date_i18n() I recognised that escaping of literals in the date format string was not handled properly (regex are wrong). Can some please check this too? What do you think about my patch?

comment:28 raubvogel9 months ago

It’s true: the unpatched date_i18n() does not handle escaped literals properly. Test

// escape test
var_dump( 'escape test: ' . date_i18n( 'F | \F | \\F | \\\F | \\\\F | \\\\\F | \\\\\\F' ) );
// escape test with PHP’s date()
var_dump( 'escape test date(): ' . date ( 'F | \F | \\F | \\\F | \\\\F | \\\\\F | \\\\\\F' ) );

results in (German localization)

string 'escape test: Oktober | F | F | \October | \October | \F | \F' (length=60)
string 'escape test date(): October | F | F | \October | \October | \F | \F' (length=67)

'F' is translated to "Oktober" but '
\F' is translated to "\October" (should be "\Oktober"). Therefore the unpatched date_i18n() has a further issue …

@Rarst: In Your patch

$dateformatstring = preg_replace( "/([^\\\])c/", "\\1" . DATE_W3C, $dateformatstring ); 
$dateformatstring = preg_replace( "/([^\\\])r/", "\\1" . DATE_RFC2822, $dateformatstring );

causes wrong escaping too (it is the same regex as in the original date_i18n).

comment:29 follow-up: Rarst9 months ago

I would suggest to open new ticket about slashing issues.

Complete rewrite of function is unlikely to be accepted as solution to smaller issues, without major review and testing by core developers.

comment:30 in reply to: ↑ 29 raubvogel9 months ago

  • Keywords dev-feedback added

Replying to Rarst:
I will mark the ticket as "dev-feedback". I do not exactly know the procedure to get a patch accepted. Therefore a trusted developer should review the ticket/patch, right?

Anyway, the original date_i18n() is not implemented very well – there are the mentioned bugs, the bad performance (regex), few comments and it’s not very readable. So rewriting this function is necessary in my opinion.

Someone really should review my code and fix possible issues. I will create further tests for my patch …

Edit: Fixing the slashes issues in original date_i18n() might be bad because one needs more complex regex (something like /(?<!\\)(?:\\\\)*D/ which has negative look-behind).

Last edited 9 months ago by raubvogel (previous) (diff)

raubvogel9 months ago

Tests for date_i18n()

comment:31 raubvogel9 months ago

I added a test for date_i18n(). Seems to be good so far.

I will create a further patch that will handle the case 'S' …

Edit: 'S' should not be translated. It’s just the English ordinal suffix (st, nd, rd or th) for the day of the month (simple translation is not enough, so it has to be implemented completely new. See http://en.wikipedia.org/wiki/Ordinal_indicator)

Last edited 9 months ago by raubvogel (previous) (diff)

comment:32 raubvogel9 months ago

  • Keywords needs-testing added
  • Version 3.4 deleted

comment:33 raubvogel9 months ago

  • Version set to 3.4

comment:34 raubvogel8 months ago

  • Severity changed from normal to major
  • Summary changed from date_i18n() produces invalid output for shorthand formats to date_i18n() is wrong for certain formats, escapes sequences and daylight saving time

Beside the format and escaping issue I found another issue maybe:

/*
* Test is called in time zone Europe/Berlin at a date where daylight saving time is off!
* (Daylight saving time in Europe/Berlin was switched off at 2013-10-27)
*/
// results 2013-10-29T22:15:03+01:00 while daylight saving time (Europe/Berlin) is off; result is correct
echo(date_i18n('Y-m-d\TH:i:sP', 1383084903));
// results 2013-10-11T19:37:20+01:00 while daylight saving time (Europe/Berlin) is on; result is wrong! should be +02:00!
echo(date_i18n('Y-m-d\TH:i:sP', 1381520240));

As you can see the time zone offset is sometimes wrong. Actually it depends on the current daylight saving time.

Is issue comes from wordpress\wp-includes\functions.php:127 in date_i18n():

$date_object = date_create( null, $timezone_object );

The date object created with the current time stamp ("null = now"). This is wrong. It has to be created with the time stamp from date_i18n() parameter $unixtimestamp.

For now we have 3 issues at date_i18n() which is annoying (at least for my German blogs). I set the severity of this ticket to "major" and changed the summary according the 3 issues.

I tested the daylight saving time issue with my patch. It worked! I think some core developer really should have a look at this ticket. And it would be nice if other people could test my patch. Thx!

comment:35 follow-up: Rarst8 months ago

Please stop inflating scope of this ticket. If there are more issues in the function - properly report as new tickets.

This here is about shorthand formats and just them.

comment:36 Rarst8 months ago

  • Severity changed from major to normal
  • Summary changed from date_i18n() is wrong for certain formats, escapes sequences and daylight saving time to date_i18n() produces invalid output for shorthand formats

comment:37 in reply to: ↑ 35 raubvogel8 months ago

Ok, so I inflated the list of tickets: new ticket is #25768.

Note: See TracTickets for help on using tickets.