#24121 closed defect (bug) (duplicate)
Blank title caused by PHP 5.4 htmlspecialchars() changes
Reported by: | trevHCS | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 3.5.1 |
Component: | Formatting | Keywords: | |
Focuses: | Cc: |
Description
Due to changes in PHP 5.4 within the htmlspecialchars() function, non UTF-8 characters in a post title will cause said title to go blank.
This is similar behaviour to ticket ID #23688 except:
- That ticket affected the body of the post not the title.
- This may require a slightly diff solution.
- The affected code is in two separate scripts.
Scenario:
- You add / edit a post and give it a title containing "You’re"
- You save the post and it appears on the site correctly.
- However, the admin -> post screen looses the title due to the ’
- Any further updates will lose the title from the public blog.
Offending character in this case is , fancy quote mark, but any non UTF-8 character will do the same, eg: the Euro symbol.
Problem: This occurs in edit-form-advanced.php around line 331 where it says:
<?php echo esc_attr( htmlspecialchars( $post->post_title ) ); ?>
Suggested solutions: My reading of the code is that esc_attr() does basically the same thing in this case as htmlspecialchars() so perhaps removing htmlspecialchars would work?
If not, a similar solution to that other ticket could be used, but it would likely need to be something like below, although see the notes in the other ticket about normalising blog_charset.
<?php echo esc_attr( htmlspecialchars( $post->post_title, ENT_SUBSTITUTE, get_option( 'blog_charset' ) ) ); ?>
I have tested with the alternative ENT_DISALLOWED but that seems to cause blank titles too.
Finally - I wasn't 100% sure if this should be a new bug or related to the previous ticket, but as that one is old I didn't want this important problem to be missed as it affects the very nature of blog publishing.
Change History (4)
#2
@
11 years ago
- Cc trevattdp@… added
That does make more sense now - was never 100% sure about the UTF-8 cause.
After doing more tests, I can conform this and the linked post content problem occur when the database is using something like "latin_swedish_ci" as the table collation as one of the blogs we run has "utf8_general_ci" and that does not suffer this problem.
As for page encoding, that seems set as ISO-8859-1 on those non UTF-8 blogs in the database 'options' table.
So it looks like both problems will affect older blogs before 'DB_CHARSET' in wp-config became utf8 by default at a guess?
#3
@
11 years ago
- Keywords needs-patch removed
- Milestone Awaiting Review deleted
- Status changed from new to closed
I wasn't 100% sure if this should be a new bug or related to the previous ticket, but as that one is old I didn't want this important problem to be missed as it affects the very nature of blog publishing.
#23688 appears to be a manifestation of the same bug and is assigned to the 3.6 milestone. Let's continue the discussion there.
Just for the record: non UTF-8 character is misleading. ’ or € are part of UTF-8, they just have to be encoded correctly. Does your blog run with a legacy encoding like ISO-8859-1?