Make WordPress Core

Opened 17 years ago

Closed 12 years ago

Last modified 12 years ago

#7559 closed defect (bug) (invalid)

strip_tags() breaks category names with left angle brackets

Reported by: squirreling's profile squirreling Owned by: ryan's profile ryan
Milestone: Priority: normal
Severity: minor Version: 2.6
Component: Taxonomy Keywords: needs-patch gsoc
Focuses: Cc:

Description

If you create a category named "<something", the category name doesn't show up on any of the category listings. If you create a category named "some<thing" the category name shows up as "some".

Attachments (5)

7559.diff (1.3 KB) - added by slushpilejs 16 years ago.
rescued lone less than sign in pre_term_name from strip_tags(), separated out pre_term_name filters
patch.diff (374 bytes) - added by MattyRob 16 years ago.
Picture 1.png (39.3 KB) - added by MattyRob 16 years ago.
wp-inc.default-filters.diff (1.0 KB) - added by MattyRob 16 years ago.
7559.2.diff (1.0 KB) - added by Denis-de-Bernardy 16 years ago.

Download all attachments as: .zip

Change History (49)

#1 @matt
17 years ago

  • Priority changed from low to high

#2 @ryan
17 years ago

strip_tags() is the culprit

#3 @ryan
17 years ago

  • Component changed from General to Taxonomy
  • Owner anonymous deleted

#4 @rmccue
16 years ago

  • Summary changed from creating a category name starting with a less than angle bracket causes category name not to show up to strip_tags() breaks category names with left angle brackets

@slushpilejs
16 years ago

rescued lone less than sign in pre_term_name from strip_tags(), separated out pre_term_name filters

#5 @slushpilejs
16 years ago

  • Keywords has-patch needs-testing added
  • Owner set to slushpilejs
  • Status changed from new to assigned

#6 @mattyrob
16 years ago

Will this patch also work for lone greater than symbols?

Maybe we should HTML encode them both (&lt; and &gt;) in category names.

#7 @slushpilejs
16 years ago

  • Owner changed from slushpilejs to ryan
  • Status changed from assigned to new

Yes, wp_specialchars encodes the lone greater than symbol. This merely dodges the broken behaviour of strip_tags.

#8 @DD32
16 years ago

  • Milestone changed from 2.7 to 2.7.1

#9 @MattyRob
16 years ago

Can't we just encode special symbols when categories are created? Only issue here is how to rectify existing category names.

#10 @chineseleper
16 years ago

I'm also having a problem with XML comments in titles, see #8981. It would be great if WP could use it's own strip_tags function which doesn't remove xml comments and keeps angle brackets.

#11 @mattyrob
16 years ago

  • Keywords tested developer-feedback added; category bracket special character needs-testing removed

#12 @mattyrob
16 years ago

  • Keywords 2nd-opinion added; developer-feedback removed

I've tested the patch I submitted 2 months ago in the spirit of the 24-hour has-patch marathon and it works for me. Maybe get someone else to check it before committing :-)

#13 @Denis-de-Bernardy
16 years ago

  • Keywords needs-testing added; tested 2nd-opinion removed
$slug = wp_specialchars($category_nicename);

won't we end up with special chars in the slug?

#14 @mattyrob
16 years ago

I've checked the database and there are no special characters in the slug in the terms table.

Weirdly though the fix appears somewhat erratic, it works fine for > but not at all for <. Is there some JS in use on the admin Categories page?

#15 @Denis-de-Bernardy
16 years ago

  • Keywords needs-patch added; has-patch removed

let's mark as needs patch then.

imo, there is no point to add the call to the slug, since that would get sanitized beforehand.

#16 @Denis-de-Bernardy
16 years ago

  • Milestone changed from 2.7.2 to 2.8

#17 @MattyRob
16 years ago

  • Keywords has-patch 2nd-opinion added; needs-patch needs-testing removed

Back Again :-)

I've checked the slug out more and it is already being sanitised. The problem with the initial patch was that the un-encoded variable is still passed further down the script in the patched file. An updated patch is attached.

There is another by sending $name rather than $cat_name to the wp_insert_term() function.

@MattyRob
16 years ago

#18 @Denis-de-Bernardy
16 years ago

  • Milestone 2.8 deleted
  • Resolution set to worksforme
  • Status changed from new to closed

say, can you confirm the issue? because, currently, there is a filter on:

pre_term_name

which does what you want.

closing as invalid, since it seems fixed in trunk... please reopen with feedback if it's not.

#19 @MattyRob
16 years ago

  • Keywords commit added
  • Resolution worksforme deleted
  • Status changed from closed to reopened

Updated to latest trunk to test and it still breaks:

Categories with 'less than' symbols in the cat name have all text after this symbol removed so 'some<thing' becomes 'some' and '<some' becomes null with the category slug becoming an integer.

The patch I submitted still fixes this issue.

#20 @MattyRob
16 years ago

Should have said, trunk version was 11111

#21 @DD32
16 years ago

  • Milestone set to 2.8

#22 @Denis-de-Bernardy
16 years ago

confirmed with foo'bar<foo"

patch is broken.

#23 @Denis-de-Bernardy
16 years ago

  • Keywords needs-patch added; has-patch 2nd-opinion commit removed

@MattyRob
16 years ago

#24 @MattyRob
16 years ago

  • Keywords has-patch 2nd-opinion commit added; needs-patch removed

Denis,

Can you check the patch again please. I tested with:

  • foo'bar<foo"
  • foo<bar>
  • '"bar"<foo<bar>'

And all three worked on my patched trunk [11270], screen grab attached

#25 @Denis-de-Bernardy
16 years ago

  • Keywords needs-patch added; has-patch commit removed

my bad sorry. I was trying to apply it to wp-includes/taxonomy.php instead of wp-admin/includes/taxonomy.php

one slight issue I ran into: it allowed to create a category called foo<br"\!'bar?#, but it denied the same for tags and link categories. a filter should likely apply to all terms on a pre_term_name hook or something like that.

#26 @MattyRob
16 years ago

You may well be right but a patch along those lines is beyond my understanding of the WordPress core files :-(

#27 @MattyRob
16 years ago

Hold the phone! I may be smarter than I thought!!

I've messed about with default-filters in wp-includes and this patch works for categories, tags and link names. The down side is that embedded HTML is now escaped into the name.

#28 @Denis-de-Bernardy
16 years ago

yeah, something like that. I'll give it a shot tomorrow.

#29 @Denis-de-Bernardy
16 years ago

  • Keywords 2nd-opinion removed

I've just uploaded a slightly modified version of your patch. (removed the strip_tags() call as it becomes useless.)

There is a genuine bug in there before any of it can be committed. If you create * foo'bar<foo" * foo<bar> * '"bar"<foo<bar>' as a tag, that tag can be added in the post editor using the autocomplete, but it then won't display correctly in the tag editor.

a few checks to run with * foo'bar<foo" * foo<bar> * '"bar"<foo<bar>':

  • can it be added as a post category/does it display correctly in the editor? in the post?
  • can it be added as a post tag with/without autocomplete/does it display correctly in the editor? in the post?
  • can it be added as a link category/does it display correctly in the editor? in the links widget?

#30 @MattyRob
16 years ago

Done some testing and the results are as follows:

  • Post Category addition, display in editor and display in actual post (in default theme) all works fine.
  • Post tag addition works fine, addition to a post in editor and display in editor is flaky probably down to JS encoding of the tag, display in front side post works well (provided you can tag in editor)
  • Addition editing and display as links all works fine

So, the problems all seem to be in the "Post Tags" section of the post editor now.

#31 @hakre
16 years ago

looks like a simple output problem to me. < must be converted to entity when the output should be preserved wihtin (x)html. patches should reflect that instead of patching here and there on good luck.

#32 @ryan
16 years ago

  • Priority changed from high to normal

#33 @MattyRob
16 years ago

The remaining issue seems to be down to the jQuery processing of tags added on the post screen. I'm not great with javascript or jQuery so someone else needs to tae a look. I suspect the problem is in the handling of special characters (<">' etc) in either post.js or tags.js contained in wp-admin/js.

#34 @Denis-de-Bernardy
16 years ago

  • Milestone changed from 2.8 to 2.9

Punting per discussion during the WP meet-up...

#35 follow-up: @hakre
16 years ago

Category names should not be allowed to have < or > chars in it. What about filtering the input and removing those values?

#36 in reply to: ↑ 35 @MattyRob
16 years ago

Replying to hakre:

Category names should not be allowed to have < or > chars in it. What about filtering the input and removing those values?

After the ticket has been logged and open for 10 months I think it's a little late to imply that the ticket is not valid and that the code should simply not allow angle brackets in category names!

The first question posed would be "Why not?" I think putting such symbols into category names, link names and tags is totally acceptable and should not be blocked or filtered, but should be properly handled and escaped.

#37 follow-up: @hakre
16 years ago

I just wanted to add my own opinion, please do not feel offended about that. It was just a suggestion, not the way to go. I should have been written that more clearly.

WordPress is not that strict about escaping and encoding and proper handling of such Issues so it *might* be a good Idea to just reduce the number of allowed chars for things like tags, categories, names, slugs etc. pp.. This is the thought behind what I've been written.

It's totally clear that this has implications on a current usage with those chars in those names.

#38 in reply to: ↑ 37 ; follow-up: @MattyRob
16 years ago

Replying to hakre:

I just wanted to add my own opinion, please do not feel offended about that. It was just a suggestion, not the way to go. I should have been written that more clearly.

Okay, fair point and I apologise if I came across a bit prickly!

WordPress is not that strict about escaping and encoding and proper handling of such Issues so it *might* be a good Idea to just reduce the number of allowed chars for things like tags, categories, names, slugs etc. pp.. This is the thought behind what I've been written.

It's totally clear that this has implications on a current usage with those chars in those names.

While one cannot argue that WordPress is fabulous at providing a rich blogging environment it does lack somewhat in documentation and in direction decisions for small but user important issues (like this one). If these characters are deemed inappropriate for category names etc, then why is the ticket still open after 10 months? If this is agreed to be an issue in the software why no update from the WP meet-up?

#39 in reply to: ↑ 38 @Denis-de-Bernardy
16 years ago

Replying to MattyRob:

If these characters are deemed inappropriate for category names etc, then why is the ticket still open after 10 months?

One word: needs-patch. Every patch provided either fails to fix the issue, or fixes it in an erroneous way, or ends up introducing problems in the tag picker.

#40 @markjaquith
15 years ago

  • Milestone changed from 2.9 to Future Release

needs-patch isn't one word. :-p

Punting to future.

#41 @wojtek.szkutnik
15 years ago

  • Keywords gsoc added

#42 @kevinB
15 years ago

  • Cc kevinB added

#43 @aaronholbrook
12 years ago

  • Resolution set to invalid
  • Status changed from reopened to closed

This no longer appears relevant, the problem as described works as expected.

#44 @dd32
12 years ago

  • Milestone Future Release deleted
Note: See TracTickets for help on using tickets.