#8446 closed defect (bug) (fixed)
Using Slug-Data in CSS Classes does not work
Reported by: | lilyfan | Owned by: | westi |
---|---|---|---|
Milestone: | 2.8.1 | Priority: | high |
Severity: | major | Version: | 2.7 |
Component: | I18N | Keywords: | has-patch tested commit developer-feedback |
Focuses: | Cc: |
Description
post_class() function outputs category/tag specific classes using its slug (introduced at [8641]).
But, allowed characters are different between slug and css class name.
(css class name can use [-_a-zA-Z0-9] and non ascii unicod chars)
A "%" character is allowed for slug, but not for css name.
Therefore, invalid class name will be created.
Eg: A user creates "携帯" category (Japanese word for "mobile"), the category slug will be "%e6%90%ba%e5%b8%af" and post_class() outputs 'class="post hentry category-%e6%90%ba%e5%b8%af"'.
This is an invalid class name.
Avoid this situation, escaping "%" character with backslash "\": category-\%e6\%90\%ba\%e5\%b8\%af"
This class name is valid. But this seems dirty.
I suggest using class ID insted of class slug.
Attachments (7)
Change History (84)
#1
follow-up:
↓ 2
@
16 years ago
Named classes are easier to style against than id based ones as they are blog install independent
I suspect we should be attribute_escaping the data though so as to ensure the html is valid
#2
in reply to:
↑ 1
@
16 years ago
Replying to westi:
I suspect we should be attribute_escaping the data though so as to ensure the html is valid
Scratch that comment .. it already is valid.
#3
@
16 years ago
- Component changed from General to Template
- Milestone changed from 2.7 to 2.8
- Severity changed from blocker to major
Unicode is valid, so why not urldecode() then escape with attribute_escape() ? Punting to 2.8.
#4
follow-up:
↓ 5
@
16 years ago
- Milestone changed from 2.8 to 2.7.1
- Severity changed from major to blocker
Named classes are easier to style against than id
I agree this, but I think that avoiding invalid css class name is prior than easy-to-use.
Current WP 2.7 RC2 creats invalid css names at non-ascii languages such as Japanese, Chinese.
This is a blocker ticket to relase 2.7!!
Unicode is valid, so why not urldecode() then escape with attribute_escape()
attribute_escape() does NOT escape '%' sign. This solution is not good for this problem.
A new function to escape for css name is needed.
#5
in reply to:
↑ 4
@
16 years ago
Replying to lilyfan:
Unicode is valid, so why not urldecode() then escape with attribute_escape()
attribute_escape() does NOT escape '%' sign. This solution is not good for this problem.
A new function to escape for css name is needed.
Yes but you wouldn't have any %'s as it would have been urldecoded first.
#7
@
16 years ago
i'd like to see both in: ids and slugs.
i doubt that category-\%e6\%90\%ba\%e5\%b8\%af is a valid class name in or at least it is as valid as category-%e6%90%ba%e5%b8%af, because % must be escaped with their UNICODE hex representation that would be escaped with
then but (oh my gosh), must be followed by a space, tab linebreak and the such. this does all not sound practicable to me, because it will actually break the class identifier then.
anyway, the suggested is misleading. for classnames, unicode letters can be used. that is no problem. it looks like the $cat->slug contains an url-encoded slug that is not valid UTF-8 any longer (as in the meaning of a classname).
since the slug comes out of the database, the problem already resides in the database, that is a local install.
for current bleeding, i see no escape routines in any way. so i see no need for a patch. what must be patched then is the routine, that stores the slug in the database, that is the category editor in the ACP. but this will be a larger patchset, because it means that slugs can be stored in unicode more generously and (potentially) must be properly url-encoded later on. who knows on how many places (!).
in this case it looks like the user did enter a specific slug, to get a nice URL. therefore, a plugin that hooks into post_class could do the job. since this is not unicode (as far as I assume, i do not write japanese), i have not idea on how actually to encode the string back to a readable representation. but it can be done with that hook.
@
16 years ago
extended patch, incl. a new function to filter classname(s), adding valid classes for tags and categories, ids preserved in their own classes.
#9
@
16 years ago
after reading into IRIs and CSS specs, seeing the tests i might be possible to use the IRI encoded slugs, convert them into UTF-8 with php (4/5) and then use the unicode representation with all control characters xml-encoded (removed?) and havin ' and " xml encoded.
#10
@
15 years ago
I am currently working on a new patch that will properly filter utf-8 classnames. it takes some time to research and actually code it, because the pcre-unicode support is not really helping here and it is not compatible with php 4.3 at all.
this patch should enable users to use IRI encoded slugs in their UTF-8 representation as classnames.
#11
follow-up:
↓ 13
@
15 years ago
- Component changed from Template to i18n
- Milestone changed from 2.7.2 to 2.8
- Owner changed from anonymous to nbachiyski
probably related: #3727
#12
@
15 years ago
New Patch added. clean_css_classnames now fully handles utf8 encoded strings as slugs. I have not checked with non latin1-existing chars from the utf8 range, but will give a try. there might be the need to urldecode $tag->slug first, but this can be applied fast.
#13
in reply to:
↑ 11
@
15 years ago
Replying to Denis-de-Bernardy:
probably related: #3727
Don't think so. With that Bug a clear statement from core-devs is just missing what MUST be done on that part of code. Not related to CSS classnames.
#14
@
15 years ago
Fixed the patch to handle IRI encoded Slugs (see comment 12). I have tested this now with the japanese Word for mobile "携帯" and it works really nice. Now Classnames can be used properly in the UTF-8 space, meaning you can create semantics in your native language. this does work for most languages, because wordpress repplaces some latin1 chars like ä ü ö with a u o, you can not use propper semantics for german for example. but japanese and other non-latin1-supported languages should work flawlessly.
#16
@
15 years ago
I did some testing. Added CSS to my theme that contains the japanese mobile category:
/* special markers */ .category-携帯 h2 { background-color:#000; color:#fff; } .category-携帯 h2 a { color:#fff !important; }
and then opened an article from that category. the headline of a post is inverted then (white on black instead of black on white).
firefox 3, chrome 1, opera 9 and ie 6.
#18
follow-up:
↓ 21
@
15 years ago
That's a lot of work for a class. Why do we need split_utf8()? It creates and returns an array that is ignored by seems_utf8(), it's only consumer.
#20
@
15 years ago
thanks for taking the time to take a look into it.
yes we need split_utf8(), it is not only used in seems_utf8():
$partsUtf8 = split_utf8($class);
it is used in seems_utf8() to not have code-duplication. you can see that a lot of the code from both functions is the same (well seems_utf8() is not that improved as split_utf8() but both have the same logic).
split_utf8() is used to validate single byte chars to macht the CSS definition since multi byte utf8 sequnces are per-se valid for the css classname. The following fragement shows that:
104 $partsUtf8 = split_utf8($class); 105 $count = count($partsUtf8); 106 if (is_array($partsUtf8) && $count != strlen($class)) { // has multi byte chars 107 108 // filter invalid chars per entity, only single chars need to me checked 109 for ($i=0; $i < $count; $i++) { 110 if (strlen($partsUtf8[$i]) == 1) { 111 $partsUtf8[$i] = preg_replace('|[^_a-zA-Z0-9\0-\177-]|', '_', $partsUtf8[$i]); 112 } 113 } 114 $class = implode($partsUtf8); // rebuild class string 115 }
#21
in reply to:
↑ 18
@
15 years ago
Replying to ryan:
That's a lot of work for a class. Why do we need split_utf8()? It creates and returns an array that is ignored by seems_utf8(), it's only consumer.
Thats only for not having code duplication. both functions can contain the same code having seems_utf8() return a true at the end while split_utf8() already has the splitted byte squences.
split_utf8() is then used later on, seems you have overlooked that.
#22
@
15 years ago
this one might be a good candidate for 2.9 early, with a patch that addresses all of the potential class name problems (the body tags' class, the post's class, probably some category/tag/post/page classes in various template tags, etc.)
#23
@
15 years ago
I think I'd rather just lose all slugs in classes. seems_utf8() is slow. I'd rather not use it every time body_class(), post_class(), comment_class(), etc. run.
#25
follow-up:
↓ 26
@
15 years ago
- Milestone changed from 2.8 to 2.9
in this case we can safely move to 2.9. the feature is seldom used anyway, and it only annoys users who worry about their site outputting valid xhtml code.
#26
in reply to:
↑ 25
;
follow-up:
↓ 28
@
15 years ago
- Milestone changed from 2.9 to 2.8
Replying to Denis-de-Bernardy:
[...] the feature is seldom used anyway, and it only annoys users who worry about their site outputting valid xhtml code.
the feature used is called CSS and it is used in a thousand themes. especially the current theme projects (modular theming solutions, frameworks etc.) are heavily using CSS class names.
this is _not_ about valid xhtml code.
this is about having slugs of posts as css classes properly utf-8 encoded. so if your language is not within the ascii 7bit range you can properly use semantic CSS files. that is mostly important for languages beyond latin-1. infact, latin-1 languages are really degraded in that case by wordpress.
because your decision is based on misleading stuff (seldom use, xhtml validity), i reverted it.
#27
@
15 years ago
+1 for having that in for 2.8 and then taking a look if it is properly working. I do not think it really is slow. to prevent your fears, seems_utf8() can be left as it is in current trunk (without any optimizations - i've said nothing) and split_utf8() is only used by clean_css_classnames() then. sounds promising?
#28
in reply to:
↑ 26
@
15 years ago
Replying to hakre:
your decision is based on misleading stuff (seldom use, xhtml validity)
It's only ever used by authors who customize their CSS in the first place, since it's based on the slug. I know this for a fact, since I've several plugins that output css classes based on a sanitized version of the title (essentially, s/a-z0-9i). For what it's worth, I've a meaningful number of customers as a sample, many of which customize some CSS here and there... They never use these classes in practice. :-)
I'm 100% for switching to IDs instead, personally.
#30
in reply to:
↑ 29
@
15 years ago
Replying to hakre:
Let me guess, most of them are from the western world.
true. but that has nothing to do with the fact that they're not using the feature in the first place. ;-)
#31
follow-up:
↓ 32
@
15 years ago
I have problems with making assumptions based on own criteria. Even if only 1% of WordPress users are actually Authors of their CSS, well, ... .
As this was requested in trac I took quite some time and analysis to find a good solution. The patch provides correctly encoded classnames based on WP Post Slugs. This really comes in handy. It provides IDs as well. This a solution that works very broad and is standard compliant.
As for the generality of a clean_css_classnames() function, it must deal with all possibility according to the definitions. split_utf8() is a helper function sothat clean_css_classnames() can do its job.
you can for shure in contrast reduce wordpress to ascii 7bit. but i assume, this is not intended by the wordpress core devs. so if the way leads to utf-8 and you need a function to clean css classnames, there it is. or should this run into the same dilemma as with clean_url()?.
#32
in reply to:
↑ 31
@
15 years ago
I think you're misunderstanding Ryan's comment (and mine, since I'm on the same wavelength as he is on this point): better a fast site with IDs used than a slow one with valid string class names.
#33
@
15 years ago
then you need to reduce the classes of tags etc. to ids as well. the clean_css_classnames() function is for general usage. if you only built them upon numerical values, there must be no need to validate against utf-8 mutlibyte sequences.
#35
@
15 years ago
I'll leave it up to you to bump to 2.9 or future. It seems clear to me that Ryan won't commit the current solution -- he'd rather have a fast blog, and rightly so.
#36
@
15 years ago
- Milestone changed from 2.8 to 2.9
My suggestion: Fix for 2.8 is to replace the slugs with IDs.
Additionally for 2.8 I make a patch of the seems_utf8() function for improvements.
For 2.9 or later a new ticket can be opened.
#37
@
15 years ago
- Keywords has-patch 2nd-opinion added; needs-patch removed
seems_utf8() improvements have their own ticket: #9692
#40
@
15 years ago
- Milestone changed from 2.8 to 2.9
The problem for me of switching to ids from slugs is that IDs are fragile and slugs are not.
If someone has safe slugs then they should be in the classes so that themes can be written against them.
If we need an extra column in the db to store the calculated safe slug for css class names then so be it but putting the IDs in just encourages people to code against them and they are not guaranteed to be consistent accross multiple installs where you might want to use the same css.
#41
@
15 years ago
the suggestion to use ids here was a compromise to find a solution to get the bug out. it is cheap to achieve but i must admit, it is not as nice as having the slugs data.
if weste see's another way of having the slugs working, i am open to go that route. if this is a feature for 2.9 then it needs a new ticket. because this one is reporting a bug with the current 2.7 implementation which should be fixed in 2.8 because the current slug implementation for css classes just does not work properly.
#42
@
15 years ago
- Milestone changed from 2.9 to 2.8
I need to put this on the 2.8 list for having the actual bug removed. The bug is:
using slugs data as part of css classnames does not work.
#45
@
15 years ago
- Summary changed from post_class() outputs invalid css class to Using Slug-Data in CSS Classes does not work
I will take a look on the two and might extend the patch.
#46
@
15 years ago
there also might be some kind of get_term_class(), too (not entirely sure, of its existence or its name).
#48
@
15 years ago
Trio of get_comment_class, get_post_class and get_body_class fixed for accidently using incompatbile values in css-class-names.
Additionally some of the new functions also benefits from the touchup to not mix multiple class names in one array entry. see #9465 and [10876]. Some more of those fixed now in this patch.
#51
@
15 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
This changeset has introduced a PHP notice that prevents IIS from loading wp-admin when debug is enabled.
PHP Notice: Undefined property: stdClass::$user_id in E:\webpage\svn\trunk\wp-includes\comment-template.php on line 297
I'm not sure, but possibly $classes[] = 'comment-author-' . $user->user_id; should be $classes[] = 'comment-author-' . $comment->user_id;
Attaching patch to that effect.
#53
@
15 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
I'd really like to get the slugs back into core.. Using/exposing ID's is plain stupid.
Not only are using id's deprecated (In the sense that the UI doesnt expose it), but that they're also fragile like westi said.
I'd like to see either:
- "Unsafe" slugs added back
- non-english users will just have to deal with a defect..
- safe slugs back in
if ( !preg_match('|^[a-z0-9]|i', slug) ) $class[] = $slug;
#54
@
15 years ago
Fine, patch it up. Don't forget to allow dashes. Might be better to use ID if slug contains '%' and slug otherwise.
#55
@
15 years ago
- Owner changed from nbachiyski to westi
- Status changed from reopened to accepted
Working on this now.
As I said above, using IDs is pretty useless when you want your theme to be portable between installs.
I think we need a sanitisation function that gets it right for ascii slugs and then allow plugins to filter them until we get a better performant solution in the future.
#57
@
15 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
could we rename that to esc_css_class or even esc_css? or at least sanitize_css_classname (without a typo :-))?
#58
@
15 years ago
- Resolution set to fixed
- Status changed from reopened to closed
I see no typo ;-)
I would like to keep it as sanitise really rather than escape because it is santisation not escaping..
I don't think we need it really short as it's not gonna be used all over the place like those other ones are.
#59
@
15 years ago
It's not a typo in the UK. :-)
But to be consistent with "sanitize" elsewhere in WP, it should probably use a "z."
#60
@
15 years ago
err... http://www.google.com/search?q=sanitise -- even the UK folks agree that it's with a z.
#61
follow-up:
↓ 63
@
15 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
the reasoning behind esc_css is that it could also be used to sanitize (with a "z") a css ID.
#62
follow-up:
↓ 64
@
15 years ago
- Cc dkikizas@… added
This may very well be a stupid question, but I’ll ask it.
Does the CSS class name have to come either from the slug or from the term ID?
Why not get it from the category/tag *name*?
#63
in reply to:
↑ 61
@
15 years ago
Replying to Denis-de-Bernardy:
the reasoning behind esc_css is that it could also be used to sanitize (with a "z") a css ID.
It should not be used for a ID as ID's have different rules from classes.
ID sanitisation would need to be much tighter than this can be.
Working on a better name and more documentation improvements as I write this.
#64
in reply to:
↑ 62
;
follow-up:
↓ 66
@
15 years ago
Replying to demetris:
This may very well be a stupid question, but I’ll ask it.
Does the CSS class name have to come either from the slug or from the term ID?
Why not get it from the category/tag *name*?
For now the slugs are a good source for this data because they are close to the sanitised value as an endpoint as they have been reduced down to ascii + %encoding of other chars. We can easily strip out the %encoded ones and return a simple ascii version for a good 80% of names.
When we have a better solution for speedily producing names in other character sets we can revisit the source data for the sanitisation.
#66
in reply to:
↑ 64
;
follow-up:
↓ 68
@
15 years ago
Replying to westi:
Replying to demetris:
This may very well be a stupid question, but I’ll ask it.
Does the CSS class name have to come either from the slug or from the term ID?
Why not get it from the category/tag *name*?
For now the slugs are a good source for this data because they are close to the sanitised value as an endpoint as they have been reduced down to ascii + %encoding of other chars. We can easily strip out the %encoded ones and return a simple ascii version for a good 80% of names.
When we have a better solution for speedily producing names in other character sets we can revisit the source data for the sanitisation.
Aren’t names, tag names for example, already sanitized?
I’m insisting just to make sure we don’t miss something obvious.
Before asking I did a quick test:
- I removed the new sanitizing function.
- I replaced
$tags->slug
with$tags->name
.
Result:
I got nice UTF-8 CSS class names in my HTML source, in all the scripts of the world, and then I used the UTF-8 class names in my stylesheet and everything worked OK. (A few replacements will be needed, obviously, e.g. replacing spaces with hyphens for the CSS class names.)
If this works with at least UTF-8, then we have UTF-8 and its perfect subset ASCII already covered, which is a huge improvement over what WP2.7 did.
#67
@
15 years ago
Not changing this any more for 2.8
Please raise a new ticket for enhancements over the current behaviour.
Fix for that issue needs to cover all charsets to be safe.
#68
in reply to:
↑ 66
@
15 years ago
Replying to demetris:
Aren’t names, tag names for example, already sanitized?
I’m insisting just to make sure we don’t miss something obvious.
see the ticket's description for a counter example.
#69
@
15 years ago
@westi:
I did not say this should be changed any more for 2.8.
Also, your current patch with the sanitizer is clearly better that what 2.7 did (littering the document with unusable class names).
However, while offering an option that only works with ASCII characters might be acceptable for Sandbox, which is just a third-party theme, it should be unacceptable for the core of WordPress.
So, this should not be viewed as something that can be enhanced, but as a bug that should be fixed.
@Denis:
Up to when I asked my question, there were two alternatives discussed:
- Getting class names from term slugs (what Sandbox and WP 2.7 do)
- Getting class names from term IDs (what lilyfan suggested)
I asked about the feasibility of a *third* option:
- Getting class names from term *names*
#70
@
15 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
If you are really after a solution (not creating more and more defects) you can find a lot of information here:
http://codex.wordpress.org/User:Hakre/slugs_to_css_classnames (Slugs to CSS Classnames).
Within one of my previous patches you find a function that properly handles UTF-8 css classnames. That is working perfectly for any language (!). The Argument against using the patch was the overhead of an additional function.
Now an additional function has been added for filtering / hooking. So the argumentation starts to get looped.
If you are after solving this problem without any defects, please consider to look into one of my previous patches.
Using the ID Values (numerical) was getting the bug (it's not a defect, don't be so ASCII 7bit centric and even there, it had flaws) out. If the old code would have considered the standards, we would not have this anyway.
So after all this discussion, i must add some more input: Consider having both in: ID and Slugs, the IDs marked as such as well as the slugs properly written (means: UTF-8 encoded).
If you feel that my code is an overhead, then consider putting the functions in sothat other can use it in case this comes in handy. That is valid for the "this thene does it that way"-argument.
As said, I put a lot of effort into it to really solve that problem. ID is an easy solution with no overhead. But if anyone wants to have Slugs in as well then it should done right.
Imagine: If you filter slugs and the full slug is iri-encoded, duplicates are possible. With IDs you find at least a workaround. That comes in handy in the concrete installation.
As well as IDs, Slugs are still user generated values. The argumentation that Slugs will work better is a limited argument. Anyway it is a valid one but only if Slugs are transposed the right way into XHTML/UTF-8.
No-one needs a buggy implementation with defects. My 2 cents.
#71
@
15 years ago
Now an additional function has been added for filtering / hooking. So the argumentation starts to get looped.
There wasn't any issue with adding an extra function. The argument was that the functions you used/introduced were expensive functions (timewise or memory usage).
I'm all for supporting internationals, but not at a huge expense to english users, Internationalisation will always use CPU cycles, the aim is to reduce the impact.
#72
@
15 years ago
- Resolution set to fixed
- Status changed from reopened to closed
As dd32 has said the issue was with the expense of the functions you were using to achieve the solution.
As I believe I said earlier in this ticket for full international support we may need to do some caching of the safe value as an extra column in the db.
For 2.8 this issue is closed and please do not re-open this ticket.
Please open a new ticket for improvements on this solution.
#73
@
15 years ago
what about having IDs in? 'tag-id-' . $id and the like? then it will be possible to improve the functionality in 2.8 later on more flawlessly. IDs won't change (and solve for those who have currently problems without _any_ overhead for english (as argumented) and the then overhead currently is not me to blame because it is not my func :).
#75
@
15 years ago
- Cc westonruter added
- Milestone changed from 2.8 to 2.8.1
- Resolution fixed deleted
- Status changed from closed to reopened
Two if
statements are missing curly braces in the patch: http://core.trac.wordpress.org/attachment/ticket/8446/8446.2.patch
See patch: http://core.trac.wordpress.org/attachment/ticket/8446/post-template.php.patch
This is in the get_body_class()
function.
Use category-100 instead of category-myslug in get_post_class(). (Ditto for tags.)