Opened 14 years ago
Closed 12 years ago
#14225 closed defect (bug) (wontfix)
Use NCRs instead of HTML entities in Twenty Ten
Reported by: | peaceablewhale | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 3.0 |
Component: | Bundled Theme | Keywords: | has-patch |
Focuses: | Cc: |
Description
The Twenty Ten theme is currently using HTML entities to represent some Unicode characters. This will break the page when the page is served as "application/xhtml+xml". To avoid that and make Twenty Ten compatible with both HTML and XHTML syntax of HTML5, numeric character references should be used instead.
Attachments (1)
Change History (16)
#3
@
14 years ago
In general we don't code for serving as application/xhtml+xml but rather as text/html
As far as entities are concerned the named entities are fine in XML as long as the parser used can cope with them:
#7
@
14 years ago
To reword what I think the OP meant, as the (expected) HTML5 doctype only supports five named entities, then all (other?) named entities should be converted to numerical entities, for any one who wants to create a an off-shoot of TwentyTen that uses the HTML5 doctype.
However, the W3C say that Content SHOULD use the hexadecimal form of character escapes rather than the decimal form when there are both, so rather than converting ← to the decimal ← as per the patch, it SHOULD be converted to the hexadecimal ←, with other entities converted accordingly.
#8
@
14 years ago
To correct myself:
In HTML5, parsed as text/html
, all named entities are predefined and valid.
However, like XHTML 1.0 Strict MAY be (and usually is) parsed as text/html
and not application/xhtml+xml
, it's possible to write HTML5 in a polyglot form, such that should it be parsed with an XML parser (as application/xhtml+xml
) it would be valid XHTML5.
There is no formal DTD for XHTML5, and although you could provide reference to an external DTD for adding named entities, browsers do not universally make use of them for their parsers, meaning it's basically not an option, as, say, ·
, …
or »
may not be recognised.
The recommendation by the WHATWG for producing HTML5 documents capable of being parsed as XML for XHTML5, is to use numerical entities, except for the 5 implicit named entities that are safe.
Changing all named entities to their hexadecimal equivalents across all of core, not just Twenty Ten, has no negative impacts (save .po strings changing), as browsers back to at least IE5.5 (and maybe earlier) cope with hexadecimal characters fine. In the meantime, it's a future-proofing fix that will greatly aid those wanting to output their sites as application/xhtml+xml
, without having to raise individual issues such as #16049!
Why does this break the page when serving as XHTML?
AFAIK XHTML supports the same named entities as HTML4 which would inclide
»'
.http://www.w3.org/TR/html401/sgml/entities.html
Changing this makes it less clear what it being done.