WordPress.org

Make WordPress Core

Opened 7 years ago

Closed 7 years ago

#3406 closed defect (bug) (wontfix)

Use HTML4 instead of XHTML1

Reported by: TedNelson Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Template Keywords:
Focuses: Cc:

Description

Please use HTML4 instead of XHTML1 in the output from WordPress blogs. The browser used by the majority of my readers doesn't support XHTML1, and is having to rely on error handling to handle your output.

Change History (14)

comment:1 technosailor7 years ago

  • Resolution set to wontfix
  • Status changed from new to closed

The problem is is that most modern browsers (and thus most WP users) can parse xhtml1 and in fact, xhtml1 is the defacto standard on the internet.

Code is Poetry. I think I speak for all WP devs when I say this won't be changed.

comment:2 link927 years ago

The whole issue is that it _isn't_ parsed as XHTML 1. It's parsed as HTML.

comment:3 technosailor7 years ago

Let's take the argument out of the arena of speculation and talk facts. What browser are your users using?

comment:4 link927 years ago

It's irrelevant - any UA which follows the HTTP specification will parse it as a text/html document, an HTML document. For a UA to parse it as XHTML is MUST be sent as application/xhtml+xml.

comment:5 Nazgul7 years ago

Some background info on XHTML browser support:
http://en.wikipedia.org/wiki/XHTML#Adoption

comment:6 Viper007Bond7 years ago

Themes are independent of WordPress. You can make/use a HTML 4.01 theme without any problems. ;)

comment:7 link927 years ago

Viper007Bond, you can't. There are places where " />" is hardcoded (a simple output buffer won't help, because then you can't use valid attributes such as title="bleh />").

comment:8 Viper007Bond7 years ago

Oh, right, right, generated content. Forgot about that. D'oh!

Well anyway, this is 2006. Time for people to stop using Netscape or whatever as their browser.

comment:9 follow-up: lachlanhunt7 years ago

The problem is that WordPress does not use a proper XML pipeline to ensure that the content is well-formed. It uses string-based processing to concatenate strings of markup together, regexes to find and replace all sorts of things, and other nasty little hacks all over the place. So much so, that even claiming that WordPress supports XHTML is absolutely bogus - it doesn't and it shouldn't pretend to.

The reality is, even if all browsers started supporting application/xhtml+xml tomorrow and didn't have any major issues with it (like they currently do), it would be impractical to change the MIME type of WordPress from text/html because it only takes one small error for the result to be fatal. I know there are many authors out there that are diligent enough to ensure their own markup is well formed, but for the average user, the tool must do that for them.

It's such a fundamental flaw with the way WordPress has been built, the only sane choice is to give up the myth that WordPress is even capable of producing XHTML; it's a tag-soup CMS only. However, I don't expect this bug to be fixed, I knew it would never be before it was even filed.

comment:10 in reply to: ↑ 9 Viper007Bond7 years ago

Replying to lachlanhunt:

even claiming that WordPress supports XHTML is absolutely bogus

That's funny seeing as how all my WordPress powered sites are 100% XHTML valid...

And it's not a horrible idea to switch to XML powered themes, although that's definitely WordPress 3.0 material...

comment:11 in reply to: ↑ description JeremyVisser7 years ago

Replying to TedNelson:

Please use HTML4 instead of XHTML1 in the output from WordPress blogs. The browser used by the majority of my readers doesn't support XHTML1, and is having to rely on error handling to handle your output.

Ted, it's not WordPress' fault if it is causing problems for your viewers. Are the devs going to completely change the default behaviour of the WP core (and thus break most themes) just to stop some issues with your viewers' browsers? Probably not. I hate to be like this, but this sort of stuff has seemed to work with other WP users for years now.

It's not WordPress' fault if they're using crappy non-standards-compliant browsers that don't support XHTML. Why fix it for their sake, when there are already existent browsers that fully support XML parsing. Why not put a Get Firefox or Get Opera link on your blog?

comment:12 follow-up: benjaminhawkeslewis7 years ago

Viper007Bond:

A couple points about this. First, XHTML conformance is not a matter of mere validity and, second, I suspect Lachlan regards XHTML 1.0 served as text/html as tag soup since it has no specified handling beyond browsers copying each others’ error recovery (see RFC 2854).

I’m going to go out on a limb here and guess that www.viper007bond.com is one of your WordPress sites? One of the things I’ve learned when dealing with tag soup systems is to never judging a site’s validity by its homepage. And sure enough, if we visit your penultimate post we find it fails validation with two errors. If you were serving that page as application/xhtml+xml, so that Firefox used its XHTML parser rather than its tag soup parser, you’d see a yellow screen of death instead of your page because your tags are mismatched.

It doesn’t reflect especially poorly on you that this happens (and I don’t claim to be any paragon of marking up myself). It’s a natural consequence of the fact that WordPress is not designed from the ground-up to emit valid XHTML, but instead to belt out broken markup that mostly renders okay thanks to browsers’ forgiving error handling.

JeremyVisser:

Most (if not all) WordPress blogs would break very visibly, just like Viper007Bond’s, if they were served as application/xhtml+xml, forcing browsers to use XML parsing rather than tag soup parsing. Indeed our ability to serve XHTML as text/html at all depends on browsers being “crappy” and “non-standards-compliant”, since if they had complied with the HTML specification they would interpret minimized XHTML tags ending /> as really ending at the / and print the > as part of the text content, due the SGML declaration for HTML 4.

comment:13 benjaminhawkeslewis7 years ago

  • Resolution wontfix deleted
  • Status changed from closed to reopened

Given the understandable concerns about breaking existing templates, how about just giving authors the option of either emitting pseudo-XHTML tag soup or the option of switching to HTML 4 by introducing an HTML 4 mode into WordPress? What exactly would be the downside of that?

Better yet, if you want to step boldly into the XHTML future, you could do it properly by serializing to HTML 4 or XHTML served as application/xhtml+xml depending on the Accept header. I suggest a reasonable method of content negotiation using the Accept header at:

http://lists.w3.org/Archives/Public/www-html/2006Nov/0044.html

At the very least, you should start serving XHTML with the correct MIME type of application/xhtml+xml to supporting user agents. This is what the authors of Appendix C intended, not that you should continue to inflict tag soup on all comers. You'll lose incremental rendering in Firefox, but that's a small price to pay for progress isn't it? If it forces you to impose better checks on validity and conformance, so much the better.

comment:14 in reply to: ↑ 12 Viper007Bond7 years ago

  • Resolution set to wontfix
  • Status changed from reopened to closed

Something as large as this shouldn't be discussed via Trac. Take it up on wp-hackers.

Replying to benjaminhawkeslewis:
And sure enough, if we visit your penultimate post we find it fails validation with two errors. If you were serving that page as application/xhtml+xml, so that Firefox used its XHTML parser rather than its tag soup parser, you’d see a yellow screen of death instead of your page because your tags are mismatched.

A result of my stupidity / a plugin.

Note: See TracTickets for help on using tickets.