Make WordPress Core

Opened 19 years ago

Closed 19 years ago

#1180 closed defect (bug) (fixed)

special character reduction on category names misses shart-s

Reported by: gbhugo's profile gbhugo Owned by: ryan's profile ryan
Milestone: Priority: normal
Severity: minor Version: 1.5
Component: Administration Keywords:
Focuses: Cc:


The szlig char "ß" used in German isn't reduced to it's base form s when used in a category name but kept in it's utf-8 state. This breaks category names for URLs etc.

Change History (5)

#1 @gbhugo
19 years ago

  • Patch set to No

#2 @ryan
19 years ago

If you are using UTF-8, the category slug should contain %c3%9f in place of ß. The URI should not break.

#3 @gbhugo
19 years ago

It does - but shouldn't it degrade to "s" as much like "ä" degrades to "a"? I thought that "ß" would be reduced to the base character in the same way as accented and dotted vowels are.

The main problem with the %-encoding in URIs is that those URIs break when used in JavaScript in many browsers. Ok, this is a bug in those browsers, but I thought this could be solved in the same elegant way as the same problem with the accented chars :-)

#4 @ryan
19 years ago

Yes, degrading it is fine. I just wanted to make sure the UTF-8 octet generation code wasn't also broken.

ß wasn't initially added since it's not technically an accented letter with a canonical decomposition that reduces it to a Basic Latin letter plus an accent. According to the UCD, ß is a ligature of U+017F LATIN SMALL LETTER LONG S and U+0073 LATIN SMALL LETTER S. I don't always know what the accepted decomposition is for a letter that doesn't have a canonical decomp listed in the UCD, so I skip it until someone sets me straight.

So, if "s" is an accepted decomposition, let's go ahead and add it.

#5 @ryan
19 years ago

  • fixed_in_version set to 1.5.1
  • Owner changed from anonymous to rboren
  • Resolution changed from 10 to 20
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.