Ticket #17487 (accepted defect (bug))

Opened 12 months ago

Last modified 6 months ago

Codepage issue with the wp.org Credits API

Reported by: demetris Owned by: westi
Priority: normal Milestone: WordPress.org
Component: General Version:
Severity: normal Keywords:
Cc: demetris

Description

I was playing with my profile page at wp.org and changed the Name field from:

demetris

to:

demetris (Δημήτρης Κίκιζας)

What the API returns for that is:

demetris (???????? ???????)

It seems the API returns its results in ISO 8859-1. Can we change that to UTF-8?

Attachments

reemplazos.php Download (8.4 KB) - added by bi0xid 12 months ago.
Replace LATIN1 -> UTF-8

Change History

  • Owner set to westi
  • Status changed from new to assigned
  • Milestone changed from Awaiting Review to WordPress.org
  • Status changed from assigned to accepted

I spent a good amount of time looking into this and it is not an easy fix at the moment.

We have a mix of UTF8 and latin1 tables on WP.org and the api is serving data from a latin1 table.

We can revisit this later but for now I have changed the api to return the user name when we detect this.

comment:4 follow-up: ↓ 6   bi0xid12 months ago

I have solved it changing the tables to UTF-8 and doing a "Search and Replace".

Attached FYI.

Replace LATIN1 -> UTF-8

could  mb_convert_encoding() be used here at all?

closed #17915 as duplicate

comment:6 in reply to: ↑ 4 ; follow-up: ↓ 7   westi6 months ago

Replying to bi0xid:

I have solved it changing the tables to UTF-8 and doing a "Search and Replace".

Attached FYI.

While this might look like it works in only handles a subset of common case issues and doesn't really resolve the underlying issue.

We have a lot of tables and a lot of data - there have been a lot of accounts created over time on WP.org for the forums and we need to do this in a careful and reliable manner.

This is not the only manifestation of the issue and we need to work carefully to resolve them all - hopefully that will happen soon.

comment:7 in reply to: ↑ 6   bi0xid6 months ago

Replying to westi:

While this might look like it works in only handles a subset of common case issues and doesn't really resolve the underlying issue.

I know. It's just a patch for everyone who needs a quick solve while we find a good solution. These are the common cases for Spanish transformations.

Since some bad data was still being returned via the API due to HTML entities, there is now also a remove_accents() call on the names.

Note: See TracTickets for help on using tickets.