Make WordPress Core

Opened 13 years ago

Closed 8 years ago

Last modified 8 years ago

#16282 closed defect (bug) (wontfix)

PHP catchable error with get_term_link and WP3.1RC2

Reported by: illutic's profile illutic Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.1
Component: Taxonomy Keywords: has-patch needs-unit-tests
Focuses: multisite Cc:

Description

I recently updated my WP Network to 3.1RC2, and suddenly the get_term_link() I was using in the footer gave me this error:

PHP Catchable fatal error: Object of class WP_Error could not be converted to string

It was working fine in 3.0.4

Attachments (7)

16282.patch (553 bytes) - added by SergeyBiryukov 13 years ago.
16282.2.patch (712 bytes) - added by scribu 13 years ago.
Clarify first parameter of get_term_link()
16282.3.patch (939 bytes) - added by scribu 13 years ago.
Further doc refinements
strict.16282.diff (1.3 KB) - added by scribu 13 years ago.
Only allow exact slug matches (untested)
both.16282.diff (641 bytes) - added by scribu 13 years ago.
both.16282.2.diff (775 bytes) - added by scribu 13 years ago.
Actually works
16282.diff (996 bytes) - added by dd32 13 years ago.
POC of querying both old and new forms of the slug

Download all attachments as: .zip

Change History (74)

#1 @SergeyBiryukov
13 years ago

It's not mentioned neither in Codex nor in the inline documentation that get_term_link() can return WP_Error instead of a string. Should we update the docs or change the function?

#2 @nacin
13 years ago

  • Milestone changed from Awaiting Review to 3.1

Per my comment in #15449, I had investigated and found that this function could have returned a WP_Error previously. This is now reflected in the docs in get_category_link() and get_tag_link().

If that is indeed accurate for get_term_link(), then I'm fine with leaving this as is. Though based on the report, this is concerning.

Moving to 3.1 for investigation.

#3 @SergeyBiryukov
13 years ago

Added patch for inline docs.

get_post_format_link() seems a bit inconsistent. If a format does not exist, it returns false.

#4 @nacin
13 years ago

  • Keywords close added; 2nd-opinion removed

I went to go find the other ticket I recently saw on this, but then I realized it was this ticket.

So:

  • Fix the inline docs. get_term_link() has been able to return WP_Error for quite a while.
  • The reporter may wish to look into what the WP_Error is saying, and why the term no longer exists (presumably). Does not look like a core issue.
  • I kind of like get_post_format_link() avoiding WP_Error. That's because it might be used in a theme for one of the defined formats, but it might not yet exist as a term since no posts have been assigned to it. Easier to return nothing rather than fatal. It also doesn't take a term, but rather a specific format -- so it could be considered at a higher level in the API.

Therefore:

  • I'm committing the patch here and suggest we close as fixed.

#5 @nacin
13 years ago

(In [17341]) Correct the @return for get_term_link. It always has been able to return WP_Error. props SergeyBiryukov, see #16282.

#6 @scribu
13 years ago

  • Resolution set to fixed
  • Status changed from new to closed

Agreed.

#7 @illutic
13 years ago

  • Keywords needs-patch added; close removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

As I discussed with nacin on twitter: this issue is not resolved yet, so reopening ticket.

WP_Error returns 'Empty term' with get_term_link() and I'm 100% sure the term does exist. Downgrading to 3.0.4 made all the links reappear too.

The patch that was submitted doesn't work (currently install is on 3.1RC3 and get_term_link still doesn't work).

#8 @SergeyBiryukov
13 years ago

  • Keywords reporter-feedback added

The patch just clarifies inline docs for the function.

Not sure how to reproduce the issue. Exact steps would be helpful.

#9 @SergeyBiryukov
13 years ago

Might be related to #13081.

#10 @illutic
13 years ago

  • Keywords dev-feedback added; reporter-feedback removed

All I did was use get_term_link inside the footer of my theme, ie: get_term_link('Overijssel','provincie')

#11 @nacin
13 years ago

Agreed with re-opening for investigation. I feel something's going on here.

#12 @scribu
13 years ago

The problem comes most likely comes from get_term_by().

illutic, it would be really helpful in this case if you would let one of us do some debugging on your setup.

My email is mail @ scribu.net

Last edited 13 years ago by scribu (previous) (diff)

#13 @SergeyBiryukov
13 years ago

What code do you use for registering the provincie taxonomy?

#14 @illutic
13 years ago

@SergeyBiryukov, I'm using the CustomPress plugin for it (http://premium.wpmudev.org/project/custompress).

@scribu, no problem. I'll email you the info right away.

#15 @scribu
13 years ago

The problem was caused by the new sanitize_title_for_query().

I replaced this:

echo get_term_link( __( 'België', 'wpo' ),'provincie' );

with this:

echo get_term_link( get_term_by( 'name', __( 'België', 'wpo' ), 'provincie' ) );

and it works.

The __() is used in order for the WPML plugin to work.

Even if sanitize_title_for_query() weren't introduced, get_term_by('slug', ...) could still return incorrect results, if the slug was modified manually, for example.

Suggest closing as 'worksforme'.

#16 @scribu
13 years ago

Or 'invalid', or 'fixed' or whatever. :)

#17 @scribu
13 years ago

  • Keywords close added; needs-patch dev-feedback removed

@scribu
13 years ago

Clarify first parameter of get_term_link()

#18 @scribu
13 years ago

  • Keywords has-patch added

I suppose further clarification of the docs on get_term_link() wouldn't hurt. See 16282.2.patch.

@scribu
13 years ago

Further doc refinements

#19 @scribu
13 years ago

  • Keywords commit added; close removed

#20 follow-up: @illutic
13 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

Odd, I alsy tried it with using the slug (as below), but that gave me the same results.

echo get_term_link('belgie','provincie');

Also tried it with Nederland but same results..

Oh well, I'm happy it's resolved and it works now :)

#21 @scribu
13 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Let's update the documentation so that other people don't fall into this trap.

#22 @nacin
13 years ago

Are we looking at a backwards incompatible change with sanitize_title_for_query?

#23 @scribu
13 years ago

No, as explained, get_term_link() only supports passing an actual term slug, not a term name.

The fact that afterwards get_term_by('slug', ...) calls sanitize_title_for_query() internally is a different matter. I would say it causes more harm than good.

#24 in reply to: ↑ 20 @scribu
13 years ago

Replying to illutic:

Odd, I alsy tried it with using the slug (as below), but that gave me the same results.

echo get_term_link('belgie','provincie');

I would wager that has something to do with the way the WPML plugin handles terms.

#25 @SergeyBiryukov
13 years ago

Would it make any sense to get a term by name if it was not found by slug? If it was working in 3.0, I suspect more people can encounter this issue after upgrading to 3.1.

#26 follow-up: @scribu
13 years ago

If it was working, it was working by accident. We should discourage passing a name to get_term_link().

I would go even one step further and not call sanitize_title_for_query() at all. That way, at least it would fail consistently.

@scribu
13 years ago

Only allow exact slug matches (untested)

#27 in reply to: ↑ 26 @nacin
13 years ago

Replying to scribu:

If it was working, it was working by accident. We should discourage passing a name to get_term_link().

I would go even one step further and not call sanitize_title_for_query() at all. That way, at least it would fail consistently.

That would be even worse for back compat, no?

#28 @scribu
13 years ago

Sadly yes, plus I also found get_term_by() is still used in WP_Query.

Hence, I maintain we go with 16282.3.patch.

#29 @dd32
13 years ago

My vote is striping out sanitize_title_for_query() from get_term_by(). In this case it's encoding the characters instead of converting them to what we use in slugs (ie. 'België' to 'belgi%c3%ab' )

Ideally, sanitize_title_for_query() should be replaced with sanitize_title() in that spot, This is the same function which sanitizes Term names into Term Slugs, so it'll have the desired effect of converting 'België' to 'belgie'.

This doesn't change the fact the example code was 'broken' to start with, as if the slug differs from the sanitized-title it'll not work, But this is a regression with a simple and (IMO, more correct) solution than that is currently in core.

#30 @SergeyBiryukov
13 years ago

sanitize_title() makes sense for me. It was used before [16832].

#31 @scribu
13 years ago

Ideally, sanitize_title_for_query() should be replaced with sanitize_title() in that spot, This is the same function which sanitizes Term names into Term Slugs, so it'll have the desired effect of converting 'België' to 'belgie'.

sanitize_title() makes sense for me. It was used before [16832].

The correct solution would be to have two versions of get_term_by(): one for WP_Query, that uses sanitize_title_for_query() and one for get_term_link() which uses sanitize_title() nothing.

For more details on why this would be necessary, see #9591.

Last edited 13 years ago by scribu (previous) (diff)

#32 @dd32
13 years ago

The correct solution would be to have two versions of get_term_by(): one for WP_Query, that uses sanitize_title_for_query() and one for get_term_link() which uses sanitize_title() nothing.

After reading that ticket earlier, It just made me even more confused why you replaced it with sanitize_title_for_query() in that spot.

In the case of Term slugs, µ is sanitized to u (as sanitize_title does on pre_term_save), not sanitized to '%c2%b5' like sanitize_title_for_query().

Perhaps, it's just that sanitize_title_for_query() needs to use remove_accents() as well.

What does that changeset actually allow? Looking at that ticket, that commit was after it was decided it was 3.2 material, and being that it's caused this regression..

#33 @scribu
13 years ago

What does that changeset actually allow?

That series of changesets allows adding characters to remove_accents() without causing 404s for old slugs that have those characters.

Looking at that ticket, that commit was after it was decided it was 3.2 material, and being that it's caused this regression..

That is not true. Mark changed the milestone after [16832] had gotten in:

http://core.trac.wordpress.org/ticket/9591#comment:51

#34 @scribu
13 years ago

Sorry, yes, [16832] was after that.

#35 @scribu
13 years ago

Even so, I don't think it would be a good idea to go messing around with it at this point, as there have been no regressions reported since it went in.

#36 @dd32
13 years ago

as there have been no regressions reported since it went in.

Except this ticket of course :)

#37 @markjaquith
13 years ago

Well ... except for this ticket! [16832] went in after the milestone was changed, and it breaks the way people were using get_term_link() (which apparently wasn't documented sufficiently to justify breaking these uses). This seems like a regression to me.

Our term API is horrible, but it's what we have. We can do a nice object-oriented API for a future version of WP that makes people be explicit about whether they're asking for a term by name, slug, ID, etc. But until then, we should avoid breaking our existing API.

#38 @nacin
13 years ago

  • Keywords revert added; has-patch commit removed

I'm really tired of discovering changesets at this stage in the cycle. Time to revert this.

#39 @markjaquith
13 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [17357]) Revert [16832]. see #9591. fixes #16282

#40 @scribu
13 years ago

Ok, but can we still get 16282.3.patch in?

#41 @scribu
13 years ago

Also, I thought 3 core developers were needed to make a commit, not 2. Or are we not in RC mode anymore?

I ask because this revert _will_ cause nasty errors for other users upgrading from WP 3.0.

#42 @scribu
13 years ago

I take that back. DD32 was right. The revert does not affect WP_Query, since the slug is URL-encoded.

#43 @scribu
13 years ago

Eating my hat now...

#44 @scribu
13 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Say I have a category with the slug 'șir', which I brought over from WP 3.0.

Going to /category/șir will work, as noted above.

Doing get_term_link('șir', 'category'); wont' work, because the slug isn't URL-encoded this time.

So, there still is cause for concern.

Last edited 13 years ago by scribu (previous) (diff)

#45 @scribu
13 years ago

It boils down to this:

Which should work:

get_term_link('Term name with an áccent', 'category');

Motivation: backwards compatibility with an incorrect usage, due to ambiguity in the documentation.

Or this:

get_term_link('șir', 'category');

Motivation: backwards compatibility with previously valid data.

Last edited 13 years ago by scribu (previous) (diff)

@scribu
13 years ago

#46 @scribu
13 years ago

  • Keywords has-patch added; revert removed

I guess with Sergey's idea of also using get_term_by('name', ...) in get_term_link() would make both work.

See both.16282.diff.

@scribu
13 years ago

Actually works

#47 @dd32
13 years ago

Ok, So the root of the cause here, Is that in <= 3.0, some slugs have been stored as URLEncoded, unlike 3.1 which is currently applying it correcty (Due to a bug in sanitize_title, not recognising all accents - can we just replace any non a-z characters already after sanitizing?)

This results in sanitize_title_for_query() actually working for those old slugs in 3.1. What it does not allow for, is Terms added in 3.1 to be queried correctly, Which is why scibu has added the extra get_term_by('name'..

As a result, get_term_by() -cannot- use either sanitize_title() OR sanitize_title_for_query(), It must use bot for 100% backwards compatibility. attachment:both.16282.2.diff handles the fact that the database may contain incorrect data at a high level (ie. In the calling function if by slug didnt work, try it by name. It doesnt count for the fact that the slug may have been sanitized wrong, and therefor take care of it in get_term_by()).

We can now ignore get_term_link() as that's not a problem, mearly showing a problem deeper in the Taxonomy API - attachment:both.16282.2.diff will allow get_term_link to work, but it does not help with the root cause which could show up in any function which uses get_term_by('slug',..).

The problem is the fact that the input Sanitization for get_term_by() has changed, in a backwards incompatible way, we can use sanitize_title() for terms added in 3.1, where as, terms added before 3.1 might not be found using that (If they include characters which have accents which were not being removed) and sanitize_title_for_query() would need to be used instead, but once again, it's cat and mouse, as the latter cannot query the former, and the former cannot query the latter.

So, Test case. Created a term 'șir' in 3.0 sanitized format, Created a term 'uș' in 3.1 sanitized format, test code:

var_dump( get_term_by('slug', 'șir', 'category') );
var_dump( get_term_by('slug', 'uș', 'category') );

Under 3.1, that causes the first to fail(3.0 format), the 2nd to succeed(3.1 format). Previous to the revert it caused the first to succeed(3.0 format), and the 2nd to fail(3.1 format).

So this brings up the question, If a slug is being requested, we sanitize the input to match what we call a "slug", in doing so we've allowed internationalised slugs to be requested which aids non-english coders.

The solutions here are:

  • Require the exact slug format to be passed, This means no accented characters, This means that the input will differ depending on when the term was created (ie. 'sir' in 3.1, or '%c8%99ir' in 3.0)
  • Make terms created in previous version inaccessible
  • Query for both Old and New taxonomy slugs
    • attachment:16282.diff as an example of this, a POC which alows both test cases above to pass
    • This results in up to 2 queries per taxonomy request,
      • ascii-only requests will result in 1 query
      • international characters with a slug created in 3.1 will result in 1 query if found, else 2
      • international characters with a slug created prior to 3.1 will result in 2 queries
  • store the slug somehow differently or update previous terms (Using the alias functionality(term_group) could allow for that, but that's just complicating things)

We need proper unit tests for Taxonomy queries, This would've shown up in a proper test case scenario.

@dd32
13 years ago

POC of querying both old and new forms of the slug

#48 @dd32
13 years ago

This sort of sanitizing problem can happen whenever we fix a bug in sanitize_title() or change the way it behaves, It should affect anything which relies upon the return value..

#49 @scribu
13 years ago

A great description of the issue, Dion.

I would just like to add that the same applies to post slugs, not just terms.

#50 @dd32
13 years ago

I'd also like to add, That this doesn't affect publically facing links, Regardless of when the term was created, the public link will always use that slug, which keeps the sanitization of when it was added, which allows it to be queried correctly through a permalink.

#51 @scribu
13 years ago

  • Keywords commit added

Although the issue is complex, I think both.16282.2.diff covers all the bases for WP 3.1.

Let me know if I'm wrong.

#52 @dd32
13 years ago

I'm of the mind that we should use sanitize_title() still, It's how the slugs are currently sanitized.

We cannot retain 100% backwards compatibility here for get_term_by('slug', take this for example:

var_dump( get_term_by('slug', 'ńaș', 'category') );

Using sanitize_title(), if it's created in 3.1, it'll be accessible (the slug is 'nas'). Using sanitize_title_for_query() it'll be inaccessible (it makes the slug '%c5%84a%c8%99').

Lets take that same term created in 3.0 however, and request it using 'ńaș'

  • using sanitize_title() will fail ('nas')
  • using sanitize_title_for_query() will fail ('%c5%84a%c8%99')

Reason being, the slug in 3.0 is: na%c8%99 (The n is converted, the s isnt)

So ultimately, the exact slug will have to be requested, You cant use either one or the other, It needs to stay with the current sanitization functions employed by core.

As for changing get_term_link() to accept a name instead, Seems like the "best" bet here, But thats an enhancement over core functionality, Anyone relying on that is ultimately relying on a bug/feature that was never supposed to be.

#53 follow-up: @scribu
13 years ago

The reason I don't think sanitize_title() should be used is because it would prevent modifications to remove_accents() again.

#54 in reply to: ↑ 53 ; follow-up: @dd32
13 years ago

Replying to scribu:

The reason I don't think sanitize_title() should be used is because it would prevent modifications to remove_accents() again.

Except it doesn't help. Slugs are stored as sanitize_title() in the db.. If it gets stored using one sanitizer, you cant just use a different sanitizer for querying.

It'd be a question of if slugs should be stored as sanitize_title() or as URLEncoded.

Eitherway, IMO, This is now out of 3.1 territory, Adding a by name is an enhancement, attempting to handle 100% backwards compatibility here seems like a can of worms.

#55 in reply to: ↑ 54 @scribu
13 years ago

Replying to dd32:

Except it doesn't help. Slugs are stored as sanitize_title() in the db.. If it gets stored using one sanitizer, you cant just use a different sanitizer for querying.

Actually, for querying, you shouldn't need a sanitizer at all. It was used as a sort of "smart" 404 detection, for example accessing the same category using any of these urls:

http://example.com/category/accent
http://example.com/category/accént
http://example.com/category/Accént

#56 @dd32
13 years ago

Actually, for querying, you shouldn't need a sanitizer at all.

In this case I'm refering to get_term_by('slug' querying. WP_Query is a separate beast which is separate to this discussion.

#57 @scribu
13 years ago

Well, since get_term_by('slug', ...) would be called by the developer explicitly, I don't see why they would expect it to be run through the sanitizer again, except in the case of get_term_link(), due to ambiguity in it's docs.

#58 @scribu
13 years ago

I only now saw that get_term_by() has a $filter argument, which should be exactly what we need.

#59 @westi
13 years ago

Before we change this code which should write some unit-tests around this behaviour so that we can easily repeatably test this on old and new versions of WordPress

#60 @hakre
13 years ago

It's just a smell, but probably helpful to know:

In case you're getting a 404 error, please test if the Normalize URLs plugin is solving the issue.

#61 @scribu
13 years ago

  • Keywords needs-unit-tests added; commit removed
  • Milestone changed from 3.1 to Future Release

Since the revert doesn't cause the 404s I was afraid of, and due to lack of unit tests, I think it's best we leave this as is for now.

#62 @dd32
13 years ago

Marked #16385 as duplicate:

I have in index.php the line 6:
<a href="<?php echo get_category_link(get_theme_mod('featured')); ?>"><?php echo cat_id_to_name(get_theme_mod('featured')); ?></a>

In Wordpres 3.0.4 works perfect, but in WordPress 3.1 RC3:
Catchable fatal error: Object of class WP_Error could not be converted to string in C:\wamp\www\wordpress\wp-content\themes\cvasi-magazine\index.php on line 6

#63 @hakre
13 years ago

Reference: #16464, #16717

#64 @jeremyfelt
10 years ago

  • Component changed from Multisite to Taxonomy
  • Focuses multisite added
  • Severity changed from major to normal

#65 @boonebgorges
8 years ago

In 34628:

Add test for get_term_by( 'slug' ) behavior with accented chars.

When passing a value of 'slug' to wp_insert_term() that contains non-ASCII
characters, WordPress converts accented characters to non-accented versions.
(See sanitize_title() and remove_accents().) The same conversion happens
when fetching when fetching a term using a slug. In this way, it's possible to
create a term and fetch it using the same accented string, even though the slug
is actually stored with non-accented characters.

See #16282.

#66 follow-up: @boonebgorges
8 years ago

  • Milestone Future Release deleted
  • Resolution set to wontfix
  • Status changed from reopened to closed

Given that five years have passed since the slug-storage mechanism changed from URL-encoding to accent-stripping, I don't think we can make any changes now. And given that no one else seems to have reported this issue in the last five years (though see [19444] for page slugs), I think we are pretty safe to close this ticket.

[34628] adds a test that describes the way things currently work, to prevent against future regressions.

#67 in reply to: ↑ 66 @SergeyBiryukov
8 years ago

Replying to boonebgorges:

Given that five years have passed since the slug-storage mechanism changed from URL-encoding to accent-stripping, I don't think we can make any changes now.

To clarify, non-Latin slugs are still URL-encoded, and can only contain up to 33 characters: comment:18:ticket:10483.

Note: See TracTickets for help on using tickets.