#6387 closed task (blessed) (fixed)
Duplicate post Tags UI for other taxonomies
Reported by: | tellyworth | Owned by: | |
---|---|---|---|
Milestone: | 2.8 | Priority: | normal |
Severity: | normal | Version: | |
Component: | General | Keywords: | blessed has-patch has-unit-tests |
Focuses: | Cc: |
Description
This duplicates the Tags postbox UI (with the Add button and X to remove) for all non-hierarchical post taxonomies.
No doubt js experts will find it crude in spots but it works. I've left in two register_taxonomy() calls that I used for testing, lines 209 and 210 of edit-form-advanced.php. They're useful for testing this patch but should obviously be removed for production.
There's no real code for fetching or saving taxonomies other than post_tag. That should probably be included but requires some careful fitting with get_tags_to_edit() and friends so I've left it out for now. I added a quick filter as a crude fetch hook but it should probably be replaced with something more solid.
Attachments (9)
Change History (34)
#1
@
17 years ago
The new patch has some better code for saving and fetching. Note that, while wp_insert_post() still supports the tags_input item for compatibility, the new UI code now passes in $_POSTtax_input? instead.
The temporary test taxonomies are still there but moved to the bottom of taxonomy.php.
#2
@
17 years ago
Would love if the get and set of post terms dealt with IDs instead of names since we're in the code.
#9
@
16 years ago
Instead of passing a taxonomy arg to a tag function, we should just use the generic term function. That tag functions are convenience wrappers but if we're passing a taxonomy to them they are fairly pointless.
@
16 years ago
Lose show_message(). Fix notices. Use generic terms funcs instead of passing tax to tag funcs.
This ticket was mentioned in PR #5373 on WordPress/wordpress-develop by @dmsnell.
5 months ago
#13
- Keywords has-patch has-unit-tests added
Trac ticket: Core-60698
(previously Core-60841)
Motivated by the need to properly transform HTML named character references (like &) I found the need for a new semantic class which can efficiently perform search and replacement of a set of static tokens. Existing patterns in the codebase are not sufficient for the HTML need, and I suspect there are other use-cases where this class would help.
---
In #6387 I have built a spec-compliant HTML5 text decoder utilizing this token map. The performance of the new decoder is approximately 20% slower than calling html_entity_decode()
directly, except that it properly decodes what PHP can't. In fact, the performance bottleneck in that PR comes from converting into UTF-8 the sequence of digits in numeric character references, not in looking up named character references.
---
This proposal is adding a new class WP_Token_Map
providing at least two methods for normal use:
contains( $token )
returns whether the passed string is in the set.read_token( $text, $offset = 0, $skip_bytes )
indicates if the character sequence starting at the given offset in the passed string forms a token in the set, and if so, returns the replacement for that token. It also sets &$skip_bytes to the length of the token so that calling code .
It also provides utility functions for pre-computing these classes, as they are designed for relatively-static cases where the actual code is intended to be generated dynamically, but stay static over time. For example, HTML5 defines the set of named character references and indicates that the list shall not change or be expanded. HTML5 spec. Precomputing can save on the startup-up cost of building the optimized lookup tables.
WP_Token_Map::from_array( array $mappings )
generates a new token map from the given array of whose keys are tokens and whose values are the replacements.to_array()
dumps the set of mapping into an array suitable for passing back intofrom_array()
.WP_Token_Map::from_precomputed_table( ...$table )
instantiates a token set from a precomputed table, skipping the computation for building the table and sorting the tokens.precomputed_php_source_table()
generates PHP source code which can be loaded with the previous static method for maintenance of the core static token sets.
#### Other potential uses
- Converting smilies like :)
- Converting emoji sequences like :happy:
- Determining if a given verb/action is allowed in an API call.
5 months ago
#14
Some CI jobs report that some static test helpers are private and it can’t access them.
5 months ago
#15
I've been reviewing the implementation and I have a concern. As far as I can tell, case-insensitive behavior is unreliable on some string strings, specifically multibyte strings with upper/lowercase variants like ü.
Have you considered this?
Yes, absolutely.
I did add case insensitivity during the testing cycle kind of as an afterthought, but I like having it. It's intended to aid in the purpose for which the class was made, which is focusing on ASCII casing. This may sound strange, but almost every basic casing functions are primarily if not entirely ASCII-oriented, because proper case-folding is a much more complicated topic, and no, mb_strtoupper()
is not enough, at all.
So I think the expectation that this works for ü
is more subjective. My gut feeling is that we can clarify this in documentation and let it be. There's no way this is going to be properly detecting more complicated case variants, but then again, nothing in this domain does. Otherwise we can remove it, but I like how the option gives us the ability to implement functions like WP_HTML_Processor::is_special()
without allocating a new string and upper-casing it.
@jonsurrell commented on PR #5373:
5 months ago
#16
It seems fine to me if we don't support case insensitive non-ascii token matching as long as we're clear about it.
5 months ago
#17
It seems fine to me if we don't support case insensitive non-ascii token matching as long as we're clear about it.
Ha! I went to add these and I already noted this in the function docblock.
'case-insensitive' to ignore ASCII case or default of 'case-sensitive'
Text issues are apparently always on my mind. Let me know if you still think this is unclear. I'm worried that if we try and add more documentation when it's already there right at the cursor when calling the function, then it won't be any more helpful, just more noise.
@jonsurrell commented on PR #5373:
5 months ago
#18
I already noted this in the function docblock … Let me know if you still think this is unclear.
I think this is fine. I was searching for words like "mb" or "multibyte", but it seems like ASCII is correct 👍
I did notice a changelog entry on the PHP documentation page that makes me wonder if we might need to support PHP versions that have varying behavior on non-ASCII treatment. I wasn't able to trigger any differing behavior myself, but do you know what this is about?
8.2.0
Case conversion no longer depends on the locale set with `setlocale()`. Only ASCII characters will be converted.
That seems like exactly what we want, but I've looked around PHP source, the changelog, etc. and I haven't managed to find exactly what changed or examples of the different.
5 months ago
#19
I did notice a changelog entry on the PHP documentation page that makes me wonder if we might need to support PHP versions that have varying behavior on non-ASCII treatment. I wasn't able to trigger any differing behavior myself, but do you know what this is about?
I've struggled to test this with the HTML API. in effect, some system locales could cause, I believe, a few characters to be case-folded differently, but this is such a broken system I'm not of the opinion that it's a big risk here. I'm willing to let this sit as a hypothetical bug until we have a clearer understanding of it.
5 months ago
#20
One suggestion that can be done in a follow up, I think automating pulling https://html.spec.whatwg.org/entities.json and checking to see if the autogenerated file has been changed would be beneficial. Can likely be done on a weekly schedule and only in trunk.
this is a good suggestion, @aaronjorbin - and thanks for the review! (I'll be addressing the other feedback inline where you left it).
my one main reluctance to automate the download is that the specification requires that the list never change, and automating it only adds a point of failure to the process, well, a point of failure plus some latency.
do you think that it's worth adding this additional complexity in order to change what is defined to not change? I believe that a wide cascade of failures would occur around the internet at large if we found new character references in HTML.
I'm happy to add this if there are strong feelings about it, though if that's the case, maybe we could consider it a follow-up task so as not to delay this functionality? a separate task where we can discuss the merits of the automation?
5 months ago
#21
I think we can add a new token-map group. It also might be worth putting in the html-api group since this is where the bigger effort lies
@aaronjorbin following the html5lib test suite, I added the wpTokenMap
test module to the @group html-api-token-map
group. happy to change if this isn't idea; I was looking for more examples and didn't see. it can also be set as a @package
and @subpackage
but I was looking for @subgroup
and didn't find any
5 months ago
#22
@aaronjorbin following the html5lib test suite, I added the
wpTokenMap
test module to the@group html-api-token-map
group. happy to change if this isn't idea; I was looking for more examples and didn't see. it can also be set as a@package
and@subpackage
but I was looking for@subgroup
and didn't find any
@subgroup isn't an annotation that is a part of PHPUnit. I think this group makes sense, thanks!
5 months ago
#23
One suggestion that can be done in a follow up, I think automating pulling https://html.spec.whatwg.org/entities.json and checking to see if the autogenerated file has been changed would be beneficial. Can likely be done on a weekly schedule and only in trunk.
this is a good suggestion, @aaronjorbin[[Image(chrome-extension://hgomfjikakokcbkjlfgodhklifiplmpg/images/wp-logo.png)]] - and thanks for the review! (I'll be addressing the other feedback inline where you left it).
my one main reluctance to automate the download is that the specification requires that the list never change, and automating it only adds a point of failure to the process, well, a point of failure plus some latency.
do you think that it's worth adding this additional complexity in order to change what is defined to not change? I believe that a wide cascade of failures would occur around the internet at large if we found new character references in HTML.
I'm happy to add this if there are strong feelings about it, though if that's the case, maybe we could consider it a follow-up task so as not to delay this functionality? a separate task where we can discuss the merits of the automation?
Upon reading https://github.com/whatwg/html/blob/main/FAQ.md#html-should-add-more-named-character-references I withdrawal this suggestion. I do think it might make sense to link that and mention that the entities json should never change in the readme for others that come across this with a similar thinking as me.
5 months ago
#24
I do think it might make sense to link that and mention that the entities json should never change in the readme for others that come across this with a similar thinking as me.
Thanks again @aaronjorbin!
It's already mentioned at the top of the README, in the character reference class (and as well in the autogenerator script where it comes from), the PR description, and the Trac ticket description.
Is that good enough or was there somewhere else you wanted it? Maybe the way I wrote it wasn't as direct or clear as it should be?
save and fetch code added