#20103 closed defect (bug) (fixed)
Rewrite get_themes() and other enemies of sanity
Reported by: | nacin | Owned by: | |
---|---|---|---|
Milestone: | 3.4 | Priority: | normal |
Severity: | normal | Version: | 1.5 |
Component: | Themes | Keywords: | has-patch |
Focuses: | Cc: |
Description
get_themes() is a memory hog (#11214). The data it returns, and how it calculates it, is terribly designed (#15858, #12275, others) and inflexible (#19816, others). It is not possible to translate this data. Its aspects are poorly named (#11215). get_themes() keys things by theme name, rather than slug, which the theme editor and other areas rely on. Indeed, get_current_theme() and the corresponding option in the DB stores the name.
The functions that scan (get_themes(), search_theme_directories()) are stuck in a PHP4 world. They use globals and arrays, and one took your lunch money. They implement unnecessary loops and directory scans, making for very lame performance.
With get_themes(), everything is loaded at once, as there is no opportunity for lazy-loading, even though lists of files are only needed for the theme editor and often all we want is the current theme's information.
Almost none of this is even remotely cacheable, certainly not sanely.
Attempts to work with this API usually lead to disaster. The theme editor is a textbook example. It breaks on basically anything other than A-Z (again relying on names rather than slugs), and does some really evil things. I cleaned up what I could two years ago during my GSoC project, but because get_themes() makes most of MU look good, little of it could flow back into core.
Over the next few days, I will be attaching some patches that rewrite large swaths of this aging area of core.
Attachments (7)
Change History (115)
#3
@
13 years ago
Bad branching in 20103.diff. Adjusted (and the whole process made clearer) in search_theme_directories.diff.
#8
@
13 years ago
WP_Themes_List_Table::search_theme() does a sanitize_title_with_dashes() on every tag of every theme.
This is not necessary. Tags are already hyphen-delimited, or we should be doing that in get_themes() or get_theme_data() to begin with.
In terms of performance, it's terrible. With the WP.com themes directory loaded up, search_theme() takes up more than 50% of processing time. 40% of processing time is spent in sanitize_title_with_dashes().
Simply removing the sanitization (and slightly better branching) results in a tremendous speed-up. In my working patch, it works out to be 17% of the processing time with a page that is twice as fast.
#9
@
13 years ago
- Milestone changed from Awaiting Review to 3.4
Introducing the WP_Theme class
As the ticket promises, WP_Theme replaces get_themes() and other enemies of sanity. Each theme is its own object, decorated on construction and packed with 21 main public methods, four public static methods, and a few internal helpers. Full support for multiple theme roots and directories of themes within a theme root. The handling for broken-ness and odd situations is smoother than get_themes().
array wp_get_themes( $args = array() );
To fetch a array of WP_Theme objects, call wp_get_themes(). wp_get_themes( $args ) takes three arguments that filter the results: errors, allowed, and blog_id.
- errors (Default: FALSE) - Filters themes based on whether they have errors. True returns only themes with errors, False only themes without errors, Null if all themes should be returned.
- allowed (Default: NULL) - Filters themes based on whether they are allowed on the site or network. (Multisite only.) Valid options are 'network' (allowed on the network), 'site' (allowed on the site specified by the blog_id argument), true (allowed at either level), false (not allowed).
- blog_id (Default: current) - Controls which site should be checked for the allowed filter. Only works, of course, in multisite.
WP_Theme wp_get_theme( $stylesheet = null, $theme_root = null );
To fetch a single theme, call wp_get_theme(). This will return a single WP_Theme object for the requested theme.
It takes two argumements: $stylesheet and $theme_root. If $stylesheet is omitted, the current stylesheet (via get_stylesheet()) will be used. If the $theme_root is omitted, get_raw_theme_root($stylesheet) will be used. Thus, to get the current theme, you may call wp_get_theme() with no arguments.
wp_get_theme() replaces get_theme(). get_theme() previously used get_current_theme() to get the name-keyed theme array from get_themes(). Not only are we no longer juggling arrays keyed by the theme name, we're not even processing every theme into memory to get a single WP_Theme object. Nice.
Handling Errors
Currently, search_theme_directories() and get_themes() will build a separate array of broken themes, $wp_broken_themes. This array is then returned by get_broken_themes().
With WP_Theme, errors like no template, invalid parent themes, or unreadable stylesheets can be nicely caught. Errors are populated into an errors property, accessible via WP_Theme->errors(). It is either null or no errors, or a WP_Error object. wp_get_themes( array( 'errors' => true ) ) enables one to loop through and pull out error messages and codes from a WP_Error object. In some cases, the current theme could 'error' but it still works, and we have access to what went wrong.
Network Support
WP_Theme encapsulates the logic for allowed themes on a network, with static methods to fetch all allowed themes on a network, themes allowed at the site level, and a merge of both lists (all themes allowed for a single site).
Each theme object then has a simple is_allowed( $check = 'both', $blog_id = 0 ) method to allow for checking whether it is allowed.
The okay-named function get_site_allowed_themes() and the terribly named function wpmu_get_blog_allowedthemes() are both deprecated in favor of static class methods.
Outputting Headers
What started me on WP_Theme was that get_themes() could not be internationalized in any sane way (#15858). The data usually came out of get_theme_data() already cleaned and even marked up (such as with wptexturize), making translation difficult or impossible. Since the data was keyed by name, changing that value would break quite a bit. As get_themes() already stored a lot of data in memory (and blows up memcached on WP.com, as they try to store it persistently), adding more keys didn't make sense.
mixed WP_Theme->get( $header )
mixed WP_Theme->display( $header, $markup = true, $translate = true )
WP_Theme has two main method for gathering header data: get() and display(). Once you call get(), the data has already been sanitized, either by strip_tags(), kses, or esc_url, depending on the header. (Tags are returned as an array.) By calling display(), the data is translated, and then marked up. (Even Tags — the array is translated for known tags.) The textdomain is automatically loaded if necessary.
display() returns data. It does not echo it. It is formatting the data for display, and would roughly be analogous to get_bloginfo( 'name', 'display' ).
get('Author') returns Andrew Nacin. display('Author') marks that up with <a> tags and a link to http://nacin.com (being AuthorURI).
display() also allows you to decline markup and translation with additional arguments, but these are mainly used for backwards compatibility or to allow fields to be searched through without Author HTML and the like.
Private methods sanitize_header(), translate_header(), and markup_header() support get() and display() in their endeavors.
Helpers
WP_Theme has a number of public methods that corresponds to the global function acting on the active theme. These include:
bool WP_Theme->is_child_theme()
string WP_Theme->get_stylesheet()
string WP_Theme->get_template()
string WP_Theme->get_stylesheet_directory()
string WP_Theme->get_template_directory()
string WP_Theme->get_stylesheet_directory_uri()
string WP_Theme->get_template_directory_uri()
string WP_Theme->get_theme_root()
string WP_Theme->get_theme_root_uri()
bool WP_Theme->load_textdomain()
array WP_Theme->get_page_templates()
In some cases (notably, get_page_templates()), the global function now wraps the method.
Screenshots
string function get_screenshot( $uri = 'relative' )
int function get_screenshot_count()
array function get_screenshots()
3.4 means multiple screenshots. WP_Theme has you covered. Three public methods provide the screenshot URL (absolute URL, or relative to the stylesheet directory), the total count of screenshots, and an array of each screenshot filename.
Template and Stylesheet Files
array WP_Theme->get_files( $type = null )
get_themes() stored the template and stylesheet files in the array, which resulted in longer processing times and, especially since absolute paths were added in 2.9, an explosion in memory usage.
WP_Theme handles traversing the PHP and CSS files theme directories on demand. As these are only used in the theme editor (and only for the current theme), this is a pretty huge performance and memory boost.
Full Backwards Compatibility
WP_Theme is 100% backwards compatible. Because it implements ArrayAccess, you can take a theme object and look for keys like 'Template Files', 'Stylesheet', and 'Version' — just as they were returned by get_themes(). WP_Theme also includes a magic getter, handling all properties that were returned by current_theme_info() (which returned stdClass in 3.3). This enables us to pass a fully decorated WP_Theme object as an existing action argument, without plugins choking because they expected an array or an older object.
Indeed, get_themes() now returns an array of WP_Theme objects, because it can. get_theme() returns a WP_Theme object, as does current_theme_info(). (All are deprecated, as is get_broken_themes().)
If you don't believe me on backwards compatibility, head over to the Theme Editor, which passes around theme names like they're crack. This file is the only area that hasn't been updated to rely on WP_Theme; it still uses get_themes() and the array it returns for each theme. Everything works.
The Parent Theme
A reference to a WP_Theme object for a parent theme is stored in the parent property, and accessible with the parent() method. Want a theme's parent name? $theme->parent() && $theme->parent()->display('Name').
Caching and Performance
Okay, so, this is sweet. After profiling get_themes() and WP_Theme with Xdebug and KCacheGrind, a few functions have been sped up, in some cases pretty significantly. This includes search_theme_directories() and theme searching, which was a drag for no real reason. Searching is now improved significantly, and is implemented as a class method to allow for better speeds. (Now that things are "ready," I hope to go back and look for further improvements.)
Once sorting was fixed, two things stood out as weighing down processing the most: Filesystem reads, and sanitization of headers. With WP_Theme, both are non-persistently cached, internally, to ensure serious speed. No style.css is ever read twice, and no header is ever sanitized twice. Same goes for identifying screenshots and PHP/CSS files — everything gets cached.
Non-persistence is a must for themes because someone might make an FTP change then refresh the page. BUT, that doesn't mean a particular installation shouldn't be able to opt into persistence.
When the themes cache bucket is persistent (there's a filter), WP_Theme gets smart about it. To prevent thousands of cache sets to update the headers cache, it sanitizes every header once the first one is requested and adds it to cache. Additionally, search_theme_directories() is cached.
With persistent caching, it is possible to hit themes.php, and even do a search, and avoid a single filesystem or kses operation. The 150-something themes in the WP.com themes repository sings in this situation.
What's Next?
- Commit this bad boy.
- If object cache drop-ins had a wp_cache_get_non_persistent_groups() function, we could scrap that filter and make it much cleaner. (It's a rough implementation and could use some tweaks anyway.)
- Some of the filesystem functions — specifically, the underlying scandir() static method and get_files() — don't quite have their arguments or loops ironed out yet. Once the theme editor is converted over, those things should come into place nicely. Code review appreciated.
#10
follow-up:
↓ 11
@
13 years ago
Could we take this opportunity to find better names for get_template() / get_stylesheet() and co.?
#11
in reply to:
↑ 10
@
13 years ago
Replying to scribu:
Could we take this opportunity to find better names for get_template() / get_stylesheet() and co.?
We could indeed! It was just not a priority during the construction. And, for my sanity, it was easiest to code it in the existing paradigm.
#14
@
13 years ago
I actually looked at the patch now. It implements ArrayAccess and everything!
We can figure out the naming after the initial commit.
#18
@
13 years ago
- Cc JustinSainton added
I don't often profess my love for other men, but when I do, it's for nacin.
#21
@
13 years ago
Related tickets I could find:
- #17597, register_theme_directory() does not work for files outside WP_CONTENT_DIR (WP_PLUGIN_DIR does not need to be within WP_CONTENT_DIR, for example). get_theme_root() always prepends WP_CONTENT_DIR, but get_raw_theme_root() sometimes returns an absolute path instead of the relative-to-content path as documented. The patch here tries to fix this but not correctly. Coming up with a better solution.
- #16744, theme editor does not work when the parent theme is in a different theme root. Should be easy to fix with WP_Theme.
- #10067 (closed as duplicate), theme_basename() function. This is superseded by the power available here.
If you can find any other tickets related to the theme or theme root API, please shout.
#25
in reply to:
↑ 23
@
13 years ago
Replying to SergeyBiryukov:
Commenting in #18501. Turns out that line of code is dead code — the variable it sets up is never used. I missed it in my audit, but removing it now.
#26
@
13 years ago
Pretty nice class and +1 for translatable theme headers.
@Nacin: Some Questions...
I only briefly looked over the code
- Couldn't - inside
function offsetGet( $offset )
- the default be set toreturn $this->display( $offset );
(instead ofnull
), so we'd have the ability to add (for e.g.)DataBaseVersion
to the header (we still have no update/upgrade hook for themes)? - Extending the switch for Author & Description inside
function sanitize_header( $header, $value )
-wp_kses()
with the allowed tags ofcite
for addresses andstrike
for update notes wouldn't hurt. I'd also love to seesub/sup
and maybe evenu
be available. - For cases similar to this one (filtering page templates) it would be nice to have a filter for the returned list of themes. Currently it's only in use to build the templates drop-down for the post edit/new screen.
#27
@
13 years ago
I have two issues with the class:
- Theme editor can't find the theme – "The requested theme does not exist."
- themes.php: The current theme display is broken: http://cl.ly/EaoG
Also the patch doesn't apply cleanly anymore.
#29
@
13 years ago
- Cc gary@… added
On the naming issue, could we fix the non-existent word "stylesheet"? It has, and always was two words, as per the CSS abbreviation, and the names on the corresponding W3C documents.
After a very quick look through the patch, I see $stylesheet
only as a private property, or as the argument names / variables inside functions, so they could be switched to $style_sheet
without fuss.
Any standalone function like get_stylesheet()
that remains after scribu has had his suggestion of renaming them, could stay as it was, but the methods introduced in the class could be renamed as get_style_sheet()
. Obviously Stylesheet
array keys would stay as they are too, unless you wanted to allow both versions.
I suspect the suggestion will be disregarded on grounds of introducing inconsistency, but since you're re-writing a hefty chunk of the internals, now is the most opportune moment to suggest improvements to the internals to match the naming conventions in the coding standards.
#30
@
13 years ago
ocean90's difficulties were a result of r20001. See #17597. Once we have an idea there for how to proceed, 20103.3.diff will land.
Given the universal (in WordPress) usage of stylesheet to refer to a particular theme directory (and not necessarily CSS files), we're not going to change it to style_sheet.
#4
@
13 years ago
get_theme_updates() could use some work. I don't like how it assigns a property directly to WP_Theme — WP_Theme has no public properties, only read-only back compat ones.
It should instead wrap wp_get_themes( array( 'updates' => true ) )
. WP_Theme should set up an update property to automatically pull from the site transient. We can cache the site transient value as a static variable in WP_Theme->update(), to prevent calling get_site_transient() over and over.
The structure ends up being backwards compatible so we could easily continue to use get_theme_updates() rather than wp_get_themes(), but since method access will be preferred to property access, we should probably switch to wp_get_themes().
#5
@
13 years ago
In general, update-related code needs an audit. wp_theme_update_row() will also need work for sure.
#11
@
13 years ago
[20126] had haunted me previously, but I never found the bug and chalked it up to a caching quirk. [20132] is a nice performance improvement for a site like WP.com, which might have 500 themes but only a minority of them are for public consumption. Assuming there are no get_themes() calls, [20132] limits the amount of header sanitization and cache_get calls taking place to the themes that will actually be displayed.
#19
@
13 years ago
Related to "list broken themes on network/themes.php...": #20099
The theme editor still use get_themes():
$ ack ' get_themes()' wp-admin/theme-editor.php 47:$themes = get_themes();
#21
@
13 years ago
[20151] and [20152] together mean big performance improvements in multisite.
Because errors were displayed on themes.php to super admins, when they visited themes.php for a site, every installed theme needed to be instantiated as a WP_Theme (to check for errors). [20151] moved this to the network admin, where every theme needs to be instantiated anyway. [20152] means that only allowed themes get instantiated on themes.php for a site. End result is very nice.
The theme editor still use get_themes()
Yeah, I haven't had the chance to go through it and update it to use the new API. You will find some Edit links to it broken (as they were updated to use the slug), but the editor itself works. If someone wishes to take a crack at it, I would be interested to hear your feedback about the API.
#23
follow-up:
↓ 25
@
13 years ago
20103.4.diff improves get_theme_root_uri():
- Short-circuits if only one directory is added (as we do elsewhere)
- Allows a theme_root to be passed, which would then avoid the call to get_raw_theme_root(), which in turn will look up the theme root from the transient. If you already have this information, passing it can save some execution, especially if this is called for many themes (such as a list of screenshots on themes.php).
Additionally, WP_Theme->get_theme_root_uri() will now call get_theme_root_uri(), therefore triggering the theme_root_uri filter. This is the only helper method in WP_Theme that wraps its function counterpart. This is because it forms the basis for all other URIs returned by the theme object. Of all the data points, it's the one we do want to be filterable.
#34
@
13 years ago
- Resolution set to fixed
- Status changed from new to closed
New tickets for any bug reports or enhancements.
20103.diff rewrites search_theme_directories(). It uses scandir() rather than opendir/readdir, eliminates unnecessary loops in favor of file_exists() checks, takes maximum indentation from 9 tabs down to about 5, and makes some attempt to improve variable names.
The average time it took for search_thmee_directories() to run over wp-content/themes/ (where /themes/ was a checkout of http://wpcom-themes.svn.automattic.com) went from to 0.02042368 to 0.00315408 seconds, an 85% speed-up.