Opened 6 years ago
Closed 6 years ago
#46133 closed defect (bug) (fixed)
Parse blocks to generate the excerpt
Reported by: | gziolo | Owned by: | pento |
---|---|---|---|
Milestone: | 5.2 | Priority: | normal |
Severity: | normal | Version: | 5.0 |
Component: | Formatting | Keywords: | has-patch has-unit-tests |
Focuses: | Cc: |
Description
Moved from GitHub: https://github.com/WordPress/gutenberg/pull/11704
Originally authored by @pento:
Now that https://github.com/WordPress/gutenberg/pull/11141 has landed, we can use the same logic when generating an excerpt, which allows us to deprecate get_dynamic_blocks_regex()
.
By using a list of allowed blocks, we can avoid the other issue that strip_dynamic_blocks()
is intended to address, when a dynamic block can call get_the_excerpt()
, causing an infinite loop.
Out of interest, https://github.com/WordPress/gutenberg/commit/198c82f0323a433eaea33f63a3fb94866be5ba32 is an alternate method, which just tweaks strip_dynamic_blocks()
to use the parser, but I think this is the better method.
It should be recreated in core based on the referenced Pull Request.
Attachments (3)
Change History (15)
#2
@
6 years ago
- Keywords needs-patch needs-unit-tests added
- Milestone changed from Awaiting Review to 5.2
- Version set to 5.0
#3
@
6 years ago
- Component changed from General to Formatting
- Keywords has-patch has-unit-tests dev-feedback added; needs-patch needs-unit-tests removed
@gziolo @pento It's unclear to me if this requires further action. It seems that this issue was fixed in [43884], though no unit tests were added for this.
In my testing, the unit tests from the unmerged GB-11704 PR pass. Removing the [https://core.trac.wordpress.org/browser/tags/5.1/src/wp-includes/formatting.php#L3689 line that strips blocks for excerpts in core causes the tests fail.
If my assessment is true, let's add these tests and this can be closed.
#4
@
6 years ago
Thanks for the follow up, @desrosj.
The existing functionality is incomplete, it doesn't deal with dynamic blocks nested within whitelisted blocks.
46133.2.diff modifies the test to demonstrate the problem.
This ticket was mentioned in Slack in #core by jeffpaul. View the logs.
6 years ago
#7
@
6 years ago
What's the way forward here, can we skip blocks that have nested blocks? Or perhaps have a whitelist for what blocks can be nested?
Ideally we are looking for some text that is relevant to the post content. If no text is found, we can perhaps use an image. But some posts may not have either (a gallery post?). In these edge cases it would be acceptable to have an empty excerpt imho.
Another possible way might be to look at the block content (HTML) without rendering the blocks and "extract" the text?
#8
@
6 years ago
In 46133.3.diff:
- Add a (non-filterable)
$allowed_inner_blocks
list. Same as the default$allowed_blocks
exceptcore/columns
. - When rendering the allowed blocks, exclude blocks that have non-whitelisted or nested inner blocks.
- Add a helper function to render whitelisted blocks from inside the columns block.
- Fix the tests to match the output from get_the_excerpt() (has some "unusual" left over white space).
#9
@
6 years ago
Add a (non-filterable)
$allowed_inner_blocks
list. Same as the default$allowed_blocks
exceptcore/columns
.
There is going to be core/group
block added to Gutenberg after 5.2 release. We might want to include it in the whitelist as well together with core/columns
.
#10
@
6 years ago
It looks like this diff covers the case where you have Columns block nested inside another Columns block. It would be good to add a test to ensure it doesn't regress in the future and to confirm my assumption :)
#11
@
6 years ago
@gziolo thanks for the review! :)
There is going to be core/group block added to Gutenberg after 5.2 release
Yeah, thinking we will need to keep enhancing the way we auto-generate excerpts for posts and include all blocks that may have relevant text.
The challenge is that we only want "static text" and need to exclude dynamic blocks as they may cause infinite loop if trying to generate an excerpt, and in most cases they are not particularly relevant to the rest of the post.
Ideally we should parse all "text containing blocks" regardless if they are wrapped in "container" blocks. The current patch only looks one level deep in all whitelisted blocks, and has an exception for core/columns
where it looks inside each column. This can be enhanced and perhaps can be done better. Planning to look at it again for 5.3.
This comment from GitHub seems to be important: