Opened 7 weeks ago
Last modified 5 weeks ago
#64944 accepted defect (bug)
Generated Excerpts - Missing white space when stripping <br>s generated in paragraph block, verse block, etc.
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 7.1 | Priority: | normal |
| Severity: | normal | Version: | 5.0 |
| Component: | Formatting | Keywords: | has-patch has-unit-tests |
| Focuses: | Cc: |
Description (last modified by )
When using the Verse block (or similarly Paragraph block and shift + enter spacing), <br>s are added in the block's content between lines. When WordPress generates an excerpt (when no custom excerpt is set), these <br>s are stripped along with other HTML tags. This often creates excerpts with missing spaces between words.
Consider the common poetry formatting where multiple lines exist in a paragraph block to represent a stanza.
On the code side of the editor this looks like:
<!-- wp:verse --> <pre class="wp-block-verse">this is<br>a verse block<br>it has<br>the same issues</pre> <!-- /wp:verse -->
or
<!-- wp:paragraph --> <p>This is a poem<br>using shft+space<br>Inside a paragraph block<br>for good stanza formatting</p> <!-- /wp:paragraph -->
When WP generates an excerpt based off the post content this ends up as:
"this isa verse blockit hasthe same issues"
or
"This is a poemusing shift+spaceInside a paragraph blockfor good stanza formatting."
This shows up often in excerpts generated from content corresponding to poetry, song lyrics, or other similar formats. When excerpts are used in any context (post previews, email subject descriptions, etc.) these missing white spaces obviously look horrible.
To Reproduce (recently tested in WP Playground on 6.9):
- Create a new post using a Paragraph or Verse block. For the Verse block, standard
enterto add new lines will repro the issue. For the Paragraph block,shift + enterto create new lines within the block. - Do not create a custom excerpt.
- Publish the post.
- Run
get_the_excerptfor the post. - Verify that there are no spaces between the last words of one line and first words of the next.
How to fix?
I am uncertain on the best approach to resolve some notes:
wp_trim_excerpt - calls get_the_content when no excerpt text is passed to it. Later calls wp_trim_words
wp_trim_words - calls wp_strip_all_tags and later creates a $words_array using preg_split on the "/[\n\r\t ]+/" pattern.
wp_strip_all_tags - strips all the tags in a preg_replace. Later, if $remove_breaks is true, replaces '/[\r\n\t ]+/' patterns with spaces. In the current chain in this context $remove_breaks is false so this doesn't happen here, and the preg_split noted above in wp_trim_words will find these.
One thought, if wp_strip_all_tags similarly considered <br>s in the $remove_breaks block AND moved this handling before the preg_replace that strips tags, that seems like potentially a general improvement. If the goal is to replace breaks with spaces, then <br>s should be considered there. However, we don't call $remove_breaks in our context coming from wp_trim_words and it may not make sense to add that there.
Another thought, would it make sense for wp_trim_words to replace <br>s with spaces before calling wp_strip_all_tags ? Those spaces would then be caught by the pattern in the preg_split creating the $words_array.
I am attaching a diff for the latter. <br> tags are stripped without preserving spacing, causing words to concatenate (e.g., ‘thisexample’). This replaces <br> with a space before tag stripping to preserve word boundaries.
Attachments (1)
Change History (8)
This ticket was mentioned in PR #11352 on WordPress/wordpress-develop by Addison-Stavlo.
7 weeks ago
#1
- Keywords has-unit-tests added
Ensures words around br tags are not concatenated together during wp_trim_words by replacing br tags with a space. This is done just before all tags are stripped and will ensure the words are actually separated when the $words_array is generated.
These br tags are common in the core block editor as they can appear in paragraph blocks (shft+enter for spacing), verse blocks, and likely more. For content written in forms similar to that of poetry or song lyrics, excerpts generated from the content stick words together. e.g. "Line one<br>line two" becomes "line oneline two" - this PR aims to resolve this problem at the source in trim words.
Trac ticket: https://core.trac.wordpress.org/ticket/64944
## Use of AI Tools
AI assistance: Yes
Tool(s): Cursor
Model(s): Composer 2
Used for: assistance with initial code investigation, assistance with generating regex pattern, initial test suggestions, and general writing directed by me. Placement of and suggested fix made by me, test reviewed and edited by me.
#2
@
7 weeks ago
- Component changed from General to Formatting
- Description modified (diff)
- Version 6.9 deleted
(This can happen with blocks since WordPress 5.0, and it was possible earlier when adding br tags manually within the Code/Text view of the classic editor.)
#3
@
6 weeks ago
- Keywords needs-testing added
- Milestone changed from Awaiting Review to 7.1
- Version set to 5.0
Moving to 7.1 as we have a patch ready to be tested.
#5
@
5 weeks ago
Reproduction Report and Patch Testing
Description
This report validates whether the issue related to missing white space when stripping <br>s generated in paragraph block, verse block, etc.
can be reproduced.
Environment
- WordPress: 7.1-alpha-62161-src
- PHP: 8.2.28
- Server: nginx/1.29.0
- Database: mysqli (Server: 8.4.5 / Client: mysqlnd 8.2.28)
- Browser: Chrome 145.0.0.0
- OS: Windows 10/11
- Theme: Twenty Twenty-Five 1.4
- MU Plugins: None activated
- Plugins:
- Test Reports 1.2.1
Actual Results
Created a new post using Verse block, and published the post. Ran get_the_excerpt for the post.
No spaces between the last words of one line and first words of the next. ✅
Supplemental Artifacts
BEFORE
Created verse block
Calling get_the_excerpt for the post shows no spaces in between.
AFTER Patch:
Spaces visible between the last words of one line and first words of the next ✅
This ticket was mentioned in Slack in #core-test by gaisma22. View the logs.
5 weeks ago
#7
@
5 weeks ago
- Keywords needs-testing removed
Patch Testing Report
Patch Tested: https://github.com/WordPress/wordpress-develop/pull/11352
Environment
- WordPress: 7.0-beta6-62085-src
- PHP: 8.3.30
- Server: nginx/1.29.7
- Database: MySQL 8.4.8
- Browser: Brave
- OS: Ubuntu 24.04
- Theme: Twenty Twenty-Five 1.4
- MU Plugins: None
- Plugins: None
Steps Taken
- Created a new post using a Verse block with three lines separated by Enter.
- Published without a custom excerpt.
- Checked the generated excerpt via the REST API at
/wp-json/wp/v2/posts/6. Before patch: Words from adjacent lines were stuck together with no spaces. e.g. "this is line onethis is line two" - Created a new post with the same steps after applying PR #11352.
Checked
/wp-json/wp/v2/posts/9. After patch: Words are correctly separated by spaces. e.g. "this is line one this is line two this is line three"
✅ Patch is solving the problem
Expected Result
When WordPress generates an automatic excerpt from a post using a Verse block or Paragraph block with shift+enter line breaks, words from adjacent lines should be separated by spaces.
Additional Notes
- Bug confirmed on WordPress 7.0-beta6. The br tags in Verse block content were stripped without preserving word boundaries, causing words from adjacent lines to concatenate in the generated excerpt.
- Removing
needs-testingas patch resolves the issue on WordPress 7.0-beta6-62085-src.





<br>tags are stripped without preserving spacing, causing words to concatenate (e.g., ‘thisexample’). This replaces<br>with a space before tag stripping to preserve word boundaries. Note tags are already stripped just after this, and newlines, returns, spaces, etc. are all later used to create the $words_array here. This helps retain expected generated excerpt behavior when using verse and paragraph (w/ shft+space) type blocks.