Make WordPress Core

Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#27160 closed task (blessed) (fixed)

Paste from Word in TinyMCE 4.0

Reported by: azaozz's profile azaozz Owned by: azaozz's profile azaozz
Milestone: 3.9 Priority: normal
Severity: normal Version: 3.9
Component: TinyMCE Keywords: needs-testing
Focuses: Cc:

Description

The 'paste' plugin has changed in TinyMCE 4.0. The "Paste from Word" button and popup are gone. Now the editor checks all pasted content and when it detects HTML produced by Word, it filters it automatically.

Some of the plugin's settings have changed too: 'paste_remove_spans' and 'paste_strip_class_attributes' are not used any more.

This ticket is for testing and eventually adjusting the filtering of the HTML done on pasting.

Attachments (2)

27160.patch (27.3 KB) - added by azaozz 11 years ago.
27160.1.patch (26.9 KB) - added by azaozz 11 years ago.

Download all attachments as: .zip

Change History (24)

#1 @azaozz
11 years ago

Additional filtering of the pasted HTML can be done either on paste_preprocess and paste_postprocess callbacks that can be set in the init object or on 'PastePreProcess' and 'PastePostProcess' TinyMCE events.

This ticket was mentioned in IRC in #wordpress-dev by azaozz. View the logs.


11 years ago

#3 @nacin
11 years ago

This is very cool.

#4 follow-up: @netweb
11 years ago

I've got nothing of value to add to this ticket except to say this is awesome.

Probably my first, last and only Word 2013 to WordPress posts.

  • 'Table of Contents' anchors match and work perfectly eg <a name="_Toc251709783"></a>
  • Tables work perfectly with <table>, <thead>, <tbody>, <tr>, <td> etc, even cells left blank are handled perfectly.
  • Imported and converted PDF -> Word 2013 - WordPress post, all good
  • Unordered Lists are fine, using <ul><li></li></li>
  • Ordered lists, I had one that didn't work and an awesome on that did kicking off with <ol start="3"><li>
  • 140 page Word 2013 Document ~28,000 words to WordPress, not a single invalid HTML element...
  • 80 page PDF opened in Word 2013, copy and paste to WordPress post, all good
  • 1,000 rows by 30 column CSV (~65k words) open in Excel 2013, copy and paste to WP post, no problem
  • ~250 rows x 15 column Access 2013 to WP post, wow, still no problems

I started off small then threw anything I could at WordPress, apart from formatting errors where images were in the original document that don't become part of the WordPress post via copy and paste I could not fault a thing.

I'd never recommend people to write in Word, Excel, Access and then copy and paste to WordPress, but if someone told me they did I would say "Isn't it an awesome WYSIWYG experience"

@azaozz Anything I missed or you need a follow up on let me know.

#5 in reply to: ↑ 4 @azaozz
11 years ago

Replying to netweb:

That's great @netweb, thanks for the comprehensive testing.

Did some non-Word tests. All works well except now classes and empty spans are not removed. All other tag attributes are kept too. We may want to do some additional filtering to remove IDs and classes, and perhaps empty spans (empty meaning <span></span>, no attributes and no content).

#6 @kirasong
11 years ago

This is great! We'll just want to be sure we communicate to users clearly that it's the case, so that support doesn't get hit by more confused users than necessary.

#7 @nacin
11 years ago

  • Type changed from enhancement to task (blessed)

All works well except now classes and empty spans are not removed. All other tag attributes are kept too. We may want to do some additional filtering to remove IDs and classes, and perhaps empty spans (empty meaning <span></span>, no attributes and no content).

Leaving open for this. azaozz, feel free to close or spin this off into a new ticket if appropriate.

This ticket was mentioned in IRC in #wordpress-ui by avryl. View the logs.


11 years ago

#9 @kirasong
11 years ago

  • Owner set to azaozz
  • Status changed from new to assigned

This ticket was mentioned in IRC in #wordpress-dev by nacin. View the logs.


11 years ago

@azaozz
11 years ago

#11 @azaozz
11 years ago

In 27160.patch: latest (dev) version of the 'paste' plugin, some additional filtering of the paste content. The filtering removes all HTML id, class and tabindex attributes by default and filters the inline styles on all elements allowing only font-weight, font-style, and color.

@azaozz
11 years ago

#12 @azaozz
11 years ago

In 27160.1.patch:

  • Bring back removal of classes.
  • Also remove IDs and tabindex.
  • Filter all inline styles allowing only font-weight, font-style, and color. This affects only pasting in non WebKit browsers.
  • Introduce wp_paste_filter TinyMCE setting that turns off the extra filtering.

#13 @azaozz
11 years ago

On further testing: the paste events now fire when copying and pasting inside the editor, even when dragging selected text/images. So our own classes and IDs will be removed with the above filters. As the filters are mostly for copying off a web page and pasting in the editor, better to remove them.

Last edited 11 years ago by azaozz (previous) (diff)

#14 @azaozz
11 years ago

The paste plugin was updated in 4.0.21. It has new setting paste_webkit_styles that specifies which inline style properties should be retained in webkit. It's currently set to font-weight font-style color. This can probably be reduced to just color as font-weight and font-style should be <b>, <i>, <strong> or <em> elements.

#15 follow-up: @kirasong
11 years ago

azaozz: Does that mean we can't filter anything extra without killing our styles/markup, or am I misunderstanding, and there's a workaround you're planning?

#16 in reply to: ↑ 15 @azaozz
11 years ago

Workaround would be if the 'paste' plugin sets a context for every "clenaup". Internally it's looking at context in most cases, but that is not passed to the callbacks/custom events. This will probably be added but not sure it will make it in 3.9.

Last edited 11 years ago by azaozz (previous) (diff)

This ticket was mentioned in IRC in #wordpress-dev by nacin. View the logs.


11 years ago

#18 @nacin
11 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

Per IRC, calling this fixed. New tickets for new issues or improvements.

My understanding is: It works as designed right now, and we can't make any further changes without breaking internal copy/pastes.

#19 @nacin
11 years ago

In 28091:

Add an explanation for 'Paste from Word' in the 'Paste from Text' dialog.

fixes #27777. see #27160.

#20 @awoz
11 years ago

  • Version set to 3.9

Gentlemen,

I see that this bug is fixed/closed but there appears to be a regression at our WP 3.9 site.

My client normally used MS Word 2007 to copy/paste content into pages. This has worked in the past. Now the paste operation drops all/most formatting and color.

I can confirm that copy/paste from a .doc that is opened with LibreOffice DOES work, all formatting is retained properly.

#21 @awoz
11 years ago

FYI. After checking the TinyMCE website and pasting from Word 2007, it appears that core TinyMCE 4.0 has the regression.

Sorry about the false alarm for WP 3.9. Otherwise a nicely done integration.

-cheers

Note: See TracTickets for help on using tickets.