Make WordPress Core

Opened 8 years ago

Last modified 5 years ago

#36818 new enhancement

Export filter for post meta

Reported by: justinbusa's profile justinbusa Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Export Keywords: has-patch dev-feedback
Focuses: Cc:

Description

It would be handy if we had a filter for modifying post meta before it is written to an export file.

Our plugin stores serialized arrays in post meta that get corrupted from time to time during the export/import process. The attached patch/filter would allow us to store the data differently in an export file to prevent that from happening.

Attachments (2)

export.diff (762 bytes) - added by justinbusa 8 years ago.
fastlinethemes.wordpress.2016-05-16.xml (13.4 KB) - added by justinbusa 8 years ago.
Export where the _fl_builder_data postmeta does not import.

Download all attachments as: .zip

Change History (14)

@justinbusa
8 years ago

#1 @justinbusa
8 years ago

  • Keywords has-patch dev-feedback added

#2 @dd32
8 years ago

Can you provide some details of how & why the values get corrupted? It seems that fixing this for everyone by fixing the bug would be the best course of action to me

#3 @netweb
8 years ago

  • Keywords reporter-feedback added; dev-feedback removed

Related, possible duplicate: #33461

#4 @justinbusa
8 years ago

Hey @dd32, thanks for following up on this!

TBH, I've dug through this more than I'd like to admit and am having a really tough time finding out what is causing the corruption of the serialized meta values.

At first, I thought it was something to do with the importer, but after further testing, I'm not 100% sure that it's not something with the export.

To give you some context as to how the data is built/saved...

The arrays are built based on user input (such as text or HTML entered in an editor). Once the arrays are ready to be saved/serialized, we do that using the update_metadata function (which serializes them for us).

We're not really doing anything special and are letting WordPress handle the heavy lifting when it comes to serializing, saving, exporting, and importing.

My guess is that it's some sort of encoding or special character that's changing during the transaction and breaking the length of the serialized data.

I've tried base64 encoding the exported meta and decoding the import and it fixed the issue. However, I'm not entirely sure if that's a solution or not.

I've attached a copy of an export with this issue. The post meta works when pulled from the database and unserialized on the source site, but it won't import into another site.

Any ideas you may have regarding the corruption of serialized data during export/import would be extremely helpful as I've been spinning my wheels on this one for a while.

Thanks!

@justinbusa
8 years ago

Export where the _fl_builder_data postmeta does not import.

#5 @justinbusa
8 years ago

  • Keywords dev-feedback added; reporter-feedback removed

#6 @dd32
8 years ago

@justinbusa Would you be able to supply an example of the data that was being lost?

It's entirely possible that it's related to multibyte characters though, PHP has a long history of not being able to properly deserialize some strings containing multibyte characters when they're created on another system.

#7 @justinbusa
8 years ago

Definitely! What's the best way to go about that? An SQL export? I've attached the .xml export previously.

#8 @justinbusa
8 years ago

Also, it being related to multibyte characters sounds very possible. Nothing is _visibly_ changing in the export. If that is the case, do you know anything we could do to prevent those from breaking the serialized data?

#9 @dd32
8 years ago

@justinbusa Sorry! My apologies, I didn't see that you'd added the export file.

#10 @justinbusa
8 years ago

I did some more testing on this and it looks like there are a number of issues that could cause serialized data to be corrupted. I haven't been able to pinpoint any consistent patterns, but it does seem like these can happen from time to time...

  1. The WXR_Parser_Regex class adds newlines to the import data (it also rtrims each import line). See #32320.
  2. The simplexml_* functions appear to be breaking serialized data when passed through them. I don't have an example on hand, but I've seen this in my previous testing. I'm guessing that might have to do with multibyte characters.
  3. Mixed newline characters seem to be breaking serialized data now and then.
  4. PHP in general is not properly handling multibyte characters as you mentioned. This would explain the random weirdness that can occur.

Given the number of ways serialized data can break on export/import, it seems like finding a better way to store it might be the solution. I've never had an issue with saving/pulling from the database, but export/import has been full of them.

In my testing, base64 encoding/decoding seems to solve the issues. Do you think that would be something to explore for exporting/importing serialized data in core, or is that too far-fetched? I'm no base64 expert, so I don't know what the ramifications of that could be (if any).

As for the original filter request, the importer does have a filter for modifying post meta before it's saved to the database, so it would be nice if we had one to modify it on export as well.

Thanks again for your help on this!

#11 @dd32
8 years ago

For a lot of people the answer is just "Don't use complex objects when not needed", but that doesn't really fly for your scenario.

I'm not sure on the best option here though, so paging @rmccue - What do you think we should/could do here to make imports more foolproof?

#12 @justinbusa
8 years ago

Thanks! I definitely get that. We used to store data in a custom table, but then we weren't compatible with revisions and export/import (among other things), so we switched to serialized post meta. It has worked extremely well for us except for this one issue. Ryan, I'd love to hear your thoughts :)

Note: See TracTickets for help on using tickets.