Opened 3 years ago
#55128 new defect (bug)
REST API /media leaves holes in the result, making it virtually impossible to paginate through them
Reported by: | frankieandshadow | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | 5.9 |
Component: | REST API | Keywords: | |
Focuses: | rest-api | Cc: |
Description
A request to the REST API for say /media?per_page=20 can unexpectedly return any number of results up to 20, including an empty array when there are more pages to follow.
This happens when some of the media library entries are attached to unpublished posts, and the media were added in the post, not directly to the media library.
Now it may be reasonable to omit these when not authenticated, but not by post-processing the array after it has been fetched from the database, as seems to be happening currently.
Doing that makes it extremely difficult to paginate through them, or to choose an appropriate page size. This is compounded by the fact that it is more likely the missing images will be near the beginning, as those are the ones less likely to have been published yet.
Consider just displaying a matrix of the thumbnails, as obtained from the API, 20 at a time. So you ask for a page of 20 and get 5. OK, let's get another page before returning it to the client: you get 18 this time, so you take the first 15 of those and along with the first 5 give those to the client. The user clicks "show more". Where do I start? I can't start on page 3 because there were some left over on page 2. But I have no idea where the gaps are so I can't set an appropriate offset - offset apparently _includes_ the missing entries! And there is no information about where the missing entries might be. Basically using offset is not possible.
The only solution is to start from the beginning every time, and then omit the first however many results already delivered to the client. This defeats the point of pagination, and gets slower and slower as they ask for more.
I could provide randomly more than they requested, up to some maximum (set as per_page), and work entirely in pages, noting the page number highwater mark on each request, each time fetching as many pages as needed to get some minimum number of images. This doesn't require starting again each time, but can also result in hundreds of API requests retrieving empty arrays each time if there are many unpublished images, which is also very slow. It also makes it impossible to work to a UI where the user specifies how many to retrieve at once.
Or I could work with page_size = 1, which means I can use page number where I would have liked to have used offset, but that is also very slow: at least 20 requests, probably many more to skip unpublished images, where I would expect only to need one request.
And it makes it much harder to implement either way, as the obviously intended implementation is to set offset to the currently retrieved images, and a number up to 100 as the page size, and just fetch that page.