Opened 4 years ago
Last modified 4 years ago
#51092 new feature request
Create a JSON schema for Privacy and Other Related Disclosures
Reported by: | carike | Owned by: | |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | |
Component: | Privacy | Keywords: | needs-privacy-review 2nd-opinion |
Focuses: | rest-api | Cc: |
Description (last modified by )
Background:
The Disclosures Tab is an initiative that is underway in the Core Privacy Team.
The aim is to help site owners / admins better understand what information their site (plugins, themes and Core) collects, where the information is stored and where it is sent - and in particular, who it is shared with.
We hope to help site owners / admins make more informed privacy choices (e.g. when choosing which plugin to install) and to better understand their risk profile when it comes to privacy.
For the most part, the actual "controlling" is planned for a sibling plugin, the Permissions Tab, which is not currently intended to be merged into Core, as this will contain more advanced settings.
You can read more about the various privacy initiatives here: https://make.wordpress.org/core/2020/08/19/minutes-core-privacy-meeting-19-august-2020/
The Challenge:
Free-form disclosures in the readme.txt would create a lot of additional work for the plugins review team.
Moreover, it makes it near impossible to compare across plugins, or to use the information in any sort of automated process.
The Disclosures Tab seeks to standardize the way that plugin, theme authors and Core can disclose privacy and other related concerns to site owners / admins, by creating quasi-"headers" and limiting the acceptable values for each.
The Solution:
Each plugin, theme and core component can have a file called disclosures.json that could be read by Core (and Meta) using relatively simple REST API functionality.
In its current form, the JSON schema does not set any fields as "required".
As URLs are not one of the six data types accepted by JSON, these types have been set as "string"s.
The format for internal URLs has been set to "uri-reference" to allow for relative URLs.
Items are not marked as "uniqueItems" because we would rather warn (after validation in PHP) than reject the file because of duplicates.
Scope:
This ticket proposes the schema.
[] will be created for the validation of the schema by Core (particularly the URLs using PHP).
[] will be created for internationalization (WP CLI and WordPress.org).
#51156 creates developer documentation.
#51144 proposes a UI for site-level privacy disclosures and related settings.
<?php { "$schema": "https://core.trac.wordpress.org/ticket/51092", "$id": "https://example.com/to.be.filled.in.later.disclosures.json", "description": "The vision of the Disclosures Tab is for site administrators to understand their site's privacy risk profile and to make more informed privacy-related choices as a result. The mission of the Disclosures Tab is to help site administrators understand what information their site collects, where it is stored and where it is sent - and in particular, with whom it is shared.", "type": "object", "properties": { "info": { "description": "This section provides information to help identify the code.", "type": "object", "properties": { "component": { "description": "One of the following values: plugin, theme, or the specific Core component (e.g. avatar).", "type": "string" }, "slug": { "description": "The slug, if the code relates to a plugin or a theme.", "type": "string" }, "version": { "description": "Which version of disclosures.json this represents for the individual component.", "type": "string" }, "since": { "description": "The plugin or theme's version number, or the Core version, if the component is a Core component, which introduced the current version of this disclosures.json file. I.e. this should represent the since value.", "type": "string" } } },
<?php "licenses": { "description": "This section contains more information about licensing." "type": "object", "properties": { "code": { "description": "A comma-separated list of URLs linking to the licenses that applies to this component (plugin, theme, or Core component)'s use.", "type": "array", "items": { "type": "string", "format": "uri" } }, "localAssets": { "description": "A comma-separated list of URLs to the license that applies to the use of each asset that has been included locally. This should include the license of any bundled libraries, as well as the licenses of any images, fonts, etc.", "type": "array", "items": { "type": "string", "format": "uri" }, }, "remoteAssets": { "description": "A comma-separated list of URLs to the licenses that applies to the use of each asset that is accessed remotely. This should the licenses of any external libraries, as well as the licenses of any images, fonts, etc.", "type": "array", "items": { "type": "string", "format": "uri" }, } } },
<?php "external": { "description": "This section provides more information relating to the Privacy Policies of the external network sites being called.", "type": "object", "properties": { "PHP": { "description": "A comma-separated list of URLs of links to the respective Privacy Policies of the sites to which the external network calls are being made in PHP.", "type": "array", "items": { "type": "string", "format": "uri" }, }, "JavaScript": { "description": "A comma-separated list of URLs linking to the respective Privacy Policies of the sites to which the external network calls are being made in JavaScript.", "type": "array", "items": { "type": "string", "format": "uri" }, }, "CSS": { "description": "A comma-separated list of URLs linking to the respective Privacy Policies of the sites to which the external network calls are being made in CSS.", "type": "array", "items": { "type": "string", "format": "uri" }, } } },
<?php "terms": { "description": "This section contains more information about third party terms and conditions that may apply to use of the software.", "type": "object" "properties": { "SaaS": { "description": "A comma-separated list of URLs linking to the Terms of Service of any instances of Software as a Service.", "type": "array", "items": { "type": "string", "format": "uri" } }, "externalAPIs": { "description": "A comma-separated list of URLs linking to the Terms of Service of any external API being used.", "type": "array", "items": { "type": "string", "format": "uri" } }, "remoteAssets": { "description": "A comma-separated list of URLs linking to the Terms of Service that applies to the use of each remote asset. This relates to the use of CDNs for images, fonts, etc.", "type": "array", "items": { "type": "string", "format": "uri" } }, "registration": { "description": "A comma-separated list of URLs linking to the Terms of Service that apply to any accounts that need to be registered in order to be able to make use of this component's code.", "type": "array", "items": { "type": "string", "format": "uri" } } } },
<?php "openWeb": { "description": "Details about mechanisms that allow others to obtain information from the site without browsing the website's front end.", "type": "object", "properties": { "apiEndpoints": { "description": "A comma-separated list of relative URLs for any internal API endpoints that are created by the code.", "type": "array", "items": { "type": "string", "format": "uri-reference" } }, "feeds": { "description": "A comma-separated list of relative URLs for any internal feeds that are created by the code.", "type": "array", "items": { "type": "string", "format": "uri-reference" } } } },
<?php "clientSide": { "type": "object", "properties": { "setsCookiesPHP": { "description": "The names of any cookies that have been set using PHP.", "type": "array", "items": { "type": "string", } }, "setsCookiesJavaScript": { "description": "The names of any cookies that have been set using JavaScript.", "type": "array", "items": { "type": "string", } }, "usesLocalStorage": { "description": "Whether or not the code makes use of local storage.", "type": "boolean" } } },
<?php "communication": { "description": "This section provides more information about how the software communicates with external parties.", "type": "object", "properties": { "email": { "type": "object", "properties": { "sends": { "description": "Whether or not the code sends e-mails.", "type": "boolean" }, "subscribed": { "description": "Whether e-mails are only sent to users that have subscribed for that particular e-mail (e.g. a newsletter).", "type": "boolean" } } } } },
<?php "database": { "description": "This section contains information about how the software interacts with the site's database (MySQL or MariaDB).", "type": "object", "properties": { "writesToDB": { "description": "Whether or not the code writes to the database.", "type": "object", "properties": { "auto": { "type": "array", "items": { "description": "Whether or not the code writes to the database in relation to information that is not explicitly input by a user.", "type": boolean }, "additionalItems": false }, "manual": { "type": "array", "items": { "description": "Whether or not the code writes information to the database that was explicitly input by the user.", "type": boolean }, "additionalItems": { "description": "A comma-separated list of capabilities that authorize a user to write information to the database within the code.", "type": "string" } } }, "CPT": { "description": "Whether the component creates any Custom Post Types.", "type": "object", "properties": { "auto": { "type": "array", "items": { "description": "Whether or not the code automatically creates any Custom Post Types without user intervention.", "type": "boolean" "additionalItems": { "description": "The names of any Custom Post Types that are created automatically by the code without user intervention.", "type": "string", } }, "manual": { "type": "array", "items": { "description": "Whether or not the code allows for users to generate Custom Post Types.", "type": "boolean", }, "additionalItems": { "description": "A comma-separated list of capabilities that authorize a user to create Custom Post Types within the code.", "type": "string" } } } }, "customTables": { "description": "Whether or not the code creates any custom tables in the database.", "type": "object", "properties": { "auto": { "type": "array", "items": { "description": "Whether or not custom tables are automatically created by the code without user intervention.", "type": "boolean" }, "additionalItems": { "description": "The names of any custom tables that are automatically created by the code without user intervention.", "type": "string" } }, "manual": { "type": "array", "items": { "description": "Whether or not the code allows the user to create any custom tables.", "type": "boolean" }, "additionalItems": { "description": "A comma-separated list of capabilities that authorize a user to create custom tables within the code.", "type": "string" } } } } },
<?php "otherStorage": { "description": "Provides more information about where information may be stored, other than the database.", "type": "object", "properties": { "writesToFiles": { "description": "A comma-separated list of file types the code writes to (e.g. .txt).", "type": "array", "items": { "type": "string", } }, "fileStructure": { "description": "Whether or not the code makes changes to the website's file structure.", "type": "object", "properties": { "auto": { "description": "Whether or not the code makes changes, or is capable of making changes, to the website's file structure that are not explicitly initiated by a user. This should not include files that are added directly from the repository, or in the original .zip file.", "type": "boolean" }, "manual": { "description": "Whether or not the code makes changes, or is capable of making changes, to the website's file structure that are explicitly initiated by the user. This should not include files that are added directly from the repository, or in the original .zip file.", } } } } },
<?php "automation": { "description": "Provides more information with regards to action taken by the code without user input.", "type": "object", "properties": { "cron": { "description": "Whether the code makes use of scheduled tasks that do not require user input.", "type": "boolean" } } },
<?php "ppi": { "description": "Whether or not the code stores any Protected Personal Information.", "type": "boolean" }, "compatibility": { "description": "Indicates whether or not the code is compatible with Privacy Tools.", "type": "object", "properties": { "ppiExport": { "description": "Does the developer, in good faith, consider the code to be compatible with the PPI Export Tool in WordPress?", "type": "array", "items": [ { "type": "boolean" } ], "additionalItems": false }, "ppiErasure": { "description": "Does the developer, in good faith, consider the code to be compatible with the PPI Erasure Tool in WordPress?", "type": "array", "items": [ { "type": "boolean" } ], "additionalItems": false }, "consentAPI": { "description": "Does the developer, in good faith, consider the code to be compatible with the WordPress Consent API?", "type": "array", "items": [ { "type": "boolean" } ], "additionalItems": false }, "disclosuresTab": { "description": "Does the developer, in good faith, consider the code to be compatible with the Disclosure Tab?", "type": "array", "items": [ { "type": "boolean" } ], "additionalItems": false }, "permissionsTab": { "description": "Does the developer, in good faith, consider the code to be compatible with the Permissions Tab?", "type": "array", "items": [ { "type": "boolean" } ], "additionalItems": false } } },
<?php "monetization": { "type": "object", "description": "This section provides more information about monetization practices. It is included to help facilitate transparency and fair business dealings. Please note that disclosure here does not relieve a developer from any specific obligations that they may have under applicable statutes.", "properties": { "upsells": { "description": "More information about upselling in the code.", "type": "array", "items": [ { "description": "Does this code promote a paid version, or extensions, or other products or services from the same author(s)?", "type": "boolean" } ], "additionalItems": { "description": "A comma-separate list of URLs linking to the Terms of Service that apply to any paid version, or extension, or other products or services from the same author(s).", "type": "string", "format": "uri" } }, "donations": { "description": "More information about donations that are facilitated by the code.", "type": "array", "items": [ { "description": "Does this code contain any request, or information in order to, donate to the plugin or its developer(s)?", "type": "boolean" } ], "additionalItems": { "description": "A comma-separated list of URLs linking to the Terms of Service that apply to the platform being used to facilitate donations.", "type": "string", "format": "uri" } }, "backLinks": { "description": "More information about the code requesting credit.", "type": "array", "items": [ { "description": "Does this code contain or generate, or ask the site owner / admin for permission to generate, backlinks?", "type": "boolean" } ], "additionalItems": { "type": "string", "format": "uri" } }, "affiliates": { "description": "More information about affiliate networks that are promoted by the code.", "type": "array", "items": [ { "description": "Does this code contain, or generate affiliate links - i.e. links from which the author may receive conditional compensation, whether in money, or in kind?", "type": "boolean" } ], "additionalItems": { "description": "A comma-separated list of URLs linking to the Terms of Service that apply to affiliate networks being promoted by the code.", "type": "string", "format": "uri" } }, "advertising": { "description": "More information about advertising that is facilitated by the code.", "type": "array", "items": [ { "description": "Does the code contain, or generate promotions or recommendations for any products or services not directly under the control of the author(s), for which the author(s) receive any compensation, whether in money, or in kind?", "type": "boolean" } ], "additionalItems": { "description": "A comma-separated list of URLs linking to the Terms of Service that apply to any products or services that are being advertised by the code.", "type": "string", "format": "uri" } } } } } }
Acknowledgements:
Thanks to Timothy for suggesting that we use a .json file instead of plugin and theme headers.
Thanks to Swissspidy for information on i18n - see comment below.
Thanks to Apedog for suggesting improved phrasing for the "external" property's description.
Change History (36)
This ticket was mentioned in Slack in #core by carike. View the logs.
4 years ago
This ticket was mentioned in Slack in #polyglots by carike. View the logs.
4 years ago
#21
@
4 years ago
I think it'd probably be better to put the schema in either a GitHub repo or at least a PR against wordpress/develop
. In its current form, providing comments on it is really difficult.
#22
follow-up:
↓ 24
@
4 years ago
- Keywords 2nd-opinion added
For the most part, the actual "controlling" is planned for a sibling plugin, the Permissions Tab, which is not currently intended to be merged into Core.
Two things:
- Not sure what "controlling" means in this context? Control what exactly?
- Thinking this is not a good idea. A "feature" is either in core or it isn't. Adding "something" to core that will need a plugin to become useful is not the way to go imho.
Each plugin, theme and core component can have a file called disclosures.json
This makes sense for themes and plugins but perhaps not for core? In this context what is "core component" and why "all" core components need separate files? WP core is not build from "independent components".
...using relatively simple REST API functionality
Why this should be available (only?) through the API and not as a simple text or HTML? How is this going to be used in core? Seems the "application design" part of this is not formulated (yet).
(Also a quick note: multiple code examples in the ticket description are hard to follow and not particularly useful.)
#23
@
4 years ago
- Focuses privacy removed
- Type changed from enhancement to feature request
On second thought, this is (inseparable) part of #51144, perhaps can be closed as duplicate and handled there. The whole feature still needs designing.
#24
in reply to:
↑ 22
;
follow-up:
↓ 27
@
4 years ago
Replying to TimothyBlynJacobs:
I think it'd probably be better to put the schema in either a GitHub repo or at least a PR against wordpress/develop. In its current form, providing comments on it is really difficult.
Thank you for the advice. Working on it :)
Replying to azaozz:
For the most part, the actual "controlling" is planned for a sibling plugin, the Permissions Tab, which is not currently intended to be merged into Core.
Two things:
- Not sure what "controlling" means in this context? Control what exactly?
- Thinking this is not a good idea. A "feature" is either in core or it isn't. Adding "something" to core that will need a plugin to become useful is not the way to go imho.
The best way I can explain this is to compare it to Site Health.
Some Site Health features enhance what is already in Core, but they are not necessary for a not-insignificant number of users, or more useful for a limited amount of time, so they are intentionally kept in the plugin.
The Core implementation works - and is useful - in its own right, but it is augmented by additional (debugging) features in a plugin.
The aim is to decide what constitutes sane defaults for Core and then to provide extended options via a Permissions plugin.
In this case, the Core functionality is more focused on awareness / education.
The aim is to help site owners / admins understand their privacy risk profile.
Advanced options may need some sandboxing capabilities (to enforce site owner / admin choices) that would not be ideal for Core, but may be in high demand as a plugin.
Each plugin, theme and core component can have a file called disclosures.json
This makes sense for themes and plugins but perhaps not for core? In this context what is "core component" and why "all" core components need separate files? WP core is not build from "independent components".
To create a disclosures.json file for Core would be a large undertaking (which would include documenting all external references in the code). Breaking it up in to the various components make it more achievable and also makes it more useful to the site owner / admin because it is explained in more manage-able chunks.
...using relatively simple REST API functionality
Why this should be available (only?) through the API and not as a simple text or HTML? How is this going to be used in core? Seems the "application design" part of this is not formulated (yet).
Free-form disclosures would not be as useful as standardized disclosures, which allow site owners / admins to compare plugins / themes' privacy practices.
A JSON format helps with standardization and can be easier for developers.
The REST API is used to validate the JSON data and to make it accessible in a format that is more friendly towards site owners and admins.
(Also a quick note: multiple code examples in the ticket description are hard to follow and not particularly useful.)
These are not multiple examples. They are intended to be a single schema. I just split them into separate blocks because on the laptop, the vertical scroll bar is at the bottom.
I am working on getting a copy up on GitHub though, as per Timothy's suggestion, as I understand that it can be hard to read here.
Replying to azaozz:
On second thought, this is (inseparable) part of #51144, perhaps can be closed as duplicate and handled there. The whole feature still needs designing.
This ticket is intended for designing.
However, we need input from a very diverse group of people for this initiative and they do not want to read through a lot of items that do not apply to them, hence the separate (but highly related) tickets to allow for multiple working groups.
#25
follow-up:
↓ 26
@
4 years ago
Since yesterday, I have thought a lot about your questions, @azaozz :)
This ticket wasn't initially meant to outline a needs-analysis-of-sorts. We kind of took that for granted after previous tickets, I think. This is not ideal and it is something we could fix here, or elsewhere.
So, let's look at why the Privacy Policy initiative was not as successful as it could have been.
Please keep in mind that these are my opinions, influenced by others on the Privacy Team (and with particular acknowledgement of xkon).
wpdirectory.net shows that wp_add_privacy_policy_content() is being used in 243 plugin extensions.
There are currently 57,243 plugins in the repository.
So why was the uptake for this not higher?
UI
- Well, currently the privacy policies are hidden. So the first issue is that there is no (website admin) user-side UI, either in Core or in the Repo;
- Furthermore, the privacy policies are walls of text that are not really useful to site owners / admins / those managing the repositories.
Content
- The content is free-form, which would make it hard to compare if all the necessary info had been included;
- However, because it is aimed at providing legalese, in practice, there is likely very little difference between the content for a SEO plugin and one that changes the appearance of the admin screen.
So, what do we need to do?
- Create a standardized template for plugin, theme authors and Core to use;
- Have this template focus on practical concerns (mostly those that will be directly relevant to data mapping and inventory techniques), instead of on legalese;
- Make the information visible to site owners and admins in order to assess their privacy risk profiles. This would include an admin UI (perhaps with tabs / expandable and collapsible sections that would make it easier for the average admin to read), as well as a repo tab (which would require a meta ticket after we have finalized the format).
#26
in reply to:
↑ 25
@
4 years ago
- Version trunk deleted
Replying to carike:
Since yesterday, I have thought a lot about your questions, @azaozz :)
Great! Me too :)
This ticket wasn't initially meant to outline a needs-analysis-of-sorts. We kind of took that for granted after previous tickets, I think. This is not ideal and it is something we could fix here, or elsewhere.
Yes, I understand. In practical terms Trac is the place to develop software. This ticket talks about specifying a format for some data but not about now that data is going to be used. This is generally a wrong-way-to-do-things when developing software. The data structure/format can easily be determined once the usage and the purpose of that data is known. Hence I think this ticket should be closed in favor of #51144 that talks about the new WP feature that will eventually use this data.
So, let's look at why the Privacy Policy initiative was not as successful as it could have been.
Sounds good. Such analysis would be great imho. Thinking this ticket is not the right place though, how about we start it in #core-privacy in Slack and continue in a make/core blog post?
Anyway, let me quickly respond to some of your questions:
currently the privacy policies are hidden.
That's technically incorrect. The "privacy policy guide" (containing all policies) is accessible from the privacy policy settings screen.
So the first issue is that there is no (website admin) user-side UI
Again, technically incorrect. See above. Admins can access the privacy policy guide from the privacy settings screen. This guide (and the suggested text for the policy) is used very rarely, usually not more than once per year. There is also a mechanism to alert the site owners when some of the text changes.
...either in Core or in the Repo;
Sorry, not sure what do you mean by "in the Repo". Site owners/admins generally do not use the WP Trac/Github?
Furthermore, the privacy policies are walls of text that are not really useful to site owners / admins / those managing the repositories.
If I remember right the decision at the time was based on the fact that the person(s) writing a website's privacy policy assume certain legal responsibilities. These responsibilities are different in different jurisdictions. The WordPress developers cannot "write" a default privacy policy, each site owner will have to do that (or hire a lawyer) in order to comply with the different laws and regulations.
Nowadays it may be possible to "standardize" the text needed for the privacy policy, however don't think this will be particularly useful. Composing a legal document is the responsibility of the site owner. The data they need has to be provided by WP and (ideally) by themes and plugins. Guessing what data exactly each plugin and theme has do provide would not be a good idea imho.
#27
in reply to:
↑ 24
@
4 years ago
Replying to carike:
The best way I can explain this is to compare it to Site Health.
Some Site Health features enhance what is already in Core, but they are not necessary for a not-insignificant number of users, or more useful for a limited amount of time, so they are intentionally kept in the plugin.
The Core implementation works - and is useful - in its own right, but it is augmented by additional (debugging) features in a plugin.
The aim is to decide what constitutes sane defaults for Core and then to provide extended options via a Permissions plugin.
Still thinking I may be misunderstanding or missing something here, but don't see the similarities. Do you mean the proposed structured privacy data format should be filterable and changeable by plugins? If yes, thinking that from software development point of view that's not a good idea. Each change of the "structure" will bring back-compat concerns, etc.
To create a disclosures.json file for Core would be a large undertaking (which would include documenting all external references in the code). Breaking it up in to the various components make it more achievable and also makes it more useful to the site owner / admin because it is explained in more manage-able chunks.
Don't think there are many "external references" in core. In fact think there is only one: to wordpress.org. This, of course, depends on the definition of "external references", could you explain/share it for clarity's sake.
Creating multiple hard-coded files for different "components" doesn't seem to make sense because:
- Nearly all WP components do not access external resources. The only exception is Install/Update that accesses only one resource: the API on wordpress.org. In addition this is a connection between the hosting company's servers and wordpress.org's API that doesn't contain any "personal information". It is still questionable if this has anything to do with user privacy.
- Having multiple (hard-coded json, xml, html, text, etc.) files would generally be harder to maintain and use.
Free-form disclosures would not be as useful as standardized disclosures, which allow site owners / admins to compare plugins / themes' privacy practices.
I think this is incorrect. Who will bear the legal responsibilities for deciding what is legally required, what is not, and what is "perhaps nice to have"? A pre-determined format for "standardized disclosures" doesn't seem possible from legal point of view.
A JSON format helps with standardization and can be easier for developers.
True, however see above.
The REST API is used to validate the JSON data and to make it accessible in a format that is more friendly towards site owners and admins.
Sorry but not following here.
- What data has to be validated, where and when, and what is has to be validated against?
- What is the purpose of using the REST API to output a static file? This seems like a bad software design?
These are not multiple examples. They are intended to be a single schema. I just split them into separate blocks because on the laptop, the vertical scroll bar is at the bottom.
I am working on getting a copy up on GitHub though, as per Timothy's suggestion, as I understand that it can be hard to read here.
Right, thanks. Yeah, the "proper way" would be to either make a patch and upload to this ticket, or make a PR as suggested by TimothyBlynJacobs.
This ticket is intended for designing.
However, we need input from a very diverse group of people for this initiative...
Hmm, thinking that there are generally two groups of people that are/should be involved with this feature.
- People that have legal experience, are interested in web privacy, and have an understanding of how software is developed (or are willing to learn).
- WP developers (designers, coders, etc.) that can "design and create" the feature after the requirements are established by the above group of people.
#28
follow-up:
↓ 29
@
4 years ago
What data has to be validated, where and when, and what is has to be validated against?
This ticket is creating a JSON schema. Plugins can then provide privacy data that must validate against that schema. That turns an unstructured data format into a structured data format that can be validated.
What is the purpose of using the REST API to output a static file? This seems like a bad software design?
It likely wouldn't be outputting a static file since some strings will need to be translated.
#29
in reply to:
↑ 28
@
4 years ago
Replying to TimothyBlynJacobs:
This ticket is creating a JSON schema. Plugins can then provide privacy data that must validate against that schema.
Yeah, I understand. The problem is that having such (fixed) schema in WP doesn't seem to be possible from a laws/regulations compliance point of view. I.e. the author(s) of the schema may carry legal responsibility for items that are included or not included there.
What is the purpose of using the REST API to output a static file? This seems like a bad software design?
It likely wouldn't be outputting a static file since some strings will need to be translated.
Hmm, the way I see it the data will not include any translatable strings. It would be "just data" used to build some UI. The (numerous) code examples in the ticket description don't seem to serve the purpose well and will need changing (separate data from view, etc.).
Even if there are translatable strings, the data is "static" by nature, why would there be a REST API end point for such data? :)
#30
follow-up:
↓ 31
@
4 years ago
Yeah, I understand. The problem is that having such (fixed) schema in WP doesn't seem to be possible from a laws/regulations compliance point of view. I.e. the author(s) of the schema may carry legal responsibility for items that are included or not included there.
I'll leave that for the privacy team. My point about using JSON schema is it provides us a good way to validate that if we do want structured data, which I think is the privacy team's goal.
Hmm, the way I see it the data will not include any translatable strings. It would be "just data" used to build some UI.
I think they probably would as mentioned in comment:3.
Even if there are translatable strings, the data is "static" by nature, why would there be a REST API end point for such data? :)
Why should it not be? If it is made available over the REST API, it is easier for other consumers to access that data. It'd also make it easier for a React powered front-end. It being available over the REST API also doesn't preclude it from being accessed in different ways as well.
#31
in reply to:
↑ 30
@
4 years ago
Replying to TimothyBlynJacobs:
I'll leave that for the privacy team. My point about using JSON schema is it provides us a good way to validate that if we do want structured data, which I think is the privacy team's goal.
Right. This question was raised when the initial discussion happened, around two years ago if I'm not mistaken, and don't think there's been a clear answer yet.
Why should it not be? If it is made available over the REST API, it is easier for other consumers to access that data. It'd also make it easier for a React powered front-end. It being available over the REST API also doesn't preclude it from being accessed in different ways as well.
I'm still thinking we're talking apples and oranges here :)
- Where the data that needs validation comes from? Static file(s), one supplied by core and (eventually) a few supplied by plugins.
- When should the validation happen? On every request of... what? Or once after a plugin is installed and then the result is saved in the DB? Or.... how is that going to work efficiently?
- What happens when the validation fails? The plugin supplying the data is... rejected (deleted, disabled, or... rejected how)?
- Does it make sense for such validation to be in core at all, or maybe better to be on accepting plugins to the plugins directory, or..?
- What happens then the schema needs to be changed? Re-validation?
Generally (continuously) validating static, non-editable files in core seems... unwise?
Also, looking through the code examples in the ticket description, quite a bit of the data seems "sensitive", i.e. only admins should be able to see it. So at best this should be a page under the Plugins and Themes menu items in wp-admin accessible only to site admins, or perhaps a "More Info" link for each plugin and theme. For security reasons this data (as proposed above) should never be freely accessible. The point is: whether this should be available through REST API should be decided after the implementation details and UI are ready, not before.
#32
follow-up:
↓ 33
@
4 years ago
Right. This question was raised when the initial discussion happened, around two years ago if I'm not mistaken, and don't think there's been a clear answer yet.
As I understand it, the point is to create a superset of facts about how a plugin handles user data, makes external API requests and other privacy related info. That way, individual plugins can be created for different privacy laws. The Core part is more about standardizing a data format so that plugins, and perhaps Core, can implement functionality based on the laws of the region the site adheres to. As well as making sure that data is disclosed to the site owner in an easy to understand way.
This can really only work if the standard is in Core to give the best potential at plugin adoption.
The current system we have in place is freeform, and I don't think has proven to be very successful. Plugin authors aren't lawyers, but are practically being asked to write up "legal" privacy policy information that the site user then needs to figure out a way to cobble into a legal document of their own.
By making the privacy data as fact based as possible, it reduces the burden on plugin authors who want to provide this information. It'd also allow for plugins or other tools to compile comprehensive privacy policies and other documents based on the structured information.
I'm still thinking we're talking apples and oranges here :)
Probably :) My comments are really only focussed on the technical side assuming this is the feature set we want to implement.
Where the data that needs validation comes from? Static file(s), one supplied by core and (eventually) a few supplied by plugins.
As you mentioned, Core is pretty simple. As I understand it, the main audience here is plugin developers. So eventually data can be displayed in the admin and on the WordPress.org plugin page.
When should the validation happen? On every request of... what? Or once after a plugin is installed and then the result is saved in the DB? Or.... how is that going to work efficiently?
For Core, it could be validated when the data needs to be accessed on the privacy page. For .org, I imagine it'd validate when zips are built.
What happens when the validation fails? The plugin supplying the data is... rejected (deleted, disabled, or... rejected how)?
I think this is mainly for @carike. But I think the idea is just that the plugin would show as having an incomplete or invalid privacy disclosures. I don't think the idea currently is, and probably never would be, for Core to completely forbid plugins from operating unless they have complete disclosure data. I imagine there would probably be plugins that do implement something like that.
Does it make sense for such validation to be in core at all, or maybe better to be on accepting plugins to the plugins directory, or..?
I think it is still necessary to have in Core to handle the plugins that don't live in the WordPress.org directory. Having a JSON Schema in Core also gives us versioning tied to WordPress releases.
What happens then the schema needs to be changed? Re-validation?
If we do need to cache validation status, yeah we could re-validate that in a fairly straightforward way I think.
Generally (continuously) validating static, non-editable files in core seems... unwise?
Which files would that be? I don't think we'd need to do that for Core's privacy disclosures. Just plugins, and I don't think it'd need to be continuous. And if we need to implement caching, I don't think it'd be that complex.
quite a bit of the data seems "sensitive", i.e. only admins should be able to see it.
In what way? As I understand it, the idea is that this data would be displayed publicly on WordPress.org.
So at best this should be a page under the Plugins and Themes menu items in wp-admin accessible only to site admins, or perhaps a "More Info" link for each plugin and theme.
I think it would be guarded the same way we guard the Privacy Settings page already.
The point is: whether this should be available through REST API should be decided after the implementation details and UI are ready, not before.
How is the REST API not a part of the implementation discussion? Ignoring the REST API until the last second and seeing it as merely a simple data transport mechanism is how we continuously get into trouble.
#33
in reply to:
↑ 32
@
4 years ago
Replying to TimothyBlynJacobs:
As I understand it, the point is to create a superset of facts about how a plugin handles user data, makes external API requests and other privacy related info.
Right, this should include only privacy related data that is considered public.
The Core part is more about standardizing a data format so that plugins, and perhaps Core, can implement functionality based on the laws of the region the site adheres to. As well as making sure that data is disclosed to the site owner in an easy to understand way.
Right again, the data should be disclosed only to the site owner(s) on a per-site basis.
This can really only work if the standard is in Core to give the best potential at plugin adoption.
The current system we have in place is freeform, and I don't think has proven to be very successful. Plugin authors aren't lawyers, but are practically being asked to write up "legal" privacy policy information that the site user then needs to figure out a way to cobble into a legal document of their own.
Right. This ensures each site owner can decide (or hire a lawyer if needed) what their Privacy Policy should contain, and bear the legal responsibility for it.
By making the privacy data as fact based as possible, it reduces the burden on plugin authors who want to provide this information.
Yeah, perhaps. Looking at the examples above, a lot of points are not particularly clear, but thinking this can be improved?
It'd also allow for plugins or other tools to compile comprehensive privacy policies and other documents based on the structured information.
Wrong. Privacy policies cannot be compiled by "(other) tools". They have to be written by people or businesses/companies who will be legally responsible for the content.
As you mentioned, Core is pretty simple. As I understand it, the main audience here is plugin developers. So eventually data can be displayed in the admin and on the WordPress.org plugin page.
Yeah, this is a good idea. However the example "schema" above has a lot of things that don't seem "privacy related", needs more work.
For Core, it could be validated when the data needs to be accessed on the privacy page. For .org, I imagine it'd validate when zips are built.
So we will need to maintain/sync two different "schemas", one in core and another on wp.org. Then plugins will be "forced" to include a (json formatted) file that will have to contain all the "required fields" or will be marked as "failed", even when they do not contain any user privacy related stuff? That seems... not ideal.
What happens when the validation fails? The plugin supplying the data is... rejected (deleted, disabled, or... rejected how)?
I think this is mainly for @carike. But I think the idea is just that the plugin would show as having an incomplete or invalid privacy disclosures. I don't think the idea currently is, and probably never would be, for Core to completely forbid plugins from operating unless they have complete disclosure data. I imagine there would probably be plugins that do implement something like that.
Yeah, this needs more thinking imho. The majority of plugins have nothing to do with user privacy.
I think it is still necessary to have in Core to handle the plugins that don't live in the WordPress.org directory. Having a JSON Schema in Core also gives us versioning tied to WordPress releases.
Right, the question of syncing the schema between wp.org and core...
What happens then the schema needs to be changed? Re-validation?
If we do need to cache validation status, yeah we could re-validate that in a fairly straightforward way I think.
Even if not cached, the validation will (likely) fail every time the schema is updated. Then all existing plugins will "fail"...
Generally (continuously) validating static, non-editable files in core seems... unwise?
Which files would that be?
The static json files supplied by plugins. But yeah, probably not a huge deal if these are not going to be accessed often. As far as I see it, on most sites these might be accessed 1-2 times per year, or less :)
quite a bit of the data seems "sensitive", i.e. only admins should be able to see it.
In what way? As I understand it, the idea is that this data would be displayed publicly on WordPress.org.
Making the data supplied by plugins "public" on a specific site will at least disclose which plugins that site is using. This in itself can be seen as a "privacy breach", can be used for "fingerprinting", the plugin's versions will probably be "guessable" from the data, etc. :)
So at best this should be a page under the Plugins and Themes menu items in wp-admin accessible only to site admins, or perhaps a "More Info" link for each plugin and theme.
I think it would be guarded the same way we guard the Privacy Settings page already.
Right, so the data contained in the plugin's json files would be "private" (on a per site basis) and only site owners will be able to see it? (Only the site owners will need to see it anyways as it is intended for creating a Privacy Policy). Or am I reading this wrong?
The point is: whether this should be available through REST API should be decided after the implementation details and UI are ready, not before.
How is the REST API not a part of the implementation discussion? Ignoring the REST API until the last second and seeing it as merely a simple data transport mechanism is how we continuously get into trouble.
It's not that it is not a part of it but... Would you add an end point to output /readme.txt or /license.txt? Does it make sense from "restful" point of view? What's the point of having that in the REST API (considering that this data would be very rarely accessed and used only by site owners/users with the highest permissions).
As far as I understand it the (compiled) data from all the plugins json files can be outputted by the REST API, in case a plugin might want to replace the (proposed) page in wp-admin (instead of extending it), but... At the end this is the same like outputting all the data for the Comments page for example, just because a plugin might eventually decide to replace it? Seems WP may get there one day but...?
#34
@
4 years ago
Wrong. Privacy policies cannot be compiled by "(other) tools". They have to be written by people or businesses/companies who will be legally responsible for the content.
There are services like this that already exist, for instance iubenda. A tool like that could consume the data over the REST API and provide much more accurate data as to what the site does with the data.
However the example "schema" above has a lot of things that don't seem "privacy related", needs more work.
I agree.
So we will need to maintain/sync two different "schemas", one in core and another on wp.org.
I imagine .org would use the schema in its WordPress install and we wouldn't be breaking BC, just adding new features or changing their format in BC ways so it being on trunk wouldn't be an issue.
Then plugins will be "forced" to include a (json formatted) file that will have to contain all the "required fields" or will be marked as "failed", even when they do not contain any user privacy related stuff? That seems... not ideal.
Something for @carike. I imagine they'd say "Privacy information not available". Which doesn't seem to bad. The minimal set of fields that you'd probably want to provide for a plugin that has zero privacy impact is probably something like:
{ "ppiExport": true, "ppiErausre": true, "consentAPI": true, "disclosuresTab": true, "permissionsTab": true }
I don't think that is too much of a burden for plugin authors to explicitly declare that they don't need to implement those features and I don't really see how else we could do it short of code analysis which would be get us a lot less accurate data.
The majority of plugins have nothing to do with user privacy.
Definitely, but I think there are still quite a number. Particularly of the most popular plugins.
- WP Http 10k: https://wpdirectory.net/search/01EHHMBF85N0WSNBAVWNBG32P0
- Cookies 3k: https://wpdirectory.net/search/01EHHMCYPW7MXY0HS07S4XNXCG
- User Meta 4.5k: https://wpdirectory.net/search/01EHHMN2EV0QYPW0RXQV26AKNA
And I think the ones that couldn't possibly have any privacy impact will be evident from the description. For the ones where it isn't so clear, the ability to say no this plugin doesn't contact any external APIs, etc... would be a good thing for those plugins I think.
Even if not cached, the validation will (likely) fail every time the schema is updated. Then all existing plugins will "fail"...
Why?
If it is from a technical implementation I imagine a function signature like this wp_get_plugin_privacy_data( $plugin, $force_revalidate = false ): array|WP_Error
. If the plugin's privacy data has changed or the version of the schema is newer, we'd revalidate before returning that data.
If it is from a perspective of making changes to WordPress' schema we'd make any changes backward compatible, the same way we currently do. I don't think it would be acceptable for their to be BC breaks there, nor do I imagine why we'd need them. Fields aren't currently marked as required
and if a new format is necessary, this can be accommodated in the schema definition.
Making the data supplied by plugins "public" on a specific site will at least disclose which plugins that site is using. This in itself can be seen as a "privacy breach", can be used for "fingerprinting", the plugin's versions will probably be "guessable" from the data, etc. :)
Where would that be disclosed? There would be a machine readable .json file in the plugin directory's folder, but you'd need to know the site is running that plugin before hand. It is also already trivial to detect because of readme files, version history, etc... And is already possible using sites like Built With.
Right, so the data contained in the plugin's json files would be "private" (on a per site basis) and only site owners will be able to see it? (Only the site owners will need to see it anyways as it is intended for creating a Privacy Policy). Or am I reading this wrong?
Yep! That matches my understanding.
It's not that it is not a part of it but... Would you add an end point to output /readme.txt or /license.txt? Does it make sense from "restful" point of view?
I'd like to yeah. You can use api.wordpress.org
for .org hosted plugins, but for non .org plugins it makes retrieving that data impossible. We now have a plugins endpoint that returns the plugin header information, but that is limited.
What's the point of having that in the REST API (considering that this data would be very rarely accessed and used only by site owners/users with the highest permissions).
We have a settings endpoint and a plugins endpoint that are only accessible to administrators. I'd also wager for most WordPress sites the admin is the only user on the whole install :)
As far as I understand it the (compiled) data from all the plugins json files can be outputted by the REST API, in case a plugin might want to replace the (proposed) page in wp-admin (instead of extending it), but... At the end this is the same like outputting all the data for the Comments page for example, just because a plugin might eventually decide to replace it? Seems WP may get there one day but...?
I don't really get the resistance to making versioned, structured data that is at least in part dynamic available over a tool that is designed for doing that.
As a whole, IMO we should be thinking about how new features can integrate with the REST API from the outset of how that feature is being designed. It makes implementation a lot simpler that way and as everything in WP-Admin is moving to a React powered interface, necessary at the moment.
In terms of use cases for Core, if we made this available in Gutenberg when editing the Privacy Policy page similar to some of the initial mockups for how that page could work in the Classic Editor, making that available over REST would vastly simplify the implementation.
The same is true for plugin authors who are building tools. And as I mentioned earlier, I think this would be great functionality for external systems like Iubenda.
I also do think there is privacy data that would make sense to make public, for instance this could serve as the source of truth for cookies. That would be useful to access on the front-end to build a cookie consent screen.
If any of these values (e.g. URLs) would need to be translatable, which they probably do, the tooling in WP-CLI and WordPress.org needs to be updated accordingly.