Add telemetry (aka usage data collection) as opt-in feature in core
|Reported by:||mor10||Owned by:|
Many discussions around changes, additions, and removal of features in WordPress core run into the same problem: We don't have the necessary data to know how end-users are interacting with the application and its features.
To solve this problem I propose the adoption of an opt-in telemetry feature in WordPress core that collects anonymized data on feature and functionality use. This is in line with what major software providers do, and it is a feature most users will be familiar with.
Implementation and activation
- The opt-in selector for the feature should be surfaced on first install or when the site is updated to the first version of WordPress containing the feature is installed.
- For new installs the opt-in question should appear on the 5-minute install page along with "Allow search engines to index the site" or similar.
- For upgrades, the opt-in question should be revealed in a dedicated modal.
- The feature should be disabled by default and the admin can make an active choice to participate.
- The feature should be controllable at any time through a dedicated section under Settings->General
- It is possible the best way to make users feel this feature is not a Trojan horse is to ship it as a plugin that auto-installs on opt-in and auto-uninstalls on opt-out.
Some core data should always be collected, including but not limited to:
- Number of themes and plugins installed
- Frequency of use of specific views (Settings, Customizer, etc)
- Current version
- Update status
- Locale (generalized to country)
In addition it should be possible to push, custom queries to activated users to test for specific interactions, as an example how many users click the Underline button in TinyMCE. I'm not sure exactly what the best approach here is, but this is one idea: The feature queries a centralized service on a weekly / monthly basis to get instructions on what type of data is currently being collected.
The decision on what data to be collected should be done by committee based on current active tickets that require user data.
Anonymity and transparency
A core requirement for the success of this feature is that data collection must be 100% anonymized. No data collected can be traced back to an individual user. Ideally the feature itself will be built in such a way that even accidental collection of personal data is impossible.
At any time, information about what data is being collected should be available to end-users both on a dedicated page on WordPress.org and through the setting in admin.
All data collected should be made public for scrutiny and use to ensure transparency and enable actual use.
Practical way forward
To prove the viability of this feature I propose a slow incremental deployment: Start with collection of certain uncontroversial datapoints like current language setting, number of themes and plugins, and one UI interaction that needs testing. Once this MVP has proven itself effective, a larger scale testing program can be shipped.