Telemetry poll results and their implications
Our previous post regarding the prospect of introducing telemetry ended up pretty divisive. Apart from the sensitive nature of the topic itself, there were some poor wording and poll design mistakes that made it look more evil than we ever planned it to be. Still, we were able to collect feedback. Thanks to all who participated! Now let’s discuss the results and what it means for the telemetry module design.
Opt-in vs. opt-out
This is the most divisive issue. The way we worded the question was “Would you participate in data sharing”, with an idea that if an overwhelming majority of people are ready to do it, then making it opt-in is acceptable for the community.
What we learnt is that roughly half the people don’t want to share any data. That part also met the most vocal opposition on the forums.
If you allow me a rant, those answers were most definitely sent from a browser that sends telemetry by default and doesn’t make it any easy to disable it.
Our community is actually important for us—we aren't going to do things a lot of people oppose unless we absolutely have to (and telemetry isn't one of those things.
However, I'd like to reiterate that the stats collection thing is meant as a way to hear from the silent majority—the people who don't have time to actively talk on the forums and channels and tell everyone what they are using, what works for them and what doesn't. Automatic stats collection is a zero-effort way to share that information and influence the project direction.
If you want to be heard, please consider enabling stats collection when it's added to the image.
Identification method
Most participants chose random UUID over a hash of hardware configuration. One person also noted that it should be changeable. Well, it's not like we have a way to prevent you from changing it.
However, that comment made me think that random, changeable UUID indeed makes sense since it allows a more flexible notion of a "same installation". One hardware box can be moved to a different place and given a completely different role, and logically this should be considered a different installation.
Sharing installation data
Most people don't mind sharing running VyOS version and a list of installed images. However, very few people want to share their geographic location. This is puzzling to me, but we have to respect that choice.
Sharing configuration data
Most people don't mind sharing a list of configured features. To my surprise, a few people are ready to share their complete anonymised configs—I thought no one would.
Sharing hardware data
Apparently hardware is one thing people have least hesitation about sharing.
Resource consumption isn't a controversial subject either.
Stats dump format
We said it up front that we are not going to keep the data to ourselves: it will be shared with the community for exploration and analysis. When it comes to the dump format, there is no clear winner though:
We wanted to hear from people before even starting to design the stats collection feature, so we haven't decided how we will store that information internally. However, we clearly need to consider making it easy to export the data in different formats.
Ok, and now what?
Now we know that many people oppose opt-out collection (despite its provable anonymous and public nature), so the implementation will be opt-in. We need to think of good places to ask the user to opt in other than the installer, since many image flavors don't use an installer. One possible option is the "add system image" upgrade script.
Since more detailed options like location and RAM usage breakdown by process are not universally accepted, there will be configuration options for enabling and disabling them. There will also be op mode commands to display the data that would be sent.
In any case, this is obviously not a high priority feature, so don't expect it real soon. There's a lot of work to do before we can call the 1.3 release stable, so real work on stats reporting is unlikely to start before 1.3 enters a beta phase.
If you are interested in working on stats collection, feel free to discuss your design and implementation ideas in Phabricator of course. If you had any doubts that stats collection code will be open source, there's no doubt—it will.
Here is link to full report - https://myquests.typeform.com/report/Dxb7dP/lMbO1FE9wR8wYa1Q
Thanks for participation in our research!
Comments