[HN Gopher] A study of data collection by Android devices
___________________________________________________________________
A study of data collection by Android devices
Author : dede4metal
Score : 80 points
Date : 2021-10-15 09:37 UTC (13 hours ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| tmsbrg wrote:
| I was curious about the LineageOS, so I checked and found:
|
| "On all of the other handsets the Google Play Services and Google
| Play store system apps send a considerable volume of data to
| Google, the content of which is unclear, not publicly documented
| and Google confirm there is no opt out from this data collection.
| LineageOS collects no data beyond this data collected by Google
| and so is perhaps the next most private choice after /e/OS."
|
| So the problem is Play Services and the Play store. Note that
| /e/OS uses MicroG[0] to replace Play Services so you can still
| use the many Android apps that require it. It's a really cool
| project and I think it's amazing how someone is like "Damn, I
| need a Google account for this app? I'll just write my own
| account manager to replace Google's!". Really that's the spirit
| of open source, and I hope more people become empowered to be
| able to solve problems that way in the future.
|
| I think you can also use MicroG rather than OpenGapps on
| LineageOS, though I haven't tried and haven't read anything about
| it. Does anyone have some info about this setup?
|
| [0] https://edevelopers-blog.medium.com/microg-what-you-need-
| to-...
| rashil2000 wrote:
| You can find all the required info here -
|
| https://lineage.microg.org
| jqpabc123 wrote:
| _Does anyone have some info about this setup?_
|
| Yes, you can use MicroG with LineageOS. If this totally solved
| the privacy issue, e/OS would be pointless.
|
| GApps is a big upper level privacy problem that microG can
| solve but this is not the only issue. By default, AOSP itself
| also sends personally identifying info to Google servers
| through low level system calls. The e/OS fork is needed in
| order to remove this.
| chuckee wrote:
| > By default, AOSP itself also sends personally identifying
| info to Google servers through low level system calls.
|
| Can you explain more about this? What kind of information is
| sent? And does LineageOS not disable this?
| jqpabc123 wrote:
| No, LineageOS does not disable this.
|
| Personal opinion but the LineageOS project seems more
| concerned with security than privacy. The idea of Google
| hijacking your privacy for profit doesn't seem to really
| bother them too much. Their web site and wiki repeatedly
| address security but rarely privacy.
|
| As for explaining further, I will defer to the document
| below from the e foundation.
|
| https://e.foundation/wp-content/uploads/2020/09/e-state-
| of-d...
| aaaxyz wrote:
| By default lineageOS comes with neither Google play services
| nor microg. The choice of what to install is left to the user.
|
| MicroG is a bit more complicated to install since it requires
| package signature spoofing, which official lineage builds don't
| enable since it's seen as a security vulnerability.
| 9387367 wrote:
| Good discussion /e/ here:
| https://community.e.foundation/t/divestos-vs-e-os-security-a...
| zibzab wrote:
| My thoughts on this. TLDR version: some are probably innocent but
| some are worse than presented here.
|
| > Key findings from the study:
|
| >
|
| > With the exception of e/OS, all of the handset manufacturers
| examined collect a list of all the apps installed on a handset.
| This is potentially sensitive information since it can reveal
| user interests, e.g., a mental health app, a Muslim prayer app, a
| gay dating app, a Republican news app. There is no opt out from
| this data collection.
|
| This happens when looking for updates anyway
|
| > The Xiaomi handset sends details of all the app screens viewed
| by a user to Xiaomi, including when and how long each app is
| used. This reveals, for example, the timing and duration of phone
| calls. The effect is akin to the use of cookies to track people's
| activity as they move between web pages. This data appears to be
| sent outside Europe to Singapore.
|
| Can this be "standard" app analytics. Not saying its okay, but
| this is the norm these days.
|
| > On the Huawei handset the Swiftkey keyboard sends details of
| app usage over time to Microsoft. This reveals, for example, when
| a user is writing a text, using the search bar, searching for
| contacts.
|
| Your custom wordlist is on the cloud with them, so they can see
| much more than that.
|
| > Samsung, Xiaomi, Realme and Google collect long-lived device
| identifiers, e.g., the hardware serial number, alongside user-
| resettable advertising identifiers. This means that when a user
| resets an advertising identifier the new identifier value can be
| trivially re-linked back to the same device, potentially
| undermining the use of user-resettable advertising identifiers.
|
| This is probably a major GDPR issue.
|
| Edit: could also be for guarantee reasons. The big question is if
| this is ever used for advertising.
|
| > Third-party system apps, e.g., from Google, Microsoft, LinkedIn
| and Facebook, are pre-installed on most of the handsets and
| silently collect data, with no opt out.
|
| This is horrible!! I specifically avoid using any Facebook
| services and we all know about their shadow profiles for users
| who don't own a FB account.
|
| (But what kind of data have the apps access to? In theory they
| are never used and have no privileges?)
|
| > There may exist a data ecosystem where data collected from a
| handset by different companies is shared/linked. Notably, the
| privacy focused e/OS variant of Android was observed to transmit
| essentially no data.
|
| We need more openness here.
| retSava wrote:
| > but this is the norm these days
|
| Norm as in normal, but I'd not say norm as in what most people
| expect and would accept if they knew.
|
| And norms change, let's not accept a "normal" as the de facto
| "this is how it should be" but instead work towards a better
| norm.
|
| That's the hard part, the easy part of how to do this is left
| as an exercise for the reader.
| illwrks wrote:
| Wow. As a Xiaomi phone user, and having not read the article
| yet, does it mention if that level of tracking happens on
| Global Rom (Android One) versions of their phones?
| gpas wrote:
| I loved my phone until I looked at nextdns logs and noticed
| it's reaching out to Xiaomi owned servers roughly every 30
| minutes. It's definitely tracking something. Now nextdns has
| a dedicated Xiaomi block list so I should be ok, but who
| knows?
|
| My bad for not checking the lineageos compatibility list
| before buying an unsupported device.
|
| Global MIUI 12.5.3 on a note 8 pro.
| rhn_mk1 wrote:
| > This happens when looking for updates anyway
|
| But it doesn't have to. My Debian system does not send a list
| of installed packages in order to get updates. It queries the
| list of available ones.
| black3r wrote:
| > This happens when looking for updates anyway
|
| I would still expect to have opt out of "looking for updates".
| Especially since this study suggests that this data is not
| collected by Play Store (Google) only, but also by the device
| manufacturer (possibly by their own store app which nobody
| really uses)
| srg0 wrote:
| LWN is a nice site, but to save you a couple of clicks, this is
| the original post by Trinity College Dublin:
|
| https://www.tcd.ie/news_events/articles/study-reveals-scale-...
|
| And this is the paper it talks about (PDF):
|
| https://www.scss.tcd.ie/Doug.Leith/Android_privacy_report.pd...
|
| "Key findings from the study:
|
| - With the exception of e/OS, all of the handset manufacturers
| examined collect a list of all the apps installed on a handset.
| This is potentially sensitive information since it can reveal
| user interests, e.g., a mental health app, a Muslim prayer app, a
| gay dating app, a Republican news app. There is no opt out from
| this data collection.
|
| - The Xiaomi handset sends details of all the app screens viewed
| by a user to Xiaomi, including when and how long each app is
| used. This reveals, for example, the timing and duration of phone
| calls. The effect is akin to the use of cookies to track people's
| activity as they move between web pages. This data appears to be
| sent outside Europe to Singapore.
|
| - On the Huawei handset the Swiftkey keyboard sends details of
| app usage over time to Microsoft. This reveals, for example, when
| a user is writing a text, using the search bar, searching for
| contacts.
|
| - Samsung, Xiaomi, Realme and Google collect long-lived device
| identifiers, e.g., the hardware serial number, alongside user-
| resettable advertising identifiers. This means that when a user
| resets an advertising identifier the new identifier value can be
| trivially re-linked back to the same device, potentially
| undermining the use of user-resettable advertising identifiers.
|
| - Third-party system apps, e.g., from Google, Microsoft, LinkedIn
| and Facebook, are pre-installed on most of the handsets and
| silently collect data, with no opt out.
|
| - There may exist a data ecosystem where data collected from a
| handset by different companies is shared/linked. Notably, the
| privacy focused e/OS variant of Android was observed to transmit
| essentially no data."
| nervuri wrote:
| > - With the exception of e/OS, all of the handset
| manufacturers examined collect a list of all the apps installed
| on a handset.
|
| /e/OS is no exception. I looked at the requests made by its
| "Apps" app. Every time it checks for updates, it tells the
| server what applications you have installed. These requests are
| made with a User-Agent header revealing your device model,
| build ID and Android version. Installed languages are also sent
| via the Accept-Language header. And there is no option to
| disable update checks; the closest you can get is to set the
| interval to monthly.
|
| Contrast that with F-Droid, which downloads the package index
| in advance (like apt does), so it doesn't need to send the
| server a list of installed apps in order to check for updates.
| smoldesu wrote:
| I am curious, do iPhones not send a list of opened apps (a la
| MacOS) back to Apple periodically? I was under the impression
| that most phone vendors would collect statistics like that.
| dartharva wrote:
| >We find that the Samsung, Xiaomi, Huawei and Realme Android
| variants all transmit a substantial volume of data to the OS
| developer (i.e. Samsung etc) and to third-party parties that have
| pre-installed system apps (including Google, Microsoft, Heytap,
| LinkedIn, Facebook).
|
| Of course they do. That's the whole reason they're selling high-
| capacity hardware for cheap, they more than make up for their
| foregone profits from user data and third party partnerships.
|
| That's why you should always flash a custom ROM whenever you buy
| a "value for money" Android phone; never stay on the vendor's OS.
| Thankfully, except Samsung and Huawei, most other Android device
| manufacturers aren't actively working on locking down their
| firmware against customization and appear tolerant as of yet. You
| can even choose not to install Google services on your phone,
| although it would make using it normally a hassle.
| gigel82 wrote:
| I wish someone did this for Windows. Couldn't find anything so
| started setting up myself (using 2 VirtualBox VMs, internal
| networking and mitmproxy).
|
| I can see the data with that setup, but it's way too much to
| parse by a (single) human. I'll collect just the URLs for now, to
| at least update my PiHole config to block what isn't needed for
| Windows Update.
| jccalhoun wrote:
| So what is this data for? Because the ads I get are still nearly
| completely irrelevant. Last week youtube showed me ads in Spanish
| which I have zero knowledge of and ads for a company in an
| entirely different state. If they can't use all this data to know
| that I don't understand Spanish or what state I live in then what
| good is all this data?
| lopis wrote:
| Like in all analytics, not all data collected is used. In fact,
| probably 99% of data every collected is never collected.
| Companies just want to preemptively collect it in case they
| need it in the future.
| marginalia_nu wrote:
| A lot of this seems to be a mechanism to sooth anxieties
| about whether you are on the right track when taking a risk.
| Feels good to have a nice graph that points upward. In the
| past you would have seen an oracle or an astrologer to get
| reassurance.
|
| People do it as well, they gather tons of statistics on
| themselves like how many steps they've walked and how many
| glasses of water they've had and how many hours they've
| slept, which for the most part is completely non-actionable
| information and your body is much better at telling you if
| you are feeling well rested than your spreadsheet is. You get
| a pretty graph for sure, but it ultimately doesn't say
| anything you didn't already know.
|
| I guess you could convince yourself this is science, but in
| science the hypothesis precedes the experiment. If you gather
| tons of arbitrary data and go digging for correlations, you
| will find them, but anything interesting you find is most
| likely going to be spurious relationships and other
| statistical aberrations.
|
| It's data dredging, not science.
| zibzab wrote:
| It could very much be that this data is not used for
| advertising and is mostly just really horribly implemented dev
| analytics.
|
| Having read the report, I can't find any smoking guns about
| _uses_ of the data. But we really don't know at this point.
| jqpabc123 wrote:
| _But we really don 't know at this point._
|
| We do know ... or a least we should.
|
| Google has said it receives tens of thousands of "geofence"
| and "keyword" warrants each year looking to identify anyone
| within a certain geographic area at a particular time or
| anyone who searched for a particular keyword.
|
| There are 3 pertinent points here:
|
| 1) Google can absolutely identify you personally; otherwise,
| the warrants would be useless.
|
| 2) The authorities are searching info from your phone without
| probable cause (aka "fishing").
|
| 3) Innocent people have been convicted for being in the wrong
| place at the wrong time.
|
| https://techcrunch.com/2021/08/19/google-geofence-warrants/
| hulitu wrote:
| It is not used only for ads. It is also sold to other companies
| and 3 letter agencies.
| greenyoda wrote:
| Big discussion of the original source a few days ago:
| https://news.ycombinator.com/item?id=28830328
___________________________________________________________________
(page generated 2021-10-15 23:02 UTC)