https://zdimension.fr/whats-in-my-location-history/ avatar zdimension adventures in computational madness * HOME * CATEGORIES * TAGS * ARCHIVES * PROJECTS * ABOUT Home What's in my Location History? Post [ ] Cancel What's in my Location History? Posted Jun 30, 2024 By zdimension 11 min read I've been using Google Maps' location history feature for a few years now. It's a bit creepy to think about how much data Google has about me, but it's also a goldmine for data analysis. Kindly enough, Google allows me to download all of my precious data in machine-readable format using Takeout: Google Takeout UI with the Location History item This gives us a bunch of JSON files: 1 $ ls -R 2 .: 3 Records.json 'Semantic Location History' Settings.json 'Timeline Edits.json' 4 5 './Semantic Location History': 6 2012 2015 2017 2019 2021 2023 7 2014 2016 2018 2020 2022 2024 8 9 './Semantic Location History/2012': 10 2012_SEPTEMBER.json 11 12 './Semantic Location History/2014': 13 2014_APRIL.json 2014_DECEMBER.json 2014_MAY.json 14 15 ... First look at the data That first Records.json file here is almost a gigabyte for my account. Not surprising, given that it contains 10-odd years of location data. It contains every single "phone home" location data packet that my phone has sent to Google. Unprocessed data Here's a packet from it: 1 { 2 "latitudeE7": 460734187, 3 "longitudeE7": 67312122, 4 "accuracy": 3198, 5 "activity": [ 6 { 7 "activity": [ 8 { 9 "type": "STILL", 10 "confidence": 100 11 } 12 ], 13 "timestamp": "2014-04-28T12:34:46.342Z" 14 } 15 ], 16 "source": "CELL", 17 "deviceTag": 803364441, 18 "timestamp": "2014-04-28T12:30:55.424Z" 19 } This one doesn't contain much information because it's quite old; it was sent from the phone I had back then (a budget Wiko Lenny) and it seems like they didn't measure as much stuff as they do now. Here's a more recent packet: 1 { 2 "latitudeE7": 506444929, 3 "longitudeE7": 30775562, 4 "accuracy": 22, 5 "altitude": 76, 6 "verticalAccuracy": 3, 7 "activity": [ 8 { 9 "activity": [ 10 { 11 "type": "ON_FOOT", 12 "confidence": 80 13 }, 14 { 15 "type": "WALKING", 16 "confidence": 80 17 }, 18 { 19 "type": "IN_VEHICLE", 20 "confidence": 6 21 }, 22 { 23 "type": "ON_BICYCLE", 24 "confidence": 6 25 }, 26 { 27 "type": "IN_ROAD_VEHICLE", 28 "confidence": 6 29 }, 30 { 31 "type": "IN_RAIL_VEHICLE", 32 "confidence": 6 33 }, 34 { 35 "type": "RUNNING", 36 "confidence": 2 37 }, 38 { 39 "type": "UNKNOWN", 40 "confidence": 0 41 } 42 ], 43 "timestamp": "2024-03-21T17:42:26.290Z" 44 } 45 ], 46 "source": "WIFI", 47 "deviceTag": 123456, 48 "platformType": "ANDROID", 49 "activeWifiScan": { 50 "accessPoints": [ 51 { 52 "mac": "xxxxxx237234444", 53 "strength": -80, 54 "isConnected": true, 55 "frequencyMhz": 0 56 }, 57 { 58 "mac": "xxxxxx237222816", 59 "strength": -85, 60 "frequencyMhz": 0 61 }, 62 { 63 "mac": "xxxxxx559068245", 64 "strength": -88, 65 "frequencyMhz": 0 66 }, 67 { 68 "mac": "xxxxxx759298638", 69 "strength": -89, 70 "frequencyMhz": 0 71 } 72 ] 73 }, 74 "osLevel": 34, 75 "serverTimestamp": "2024-03-21T17:44:37.220Z", 76 "deviceTimestamp": "2024-03-21T17:44:38.992Z", 77 "batteryCharging": false, 78 "formFactor": "PHONE", 79 "timestamp": "2024-03-21T17:42:29.183Z" 80 } It only contains data processed locally on the phone. We can see it tried to guess what I was doing, and also sent a list of nearby Wi-Fi networks. Processed data The aptly named Semantic Location History folder contains the Google-processed data. Notably, it contains both place visits, that describe places I was, and activity segments, that describe things I did over a period of time. Here are two packets from the same time period as the one above: 1 { 2 "placeVisit": { 3 "location": { 4 "latitudeE7": 506443503, 5 "longitudeE7": 30774789, 6 "placeId": "ChIJD-fhtSEqw0cRkCuIhwJPMx8", 7 "address": "147 Rue du Ballon, 59110 La Madeleine, France", 8 "name": "Nexedi", 9 "semanticType": "TYPE_UNKNOWN", 10 "sourceInfo": { 11 "deviceTag": 850787795 12 }, 13 "locationConfidence": 46.95392, 14 "calibratedProbability": 46.95392 15 }, 16 "duration": { 17 "startTimestamp": "2024-03-20T18:36:21Z", 18 "endTimestamp": "2024-03-21T17:42:24Z" 19 }, 20 "placeConfidence": "MEDIUM_CONFIDENCE", 21 "visitConfidence": 99, 22 "otherCandidateLocations": [ 23 { 24 "latitudeE7": 506442688, 25 "longitudeE7": 30772597, 26 "placeId": "ChIJt-gKoQwqw0cR_BiE7FjE1sM", 27 "address": "34 Av. Verdi, 59110 La Madeleine, France", 28 "semanticType": "TYPE_SEARCHED_ADDRESS", 29 "locationConfidence": 45.081074, 30 "calibratedProbability": 45.081074 31 }, 32 { 33 "latitudeE7": 506443473, 34 "longitudeE7": 30774441, 35 "placeId": "ChIJW0n_oQwqw0cRLSCoT36NyAY", 36 "address": "147 Rue du Ballon, 59110 La Madeleine, France", 37 "semanticType": "TYPE_UNKNOWN", 38 "locationConfidence": 7.4105234, 39 "calibratedProbability": 7.4105234 40 }, 41 { 42 "latitudeE7": 506445560, 43 "longitudeE7": 30775986, 44 "placeId": "ChIJy6IzmAwqw0cRyUwotTJRo8M", 45 "address": "48 Av. Louise, 59110 La Madeleine, France", 46 "semanticType": "TYPE_UNKNOWN", 47 "locationConfidence": 0.20785488, 48 "calibratedProbability": 0.20785488 49 }, 50 "/* many others */" 51 ], 52 "editConfirmationStatus": "NOT_CONFIRMED", 53 "locationConfidence": 47, 54 "placeVisitType": "SINGLE_PLACE", 55 "placeVisitImportance": "MAIN" 56 } 57 } 58 { 59 "activitySegment": { 60 "startLocation": { 61 "latitudeE7": 506452913, 62 "longitudeE7": 30765453, 63 "sourceInfo": { 64 "deviceTag": 850787795 65 } 66 }, 67 "endLocation": { 68 "latitudeE7": 506525419, 69 "longitudeE7": 30805372, 70 "sourceInfo": { 71 "deviceTag": 850787795 72 } 73 }, 74 "duration": { 75 "startTimestamp": "2024-03-21T17:42:24Z", 76 "endTimestamp": "2024-03-21T17:54:24Z" 77 }, 78 "distance": 973, 79 "activityType": "WALKING", 80 "confidence": "HIGH", 81 "activities": [ 82 { 83 "activityType": "WALKING", 84 "probability": 97.82699942588806 85 }, 86 { 87 "activityType": "IN_TRAM", 88 "probability": 0.5118888337165117 89 }, 90 { 91 "activityType": "IN_PASSENGER_VEHICLE", 92 "probability": 0.27286421973258257 93 }, 94 { 95 "activityType": "CYCLING", 96 "probability": 0.17150864005088806 97 }, 98 { 99 "activityType": "IN_TRAIN", 100 "probability": 0.1608503283932805 101 }, 102 { 103 "activityType": "IN_SUBWAY", 104 "probability": 0.10802004253491759 105 }, 106 { 107 "activityType": "IN_BUS", 108 "probability": 0.08674889104440808 109 }, 110 { 111 "activityType": "RUNNING", 112 "probability": 0.07946699624881148 113 }, 114 { 115 "activityType": "IN_FERRY", 116 "probability": 0.04058652848470956 117 }, 118 { 119 "activityType": "SAILING", 120 "probability": 0.005038841118221171 121 }, 122 { 123 "activityType": "MOTORCYCLING", 124 "probability": 0.0035849196137860417 125 }, 126 { 127 "activityType": "SKIING", 128 "probability": 0.0032918462238740176 129 }, 130 { 131 "activityType": "FLYING", 132 "probability": 9.03989166545216E-4 133 } 134 ], 135 "waypointPath": { 136 "waypoints": [ 137 { 138 "latE7": 506444816, 139 "lngE7": 30776507 140 }, 141 { 142 "latE7": 506459426, 143 "lngE7": 30749950 144 }, 145 { 146 "latE7": 506460113, 147 "lngE7": 30750701 148 }, 149 { 150 "latE7": 506521911, 151 "lngE7": 30811212 152 }, 153 { 154 "latE7": 506525421, 155 "lngE7": 30805346 156 } 157 ], 158 "source": "INFERRED", 159 "roadSegment": [ 160 { 161 "placeId": "ChIJG9zengwqw0cRnoAeiorlC5U", 162 "duration": "8s" 163 }, 164 { 165 "placeId": "ChIJHYTkuwwqw0cRwMDf0gLaefA", 166 "duration": "71s" 167 }, 168 "/* many others */" 169 ], 170 "distanceMeters": 1175.9853099670709, 171 "travelMode": "WALK", 172 "confidence": 0.9999485583582982 173 }, 174 "simplifiedRawPath": { 175 "points": [ 176 { 177 "latE7": 506459770, 178 "lngE7": 30751483, 179 "accuracyMeters": 6, 180 "timestamp": "2024-03-21T17:45:19.542Z" 181 } 182 ] 183 } 184 } 185 } Here, the gremlins inside Google's servers have managed to deduce from the periodic location data sent by my phone that I was at the office for a bit more than a day (I was sleeping in the office bedroom that week), and that I walked somewhere. Indeed, I was on my way to the sixth edition of the Rust Lille meetup! Adding Google Fit to the mix I also happen to use Google Fit on my phone to track my physical activity. Fit uses the phone's internal accelerometer and step counter to more accurately track the details of my walks. It uses the GPS like Maps, but more processing is done locally, and the sensors help make more sense of the recorded data. I downloaded the Fit data from Takeout, this time it's in a different format: 1 $ ls | head 2 2018-10-28T14_30_31.734+01_00_PT22M11.176S_Marche_.tcx 3 2018-10-31T11_06_19.466+01_00_PT13M22.611S_Marche_.tcx 4 2018-11-01T19_24_54.616+01_00_PT20M9.73S_Marche a_.tcx 5 2018-11-03T09_59_04.525+01_00_PT23M3.316S_Marche a.tcx 6 2018-11-03T11_05_37.903+01_00_PT23M23.153S_Marche_.tcx 7 ... TCX is an open format for fitness data. It's a bit more concise than the Location History data: 1 2 9 10 11 2024-03-21T17:44:50.622Z 12 13 14 15 0.0 16 17 18 19 16.5991829017131 20 21 22 50.645694732666016 23 3.0756449699401855 24 25 71.0 26 27 28 29 1157.9140959867575 30 31 32 50.65315246582031 33 3.0800328254699707 34 35 85.20000457763672 36 37 38 1157.9140959867575 39 40 41 42 1157.9140959867575 43 675.65 44 40.936883866013424 45 Active 46 Manual 47 48 49 50 51 Google Fit 52 53 54 0 55 0 56 0 57 0 58 59 60 en 61 000-00000-00 62 63 Already there are small differences between Maps and Fit. Maps recorded a single walk activity but Fit cut it in two activities. The previous Fit activity isn't shown above, but I can tell you Fit thinks I walked 1193 meters while Maps says 1175 meters. We'll compare both of their datasets below. Processing it with Python I wrote a simple Python script that parses the JSON, aggregates the data in multiple fun ways, and outputs it to JSON files I can use as charts on this blog (try viewing the page source!). Measuring my walking speed The most obvious measure for a dataset is often the average. As a disclaimer, I filtered walks that indicated a walking speed of more than 15 km/h (I'm not that good of an athlete) and less than 1 km/h (I'm not that slow of a walker). Binned and histogrammed, the data looks like this: The average measured walking speed is 4.01 km/h (median 4.03 km/h), that is, close enough to the 5 km/h figure that is often used as a rule of thumb for walking speed for an adult human. High-passing above 2 or 3 km/h (which are realistic speeds for walking) would contribute to make the average closer to 5 km/h. The data isn't perfect by any means, we can see a long tail of fast walks on the right that we can attribute to me running to catch my bus, and a few slow walks on the left that are probably me walking around the house. And that's without accounting for the fact that the data is only from when I had my phone on me, which is true most of the time but not always. Plus, it's only what the algorithms have deduced from periodic position data! Measuring my walking distance over time Grouping walks by month and summing the distance gives us this lovely chart that I've labelled with what I was doing at the time: I started using Fit in November of 2018, so no data before that. Here, I've grouped batches of 12 months by the academic year they belong to, instead of the calendar year. The rationale is that for those years, my life was mostly structured by the academic year, from September to August. I walked a lot in high school, a bit less in prep school, and got to an all-time low in my second year, because of the first COVID lockdown (admittedly I was also skipping a lot of classes in the end of the year) which made us attend the last 3-4 months of class from home. Interestingly, Maps and Fit agree for the Prep. Y1 part, but start disagreeing more and more as time passes. I've manually compared a few bits of data from both (comparing everything is... a bit hard) and it seems like most of the divergence can be explained by the fact that Maps is unable to measure what I'd call "internal walks", e.g. walking in my apartment, because the GPS location doesn't vary much. Fit, on the other hand, can measure those because it uses the phone's accelerometer and step counter in addition to the GPS. This hypothesis is supported by the fact that in first year I lived in a very small apartment, and then in second year there's the lockdown, and I start walking a lot in a bigger apartment (but that's still not bit enough for Maps to notice my movements). Enter third year and it goes back up a bit, well, until the next lockdown, at the end of October, but this time the whole year was done remotely. Again, Maps and Fit disagree quite a lot here, and I remember that I was walking circles often in my apartment during that year to help me think. Fourth and fifth year were pretty normal, and then around half of fifth year I started interning and then this year, working (remotely, so again, disagreement between Maps and Fit). We can get a bit more granular and look at the monthly distance walked. Here I've labelled specific periods of time that explain some of the peaks and valleys: Mobile users: consider landscape mode? Programming Data Science This post is licensed under CC BY 4.0 by the author. Share Recently Updated * Six handshakes away * Analyzing my electricity consumption * What's in my Location History? * How I Learned to Stop Worrying and Love Macros * Running a Windows executable on Linux in 17 easy steps Trending Tags Programming Hardware Ramblings Dark magic Data Science Rust TOMB5 Code analysis Compilers Low-level Contents Further Reading Oct 3, 2021 Six handshakes away Have you ever heard about "six degrees of separation"? It's about the famous idea that there are always less than about six persons between two individuals chosen at random in a population. Given... Feb 16, 2024 Analyzing my electricity consumption Electricity prices have been steadily rising in France for the past few years, with a particularly sharp increase since the beginning of the Russian invasion of Ukraine. This has led me to wonder... Feb 9, 2023 Stack Machines and Where To Find Them Ever tried googling "recursion"? There's something quite peculiar about recursion. Every developer and their dog has heard of it at some point, and most developers seem to have quite a strong ... Analyzing my electricity consumption - (c) 2024 zdimension. Best viewed with Netscape Communicator 4. Based on the Chirpy theme, powered by Jekyll Trending Tags Programming Hardware Ramblings Dark magic Data Science Rust TOMB5 Code analysis Compilers Low-level site stats