[HN Gopher] Google Analytics alternative that protects your data...
___________________________________________________________________
Google Analytics alternative that protects your data and your
customers' privacy
Author : doener
Score : 165 points
Date : 2023-05-07 08:57 UTC (14 hours ago)
(HTM) web link (matomo.org)
(TXT) w3m dump (matomo.org)
| jacooper wrote:
| Beware, Matomo by default isn't very privacy friendly, you will
| need the GDPR banner for any advanced features.
|
| If you want a GDPR compliant analytics you have to disable many
| of its flagship features, or use something else like Plausible,
| designed to work with no consent.
| viraptor wrote:
| You need a GDPR banner for sharing information with third
| parties. Why would you need one for self-hosted Matomo?
| jeroenhd wrote:
| You'll need consent for any data collection not essential to
| your site's functionality, even if you host stuff yourself.
|
| If all you so is collect how often your pages are being
| visited then you're not collecting any PII and you don't need
| a banner, but if you're tracking visitors based on unique
| identifiers (cookies, IP addresses, etc.) you'll need to get
| consent first.
| jacooper wrote:
| Its not only that, its also about the data collected.
|
| Real time data like visitor data and heatmaps aren't allowed,
| also IP tracking is not allowed too.
|
| Because matomo can be very powerful, more powerful than
| Ganalytics.
|
| You can let it assign a unique id to every visitor if it
| visited a subdomain and logged in, so you can now exactly who
| each visitor is on all of your sites.
|
| https://matomo.org/faq/how-to/how-do-i-configure-matomo-
| with...
| BaudouinVH wrote:
| https://umami.is/ does the same, free tier available.
| marban wrote:
| C'mon there's like a thousand threads on these already.
| throwaway2056 wrote:
| In all these threads, there is never a project manager from a
| large establishment telling
|
| - Thanks. we will migrate - we did and it was <good/bad>
|
| More or less everyone is after $.
| gerenuk wrote:
| Usermaven.com does the same and covers product insights as well.
| Free tier is pretty generous (1M events per month).
| prithsr wrote:
| Thanks for sharing this! Just got it set up on one of my
| domains and very pleased.
| nologic01 wrote:
| Ideally the different self-hosted web stacks would have built-in
| analytics that would not have to hit the client with javascript.
| But they don't, or if they do each has its own inconsistent
| approach to as what data is collected and how it is presented. So
| the second best if you care about your user's privacy (and, if
| applicable, your own commercial or institutional privacy) is
| something like matomo.
| earth2mars wrote:
| Their usage of word "On-Premise" instead of "on prem" or "on
| premises"!!!
| ezekg wrote:
| The terms on-premises and on-premise and on-prem are synonymous
| within enterprise lingo.
| TechBro8615 wrote:
| The one that grinds my gears is "bottoms up" instead of
| "bottom up."
| colesantiago wrote:
| Let's all just stop tracking all together.
|
| We don't need tracking at all and bloating up and slowing down
| websites.
| jeroenhd wrote:
| I wonder if someone's made an AdNauseam for tracking libraries
| yet. Going on the defensive clearly doesn't work, some more
| offensive action is required.
|
| Send a whole bunch of plausible events, pretending to click
| every link, changing your identifiers and stuff like resolution
| every time, make it impossible to determine what data is real
| and what isn't. Bonus points for leaving websites alone if they
| don't load tracking scripts until you've consented.
|
| We can't stop trackers, but we can try to make them useless.
| Even if they filter out such tracking they shouldn't be able to
| figure out what data was real, making their tracking attempts
| worthless.
| jhpacker wrote:
| I believe AdNauseam uses EasyList, so if it doesn't include
| the EasyPrivacy part of that (which contains the trackers) by
| default it seems like it would be easy to add.
|
| That said, I don't think this is an effective strategy at
| all. Safari has placed a big giant hole in tracking (like 20%
| of users) and lots of sites are still proceeding like nothing
| has changed. Google referrer spam was run at mega-scale
| dumping billions (at least) of spam hits into millions of
| profiles and didn't effect tracking efforts.
|
| A plugin run by .0001% of users or whatever that adds in a
| bunch of slop to the numbers just makes more analysts pull
| out their hair rather than leading to change.
| nicbou wrote:
| This would be nice. I don't track users on my personal blog. I
| don't give a flying duck about what people do there.
|
| However I make money from running a website which is really
| useful to a lot of people. I absolutely need to know what works
| and what doesn't. I can't write and edit in the dark, possibly
| missing by a mile what my readers really need. It would be like
| flying a plane without instruments.
|
| For instance, I see that a lot of people use the search to find
| a single guide, which should definitely be linked on the home
| page. Without basic tracking, I wouldn't even know which pages
| are important to my users.
|
| There are many gaping holes in your website that you could be
| completely blind to without a basic sense of what your users do
| on your website.
|
| I also caught many illegal copies of my website through
| referrer tracking. Three of them were phishing websites, and I
| got them shut down.
|
| So there are many legitimate reasons to have basic traffic
| counters, and you can have those while respecting your users'
| privacy and following the spirit and the letter of GDPR.
| ZacnyLos wrote:
| More alternatives: https://european-alternatives.eu/alternative-
| to/google-analy...
| iLoveOncall wrote:
| Half of the "alternatives" there are dead (website is offline)
| and the rest is not free. Those are hardly alternatives, more
| like band-aid solutions.
| paulcole wrote:
| A paid alternative to a free service is still an alternative.
|
| They may not be alternatives you like, but they are
| alternatives. And for many people a paid option may be better
| once they start looking at why the free thing is free.
| hobo_mark wrote:
| I'm going through the list at random (I'm on the market for
| such a service) and none of them appear offline so far.
| iLoveOncall wrote:
| Yeah you're right actually. From my phone most of them
| appeared online but from my computer, on the same network,
| they're all fine.
| graftak wrote:
| This website is off to a bad start when the first item says
|
| > Because it does not use cookies there is no need to show
| cookie banner for this service.
|
| which is a blatant lie/misinformation. The 'cookie law' has
| nothing to do with the actual use of cookies.
| jhpacker wrote:
| To defend this site, that is the claim of the vendor and I
| wouldn't expect a site that focuses on listing EU
| alternatives to be critically evaluating a claim like that
| which hasn't been explicitly nay-sayed by any regulatory
| agency. Plausible uses a visitor id based upon a hashed +
| salted user agent plus IP address where the salt is rotated
| daily. The choice of whether consent is required for that is
| for the individual implementing site to make up their mind
| upon, but I don't think the vendor claim is unreasonable.
|
| A similar (but better, IMHO) site that focuses just on
| analytics is: https://newmetrics.io/
| algustionesa wrote:
| I have reviewed many Google Analytics replacements in terms of
| features and capabilities. Matomo may be suitable for you, but
| its data presentation is not user-friendly. If you only need
| basic metrics to track, there are many alternatives that present
| analytics data more clearly. For more information, see
| https://algustionesa.com/google-analytics-alternatives/.
| herunan wrote:
| Why is this at the top of HN?
| viraptor wrote:
| I'm guessing because of this:
|
| https://blog.google/products/marketingplatform/analytics/pre...
|
| > All standard Universal Analytics properties will stop
| processing new hits on July 1, 2023.
| EGreg wrote:
| So what is the main difference between Universal Analytics
| and Google Analytics 4?
|
| We currently use Google Analytics to understand how users
| move through our app. We also used Matomo (previously Piwik)
| 5 years ago.
|
| Now Google Analytics on iOS will stop working for users
| unless they update our app? It doesn't seem to say anything:
| https://developers.google.com/analytics/devguides/collection.
| ..
| devjab wrote:
| Analytics are widely used in communication departments in
| European enterprise, and where that previously was very often
| Google Analytics, it's hard to use it because of Google's
| inability/unwillingness to change their enterprise targeting
| business model to be GDPR compliant. I'm not personally
| convinced you really need an analytics tool in most European
| communications departments. As long as saying something like
| that is akin to heresy, however, I think it's safe to say that
| a lot of people are interested in alternatives to Google
| Analytics.
|
| It's likely not just in Europe anymore. Privacy seems to be a
| tend that is on the increase everywhere. But as I understand
| it, things move to the top of HN if they are interesting to a
| lot of people, and privacy is interesting to a lot of people
| these years. Not just to the "nerds" either, at least I tend to
| see more and more discussion on it outside of tech circles. In
| the EU specific you do have the very real "motivation" of
| dropping Google Analytics because using it puts you in the
| lovely area of breaking the law.
| haunter wrote:
| Google bad
| zichy wrote:
| It is really easy to protect everyone's privacy by not using
| advanced analytics platforms at all.
| jmduke wrote:
| A while back I built out a quick guide comparing all of these
| alternatives, because the core value prop was pretty similar and
| it was annoying to compare between pricing plans. (My personal
| vote goes to Fathom.)
|
| https://buttondown.email/comparison-guides/google-analytics-...
| berkle4455 wrote:
| Stay far away from fathom. Bro culture bullshit at the worst.
| Don't believe a word they say.
| skilled wrote:
| Fathom is run by some goofy marketer who has openly slandered
| (on HN) other analytics products in this space. Sadly, can't
| support anyone who does that. They're not open-source either.
| 6ak74rfy wrote:
| Last I checked, Fathom's open source product hadn't been
| updated for a couple of years. So, I switched to Plausible
| which is more reasonably updated.
| graeme wrote:
| How does Plausible compare to Google's Universal Analytics?
| And are there any SEO effects?
|
| GA4 migration seems not aimed at general users, so I'm
| looking at alternatives. Ideally could import my data.
| gcanyon wrote:
| "GA's interface is complex and confusing, especially for basic
| use cases."
|
| As I said in another comment, it's been eight years since I
| used that accursed interface, and I'll be ready to try it again
| once the flashbacks go away.
| jhpacker wrote:
| Matomo is decent, but my main issue with it is the performance
| when run at any sort of scale. It's PHP/MySQL, which is nice for
| ease of self-hosting, but it means a lot of things need to be
| pre-calculated. Most of the newer and more performant GA
| alternatives out there are using things like ClickHouse.
|
| ClickHouse: Piwik PRO, Plausible, PostHog, Yandex, Cloudflare
|
| Snowflake: Amplitude, Piano, Snowplow
|
| SingleStore: Fathom
|
| I've written a book on the subject including evaluating the 15
| most widely used options: https://gaalternatives.guide
| KronisLV wrote:
| > Matomo is decent, but my main issue with it is the
| performance when run at any sort of scale. It's PHP/MySQL,
| which is nice for ease of self-hosting, but it means a lot of
| things need to be pre-calculated.
|
| I've never actually run into performance issues, neither when
| using it in production professionally, nor for my self-hosted
| sites (with Matomo always running on-prem). I'd say the
| performance of PHP and MySQL/MariaDB is most likely decent as
| long as you don't go too far into specialized workloads, for
| example log aggregation/tracing; though even some APM solutions
| like Apache Skywalking also support using traditional RDBMSes
| for this purpose as well:
| https://skywalking.apache.org/docs/main/v9.0.0/en/setup/back...
|
| That said, I can't help but to wonder at what actual scale
| (number of logged events/second, given certain hardware) you'd
| run into issues. Luckily, because adding basic analytics is
| usually quite easy, testing this for your own workloads
| shouldn't be out of the question - then you can let the data
| speak for itself.
| jhpacker wrote:
| The performance issues aren't with the measurement requests
| but with reporting.
|
| When I eval'd it for my book last fall there were big delays
| in reporting waiting for segments and then also issues with
| custom reports. I think they have changed the default
| behavior to get around some of the former, but with MySQL
| it's always going to be tough for larger queries.
|
| (if there's any performance issue on the measurement side it
| has more to do with the JavaScript payload because they
| include a lot in their standard JS bundle).
| [deleted]
| preinheimer wrote:
| I wish some of the privacy focused GA alternatives had SOC 2
| reports, or ISO 27001. We're working towards our first SOC 2,
| which makes it hard to incorporate anything without one into our
| product.
|
| On prem is a lot of work, and not something i want to approach
| lightly.
| npace12 wrote:
| why? Having gone through a few SOC-2s, I don't see any value it
| other than it being a racket.
| ian0 wrote:
| Having gone through ISO 27001 and PCI DSS level 2 I kind of
| assumed all of these security focussed compliance standards
| are just that. Anyone have any exceptions?
| vlovich123 wrote:
| Yes it's a huge racket that's likely does little to solve the
| problems it was enacted to prevent. But have you tried making
| deals with large SOC2 companies without your own
| certification?
| jhpacker wrote:
| Piwik Pro is SOC2 certified.
| TekMol wrote:
| I tried Matomo.
|
| Self hosting is easy. That's a plus.
|
| I also like the interface. Took a while to get used to it but
| after that, I liked it even better than Google Analytics.
|
| But one problem that seems unsurmountable is that it tries to be
| clever. And while trying, it messes up your data.
|
| If you have pages with a parameter in the querystring that is
| called "q", Matomo does not count those as pageviews. It tries to
| be clever and only count those as "searches". Probably because
| many site searches use a parameter "q" for what the user is
| searching for.
|
| Even if a page is a search result page, it should be counted as a
| pageview.
|
| The problem gets even worse when you have users bookmarking pages
| with a "q" parameter. Then things get really messy when you try
| to understand which pages users use, where they come from etc.
|
| I have searched a lot, but have found no way to disable this
| "cleverness". And no way to retroactively fix the data.
| ethor wrote:
| Just disable website search or change the search parameter
| inside website settings to stop Matomo from interpreting the
| 'q' parameter as search.
| ehnto wrote:
| It is bad default, but nice that it is configurable.
| SquareWheel wrote:
| That's an odd choice as WordPress, which is by far the most
| popular CMS, uses ?s= as a search query.
|
| I would expect those pages to be included in the data. They
| could offer some sort of segmentation if they think they they
| can separate out searches, though.
| dfsl wrote:
| I have integrated my site with matomo.
|
| The matomo analytics are captured and stored on-premises on my
| server (nothing goes to the cloud).
|
| Performance is good with my configuration. You can see page
| performance for yourself by loading this page:
| https://freesoftware.life/how-to-install-kubuntu-23-04/
| riogordo2go wrote:
| I'm using the matomo self hosted version and like it overall. I
| love you can track all outbound clicks without having to
| specifically add Dom elements to outbound links to make this
| possible. Unfortunately matomo is blocked just like Google
| Analytics by every ad/tracking blocker. Doesn't matter if you
| host it yourself and only track global stats vs tracking users
| across the web like GA does. The only solution seems to be
| writing your own analytics.
| belorn wrote:
| At this point in history, tracking on the web is no longer a
| trusted activity where people can assume that the person behind
| the tracking is doing it for benevolent purposes. It's the same
| thing with email and spam, especially when attachments are
| involved.
|
| Writing your own analytics can give some additional benefits in
| that you are only collected what you need while taking into
| considerations your users needs. I expect however that in time
| browsers will block more and more by default, similar in how
| email clients and services has progressed in their arm race
| with spam.
| teekert wrote:
| Is it also blocked when you don't even enable cookies? You
| loose some accuracy, but clients can't prevent ending up in
| your logs and they have to share some info with the server.
| chpatrick wrote:
| You can usually rename the tracker to something that's not on
| the blocklist.
| riogordo2go wrote:
| That used to work but current block filters analyse js
| variables and url parameters and are much harder to
| circumvent.
| jeroenhd wrote:
| Why spend time undermining people's preferences?
| RHSeeger wrote:
| If I build a web site, and it is my preference to know what
| pages get clicks on what elements (presumably, so I can make
| my site better)... whose preference gets priority; mine or my
| users? It's not as black and white as your question makes it
| sound.
| gumby wrote:
| The users have the ultimate authority whether you like it
| or not: they don't have to read your whole page, they don't
| have to look at that image (or even load it), they don't
| even have to go to your site if their friends tell them not
| to.
|
| It's like going to pee when an ad appeared on TV back when
| TV was a thing. The broadcaster and advertiser had no
| control.
|
| I am sympathetic to your desire (I'm assume _your_ desire
| comes from a good place),* but at the end of the day I
| think we want to live in a world where the people are the
| important part.
|
| * in my experience the best sales people really do believe
| the prospective customer _does_ want what they are selling,
| be it pantyhose, homeopathic drugs, or specially formulated
| window washing fluid.
| gregmac wrote:
| It kind of _is_ black and white, from technology point of
| view.
|
| You, the website owner, can control what your server does
| in response to HTTP requests a client makes. You control
| what data is sent, and under what conditions you'll send
| that data (ie: presence of a valid session cookie, correct
| username/password, cryptographly signed request, etc).
|
| I, the user owning a computer, get to control what my
| computer does. I run a web browser, and can choose what
| happens in response to data your site sends me via HTTP.
|
| Most notably, your site can send some javascript, but my
| computer doesn't _have_ to run it. My computer can also
| selectively block what it does, including limiting its
| access to initiate web requests to other sites.
|
| Anything beyond this is artificial, such as laws like DMCA
| or CFAA.
| RHSeeger wrote:
| Your response seems to completely miss the point of the
| thread you're replying to. The discussion in question
| was, effectively
|
| >>> You can write your own code to gather statistics
|
| >> You should respect your user's desires and not gather
| statistics
|
| > The users aren't the only ones with desires
|
| Sure, whether or not you "can" do it is black and white
| (and a game of whack-a-mole many times), but whether or
| not you "should" do it is very much a gray area.
| EGreg wrote:
| I don't get it, how can they stop you from recording this
| on your own server?
|
| Are you talking about CNAME cloaking? Pretty sure Apple
| only cares if one specific server gets all the CNAMEs. It
| doesn't block CNAMEs in general.
| RHSeeger wrote:
| I thought that was the whole point of what was being
| said; that things like metrics (what on the page gets
| clicked on) are getting blocked. Bear in mind, I'm not
| just talking about what pages get loaded. There's more to
| "clicked on the page" than just page loading.
| jhpacker wrote:
| ITP now also degrades first party server-set cookies to 7
| days where the first part of the IPs don't match. So if
| you're using CNAMEs for your measurement and the you have
| a.a.x.x and b.b.x.x it will downgrade.
| EGreg wrote:
| Link?
| jhpacker wrote:
| https://github.com/WebKit/WebKit/pull/5347
| riogordo2go wrote:
| Because I think most people who use something like ublock
| don't want to see ads or have their privacy violated by being
| followed around the web using third party trackers.
|
| A site owner observing some general, anonymized stats like
| visitor and page count, which outbound links are clicked, os,
| screen size, time on page and what have you is quite
| different. I understand a blocker must go all the way and
| cannot distinguish between these cases. Hence my effort to
| find an alternative.
| nolok wrote:
| Most people who are against trackers are not against the
| website they visit getting valuable information about which
| page they use or not, or the order in which they use each
| page to figure out which path work or not, etc ...
|
| They are against the website chosing to not pay for it and
| instead getting it for free in exchange for giving all that
| data to a 3rd party (like GA / Google), who then uses it for
| its own purpose.
|
| Doesn't mean no people are against that first scenario too,
| but then they better not make an account, visit several pages
| in a row on the same website or want to use a cart, or
| essentially anything beyond a static website.
|
| Both scenarios are widely different, and convincing people on
| both side (even both extreme) of that line that the line
| doesn't exists is one of the greatest and most successfull
| trick tracking companies have played.
| gumby wrote:
| An ad/tracking blocker _could_ discriminate between
| privacy-protecting trackers and spyware, but it would not
| be worth the time in practice.
|
| Such a distinction would need an option and have to be on
| by default. Most people use the "out of the box" config, so
| only a few people (like me) would enable honest tracking.
|
| The blockers would have to keep up with this option to make
| sure the thing they allow hadn't switched to evil mode.
|
| And so on. Basically another case where bad actors like
| google poisoned the well.
| soared wrote:
| > tracking users across the web like GA does
|
| What does this mean?
| riogordo2go wrote:
| At least in prior Google Analytics versions, a third party
| cookie was used, giving the possibility to link you to every
| site that implements Google Analytics. But Google explicitly
| states not to do this, so you are correct in calling me out
| here.
| jhpacker wrote:
| GA4 still uses the doubleclick cookie. It also encourages
| the use of Google Signals and runs measurement requests off
| of the main google.com domain to help it track users based
| upon their Google login.
| nolok wrote:
| If site A and site B both uses GA, then GA track them across
| both internally for their stats (and it helps google in
| figuring out the same user has interest A and interest B).
|
| Matomo promises to not do the same link across properties on
| their cloud hosted version.
| stoicjumbotron wrote:
| Thoughts on Microsoft Clarity? https://clarity.microsoft.com
| victor106 wrote:
| I think we learnt enough when big tech offers something for
| "free" and when they call it "absolutely" free it just means
| you are absolutely the product.
|
| So thanks but No thanks
| hodgesrm wrote:
| Clarity is awesome--the metrics and the way it combines a
| visualization of our site with user session data is amazing. It
| shows you the actual locations on the page that users visit as
| well as the path they follow to get there. The insights are far
| more actionable than Google Analytics from my experience. (We
| use both.)
|
| p.s., Under the covers Clarity runs on ClickHouse.
| gcanyon wrote:
| I just want analytics that don't require a Ph.D. in obscure user
| interfaces to get anything out of them. TBF, I haven't used GA in
| 8 years, maybe it's gotten better -- but I still have flashbacks.
| pastage wrote:
| Matomo is not trivial run on prem, there are lots of stuff that
| do not work on larger installs unless you do lots of manual
| optimization, what those optimizations are is not obvious. The
| problems only shows after some time when you have to redo reports
| for multiyear periods, or handle hug of death.
|
| That said people love analytics, it is a powerful tool.
| viraptor wrote:
| Any links? I'm assuming you mean something more than the
| periodic rollups?
| RobotToaster wrote:
| Worth noting that it seems to be only "open core", there's a
| bunch of paywalled features that I presume aren't open source.
| https://matomo.org/pricing/
| johndhi wrote:
| I'm not an engineer. Can someone please specifically explain to
| me how this protects data and privacy more than Google?
|
| Does it use cookies and browser storage?
| jenadine wrote:
| One thing is that it doesn't send any data about your users to
| Google.
| hrpnk wrote:
| Rudderstack claims to be a GA alternative [1] and accepts server-
| side data allowing this to be a 1st party integration skipping
| the consent complexity. Any thoughts on this one? It also made it
| to the Thoughtworks Tech Radar [2].
|
| [1] https://www.rudderstack.com/replace-google-
| analytics-4-guide... [2] https://www.thoughtworks.com/en-
| us/radar/platforms/ruddersta...
| encoderer wrote:
| We (cronitor.io) have a really great all-in-one solution to
| analytics and website monitoring with a generous free tier.
|
| https://cronitor.io/real-user-monitoring
| jpalomaki wrote:
| You can also run Matomo without tracking Javascript and instead
| feed in log files [1]. This works with the Cloudfront log files
| (and many others).
|
| [1] https://matomo.org/faq/general/requirements-for-log-
| analytic...
| IanCal wrote:
| > Google Analytics alternative that protects your data and your
| customers' privacy
|
| It's not your data, this data about me.
|
| > Your customers will love you because their valuable personal
| data is protected.
|
| I guarantee you your customers will not love you for tracking
| them.
| igor47 wrote:
| Your customers might love you for making a better website, and
| this is hard to do without feedback, of which analytics is one
| kind.
| nmstoker wrote:
| Haven't used it in six years but back in the days it was Piwik it
| was ideal: easy to set up locally, a good range of features and a
| friendly community (v. responsive to an upgrade issue we
| experienced but apart from that everything worked exactly as
| expected).
| acidburnNSA wrote:
| Back in my day it was awstats. Still works great. I have 18
| years of data.
|
| https://www.awstats.org/
| jhpacker wrote:
| Loved AWStats! Still can be useful -- but bots, client side
| caching, CDNs, and did I mention bots..? have made the data
| hard to rely on for much. A while ago I switched from AWStats
| to GoAccess (https://goaccess.io/) for this kind of thing. I
| prefer its interface, and it's way way faster to churn
| through big log files (C vs. Perl).
___________________________________________________________________
(page generated 2023-05-07 23:00 UTC)