[HN Gopher] Matomo: Open-source analytics platform
       ___________________________________________________________________
        
       Matomo: Open-source analytics platform
        
       Author : AbdHicham
       Score  : 166 points
       Date   : 2021-01-05 20:47 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | viraptor wrote:
       | I've tried to use it and... they could work on improving the
       | installation guide. I gave up after an hour or so of failing to
       | provision the database and create configuration in a way that
       | gets me to the initial setup screen. I was using their docker
       | setup for this.
       | 
       | It's kind of my job to take random apps, deploy them and manage
       | properly, so I'm not a clueless user here. I could press on and
       | figure it out with more time, and I understand they'd be happy
       | with people using the cloud offering / paid support instead. But
       | I also feel like a working docker-compose (or comparable) setup
       | is table stakes these days for an open-source service.
       | 
       | See Loki+Grafana for a good example:
       | https://grafana.com/docs/loki/latest/installation/docker/#in... -
       | it's not a production setup, but it's a valid "play around with
       | it in 2min" setup.
        
         | typhonius wrote:
         | I've found that using geerlingguy's Ansible roles for MySQL,
         | Nginx, and PHP, most random PHP applications can be deployed
         | with default configuration. I've had them in production with
         | Matomo for the past year or so and had no problems so far.
         | 
         | A lot of the challenges faced with a 'from scratch' install
         | will revolve around which PHP version and extensions to install
         | and how to get Nginx to talk to FPM. Neither of which are
         | trivial for someone wanting to test/evaluate without much prior
         | knowledge.
        
         | crstin wrote:
         | I've did some research a while ago and found that
         | https://github.com/crazy-max/docker-matomo dockerizes it the
         | best.
        
         | cxcorp wrote:
         | It was pretty easy to get up and running with Docker Compose:
         | version: '3'            services:           mysql:
         | image: mysql:8               environment:
         | MYSQL_ROOT_PASSWORD: root                   MYSQL_DATABASE:
         | matomo           matomo:               image: matomo:4
         | ports:                   - 4000:80
         | 
         | This lets me in at localhost:4000 and I just enter "mysql" as
         | the DB host, "root" as the username & password and "matomo" as
         | the database name, and it's basically done.
         | 
         | Of course, I probably have to point it out or someone else
         | will, that it's a bad idea to be using the MySQL root user,
         | instead of creating a user with the rights that Matomo needs:
         | https://matomo.org/faq/how-to-install/faq_23484/
        
       | kooparse wrote:
       | It's pretty easy to make your own analytics. It's not famous, but
       | I open-sourced mine; it's called Bast, written mostly in Rust,
       | and it's easy to deploy it. https://github.com/kooparse/bast
        
         | marvinblum wrote:
         | I did the same for my Go library [0], but I don't think it's
         | "pretty easy" if you want to do it right. Especially filtering
         | out bots is a constant hassle, it needs to be tested,
         | maintained, just like any other software. So, paying something
         | like $4 is worth it, if you don't want to think about it as
         | much.
         | 
         | You have a very good looking UI there. I really love the
         | simplicity.
         | 
         | [0] https://github.com/pirsch-analytics/pirsch
        
       | hertzrat wrote:
       | I used matomo once for a basic Wordpress blog (with cloudron) but
       | for some reason it led to my site being flagged as distributing
       | malware and I vanished from search engines. Apparently there is a
       | Microsoft form you need to fill out to get unflagged where you
       | explain what data your site is collecting but I just took the
       | site offline because I was too busy to dig into it. Extremely
       | annoying since the entire idea is to collect as little personal
       | information as you can. It wasn't matomo fault, its somebody's
       | dysfunctional web crawler bot auto generating reports
        
       | mox1 wrote:
       | Formerly known as Piwik, been around for a long time.
        
         | AbdHicham wrote:
         | Thank you :) I didn't know that, found it here as well
         | https://piwik.com/
        
       | junon wrote:
       | No screenshots, no examples in the readme, builds failing.
       | 
       | Nah.
        
         | tadzik_ wrote:
         | There's some of that on https://matomo.org/, not sure why OP
         | linked to github instead.
        
       | kerng wrote:
       | I switched to it about a year ago to get rid of Google Analytics.
       | Quite happy with the decision.
        
         | XCSme wrote:
         | Where are you hosting it and how much do you pay for it? Do you
         | have a lot of visitors?
        
           | input_sh wrote:
           | Any cheap Hetzner dedicated (~30EUR/month) can handle
           | tracking thousands of daily visits without breaking a sweat.
           | 
           | If we're talking about 10s of thousands, you're gonna need to
           | invest in some SSD (EUR50-70/month probably).
           | 
           | If we're talking about dozens of sites, some of which have
           | millions of yearly visitors (and a bunch of plugins and
           | reports that need to be generated), then you're gonna
           | encounter some issues and have to spend a considerable amount
           | of time optimizing every part of it, and hardware cost will
           | rise to a hundred or two per month.
        
             | y42 wrote:
             | To add another level: To track millions of requests per day
             | for a couple of sites, it will probably cost around 1k - 5k
             | per month, hosting on AWS. It's advised to use not the
             | cheapeast hoster here, because any little outage directly
             | affects every site that implements your tracking
             | technology.
        
               | zerkten wrote:
               | > It's advised to use not the cheapest hoster here,
               | because any little outage directly affects every site
               | that implements your tracking technology.
               | 
               | I guess it depends on how essential your tracking is, and
               | how you've implemented it. It shouldn't be added in a way
               | that can take out your site unless there is some business
               | critical reason to track.
               | 
               | Then I'd ask, just how critical is the tracking? If
               | losing a few hours of data is going to throw off your
               | product development, do you have enough data to be making
               | decisions? My experience is that bugs and
               | misconfiguration of experiments is common in most orgs,
               | so even if the system is up to capture all data, product
               | managers check an experiment a week later to find they
               | have only 50% of the data.
        
               | celsoazevedo wrote:
               | While it makes sense for some projects to use AWS, Azure,
               | Google Cloud, etc, you could track the same number of
               | requests on Digital Ocean, Vultr or Linode reliably and
               | for less money.
        
         | FriedrichN wrote:
         | My anecdotal experience is that people liked it better than GA.
         | I also really like how easily you can extend it. In one case (a
         | webshop) we added the currently logged in user so we can track
         | what they look for specifically so we can improve the search
         | and categories.
        
       | samuell wrote:
       | Installed it on some websites recently. Quite pleased with the UI
       | and the functionality. Pretty spot on in terms of the right level
       | of information about users - providing useful info without
       | stalking people inappropriately. The new version (4.0) seems to
       | improve on some earlier stability issues.
        
         | XCSme wrote:
         | Is the performance better than Google Analytics?
        
           | samuell wrote:
           | It is definitely clearly snappier, yes. There is some sub-
           | second loading times, which should not be surprising, but not
           | the sometimes multi-second lags I have seen in Google
           | Analytics.
        
       | robertlagrant wrote:
       | Would recommend Countly. Not affiliated.
        
       | nagbava wrote:
       | About data protection and GDPR, a good thing with Matomo is that,
       | if configured properly, it can be used without requiring to
       | collect the user's consent (since Matomo doesn't use the data for
       | its own purpose). Of course there are less information collected
       | but at least you don't have to display a form as soon as a user
       | enters your website.
       | 
       | The French data protection authority issued a piece of code (JS)
       | which must be used to avoid collecting the user's consent. I
       | don't know about other data protection authorities in the EU but
       | it shouldn't be much different.
        
         | iamacyborg wrote:
         | > it can be used without requiring to collect the user's
         | consent (since Matomo doesn't use the data for its own purpose)
         | 
         | This is not how the GDPR works. If you are collecting personal
         | data, or if you are dropping analytics cookies on someone's
         | device, you need consent. No ifs or buts.
        
           | arp242 wrote:
           | This is not how the GDPR works. It lays out several legal
           | basis for the collection of personal information, of which
           | consent is one. There are others as well.
           | 
           | I'd have to re-read it to be sure about analytics cookies,
           | but I don't think it says a whole lot about that off-hand.
           | This the the ePrivacy directive.
        
           | nagbava wrote:
           | You should apply to the CNIL since you seem to know GDPR
           | better than they do. (https://www.cnil.fr/fr/cookies-
           | solutions-pour-les-outils-de-...)
           | 
           | I never said no personal data were collected but, _if
           | configure properly_ , the processing of data falls within the
           | legitimate interest basis.
        
             | iamacyborg wrote:
             | I believe that page may be out of date, or they've updated
             | their github repo prematurely.
             | 
             | https://github.com/LINCnil/Guide-RGPD-du-
             | developpeur/commit/...
             | 
             | /edit Ignore me. I appear to have misunderstood the changes
             | when I last read this. Sorry
        
             | nagbava wrote:
             | It is true that an opt-out system must be installed on the
             | website (Matomo gives that piece of code) but - as noted on
             | the github link you posted - that very is different from an
             | opt-in system (which is the standard GDPR requirement).
        
               | lmkg wrote:
               | GDPR does not require consent. If you use consent, then
               | it must be freely-given, but can often use a different
               | legal basis when processing personal data.
               | 
               | The ePrivacy Directive requires consent for reading or
               | writing from a terminal device. This includes anything
               | with cookies, even if they're not personal data. While
               | the ePD refers to GDPR for its definition of consent, it
               | is a separate piece of legislation and many things that
               | are true about GDPR are not true about ePD (such as being
               | able to invoke Legitimate Interest instead of consent).
        
               | nagbava wrote:
               | Sure, but when it comes to cookies, consent is almost
               | always required on the GDPR basis (other legal basis are
               | rarely working).
               | 
               | You're right to point to e-privacy, to which consent is
               | central. But the latest draft of its new version states
               | that (art.8): _1.The use of processing and storage
               | capabilities of terminal equipment and the collection of
               | information from end-users' terminal equipment, including
               | about its software and hardware, other than by the end-
               | user concerned shall be prohibited, except on the
               | following grounds: [...] (d)it is necessary for audience
               | measuring, provided that such measurement is carried out
               | by the provider of the information society service
               | requested by the end-user or by a third party, or by
               | third parties jointly,on behalf of theone or more
               | providersof the information society service provided that
               | conditions laid down in Article 28, or where applicable
               | Article 26,of Regulation (EU) 2016 /679 are met_
               | 
               | So Matomo can still do without the user consent (from
               | what I understand, the relation between GDPR and
               | e-privacy is no easy business).
        
               | iamacyborg wrote:
               | > So Matomo can still do without the user consent (from
               | what I understand, the relation between GDPR and
               | e-privacy is no easy business).
               | 
               | It also depends on the jurisdiction. For example the ICO
               | has been clear that using a cookie based analytics tool
               | requires a GDPR level of consent, without exceptions.
        
               | lmkg wrote:
               | > when it comes to cookies, consent is almost always
               | required
               | 
               | We are in agreement. It seems I wasn't clear enough in my
               | original post, but this is my overall point. GDPR doesn't
               | require consent, but consent is required because of ePD.
               | 
               | > latest draft of its new version
               | 
               | > So Matomo can still do without the user consent
               | 
               | The new draft is not law yet. It's been 6 months away
               | from passing for several years now. In the meantime,
               | fines are still being issued under the existing law.
               | Google got fined a hundred million euro last month in
               | France, and that fine was _very specifically_ ePD and
               | _not_ GDPR for a variety fo reasons.
        
       | bravura wrote:
       | 29 euros a month for the managed, on-cloud version. Does anyone
       | know inexpensive Google Analytics alternatives for small sites,
       | that are hosted for you?
        
         | marvinblum wrote:
         | https://pirsch.io/
         | 
         | $4/month if you pay annually or $6 to pay monthly, but free
         | during beta.
         | 
         | We are actively working on it right now, but the core is
         | working well and is open-source: https://github.com/pirsch-
         | analytics/pirsch
        
       | ddevault wrote:
       | Open source does not make it okay. Do not spy on people. It's
       | just that simple.
        
         | XCSme wrote:
         | So, you shouldn't be allowed to know the conversion rate on
         | your page?
        
           | ddevault wrote:
           | It's easy enough to run $nconversions / $nrequests without
           | spying on anyone.
        
             | XCSme wrote:
             | But $nrequests is not accurate as it can be 10x or 100x
             | more than the number of visitors, so your conversion rate
             | will be 10x or 100x off.
        
         | hertzrat wrote:
         | What do you use to track whether your site is growing or
         | shrinking in popularity? Server logs? If so, I'm curious
         | whether not being able to filter out bot visits is a problem
        
       | XCSme wrote:
       | Is 1.7k open issues something to worry about or is it normal for
       | a project of this size?
       | 
       | PS: I have also been building something similar, but not
       | completely open-source: https://www.usertrack.net
        
         | zufallsheld wrote:
         | I also compare open issues to closed issues. If the numbers are
         | roughly the same (or the open issues bigger than closed) I'd
         | say that's a problem.
        
         | yread wrote:
         | I think it's more a consequence of not tidying up. There are
         | irrelevant issues open from 2008. On the other hand I prefer
         | 1.7k issues to project where they aggressively auto-close
         | tickets
        
           | mgkimsal wrote:
           | if projects are given the option to 'auto-close' tickets
           | (which honestly, I don't mind - having loads of open stuff
           | can hamper finding more recent/useful info), wouldn't it be
           | helpful to have filter to view 'auto-closed' tickets vs
           | 'closed' tickets?
        
         | Findus23 wrote:
         | While I agree that 1.7k issues are a lot, also keep in mind
         | that there are 9.6k closed issues and they are bugs and feature
         | requests over 11 years that never get closed as there might be
         | someone else coming across some feature request one day who
         | wants to implement it.
        
           | monkin wrote:
           | Fun fact that should be point out: Most of those issues and
           | pr's comes from core team members. There is almost no
           | community around it, but looking at these numbers you get the
           | impression that it is otherwise.
        
         | AbdHicham wrote:
         | It's normal for a project of that size, never had any major
         | issues with it, it's in active development so it's good to see
         | issues are being reported
        
         | AbdHicham wrote:
         | Which parts are open source in usertrack ? you say not
         | completely open-source
        
           | XCSme wrote:
           | I say it's not completely open-source mainly because it's not
           | free and people usually expect open-source to equal free.
           | 
           | Once you purchase it you get full access to the original
           | server-side code (PHP, MySQL).
           | 
           | For the client-side part you only get the bundled JS/HTML/CSS
           | (the original client-side source code is TypeScript, React),
           | mostly because otherwise I would have to provide all the
           | build tools and document better the code, tooling, building,
           | releasing, etc.
        
             | AbdHicham wrote:
             | I understand now, thank you for clarifying :)
        
             | viraptor wrote:
             | > people usually expect open-source to equal free.
             | 
             | Open source has a specific meaning - is the Software
             | released on an open source license.
             | (https://opensource.org/licenses) For example if you pay
             | enough, you get ms windows source as well - that doesn't
             | make it "not completely open source". Your project doesn't
             | seem to be open source at all.
        
               | mgkimsal wrote:
               | I think the term 'shared source' was coined to describe
               | that particular business model (under certain conditions,
               | the code is shared, and perhaps modifications may be
               | allowed in some scenarios, but no redistribution).
        
               | XCSme wrote:
               | Sorry for the misunderstanding then, I might be using the
               | wrong terminology.
               | 
               | I have seen many other products that are marketed as
               | "open source" because you get the source code after you
               | purchase it, so it is literally "open source", but not
               | "open-source" as in released under an open-source
               | license.
               | 
               | I am personally not marketing userTrak as open-source and
               | I will stop using similar terms if other people do have a
               | strong opinion about what "open-source" actually means.
        
             | ffpip wrote:
             | Did you change the license? I think I stumbled upon your
             | website on HN comment a few months ago. Didn't it be free?
             | 
             | Great product and an excellent demo!
        
               | XCSme wrote:
               | Thank you! userTrack was never free, but I did change the
               | pricing model from lifetime, to yearly to now being one-
               | time payment + yearly payments for updates.
               | 
               | I would love to make userTrack free if I can find a
               | sustainable way to work on it. Most other similar open-
               | source software offers a "hosted" version to get revenue,
               | but my goal is to promote decentralization and self-
               | hosting in general, so me focusing on the hosted version
               | would go against my goal and beliefs. I really want to
               | see a feature where any non-technical person can choose a
               | few products and have them running on their own
               | VPS/server in a few clicks. This would have many
               | advantages for the clients AND for the developers:
               | 
               | * Clients pay a lot less for a products
               | 
               | * Developers must focus more on product and performance,
               | leading to higher quality products.
               | 
               | * Hugely increased privacy for the average internet user
               | and for the own data of the client using the product
               | 
               | * Better performance (each client has their own server so
               | it is more likely to have more resources)
               | 
               | * Better latencies (each client can choose to use/host
               | their product on a local datacenter)
               | 
               | * Better data transparency, easier migrations and fewer
               | vendor lock-ins (if you own the server and the data on it
               | you can most likely always export it in some form)
               | 
               | I think there are many other advantages for both
               | companies and clients. The current SaaS environment makes
               | it really easy for companies to ask huge amounts of money
               | for services just because they want to, as the client has
               | no real alternative unless he is really technical and can
               | spend days installing and maintaining a self-hosted
               | software that rarely gets updated.
        
               | ffpip wrote:
               | > userTrack was never free...
               | 
               | Sorry if it seemed like I was complaining about the
               | pricing change. I was just wondering whether I remembered
               | it correctly from here
               | (https://news.ycombinator.com/item?id=24207129)
               | 
               | It's a great product. People will pay for it.
        
               | XCSme wrote:
               | No worries, I was just making the history of the pricing
               | structure clear.
               | 
               | Thank you for the kind words, I do love working on this
               | project and I hope to be able to continue working on it.
               | Existing customers absolutely love it and keep
               | recommending but I am still struggling with finding a
               | pricing structure that makes sense for everyone.
               | 
               | I do hope that one day I will find a way to make
               | userTrack free for everyone, but looking at Matomo,
               | making it open-source seems to drastically slow the
               | development of a project as there are so many people
               | involved and so much more decisions to be taken. Apart
               | from that I would still have to earn a living somehow,
               | but if I get a job and keep userTrack open-source I won't
               | be able to spend too much energy on maintaining it and I
               | hate not being able to make a product as good as it can
               | be.
        
             | Macha wrote:
             | From your license agreements (this language appears in all
             | 3):
             | 
             | You are NOT allowed to:
             | 
             | Redistribute in any way any of the userTrack files or any
             | parts of the userTrack's source code (with the exception of
             | the public tracker JavaScript files that have to be
             | included on your site).
             | 
             | Install userTrack on someone else's server.
             | 
             | Continue using userTrack or offering userTrack access to
             | others after this license agreement has been voided (either
             | via a refund, license period expiration or legal action).
             | 
             | This is not open source (or even "fair code" as redis etc
             | advocate for). Providing the source but under a license
             | like this is usually referred to as visible source or
             | shared source
        
               | XCSme wrote:
               | You are correct, I did confuse the terms "visible source"
               | with "open source".
               | 
               | The way userTrack is currently distributed is as any
               | other digital product (you pay for it and you are not
               | allowed to sell or redistribute copies of it) with the
               | mention that the server-side code is un-compiled and un-
               | obfuscated so you can transparently see what it does, how
               | it does it and change it if you want.
               | 
               | I am not sure that fully open-sourcing it is the way to
               | go as I've seen so many projects die or disappear because
               | the maintainers didn't have a lot of incentives to keep
               | improving it or simply no longer had time to work on it.
               | I also think that it's fair to pay for something that
               | brings value to you also knowing that by paying for it
               | you support its further development.
        
       | RocketSyntax wrote:
       | maybe website analytics is more appropriate? the word analytics
       | is taking on a new meaning these days
        
       | solarkraft wrote:
       | I used it a bunch back when it was Piwik (why did they change the
       | name, it was great!) and have been quite satisfied.
       | 
       | Yet, though at least it isn't cloud based, it's still quite scary
       | what kinds of things it will tell you about your visitors.
        
         | brnt wrote:
         | I'll bite: what kinds of things did you learn about your
         | visitors?
        
       | cmg wrote:
       | Worth noting that they also have a WordPress plugin that handles
       | all of the installation and related setup:
       | https://wordpress.org/plugins/matomo/
       | 
       | I use it on a personal project site and it works very well.
        
       | [deleted]
        
       | pachico wrote:
       | I just wished it used Clickhouse as persistence layer.
        
       | [deleted]
        
       | villgax wrote:
       | A startup I worked with used Piwik(now Matomo) & defrauded
       | investors with fake visits by tampering with the DB. Any VC
       | should actively be involved & knowledgeable of analytics that is
       | presented to you in order to avoid being ripped off.
        
         | viraptor wrote:
         | I'm not sure if this is specific to Matomo. You could use
         | headless agents over web proxies around the world to inflate
         | Google Analytics as well. It just costs less to do it in your
         | own DB.
        
       | andrewzah wrote:
       | I used to use matomo but I found goatcounter to be simpler to set
       | up and maintain. I also like the UX more [0].
       | 
       | [0]: https://stats.arp242.net/
        
       | francoisp wrote:
       | a slightly different angle, also opensource: mautic
        
       | deepstack wrote:
       | Any one know a node alternative to GA? Matomo is great, but for
       | something like GA, node would be best option for handling large
       | amount of http requests.
        
         | codyogden wrote:
         | I started using Umami (Node/Next) to replace GA on my major
         | site (70k/30days). It provides the data I care about seeing,
         | and nothing more. Public preview: https://analytics.kbg.rip
         | 
         | Project: https://umami.is
        
           | AbdHicham wrote:
           | That's also a really cool alternative.
        
           | fbnlsr wrote:
           | This looks really good. Kudos!
        
           | pabe wrote:
           | I'd also recommend this one. Relatively "basic" data but it's
           | GDPR compliant and easy to install and update. Big thanks to
           | the author!
        
         | XCSme wrote:
         | I use a PHP analytics platform on a shared VPS hosting and it
         | can track 1M+ monthly visits without any issues.
         | 
         | Why would node be able to handle so much more HTTP requests
         | than Apache or Nginx? I think the throughput is mostly dictated
         | by implementation.
        
         | looperhacks wrote:
         | Why would you need node to handle "large amounts" of http
         | requests?
        
           | capableweb wrote:
           | Of course you don't have to have nodejs to handle large
           | amounts of http requests, if you spend enough time you can
           | get any language/framework to handle the amount you need :)
           | 
           | But, seems that at least in the TechEmpower framework
           | benchmarks, es4x (JS) ends up on position 9 while the closest
           | PHP framework ends up at 13. Now it's just a small benchmark
           | with specific tests, but I do think it's easier to make
           | NodeJS handle large amount of requests than PHP. Although
           | again, you can definitely do large amounts of requests with
           | PHP too. I've spent about 5 years on each, found that getting
           | good performance out of V8 is easier than out of PHP.
        
         | Volrath89 wrote:
         | I've been using plausible, and don't miss a thing about GA so
         | far
        
       | rilut wrote:
       | Have anyone evaluated Matomo vs Countly vs PostHog?
        
       | Galanwe wrote:
       | As a non-web developper, I always wondered:
       | 
       | Are these alternatives fully able to replace Google analytics?
       | 
       | I sort of thought Google analytics would tell you more about your
       | visitors since with Google cookies, they could map them to other
       | visited websites, centers of interest, age group, etc.
       | 
       | Are you loosing all that when switching to a less intrusive
       | analytics platform such as this, or is Google analytics not
       | leveraging their ability to disclose more about the visitors?
        
         | dustinmoris wrote:
         | Due to the amount of people blocking Google Analytics with
         | browser extensions, Pi-Holes and other tools I find GA
         | increasingly lacking good analytics.
        
           | wartijn_ wrote:
           | I assume most tools will block Matomo as well. I know uBlock
           | origin with the default blocklist does.
        
             | mattmcknight wrote:
             | This is why you self host it, to avoid a third party
             | cookie.
        
               | y42 wrote:
               | Still not help if you consider GDPR et al. rules, at
               | least in the EU.
        
               | XCSme wrote:
               | I think that using a self-hosted platform where you don't
               | store any PII or cookies allows you to store visitor
               | statistics without explicit consent.
        
               | mattmcknight wrote:
               | I meant it helps with uBlock type rules against domains.
        
               | vntok wrote:
               | In some countries (eg France), there are exemptions for
               | tracking purposes if the tracking is done only to the
               | benefit of the site's editor.
        
         | jeroenhd wrote:
         | Google Analytics tells you more about your audience because it
         | stalks people across the web. Matomo can never provide that
         | without having a broad range of websites from which you collect
         | data and writing custom code to annotate visitors with your own
         | interest tags.
         | 
         | Matomo purely tracks analytics: who visited what page, for how
         | long, from what device, from what location, from what inbound
         | website, and what outbound links did they click. It also
         | provides a log of pages requested per session so you can
         | analyze people's flows through your website.
         | 
         | It's certainly not a replacement for Google Analytics if you
         | use it to collect background information on your visitors. Even
         | though Google's information is very broad (you mostly get
         | ranges and the interests aren't that reliable), some marketeers
         | use it to make decisions about their marketing strategies.
         | Matomo won't help you there, your alternative would probably be
         | Facebook or another big tech tracking solution.
         | 
         | It does provide a replacement for the type of tracking that I
         | personally find acceptable, assuming the IP addresses are
         | anonymized sufficiently. Matomo recommends shortening IP
         | addresses to /16 after analysis, which I consider good enough,
         | but that's a setting administrators can change.
        
           | iamacyborg wrote:
           | > Google Analytics tells you more about your audience because
           | it stalks people across the web.
           | 
           | That data is mostly garbage and only getting worse.
        
           | XCSme wrote:
           | What information exactly does GA tell you from stalking
           | people across the web? I don't think Google sharing with you
           | accurate information about the people visiting other websites
           | would be completely legal (GDPR). Where exactly do you find
           | this information in the dashboard?
        
             | marvinblum wrote:
             | Even if they don't share it with you, they do it for
             | themselves.
        
               | XCSme wrote:
               | Yes, that was my point, probably they get this data for
               | themselves and improving their own services but as a
               | webmaster you don't get all this data yourself.
        
             | iamacyborg wrote:
             | It's under Audience -> Demographics.
             | 
             | It's off by default. Turning it on gives you basic
             | demographic data but also means you consent to sharing your
             | GA data with Google to use for advertising purposes.
        
             | kbelder wrote:
             | Importantly, it's always bucketed. You can get a breakdown
             | for a group of visitors, but not the demographics for an
             | individual.
        
             | JW_00000 wrote:
             | Demographic data like: age (in buckets of 10 years),
             | gender, household income for some countries, whether you're
             | a parent [1] + Interests/"Affinities" [2]. I think these
             | are derived from your Google search history and the sites
             | you visit.
             | 
             | You do need to explicitly enable this in the GA dashboard,
             | and ask users' consent under the GDPR.
             | 
             | [1] https://support.google.com/google-
             | ads/answer/2580383?hl=en [2]
             | https://support.google.com/google-ads/answer/2497941?hl=en
        
         | marvinblum wrote:
         | Hey there, we are working on Pirsch [0] (another GA
         | alternative).
         | 
         | If you can replace GA depends on your needs. GA collects more
         | personal data, you get better insight of your audience. This is
         | important if you do online marketing and like to see how well
         | your campaigns perform. GA does track visitors across days and
         | you can therefore see if someone came back after a week and
         | made a purchase.
         | 
         | In case you don't do that or are simply not interested in
         | specifics, all the alternatives are good enough right now, I
         | think. You can still tell how visitors navigate your page, what
         | content they visit most and all that stuff. We are currently
         | thinking about what we can add to gain more insight for
         | businesses, without invading privacy as Google does.
         | 
         | [0] https://pirsch.io/
        
         | johnchristopher wrote:
         | The real kick is when you link both google ads and google
         | analytics https://blog.littledata.io/2019/02/25/why-link-
         | google-ads-ad...
         | 
         | Something you can't do when leaving google land.
        
       | l1am0 wrote:
       | I use Matomo for years now and it works quite reliably. (A few
       | updates failed the automatic update, but nothing serious)
       | 
       | Only thing that bothered me is that most Ad Blockers are blocking
       | Matomo as well. I did build a little Script to circumvent that,
       | you might find it handy as well:
       | https://gumroad.com/l/matomo_circumvent_adblock
       | 
       | I use it on my website. Check if your ad blocker is capable of
       | blocking it: https://simon-frey.com
        
         | l1am0 wrote:
         | FYI: The gumroad link is to a support license.
         | 
         | You purchase a support license to help me to continue working
         | on MCAB. MCAB itself is Open Source and can be found on Github:
         | https://github.com/simonfrey/matomo_circumvent_adblock
        
         | vntok wrote:
         | Unfortunately your script still calls a third party domain,
         | which is trivial to block using a generic AdBlock/uBlock rule.
         | Instead, you should host the matomo script (under a different
         | filename of course) on your own domain. That way it won't be as
         | easily blocked.
         | 
         | I go as far as to send all the tracking parameters through a
         | custom server script before they are proxied to GA and Matomo.
         | That way, I can change the script and parameter names at will,
         | making them much more difficult to block. For example, Matomo-
         | related blocking rules are as follows:
         | 
         | /matomo-tracking.
         | 
         | /matomo.js$domain=~github.com
         | 
         | /matomo.php
         | 
         | /matomo/ _$domain=~github.com|~matomo.org|~wordpress.org
         | 
         | /piwik-$domain=~github.com|~matomo.org|~piwik.org|~piwik.pro|pi
         | wikpro.de
         | 
         | /piwik.$image,script,domain=~matomo.org|~piwik.org|~piwik.pro|p
         | iwikpro.de
         | 
         | /piwik._/ping?
         | 
         | /piwik.js
         | 
         | /piwik.php
         | 
         | /piwik/*$domain=~github.com|~matomo.org|~piwik.org|~piwik.pro
         | 
         | /piwik1.
         | 
         | /piwik2.js
         | 
         | /piwik_
         | 
         | /piwikapi.js
         | 
         | /piwikC_
         | 
         | /piwikTracker.
        
           | l1am0 wrote:
           | If someone decides to explicit block my Matomo tracking
           | server I am fine with that.
           | 
           | I experimented with tracking on the same site and the
           | overhead is not worth it for me. Central solution for all my
           | projects works quite reliable
        
             | vntok wrote:
             | Sure if someone explicitely blocks you and you alone,
             | that's fine. The problem is you getting blocked
             | generically, because you're using the same scripts or
             | patterns as everyone else, such that there exists a very
             | wide and generic block rule in uBlock Origin or some other
             | filter that happens to apply to your own domain. That's
             | unacceptable and worth fighting against.
        
               | l1am0 wrote:
               | But that is exactly what the script is doing. It changes
               | the structure to prevent generally being blocked
        
               | vntok wrote:
               | Using a third-party domain means you're already blocked
               | in most cases involving adblockers and non-standard CDNs.
        
               | l1am0 wrote:
               | Oh thanks. That is an insight I did not have so far!
        
         | huhtenberg wrote:
         | Unless I am missing something, that's trivial to block.
         | 
         | Any tracker can be made to work around ad blockers by making
         | callbacks to the site itself and having a small shim there that
         | forwards these pingbacks to the actual tracking service. But
         | even then they still can be blocked based on the request
         | contents.
         | 
         | PS. Here's how your website looks in Firefox -
         | https://i.imgur.com/uFKEB4X.jpg. That's with uBlock off. No
         | console errors.
        
           | l1am0 wrote:
           | Weird that you don't see any text. In FF on Linux it works as
           | expected. What system are you on?
        
             | huhtenberg wrote:
             | Windows
        
         | lightswitch05 wrote:
         | Its pretty disgusting to track people against their consent,
         | even more so to circumvent their protections against tracking.
         | I added your domain to my blocklist:
         | https://github.com/lightswitch05/hosts/commit/bb2cd77c9ec028...
        
           | vntok wrote:
           | It's pretty disgusting to access creators' content for free
           | while blocking their attempts to monetize it.
        
             | [deleted]
        
             | necovek wrote:
             | Why not block access to the content then? You can't watch
             | Netflix streams without paying for them, that's trivial to
             | implement.
             | 
             | Ah, right, creators want their content to show up for my
             | search keywords, Google won't let them have pages only
             | visible to Google bots (though even that is changing with
             | the rise of paywalled sites), and they want the money from
             | that same Google showing ads from their ad network.
             | 
             | Google initially promised to deliver a search for the open
             | web unencumbered. It has become a sort of paywall itself
             | (accept our ads or our search results will be useless
             | pointing you to pages that only work if you have ads
             | enabled).
             | 
             | Sure, it would be fair if they haven't pushed out the
             | competition acting entirely differently ("we have no ads",
             | "our ads are clearly marked" to current "see if you can
             | tell a difference between an ad and your search results").
        
           | samjmck wrote:
           | Ad blocking != block tracking. If you don't want to get
           | tracked, turn on Do Not Track in your browser. Matomo and
           | most other privacy focused analytics scripts respect that
           | setting.
        
             | lightswitch05 wrote:
             | While I agree that is the proper solution, most analytics
             | do not respect the Do Not Track header. Beyond it being
             | mostly ignored, Safari (which currently has 20% global
             | browser share) removed support for Do Not Track in 12.1. So
             | even though Matomo might respect the header request, there
             | is no way for me to send that header on many of my devices.
             | Blocking is the only solution left to me to 'opt out' of
             | tracking regardless of the good intentions of Matomo.
        
             | l1am0 wrote:
             | +1 I set Matomo to respect Do Not Track and you can opt out
             | of the Tracking in my Privacy Settings
        
             | dastx wrote:
             | Sure, so I have tracking protection too both through uBlock
             | Origin, and Firefox' tracking protection feature. Yet, here
             | you are, bypassing my tracking protections.
        
               | samjmck wrote:
               | If you're using Firefox tracking protection (which I'm
               | guessing using DNT as well), then Matomo by default does
               | not track you though. So no, your tracking protections
               | aren't being bypassed.
        
             | arp242 wrote:
             | Do-Not-Track is pointless and dead. Pretty much none of the
             | trackers that _actually_ matter pay one iota of attention
             | to it.
        
           | aorth wrote:
           | It's your call really, but a website owner tracking you with
           | their own software on their own Matomo instance is not the
           | problem. This is essentially the same as monitoring website
           | logs... that's not disgusting at all.
        
             | Mediterraneo10 wrote:
             | When the GDPR was entering into force, I remember some
             | speculating that monitoring Apache logs could violate it,
             | since the user has not consented to having their personal
             | details (i.e. the IP address) processed. What was the final
             | consensus reached on this?
        
             | lightswitch05 wrote:
             | I think grouping server-side tracking with JavaScript based
             | tracking is an oversimplification. JavaScript tracking is
             | much more invasive and can access significantly more data.
             | From something as straightforward as fingerprinting to
             | potentially even more invasive data such as geo-location,
             | battery status, webcam, microphone - you name it. Server
             | access logs aren't going to track my eyes.
             | 
             | I think we can all agree there are different levels of
             | acceptable tracking and use of that data- but the degrees
             | of acceptance are going to be different depending on the
             | user and service. I don't consider bypassing my
             | restrictions to run unauthorized code to be an acceptable
             | tracking method and raises serious concerns about how the
             | data will then be used.
        
               | throwaway894345 wrote:
               | OTOH JavaScript tracking is an easy way to filter out a
               | lot of the bots. I use a little bit of JS-based tracking
               | for exactly this reason, but I'm not extracting anything
               | that wouldn't show up in server logs (eventually I also
               | want to get some "time spent on page" metric so I have
               | some idea how useful my blog posts are (are people
               | clicking and leaving right away or are they sticking
               | around to read). You pretty need JS for this. In whatever
               | case, web analytics like these aren't "tracking"; you're
               | looking at user behavior on your own site; not trying to
               | follow them around the Internet or otherwise identify
               | them.
        
               | arp242 wrote:
               | Anyone _can_ do all sorts of things. I can punch anyone I
               | see on the street in the face. Doesn 't mean they're
               | actually doing it.
               | 
               | Now, I have a vested interest in this as I work on one of
               | those tracking tools, but it actually collects _less_
               | data than those Apache access_logs that people have been
               | keeping for 25 years. Plus, the JS is unminified and
               | easily examinable if you want (as is the HTTP request),
               | so you also have more insight in what is being collected
               | exactly.
               | 
               | "It's using JavaScript" and "it _can_ do [..] " are
               | massive red herrings; browsers are actually fairly
               | sandboxed and there are millions upon millions of lines
               | of code on your computer that _can_ do much more than
               | JavaScript inside a webpage.
        
               | lightswitch05 wrote:
               | > I can punch anyone I see on the street in the face.
               | 
               | Yes, and then you would be charged with assault. It is
               | great that you work on a tool that respects peoples
               | privacy. I suppose I failed to put an emphasis on trust.
               | With server side logs, less trust is required because
               | there is less that can be done. Paired with VPN, I can
               | have reasonable belief that server side logging is not
               | logging anything unreasonable and it does not require
               | trust that they are not fingerprinting me. As you say,
               | just because someone can do something doesn't mean they
               | will - but trust is required, especially if there are no
               | repercussions if that trust is violated.
        
       | matomo-report wrote:
       | I have setup a load balancer with 3 instances of matomo connected
       | to one mysql database to handle tracking on a website with around
       | 7000 visits a day. It could all probably be handled by just one
       | instance but that is sort of the standard setup we have for
       | things.
       | 
       | matomo is very comparable to google analytics in terms of
       | reports. matomo has some things that seem a little easier to get
       | to; like visitor flows.
       | 
       | However, matomo seems to just give up on big data, complex
       | reports. Similar reports in google analytics take a long time to
       | complete, 10 to 40 seconds, but they at least complete;
       | eventually.
        
       | sneak wrote:
       | Does anyone know how to get it to show the full referer URL? I
       | use a self hosted Matomo and it only shows the referer domain.
        
         | XCSme wrote:
         | I don't think that's always possible as it's a browser security
         | limitation. The referring domain can decide to pass on only the
         | domain, not the full URL.
         | 
         | https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Re...
        
       ___________________________________________________________________
       (page generated 2021-01-06 23:05 UTC)