Reason #25 why I love gopher
Thursday Dec 27 10:39:11 2012
Hello again gopherites! It's nice to be back on on my phlog. I
haven't been on gopher much over the past couple of months, but
I'll talk about that in my next post.
Anyway, back to the subject at hand. The other day I was reading
Hacker News [0] when I noticed an article titled "You're not
anonymous. I know your name, email, and company." [1] I decided to
check it out and see what it had to say. I usually enjoy reading
articles who's subject is privacy, so I figured it would be a good
read.
The this article explains how companies track users across websites
so that if a user visits site A and enters personal information,
site B will know all of the information entered on site A, just so
long as B is part of the same "information sharing network".
This is disturbing for many reasons, but my biggest fears are
privacy related. If companies don't need to have any relationship
with a user, but can still access their personal information, that
seems like a breach of "reasonable expectations of privacy".
Unfortunately, there no law that guarantees such privacy. As long as
the user agrees to such rules, this type of practice is almost
inevitable.
Immediately I thought, "This is one if the reasons why gopher is so
great, there's no tracking. Something like this could never be done
in gopher." But the more I thought about it, the more I wondered
just how traceable gopher really is.
Gopher is much less crackable than http, mainly due to the
simplicity of the protocol. A gopher request sends no other data
than the page the client requested. An http request sends loads of
extra information, such as accepted formats and time zones. That
doesn't even consider the numerous other things one can learn about
the system via Javascript, Flash, and Java. If someone collects all
this data and records it, they can use these browser fingerprints
to track users across sessions; users don't even need to be logged
in for the website to know who it is. The EFF has a great example
of browser fingerprinting in action. You can check it out at
panopticlick.eff.org [2] (For what it's worth, all of the clients
on my laptop say they are unique)
But are gopher clients really invulnerable to fingerprinting? Well,
the answer is more of a "barely" rather than a definite "no".
Gopher clients do have several idiosyncrasies that make them
distinctive. The original gopher client from UMN is the only one to
try to use gopher+ rather than the standard gopher defined in RFC
1436. A gopher server could easily detect this and know the user is
using UMN's client. Also, the few number of clients which requests
caps.txt makes a request easy to narrow down to a few clients. Kim
Holviala outlined this in the gopher mailing list [3]. (Kim also
noted that gopher server are much easier to detect [4])
Even if the client a person is using can be determined, they still
are in very little danger because since gopher sends practically no
information along with its requests, clients cannot be mapped to a
specific person.
While gopher clients alone cannot reveal a person's identity, I
would hardly call gopher an anonymous protocol. Gopher uses TCP as
its transport protocol, so obviously all servers get an IP address
of the client they're talking to. Unless the user in question is
using a gopher proxy, the IP will be for their home connection.
Quite a bit of info can be gleaned from an IP address.
As an individual (not a business which has money to spend) anyone
can get several important facts from an IP address. First, just
using a whois search, you can find out who the IP address is
registered to. This is the ISP of the user. The next step is
geolocation. IP addresses cannot reliably mapped to mailing
addresses, but some commercial solutions are fairly accurate and
can determine the country and city of the ISP who registered the IP
address. Unfortunately, I'm too cheap, so I only found two free
solutions for this [5] [6]. The first, hostip.info, didn't have any
info about my IP address at all (I guess that's a good thing?), so
I wouldn't consider it as a useful option. The second one I found
was MaxMind's free GeoLite City geolocation databases (they also
provide commercial, more accurate databases). These don't have a
public API, so I couldn't test them, but they are free to download.
I would imagine that they offer the same of slightly better accuracy
than the hostip.info databases.
If all the information we can gather is accurate, we will know the
user's ISP, country, approximate city, and probably time zone.
Combined with the fact that there are very few people who browse
gopher, it might just be feasible to track gopher users, or at
least narrow down possible suspects more accurately than one's
initial predictions might have accounted for.
Is this really that much of a problem? I doubt it. It's still
_much_ harder to pull this off than would be over http, and plus,
where's the motivation. There are no commercial benefits to
tracking gopher users, so why should anyone even bother?
The lesson here is to be wary of what your tech is broadcasting
behind your back. Also, gopher rules.
(HTM) [0] Hacker News Article
(HTM) [1] "You're not anonymous. I know your name, email, and company." - 42 Floors
(HTM) [2] Panopticlick - How Unique - and Trackable - Is Your Browser?
(HTM) [3] Gopher mailing list - Fingerprinting gopher clients
(HTM) [4] Gopher mailing list - Fingerprinting gopher servers
(HTM) [5] hostip.info - free IP geolocation
(HTM) [6] MaxMind - Free and commercial IP geolocation