[HN Gopher] Tell HN: Startups harvesting GitHub commit emails fo...
       ___________________________________________________________________
        
       Tell HN: Startups harvesting GitHub commit emails for marketing
       purposes
        
       I recently received an starting like this:  > Hi fouric,  > I found
       your email from one of your GitHub repository while checking for a
       solution.  > Being in the software engineering industry, I am
       reaching out for your valuable inputs on our product called
       "Language Lens" [...]  The email attached to my GitHub account
       isn't exposed in my profile - the only way this individual would
       have gotten my email from GitHub would have been too scrape my
       commit messages.  Needless to say, I didn't request or consent to
       this.  All y'all might want to make sure that the email attached to
       your commits is something you're not afraid to receive spam like
       this at.
        
       Author : fouric
       Score  : 137 points
       Date   : 2022-04-10 15:27 UTC (7 hours ago)
        
       | badrabbit wrote:
       | Doesn't everyone else fake their git email?
        
       | forgotmypw17 wrote:
       | I wish GitHub would allow me to specify additional emails for the
       | activity graph, because mine looks pretty much empty because
       | there's no way I'm going to put my email address into my commits.
        
       | DiabloD3 wrote:
       | Fun fact: Anyone who spams my private, other private, or business
       | email address is going to find their domain name and IP range
       | submitted to the various spam blackhole lists.
       | 
       | Let's call this "Github roulette".
        
       | newaccount74 wrote:
       | I think it's nice that people have their real email in git
       | commits. Gives it a personal touch when you are looking through
       | commit logs. Some clients even display gravatar profile photos...
       | 
       | I understand and respect that some people may prefer to stay
       | anonymous. But if you are hiding your email just to fight spam, I
       | don't think it's worth it. I prefer being easily reachable and
       | dealing with a few spam emails that get through the filter.
        
         | worik wrote:
         | If you are committing to a github repository, then you are not
         | hard to reach.
        
       | jasonlotito wrote:
       | It's interesting because years ago, I remember people saying this
       | was one of the perks of doing open source. You'd have your email
       | address in the code and people would reach out to you for these
       | various things. Times have changed.
        
         | 0des wrote:
         | Grey beard here, lemme toss out my 2 cents.
         | 
         | I use bogus emails like no-reply@localhost. I quite literally
         | don't give a single care in the world what someone wants to say
         | to me, or why. I'd rather select the times when I accept
         | inbound comms. This is a big reason I do not carry a phone, and
         | use one maybe once a month to organize the next cycle with my
         | coconspirators.
         | 
         | I don't dislike people at all, but to contact me after the
         | effort is taken to not be contacted is rude. I wish there were
         | a license that could be used seriously that consists of:
         | 
         | This is forkware
         | 
         | Don't bother me
         | 
         | I'm a nice guy and enjoy sharing. Let's take doritos as an
         | example. I'll share my doritos with you, but I'm not interested
         | in the pleasantries like your analysis of why modern doritos
         | are terrible little crunchy cardboard chunks that mouthfuck
         | your tastebuds into submission. I even don't want to know about
         | the ones I agree with, like how a modern mountain dew most
         | deliciously compliments the retro doritos if you can still find
         | them.
         | 
         | I don't mind if you use my code, I don't care if you take my
         | name off it, or put it on, or write it on your arm, I don't
         | care. Go sell it to facebook if that makes your day. If I
         | change my mind I'll put a license on the next version, but I
         | haven't reached that point yet, so the obese and overbearing
         | MIT license it is, until then.
         | 
         | But know this: Don't you even think about contacting me for
         | anything. I am not a business, I have a fondness for the idea
         | that the byproduct of my struggles may help a stranger without
         | my knowledge. I prefer to live while littering artifacts of
         | what I make so that maybe someone else stuck in the same spot
         | can get some relief.
         | 
         | Why do people think that for me to give away some code I now
         | have to have a support staff of one?
         | 
         | You've got the entire game fucked up. Don't @ me.
        
           | xwowsersx wrote:
           | > I'll share my doritos with you, but I'm not interested in
           | the pleasantries like your analysis of why modern doritos are
           | terrible little crunchy cardboard chunks that mouthfuck your
           | tastebuds into submission.
           | 
           | > But know this: Don't you even think about contacting me for
           | anything.
           | 
           | This got angry and weird real quick.
        
             | 0des wrote:
             | Eh sorry, currently on a long flight. I'm sure you're
             | alright.
             | 
             | But don't you email me, you nice and wonderful person.
        
             | [deleted]
        
           | jotm wrote:
           | Yeah, just use throwaway emails everywhere.
           | 
           | Though I'd say people will assume you're poor at "online
           | presence" or something before assuming you don't want to be
           | contacted.
        
           | majkinetor wrote:
           | And yet, you can't blame one for trying. Who gives a shit
           | about what some random grey beard person on the Internet
           | wants - if you don't want to get contacted, do not leave
           | artifacts behind :)
        
           | jasonlotito wrote:
           | It's interesting that the two comments to my comment referred
           | to support.
           | 
           | I remember decades ago the email address in the code being
           | the "resume" for people. People using the Linux mailing list
           | for example to "harvest" emails to contact those people about
           | employment.
           | 
           | I think it's fair to say if you share your email address out
           | in an open source project and give away it's code, it is an
           | open invitation to contact the person. After all, the purpose
           | of the email address is generally for contacting people.
           | 
           | If you don't want to be contacted, you already have a
           | solution: don't use a real email address. But if you are
           | putting your email address out there in public these days,
           | the _ONLY_ reason you are doing that is to be contacted.
        
         | fxtentacle wrote:
         | Yes, the expectation nowadays is that every free open source
         | project comes with professional 24/7 support.
        
           | jasonlotito wrote:
           | That's not a new thing. That's been going on for decades. I
           | also wasn't even referring to support, but job offers.
        
       | renewiltord wrote:
       | Yep, and I think it's perfectly reasonable. I'm always happy to
       | receive email on my GitHub for this reason.
       | 
       | If enough people mark as spam it'll go to spam for everyone else
       | so it's pretty self-correcting.
        
       | [deleted]
        
       | ffhhj wrote:
       | Is there an email address that would make a spammer get in
       | trouble? They might call the Github "police" on themselves.
        
       | C4K3 wrote:
       | This has been going on for a long time. There was a company
       | called geekedin that scraped data off public git repositories
       | (among other things) and who then had their database leaked back
       | in 2016.
       | 
       | https://www.troyhunt.com/8-million-github-profiles-were-leak...
        
       | nabaraz wrote:
       | I was getting quite a lot of spams too. So, I ended up creating a
       | brand new email address and rewriting name and email with:
       | git filter-branch -f --env-filter \
       | "GIT_AUTHOR_NAME='Newname'; GIT_AUTHOR_EMAIL='newemail'; \
       | GIT_COMMITTER_NAME='committed-name';
       | GIT_COMMITTER_EMAIL='committed-email';" HEAD
        
         | throwaway892238 wrote:
         | You need to ask GitHub support to reindex your repo or the old
         | address may still show up on the contributors page
        
       | dustyharddrive wrote:
       | A lot of these spammers start with a repo's stargazer list, so
       | I've been slowly unstarring everything -- if enough people do
       | this GitHub will have to fight at least that kind of scraping. I
       | recommend making a special email alias for git you can rotate, or
       | at least send to a folder you don't check often.
       | 
       | Does anyone here think an open source blocklist for domains
       | associated with this developer-targeted spam would be useful?
        
         | zhfliz wrote:
         | it's impossible for github to block this.
         | 
         | you just need to do a git clone and you get all the email
         | addresses in the commit history of a repository.
         | 
         | unless you're talking about your publicly displayed email
         | address (visible to logged-in users), which I believe is not
         | shown by default and can be disabled easily.
        
       | stevebmark wrote:
       | This isn't news, recruiters have been doing this for years. You
       | can tell what languages and technologies people know from their
       | commits. It's a great idea for recruiting companies. Your Github
       | commit email is public, there's not much to consent to.
        
       | leros wrote:
       | This has been going on for a while. GitHub has been
       | programmatically scraped by multiple companies that feed into
       | databases that feed into other databases.
       | 
       | Lots of companies are scraping data and lots of companies are
       | buying, aggregating, and selling data to each other.
        
       | slenk wrote:
       | There are tools out there that facilitate this:
       | https://github.com/paulirish/github-email
        
         | LunaSea wrote:
         | Developed by a Googler, surprise, surprise.
        
           | renewiltord wrote:
           | Googler: Produce some amount of code
           | 
           | Anti-Googler: Produce prodigious amounts of comments on the
           | Internet about Google and approximately zero code
        
             | rhizome wrote:
             | Pro-Googler: changes the subject to attack critics
        
               | renewiltord wrote:
               | Haha, I'm not pro-Google (read my past comments). Just
               | tired of the repetitive nonsense here.
        
           | 0des wrote:
           | Oh you done did it now
        
           | ls_waiting wrote:
           | The about section is: "Get a GitHub user's email. "smiley
           | sunglasses face emoji" Use this responsibly." Is the smiley
           | face for the first sentence? (look how cool and clever I was
           | to find this data) Or the second sentence? (lets all have a
           | laugh that there's actually a responsible usage for this
           | data.)
        
           | angrais wrote:
           | The real concern should be: Developed as an npm package.
           | 
           | I mean, really? It's a shell script wrapped as a package to
           | gain traction. What has the state of the Dev world become,
           | indeed ...
        
             | skinnymuch wrote:
             | I was 99% sure you were exaggerating in a classic HN sort
             | of cliched way. I thought it's prob in JS code that could
             | be done as shell but this lets it work on Windows too (at
             | least when it came out).
             | 
             | But no. This is actually just a small shell script as you
             | said. Insane!
        
       | Stampo00 wrote:
       | Github provides an email address for every user to obscure their
       | real email address as well as filtering to reject commits that
       | use your real email address. Unfortunately, you must opt in to
       | use this.
       | 
       | https://docs.github.com/en/account-and-profile/setting-up-an...
        
         | jart wrote:
         | And it makes me so unhappy to see them polluting my commit
         | logs. I won't work with people on open source and run their
         | code on my computer unless I know who they are. Getting
         | recruiter spam is a good problem to have.
        
           | 0des wrote:
           | Maybe some day you reconsider. Some of us just don't want to
           | be bothered but don't mind sharing. Check my other reply for
           | context.
        
             | [deleted]
        
           | vvillena wrote:
           | Relying on the commit log email for checking someone's
           | identity is probably a really bad idea. Public-key signed
           | commits are the better tool for that.
        
           | blueflow wrote:
           | In a world where activists launch shitstorms against you on
           | twitter if they don't like something you do or say, there
           | much reason to never go public with your real name.
           | 
           | Its not even about what you do, worse, its what they believe
           | you did. And this is a factor you cannot possibly control.
        
             | skinnymuch wrote:
             | Can you give some examples of this happening to "nobodies"
             | in the sense of [internet] fame or clout? Also not if the
             | media riled things up too. I'm sure this stuff happens but
             | I'm not around the places people keep saying this happens
             | (usually Twitter). I've only seen examples of people who
             | were internet famous or some other reasons that wouldn't
             | overlap with a casual GitHub user.
        
           | sodality2 wrote:
           | >I won't work with people on open source and run their code
           | on my computer unless I know who they are.
           | 
           | I can make a free email very easily. Having a "real" gmail in
           | the logs means nothing.
           | 
           | >Getting recruiter spam is a good problem to have.
           | 
           | Easy for you to say. Not everyone wants that. Sign up for
           | your own recruiters then...
        
             | [deleted]
        
         | jotm wrote:
         | It was enabled by default when I made an account (~1 year ago)
        
       | ttyprintk wrote:
       | It might also come from PGP-signed commits, since emails
       | associated with keys are public knowledge. Fairly easy way to
       | filter for good programmers, if you ask me.
        
         | arjvik wrote:
         | You're saying the presence of PGP-signed commits is correlated
         | with programmer skill?
         | 
         | I have never signed my commits because I see absolutely no
         | reason to (my repositories are all small personal projects that
         | nobody uses), and I haven't gotten around to setting up a
         | proper key management workflow with my password manager. Maybe
         | I need to start signing them!
        
           | 0des wrote:
           | I sign mine because though I do not wish to be contacted, I
           | want to be able to verify that what I published came from the
           | online identity I am using at that time for that project. I
           | don't share keys, I don't want to be contacted, I also don't
           | want to be impersonated, which is easy in Git
        
           | encryptluks2 wrote:
           | You can create a key specifically for GitHub and use your no-
           | reply address on the key. That is what I do.
        
         | Etheryte wrote:
         | Fairly easy way to end up right in the spam folder.
        
       | thenerdhead wrote:
       | Report them as spam so their email provider loses reputation and
       | they get warned to stop these practices.
        
       | mherrmann wrote:
       | And now even more startups will do it. However, as others have
       | pointed out, GitHub offers a way to anonymize your email address.
        
       | throwaway892238 wrote:
       | Startups (and scammers) also harvest e-mails from HN and phish
       | them. I posted a unique e-mail address here once and haven't
       | stopped getting all kinds of scams and spam to it.
        
       | ghoomketu wrote:
       | This is quite common unfortunately and has been used by billion
       | dollar companies like Airbnb to get traction(1). Back in the days
       | this was just plain old spamming but nowadays it is called growth
       | hacking
       | 
       |  _Craigslist growth hack --very early on, anyone who listed on
       | Airbnb could cross-list on Craigslist with one click, Airbnb
       | helped by filling out all Craigslists forms with a 'bot.' The
       | hack required some technical gnarl to perform, but it was perfect
       | for this early stage. The team also appeared to look for all
       | listings of vacation properties being listed on Craigslist, and
       | emailed the owners to list also on Airbnb. Yes, it's spam. But
       | yes, it worked._
       | 
       | (1)
       | https://www.linkedin.com/pulse/20140918020352-142089-airbnb-...
        
         | iamacyborg wrote:
         | > nowadays it is called growth hacking
         | 
         | Nowadays it's called illegal. At least in the EU.
        
           | mosselman wrote:
           | Do you have a source for this?
        
             | repox wrote:
             | > Do you have a source for this?
             | 
             | https://gdpr.eu/
             | 
             | It's quite comprehensive, but article 4 on consent.
        
               | frereubu wrote:
               | They'd probably try to rely on legitimate interest rather
               | than consent, but it would depend to a large extent on
               | how personalised and automated the messages were.
        
               | yakak wrote:
               | Good luck with that. Most legitimate interests require
               | the reason to be defensive or a relationship to exist
               | where there's an expectation they would need to process
               | your data. I am legitimately interested in spamming you
               | is not really a defense.
        
               | rat9988 wrote:
               | Legitimate interest is not vague enough to be relied on.
        
         | 0des wrote:
         | Why is every growth hacker I meet IRL an ex-SEO guy who is
         | convinced I want to buy their NFT bs?
        
         | neilv wrote:
         | > _The team also appeared to look for all listings of vacation
         | properties being listed on Craigslist, and emailed the owners
         | to list also on Airbnb. Yes, it's spam. But yes, it worked._
         | 
         | That was arguably expressly disallowed on CraigsList since very
         | early on. (Post pages would have a checkbox for whether people
         | could contact you for other purposes, and displayed/scraped
         | posts would indicate how the poster answered. I don't see how
         | "growth hackers" could've missed that.)
         | 
         | CL was built on a Californian flavor of warm-fuzzy, and it was
         | a shame to see Californian-style startups then abuse that.
         | (Well, when CL grew mainstream, the anonymity and hookups
         | attracted sketchiness, but was another corruption.)
        
           | scrose wrote:
           | Funnily enough, Airbnb also expressly stated that you were
           | not allowed to scrape their listings early on and went to
           | (moderate) lengths to make it difficult to do so, even when
           | they were breaking, and/or empowering people, to break the
           | laws in given areas.
           | 
           | Everyone wants to keep their data secret, but no one wants to
           | respect others wishes.
        
             | nowherebeen wrote:
             | Same thing with Facebook. They scrape everyone's website
             | but if you open the console on their website: a big legal
             | disclaimer say it's their code. The only website I have
             | ever encounter that does that. Even Google doesn't do it.
             | The hypocrisy is astounding.
        
               | registeredcorn wrote:
               | Wow. I thought you were joking, but nope. That appears to
               | be real. https://i.imgur.com/jCtREao.png
        
               | [deleted]
        
               | Leherenn wrote:
               | It doesn't say you can't copy anything from their
               | website, it says don't copy paste code here because
               | you'll end up pwning yourself, with a link to "selfxss".
               | That's an anti-scam message, doesn't sound that bad to
               | me.
        
               | nowherebeen wrote:
               | It seems like they recently changed the message. It was
               | much more aggressive in the past. Something along the
               | lines of reading it could be consider copyright
               | infringement.
        
               | charcircuit wrote:
               | Can you share it?
        
               | terinjokes wrote:
               | The only thing that seems to have changed in that
               | screenshot is that the word "STOP" is not printed in red.
               | Otherwise it's the same message I've recalled seeing for
               | years.
               | 
               | Discord also has a similar message in their console.
        
       | sshine wrote:
       | I do get offended when I receive unsolicited email, but only
       | because the ads are always so bad.
        
       | m_ke wrote:
       | Seems like for me they have moved on from emails to phone calls.
       | I get 2-3 daily calls from people who claim they're calling me
       | because I didn't reply to their email pitch.
        
       | paradite wrote:
       | Your email is public on your GitHub profile:
       | https://github.com/fouric
       | 
       | And it's the same one in one of the commits:
       | https://github.com/fouric/lightning-cd/commit/db619ad363227e...
       | 
       | Anyway, do you have a problem with people reaching out to you via
       | email that you leave in the commit, or specifically automated
       | scraping of email from commits? I think the former is fine and by
       | design.
        
         | moffkalast wrote:
         | I wonder why emails on commits was ever mandated, instead of
         | just the username. At least they didn't demand a phone number
         | and address, the dimwits.
        
           | encryptluks2 wrote:
           | GitHub uses it to associate signed commits with your user.
           | Not sure why they can't just verify the GPG key on your
           | account though. I think it is fair to say to them the more
           | personal data you share the better.
        
             | forgotpwd16 wrote:
             | GitHub allows you to use username@users.noreply.github.com
             | as commit email and even has an option to block push
             | commits utilizing your actual email.
        
               | encryptluks2 wrote:
               | Yes, but they shouldn't need any email for commits.
        
           | unfunco wrote:
           | They're not mandated by Git, you can do:
           | git commit --author "moffkalast <>"
        
       | zeta0134 wrote:
       | Say I don't care about the sanctity of my commit history, and
       | wish to scrub my personal email from all of my public
       | repositories. How would I go about doing this without losing the
       | commits themselves?
        
         | xahrepap wrote:
         | I've found scripts online to help me change the committer on
         | entire repos when I realized I had used the wrong config.
         | Something like the answer here:
         | https://stackoverflow.com/questions/2919878/git-rewrite-prev...
         | 
         | Then you have to force push.
         | 
         | Note, you will still have the same number of commits in the
         | same order with the same diffs, but they will be new commit
         | ids.
         | 
         | You're "rewriting" your history. Which in the case you outline
         | is perfectly fine. But note, anyone who's pulled down the
         | branches you're changing will need to reset rather than pull
         | your changes otherwise git will attempt to MERGE the new
         | history with the old one.
        
         | Arubis wrote:
         | You'd have to rewrite the commit history for each of those
         | repos and force push those changes.
         | https://github.com/newren/git-filter-repo would be a decent
         | starting point.
        
       | [deleted]
        
       | nanidin wrote:
       | Your email address is an address that you published in the
       | commit, and you published it in a public place. You can't really
       | fault someone for trying to get in contact with you via an
       | address you published in a public place.
       | 
       | The message they sent you sounds like a consulting opportunity
       | waiting to happen!
        
         | vorpalhex wrote:
         | Spam is spam. Making a commit is not authorization to spam me.
        
           | pc86 wrote:
           | Publishing an email address publicly could reasonably be seen
           | as consent to receive email at that address. GitHub,
           | including commits and those messages, is mostly public
           | (unless you make the repo private, which is free). There are
           | many, many other free and paid alternatives to GitHub.
           | 
           | I agree with you that spam is spam but at the same time the
           | pearl clutching around "omg _commit messages_ " is kind of
           | silly.
        
           | [deleted]
        
           | nanidin wrote:
           | I would argue that there is a difference between low effort
           | bulk spam and what OP received. The message OP received
           | sounds like a fairly tailored request for expert advice,
           | which is a great way to make $100+/hr to answer questions.
           | 
           | If you only want to receive messages from approved contacts,
           | then use whitelists or stick to Facebook Messenger et al.
        
       | [deleted]
        
       | franciscop wrote:
       | This has been happening for a while, and I guess the solution is
       | the same as always? Mark them as SPAM on your email provider, and
       | hopefully the rest of us won't even receive it.
       | 
       | Now if you are in Europe/related, you could try to go the GDPR
       | way, which TBH I don't know at all what it entails here.
        
       | dpedu wrote:
       | Spammers have been harvesting email addresses written on the web
       | in plain text since the dawn of the web. We've all seen older
       | websites where the author obfuscates their email address like
       | "author [at] domain.com" or "my email is firstname at my domain"
       | or even use an image in place of actual text. I could say that my
       | email is my hacker news username for both the name and domain, on
       | the dot io TLD. A bot couldn't scrape that, but you could figure
       | it out.
        
       ___________________________________________________________________
       (page generated 2022-04-10 23:01 UTC)