[HN Gopher] Hidden GitHub commits and how to reveal them
       ___________________________________________________________________
        
       Hidden GitHub commits and how to reveal them
        
       Author : chuckhend
       Score  : 66 points
       Date   : 2024-02-23 15:41 UTC (7 hours ago)
        
 (HTM) web link (neodyme.io)
 (TXT) w3m dump (neodyme.io)
        
       | Sohcahtoa82 wrote:
       | This highlights why it's so important that any secret that gets
       | committed _must_ be rotated. Simply removing it from the git
       | history isn 't enough, because it can still linger, it's just
       | harder to find.
        
         | belinder wrote:
         | While I agree that you should rotate accidentally exposed
         | secrets, it should be noted that you can remove old history
         | from git reflog by expiring it
         | 
         | I see git reflog kinda like an OS recycle bin
        
           | Kluggy wrote:
           | How do you do that once it's on Github's servers?
        
             | latchkey wrote:
             | [deleted]
        
               | javawizard wrote:
               | That link provides instructions on how to access commits
               | from the reflog of a GitHub repository, not how to expire
               | or otherwise delete them.
        
             | KTibow wrote:
             | Sounds like it happens automatically:
             | https://stackoverflow.com/questions/56020314/how-
             | frequently-...
        
               | mxmlnkn wrote:
               | I think this isn't true anymore since at least the
               | introduction of the unwanted Github activity view:
               | https://docs.github.com/en/repositories/viewing-activity-
               | and...
        
           | dunham wrote:
           | That's only client side, and you also need to "gc" it to get
           | rid of it, or it will still be in .git/objects and can be
           | retrieved via something like `git cat-file`.
        
             | coldpie wrote:
             | We're getting a little off-topic, but even git-add will put
             | it in the object store without even committing! I once
             | saved my boss's bacon with that. He had git-added a
             | presentation file he'd been working on to commit it, but
             | accidentally nuked his changes with "git-reset --hard"
             | before comitting. He mentioned his mistake in chat, and we
             | were able to recover the lost object by sorting files in
             | the object directory by last-modified and cat-filing it
             | back out by that ID. He bought me a beer for that after
             | work that day. Good times.
             | 
             | Read gitcore-tutorial(7), folks. You too might save
             | someone's bacon, some day.
        
         | sammorrowdrums wrote:
         | Full disclosure, I work for GitHub, but push protection from
         | Secret Scanning is awesome for this because your nearly leaked
         | secret doesn't make it to the remote, and it gives you
         | instructions on how to fix your local repo!
        
           | lol768 wrote:
           | Why does GitHub provide no way for a repository administrator
           | to self-service a git gc? I seem to recall reading a blog
           | post that suggested GitHub had invested a bunch of
           | engineering resource in making cleaning up unreachable
           | objects much more scalable.
        
         | arein3 wrote:
         | Is it commited or pushed?
         | 
         | If I commit something locally, reset it and push to remote
         | something else does it leave a trace?
        
           | msm_ wrote:
           | It is still in your _local_ repository, but it 's not pushed
           | to the remote repository. So a forensics on your local
           | machine may reveal it (probably until you do git gc, but I'm
           | not an expert on git forensics) but it's safe otherwise.
        
       | Okx wrote:
       | If you've inadvertently committed, say, copyrighted material to
       | GitHub, and want to fully erase it, is there a way? Other than
       | contacting GitHub as this article mentions.
       | 
       | Even if you contact them, GitHub says[1] that they will not
       | remove "non-sensitive data", but makes no reference to
       | copyrighted material.
       | 
       | [1] https://docs.github.com/en/authentication/keeping-your-
       | accou...
        
         | lima wrote:
         | There isn't, you need to contact them so they can delete the
         | offending objects.
        
         | kevingadd wrote:
         | If it's a copyright violation (be sure that it ACTUALLY is!)
         | they will remove content in response to a DMCA request, but any
         | forks will only be removed if you manually find them and issue
         | a request for each fork. This isn't very useful if you
         | accidentally uploaded your own copyrighted material though,
         | since that's not a violation you could issue a notice for.
        
           | Okx wrote:
           | Can you DMCA yourself for someone else's copyrighted
           | material? That's what I'm talking about here.
        
             | kevingadd wrote:
             | You have to be the copyright holder or their
             | representative, so no, it would technically be illegal to
             | DMCA yourself for violating someone else's IP. If you asked
             | github support nicely they might help, though.
        
       | funyug wrote:
       | Is this an issue with git or github only? If this is an issue
       | with github only, i won't use it anymore for personal projects
        
         | Denvercoder9 wrote:
         | It's not really an issue, it's just that the assumption that
         | removing a commit from the history actually deletes it is not
         | correct. That holds for both Git and GitHub, and probably most
         | other Git hosts.
         | 
         | Also in general, don't assume that you can remove _anything_
         | from the internet once it has been published.
        
           | IshKebab wrote:
           | It is an issue. It means there's no way to _actually_ delete
           | commits from a GitHub repo.
           | 
           | And it is a GitHub issue. If you were self-hosting you could
           | just run `git prune` `git gc` or `git repack` or whatever the
           | magic command is.
        
             | biftek wrote:
             | If your remote is publicly accessible (GitHub or not)
             | anyone could have cloned it while the sensitive data was
             | there and no magic command will make that go away
        
         | kevincox wrote:
         | Mostly a Git issue. In general Git won't remove old data pushed
         | to remotes. Maybe if they run a garbage collection.
         | 
         | However GitHub does exacerbate it a little by providing APIs
         | that list commits that are no longer in the history. However
         | there are other ways to get this info such as brute-forcing
         | short prefixes of commits.
         | 
         | But really this is another case of the general problem that
         | once you publish information you can't unpublish it. If you
         | push a secret to a repo you can't 100% reliably clean it up.
         | You should assume that everyone with the repo took a copy.
        
       | semiquaver wrote:
       | You don't even need the pushes API to see commits that were force
       | pushed away. You can get the head of any branch at a given time
       | using `gitrevisions` [1] syntax any place that you would normally
       | put a branch or commit.
       | 
       | e.g to see the state of the cpython main branch on January 1 we
       | can ask for `main@{2024-01-01}`:
       | 
       | https://github.com/python/cpython/tree/main@{2024-01-01}
       | 
       | This does not walk the commit history, but instead the server-
       | side reflog, so it's immune to force pushing and can only be
       | avoided by GC of the reflog or repo. Definitely contact GH
       | support if you pushed something you shouldn't have.
       | 
       | [1] https://git-scm.com/docs/gitrevisions
        
       ___________________________________________________________________
       (page generated 2024-02-23 23:01 UTC)