[HN Gopher] PyPy has moved to Git, GitHub
       ___________________________________________________________________
        
       PyPy has moved to Git, GitHub
        
       Author : lumpa
       Score  : 137 points
       Date   : 2024-01-01 20:06 UTC (2 hours ago)
        
 (HTM) web link (www.pypy.org)
 (TXT) w3m dump (www.pypy.org)
        
       | nu11ptr wrote:
       | I used to use Mercurial as well and greatly preferred it, but for
       | better or worse, Git won. I started using Git several years ago
       | and haven't looked back.
       | 
       | No matter what people might say, I think this stuff matters for
       | contributors and users who might be looking at your project, and
       | git/github is the typical expectation. This is likely the right
       | decision, as they are now ubiquitous.
        
         | vasco wrote:
         | Same story for us, started with mercurial many years ago,
         | eventually the tooling around git and just "using the standard"
         | was too big to ignore and we migrated along with a bunch of
         | other CI/CD and DevX improvements. Mercurial was cool, but
         | lacking support meant little things like Jenkins having to
         | "pull" 3x instead of 1x for git natively along with many of
         | these little things meant just using git, generally saved us a
         | bunch of work.
        
         | lelandfe wrote:
         | If you liked the interface of Mercurial, there is
         | https://foss.heptapod.net/mercurial/hg-git
        
           | jcranmer wrote:
           | I've used that in the past, but it doesn't really work that
           | well on large projects--it basically works by keeping a hg
           | and a git version of the same repository, and storing a
           | mapping between the two, which scales really poorly with
           | multi-million commit repositories.
           | 
           | What I really want is something that will let me use the
           | interface of hg's power tools (revsets, phases, changeset
           | evolution) on an existing git repository.
        
         | lowbloodsugar wrote:
         | That Torvald's second biggest creation is now most closely
         | associated with a Microsoft company gives me feelings.
        
           | akerl_ wrote:
           | Me too. It feels great to see such a clear example of
           | building a successful business on an open source tool and
           | ecosystem. Git is massively popular and actively used, GitHub
           | built a huge community for discovering and interacting with
           | open source projects, and since the acquisition, Microsoft-
           | owned GitHub has continued improving their platform without
           | breaking interop with the open spec.
           | 
           | Everybody wins.
        
       | 8organicbits wrote:
       | > the script properly retained the issue numbers
       | 
       | Oh that's quite helpful. I was worried about how lossy the
       | migration would be.
        
       | aeurielesn wrote:
       | Why do people like Mercurial branches? Was it revamped? I hate it
       | when I used it.
       | 
       | By all means, I prefer Git branches.
        
         | KwanEsq wrote:
         | I mean you can't really compare them since git doesn't even
         | _have_ branches as Mercurial understands them. git's branches
         | would perhaps better be called twigs in comparison. git's
         | lightweight branches better map to Mercurial's topics or
         | bookmarks, though neither perfectly. And Mercurial has even
         | lighter weight branches since you can just make a new head by
         | committing without having to name anything, and it won't yell
         | at you about a detached head like git will.
        
         | notPlancha wrote:
         | Pypy described the following in the FAQ:
         | 
         | > The difference between git branches and named branches is not
         | that important in a repo with 10 branches (no matter how big).
         | But in the case of PyPy, we have at the moment 1840 branches.
         | Most are closed by now, of course. But we would really like to
         | retain (both now and in the future) the ability to look at a
         | commit from the past, and know in which branch it was made.
         | Please make sure you understand the difference between the Git
         | and the Mercurial branches to realize that this is not always
         | possible with Git-- we looked hard, and there is no built-in
         | way to get this workflow.
         | 
         | > Still not convinced? Consider this git repo with three
         | commits: commit #2 with parent #1 and head of git branch "A";
         | commit #3 with also parent #1 but head of git branch "B". When
         | commit #1 was made, was it in the branch "A" or "B"? (It could
         | also be yet another branch whose head was also moved forward,
         | or even completely deleted.)
         | 
         | In this post they say that "Github notes solves much of point
         | (1): the difficulty of discovering provenance of commits,
         | although not entirely"
        
           | dgfitz wrote:
           | The question in your example seems odd to me. It can be
           | interpreted as either 2 OR 3 unique branches depending on how
           | you read it.
           | 
           | There is either base branch A whose current head is commit #2
           | / branch B with head of commit #3.
           | 
           | OR
           | 
           | Commit #1 is branch "default" commit #2 is branch "A" with
           | parent as commit #1 and commit is branch "B" with parent also
           | as commit #1
           | 
           | Consider your same example with forking instead of branching,
           | how would the issue be resolved?
        
         | jcranmer wrote:
         | There are benefits to having branches be an inherent property
         | of a commit as opposed to the Git model of a dynamic property
         | of the graph.
         | 
         | Suppose I have a branch A with three commits, and then I make
         | another branch B on top of that with another few commits. The
         | Git model essentially says that B consists of all commits that
         | are an ancestor of B that aren't the ancestor of any other
         | branch. But now I go and rebase A somewhere else--and as a
         | result, B suddenly grew several extra commits on its branch
         | because those commits are no longer on branch A. If I want to
         | rebase B on the new A, well, those duplicated commits will
         | cause me some amount of pain, pain that would go away if only
         | git could remember that some of those commits are really just
         | the old version of A.
        
           | sampo wrote:
           | > If I want to rebase B on the new A, well, those duplicated
           | commits will cause me some amount of pain
           | 
           | Not really. Git will recognize commits that produce an
           | identical diff and skip them. Your only pain will be that for
           | each skipped commit, you will see a notification line in the
           | output of your `git rebase`:                   warning:
           | skipped previously applied commit <hash>
        
       | TheRealPomax wrote:
       | Better late than never. Here's hoping that means things like `pip
       | publish` are back on the table, too.
        
         | simonw wrote:
         | Are you confusing PyPI and PyPy? Easily done!
        
       | judge2020 wrote:
       | Git should be a pretty easy-to-federate system, at least in terms
       | of mimicking pull requests. Is there anything that tries to do
       | so? Gitea?
        
         | aquatica wrote:
         | iirc GitLab is trying to federate using ActivityPub
         | 
         | Edit: yes,
         | https://docs.gitlab.com/ee/architecture/blueprints/activity_...
        
         | matrss wrote:
         | ForgeFed sounds promising: https://forgefed.org/
        
         | RheingoldRiver wrote:
         | Codeberg is also working on federating, or maybe they already
         | do. My experience using them was quite unpleasant, though,
         | they're very feature-incomplete.
        
       | sidkshatriya wrote:
       | I've been using git happily for many years. Strangely enough the
       | provenance of a commit i.e. which branch did a commit originally
       | come has not really mattered to me very much. Mercurial provides
       | this and they are using `git notes` to add this provenance meta-
       | data to each commit during migration to git.
       | 
       | I would have thought I'd need this much more, but I have not. In
       | plain git I'll just `git log` and grep for the commit in case I
       | want to make sure a commit is available in a certain branch.
        
         | g-b-r wrote:
         | The point is giving branches a meaning (e.g. "implementation of
         | this feature") and being able to at least keep the information
         | that such commit was part of that (well at least that's why I'd
         | want Mercurial's named branches, I'm not sure that's how this
         | project used them)
        
         | heads wrote:
         | When I want to see inside a piece of software I look for (1)
         | the source code; (2) the git-blame; (3) the code review for
         | significant commits. I have never wanted to see into the
         | history before that point, namely how the developer drafted and
         | polished their idea prior to the final code review approval.
         | 
         | What practical use case am I missing out on when these work-in-
         | progress draft commits are lost? I can't see one.
        
         | eikenberry wrote:
         | Wouldn't good merge commit conventions work to preserve as much
         | of this sort of information as desired? All the commits of the
         | branch contained in it with the the merge commit message
         | preserving that info.
        
       | csk111165 wrote:
       | But what about compatibility, is it fully compatible with Git??
       | How will the contribution work flow change?
        
       | SushiHippie wrote:
       | > Open Source has become synonymous with GitHub, and we are too
       | small to change that.
       | 
       | It's kind of sad that this is true.
       | 
       | I'm guilty myself, I contribute to projects on GitHub more often
       | than on any other platform.
       | 
       | And when I search for open source projects the first page I use
       | is GitHub.
        
         | enriquto wrote:
         | Be part of the solution, not part of the problem. You can use
         | some other forge and keep an up-to-date github repository as a
         | read-only front.
        
           | SushiHippie wrote:
           | I don't host my own repositories on GitHub, I host a gitea
           | instance myself.
        
           | clintonb wrote:
           | What exactly is the problem?
        
         | politelemon wrote:
         | I will admit, I've often searched for $project_name github to
         | get to their repositories. It shouldn't matter but it's just a
         | force of habit now.
         | 
         | That said I do feel some joy when I see a project on Gitlab and
         | am happy to contribute there, eg FDroid.
        
         | gdevenyi wrote:
         | A lot of this is SourceForge's fault.
         | 
         | They had a sizable lead and completely bungled it.
        
           | bastawhiz wrote:
           | It's not just that they dropped the ball, they actively
           | sabotaged whatever goodwill they had built by adding malware
           | to software. Not only was this a massive hassle, it ruined
           | the reputation of lots of FOSS projects with folks who just
           | wanted to use some of the most popular consumer-ish open
           | source software like Filezilla.
           | 
           | While SF was crapping where they eat, GitHub built a lot of
           | trust and goodwill with a lot of people.
        
         | NooneAtAll3 wrote:
         | I found Codeberg being good substitute
        
       | atticora wrote:
       | > foss.heptapod.net is not well indexed in google/bing/duckduckgo
       | search, so people find it harder to search for issues in the
       | project.
       | 
       | SEO : WWW structure :: gravity : orbital mechanics
        
         | Cupprum wrote:
         | What does this mean?
        
           | atticora wrote:
           | The web is shaped by the needs of the indexes as the solar
           | system is shaped by gravity.
        
       | bluish29 wrote:
       | If you are confused (like me) that this was about PyPI (Python
       | packages repository) then no. It is about a project called PyPy
       | (one can argue it is bad name) that is an implementation of
       | python interpreter but without cpython. Instead they rely on a
       | JIT compiler. And it is syntax compatible but if your code uses
       | any library or method relying on C extensions then you are out of
       | luck (Goodbye NumPy.. etc).
       | 
       | Edit: They have C-layer emulation, but I don't know its
       | limitations or current status, but you can use those libraries
       | [1][2]
       | 
       | [1] https://www.pypy.org/posts/2018/09/inside-cpyext-why-
       | emulati...
       | 
       | [2] https://pythoncapi.readthedocs.io/cpyext.html
        
         | giovannibajo1 wrote:
         | to be fair, PyPy predates PyPI
        
           | d-cc wrote:
           | Now all we need is a PiPy and we'll have all the pies
        
             | dpflan wrote:
             | PiPi would complete the set but that would be (doubly)
             | irrational...
        
               | cdjk wrote:
               | I think you mean transcendental.
        
               | dpflan wrote:
               | True, perhaps that would be how the pie would taste:
               | truly transcendental, but I don't think it could ever be
               | finished.
        
         | d-cc wrote:
         | THANK YOU. This was my first reaction as well.
        
         | pletnes wrote:
         | They do support numpy. The pypy name predates pypi. You're off
         | on multiple details here.
        
         | pythux wrote:
         | Nit. I believe you can use numpy and at least some other
         | libraries relying on native extensions but performance might
         | vary: https://doc.pypy.org/en/latest/faq.html#should-i-install-
         | num...
        
         | CogitoCogito wrote:
         | > one can argue it is bad name
         | 
         | Given the way that pypy is implemented, I think the name is
         | quite clever really.
        
         | fumeux_fume wrote:
         | The packages repo is known as PyPI (like Py P.I.), not PyPi.
        
         | Waterluvian wrote:
         | PyPy being a Python JIT written in Python with an ouroboros as
         | a logo is pretty much the perfect name.
        
       | 1letterunixname wrote:
       | Speaking of git, for mega monorepro performance, we're gonna need
       | synthetic FSes and SCM-integrated synthetic checkouts. Sapling
       | (was hg in the past but was forked and reworked extensively) will
       | be able to do this if EdenFS will ever be released, but Git will
       | need something similar. This will require a system agent running
       | with a caching overlay fs that can grab and cache bits on-the-
       | fly. Yes, it's slightly slower than having contents already, but
       | there is no way to checkout a 600+ GiB repo on a laptop with a
       | 512 GiB SSD.
        
         | filmgirlcw wrote:
         | That already exists. It's called Scalar[1] and it has been
         | built-into Git since October 2022[2], dates back to 2020[3] and
         | is the spiritual successor or something Microsoft was using as
         | far back as 2017[4].
         | 
         | 1. https://git-scm.com/docs/scalar
         | 
         | 2. https://github.blog/2022-10-13-the-story-of-scalar/
         | 
         | 3. https://devblogs.microsoft.com/devops/introducing-scalar/
         | 
         | 4. https://devblogs.microsoft.com/bharry/the-largest-git-
         | repo-o...
        
           | aseipp wrote:
           | Scalar explicitly does not implement the virtualized
           | filesystem the OP is referring to. The original Git VFS for
           | Windows that Microsoft designed did in fact do this, but as
           | your third link notes, Microsoft abandoned that in favor of
           | Scalar's totally different design which explicitly was about
           | scaling repositories _without_ filesystem virtualization.
           | 
           | There's a bunch of related features they added to Git to
           | achieve scalability without virtualization, including the
           | Scalar daemon which does background monitoring and
           | optimization. Those are all useful and Scalar is a welcome
           | addition. But the need for a virtual filesystem layer for
           | large-scale repositories is still a very real one. There are
           | also some limitations with Git's existing solutions that
           | aren't ideal; for example Git's partial clones are great but
           | IIRC can only be used as a "cone" applied to the original
           | filesystem hierarchy. More generalized designs would allow
           | mapping arbitrary paths in the original repository to any
           | other path in the virtual checkout, and synchronizing between
           | them. Tools like Josh can do this today with existing Git
           | repositories[1].
           | 
           | The Git for Windows that was referenced isn't even that big
           | at 300GB, either. That's well within the realm of single
           | machine stuff. Game studios regularly have repositories that
           | exist at multi-terabyte size, and they have also converged on
           | similar virtualization solutions. For example, Destiny 2 uses
           | a "virtual file synchronization" layer called VirtualSync[2]
           | that reduced the working size of their checkouts by over 98%,
           | multiple terabytes of savings per person. And in a twist of
           | fate, VirtualSync was implemented thanks to a feature called
           | "ProjFS" that Microsoft added to Windows... which was
           | motivated originally by the Git VFS for Windows they
           | abandoned!
           | 
           | [1] https://github.com/josh-project/josh
           | 
           | [2] https://www.gdcvault.com/play/1027699/Virtual-Sync-
           | Terabytes...
        
       | throwawaaarrgh wrote:
       | Every provider out there can talk a standard Git protocol, but
       | all the features that don't have a standard Git protocol become a
       | proprietary API. I think if Git (or a project like it) made a
       | standard protocol/data format for all the features of a SCM, then
       | all those providers could adopt it, and we could start moving
       | away from GitHub as the center of the known universe. If we don't
       | make a universal standard (and implementation) then it'll remain
       | the way it is today.
        
       | juped wrote:
       | This is a tragic, wrongheaded move, and I say that as a big Git
       | enthusiast (but a Github hater, to be fair...)
       | 
       | I don't think PyPy gains anything from this, not even a reduction
       | in the annoying messages that have been psychologically torturing
       | the maintainers. If anything, you're just opening yourself up to
       | more common and frequent low-investment pestering.
        
       ___________________________________________________________________
       (page generated 2024-01-01 23:01 UTC)