[HN Gopher] PyPy has moved to Git, GitHub
___________________________________________________________________
PyPy has moved to Git, GitHub
Author : lumpa
Score : 137 points
Date : 2024-01-01 20:06 UTC (2 hours ago)
(HTM) web link (www.pypy.org)
(TXT) w3m dump (www.pypy.org)
| nu11ptr wrote:
| I used to use Mercurial as well and greatly preferred it, but for
| better or worse, Git won. I started using Git several years ago
| and haven't looked back.
|
| No matter what people might say, I think this stuff matters for
| contributors and users who might be looking at your project, and
| git/github is the typical expectation. This is likely the right
| decision, as they are now ubiquitous.
| vasco wrote:
| Same story for us, started with mercurial many years ago,
| eventually the tooling around git and just "using the standard"
| was too big to ignore and we migrated along with a bunch of
| other CI/CD and DevX improvements. Mercurial was cool, but
| lacking support meant little things like Jenkins having to
| "pull" 3x instead of 1x for git natively along with many of
| these little things meant just using git, generally saved us a
| bunch of work.
| lelandfe wrote:
| If you liked the interface of Mercurial, there is
| https://foss.heptapod.net/mercurial/hg-git
| jcranmer wrote:
| I've used that in the past, but it doesn't really work that
| well on large projects--it basically works by keeping a hg
| and a git version of the same repository, and storing a
| mapping between the two, which scales really poorly with
| multi-million commit repositories.
|
| What I really want is something that will let me use the
| interface of hg's power tools (revsets, phases, changeset
| evolution) on an existing git repository.
| lowbloodsugar wrote:
| That Torvald's second biggest creation is now most closely
| associated with a Microsoft company gives me feelings.
| akerl_ wrote:
| Me too. It feels great to see such a clear example of
| building a successful business on an open source tool and
| ecosystem. Git is massively popular and actively used, GitHub
| built a huge community for discovering and interacting with
| open source projects, and since the acquisition, Microsoft-
| owned GitHub has continued improving their platform without
| breaking interop with the open spec.
|
| Everybody wins.
| 8organicbits wrote:
| > the script properly retained the issue numbers
|
| Oh that's quite helpful. I was worried about how lossy the
| migration would be.
| aeurielesn wrote:
| Why do people like Mercurial branches? Was it revamped? I hate it
| when I used it.
|
| By all means, I prefer Git branches.
| KwanEsq wrote:
| I mean you can't really compare them since git doesn't even
| _have_ branches as Mercurial understands them. git's branches
| would perhaps better be called twigs in comparison. git's
| lightweight branches better map to Mercurial's topics or
| bookmarks, though neither perfectly. And Mercurial has even
| lighter weight branches since you can just make a new head by
| committing without having to name anything, and it won't yell
| at you about a detached head like git will.
| notPlancha wrote:
| Pypy described the following in the FAQ:
|
| > The difference between git branches and named branches is not
| that important in a repo with 10 branches (no matter how big).
| But in the case of PyPy, we have at the moment 1840 branches.
| Most are closed by now, of course. But we would really like to
| retain (both now and in the future) the ability to look at a
| commit from the past, and know in which branch it was made.
| Please make sure you understand the difference between the Git
| and the Mercurial branches to realize that this is not always
| possible with Git-- we looked hard, and there is no built-in
| way to get this workflow.
|
| > Still not convinced? Consider this git repo with three
| commits: commit #2 with parent #1 and head of git branch "A";
| commit #3 with also parent #1 but head of git branch "B". When
| commit #1 was made, was it in the branch "A" or "B"? (It could
| also be yet another branch whose head was also moved forward,
| or even completely deleted.)
|
| In this post they say that "Github notes solves much of point
| (1): the difficulty of discovering provenance of commits,
| although not entirely"
| dgfitz wrote:
| The question in your example seems odd to me. It can be
| interpreted as either 2 OR 3 unique branches depending on how
| you read it.
|
| There is either base branch A whose current head is commit #2
| / branch B with head of commit #3.
|
| OR
|
| Commit #1 is branch "default" commit #2 is branch "A" with
| parent as commit #1 and commit is branch "B" with parent also
| as commit #1
|
| Consider your same example with forking instead of branching,
| how would the issue be resolved?
| jcranmer wrote:
| There are benefits to having branches be an inherent property
| of a commit as opposed to the Git model of a dynamic property
| of the graph.
|
| Suppose I have a branch A with three commits, and then I make
| another branch B on top of that with another few commits. The
| Git model essentially says that B consists of all commits that
| are an ancestor of B that aren't the ancestor of any other
| branch. But now I go and rebase A somewhere else--and as a
| result, B suddenly grew several extra commits on its branch
| because those commits are no longer on branch A. If I want to
| rebase B on the new A, well, those duplicated commits will
| cause me some amount of pain, pain that would go away if only
| git could remember that some of those commits are really just
| the old version of A.
| sampo wrote:
| > If I want to rebase B on the new A, well, those duplicated
| commits will cause me some amount of pain
|
| Not really. Git will recognize commits that produce an
| identical diff and skip them. Your only pain will be that for
| each skipped commit, you will see a notification line in the
| output of your `git rebase`: warning:
| skipped previously applied commit <hash>
| TheRealPomax wrote:
| Better late than never. Here's hoping that means things like `pip
| publish` are back on the table, too.
| simonw wrote:
| Are you confusing PyPI and PyPy? Easily done!
| judge2020 wrote:
| Git should be a pretty easy-to-federate system, at least in terms
| of mimicking pull requests. Is there anything that tries to do
| so? Gitea?
| aquatica wrote:
| iirc GitLab is trying to federate using ActivityPub
|
| Edit: yes,
| https://docs.gitlab.com/ee/architecture/blueprints/activity_...
| matrss wrote:
| ForgeFed sounds promising: https://forgefed.org/
| RheingoldRiver wrote:
| Codeberg is also working on federating, or maybe they already
| do. My experience using them was quite unpleasant, though,
| they're very feature-incomplete.
| sidkshatriya wrote:
| I've been using git happily for many years. Strangely enough the
| provenance of a commit i.e. which branch did a commit originally
| come has not really mattered to me very much. Mercurial provides
| this and they are using `git notes` to add this provenance meta-
| data to each commit during migration to git.
|
| I would have thought I'd need this much more, but I have not. In
| plain git I'll just `git log` and grep for the commit in case I
| want to make sure a commit is available in a certain branch.
| g-b-r wrote:
| The point is giving branches a meaning (e.g. "implementation of
| this feature") and being able to at least keep the information
| that such commit was part of that (well at least that's why I'd
| want Mercurial's named branches, I'm not sure that's how this
| project used them)
| heads wrote:
| When I want to see inside a piece of software I look for (1)
| the source code; (2) the git-blame; (3) the code review for
| significant commits. I have never wanted to see into the
| history before that point, namely how the developer drafted and
| polished their idea prior to the final code review approval.
|
| What practical use case am I missing out on when these work-in-
| progress draft commits are lost? I can't see one.
| eikenberry wrote:
| Wouldn't good merge commit conventions work to preserve as much
| of this sort of information as desired? All the commits of the
| branch contained in it with the the merge commit message
| preserving that info.
| csk111165 wrote:
| But what about compatibility, is it fully compatible with Git??
| How will the contribution work flow change?
| SushiHippie wrote:
| > Open Source has become synonymous with GitHub, and we are too
| small to change that.
|
| It's kind of sad that this is true.
|
| I'm guilty myself, I contribute to projects on GitHub more often
| than on any other platform.
|
| And when I search for open source projects the first page I use
| is GitHub.
| enriquto wrote:
| Be part of the solution, not part of the problem. You can use
| some other forge and keep an up-to-date github repository as a
| read-only front.
| SushiHippie wrote:
| I don't host my own repositories on GitHub, I host a gitea
| instance myself.
| clintonb wrote:
| What exactly is the problem?
| politelemon wrote:
| I will admit, I've often searched for $project_name github to
| get to their repositories. It shouldn't matter but it's just a
| force of habit now.
|
| That said I do feel some joy when I see a project on Gitlab and
| am happy to contribute there, eg FDroid.
| gdevenyi wrote:
| A lot of this is SourceForge's fault.
|
| They had a sizable lead and completely bungled it.
| bastawhiz wrote:
| It's not just that they dropped the ball, they actively
| sabotaged whatever goodwill they had built by adding malware
| to software. Not only was this a massive hassle, it ruined
| the reputation of lots of FOSS projects with folks who just
| wanted to use some of the most popular consumer-ish open
| source software like Filezilla.
|
| While SF was crapping where they eat, GitHub built a lot of
| trust and goodwill with a lot of people.
| NooneAtAll3 wrote:
| I found Codeberg being good substitute
| atticora wrote:
| > foss.heptapod.net is not well indexed in google/bing/duckduckgo
| search, so people find it harder to search for issues in the
| project.
|
| SEO : WWW structure :: gravity : orbital mechanics
| Cupprum wrote:
| What does this mean?
| atticora wrote:
| The web is shaped by the needs of the indexes as the solar
| system is shaped by gravity.
| bluish29 wrote:
| If you are confused (like me) that this was about PyPI (Python
| packages repository) then no. It is about a project called PyPy
| (one can argue it is bad name) that is an implementation of
| python interpreter but without cpython. Instead they rely on a
| JIT compiler. And it is syntax compatible but if your code uses
| any library or method relying on C extensions then you are out of
| luck (Goodbye NumPy.. etc).
|
| Edit: They have C-layer emulation, but I don't know its
| limitations or current status, but you can use those libraries
| [1][2]
|
| [1] https://www.pypy.org/posts/2018/09/inside-cpyext-why-
| emulati...
|
| [2] https://pythoncapi.readthedocs.io/cpyext.html
| giovannibajo1 wrote:
| to be fair, PyPy predates PyPI
| d-cc wrote:
| Now all we need is a PiPy and we'll have all the pies
| dpflan wrote:
| PiPi would complete the set but that would be (doubly)
| irrational...
| cdjk wrote:
| I think you mean transcendental.
| dpflan wrote:
| True, perhaps that would be how the pie would taste:
| truly transcendental, but I don't think it could ever be
| finished.
| d-cc wrote:
| THANK YOU. This was my first reaction as well.
| pletnes wrote:
| They do support numpy. The pypy name predates pypi. You're off
| on multiple details here.
| pythux wrote:
| Nit. I believe you can use numpy and at least some other
| libraries relying on native extensions but performance might
| vary: https://doc.pypy.org/en/latest/faq.html#should-i-install-
| num...
| CogitoCogito wrote:
| > one can argue it is bad name
|
| Given the way that pypy is implemented, I think the name is
| quite clever really.
| fumeux_fume wrote:
| The packages repo is known as PyPI (like Py P.I.), not PyPi.
| Waterluvian wrote:
| PyPy being a Python JIT written in Python with an ouroboros as
| a logo is pretty much the perfect name.
| 1letterunixname wrote:
| Speaking of git, for mega monorepro performance, we're gonna need
| synthetic FSes and SCM-integrated synthetic checkouts. Sapling
| (was hg in the past but was forked and reworked extensively) will
| be able to do this if EdenFS will ever be released, but Git will
| need something similar. This will require a system agent running
| with a caching overlay fs that can grab and cache bits on-the-
| fly. Yes, it's slightly slower than having contents already, but
| there is no way to checkout a 600+ GiB repo on a laptop with a
| 512 GiB SSD.
| filmgirlcw wrote:
| That already exists. It's called Scalar[1] and it has been
| built-into Git since October 2022[2], dates back to 2020[3] and
| is the spiritual successor or something Microsoft was using as
| far back as 2017[4].
|
| 1. https://git-scm.com/docs/scalar
|
| 2. https://github.blog/2022-10-13-the-story-of-scalar/
|
| 3. https://devblogs.microsoft.com/devops/introducing-scalar/
|
| 4. https://devblogs.microsoft.com/bharry/the-largest-git-
| repo-o...
| aseipp wrote:
| Scalar explicitly does not implement the virtualized
| filesystem the OP is referring to. The original Git VFS for
| Windows that Microsoft designed did in fact do this, but as
| your third link notes, Microsoft abandoned that in favor of
| Scalar's totally different design which explicitly was about
| scaling repositories _without_ filesystem virtualization.
|
| There's a bunch of related features they added to Git to
| achieve scalability without virtualization, including the
| Scalar daemon which does background monitoring and
| optimization. Those are all useful and Scalar is a welcome
| addition. But the need for a virtual filesystem layer for
| large-scale repositories is still a very real one. There are
| also some limitations with Git's existing solutions that
| aren't ideal; for example Git's partial clones are great but
| IIRC can only be used as a "cone" applied to the original
| filesystem hierarchy. More generalized designs would allow
| mapping arbitrary paths in the original repository to any
| other path in the virtual checkout, and synchronizing between
| them. Tools like Josh can do this today with existing Git
| repositories[1].
|
| The Git for Windows that was referenced isn't even that big
| at 300GB, either. That's well within the realm of single
| machine stuff. Game studios regularly have repositories that
| exist at multi-terabyte size, and they have also converged on
| similar virtualization solutions. For example, Destiny 2 uses
| a "virtual file synchronization" layer called VirtualSync[2]
| that reduced the working size of their checkouts by over 98%,
| multiple terabytes of savings per person. And in a twist of
| fate, VirtualSync was implemented thanks to a feature called
| "ProjFS" that Microsoft added to Windows... which was
| motivated originally by the Git VFS for Windows they
| abandoned!
|
| [1] https://github.com/josh-project/josh
|
| [2] https://www.gdcvault.com/play/1027699/Virtual-Sync-
| Terabytes...
| throwawaaarrgh wrote:
| Every provider out there can talk a standard Git protocol, but
| all the features that don't have a standard Git protocol become a
| proprietary API. I think if Git (or a project like it) made a
| standard protocol/data format for all the features of a SCM, then
| all those providers could adopt it, and we could start moving
| away from GitHub as the center of the known universe. If we don't
| make a universal standard (and implementation) then it'll remain
| the way it is today.
| juped wrote:
| This is a tragic, wrongheaded move, and I say that as a big Git
| enthusiast (but a Github hater, to be fair...)
|
| I don't think PyPy gains anything from this, not even a reduction
| in the annoying messages that have been psychologically torturing
| the maintainers. If anything, you're just opening yourself up to
| more common and frequent low-investment pestering.
___________________________________________________________________
(page generated 2024-01-01 23:01 UTC)