[HN Gopher] The Changing "Guarantees" Given by Python's Global I...
       ___________________________________________________________________
        
       The Changing "Guarantees" Given by Python's Global Interpreter Lock
        
       Author : abhi9u
       Score  : 95 points
       Date   : 2023-11-17 13:03 UTC (9 hours ago)
        
 (HTM) web link (stefan-marr.de)
 (TXT) w3m dump (stefan-marr.de)
        
       | dfox wrote:
       | Code that assumes that something is going to be atomic because of
       | the GIL (or any other implementation detail) is simply broken. If
       | you need something to be atomic you should be explicit about that
       | and use mutex or something.
        
         | bheadmaster wrote:
         | True, but when a majority of ecosystem is relying on an
         | implementation detail, that implementation detail becomes de-
         | facto standard.
        
           | darklion wrote:
           | Just because everybody does something the wrong way doesn't
           | somehow make it magically correct.
           | 
           | One study [1] in the US released in 2020 found almost 90% of
           | people admitted to speeding, but I don't think anyone would
           | say that speeding is now approved by the authorities and
           | consequence-free.
           | 
           | [1] https://www.thezebra.com/resources/research/speeding-car-
           | ins...
        
             | bheadmaster wrote:
             | > Just because everybody does something the wrong way
             | doesn't somehow make it magically correct.
             | 
             | Depends on the definition of "correct".
             | 
             | One way to define "correct" is "as defined by the
             | standard". However, standard serves a purpose - to maintain
             | interoperability between components, so the other, deeper,
             | way to define "correct" is "interoperable with the
             | ecosystem".
             | 
             | The official standard can argue that relying on the
             | implementation detail is incorrect because they didn't
             | specify it explicitly, but they would go against the grain
             | of the rest of the ecosystem. Which reminds me of an old
             | joke:                   An old woman is watching the news.
             | She sees a news report saying there is a car driving in the
             | wrong direction on the highway. So the old woman calls up
             | her husband.              Old woman: be careful on the
             | highway dear, there is a crazy driver on the highway
             | driving the wrong way!              Husband: It's not just
             | one car, it's hundreds of them!
        
               | nerdponx wrote:
               | Lest we forget that Python doesn't even have a language
               | standard.
               | 
               | The "standard" is a combination of PEPs, the Python docs,
               | and the CPython implementation.
               | 
               | Implementation details _are_ language features, because
               | implementation details are the standard. Python programs
               | that rely on the GIL are not  "wrong". Such programs are
               | relying on clearly-documented features of the system that
               | they use.
               | 
               | Is it "wrong" to use GCC-specific features in a C
               | program, if you know that you only intend to target GCC?
        
               | usrbinbash wrote:
               | > Implementation details are language features, because
               | implementation details are the standard. Python programs
               | that rely on the GIL are not "wrong".
               | 
               | The problem here is: Implementation details are not
               | guaranteed to be stable. They can change with every
               | release.
               | 
               | So if I write code that relies on any particular
               | implementation detail, it may be correct today, and may
               | be wrong two weeks from now, even though I didn't change
               | anything.
        
               | samus wrote:
               | GCC-specific features are explicitly documented in their
               | manual, and they can and should be used if one is fine
               | with one's program being tied to that compiler. Best
               | example: the Linux kernel.
               | 
               | On the other hand, Linux kernel developers have also been
               | burned by GCC changing what it does regarding _undefined
               | behavior_ (the infamous not-so-corner cases of C /C++).
               | They expect GCC to be sort of like an assembler, which
               | does the most straightforward thing in case of
               | ambiguities. Instead, GCC is an optimizing compiler that
               | is expected to deliver high-performance code for general-
               | purpose programs. It treats undefined behavior as
               | opportunities to enact optimizations. What it actually
               | does is also usually documented, but also here different
               | people have different expectations about what that
               | implies.
               | 
               | The GIL is an implementation choice that was put in place
               | by Python's developers for simplicity's sake. It is
               | important to be aware of it because it has huge
               | performance implications, but it is unwise to rely on it
               | for semantics. Especially since it has not exactly been a
               | secret that various parties would eventually like it to
               | be removed. Anyways, the GIL has very little to do with
               | Python's semantics. User code is racy with or without the
               | GIL, which is actually the point of TA.
        
             | hardware2win wrote:
             | Nobody says it is correct
             | 
             | Just hard as hell to change
        
             | gjvc wrote:
             | _Just because everybody does something the wrong way
             | doesn't somehow make it magically correct._
             | 
             | Hyrum's Law says the opposite:
             | 
             | https://www.hyrumslaw.com/
             | 
             |  _" With a sufficient number of users of an API, it does
             | not matter what you promise in the contract: all observable
             | behaviors of your system will be depended on by somebody."_
             | 
             | Named after some guy called Hyrum who worked / works at
             | Google.
        
               | verteu wrote:
               | Agreed. It's also the "the first rule of kernel
               | maintenance," according to Linus:
               | https://linuxreviews.org/WE_DO_NOT_BREAK_USERSPACE
        
             | marcinzm wrote:
             | > One study [1] in the US released in 2020 found almost 90%
             | of people admitted to speeding, but I don't think anyone
             | would say that speeding is now approved by the authorities
             | and consequence-free.
             | 
             | I would say jaywalking in NYC is a much better example
             | because breaking the law doesn't kill others except in rare
             | cases. Jaywalking is illegal. It's also not enforced for
             | all intents and purposes. Every attempt to enforce it has
             | led to such loud public outcry that it was stopped.
        
             | HDThoreaun wrote:
             | Speeding absolutely is approved by the authorities and
             | consequence free where I live.
        
             | citrin_ru wrote:
             | If 90% are speeding on a given highway / road then the
             | speed limit is likely unreasonably low. Traffic accident
             | which could be prevented by a speed limit typically caused
             | by fastest 10% of the speed distribution, not by remaining
             | 90%.
        
           | usrbinbash wrote:
           | Maybe, but "de-facto" and "actual" are still not the same,
           | and if I write code that relies on them being the same, then
           | I have only myself to blame when my code breaks.
           | 
           | e.g.; there are C-compilers that usually zero most allocated
           | structures. That's an implementation detail however, not a
           | feature of C. C doesn't guarantee you zeroed memory anywhere,
           | and code that assumes otherwise is just one compilation away
           | from a desaster.
        
           | samus wrote:
           | The vendor might not agree with that and change the
           | implementation detail at their convenience. In absence of a
           | prescriptive standard, a reference could be reasonably
           | trusted, but relying on only sort-of-documented
           | implementation details is a kind of tech debt.
        
         | BiteCode_dev wrote:
         | The gil is a feature. Not using a feature that makes sense in
         | our context is counter productive.
        
           | nicolaslem wrote:
           | The GIL in an implementation detail of CPython, it's not part
           | of Python the language.
        
             | colanderman wrote:
             | It's been a while since I've followed Python closely but
             | last I heard Python-the-language is _de facto_ defined as
             | "whatever CPython does". Has Python since grown a proper
             | language specification?
        
               | formerly_proven wrote:
               | No.
        
               | dragonwriter wrote:
               | Python doesn't have a full specification, but it does
               | have first party documentation that distinguishes between
               | CPython implementation details and language guarantees,
               | in part to support alternative implementations.
        
               | pdonis wrote:
               | _> Python doesn 't have a full specification_
               | 
               | This doesn't count?
               | 
               | https://docs.python.org/3/reference/index.html
        
               | squeaky-clean wrote:
               | The introduction section states that it's not a
               | complete/exact specification. If you're using a Python
               | implementation and have a question, this should answer
               | it. If you're writing a Python implementation and have a
               | question, this may not answer it.
               | 
               | > Consequently, if you were coming from Mars and tried to
               | re-implement Python from this document alone, you might
               | have to guess things and in fact you would probably end
               | up implementing quite a different language. On the other
               | hand, if you are using Python and wonder what the precise
               | rules about a particular area of the language are, you
               | should definitely be able to find them here. If you would
               | like to see a more formal definition of the language,
               | maybe you could volunteer your time -- or invent a
               | cloning machine :-).
        
               | pdonis wrote:
               | _> If you 're writing a Python implementation and have a
               | question, this may not answer it._
               | 
               | Depends on the question. If your question is, say, "how
               | should I implement the built-in types", yes, the language
               | reference won't answer that question. But if your
               | question is "what does my implementation have to do to
               | count as an implementation of the Python language", then
               | yes, the language reference _does_ answer that question--
               | since the language reference is what _defines_ the Python
               | language.
               | 
               | To me that _is_ a  "specification" of the Python
               | language. It's not a specification of the
               | _implementation_ of the language, but why should that be
               | required for something to be considered a language
               | specification? The whole _point_ is to specify what is
               | required to define the language _without_ specifying
               | every detail of the implementation.
        
               | pdonis wrote:
               | _> if you were coming from Mars and tried to re-implement
               | Python from this document alone, you might have to guess
               | things and in fact you would probably end up implementing
               | quite a different language_
               | 
               | Yes, I know that statement is in the Introduction, but I
               | think it's rather ill-considered. If my implementation is
               | consistent with the language reference, on what grounds
               | would someone claim it was _not_ an implementation of
               | Python but a  "different language"?
        
               | uxp8u61q wrote:
               | On the grounds that we live in the real world, not a
               | world where anything written down is a true and complete
               | description of reality. "Technically correct" is
               | sometimes a synonym of "incorrect".
        
               | pdonis wrote:
               | By your criterion, no programming language has a language
               | specification at all. I don't think that's a useful way
               | to look at things.
               | 
               | I would like to see some indication from the people who
               | say that the Python Language Reference I gave a link to
               | does _not_ qualify as  "language specification", of what
               | _would_ qualify. Specific examples would be nice.
        
               | dragonwriter wrote:
               | I would say that that is intended as documentation for
               | users (includong implementors of tools targeting the
               | language), not a specification for implementors of
               | Python, but I would agree that it is largely usable in
               | either role; my point in the GP was that something
               | distinct called "the Python Language Specification"
               | doesn't exist, but that a (not necessarily complete, from
               | a language implementors perspective) specification
               | distinct from the implementation behavior of CPython does
               | effectively exist in the documentation.
        
               | pdonis wrote:
               | _> not a specification for implementors of Python_
               | 
               | I don't see why not. The "Introduction" section
               | specifically mentions different implementations and
               | distinguishes implementation details, which can vary by
               | implementation, from the language reference itself, which
               | defines what every implementation has to meet to be
               | considered an implementation of the Python language.
               | 
               |  _> my point in the GP was that something distinct called
               | "the Python Language Specification" doesn 't exist_
               | 
               | And that's the point I'm disputing; AFAIK the language
               | reference I linked to _is_ that something distinct, even
               | if it isn 't called a "Language Specification" but
               | instead a "Language Reference". Either way it defines
               | what the Python language is.
        
               | dragonwriter wrote:
               | So, we rather explicitly agree that Python has nothing
               | called a "language specification", but that its published
               | first party documentation includes what is, functionally,
               | a specification of the language distinct from CPython
               | implementation details?
               | 
               | Not sure why there is an argument here.
        
               | BerislavLopac wrote:
               | This implies that, if CPython differs from that
               | specification in any way, it is not, in fact, Python.
               | What have I been using all these years, I wonder?
        
               | pdonis wrote:
               | _> we rather explicitly agree that Python has nothing
               | called a "language specification"_
               | 
               | No, we don't. I have already said explicitly that I think
               | the Python Language Reference _is_ such a thing. (I am
               | ignoring quibbles about it being called a  "reference"
               | instead of a "specification".) If you think it isn't,
               | why? What _does_ count as a  "language specification" in
               | your view? Do you have any specific examples that you can
               | contrast with Python?
        
             | friendzis wrote:
             | AFAIK, there is no Python language specification, therefore
             | implementation details of CPython IMO _is_ the language.
        
               | pdonis wrote:
               | _> AFAIK, there is no Python language specification_
               | 
               | Yes, there is:
               | 
               | https://docs.python.org/3/reference/index.html
               | 
               | This specification does not mention the GIL anywhere,
               | which means it is, as the GP said, an implementation
               | detail. Other implementations of Python that do _not_
               | have the GIL are still  "Python" implementations because
               | they meet this language specification.
        
               | Kranar wrote:
               | That is not a specification. A specification is a
               | prescriptive document of how a system is required to work
               | whereas a reference is a descriptive document of how a
               | system happens to currently work. Python has a reference,
               | in fact it has many references that even contradict one
               | another in subtle ways, but it does not have a
               | specification.
        
               | eesmith wrote:
               | What advantages would there be to that document?
               | 
               | I mean, I can see the point if there are multiple
               | commercial competitors in the market, as there is with
               | C/C++, or if the implementation is proprietary and the
               | users want to avoid vendor lock-in.
               | 
               | But the Minimal BASIC of ANSI X3.60-1978 never did catch
               | on for any of the BASICs I used in the 1990s, and the
               | Full BASIC of ANSI X3.113-1987 was a flop, so clearly
               | it's possible to put a lot of time into a standard only
               | to have it be irrelevant.
        
               | Kranar wrote:
               | I never mentioned any advantage or disadvantage. I am
               | only stating a fact about the current state of Python.
               | 
               | Python does not have a specification or a standards
               | document, it has a reference that describes how Python
               | happens to work and the reference is a great resource for
               | people to familiarize themselves with the language but it
               | should simply be clear that its purpose is to reflect the
               | existing state of the language rather than to specify how
               | Python works.
        
               | pdonis wrote:
               | Then why are different implementations of Python all
               | implementations of Python? What makes them
               | implementations of Python rather than different
               | languages?
               | 
               | I've already given my answer: they all meet the language
               | reference I linked to (yes, the word "reference" appears
               | in its title, not "specification"; that's just another
               | quibble).
               | 
               | What is your answer?
        
               | Kranar wrote:
               | Haha, no they don't all meet the language reference you
               | linked to and if you used them you'd know that!
               | 
               | Python is a lot more like LISP than it is C++. There are
               | many flavors of Python, from MicroPython to GraalPython
               | to Cyston and literally dozens of them. They most
               | certainly do not all meet the language reference you
               | linked to and they all have quirks here and there.
               | 
               | A language does not need a specification in order to
               | exist or to have a name. What matters is that people use
               | it and get work done with it, and ultimately if looks
               | like a Python and quacks like a Python, then it's fine to
               | call it a Python.
        
               | eesmith wrote:
               | What then did you think it relevant to point out that
               | Python does not have a prescriptive document if you
               | didn't think it was somehow useful?
               | 
               | Here's another fact: Python doesn't have its own typeface
               | either. Yet that fact is hardly germane to the thread.
               | 
               | A prescriptive document is a formal specification. The
               | Python Language Reference is a less formal specification.
               | It's still a specification.
               | 
               | It is also not a single document, but then again even a
               | formal specification may incorporate other specifications
               | by reference.
        
               | Kranar wrote:
               | Because the original comment was about the
               | appropriateness of writing Python code according to the
               | reference implementation as opposed to writing Python
               | code strictly according to the reference document. Some
               | people are arguing that the reference documentation is in
               | some sense authoritative and the only resource that
               | should be used to define Python's semantics. In
               | particular the issue at hand is whether the GIL is a part
               | of the semantics of a Python program.
               | 
               | As my position is that the CPython implementation is the
               | reference implementation for Python and as the GIL is an
               | integral part of that implementation, then the GIL does
               | form a part of Python's semantics regardless of whether
               | the reference documentation mentions it or not.
               | 
               | The Python reference documentation is not a specification
               | and it's not intended to be one.
        
               | pdonis wrote:
               | _> That is not a specification._
               | 
               | You're quibbling. By your definition, _no_ programming
               | language has a  "specification". Every language has
               | implementations that do things that aren't explicitly
               | described in any document.
        
               | Kranar wrote:
               | Plenty of languages have a specification. C++, Java, C#
               | all do.
               | 
               | Some languages do not, such as Python and Rust.
               | 
               | The ISO C++ committee even uses quite strong language
               | about the C++ standard and makes it a point to
               | differentiate between the C++ specification and C++
               | references:
               | 
               | >The standard is not intended to teach how to use C++.
               | Rather, it is an international treaty - a formal, legal,
               | and sometimes mind-numbingly detailed technical document
               | intended primarily for people writing C++ compilers and
               | standard library implementations.
        
               | pdonis wrote:
               | Does every C++ implementation do exactly what is in the
               | C++ language specification, no more, no less?
               | 
               | Same question for Java and C#.
               | 
               | If the answer to all of these questions is "no", as I
               | believe it is, on what grounds do you claim that these
               | languages have a specification, while Python and Rust do
               | not?
        
               | Kranar wrote:
               | Yes the C++ and Java implementations do exactly what is
               | in the language specification. The C++ specification
               | explicitly allows languages to do more, but it can not do
               | less. I believe Java has a similar clause but I'm not
               | sure.
               | 
               | The grounds that I claim is that you can read them, here
               | they are:
               | 
               | https://docs.oracle.com/javase/specs/
               | 
               | https://isocpp.org/files/papers/N4860.pdf
               | 
               | Note what the actual C++ standard states, and I quote:
               | 
               | >This document specifies requirements for implementations
               | of the C++ programming language. The first such
               | requirement is that they implement the language, so this
               | document also defines C++. Other requirements and
               | relaxations of the first requirement appear at various
               | places within this document
        
               | pdonis wrote:
               | _> the C++ and Java implementations do exactly what is in
               | the language specification_
               | 
               | The number of Google hits I get when I search on "C++
               | implementations that do not meet the language
               | specification" does not seem to support this claim.
               | 
               |  _> Note what the actual C++ standard states_
               | 
               | But your position with respect to Python is that it
               | doesn't matter what the "standard" states because the
               | actual definition of the language is in its reference
               | implementation. Why are you now shifting your ground?
        
               | samus wrote:
               | The C and C++ specifications are rather infamous about
               | being incomplete, i.e., containing features that might
               | lead to _undefined behavior_. On the other hand, Java is
               | quite comprehensive. It is one of the earliest instances
               | of languages specifying a memory model.
        
               | samus wrote:
               | Whether it is a specification or a reference doesn't
               | really matter for this discussion. What matters is that
               | it is the closest thing to a specification that exists
               | for Python. Assuming anything beyond it ties the program
               | to internals of a specific implementation. Internals for
               | which the vendor might have different intentions about
               | keeping stable or change than users expect.
        
               | Kranar wrote:
               | It's absolutely pertinent to this discussion, this whole
               | discussion is about whether the GIL is part of the
               | semantics of a Python program or not. My position is that
               | it is because the GIL is a part of the CPython reference
               | implementation.
               | 
               | Others object saying that the Python reference document
               | is what specifies its semantics, not the reference
               | implementation.
               | 
               | My position is that both the CPython reference
               | implementation and the reference documentation are valid
               | sources that document Python's semantics and that they
               | both can and should be used.
        
               | samus wrote:
               | Both of the former provide stronger promises about how
               | likely changes in the future are. For the former, user
               | input and coordination is absolutely required.
               | 
               | Implementation details are documented to allow
               | performance optimizations and to give insight into why
               | certain things are how they are. Therefore, users can
               | reasonably expect that the vendor won't cause performance
               | regressions for existing code. However, it is unwise to
               | derive semantics from them, even if they are technically
               | documented in that way.
               | 
               | One of the biggest disadvantages of relying on
               | implementation details is that it makes it way more
               | troublesome for the vendor to maintain and improve the
               | product.
               | 
               | Anyways, the GIL and the presence of possible concurrency
               | bugs are completely orthogonal things as the GIL has
               | always only served to prevent corruption of runtime data
               | structures, not of user code.
        
               | pdonis wrote:
               | _> this whole discussion is about whether the GIL is part
               | of the semantics of a Python program or not. My position
               | is that it is because the GIL is a part of the CPython
               | reference implementation._
               | 
               | And my position is that it is not because there are other
               | implementations of Python that do not have it, and which
               | everybody agrees are implementations of Python.
               | 
               | Not only that, but the very language reference that I
               | linked to explicitly distinguishes CPython implementation
               | details from the language itself. So the Python dev team
               | does not appear to agree with your position.
        
             | nerdponx wrote:
             | There is no language standard. The GIL is a Python language
             | feature _because_ it 's a CPython feature.
        
               | eesmith wrote:
               | The GIL is not a required language feature. Neither
               | Jython nor IronPython have a GIL.
               | 
               | What would it mean to have a language standard? A
               | publication from ISO or ECMA?
               | 
               | I ask because the Python Language Reference at
               | https://docs.python.org/3/reference/index.html seems to
               | be a (terse) language standard. Among other things, it
               | highlights some of the things which are implementation
               | defined, rather than language defined.
        
             | BiteCode_dev wrote:
             | Both can be true, espacially since the implementation has
             | been the same for 20 years.
        
           | usrbinbash wrote:
           | No it isn't.
           | 
           | It's an implementation detail, and relying on those for
           | functionality is a great way of getting ones code to break.
        
         | o11c wrote:
         | It depends on what kind of "atomic".
         | 
         | Assuming sequential consistency is pretty broken. Assuming
         | acquire/release atomicity is much more reasonable. Assuming "at
         | least relaxed" is outright mandatory.
        
         | blamestross wrote:
         | Most people and scripts using the GIL to ensure safety likely
         | don't even realize it is necessary. That is the hard part of
         | this type of migration. Decades of code written with the
         | implicit assumption of the GIL for all situations.
        
           | Gabrys1 wrote:
           | I don't think people generally write code to use this or
           | another guarantee of the implementation. They write code,
           | they run it, it seems to work (on my machine) and that's it.
           | Later, the implementation changes in a subtle way (or someone
           | tries to run the code in an incompatible version) and the
           | code stops working. Nothing to do with "relying on GIL". The
           | same may apply to relying on dict key order, timing of
           | things, or anything else. That's why we have race conditions,
           | Docker images, freezing dependencies, "only works in IE",
           | etc.
           | 
           | Writing to spec is very rare in my experience. You usually
           | learn the spec once someone tells you your code doesn't work
           | on XYZ.
        
         | mikepurvis wrote:
         | I dunno about that. There's a lot of Python out there with
         | worker threads dumping their output into a shared "results"
         | dict on the assumption that those insert operations are atomic.
        
           | formerly_proven wrote:
           | Relying on implementation behavior, even if it's very visible
           | and hasn't changed for a few decades, is morally wrong and
           | reprehensible. Only the letter of the standard has any
           | bearing on correct reality.
        
             | HDThoreaun wrote:
             | > morally wrong
             | 
             | That's a wild one
        
               | smegsicle wrote:
               | woe is man who loses sight of god and sees only himself
        
               | cozzyd wrote:
               | clearly a user of https://github.com/munificent/vigil
        
               | m463 wrote:
               | lol
               | 
               | from FAQ:
               | 
               | Q: But isn't a language that deletes code crazy?
        
             | Kranar wrote:
             | Python's semantics do not have a standards document or
             | specification. It's based on its reference implementation
             | (CPython) and different variations of Python even have
             | different semantics (Jython, IronPython, PyPy).
             | 
             | The library is documented, and the syntax is also
             | documented, but the semantics themselves are not.
        
               | eesmith wrote:
               | It isn't as dire as you suggest.
               | 
               | The language reference at
               | https://docs.python.org/3/reference/index.html "describes
               | the syntax and "core semantics" of the language.".
               | 
               | There has been a distinction between Python-the-language
               | and CPython-the-implementation ever since JPython back in
               | the 1990s. For example, reference counting is a CPython
               | implementation details. The reference manual says only:
               | 
               | > Objects are never explicitly destroyed; however, when
               | they become unreachable they may be garbage-collected. An
               | implementation is allowed to postpone garbage collection
               | or omit it altogether -- it is a matter of implementation
               | quality how garbage collection is implemented, as long as
               | no objects are collected that are still reachable.
               | 
               | (Quoting
               | https://docs.python.org/3/reference/datamodel.html )
        
               | Kranar wrote:
               | Absolutely, but note what your very quote says "describes
               | ... the language" rather than "prescribes ... the
               | language".
               | 
               | A standards document or specification's purpose is to
               | "prescribe" how a system shall work in order to be
               | compliant as opposed to a reference which documents how
               | systems currently happen to work.
        
           | eesmith wrote:
           | Python says those insert operations are atomic.
           | https://docs.python.org/3/faq/library.html#what-kinds-of-
           | glo...                 For example, the following operations
           | are all atomic (L, L1, L2       are lists, D, D1, D2 are
           | dicts, x, y are objects, i, j are ints):
           | L.append(x)       L1.extend(L2)       x = L[i]       x =
           | L.pop()       L1[i:j] = L2       L.sort()       x = y
           | x.field = y       D[x] = y       D1.update(D2)       D.keys()
        
             | mikepurvis wrote:
             | That's helpful that that's explicitly documented as such
             | then, rather than just a "lol GIL" implementation detail
             | that people have been able to take advantage of since
             | forever-- not carrying a documented guarantee like that
             | over into GIL-less python's stdlib would be a nightmare.
        
         | lelanthran wrote:
         | Python doesn't have a standards document. It has a reference
         | implementation that defines the language.
         | 
         | IOW, the python implementation _is_ the standard, and therefore
         | any code that relies on an implementation detail in the
         | reference implementation is, by definition, correct.
        
           | oblvious-earth wrote:
           | > Python doesn't have a standards document. It has a
           | reference implementation that defines the language.
           | 
           | That's not quite strictly true, Python does have a documented
           | "Language Reference":
           | https://docs.python.org/3/reference/index.html
           | 
           | If there is a contradiction between the Language Reference
           | and CPython then one, or both, of them needs to be updated
           | and it's treated on a case by case basis.
           | 
           | If an alternative Python implementation follows the Language
           | Reference but chooses different details outside it, that
           | doesn't stop it from being "Python". Of course practically
           | speaking most alternative implementations are incentivized to
           | closely follow CPython.
        
             | solarkraft wrote:
             | And what about PEPs, which describe (in pretty good detail)
             | what changes will be made?
        
               | samus wrote:
               | They describe large-scale changes that usually also lead
               | to changes of implementation details. But their biggest
               | disadvantage is that they are not kept up to date when
               | things change due to later PEPs or smaller-scale changes
               | interfering. After landing, they are of historical value
               | only.
        
         | citrin_ru wrote:
         | A related xkcd: https://xkcd.com/1172/
         | 
         | For a sufficiently popular project there are often cases where
         | users (other projects) rely on existing observed behavior even
         | if it not guaranteed by documentation/specification.
        
         | hyperpape wrote:
         | You can say it's broken until you're blue in the face, and
         | honestly, you're not wrong, but it's irrelevant.
         | 
         | If major libraries and user code are constantly incorrect, but
         | work because of an implementation detail, then removing that
         | implementation detail becomes extraordinarily difficult,
         | verging on impossible.
         | 
         | It would be like retrofitting your language to distinguish
         | valid unicode text from arbitrary byte strings, when you'd
         | previously treated them as equivalent.
        
       | coldcode wrote:
       | Why does Python use an un-comparable version number scheme? Not
       | being a Python programmer, comparing version 3.9 to 3.13 seemed
       | bizarre until I caught on.
        
         | BiteCode_dev wrote:
         | Sem ver is pretty standard imo.
        
         | Scarblac wrote:
         | Like in most versioning schemes, the dot isn't meant to be read
         | as decimal point.
        
         | rcxdude wrote:
         | comparing version numbers as tuples and not as decimals is a
         | pretty standard thing. Linux does it as well, for example. I am
         | actually not sure of a project which does something different.
        
           | irishsultan wrote:
           | TeX and Metafont have version numbers that are approaching pi
           | and e respectively, so the sensible way is to read these as
           | decimals.
        
         | throwaway468234 wrote:
         | Is there any versioning scheme that _is_ comparable? Honest
         | question.
        
           | burkaman wrote:
           | Some languages just use integer versions, like Java.
        
             | fullstop wrote:
             | The early versions did not use integers, for what it's
             | worth.
        
             | josefx wrote:
             | Java dropped the leading 1 with version 5 since Sun decided
             | it would never go for a complete rewrite of the language.
             | Imagine the chaos if Python did that and then pulled the 2
             | to 3 change on a run of the mill update from 213 to 214.
        
               | jcranmer wrote:
               | Not sure that's entirely accurate.
               | 
               | Java 1.2 was branded as "Java 2", so you had the J2SE
               | (Java 2, Standard Edition) and related J2ME and J2EE
               | (Mobile and Enterprise, respectively) platforms. The
               | "Java 2" moniker was dropped in Java 5, which was the
               | largest rewrite of the language since, adding generics,
               | sane memory model, annotations, etc., all in the same
               | language revision.
        
           | jorgemf wrote:
           | I think kotlin is one example. It uses the same idea but it
           | uses powers of 10 for incremental fixes and numbers for 1 to
           | 9 for hotfixes. That's if for the 3rd number, I do not know
           | what will happen when the second number reaches 2 digits. I
           | guess they will do something to make it comparable again.
        
           | nxpnsv wrote:
           | TeX version number asymptotically approaches pi... each new
           | version has another digit, this also makes versions
           | comparable. Clearly this is the better way....
        
           | coldtea wrote:
           | Comparable as what?
           | 
           | 3.10 and 3.9 are perfectly mechanically comparable (meaning
           | one can write a program to deterministically compare them and
           | return their relative order), just not with default numeric
           | ordering (then again they're not numbers, they are composite
           | values that are comprised by numbers) or naive string based
           | ordering.
           | 
           | If we wanted trivially comparable with regular numeric
           | ordering we could have incremental numbers as versions. 1, 2,
           | 3, ...
           | 
           | And if we wanted string ordering (as with usual filesystem
           | listing sorting with no extra flags to treat as numbers), we
           | could have fixed length padded parts: 00001.00045.
           | 
           | Not sure if the latter is used, but some software does use
           | the first.
        
             | usrbinbash wrote:
             | Or I could just accept that neither numeric, nor string
             | ordering works for semantic versioning, and write a
             | trivially easy piece of code that does the ordering in a
             | contect where I expect such a scheme.
             | 
             | > If we wanted trivially comparable with regular numeric
             | ordering we could have incremental numbers as versions. 1,
             | 2, 3, ...
             | 
             | Yes, and then we would be back to the day when the version
             | number gave me zero information about what changed, and how
             | that affects compatibility with existing code.
             | 
             | There is a reason semver is used across the industry by
             | now.
        
               | coldtea wrote:
               | > _Or I could just accept that neither numeric, nor
               | string ordering works for semantic versioning, and write
               | a trivially easy piece of code that does the ordering in
               | a contect where I expect such a scheme._
               | 
               | Hence the whole "3.10 and 3.9 are perfectly mechanically
               | comparable (meaning one can write a program to
               | deterministically compare them and return their relative
               | order)" part in my comment you perhaps missed.
               | 
               | > _Yes, and then we would be back to the day when the
               | version number gave me zero information about what
               | changed, and how that affects compatibility with existing
               | code._
               | 
               | Not that it's any better now with semver though: in
               | practice the semver works 95% of the time, and give just
               | a false sense of comfort at the other 5%. You update, and
               | things still break, despite the semver promise.
        
         | jjoonathan wrote:
         | I won't argue with bizzare, but it's common in version
         | notation.
         | 
         | If it wasn't already common, it would probably become common
         | shortly after the first time a big respectable company hit the
         | "oops what comes after 9" problem and decided on dot-separated-
         | integers rather than significant digits :)
        
           | coldtea wrote:
           | Is it that bizarre?
           | 
           | It's basically semantic versioning, that is a hierarchical
           | split based on levels of change (major = new release with
           | possibly big breaking changes, minor = some incremental
           | update version within the same release) and so on.
           | 
           | Who ever thought version numbers are decimals and why? The
           | "." appears as a separator on all kinds of strings in
           | software (filenames, domains, and IPs probably the most
           | common ones).
        
         | Aurornis wrote:
         | Semantic Versioning is ubiquitous across the modern software
         | industry. It's worth reading about it if you're not familiar:
         | https://semver.org/
         | 
         | Never assume that version numbers are decimal values. It's more
         | obvious when you see the full version triple (3.13.0 for
         | example) that it's not a single number, but the abbreviated
         | version numbers can some times look like a decimal value. You
         | should never compare version numbers as decimals in modern
         | software unless you're absolutely sure that's how the project
         | is structured.
        
           | luhn wrote:
           | I know you didn't say it outright, but since it is implied:
           | Python does _not_ use semver, although it does share similar
           | formatting. Semver considers bumps to the second integer a
           | "minor" release with no breaking changes, whereas in Python
           | that indicates a major release that is not backwards
           | compatible.
        
         | coldtea wrote:
         | You mean uncomparable using numeric or string sorting on those
         | strings?
         | 
         | Because otherwise it's perfectly comparable - and quite common.
         | 
         | Most FOSS uses a variant of <major>.<minor>.<patch> (here just
         | major (3) + minor (13), minor meaning "release within the same
         | major Python version").
        
       | zzzeek wrote:
       | these examples of "what one would assume to be atomic" did not
       | seem useful to me, they looked like things that are obviously not
       | threadsafe.
       | 
       | a more interesting example is something like this:
       | # setup        l = []             # thread A        l.extend([1,
       | 2, 3])                  # thread B        l.extend([4, 5, 6])
       | 
       | is the resulting list always within the set of [1,2,3,4,5,6] or
       | [4,5,6,1,2,3] ? or are the two sets of numbers randomly
       | interleaved in the list? or if the GIL is removed does the
       | interpreter segfault (I'm pretty sure this latter will not be the
       | case for GIL removal but I don't understand the gil remove plan
       | very much yet).
       | 
       | Edit: before people jump in and correct how the above is a bad
       | idea anyway, it's not like I'd ever do the above and expect
       | anything but disaster. This is more of a thought experiment to
       | understand what GIL removal is going to do.
        
         | ngoldbaum wrote:
         | It'll be guarded by a fine-grained mutex, so it won't seg
         | fault. They're using a performant mutex based on webkit's
         | wtf::lock.
        
         | OskarS wrote:
         | I'm guessing interleaving is probably not possible because the
         | `extend` call is passed the full list, and it holds the GIL
         | until it's finished. But you'd have to look at the bytecode to
         | be sure, I suppose. If there were three append calls instead of
         | a single extend, I would assume any interleaving would be
         | valid.
         | 
         | > or if the GIL is removed does the interpreter segfault (I'm
         | pretty sure this latter will not be the case for GIL removal
         | but I don't understand the gil remove plan very much yet).
         | 
         | I haven't looked into the plan in detail either, but presumably
         | not, that would be nuts. My understanding is that they're going
         | to replace the GIL with locks on the objects themselves (your
         | list `l` in this case). This is why in all the tests single-
         | threaded performance suffer, you have to take and release more
         | locks if you don't have a GIL, and the objects themselves grow
         | larger as well.
        
           | Gabrys1 wrote:
           | Let's say you're passing 1 million ints to each extend. One
           | may start wondering if those operations are then split into
           | chunks...
        
         | dfox wrote:
         | You can probably trigger the behavior of random interleaving
         | even with GIL if you use something other than list or tuple as
         | an argument to list.extend(). CPython special cases list and
         | tuples and copies the contents directly (list_extend_fast() in
         | listobject.c) while it uses iterator for other types
         | (list_extend_iter()).
         | 
         | Because the iterator can be pretty much arbitrary Python code
         | it seems to be a bad idea to guarantee extend() to be atomic,
         | as you don't want to hold the list mutex while calling out into
         | user code.
        
           | nerdponx wrote:
           | All the more reason to wonder. What's going to be special-
           | cased to become atomic and thread-safe, and what won't be?
           | 
           | A surprising amount of the CPython stdlib is just pure Python
           | code.
        
         | colesbury wrote:
         | You will always get either [1,2,3,4,5,6] or [4,5,6,1,2,3] in
         | the upcoming `--disable-gil` builds of CPython 3.13 and the
         | nogil forks. Most operations on mutable collections hold a per-
         | object lock.
         | 
         | Part of the integration work will be to better document the
         | thread-safety guarantees, but there is still a lot of work to
         | do before we get there.
        
           | HDThoreaun wrote:
           | Every mutable collection needs to lock before write? Sounds
           | slow af
        
             | quietbritishjim wrote:
             | Locks are generally very cheap if they're not contended. Of
             | course it's all relative though!
        
               | gnulinux wrote:
               | They're cheap but they're not free. If you do it at
               | literally every single rw operation at runtime, it's
               | going to add up.
        
             | fbdab103 wrote:
             | Last I looked, the nogil implementation was some 5-10%
             | slower than the current, owing to all of the extra locking.
        
       | continuitylimit wrote:
       | The brief blurb about how GIL came to be, in light of Python's
       | success as a language and a tool, makes me question my s/e belief
       | system. Things like this are like when good things happen to bad
       | people and bad things happen to good people. It makes you
       | question the meaning of it all.
       | 
       | Is there no great architect in the sky? Is there no software god
       | after all, looking down, punishing sloppy engineers and granting
       | blessings to thoughtful engineers? How else to explain this
       | injustice of sloppy engineering eating the world (to say nothing
       | of JavaScript)?
        
         | nerdponx wrote:
         | Of course there isn't. Software is developed by people.
         | 
         | But consider that maybe it wasn't a _sloppy_ design at all. For
         | decades, the explicitly stated philosophy of the CPython
         | development team was to prioritize simplicity of implementation
         | over performance. I don 't think anyone ever envisioned Python
         | becoming the wild success that it is today.
         | 
         | That is, the GIL wasn't sloppy at all. It was perfectly
         | reasonable and pragmatic decision that made sense given the
         | tradeoffs of the time.
        
         | mrkeen wrote:
         | It's network effects, all the way down.
         | 
         | Watch people's commentary here when talking about the good and
         | bad of various technologies.
         | 
         | There's no 'bad' tech, just tech with lots of users. Or the
         | tech is good because you can hire for it. Because it has lots
         | of users. Or it has a rich ecosystem, because it has lots of
         | users.
         | 
         | Read the advice given by those who tell you to get to market
         | first instead of polishing the tech.
         | 
         | We're the users of that tech which went to market unpolished
         | and gathered all the users.
        
         | jerf wrote:
         | While there is some truth in what you are saying, you have a
         | common misunderstanding of the situation. Part of the reason
         | the GIL has proved so difficult to remove is that it is
         | actually a _good_ solution. In fact, there have been multiple
         | largely successful attempts to remove it over time over the
         | entire range of aggressiveness from CPython changes to writing
         | an entire JIT stack (PyPy), but it has never gone in to CPython
         | because it would either ruin all existing 3rd party libraries
         | that used C (which is a _lot_ of them), it would diminish
         | performance for an already-slow language, or as in the case of
         | PyPy, it isn 't even a "patch" so much as a new project.
         | 
         | Especially when you consider this over the whole of Python's
         | lifespan, which very, very firmly includes many years in which
         | multicore was simply not a thing, followed by some years where
         | it was a thing but it didn't work very well anyhow at the OS
         | level so who cares what Python does with it.
         | 
         | It is not as if back when it was put it the choice was either
         | to use a GIL or to correctly write a multithreaded interpreter
         | and fix all the 3rd party libraries at the time for exactly the
         | same cost. The latter option was orders of magnitude more
         | expensive, and harder then than it is now, with better tooling
         | and more collective developer experience. The choice of not
         | using a GIL, rather than being some sort of nirvana that we
         | could just be in if they hadn't chosen poorly 15 years ago,
         | could well have killed the language. We don't really know. I do
         | know that a programming language that just sort of breaks every
         | so often when you use threads and there's absolutely nothing
         | you can do about it from the Python level is not a very
         | appealing proposition and it's hard to know how badly this
         | could have hurt the language.
         | 
         | And Python of all the languages now has a well-justified fear
         | of breaking everything and demanding that everyone upgrade.
         | 
         | So, to put this in a nutshell, if you believe the GIL is simply
         | bad and should never have been an option, you have a very
         | immature understanding of software engineering, especially in
         | the light of being the leader of a very very large community
         | who will be impacted by your decisions. It may not have been
         | the only choice, but it was a good one, and regardless of what
         | decision was made 15 years ago there would be _some_
         | consequence to deal with now. No programming language community
         | can be expected to get everything right in 2003 that the people
         | of 2023 will want any more than we can expect any current
         | programming language to be the perfect programming language of
         | 2043 right this second.
        
           | continuitylimit wrote:
           | Thanks jerf for your thoughtful reply. tbh I was trolling hn
           | for the very first time in 14 years and based on the response
           | I have a natural talent for it. Who knew. The multi-core
           | point is well taken, as it maps to my own professional
           | experience in that transitional era as well.
        
         | bruce343434 wrote:
         | How is this sloppy engineering? And no of course there is no
         | grand architect, orchestrating everything neatly form a central
         | place of command, instead everything is an emergent process
         | including the decisions made by the python team.
         | 
         | What are you even complaining about? What is your point?
        
         | zkldi wrote:
         | Tools win not because they are "better" in some platonic ideal
         | of a programming language but because they are more practical
         | for solving the problems people have.
         | 
         | Python, JS, C, Bash aren't even particularly great at the
         | problems they solve, but they succeed mostly on inertia (it's
         | where all the libraries are, it's what people know) and
         | occupying developer mindshare.
         | 
         | They are full of _obvious_ design mistakes; things that not
         | even the creators of the language (nor any of its users) can
         | defend, yet those languages are used infinitely more than
         | languages that eschew those mistakes. Why? Because they solve
         | problems people have.
         | 
         | If this sounds terrible to you, the good news is that there is
         | a tonne of low-hanging fruit in the programming language design
         | space. Consider that most developers know nothing of sum-types,
         | or eschew the idea of typing entirely. Consider that most
         | developers see no fundamental problem behind having to venv or
         | dockerise software lest it bitrot over a month. Consider that
         | programmers actually use bash.
         | 
         | These terrible, obviously broken tools are somehow the most
         | pragmatic things we actually have. The fruit is low-hanging;
         | the door is wide open, if you wish to grab it.
        
           | amethyst wrote:
           | > Python, JS, C, Bash aren't even particularly great at the
           | problems they solve
           | 
           | I would argue the "problem" that Python really solves is the
           | amount of engineering effort required to read and write code
           | for common software use cases. Ie, it's purpose is to help
           | developers write better code faster and easier than other
           | languages, while execution speed has usually had a lower
           | priority. In that framing, it's _great_ at solving the
           | problem of development time and old code being hard to
           | maintain, and that 's why so many engineers like myself love
           | it.
        
         | gnulinux wrote:
         | > makes me question my s/e belief system
         | 
         | That's a good thing! Models are fit on data, data doesn't fit
         | to models. This is like when people learn elementary music
         | theory then go analyze some actual composer and it doesn't fit
         | the model at all. Well kiddo the problem is that "music theory"
         | is simply a model, a model people created after training some
         | very limit set of musical data, everything outside of that data
         | will probably behave different and you'll have to change your
         | models.
         | 
         | If your software engineering model predicts Python would be
         | unsuccessful, but there is evidence that Python is successful,
         | this simply means your software engineering model is
         | unpredictive and therefore must be revised.
        
       | yoyohello13 wrote:
       | It seems like in every python discussion I hear people complain
       | about the GIL.
       | 
       | I'm happy people are working on removing the GIL, but As a
       | professional python dev for about 5 years now I have literally
       | never had a problem where the GIL was a limiter. Although I just
       | make web apps so maybe I'm not the target audience.
        
         | BorgHunter wrote:
         | Conversely, I've worked on backend, data processing-type
         | applications for most of my career, much of it in Java but some
         | (especially recently) in Python, and the GIL is a huge limiting
         | factor for writing efficient, readable Python code. I've had to
         | write very annoying Python code using the multiprocessing
         | library to get around the GIL, and ultimately it works, but
         | it's ugly and clunky and overall just a pain. And remember,
         | I've written a lot of Java code, so I have a high tolerance for
         | pain! But the JVM's concurrency abstractions are actually kind
         | of a joy to use, even if Java the language isn't. Python is the
         | opposite, so if they can shed the GIL and make multithreading
         | viable in Python without forking new processes, that would be a
         | huge win.
        
           | Gabrys1 wrote:
           | I think multiprocessing is quite sensible in Python
           | (comparing to async for an example)
        
             | toxik wrote:
             | multiprocessing is almost never a good idea in my
             | experience.
        
         | svara wrote:
         | It's just about what kind of code you write.
         | 
         | If you write a lot of code that parallelizes over data you will
         | hurt all the time because of the GIL.
         | 
         | If you have worker processes that do something on the CPU, and
         | the results need to be collected and processed further in some
         | other process, you now need to pickle the data to copy it
         | around. That can get really slow.
         | 
         | I understand a lot of python users don't do that kind of thing,
         | but it's a real problem. I'm happy that the python community
         | seems to be slowly beginning to take this seriously after
         | decades of just claiming the GIL isn't a real issue.
        
           | quietbritishjim wrote:
           | > If you write a lot of code that parallelizes over data [and
           | that parallelization can't happen by vectorizing your
           | operations in a C extension module that releases the GIL,
           | like numpy] you will hurt all the time because of the GIL.
           | 
           | I've added some text for clarity
        
         | klyrs wrote:
         | In my rapidly-approaching-20 years of python development, I
         | have butted heads with the GIL countless times. Are your web
         | apps single-threaded? Have you looked at scaling your service?
         | Or do you use a web framework that handles that for you behind
         | the scenes?
        
         | bvirb wrote:
         | I've had the same experience mostly writing web apps in Ruby,
         | which locks similarly to Python. Multi-threading is always
         | needed to saturate IO, network, etc, never CPU, so GIL isn't an
         | issue. I always wondered what people were doing that ran into
         | issues.
         | 
         | With Python being the language of choice for ML workloads I
         | guess it's more common to have the CPU be a bottleneck. It
         | seems cool they're making an option to turn it off for those
         | use cases.
         | 
         | It seems like Python could maintain a GIL compatibility option
         | to preserve the current/old behavior for legacy code.
        
       | incomingpain wrote:
       | How many other people knew nothing about the GIL. Got to using
       | threading and inevitably found major performance issues that were
       | 100% caused by the GIL?
       | 
       | Thusly moving to multiprocessing and dealing with the lack of
       | shared memory issues, with managers.
       | 
       | When/if the GIL goes away, good riddence.
        
       | phkahler wrote:
       | >> I have at least one very concrete example of code that someone
       | assumed to be atomic:
       | 
       | request_id = self._next_id
       | 
       | self._next_id += 1
       | 
       | To think that is thread safe is just naive. Once you understand
       | the potential problem you can look at the code and ask "why
       | shouldn't this be unsafe?" Since there is nothing explicitly
       | preventing the problem. After reading TFA (lazily I admit) I
       | still don't know why that code is thread-safe with the GIL.
       | 
       | Python is fun and often forgiving, but a bunch of people who got
       | lucky (because they were never taught about the hazards) are
       | going to learn some new stuff with no-gil. I think it's a long
       | overdue change and worth the (single thread) performance hit and
       | bug surfacing phase.
        
         | toxik wrote:
         | It isn't threadsafe in current Python.
        
         | taway-20230404 wrote:
         | That example is not thread-safe _behavior-wise_ under the GIL.
         | The block is composed of several individual operations, and the
         | interpreter can preempt a thread between any one of them.
         | 
         | On the other hand, it is safe from a memory-access perspective;
         | the read from `self._next_id` will never dereference a
         | partially-mutated invalid pointer or read a partially-mutated
         | value.
        
       | Joel_Mckay wrote:
       | Pythons threading and GC model has always been problematic on
       | multi-core architectures.
       | 
       | It shouldn't be anywhere near time-critical and or low-latency
       | use-cases.
       | 
       | Python is functionally the modern BASIC, and included many of the
       | same design trade-offs for usability.
       | 
       | Don't get mad, it is true... and we know it. =)
        
       | Asooka wrote:
       | Yet more evidence that removing the GIL really ought to bump the
       | major version to Python 4.
        
       ___________________________________________________________________
       (page generated 2023-11-17 23:02 UTC)