[HN Gopher] The Changing "Guarantees" Given by Python's Global I...
___________________________________________________________________
The Changing "Guarantees" Given by Python's Global Interpreter Lock
Author : abhi9u
Score : 95 points
Date : 2023-11-17 13:03 UTC (9 hours ago)
(HTM) web link (stefan-marr.de)
(TXT) w3m dump (stefan-marr.de)
| dfox wrote:
| Code that assumes that something is going to be atomic because of
| the GIL (or any other implementation detail) is simply broken. If
| you need something to be atomic you should be explicit about that
| and use mutex or something.
| bheadmaster wrote:
| True, but when a majority of ecosystem is relying on an
| implementation detail, that implementation detail becomes de-
| facto standard.
| darklion wrote:
| Just because everybody does something the wrong way doesn't
| somehow make it magically correct.
|
| One study [1] in the US released in 2020 found almost 90% of
| people admitted to speeding, but I don't think anyone would
| say that speeding is now approved by the authorities and
| consequence-free.
|
| [1] https://www.thezebra.com/resources/research/speeding-car-
| ins...
| bheadmaster wrote:
| > Just because everybody does something the wrong way
| doesn't somehow make it magically correct.
|
| Depends on the definition of "correct".
|
| One way to define "correct" is "as defined by the
| standard". However, standard serves a purpose - to maintain
| interoperability between components, so the other, deeper,
| way to define "correct" is "interoperable with the
| ecosystem".
|
| The official standard can argue that relying on the
| implementation detail is incorrect because they didn't
| specify it explicitly, but they would go against the grain
| of the rest of the ecosystem. Which reminds me of an old
| joke: An old woman is watching the news.
| She sees a news report saying there is a car driving in the
| wrong direction on the highway. So the old woman calls up
| her husband. Old woman: be careful on the
| highway dear, there is a crazy driver on the highway
| driving the wrong way! Husband: It's not just
| one car, it's hundreds of them!
| nerdponx wrote:
| Lest we forget that Python doesn't even have a language
| standard.
|
| The "standard" is a combination of PEPs, the Python docs,
| and the CPython implementation.
|
| Implementation details _are_ language features, because
| implementation details are the standard. Python programs
| that rely on the GIL are not "wrong". Such programs are
| relying on clearly-documented features of the system that
| they use.
|
| Is it "wrong" to use GCC-specific features in a C
| program, if you know that you only intend to target GCC?
| usrbinbash wrote:
| > Implementation details are language features, because
| implementation details are the standard. Python programs
| that rely on the GIL are not "wrong".
|
| The problem here is: Implementation details are not
| guaranteed to be stable. They can change with every
| release.
|
| So if I write code that relies on any particular
| implementation detail, it may be correct today, and may
| be wrong two weeks from now, even though I didn't change
| anything.
| samus wrote:
| GCC-specific features are explicitly documented in their
| manual, and they can and should be used if one is fine
| with one's program being tied to that compiler. Best
| example: the Linux kernel.
|
| On the other hand, Linux kernel developers have also been
| burned by GCC changing what it does regarding _undefined
| behavior_ (the infamous not-so-corner cases of C /C++).
| They expect GCC to be sort of like an assembler, which
| does the most straightforward thing in case of
| ambiguities. Instead, GCC is an optimizing compiler that
| is expected to deliver high-performance code for general-
| purpose programs. It treats undefined behavior as
| opportunities to enact optimizations. What it actually
| does is also usually documented, but also here different
| people have different expectations about what that
| implies.
|
| The GIL is an implementation choice that was put in place
| by Python's developers for simplicity's sake. It is
| important to be aware of it because it has huge
| performance implications, but it is unwise to rely on it
| for semantics. Especially since it has not exactly been a
| secret that various parties would eventually like it to
| be removed. Anyways, the GIL has very little to do with
| Python's semantics. User code is racy with or without the
| GIL, which is actually the point of TA.
| hardware2win wrote:
| Nobody says it is correct
|
| Just hard as hell to change
| gjvc wrote:
| _Just because everybody does something the wrong way
| doesn't somehow make it magically correct._
|
| Hyrum's Law says the opposite:
|
| https://www.hyrumslaw.com/
|
| _" With a sufficient number of users of an API, it does
| not matter what you promise in the contract: all observable
| behaviors of your system will be depended on by somebody."_
|
| Named after some guy called Hyrum who worked / works at
| Google.
| verteu wrote:
| Agreed. It's also the "the first rule of kernel
| maintenance," according to Linus:
| https://linuxreviews.org/WE_DO_NOT_BREAK_USERSPACE
| marcinzm wrote:
| > One study [1] in the US released in 2020 found almost 90%
| of people admitted to speeding, but I don't think anyone
| would say that speeding is now approved by the authorities
| and consequence-free.
|
| I would say jaywalking in NYC is a much better example
| because breaking the law doesn't kill others except in rare
| cases. Jaywalking is illegal. It's also not enforced for
| all intents and purposes. Every attempt to enforce it has
| led to such loud public outcry that it was stopped.
| HDThoreaun wrote:
| Speeding absolutely is approved by the authorities and
| consequence free where I live.
| citrin_ru wrote:
| If 90% are speeding on a given highway / road then the
| speed limit is likely unreasonably low. Traffic accident
| which could be prevented by a speed limit typically caused
| by fastest 10% of the speed distribution, not by remaining
| 90%.
| usrbinbash wrote:
| Maybe, but "de-facto" and "actual" are still not the same,
| and if I write code that relies on them being the same, then
| I have only myself to blame when my code breaks.
|
| e.g.; there are C-compilers that usually zero most allocated
| structures. That's an implementation detail however, not a
| feature of C. C doesn't guarantee you zeroed memory anywhere,
| and code that assumes otherwise is just one compilation away
| from a desaster.
| samus wrote:
| The vendor might not agree with that and change the
| implementation detail at their convenience. In absence of a
| prescriptive standard, a reference could be reasonably
| trusted, but relying on only sort-of-documented
| implementation details is a kind of tech debt.
| BiteCode_dev wrote:
| The gil is a feature. Not using a feature that makes sense in
| our context is counter productive.
| nicolaslem wrote:
| The GIL in an implementation detail of CPython, it's not part
| of Python the language.
| colanderman wrote:
| It's been a while since I've followed Python closely but
| last I heard Python-the-language is _de facto_ defined as
| "whatever CPython does". Has Python since grown a proper
| language specification?
| formerly_proven wrote:
| No.
| dragonwriter wrote:
| Python doesn't have a full specification, but it does
| have first party documentation that distinguishes between
| CPython implementation details and language guarantees,
| in part to support alternative implementations.
| pdonis wrote:
| _> Python doesn 't have a full specification_
|
| This doesn't count?
|
| https://docs.python.org/3/reference/index.html
| squeaky-clean wrote:
| The introduction section states that it's not a
| complete/exact specification. If you're using a Python
| implementation and have a question, this should answer
| it. If you're writing a Python implementation and have a
| question, this may not answer it.
|
| > Consequently, if you were coming from Mars and tried to
| re-implement Python from this document alone, you might
| have to guess things and in fact you would probably end
| up implementing quite a different language. On the other
| hand, if you are using Python and wonder what the precise
| rules about a particular area of the language are, you
| should definitely be able to find them here. If you would
| like to see a more formal definition of the language,
| maybe you could volunteer your time -- or invent a
| cloning machine :-).
| pdonis wrote:
| _> If you 're writing a Python implementation and have a
| question, this may not answer it._
|
| Depends on the question. If your question is, say, "how
| should I implement the built-in types", yes, the language
| reference won't answer that question. But if your
| question is "what does my implementation have to do to
| count as an implementation of the Python language", then
| yes, the language reference _does_ answer that question--
| since the language reference is what _defines_ the Python
| language.
|
| To me that _is_ a "specification" of the Python
| language. It's not a specification of the
| _implementation_ of the language, but why should that be
| required for something to be considered a language
| specification? The whole _point_ is to specify what is
| required to define the language _without_ specifying
| every detail of the implementation.
| pdonis wrote:
| _> if you were coming from Mars and tried to re-implement
| Python from this document alone, you might have to guess
| things and in fact you would probably end up implementing
| quite a different language_
|
| Yes, I know that statement is in the Introduction, but I
| think it's rather ill-considered. If my implementation is
| consistent with the language reference, on what grounds
| would someone claim it was _not_ an implementation of
| Python but a "different language"?
| uxp8u61q wrote:
| On the grounds that we live in the real world, not a
| world where anything written down is a true and complete
| description of reality. "Technically correct" is
| sometimes a synonym of "incorrect".
| pdonis wrote:
| By your criterion, no programming language has a language
| specification at all. I don't think that's a useful way
| to look at things.
|
| I would like to see some indication from the people who
| say that the Python Language Reference I gave a link to
| does _not_ qualify as "language specification", of what
| _would_ qualify. Specific examples would be nice.
| dragonwriter wrote:
| I would say that that is intended as documentation for
| users (includong implementors of tools targeting the
| language), not a specification for implementors of
| Python, but I would agree that it is largely usable in
| either role; my point in the GP was that something
| distinct called "the Python Language Specification"
| doesn't exist, but that a (not necessarily complete, from
| a language implementors perspective) specification
| distinct from the implementation behavior of CPython does
| effectively exist in the documentation.
| pdonis wrote:
| _> not a specification for implementors of Python_
|
| I don't see why not. The "Introduction" section
| specifically mentions different implementations and
| distinguishes implementation details, which can vary by
| implementation, from the language reference itself, which
| defines what every implementation has to meet to be
| considered an implementation of the Python language.
|
| _> my point in the GP was that something distinct called
| "the Python Language Specification" doesn 't exist_
|
| And that's the point I'm disputing; AFAIK the language
| reference I linked to _is_ that something distinct, even
| if it isn 't called a "Language Specification" but
| instead a "Language Reference". Either way it defines
| what the Python language is.
| dragonwriter wrote:
| So, we rather explicitly agree that Python has nothing
| called a "language specification", but that its published
| first party documentation includes what is, functionally,
| a specification of the language distinct from CPython
| implementation details?
|
| Not sure why there is an argument here.
| BerislavLopac wrote:
| This implies that, if CPython differs from that
| specification in any way, it is not, in fact, Python.
| What have I been using all these years, I wonder?
| pdonis wrote:
| _> we rather explicitly agree that Python has nothing
| called a "language specification"_
|
| No, we don't. I have already said explicitly that I think
| the Python Language Reference _is_ such a thing. (I am
| ignoring quibbles about it being called a "reference"
| instead of a "specification".) If you think it isn't,
| why? What _does_ count as a "language specification" in
| your view? Do you have any specific examples that you can
| contrast with Python?
| friendzis wrote:
| AFAIK, there is no Python language specification, therefore
| implementation details of CPython IMO _is_ the language.
| pdonis wrote:
| _> AFAIK, there is no Python language specification_
|
| Yes, there is:
|
| https://docs.python.org/3/reference/index.html
|
| This specification does not mention the GIL anywhere,
| which means it is, as the GP said, an implementation
| detail. Other implementations of Python that do _not_
| have the GIL are still "Python" implementations because
| they meet this language specification.
| Kranar wrote:
| That is not a specification. A specification is a
| prescriptive document of how a system is required to work
| whereas a reference is a descriptive document of how a
| system happens to currently work. Python has a reference,
| in fact it has many references that even contradict one
| another in subtle ways, but it does not have a
| specification.
| eesmith wrote:
| What advantages would there be to that document?
|
| I mean, I can see the point if there are multiple
| commercial competitors in the market, as there is with
| C/C++, or if the implementation is proprietary and the
| users want to avoid vendor lock-in.
|
| But the Minimal BASIC of ANSI X3.60-1978 never did catch
| on for any of the BASICs I used in the 1990s, and the
| Full BASIC of ANSI X3.113-1987 was a flop, so clearly
| it's possible to put a lot of time into a standard only
| to have it be irrelevant.
| Kranar wrote:
| I never mentioned any advantage or disadvantage. I am
| only stating a fact about the current state of Python.
|
| Python does not have a specification or a standards
| document, it has a reference that describes how Python
| happens to work and the reference is a great resource for
| people to familiarize themselves with the language but it
| should simply be clear that its purpose is to reflect the
| existing state of the language rather than to specify how
| Python works.
| pdonis wrote:
| Then why are different implementations of Python all
| implementations of Python? What makes them
| implementations of Python rather than different
| languages?
|
| I've already given my answer: they all meet the language
| reference I linked to (yes, the word "reference" appears
| in its title, not "specification"; that's just another
| quibble).
|
| What is your answer?
| Kranar wrote:
| Haha, no they don't all meet the language reference you
| linked to and if you used them you'd know that!
|
| Python is a lot more like LISP than it is C++. There are
| many flavors of Python, from MicroPython to GraalPython
| to Cyston and literally dozens of them. They most
| certainly do not all meet the language reference you
| linked to and they all have quirks here and there.
|
| A language does not need a specification in order to
| exist or to have a name. What matters is that people use
| it and get work done with it, and ultimately if looks
| like a Python and quacks like a Python, then it's fine to
| call it a Python.
| eesmith wrote:
| What then did you think it relevant to point out that
| Python does not have a prescriptive document if you
| didn't think it was somehow useful?
|
| Here's another fact: Python doesn't have its own typeface
| either. Yet that fact is hardly germane to the thread.
|
| A prescriptive document is a formal specification. The
| Python Language Reference is a less formal specification.
| It's still a specification.
|
| It is also not a single document, but then again even a
| formal specification may incorporate other specifications
| by reference.
| Kranar wrote:
| Because the original comment was about the
| appropriateness of writing Python code according to the
| reference implementation as opposed to writing Python
| code strictly according to the reference document. Some
| people are arguing that the reference documentation is in
| some sense authoritative and the only resource that
| should be used to define Python's semantics. In
| particular the issue at hand is whether the GIL is a part
| of the semantics of a Python program.
|
| As my position is that the CPython implementation is the
| reference implementation for Python and as the GIL is an
| integral part of that implementation, then the GIL does
| form a part of Python's semantics regardless of whether
| the reference documentation mentions it or not.
|
| The Python reference documentation is not a specification
| and it's not intended to be one.
| pdonis wrote:
| _> That is not a specification._
|
| You're quibbling. By your definition, _no_ programming
| language has a "specification". Every language has
| implementations that do things that aren't explicitly
| described in any document.
| Kranar wrote:
| Plenty of languages have a specification. C++, Java, C#
| all do.
|
| Some languages do not, such as Python and Rust.
|
| The ISO C++ committee even uses quite strong language
| about the C++ standard and makes it a point to
| differentiate between the C++ specification and C++
| references:
|
| >The standard is not intended to teach how to use C++.
| Rather, it is an international treaty - a formal, legal,
| and sometimes mind-numbingly detailed technical document
| intended primarily for people writing C++ compilers and
| standard library implementations.
| pdonis wrote:
| Does every C++ implementation do exactly what is in the
| C++ language specification, no more, no less?
|
| Same question for Java and C#.
|
| If the answer to all of these questions is "no", as I
| believe it is, on what grounds do you claim that these
| languages have a specification, while Python and Rust do
| not?
| Kranar wrote:
| Yes the C++ and Java implementations do exactly what is
| in the language specification. The C++ specification
| explicitly allows languages to do more, but it can not do
| less. I believe Java has a similar clause but I'm not
| sure.
|
| The grounds that I claim is that you can read them, here
| they are:
|
| https://docs.oracle.com/javase/specs/
|
| https://isocpp.org/files/papers/N4860.pdf
|
| Note what the actual C++ standard states, and I quote:
|
| >This document specifies requirements for implementations
| of the C++ programming language. The first such
| requirement is that they implement the language, so this
| document also defines C++. Other requirements and
| relaxations of the first requirement appear at various
| places within this document
| pdonis wrote:
| _> the C++ and Java implementations do exactly what is in
| the language specification_
|
| The number of Google hits I get when I search on "C++
| implementations that do not meet the language
| specification" does not seem to support this claim.
|
| _> Note what the actual C++ standard states_
|
| But your position with respect to Python is that it
| doesn't matter what the "standard" states because the
| actual definition of the language is in its reference
| implementation. Why are you now shifting your ground?
| samus wrote:
| The C and C++ specifications are rather infamous about
| being incomplete, i.e., containing features that might
| lead to _undefined behavior_. On the other hand, Java is
| quite comprehensive. It is one of the earliest instances
| of languages specifying a memory model.
| samus wrote:
| Whether it is a specification or a reference doesn't
| really matter for this discussion. What matters is that
| it is the closest thing to a specification that exists
| for Python. Assuming anything beyond it ties the program
| to internals of a specific implementation. Internals for
| which the vendor might have different intentions about
| keeping stable or change than users expect.
| Kranar wrote:
| It's absolutely pertinent to this discussion, this whole
| discussion is about whether the GIL is part of the
| semantics of a Python program or not. My position is that
| it is because the GIL is a part of the CPython reference
| implementation.
|
| Others object saying that the Python reference document
| is what specifies its semantics, not the reference
| implementation.
|
| My position is that both the CPython reference
| implementation and the reference documentation are valid
| sources that document Python's semantics and that they
| both can and should be used.
| samus wrote:
| Both of the former provide stronger promises about how
| likely changes in the future are. For the former, user
| input and coordination is absolutely required.
|
| Implementation details are documented to allow
| performance optimizations and to give insight into why
| certain things are how they are. Therefore, users can
| reasonably expect that the vendor won't cause performance
| regressions for existing code. However, it is unwise to
| derive semantics from them, even if they are technically
| documented in that way.
|
| One of the biggest disadvantages of relying on
| implementation details is that it makes it way more
| troublesome for the vendor to maintain and improve the
| product.
|
| Anyways, the GIL and the presence of possible concurrency
| bugs are completely orthogonal things as the GIL has
| always only served to prevent corruption of runtime data
| structures, not of user code.
| pdonis wrote:
| _> this whole discussion is about whether the GIL is part
| of the semantics of a Python program or not. My position
| is that it is because the GIL is a part of the CPython
| reference implementation._
|
| And my position is that it is not because there are other
| implementations of Python that do not have it, and which
| everybody agrees are implementations of Python.
|
| Not only that, but the very language reference that I
| linked to explicitly distinguishes CPython implementation
| details from the language itself. So the Python dev team
| does not appear to agree with your position.
| nerdponx wrote:
| There is no language standard. The GIL is a Python language
| feature _because_ it 's a CPython feature.
| eesmith wrote:
| The GIL is not a required language feature. Neither
| Jython nor IronPython have a GIL.
|
| What would it mean to have a language standard? A
| publication from ISO or ECMA?
|
| I ask because the Python Language Reference at
| https://docs.python.org/3/reference/index.html seems to
| be a (terse) language standard. Among other things, it
| highlights some of the things which are implementation
| defined, rather than language defined.
| BiteCode_dev wrote:
| Both can be true, espacially since the implementation has
| been the same for 20 years.
| usrbinbash wrote:
| No it isn't.
|
| It's an implementation detail, and relying on those for
| functionality is a great way of getting ones code to break.
| o11c wrote:
| It depends on what kind of "atomic".
|
| Assuming sequential consistency is pretty broken. Assuming
| acquire/release atomicity is much more reasonable. Assuming "at
| least relaxed" is outright mandatory.
| blamestross wrote:
| Most people and scripts using the GIL to ensure safety likely
| don't even realize it is necessary. That is the hard part of
| this type of migration. Decades of code written with the
| implicit assumption of the GIL for all situations.
| Gabrys1 wrote:
| I don't think people generally write code to use this or
| another guarantee of the implementation. They write code,
| they run it, it seems to work (on my machine) and that's it.
| Later, the implementation changes in a subtle way (or someone
| tries to run the code in an incompatible version) and the
| code stops working. Nothing to do with "relying on GIL". The
| same may apply to relying on dict key order, timing of
| things, or anything else. That's why we have race conditions,
| Docker images, freezing dependencies, "only works in IE",
| etc.
|
| Writing to spec is very rare in my experience. You usually
| learn the spec once someone tells you your code doesn't work
| on XYZ.
| mikepurvis wrote:
| I dunno about that. There's a lot of Python out there with
| worker threads dumping their output into a shared "results"
| dict on the assumption that those insert operations are atomic.
| formerly_proven wrote:
| Relying on implementation behavior, even if it's very visible
| and hasn't changed for a few decades, is morally wrong and
| reprehensible. Only the letter of the standard has any
| bearing on correct reality.
| HDThoreaun wrote:
| > morally wrong
|
| That's a wild one
| smegsicle wrote:
| woe is man who loses sight of god and sees only himself
| cozzyd wrote:
| clearly a user of https://github.com/munificent/vigil
| m463 wrote:
| lol
|
| from FAQ:
|
| Q: But isn't a language that deletes code crazy?
| Kranar wrote:
| Python's semantics do not have a standards document or
| specification. It's based on its reference implementation
| (CPython) and different variations of Python even have
| different semantics (Jython, IronPython, PyPy).
|
| The library is documented, and the syntax is also
| documented, but the semantics themselves are not.
| eesmith wrote:
| It isn't as dire as you suggest.
|
| The language reference at
| https://docs.python.org/3/reference/index.html "describes
| the syntax and "core semantics" of the language.".
|
| There has been a distinction between Python-the-language
| and CPython-the-implementation ever since JPython back in
| the 1990s. For example, reference counting is a CPython
| implementation details. The reference manual says only:
|
| > Objects are never explicitly destroyed; however, when
| they become unreachable they may be garbage-collected. An
| implementation is allowed to postpone garbage collection
| or omit it altogether -- it is a matter of implementation
| quality how garbage collection is implemented, as long as
| no objects are collected that are still reachable.
|
| (Quoting
| https://docs.python.org/3/reference/datamodel.html )
| Kranar wrote:
| Absolutely, but note what your very quote says "describes
| ... the language" rather than "prescribes ... the
| language".
|
| A standards document or specification's purpose is to
| "prescribe" how a system shall work in order to be
| compliant as opposed to a reference which documents how
| systems currently happen to work.
| eesmith wrote:
| Python says those insert operations are atomic.
| https://docs.python.org/3/faq/library.html#what-kinds-of-
| glo... For example, the following operations
| are all atomic (L, L1, L2 are lists, D, D1, D2 are
| dicts, x, y are objects, i, j are ints):
| L.append(x) L1.extend(L2) x = L[i] x =
| L.pop() L1[i:j] = L2 L.sort() x = y
| x.field = y D[x] = y D1.update(D2) D.keys()
| mikepurvis wrote:
| That's helpful that that's explicitly documented as such
| then, rather than just a "lol GIL" implementation detail
| that people have been able to take advantage of since
| forever-- not carrying a documented guarantee like that
| over into GIL-less python's stdlib would be a nightmare.
| lelanthran wrote:
| Python doesn't have a standards document. It has a reference
| implementation that defines the language.
|
| IOW, the python implementation _is_ the standard, and therefore
| any code that relies on an implementation detail in the
| reference implementation is, by definition, correct.
| oblvious-earth wrote:
| > Python doesn't have a standards document. It has a
| reference implementation that defines the language.
|
| That's not quite strictly true, Python does have a documented
| "Language Reference":
| https://docs.python.org/3/reference/index.html
|
| If there is a contradiction between the Language Reference
| and CPython then one, or both, of them needs to be updated
| and it's treated on a case by case basis.
|
| If an alternative Python implementation follows the Language
| Reference but chooses different details outside it, that
| doesn't stop it from being "Python". Of course practically
| speaking most alternative implementations are incentivized to
| closely follow CPython.
| solarkraft wrote:
| And what about PEPs, which describe (in pretty good detail)
| what changes will be made?
| samus wrote:
| They describe large-scale changes that usually also lead
| to changes of implementation details. But their biggest
| disadvantage is that they are not kept up to date when
| things change due to later PEPs or smaller-scale changes
| interfering. After landing, they are of historical value
| only.
| citrin_ru wrote:
| A related xkcd: https://xkcd.com/1172/
|
| For a sufficiently popular project there are often cases where
| users (other projects) rely on existing observed behavior even
| if it not guaranteed by documentation/specification.
| hyperpape wrote:
| You can say it's broken until you're blue in the face, and
| honestly, you're not wrong, but it's irrelevant.
|
| If major libraries and user code are constantly incorrect, but
| work because of an implementation detail, then removing that
| implementation detail becomes extraordinarily difficult,
| verging on impossible.
|
| It would be like retrofitting your language to distinguish
| valid unicode text from arbitrary byte strings, when you'd
| previously treated them as equivalent.
| coldcode wrote:
| Why does Python use an un-comparable version number scheme? Not
| being a Python programmer, comparing version 3.9 to 3.13 seemed
| bizarre until I caught on.
| BiteCode_dev wrote:
| Sem ver is pretty standard imo.
| Scarblac wrote:
| Like in most versioning schemes, the dot isn't meant to be read
| as decimal point.
| rcxdude wrote:
| comparing version numbers as tuples and not as decimals is a
| pretty standard thing. Linux does it as well, for example. I am
| actually not sure of a project which does something different.
| irishsultan wrote:
| TeX and Metafont have version numbers that are approaching pi
| and e respectively, so the sensible way is to read these as
| decimals.
| throwaway468234 wrote:
| Is there any versioning scheme that _is_ comparable? Honest
| question.
| burkaman wrote:
| Some languages just use integer versions, like Java.
| fullstop wrote:
| The early versions did not use integers, for what it's
| worth.
| josefx wrote:
| Java dropped the leading 1 with version 5 since Sun decided
| it would never go for a complete rewrite of the language.
| Imagine the chaos if Python did that and then pulled the 2
| to 3 change on a run of the mill update from 213 to 214.
| jcranmer wrote:
| Not sure that's entirely accurate.
|
| Java 1.2 was branded as "Java 2", so you had the J2SE
| (Java 2, Standard Edition) and related J2ME and J2EE
| (Mobile and Enterprise, respectively) platforms. The
| "Java 2" moniker was dropped in Java 5, which was the
| largest rewrite of the language since, adding generics,
| sane memory model, annotations, etc., all in the same
| language revision.
| jorgemf wrote:
| I think kotlin is one example. It uses the same idea but it
| uses powers of 10 for incremental fixes and numbers for 1 to
| 9 for hotfixes. That's if for the 3rd number, I do not know
| what will happen when the second number reaches 2 digits. I
| guess they will do something to make it comparable again.
| nxpnsv wrote:
| TeX version number asymptotically approaches pi... each new
| version has another digit, this also makes versions
| comparable. Clearly this is the better way....
| coldtea wrote:
| Comparable as what?
|
| 3.10 and 3.9 are perfectly mechanically comparable (meaning
| one can write a program to deterministically compare them and
| return their relative order), just not with default numeric
| ordering (then again they're not numbers, they are composite
| values that are comprised by numbers) or naive string based
| ordering.
|
| If we wanted trivially comparable with regular numeric
| ordering we could have incremental numbers as versions. 1, 2,
| 3, ...
|
| And if we wanted string ordering (as with usual filesystem
| listing sorting with no extra flags to treat as numbers), we
| could have fixed length padded parts: 00001.00045.
|
| Not sure if the latter is used, but some software does use
| the first.
| usrbinbash wrote:
| Or I could just accept that neither numeric, nor string
| ordering works for semantic versioning, and write a
| trivially easy piece of code that does the ordering in a
| contect where I expect such a scheme.
|
| > If we wanted trivially comparable with regular numeric
| ordering we could have incremental numbers as versions. 1,
| 2, 3, ...
|
| Yes, and then we would be back to the day when the version
| number gave me zero information about what changed, and how
| that affects compatibility with existing code.
|
| There is a reason semver is used across the industry by
| now.
| coldtea wrote:
| > _Or I could just accept that neither numeric, nor
| string ordering works for semantic versioning, and write
| a trivially easy piece of code that does the ordering in
| a contect where I expect such a scheme._
|
| Hence the whole "3.10 and 3.9 are perfectly mechanically
| comparable (meaning one can write a program to
| deterministically compare them and return their relative
| order)" part in my comment you perhaps missed.
|
| > _Yes, and then we would be back to the day when the
| version number gave me zero information about what
| changed, and how that affects compatibility with existing
| code._
|
| Not that it's any better now with semver though: in
| practice the semver works 95% of the time, and give just
| a false sense of comfort at the other 5%. You update, and
| things still break, despite the semver promise.
| jjoonathan wrote:
| I won't argue with bizzare, but it's common in version
| notation.
|
| If it wasn't already common, it would probably become common
| shortly after the first time a big respectable company hit the
| "oops what comes after 9" problem and decided on dot-separated-
| integers rather than significant digits :)
| coldtea wrote:
| Is it that bizarre?
|
| It's basically semantic versioning, that is a hierarchical
| split based on levels of change (major = new release with
| possibly big breaking changes, minor = some incremental
| update version within the same release) and so on.
|
| Who ever thought version numbers are decimals and why? The
| "." appears as a separator on all kinds of strings in
| software (filenames, domains, and IPs probably the most
| common ones).
| Aurornis wrote:
| Semantic Versioning is ubiquitous across the modern software
| industry. It's worth reading about it if you're not familiar:
| https://semver.org/
|
| Never assume that version numbers are decimal values. It's more
| obvious when you see the full version triple (3.13.0 for
| example) that it's not a single number, but the abbreviated
| version numbers can some times look like a decimal value. You
| should never compare version numbers as decimals in modern
| software unless you're absolutely sure that's how the project
| is structured.
| luhn wrote:
| I know you didn't say it outright, but since it is implied:
| Python does _not_ use semver, although it does share similar
| formatting. Semver considers bumps to the second integer a
| "minor" release with no breaking changes, whereas in Python
| that indicates a major release that is not backwards
| compatible.
| coldtea wrote:
| You mean uncomparable using numeric or string sorting on those
| strings?
|
| Because otherwise it's perfectly comparable - and quite common.
|
| Most FOSS uses a variant of <major>.<minor>.<patch> (here just
| major (3) + minor (13), minor meaning "release within the same
| major Python version").
| zzzeek wrote:
| these examples of "what one would assume to be atomic" did not
| seem useful to me, they looked like things that are obviously not
| threadsafe.
|
| a more interesting example is something like this:
| # setup l = [] # thread A l.extend([1,
| 2, 3]) # thread B l.extend([4, 5, 6])
|
| is the resulting list always within the set of [1,2,3,4,5,6] or
| [4,5,6,1,2,3] ? or are the two sets of numbers randomly
| interleaved in the list? or if the GIL is removed does the
| interpreter segfault (I'm pretty sure this latter will not be the
| case for GIL removal but I don't understand the gil remove plan
| very much yet).
|
| Edit: before people jump in and correct how the above is a bad
| idea anyway, it's not like I'd ever do the above and expect
| anything but disaster. This is more of a thought experiment to
| understand what GIL removal is going to do.
| ngoldbaum wrote:
| It'll be guarded by a fine-grained mutex, so it won't seg
| fault. They're using a performant mutex based on webkit's
| wtf::lock.
| OskarS wrote:
| I'm guessing interleaving is probably not possible because the
| `extend` call is passed the full list, and it holds the GIL
| until it's finished. But you'd have to look at the bytecode to
| be sure, I suppose. If there were three append calls instead of
| a single extend, I would assume any interleaving would be
| valid.
|
| > or if the GIL is removed does the interpreter segfault (I'm
| pretty sure this latter will not be the case for GIL removal
| but I don't understand the gil remove plan very much yet).
|
| I haven't looked into the plan in detail either, but presumably
| not, that would be nuts. My understanding is that they're going
| to replace the GIL with locks on the objects themselves (your
| list `l` in this case). This is why in all the tests single-
| threaded performance suffer, you have to take and release more
| locks if you don't have a GIL, and the objects themselves grow
| larger as well.
| Gabrys1 wrote:
| Let's say you're passing 1 million ints to each extend. One
| may start wondering if those operations are then split into
| chunks...
| dfox wrote:
| You can probably trigger the behavior of random interleaving
| even with GIL if you use something other than list or tuple as
| an argument to list.extend(). CPython special cases list and
| tuples and copies the contents directly (list_extend_fast() in
| listobject.c) while it uses iterator for other types
| (list_extend_iter()).
|
| Because the iterator can be pretty much arbitrary Python code
| it seems to be a bad idea to guarantee extend() to be atomic,
| as you don't want to hold the list mutex while calling out into
| user code.
| nerdponx wrote:
| All the more reason to wonder. What's going to be special-
| cased to become atomic and thread-safe, and what won't be?
|
| A surprising amount of the CPython stdlib is just pure Python
| code.
| colesbury wrote:
| You will always get either [1,2,3,4,5,6] or [4,5,6,1,2,3] in
| the upcoming `--disable-gil` builds of CPython 3.13 and the
| nogil forks. Most operations on mutable collections hold a per-
| object lock.
|
| Part of the integration work will be to better document the
| thread-safety guarantees, but there is still a lot of work to
| do before we get there.
| HDThoreaun wrote:
| Every mutable collection needs to lock before write? Sounds
| slow af
| quietbritishjim wrote:
| Locks are generally very cheap if they're not contended. Of
| course it's all relative though!
| gnulinux wrote:
| They're cheap but they're not free. If you do it at
| literally every single rw operation at runtime, it's
| going to add up.
| fbdab103 wrote:
| Last I looked, the nogil implementation was some 5-10%
| slower than the current, owing to all of the extra locking.
| continuitylimit wrote:
| The brief blurb about how GIL came to be, in light of Python's
| success as a language and a tool, makes me question my s/e belief
| system. Things like this are like when good things happen to bad
| people and bad things happen to good people. It makes you
| question the meaning of it all.
|
| Is there no great architect in the sky? Is there no software god
| after all, looking down, punishing sloppy engineers and granting
| blessings to thoughtful engineers? How else to explain this
| injustice of sloppy engineering eating the world (to say nothing
| of JavaScript)?
| nerdponx wrote:
| Of course there isn't. Software is developed by people.
|
| But consider that maybe it wasn't a _sloppy_ design at all. For
| decades, the explicitly stated philosophy of the CPython
| development team was to prioritize simplicity of implementation
| over performance. I don 't think anyone ever envisioned Python
| becoming the wild success that it is today.
|
| That is, the GIL wasn't sloppy at all. It was perfectly
| reasonable and pragmatic decision that made sense given the
| tradeoffs of the time.
| mrkeen wrote:
| It's network effects, all the way down.
|
| Watch people's commentary here when talking about the good and
| bad of various technologies.
|
| There's no 'bad' tech, just tech with lots of users. Or the
| tech is good because you can hire for it. Because it has lots
| of users. Or it has a rich ecosystem, because it has lots of
| users.
|
| Read the advice given by those who tell you to get to market
| first instead of polishing the tech.
|
| We're the users of that tech which went to market unpolished
| and gathered all the users.
| jerf wrote:
| While there is some truth in what you are saying, you have a
| common misunderstanding of the situation. Part of the reason
| the GIL has proved so difficult to remove is that it is
| actually a _good_ solution. In fact, there have been multiple
| largely successful attempts to remove it over time over the
| entire range of aggressiveness from CPython changes to writing
| an entire JIT stack (PyPy), but it has never gone in to CPython
| because it would either ruin all existing 3rd party libraries
| that used C (which is a _lot_ of them), it would diminish
| performance for an already-slow language, or as in the case of
| PyPy, it isn 't even a "patch" so much as a new project.
|
| Especially when you consider this over the whole of Python's
| lifespan, which very, very firmly includes many years in which
| multicore was simply not a thing, followed by some years where
| it was a thing but it didn't work very well anyhow at the OS
| level so who cares what Python does with it.
|
| It is not as if back when it was put it the choice was either
| to use a GIL or to correctly write a multithreaded interpreter
| and fix all the 3rd party libraries at the time for exactly the
| same cost. The latter option was orders of magnitude more
| expensive, and harder then than it is now, with better tooling
| and more collective developer experience. The choice of not
| using a GIL, rather than being some sort of nirvana that we
| could just be in if they hadn't chosen poorly 15 years ago,
| could well have killed the language. We don't really know. I do
| know that a programming language that just sort of breaks every
| so often when you use threads and there's absolutely nothing
| you can do about it from the Python level is not a very
| appealing proposition and it's hard to know how badly this
| could have hurt the language.
|
| And Python of all the languages now has a well-justified fear
| of breaking everything and demanding that everyone upgrade.
|
| So, to put this in a nutshell, if you believe the GIL is simply
| bad and should never have been an option, you have a very
| immature understanding of software engineering, especially in
| the light of being the leader of a very very large community
| who will be impacted by your decisions. It may not have been
| the only choice, but it was a good one, and regardless of what
| decision was made 15 years ago there would be _some_
| consequence to deal with now. No programming language community
| can be expected to get everything right in 2003 that the people
| of 2023 will want any more than we can expect any current
| programming language to be the perfect programming language of
| 2043 right this second.
| continuitylimit wrote:
| Thanks jerf for your thoughtful reply. tbh I was trolling hn
| for the very first time in 14 years and based on the response
| I have a natural talent for it. Who knew. The multi-core
| point is well taken, as it maps to my own professional
| experience in that transitional era as well.
| bruce343434 wrote:
| How is this sloppy engineering? And no of course there is no
| grand architect, orchestrating everything neatly form a central
| place of command, instead everything is an emergent process
| including the decisions made by the python team.
|
| What are you even complaining about? What is your point?
| zkldi wrote:
| Tools win not because they are "better" in some platonic ideal
| of a programming language but because they are more practical
| for solving the problems people have.
|
| Python, JS, C, Bash aren't even particularly great at the
| problems they solve, but they succeed mostly on inertia (it's
| where all the libraries are, it's what people know) and
| occupying developer mindshare.
|
| They are full of _obvious_ design mistakes; things that not
| even the creators of the language (nor any of its users) can
| defend, yet those languages are used infinitely more than
| languages that eschew those mistakes. Why? Because they solve
| problems people have.
|
| If this sounds terrible to you, the good news is that there is
| a tonne of low-hanging fruit in the programming language design
| space. Consider that most developers know nothing of sum-types,
| or eschew the idea of typing entirely. Consider that most
| developers see no fundamental problem behind having to venv or
| dockerise software lest it bitrot over a month. Consider that
| programmers actually use bash.
|
| These terrible, obviously broken tools are somehow the most
| pragmatic things we actually have. The fruit is low-hanging;
| the door is wide open, if you wish to grab it.
| amethyst wrote:
| > Python, JS, C, Bash aren't even particularly great at the
| problems they solve
|
| I would argue the "problem" that Python really solves is the
| amount of engineering effort required to read and write code
| for common software use cases. Ie, it's purpose is to help
| developers write better code faster and easier than other
| languages, while execution speed has usually had a lower
| priority. In that framing, it's _great_ at solving the
| problem of development time and old code being hard to
| maintain, and that 's why so many engineers like myself love
| it.
| gnulinux wrote:
| > makes me question my s/e belief system
|
| That's a good thing! Models are fit on data, data doesn't fit
| to models. This is like when people learn elementary music
| theory then go analyze some actual composer and it doesn't fit
| the model at all. Well kiddo the problem is that "music theory"
| is simply a model, a model people created after training some
| very limit set of musical data, everything outside of that data
| will probably behave different and you'll have to change your
| models.
|
| If your software engineering model predicts Python would be
| unsuccessful, but there is evidence that Python is successful,
| this simply means your software engineering model is
| unpredictive and therefore must be revised.
| yoyohello13 wrote:
| It seems like in every python discussion I hear people complain
| about the GIL.
|
| I'm happy people are working on removing the GIL, but As a
| professional python dev for about 5 years now I have literally
| never had a problem where the GIL was a limiter. Although I just
| make web apps so maybe I'm not the target audience.
| BorgHunter wrote:
| Conversely, I've worked on backend, data processing-type
| applications for most of my career, much of it in Java but some
| (especially recently) in Python, and the GIL is a huge limiting
| factor for writing efficient, readable Python code. I've had to
| write very annoying Python code using the multiprocessing
| library to get around the GIL, and ultimately it works, but
| it's ugly and clunky and overall just a pain. And remember,
| I've written a lot of Java code, so I have a high tolerance for
| pain! But the JVM's concurrency abstractions are actually kind
| of a joy to use, even if Java the language isn't. Python is the
| opposite, so if they can shed the GIL and make multithreading
| viable in Python without forking new processes, that would be a
| huge win.
| Gabrys1 wrote:
| I think multiprocessing is quite sensible in Python
| (comparing to async for an example)
| toxik wrote:
| multiprocessing is almost never a good idea in my
| experience.
| svara wrote:
| It's just about what kind of code you write.
|
| If you write a lot of code that parallelizes over data you will
| hurt all the time because of the GIL.
|
| If you have worker processes that do something on the CPU, and
| the results need to be collected and processed further in some
| other process, you now need to pickle the data to copy it
| around. That can get really slow.
|
| I understand a lot of python users don't do that kind of thing,
| but it's a real problem. I'm happy that the python community
| seems to be slowly beginning to take this seriously after
| decades of just claiming the GIL isn't a real issue.
| quietbritishjim wrote:
| > If you write a lot of code that parallelizes over data [and
| that parallelization can't happen by vectorizing your
| operations in a C extension module that releases the GIL,
| like numpy] you will hurt all the time because of the GIL.
|
| I've added some text for clarity
| klyrs wrote:
| In my rapidly-approaching-20 years of python development, I
| have butted heads with the GIL countless times. Are your web
| apps single-threaded? Have you looked at scaling your service?
| Or do you use a web framework that handles that for you behind
| the scenes?
| bvirb wrote:
| I've had the same experience mostly writing web apps in Ruby,
| which locks similarly to Python. Multi-threading is always
| needed to saturate IO, network, etc, never CPU, so GIL isn't an
| issue. I always wondered what people were doing that ran into
| issues.
|
| With Python being the language of choice for ML workloads I
| guess it's more common to have the CPU be a bottleneck. It
| seems cool they're making an option to turn it off for those
| use cases.
|
| It seems like Python could maintain a GIL compatibility option
| to preserve the current/old behavior for legacy code.
| incomingpain wrote:
| How many other people knew nothing about the GIL. Got to using
| threading and inevitably found major performance issues that were
| 100% caused by the GIL?
|
| Thusly moving to multiprocessing and dealing with the lack of
| shared memory issues, with managers.
|
| When/if the GIL goes away, good riddence.
| phkahler wrote:
| >> I have at least one very concrete example of code that someone
| assumed to be atomic:
|
| request_id = self._next_id
|
| self._next_id += 1
|
| To think that is thread safe is just naive. Once you understand
| the potential problem you can look at the code and ask "why
| shouldn't this be unsafe?" Since there is nothing explicitly
| preventing the problem. After reading TFA (lazily I admit) I
| still don't know why that code is thread-safe with the GIL.
|
| Python is fun and often forgiving, but a bunch of people who got
| lucky (because they were never taught about the hazards) are
| going to learn some new stuff with no-gil. I think it's a long
| overdue change and worth the (single thread) performance hit and
| bug surfacing phase.
| toxik wrote:
| It isn't threadsafe in current Python.
| taway-20230404 wrote:
| That example is not thread-safe _behavior-wise_ under the GIL.
| The block is composed of several individual operations, and the
| interpreter can preempt a thread between any one of them.
|
| On the other hand, it is safe from a memory-access perspective;
| the read from `self._next_id` will never dereference a
| partially-mutated invalid pointer or read a partially-mutated
| value.
| Joel_Mckay wrote:
| Pythons threading and GC model has always been problematic on
| multi-core architectures.
|
| It shouldn't be anywhere near time-critical and or low-latency
| use-cases.
|
| Python is functionally the modern BASIC, and included many of the
| same design trade-offs for usability.
|
| Don't get mad, it is true... and we know it. =)
| Asooka wrote:
| Yet more evidence that removing the GIL really ought to bump the
| major version to Python 4.
___________________________________________________________________
(page generated 2023-11-17 23:02 UTC)