[HN Gopher] Python stands to lose its GIL, and gain a lot of speed
       ___________________________________________________________________
        
       Python stands to lose its GIL, and gain a lot of speed
        
       Author : bobajeff
       Score  : 161 points
       Date   : 2021-10-17 13:41 UTC (9 hours ago)
        
 (HTM) web link (www.infoworld.com)
 (TXT) w3m dump (www.infoworld.com)
        
       | ahmedfromtunis wrote:
       | In my opinion, there's a time in a language's life were it should
       | slow down the pace of "innovation". Code bases are complexe
       | things and updating and upgrading them constantly just to keep up
       | with the language maybe counterproductive.
       | 
       | Python is there now, if you ask me. It should slow down and focus
       | more on "maintenance" stuff with little to no impact on its
       | interface. And maybe work on big projects like multithreading or
       | stronger typing on the background and them when they're fully
       | ready.
        
         | pmlnr wrote:
         | > stronger typing
         | 
         | Please, don't. There are wonderful strongly types languages out
         | there, so if one wants or needs a strongly types language, use
         | that, and not Python.
        
           | smitty1e wrote:
           | The gradual typing available now seems suitable. I have
           | written plenty of code in typed contexts and plenty without.
           | Python's "consenting adults" approach seems a win.
           | 
           | Perhaps, without the GIL and with typing information
           | included, additional performance gains with be on offer.
           | 
           | But the "have it your way" nature of Python is a bigger win
           | than either end of the data typing spectrum.
        
             | pmlnr wrote:
             | > "have it your way" nature of Python
             | 
             | TMTOWTDI - There's more than one way to do it - is Perl's
             | motto :)
             | 
             | Zen of Python suggests there's just one correct way.
        
         | JacobHenner wrote:
         | Static typing, not strong typing
        
       | gigatexal wrote:
       | These are the kinds of moves Python needs to stay relevant with
       | amazing languages like Go out there that are just so simple yet
       | powerful.
        
       | sys_64738 wrote:
       | Knowledge of the GIL forces a lot of python developers to not try
       | to write multithreaded scripts. Doing away with it will make life
       | harder for many folks, IMO. Have you tried explaining
       | multithreading to some python scripters?
        
       | vram22 wrote:
       | pinch.take(salt)
       | 
       | Friend: I saw this interesting tech article ...
       | 
       | Me: Site?
       | 
       | Friend: mumble corp IT site mumble
       | 
       | Me: Bye
        
       | di wrote:
       | Previous discussion:
       | https://news.ycombinator.com/item?id=28880782
        
         | Animats wrote:
         | Yes. This is the discussion we had yesterday, but with more
         | hype.
        
       | np_tedious wrote:
       | > These changes are major enough that a fair number of existing
       | Python libraries that work directly with Python's internals
       | (e.g., Cython) would need to be rewritten. But the cadence of
       | Python's release schedule just means such breaking changes would
       | need to be made in a major point release instead of a minor one.
       | 
       | Maybe time to rethink?
       | https://www.techrepublic.com/article/programming-languages-w...
       | 
       | If this is as promising as it sounds, it seems Python 4 now has
       | its "thing" and is on the horizon. Or at least may become a
       | serious thing to talk about
        
         | [deleted]
        
         | gwking wrote:
         | I began using python during the python3.0 betas, and I watched
         | the 2 vs 3 saga from the (unusual?) perspective of a v3
         | hobbyist with no back-compat requirements.
         | 
         | What struck me as most significant was the opportunistic
         | breakage of things not related to the unicode transition. In
         | the many years it took to win people over to v3, they could
         | have marched over all the breaking changes a year at a time.
         | Given that side-by-side installs of python3.x point versions
         | are very functional, with or without venvs, this would have
         | been much more palatable. Perhaps harder than it sounds though.
         | 
         | I attempted a couple of 2to3 translations of open source
         | libraries over the years, with varying degrees of success.
         | Every time I found that most of the changes were easy, but
         | debugging the broken bits was hard due to the sheer volume of
         | source changes. If instead I could have done conversions where
         | there was only a single major semantic change at a time, it
         | would be so much easier to figure out what was going wrong at
         | any given step. Furthermore, I imagine that a single-breaking-
         | change mentality would lead to better documentation on how to
         | transition for each version.
         | 
         | For this reason, I have become rather suspicious of yearly
         | release schedules. Swift is even more frustrating: the version
         | changes are really just dictated by Apple's yearly PR calendar.
         | Some big things get rushed out for WWDC before they are ready,
         | and smaller fixes can get held back until the next year. I
         | would much rather that the language teams just prioritize one
         | thing at a time, release it when it is ready, and foster a
         | community where staying up-to-date on the latest version is
         | easy and desirable (a more complicated story for Apple than for
         | Python I think, due to ABI, OS version, etc).
         | 
         | From past discussions on HN I've gathered that there is such a
         | thing as release fatigue, where developers get irritated when
         | libraries release breaking changes too often. Nevertheless I
         | often wonder if languages and libraries could improve faster by
         | making more breaking changes, one at a time, with robust side-
         | by-side installs to facilitate testing across versions. I wish
         | side-by-side library versions were possible in Python, just to
         | facilitate regression testing.
         | 
         | Bringing this all back to the post, I sincerely hope that if
         | Python 4 is a breaking change to the GIL, that it will be only
         | that.
         | 
         | I'm curious what others think about all this. Thoughts?
        
           | kevin_thibedeau wrote:
           | > If instead I could have done conversions where there was
           | only a single major semantic change at a time,
           | 
           | That was the point of the "from __future__" imports. You
           | could get most of the way toward Python 3 so that 2to3 would
           | be easier to work with and the new semantics could be
           | gradually baked into the code prior to migration.
           | 
           | Python 3 had 25 years of cruft to clean up. They won't have
           | to do that again.
        
           | fractalb wrote:
           | If every release has a single breaking change, then that
           | language is said to be unstable/not-production ready. IMHO,
           | that's not at all an acceptable way of doing point releases.
           | People will just be scared of new releases. No one will adopt
           | a new language version as soon as it releases. Java never
           | breaks backwards compatibility and still there are people
           | running Java 8. Imagine what would happen if every point
           | release carries breaking changes. It makes you feel that the
           | language is not mature, the library ecosystem broken, since
           | you'll have to keep track of version compatibility for each
           | library that you use. It's a nightmare for both library
           | developers and end-users. Few people would like to use such a
           | language
        
             | nuerow wrote:
             | > _If every release has a single breaking change, then that
             | language is said to be unstable /not-production ready.
             | IMHO, that's not at all an acceptable way of doing point
             | releases._
             | 
             | This.
             | 
             | It makes absolutely no sense to claim that having to deal
             | with a single non-backwards compatible release is somehow
             | worse than having to deal with a sequence of non-backwards
             | compatible releases.
             | 
             | Even though the migration from Python2 to Python3 faced
             | some resistence, if anything the decision was totally
             | vindicated.
        
             | ikerdanzel wrote:
             | Python makes it to #1 over Java now. Python is breaking
             | changes frequently. A lot of libraries need to be "tweak"
             | to get it working even incremental changes within 3.x let
             | along the big rift from 2 to 3. Although logically, few
             | would like such language that breaks, current market for
             | Python adoption buck this trend.
        
           | xvland wrote:
           | Idealists and free developers (who have created the majority
           | of the Python interpreter) agree with you.
           | 
           | Corporate developers, who have taken over Python and other
           | people's work, like unnecessary changes, because they get
           | many billable hours of seemingly complex work that can be
           | done on autopilot.
           | 
           | Corporations might even take over more C extensions whose
           | developers are no longer willing to put up with the churn and
           | who have moved to C++ or Java.
           | 
           | In the long run, this is bad for Python. But many developers
           | want to milk the snake until their retirement and don't care
           | what happens afterwards.
        
             | simonw wrote:
             | "Corporate developers, who have taken over Python and other
             | people's work, like unnecessary changes, because they get
             | many billable hours of seemingly complex work that can be
             | done on autopilot."
             | 
             | In my 20+ year career I have never worked with a programmer
             | that matches this description.
             | 
             | Maybe I've got lucky.
        
           | isoprophlex wrote:
           | O boy, angry rant incoming. I'll say something petulant and
           | overly dramatic, but I don't like the direction in which
           | python is going, and I'm glad there's finally some news about
           | focus on actual innovation instead of tacking on syntactic
           | cruft.
           | 
           | I want the python that Guido promised me, with 2021
           | performance. I don't want some abhorrent committee-designed
           | piece of middle-of-the-road shitware glue language that I
           | must use because everyone uses it.
           | 
           | I want a language that doesn't spin it's single-threaded
           | wheel in a sea of CPU cores, and I want a language that has
           | one obvious way of doing things without needing to grok and
           | parse dumb """clever""" hacks that will only be abused by
           | midlevel programmers to show off hoe they saved typing a few
           | lines of additional code.
           | 
           | To me, speed + simplicity = ergonomy = joy. I want a new
           | python 4 to focus exclusively and intensely on performance
           | improvements and ergonomy.
        
             | DangitBobby wrote:
             | What recent changes to the language do you specifically
             | dislike?
        
               | dataflow wrote:
               | Not the parent but := assignment expressions and match
               | expressions are abominations.
        
               | DangitBobby wrote:
               | I'll point out that the walrus operator was actually
               | accepted while Guido was still BDFL (and the vitriol
               | surrounding the decision to include it led directly to
               | him stepping down from the position [1]), so even
               | accepting the fact that it's a poor addition to the
               | language, does not provide support for the statement that
               | "design by committee" has lead to poor language design
               | decisions.
               | 
               | 1. https://pythonsimplified.com/the-most-controversial-
               | python-w...
        
               | dataflow wrote:
               | You'll want to post that as a reply to the parent you
               | originally replied to, since I'm not the one who said
               | anything about design-by-committee.
        
               | DangitBobby wrote:
               | They gave no specific criticisms. This thread was born of
               | a request for specific criticisms. When that happens, I
               | try to operate as though the assumptions laid out in the
               | parents hold for the children. I think this makes sense
               | to do, especially when you appeared to step in as a proxy
               | expanding on the parent's opinion. Even if that wasn't
               | your intention, this is a public thread, and the most
               | relevant place to post things as a response to a
               | sentiment in a thread may not be directly to a person who
               | holds that exact sentiment. If you don't take issue with
               | "design by committee" then you need not be concerned. I
               | don't think you think that, and I think no less of you
               | regardless.
        
               | dataflow wrote:
               | I meant it moreso that the person whom the reply is
               | actually relevant to might not see it orherwise, but
               | whatever, it's fine with me.
        
               | asah wrote:
               | Disagree: the recent changes are things I put to work
               | immediately and in a large fraction of the code. They're
               | not niche and "should have" been added years ago. If
               | anything, I'm thrilled with the work of the "committee,"
               | whose judgments are better than the result of any
               | individual. Postgres is the same.
               | 
               | Gone are the days when you invest in a platform like
               | python, and they make crazy decisions that kill the
               | platform's future (e.g. perl5). Ignore small syntax stuff
               | like := and focus on the big stuff.
        
               | dataflow wrote:
               | > Disagree: the recent changes are things I put to work
               | immediately and in a large fraction of the code.
               | 
               | That says nothing about their quality. It just says you
               | like them. If you gave me unhealthy food I'd probably eat
               | it immediately too. Doesn't mean I think it's good for
               | me.
               | 
               | > Ignore small syntax stuff like := and focus on the big
               | stuff.
               | 
               | They're not "small" when you immediately start using them
               | in a "large fraction of your code". And a simple syntax
               | that's easy to understand is practically Python's raison
               | d'etre. They added constructs with some pretty darn
               | unexpected meanings into what was supposed to be an
               | accessible language, and you want people to ignore them?
               | I would ignore them in a language like C++ (heck, I would
               | ignore syntax complications in C++ to a large degree),
               | but ignoring features that make _Python_ harder to read?
               | To me that 's like putting performance-killing features
               | in C++ and asking people to ignore them. It's not that I
               | _can 't_ ignore them--it's that that's not the point.
        
               | DangitBobby wrote:
               | I simply do not understand how the walrus operator is
               | harder to read. Maybe an example?
               | my_match = regex.match(foo)         if my_match:
               | return my_match.groups()         # continues with the now
               | useless my_match in scope
               | 
               | Versus                   if my_match := regex.match(foo):
               | return my_match.groups()         # continues without
               | useless my_match in scope
               | 
               | How is the second one less readable? Have you ever heard
               | of a real world example of a beginner or literally anyone
               | ever actually expressing confusion over this?
        
               | dataflow wrote:
               | The problem isn't that simple use case. Although even in
               | that case, they already had '=' as an assignment
               | operator, and they could've easily kept it like the
               | majority of other languages do instead of introducing an
               | inconsistency.
               | 
               | The more major problem with the walrus operator is more
               | complicated expressions they made legal with it. Like,
               | could you explain to me why making _these_ legal was a
               | _good_ thing?                 def foo()           return
               | ...       def bar():           yield ...       while
               | foo() or (w := bar()) < 10:           # w is in-scope
               | here, but possibly nonexistent!           # Even in C++
               | it would at least *exist*!           print(w)
               | # The variable is still in-scope here, and still
               | *nonexistent*       # Ditto as above, but even worse
               | outside the loop       print(w := w + 1)
               | 
               | If they just wanted your use case, they could've made
               | only expressions of the form 'if var := val' legal, and
               | _maybe_ the same with  'while', not full-blown
               | assignments in arbitrary expressions, which they had
               | (very wisely) prohibited for decades for the sake of
               | readability. And they would've scoped the variable to the
               | 'if', not made it accessible after the conditional. But
               | nope, they went ahead and just did what '=' does in any
               | language, and to add insult to injury, they didn't even
               | keep the existing syntax when it has exactly the same
               | meaning. And it's not like they even added += and -= and
               | all those along with it (or +:= and -:= because
               | apparently that's their taste) to make it more useful in
               | that direction, if they really felt in-expression
               | assignments were useful, so it's not like you get those
               | benefits either.
        
               | DangitBobby wrote:
               | In your example, if you leave out the parentheses around
               | w := bar(), you get "SyntaxError: cannot use assignment
               | expressions with operator" which makes me think it's a
               | bug in the interpreter and not intentionally designed to
               | allow it.
               | 
               | I am baffled to learn that it's kept in scope outside of
               | the statement it's assigned, and I agree it would have a
               | negative impact on readability if used outside of the if
               | statement.
        
               | dataflow wrote:
               | > if you leave out the parentheses around w := bar(), you
               | get "SyntaxError: cannot use assignment expressions with
               | operator" which makes me think it's a bug in the
               | interpreter and not intentionally designed to allow it.
               | 
               | No, I'm pretty sure that's intentional. You want the
               | left-hand side of an assignment to be crystal clear,
               | which "foo() or w := bar()" is not. It looks like it's
               | assigning to (foo() or w).
        
               | DangitBobby wrote:
               | To be clear:                   def thing(): return True
               | if thing() or w:= "ok": # SyntaxError: cannot use
               | assignment expressions with operator             pass
               | print(w)              . . .              if thing() or (w
               | := "ok"):             pass         print(w) # NameError:
               | name 'w' is not defined
               | 
               | The first error makes me think your concern (that w is
               | conditionally undefined) was anticipated and supposed to
               | be guarded against with the SyntaxError. I believe the
               | fact you can bypass it with parentheses is a bug and not
               | an intentional design decision.
        
               | dataflow wrote:
               | Oh I see, you're looking at it from that angle. But no,
               | it's intentional. Check out PEP 572 [1]:
               | 
               | > _The motivation for this special case is twofold.
               | First, it allows us to conveniently capture a "witness"
               | for an any() expression, or a counterexample for all(),
               | for example:_                 if any((comment :=
               | line).startswith('#') for line in lines):
               | print("First comment:", comment)       else:
               | print("There are no comments")
               | 
               | I have a hard time believing even the authors (let alone
               | you) could tell me with a straight face that that's easy
               | to read. If they really believe that, I... have questions
               | about their experiences.
               | 
               | The beauty of Python...
               | 
               | [1] https://www.python.org/dev/peps/pep-0572/
        
               | DangitBobby wrote:
               | Your new example makes me wonder: if I can intentionally
               | conditionally bring variables into existence with the
               | walrus operator, what's the motivation behind the
               | SyntaxError in my statement above? I maintain my belief
               | that the real issue here is, readability aside, if blocks
               | do not implement a new scope, which has always been a
               | problem in the language. The walrus operator just gives
               | you new ways to trip over that problem.
               | 
               | From the PEP:
               | 
               | > An assignment expression does not introduce a new
               | scope. In most cases the scope in which the target will
               | be bound is self-explanatory: it is the current scope. If
               | this scope contains a nonlocal or global declaration for
               | the target, the assignment expression honors that. A
               | lambda (being an explicit, if anonymous, function
               | definition) counts as a scope for this purpose.
               | 
               | I find this particularly strange and inconsistent:
               | lines = ["1"]              [(comment :=
               | line).startswith('#') for line in lines]
               | print(comment) # 1              [x for x in range(3)]
               | print(x) # NameError: name 'x' is not defined
        
               | dataflow wrote:
               | > what's the motivation behind the SyntaxError in my
               | statement above?
               | 
               | I'm pretty sure it's what I explained here:
               | https://news.ycombinator.com/item?id=28899404
        
               | DangitBobby wrote:
               | I did not understand what you meant.
        
               | dataflow wrote:
               | I'm saying it's the same reason why (x + y = z) should be
               | illegal even if (x + (y = z)) is legal in any language.
               | It's not specific to Python by any means. The target of
               | an assignment needs to be obvious and not confusing. You
               | don't want x + y to look like it's being assigned to.
        
               | DangitBobby wrote:
               | I see. It has low precedence in the operator hierarchy
               | [1] so                   False or w := 1:
               | 
               | Is grouped like so:                   (False or w) := 1
               | 
               | Which is a SyntaxError. That's... not a smart place for
               | it to be in the operator hierarchy. I expected it to be
               | near the very top, like await.
               | 
               | 1. https://docs.python.org/3/reference/expressions.html#o
               | perato...
               | 
               | Edit: 20 minutes later, can't respond.
               | 
               | There are two aspects I have been thinking about while
               | looking at this: Introduction of non-obvious behavior
               | (foot-guns) and readability. Readability is important,
               | but I have been thinking primarily about the foot-gun
               | bits, and you have been emphasizing the readability bits.
               | I can't really accurately assess readability of something
               | until I encounter it in the wild.
        
               | dataflow wrote:
               | If the precedence was higher then you'd get a situation
               | like                 x := 1 if cond else 2
               | 
               | never resulting in x := 2 which is pretty unintuitive.
               | 
               | And you have to realize, even if the precedence works
               | out, nobody is going to remember the full ordering for
               | every language they use. People mostly remember a partial
               | order that they're comfortable with, and the rest they
               | either avoid or look up as needed. Like in C++, I
               | couldn't tell you exactly how (a << b = x ? c : d) groups
               | (though I could make an educated guess), and I don't have
               | any interest in remembering it either.
               | 
               | Ultimately, this isn't about the actual precedence. Even
               | if the precedence was magically "right", it's about
               | readability. It's just not readable to assign to a
               | compound expression, even if the language has perfect
               | precedence.
        
               | [deleted]
        
               | eesmith wrote:
               | While the walrus operator gives a way to see this sort of
               | non-C++ behavior, it's more showing that Python isn't C++
               | than something special about the operator.
               | 
               | Here's another way to trigger the same NameError, via
               | "global":                   import random         def
               | foo():             return random.randrange(2)         def
               | bar():             global w             w = return
               | random.randrange(20)             return w
               | while foo() or (bar() < 10):             print(w)
               | 
               | For even more Python-is-not-C++-fun:
               | import re         def parse_str(s):             def
               | m(pattern):  # I <3 Perl!                 nonlocal _
               | _ = re.match(pattern, s)                 return _ is not
               | None                  if m("Name: (.*)$"):
               | return ("name", _[1])             if m("State: (..) City:
               | (.*)$"):                 return ("city", (_[2], _[1]))
               | if m(r"ZIP: (\d{5})(-(\d{4}))?$"):                 return
               | ("zip", _[1] + (_[2] if _[2] else ""))
               | return ("Unknown", s)             del _  # Remove this
               | line and the function isn't valid Python(!)
               | for line in (                 "Name: Ernest Hemingway",
               | "State: FL City: Key West",                 "ZIP: 33040",
               | ):             print(parse_str(line))
        
               | dataflow wrote:
               | Right, I'm _quite_ well-aware of that, but I 'm saying
               | this change has made the situation even worse. If they
               | ensured the variables were scoped and actually
               | initialized it'd have actually been an improvement.
        
               | pansa2 wrote:
               | The walrus operator provides no benefit here. `my_match`
               | is still in scope in both cases.
               | 
               | Python's `if` statements do not introduce a new scope.
        
               | DangitBobby wrote:
               | I know they don't, normally. I really thought that was
               | basically the point of the walrus operator to begin with,
               | that the variable was only in scope for the lifetime of
               | the if statement where it's needed. Huge bummer to find
               | out that's not true.
        
               | dataflow wrote:
               | Looks like now you're seeing why it's an abomination ;)
        
               | DangitBobby wrote:
               | IMO the real abomination was already present in the
               | language, which is that if blocks do not introduce new
               | scope. My IDE protects me from the bugs this could easily
               | introduce when I try to use a variable that may not yet
               | be in scope, but it should be detected before runtime.
               | 
               | I will readily admit that the walrus operator doesn't do
               | what I thought it did and I have no interest in whatever
               | utility it provides as it exists today.
        
               | dataflow wrote:
               | > IMO the real abomination was already present in the
               | language, which is that if blocks do not introduce new
               | scope.
               | 
               | Definitely. You would think if they're going to undermine
               | decades of their own philosophy, they would instead
               | introduce variable declarations and actually help
               | mitigate some bugs in the process.
        
               | DangitBobby wrote:
               | I have now heard specific concerns with := that make
               | sense to me, but what about match expressions? What about
               | them do you not like?
        
               | pansa2 wrote:
               | IMO the match statement has some very unintuitive
               | behaviour:                   match status:
               | case 404:                 return "Not found"
               | not_found = 404         match status:             case
               | not_found:                 return "Not found"
               | 
               | The first checks for equality (`status == 404`) and the
               | second performs an assignment (`not_found = status`).
               | 
               | `not_found` behaving differently from the literal `404`
               | breaks an important principle: "if you see an
               | undocumented constant, you can always name it without
               | changing the code's meaning" [0].
               | 
               | [0] https://twitter.com/brandon_rhodes/status/13602261083
               | 9909990...
        
               | dataflow wrote:
               | Aw dang you spoiled it :-) I was hoping my example would
               | be more fun to work through haha.
        
               | DangitBobby wrote:
               | I see. Is it fair to say your issue with the feature is
               | less about not wanting the feature and more about the
               | implementation details?
        
               | dataflow wrote:
               | What's the output of this program?                 class
               | C(object):           A = 1              B = 2
               | x = 3       y = 10       print(x - B)       match y:
               | case C.A:               print('A')           case B:
               | print(y - B)
        
               | DangitBobby wrote:
               | Do you not like the idea of pattern matching as a feature
               | or do you not like the implementation details? This kind
               | of seems like another clumsy scoping problem, no?
        
               | dataflow wrote:
               | I would love a good pattern matching feature, but this is
               | not it. And this is a seriously broken design at a
               | fundamental level, not an "implementation detail". I
               | actually have no clue how it's _implemented_ and couldn
               | 't care less honestly. I just know it's incredibly
               | dangerous for the user to actually use, and incredibly
               | unintuitive on its face. It's as front-and-center as a
               | design decision could possibly be, I think.
               | 
               | And no, this is not really a scoping issue. Match is
               | literally writing to a variable in one pattern but not
               | the other. A conditional write is just a plain
               | inconsistency.
               | 
               | The sad part is both of these features are stumbling over
               | the fact that Python doesn't have variable
               | declarations/initialization. If they'd only introduced a
               | different syntax for initializations, both of these could
               | have been much clearer.
        
               | DangitBobby wrote:
               | > I actually have no clue how it's implemented and
               | couldn't care less honestly.
               | 
               | I guess I'm not sure where "design" ends and
               | "implementation" begins? To me, how to handle matching on
               | variables that already exists is both, because "pattern
               | matching and destructuring" are the features and how that
               | must work in the context of the actual language is
               | "implementation". It being written in a design doc and
               | having real world consequences in the resulting code
               | doesn't make it not part of the implementation.
               | 
               | Instead of quibbling over terms, I was much more
               | interested in whether you like the idea of pattern
               | matching.
               | 
               | I think not liking the final form a feature takes in the
               | language is fundamentally different from wholesale
               | disliking the direction the language design is going.
        
               | dataflow wrote:
               | Design is the thing the client sees, implementation is
               | the stuff they don't see. In this case the user is the
               | one using match expressions. And they're seeing variables
               | mutate inconsistently. It's practically impossible for a
               | user _not_ to see this, even if they wanted to. Calling
               | that an implementation detail is like calling your car 's
               | steering wheel an implementation detail.
               | 
               | But I mean, you can call it that if you prefer. It's just
               | as terrible and inexcusable regardless of its name. And
               | yes, as I mentioned, I would have loved to have a good
               | pattern matching system, but so far the "direction"
               | they're going is actively damaging the language by
               | introducing more pitfalls instead of fixing the existing
               | ones (scopes, declarations, etc.). Just because pattern
               | matching in the abstract could be a feature, that doesn't
               | mean they're going in a good direction by implementing it
               | in a broken way.
               | 
               | I guess like they say, the road to hell is paved with
               | good intentions.
        
               | isoprophlex wrote:
               | The walrus operator is a tired old trope to hate on, but
               | I dont see the point personally. Same goes for the
               | structural pattern matching thing. The tacking on of
               | typing features feels superfluous in a language thats not
               | compiled or even strongly typed.
               | 
               | But for the sake of maximum pendatry let me paste some
               | nitpicky little detail from a somewhat recent syntactic
               | addition:                  >>> def f(a, b, /, **kwargs):
               | ...     print(a, b, kwargs)        ...        >>> f(10,
               | 20, a=1, b=2, c=3)                        10 20 {'a': 1,
               | 'b': 2, 'c': 3}                a and b are used in two
               | ways.                Since the parameters to the left of
               | / are not exposed as possible keywords, the parameters
               | names remain available for use in **kwargs
               | 
               | Jesus fucking hell on a tricylce so now i have *'s and
               | /'s showing up in function signatures so someone can
               | prematurely optimize the re-use of variable names without
               | breaking backwards comparability?!
               | 
               | Python is becoming a mockery, dying a death through a
               | thousand little cuts to its ergonomics.
        
               | ptx wrote:
               | I'm sure you're already aware of this example since it's
               | the canonical one, but to me personally the point is very
               | clear: I use regular expressions all the time and always
               | have to write that little bit of boilerplate, which the
               | walrus operator now lets me get rid of.
               | 
               | Avoiding tedious boilerplate by adding nice features like
               | the walrus operator is precisely what lets us avoid
               | "death through a thousand little cuts to its ergonomics",
               | in my view.
               | 
               | Sure, maybe writing                 m = re.match("^foo",
               | s)       if m != None:         ...
               | 
               | isn't so bad, but in that case maybe writing
               | i = 0        while i < len(stuff):         element =
               | stuff[i]         ...         i += 1
               | 
               | wouldn't be so bad, and we could get rid of Python's
               | iterator protocol?
        
               | mixmastamyk wrote:
               | It should have been "if ... as y" and reused existing
               | syntax. I've never seen anyone use the extended variant
               | (multiple assignment) that walrus allows. The extra
               | colons with this and typing makes it look like a standard
               | punctuation-heavy language we sought to avoid in the
               | first place.
        
               | dataflow wrote:
               | I think regex matches might be literally the only use
               | case for := that I come across with any kind of
               | nontrivial frequency, and it's only a minor nuisance at
               | that. Certainly nothing to warrant an entirely new yet
               | different syntax for something we already have.
               | 
               | The iterator protocol is _way_ more general than what you
               | have; it 's not remotely comparable.
        
               | dragonwriter wrote:
               | AFAIK, the _purpose_ of " /" is so that python-
               | implemented functions can be fully signature-(and,
               | therefore, also type-)compatible with builtins and
               | C-implemented functions that required positional
               | arguments but do not accept those arguments being passed
               | as keyword arguments.
        
               | wenc wrote:
               | > The tacking on of typing features feels superfluous in
               | a language thats not compiled or even strongly typed.
               | 
               | Python has always been strongly typed (Python has strong
               | dynamic typing). Adding typechecks moves it towards being
               | gradually/statically-typed.
        
               | ahmedfromtunis wrote:
               | Not the OP but some of the late additions to Python were,
               | *in MY very humble opinion* not very pythonic; just a
               | syntactic sugar that meant there are now more than just
               | one way of doing things.
               | 
               | On that list: the walrus operator and the new switch
               | thing. If I understand them fully and correctly, those
               | two things don't enable developers to do things that were
               | impossible before, instead they add new ways to do things
               | that were possible prior.
               | 
               | That's the Python I know and love.
               | 
               | Of course, this doesn't mean I'll love Python any less,
               | just that I wished there were more focus on staff that
               | matters like the topic of this article. Or maybe getting
               | type hinting better.
               | 
               | Again, this is just my opinion.
        
               | DangitBobby wrote:
               | IMO, the "one obvious way to do things" has always been a
               | comforting fiction. There are numerous ways to do
               | everything, the worst offenders forcing people to make
               | tradeoffs between debuggability and readability (ie, for
               | loops versus list comprehensions). Many of them are
               | purely about readability (ternary expression versus if
               | blocks) and many of them are about style (ternary
               | expression versus use of or/and short-circuiting). Even
               | so, before the walrus operator, there was never a way to
               | define a variable that only existed in the scope of a
               | particular if statement.
               | 
               | After using pattern matching in Rust and switch
               | statements in JavaScript, I personally am very excited
               | for that addition to Python, but I understand the feature
               | is divisive and will concede it as a matter of opinion.
               | 
               | Edit: turns out the walrus operator does not cause the
               | variable to move out of scope after the if block, which
               | is disappointing. IMO the worse anti-pattern has already
               | been part of the language, which is not creating new
               | scopes for if statements.
        
         | klyrs wrote:
         | Cython itself should be a relatively simple fix (relative to
         | the difficulty that Cython devs are accustomed to). Libraries
         | that use Cython in a pure way (that is, not fussing with
         | refcounts in hand-written C code) should "just work" after
         | Cython gets updated. It's the poor folk who have done straight
         | C extensions without the benefit of Cython that I'm concerned
         | about.
        
         | dangerbird2 wrote:
         | I'd wonder if it would be easier to introduce a totally new API
         | along the lines of ruby's ractor API[1] that enables thread
         | parallelism while keeping existing Thread behavior identical as
         | with the GIL. Tons of python code relies on threaded code that
         | is thread-safe under the GIL, but would completely blow up if
         | the GIL was naively replaced.
         | 
         | [1] https://docs.ruby-lang.org/en/master/doc/ractor_md.html
        
           | arthurcolle wrote:
           | Ractors don't offer very good performance yet. Better to have
           | it be awesome right off the bat
        
             | dangerbird2 wrote:
             | Yeah, That's what I thought. I think the greatest barrier
             | now is that most multithreaded python code right now is
             | _just barely_ thread-safe, even with the GIL. I
             | occasionally have to remind colleagues that even though the
             | GIL guarantees instructions are atomic, you need to use
             | mutexes and other synchronization primitives to ensure
             | there is no race condition between multiple instructions. I
             | 'd imagine this change would be an optional interpreter
             | feature initially, since removing the GIL would break the
             | vast majority of code out in the wild, and it would be much
             | more difficult to create an automated conversion tool like
             | they did with the syntactic changes between 2.7 and 3
        
         | lpapez wrote:
         | A simple solution would be to introduce two new types:
         | ConcurrentThread and ParallelThread. Alias the old Thread to
         | the ConcurrentThread and keep the behaviour. No breaking
         | changes, easy to explain the differrence. People who need it
         | can use the new truly parallel version.
        
       | pvg wrote:
       | The other day:
       | 
       | https://news.ycombinator.com/item?id=28880782
        
       | kjeetgill wrote:
       | Anyone know how far along graalpython is? I'd imagine it should
       | be a suitable, if not superior, replacement with about as much
       | effort right?
        
       | aasasd wrote:
       | I mean, it stands to evolutionary reason that Python shouldn't
       | have a GIL.
        
       | aeturnum wrote:
       | I worked in Python for years and while I suppose I'm glad for any
       | improvement, I have never understood the obsession with true
       | multi-threading. Languages are about trade-offs and Python, again
       | and again, chooses flexibility over performance. It's a good
       | choice! You can see it in how widely Python is used and the
       | diversity of its applications.
       | 
       | Performance doesn't come from any one quality, but from the
       | holistic goals at each level of the language. I think some of the
       | most frustrating aspects from the history of Python have been
       | when the team lost focus on why and how people used the language
       | (i.e. the 2 -> 3 transition, though I have always loved 3). I
       | hope that this is a sensible optimization and not an over-
       | extension.
        
         | darthrupert wrote:
         | Lack of multithreading can easily be a win for a language. A
         | tiny subset of problems really needs it these days and for
         | everything else it's a potential way to either screw things up
         | or make them way more complicated that needs to be.
        
           | cm2187 wrote:
           | > _A tiny subset of problems_
           | 
           | like processing web requests?
        
           | emrah wrote:
           | So if you don't like it or want it, don't use it then? Why
           | does it have to be missing altogether for you to be happy?
        
         | cm2187 wrote:
         | Popularity has probably as much to do, if not more to do, with
         | ease of access (or lack of alternative) than good design of the
         | language. Php is equally if not more popular than python.
        
           | aeturnum wrote:
           | I'm not a PHP expert, but I did not know it was also used in
           | data science, game programming, embedded programming and
           | machine learning as Python is. Of course they are both used
           | for web services.
        
         | emerged wrote:
         | From my perspective as a huge Python fan, efficient
         | multithreading is simply the only major thing missing from the
         | language. I would still use C/C++/assembly for bleeding edge
         | performance needs, but efficient multithreading in Python would
         | have me reaching for alternatives far less often.
         | 
         | Basically I love peanut butter ice cream (Python) I'd just like
         | it even more with sprinkles.
        
         | dec0dedab0de wrote:
         | I agree, but I don't do anything that can be split up, and
         | would benefit from sharing memory. That is really the only
         | benefit of removing the GIL. Multiprocessing can do true
         | concurrency, and so can Celery, which even allows you to use
         | multiple computers. The only time that is a pain is when you
         | need to share memory, or I guess maybe if you're low on
         | resources and can't spare the overhead from multiple processes.
         | 
         | I think a JIT would be the best possible improvement for
         | CPython as far as speed is concerned. Though I can imagine
         | there are plenty of people doing processor heavy stuff with c
         | extensions that would benefit from sharing memory. So from
         | their perspective removing the GIL would be a better
         | improvement.
         | 
         | So basically a JIT would help every Python program, and
         | removing the GIL would only help a small subset of Python
         | programs. Though I'm just happy I get to make a living using
         | Python.
        
         | klyrs wrote:
         | Losing the GIL strictly makes the language strictly more
         | flexible. Previous GILectomies tanked performance to an
         | unacceptable degree. In single-threaded code, this one is a
         | moderate performance improvement in some benchmarks, and a
         | small detriment in others -- which is about as close to perfect
         | as one could expect from such a change. That's why people are
         | excited about it.
         | 
         | At a higher level, Python is getting serious about performance.
         | But this gives both flexibility _and_ performance.
        
           | aeturnum wrote:
           | Yah, that's definitely the future I'm hoping for. What I am
           | worried about are the kind of transition issues I mentioned.
           | Python 2 -> 3 strictly made the language more flexible too -
           | but the Python ecosystem is about existing code almost more
           | than the language and I worry that we could find similar
           | problems here. Potential for plenty of growing pains while
           | chasing relatively small gains.
        
             | ynik wrote:
             | In the company I'm working for, we had to spent more
             | engineer time on GIL workarounds (dealing with the extra
             | complexity caused by multiprocessing, e.g. patching C++
             | libraries to put all their state into shared memory) than
             | we needed for the Python 2 -> 3 migration. And we've only
             | managed to parallelize less than half of our workload so
             | far.
             | 
             | Even if this will be a major breaking change to Python,
             | it'll be worth it for us.
        
         | m0zg wrote:
         | One does not preclude another: the language can be flexible and
         | offer higher concurrency that it does now. My workstation has
         | 64 hyperthreads. Python can use one at a time. That's messed up
         | since I use it as a general purpose language.
        
         | didip wrote:
         | This is because Python, by luck, ended up dominating the data
         | science market.
         | 
         | In this market you really want to shuffle tons of data quickly,
         | and that's usually achieved through parallelism.
         | 
         | Python multiprocess library does a poor job at that.
        
           | anthk wrote:
           | That's calling C and Fortran in the background, actually.
        
         | amelius wrote:
         | > Performance doesn't come from any one quality, but from the
         | holistic goals at each level of the language.
         | 
         | It starts to become an issue when you have built a few well-
         | performing subsystems and now want them to run together and
         | interact. With the GIL, your subsystems are suddenly _not_
         | performing as well anymore. Without the GIL, you can still get
         | good performance (within limits of course).
         | 
         | Performance referring here to throughput and/or latency
         | (responsiveness).
        
         | tester756 wrote:
         | >again and again, chooses flexibility over performance. It's a
         | good choice! You can see it in how widely Python is used and
         | the diversity of its applications.
         | 
         | What does it mean? how is python different here than Java/C#?
        
           | aeturnum wrote:
           | I mean, you can modify Python code at runtime if you like.
           | This has a good overview of all the nonsense happening under
           | the hood: http://jakevdp.github.io/blog/2014/05/09/why-
           | python-is-slow/
        
         | cma wrote:
         | There are 64-core, 128-thread prosumer CPUs now and it is only
         | going to go higher. At some point it just becomes necessary.
        
           | turminal wrote:
           | What does a 128 thread python app do better than 128 single
           | threaded ones?
        
             | gypsyharlot wrote:
             | Shared L3 cache.
        
             | jhoechtl wrote:
             | OS overhead of 128 processes is higher than scheduling 128
             | tasks. Varies from os to os, but it's especially bad on
             | Windows.
        
               | turminal wrote:
               | Yeah, I know about that argument but it just doesn't make
               | sense to me. Removing the GIL means that 1) you make your
               | language runtime more complex and 2) you make your app
               | more complex.
               | 
               | Is it truly worth it just to avoid some memory overhead?
               | Or is there some other windows specific thing that I'm
               | missing here?
        
               | dragonwriter wrote:
               | > Yeah, I know about that argument but it just doesn't
               | make sense to me. Removing the GIL means that 1) you make
               | your language runtime more complex and 2) you make your
               | app more complex.
               | 
               | #2 need not be true; e.g., the approach proposed here is
               | transparent to most Python code and even minimized impact
               | on C extensions, still exposing the same GIL hook
               | functions which C code would use in the same
               | circumstances, though it has slightly different effect.
        
             | Redoubts wrote:
             | marshal data
        
               | turminal wrote:
               | Care to elaborate? What does that change for an average
               | webapp?
        
               | yuliyp wrote:
               | Say your webapp talks to a database or a cache. It'd be
               | really nice if you could use a single connection to that
               | database instead of 64 connections. Or if you wanted to
               | cache some things on the web server, it would be nice if
               | you could have 1 copy easily accessible vs needing 64
               | copies and needing to fill those caches 64x as much.
        
               | semiquaver wrote:
               | Unfortunately using a single db/RPC connection for many
               | active threads is not done in any multithreaded system
               | I'm aware of for good reasons. Sharing this type of
               | resource across threads is not safe without expensive and
               | performance-destroying mutexes. In practice each thread
               | needs exclusive access to its own database connection
               | while it is active. This is normally achieved using
               | connection pooling which can save a few connections when
               | some threads are idle, but 1 connection for 64 active web
               | worker threads is not a recipe for a performant web app.
               | If you can point to a multithreaded web app server that
               | works this way I'd be very interested to hear about it.
               | 
               | The idea of a process-local cache (or other data) shared
               | among all worker threads is a different story. I see this
               | as one of the bigger advantages of threaded app servers.
               | However, preforking multiprocess servers can always use
               | shmget(2) to share memory directly with a bit more work.
        
               | cma wrote:
               | That's slower than just doing it single threaded for many
               | use cases.
        
               | [deleted]
        
             | heinrichhartman wrote:
             | No shared memory. To communicate between processes you
             | usually use sockets, to communicate between threads you
             | mutate variables. This is a huge performance difference.
        
           | aeturnum wrote:
           | Yes higher core counts are more and more common, but the
           | language has thirty years of single-threaded path-dependence.
           | Lots of elements of it work the way they do because there was
           | a GIL. I could be wrong, but I am skeptical that Python will
           | ever be the best choice for high performance code. It's
           | always worth improving the speed of code when you can, but
           | more often than not you "get" something for going slower. I
           | hope my worries are wrong and this is actually a free win!
        
       | randtrain34 wrote:
       | Design doc the proposer linked:
       | https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsD...
        
       | lvass wrote:
       | I have seen and written Python code that spawns various threads
       | with shared mutable state. Is it possible that some day the same
       | code would run in parallel? That could be a terrible (very)
       | breaking change. I'm not against allowing in-process parallel
       | execution but please let it require a new API.
        
         | ajkjk wrote:
         | Isn't that.. Already the case? The gil doesn't prevent thread
         | switching in the middle of python code.
        
           | lvass wrote:
           | It's concurrent not parallel. The switch won't happen inside
           | the execution of one opcode including some dictionary update
           | operations so it's safe in many cases where parallel
           | execution isn't.
        
             | ajkjk wrote:
             | Yes, single-instruction operations would be fine, but if
             | you're writing multithreaded code you are probably doing
             | things that the GIL doesn't protect all the time. Like
             | dict-updates on classes that implement __set__, or `if not
             | a[x]: a[x] = y` sorts of two-phased checks, or just like,
             | anything else. You can't get very far with global state
             | without reckoning with concurrency, GIL or not.
             | 
             | I assume that a change to relax the GIL will both allow you
             | to opt-out of it, and allow you to use locking versions of
             | primitive data-structures, anyway; it's not like it's going
             | to just vanish overnight with no guardrails.
        
             | colinmhayes wrote:
             | Seems like checking the docs for atomicity of every
             | operation is a huge pain in the ass.
        
               | ajkjk wrote:
               | It really is, they don't make it clear at all. Every time
               | I have to ask the question of "is this atomic under the
               | GIL" I struggle to find the right answer.
        
               | kzrdude wrote:
               | Best to avoid sharing data, and never mutating shared
               | data!
               | 
               | Rust has a great rule: Sharing XOR Mutation.
               | 
               | Python is higher level, so message passing and passing
               | "owned" values between threads is all the more feasible
               | and sensible.
        
             | ynik wrote:
             | Currently, dictionary updates are atomic only if the keys
             | are primitive types.
             | 
             | `dict.update()` will call methods like `__eq__`, and those
             | methods (if implemented in Python) may temporarily release
             | the GIL.
        
               | kzrdude wrote:
               | Unfortunately the Python docs says that dict.update() is
               | atomic. It's being fixed though... It came out during
               | these discussions.
        
             | zelphirkalt wrote:
             | It is probably a bad practice to not acquire a mutex for
             | that concurrent dictionary update. The code should be
             | improved in that regard, with or without any potential
             | Python language change.
        
               | lvass wrote:
               | If needed, I'll probably just change it to asyncio,
               | probably saner than inspecting everything for new
               | parallelism bugs which can be incredibly subtle.
        
               | ajkjk wrote:
               | If performance isn't hugely important you could make
               | blanket-locking wrappers around common data structures
               | and swap them in-place for all of your global state.
               | 
               | .. but, as I said, removing the GIL will almost certainly
               | be opt-in.
        
         | pkulak wrote:
         | That was my first thought as well. I use Python to whip up
         | quick scripts and enjoy not having to worry about shared
         | memory, even when I'm using concurrency. I'd hate to lose that.
        
           | Spivak wrote:
           | Don't you still have to worry about this since OS threads can
           | be arbitrarily preempted?
        
             | pkulak wrote:
             | That's fine. You don't have to worry about pausing, you
             | have to worry about multiple threads getting at some memory
             | at the same time.
        
         | juanbyrge wrote:
         | Translation: I have written buggy, racy software that has
         | specific dependencies on thread timing. Please do not make
         | significant improvements to Python because it will reveal these
         | bugs in my software, and I will be forced to fix the bugs and
         | use proper synchronization.
        
           | [deleted]
        
           | lvass wrote:
           | Is that how you call things that have been working flawlessly
           | and solving people's problems for over 10 years? Is
           | needlessly breaking things that work an improvement to you?
        
             | zelphirkalt wrote:
             | You could probably simply lock the Python version you use
             | for such code. No breakage there. If you must upgrade to a
             | newer Python version, then you will have to repair broken
             | code.
        
               | doubled112 wrote:
               | This went really well for the Python 2 -> 3 upgrade
        
               | lazide wrote:
               | It did buy a decade or so (or more really) - not like the
               | python2 distribution you downloaded and distribute with
               | your program back then is going to get tracked down and
               | shot in the head by Guido anytime soon.
               | 
               | If you're relying on whatever python version is
               | distributed with whatever machine it happens to be on,
               | there are a huge number of problems you're already going
               | to have.
        
           | [deleted]
        
           | lazide wrote:
           | If you make something that works because of an explicit
           | memory and concurrency model (and not like there are other
           | options at the time), it is indeed legit to worry about a
           | major shift to those models that would cause problems.
           | 
           | Even if those changes are better for other ways of solving
           | problems.
        
       | JacobHenner wrote:
       | Source: https://mail.python.org/archives/list/python-
       | dev@python.org/...
        
       | kzrdude wrote:
       | The story of losing GIL is very popular in the news, and I like
       | it too!
       | 
       | .. but. Let's not count our chickens until they are home. I'm
       | wondering if the Python dev community will take on this
       | challenge. I hope so, Sam seems to really have put in a lot of
       | effort!
        
       | TekMol wrote:
       | Why is multithreaded performance important, what are the usecases
       | where you cannot run multiple processes to spread your
       | numbercrunching across CPUs?
       | 
       | I am _praying_ for CPython to become faster. But I need faster
       | singlethreaded performance, so web applications benefit from it.
        
         | chrisseaton wrote:
         | Communicating between processes is more expensive than
         | communicating between threads.
        
           | TekMol wrote:
           | Ok, but what is the use case where you need high bandwidth
           | inter-thread/process communication?
        
             | moron4hire wrote:
             | Games. Simulations. Backend servers for shared editor
             | experiences. Teleconferencing.
        
             | chrisseaton wrote:
             | Except for embarrassingly parallel problems, the trade-off
             | of generating more parallelism is usually needing finer-
             | grained communication. Canonical examples in the literature
             | are matrix multiplication, triangulation, and refinement.
        
             | Spivak wrote:
             | Large shared state is basically always the answer. You can
             | cop-out and say use a database or Redis if that's fast
             | enough but that's just making someone else use many threads
             | with shared memory.
        
             | adgjlsfhk1 wrote:
             | matrix multiply is an obvious one. partial differential
             | equations are another. Sorting is one if you don't care
             | about math.
        
         | VWWHFSfQ wrote:
         | multiple processes use a lot more memory than threads
        
           | TekMol wrote:
           | Can you quantify that?
        
             | SkittyDog wrote:
             | Yes, you can
        
               | lazide wrote:
               | Even with copy-on-write? Have any actual numbers?
        
               | SkittyDog wrote:
               | Actual work is left as an exercise for the reader ;-)
        
               | lazide wrote:
               | Last I did this, when the processes were fork()'s of the
               | parent (the typical way this was done), memory overhead
               | was minimal compared to threads. A couple %. That was
               | somewhat workload dependent however, if there is a lot of
               | memory churn or data marshaling/unmarshalling happening
               | as part of the workload, they'll quickly diverge and
               | you'll burn a ton of CPU doing so.
               | 
               | Typical ways around that include mmap'ng things or
               | various types of shared memory IPC, but that is a lot of
               | work.
        
               | kaba0 wrote:
               | What about context switches as well, much slower IPC, and
               | basically "no native support".
        
             | adgjlsfhk1 wrote:
             | each python process requires somewhere around 200mb of
             | memory and .1s to do nothing. if you want libraries, it
             | scales from there.
        
               | TekMol wrote:
               | Really? That sounds like an aweful lot.
               | 
               | When I execute this:                   python3 -c 'import
               | time; time.sleep(60)'
               | 
               | And then pmap the process id of that process, I get
               | 26144K. That is 26MB.
               | 
               | As for timing, when I execute this:                  time
               | python3 -c ''
               | 
               | I get 0.02s
        
               | lazide wrote:
               | Also generally no one spins up distinct, new processes
               | for the 'co-ordinated distinct process work queue' when
               | they can just fork(), which should be way faster and
               | pretty much every platform uses copy-on-write for this,
               | so also has minimal memory overhead (at least initially)
        
               | jesboat wrote:
               | The problem is (perhaps amusingly) with refcounting. As
               | the processes run, they'll each be doing refcount
               | operations on the same module/class/function/etc objects
               | which causes the memory to be unshared.
        
               | lazide wrote:
               | Only where there is memory churn. If you're in a tight
               | processing loop (checksumming a file? Reading data in and
               | computing something from it?) then the majority of
               | objects are never referenced or dereferenced from the
               | baseline.
               | 
               | Also, since the copy on write generally is memory page by
               | memory page, even if you were doing a lot of that, if
               | most of those ref counts are in a small number of pages,
               | it's not likely to really change much.
               | 
               | It would be good to get real numbers here of course. I
               | couldn't find anyone obviously complaining about obvious
               | issues with it in Python after a cursory search though.
        
               | byroot wrote:
               | That was solved years ago by moving the refcounts into
               | different pages: https://instagram-engineering.com/copy-
               | on-write-friendly-pyt...
        
               | [deleted]
        
         | kzrdude wrote:
         | Interactive use of Python: plotting, working with data, also
         | would benefit from better multithreading. It's interactive, so
         | it's (a bit) frustrating to wait for it to compute and see that
         | it uses just one thread (the statistics ops are usually well
         | threaded already, but plotting is not).
        
         | lostdog wrote:
         | Here's a use case: I was training a neutral net, and wanted to
         | do some preprocessing (similar to image resizing, but without
         | an existing C function). Inputs are batched, so the
         | preprocessing is trivially parallelizable. I tried to
         | multithread it in python, and got no speedup at all.
         | 
         | That was a really sad moment, and I've never felt good about
         | python since.
        
           | isoprophlex wrote:
           | Yeah, I had something similar.
           | 
           | I wanted "as you wait for the GPU to churn through this
           | batch, start reading the next batch from disk & preprocessing
           | it on the CPU"
           | 
           | Getting this to work turned out so ass backwards it made me
           | sad
           | 
           | Also I pity the fool who tries to connect a debugger to code
           | using multiprocessing.Pool()...
        
         | rbjorklin wrote:
         | One thing that comes to mind is for shared connection pools
         | when you have thousands of workers connecting to the same
         | stateful service.
        
           | TekMol wrote:
           | Which use case requires such a setup?
        
             | Spivak wrote:
             | MySQL comes to mind. Unlike Postgres where connections are
             | expensive MySQL encourages loading up the server with
             | hundreds of simultaneous connections per server.
             | 
             | Like anything, it's possible to split this work out to a
             | separate process but the IPC overhead is a lot.
        
             | lazide wrote:
             | Most databases (unfortunately), and since a ton of web apps
             | use databases for state..
             | 
             | That said, many such databases are (or already have) rolled
             | out connection pool proxies for reasons like this, so meh.
        
             | lanstin wrote:
             | Web front end backed by data base with expensive login, so
             | you cache connections. Two tier architecture it would be
             | called.
        
       | mhh__ wrote:
       | With all the effort that's been put into this how many people
       | just jumped ship to a native/more-native language?
       | 
       | I'm probably biased because I think Python is a hacked together
       | mess but I just don't see what the point is in dragging it
       | around.
        
         | ajkjk wrote:
         | The difference is that it is so much easier, by orders of
         | magnitude, to write code that gets shit done in python than any
         | native language I'm aware of.
        
         | Zababa wrote:
         | > With all the effort that's been put into this how many people
         | just jumped ship to a native/more-native language?
         | 
         | I think Go benefited a lot from that.
        
         | jgb1984 wrote:
         | Very useful contribution, thanks! Mainly to demonstrate your
         | own ignorance on the topic at hand.
        
           | mhh__ wrote:
           | I did ask a question. The second part is me being belligerent
           | but the question was sincere.
           | 
           | I think Python is shit but so is most everything else, I'm
           | interested in whether people jump ship or just work around
           | it's issues at scale. I work on a programming language
           | designed to avoid messy python scripts internally, so I am
           | sincerely interested in these decisions.
        
             | mixmastamyk wrote:
             | The question is unanswerable. Python started as a scripting
             | and prototyping language and that hasn't and won't
             | completely change. It's fantastic at what it does from that
             | perspective, late additions of complexity notwithstanding.
        
       | Mikeb85 wrote:
       | I use Ruby and not Python, but I think both have a lot of the
       | same benefits and weaknesses.
       | 
       | IMO, removing the GIL is a major mistake. The GIL is what allows
       | you easy concurrency and to keep the language's 'magic' while
       | ensuring correctness. If you need parallelism, there's processes
       | and probably other tactics (I'm not super up to date on Python
       | things). If you simply remove the GIL you have a bunch of race
       | conditions, so you need a bunch of new language constructs, and
       | it just adds a bunch of complexity to solve problems that don't
       | really need solving.
       | 
       | IMO they should just do what Ruby did with Ractors; basically a
       | cheap alternative to spawning more processes. Rewriting
       | absolutely everything that uses threads to be thread-safe is a
       | waste of time.
        
         | klyrs wrote:
         | It's already easy to write race conditions in Python.
         | if x in d:           del d[x]       else:           d[x] = True
         | 
         | Is a classic example -- if two threads execute that, you can't
         | predict the outcome (but a KeyError is quite likely)
         | 
         | The GIL only protects the CPython virtual machine; it doesn't
         | protect user code. Concurrent code with shared mutable state
         | already needs explicit mutexes.
        
         | intrepidhero wrote:
         | What thread safe code can I write with the GIL that will have a
         | race without it?
         | 
         | I already have to be careful to only write to a shared object
         | from one thread, since I have no guarantees on order of
         | execution.
         | 
         | The main benefit of the GIL, from my recent reading is that it
         | makes ref counting fast _and_ thread safe. The meat of the
         | proposal is changing ref counting so that it 's almost as fast
         | _and_ atomic without the GIL.
        
           | ptx wrote:
           | What about setting a simple boolean flag, e.g. setting
           | "cancelled = True" in the UI thread to cancel an operation in
           | a background thread?
           | 
           | In Java you would have to worry about _safe publication_ to
           | make the change visible to the other thread, but thanks to
           | the GIL changes in Python are always (I think?) made visible
           | to other threads.
        
       | dleslie wrote:
       | I wonder how many folks think they're writing thread safe
       | software with ease, and are unaware that they are leaning on the
       | GIL?
       | 
       | Could be that the impact of this change is far broader than just
       | a few key libraries.
        
         | [deleted]
        
         | avianlyric wrote:
         | I don't see how the GIL makes writing thread safe software any
         | easier. The GIL might prevent two Python threads executing
         | simultaneously, but it doesn't change the fact that a Python
         | thread can be preempted, meaning your global state can change
         | at any point during execution without warning.
         | 
         | Most of the issues with multi-threading come from concurrency,
         | not parallelism. The GIL allows concurrency, you just don't get
         | any of the advantages of parallelism, which is normally the
         | reason for putting up with the complexity concurrency creates.
        
           | dehrmann wrote:
           | It's more cpython than the GIL, but it lets you get away with
           | using += and certain dict operations without locks.
        
             | hexane360 wrote:
             | Is this true? It looks like += compiles to four bytecode
             | instructions: two loads, an increment, and a store. It
             | should be possible for a thread to get paused after the
             | load but before the store, resulting in a stale read and
             | lost write.
             | 
             | Some more discussion here:
             | https://stackoverflow.com/questions/1717393/is-the-
             | operator-...
        
               | dehrmann wrote:
               | Maybe it's just certain collection operations, then.
        
               | pansa2 wrote:
               | With the GIL, for an int i, `i += 1` is not thread-safe,
               | but IIRC for a list l, `l += [1]` (i.e. extend) is.
               | 
               | Presumably this patch changes the list implementation in
               | some way so that the extend operation remains thread-safe
               | without the GIL.
        
           | stefan_ wrote:
           | From running the same software on two moderately powerful
           | embedded systems, one single-core and one multi-core, the
           | latter is a lot more reliable in immediately exposing races
           | and concurrency issues.
        
           | dleslie wrote:
           | > The GIL might prevent two Python threads executing
           | simultaneously, but it doesn't change the fact that a Python
           | thread can be preempted, meaning your global state can change
           | at any point during execution without warning.
           | 
           | That thread behavior is enough to reduce the likelihood of
           | races and collisions; particularly if the critical sections
           | are narrow.
        
             | laserlight wrote:
             | I wouldn't call it thread-safe when race conditions are
             | possible.
        
               | jldugger wrote:
               | Then we need a term for when code race conditions are
               | possible but rare enough that nobody using the software
               | notices. thread-timebomb?
        
               | nuerow wrote:
               | > _Then we need a term for when code race conditions are
               | possible but rare enough that nobody using the software
               | notices. thread-timebomb?_
               | 
               | There's already a term for that: not thread-safe.
               | 
               | The definition of thread safety does not include
               | theoretical or practical assessments regarding how
               | frequent a problem can occurr. It only assesses whether a
               | specific class of problems is eliminated or not.
        
               | jldugger wrote:
               | >The definition of thread safety does not include
               | theoretical or practical assessments regarding how
               | frequent a problem can occur.
               | 
               | Well, _obviously_.
               | 
               | The challenge I am putting forth on HN is to meaningfully
               | describe _usable_ thread-unsafe software. If you've spent
               | enough time outside university, you'll be aware that
               | there are all kinds of theoretical race conditions that
               | are not triggered in practical use.
        
               | klyrs wrote:
               | If you've worked at industrial scale, you'll be aware
               | that even the most theoretical-seeming race condition
               | will be triggered frequently.
        
               | The_Colonel wrote:
               | That reminds me how I was called to fix some Java
               | service, which was successfully in production for 10
               | years with hardly any incident, but it suddenly started
               | crashing hard, all the time. It was of course a thread
               | safety issue (concurrent non-synchronized access to
               | hashmap) which laid dormant for 10 years only to wreak
               | havoc later.
               | 
               | Nothing obvious changed (it was still running a decade
               | old JRE), perhaps it was a kernel security patch, perhaps
               | a RAM was replaced or even just the runtime data
               | increased/changed in some way which woke up this monster.
        
               | formerly_proven wrote:
               | Heisenbug.
        
               | dkersten wrote:
               | That's not useful. If you have a race condition, you will
               | eventually hit it and when you do, you may get incorrect
               | results or corrupt data. Thread unsafe is thread unsafe,
               | regardless how rare it appears to be.
               | 
               | Also, rare on one computer (or today's computer) might
               | not be rare on another (tomorrows faster one for
               | example).
               | 
               | These types of bugs are also very hard to detect. You
               | might not know your data is corrupted. Reminds me of how
               | bad calculations in excel has cost companies billions of
               | dollars, except now, the calculations could be "correct"
               | and the error sitting dormant, just waiting for the right
               | timings to happen. Much better to not make assumptions
               | about the safety and think about it up front: if you are
               | using multiple threads, you need to carefully consider
               | your thread safety.
        
             | Brian_K_White wrote:
             | There is no such thing as "likely". A thing is either
             | possible or not possible.
        
               | kingofpandora wrote:
               | Likelihood refers to probability not possibility.
        
               | Brian_K_White wrote:
               | There is no such thing as probability. All there is is
               | possible and not possible.
               | 
               | I don't know how the point of the comment could be
               | missed, but what I am saying is, it is a mistake, a
               | rookie baby not-a-programmer not even any kind of
               | engineer in any field, to even think in those sorts of
               | terms at all. At least not in the platonic ideal worlds
               | of math or code or protocol or systems design or legal
               | documents, etc.
               | 
               | Physical events have probability that is unavoidable. How
               | fast does the gas burn? "Probably this fast"
               | 
               | There is no excuse for any coder to even utter the word
               | "likely".
               | 
               | The ONLY answers to "Is this operation atomic?" or "Is
               | this function correct?" or "Does this cpu perform
               | division correctly?" Is either yes or no. There is no
               | freaking "Most of the time."
               | 
               | "Likely" only exists in the realm of user data and where
               | it is explicitly _created_ as part of an algorythm.
        
             | avianlyric wrote:
             | That just means the GIL is good at hiding concurrency bugs.
             | It doesn't make writing correct code any easier. Arguably
             | you could say it makes writing correct concurrent code
             | harder, because it'll take significantly longer for
             | concurrency bugs cause errors.
        
           | chacham15 wrote:
           | There are certain classes of errors that it prevents. E.g:
           | 
           | Thread1: a = 0xFFFFFFFF00000000
           | 
           | Thread2: a = 0x00000000FFFFFFFF
           | 
           | One might think that the two possible values of a if those
           | are run concurrently are 0xFFFFFFFF00000000 and
           | 0x00000000FFFFFFFF. But actually 0x0000000000000000 and
           | 0xFFFFFFFFFFFFFFFF are also possible because the load itself
           | isnt atomic.
           | 
           | The GIL (AFAICT) will prevent the latter two possibilities.
        
             | adrian_b wrote:
             | Most CPUs guarantee that aligned loads and stores up to the
             | register size, i.e. now usually up to 64-bit, are atomic.
             | 
             | The compilers also take care to align most variables.
             | 
             | So while your scenario is not impossible, it would take
             | some effort to force "a" to be not aligned, e.g. by being a
             | member in a structure with inefficient layout.
             | 
             | Normally in a multithreaded program all shared variables
             | should be aligned, which would guarantee atomic loads and
             | stores.
        
               | knorker wrote:
               | Well, thread safety is exactly about these cases of
               | "well, it's hardly ever a problem".
               | 
               | Real life bugs have come from misapplication of correct
               | parameters for memory barriers, even on x86. Python GIL
               | removes a whole class of potential errors.
               | 
               | Not that I'm against getting rid of the GIL, but I'm more
               | sceptical that it won't trigger bugs.
               | 
               | Though in my opinion python just isn't a good language
               | for large programs for other reasons. But it'd be nice to
               | be able to multithread some 50 line scripts.
        
             | intrepidhero wrote:
             | But if you're writing to the same object from two different
             | threads you're going to have undefined behavior regardless
             | of the GIL, yes?
        
               | NovemberWhiskey wrote:
               | Not really. If you're doing an atomic write to the same
               | object from two different threads, you're going to have
               | one win the race and the other lose. That may be a bug in
               | your code, but it's not undefined behavior at the
               | language level.
        
               | ajkjk wrote:
               | It prevents classes of errors, such as, as the parent
               | mentioned, non-atomic writes to individual variables.
        
               | fulafel wrote:
               | No. The L in GIL stands for lock. So only the thread that
               | holds it can write or read from the object, and the
               | behavior is well defined at the C level, because C lock
               | acquire and release operations are defined to be memory
               | barriers.
        
               | dkersten wrote:
               | But when each thread reads the variable, you have no
               | control over which value you see, since you don't control
               | when each thread gets to run. So its undefined in the
               | sense that you don't know which values you will get: a
               | thread might get the value it wrote, or the value the
               | other thread wrote. The threads might not get the same
               | value either.
               | 
               | The GIL exists to protect the interpreters internal data,
               | not your applications data. If you access mutable data
               | from more than one thread, you still need to your own
               | synchronisation.
        
               | hexane360 wrote:
               | It depends what you mean by 'undefined behavior'. The GIL
               | makes operations atomic on the bytecode instruction
               | level. Critically, this includes loading and storing of
               | objects, meaning that refcounting is atomic. However,
               | this doesn't extend to most other operations, which
               | generally need to pull an object onto the stack,
               | manipulate it, and store it back in separate opcodes.
               | 
               | So with Python concurrency, you can get unpredictable
               | behavior (such as two threads losing values when
               | incrementing a counter), but not undefined behavior in
               | the C sense, such as use-after-free.
        
             | snek_case wrote:
             | AFAIK CPUs implement atomic load and store instructions and
             | the performance overhead of these is very small compared to
             | something like a software busy lock. So I think it's quite
             | possible to take away the GIL while still making it
             | impossible to load only half of a value.
        
             | ignoramous wrote:
             | Related: _Symmetric Multi-Processor primer for Android_ ,
             | https://developer.android.com/training/articles/smp
             | (although for Android/ARM, it makes for a pretty good read
             | on the topic).
        
         | rectang wrote:
         | I have read many anti-GIL arguments over the years that
         | approach soundness as optional. Is this change going to make a
         | bunch of previously sound code unsound?
        
         | hyperbovine wrote:
         | Conversely, I have found that the GIL makes it unexpectedly
         | easy to write thread-safe software in Python. Compare (in
         | Cython) writing                  with gil:
         | call_a_method()            print(some_debugging_info)
         | 
         | with all the sit-ups you'd have to do in a "real" concurrent
         | language.
        
           | ynik wrote:
           | The GIL doesn't really help Python code though, because the
           | interpreter may switch threads between any two opcodes.
           | 
           | It only protects the state of the Python interpreter and that
           | of C/Cython extension modules. Though even there, you can
           | have unexpected thread switches, e.g. in Cython `self.obj =
           | None` can result in a thread switch if the value previously
           | stored in `self.obj` had a `__del__` method implemented in
           | Python.
           | 
           | And AFAIK pretty much any Python object allocation can
           | trigger the cycle collector which can trigger `__del__` on
           | (completely unrelated) objects in reference cycles, so it's
           | pretty much impossible to rely on the GIL to keep any non-
           | trivial code block atomic.
        
       | ngrilly wrote:
       | Impressive proposal. That would remove a major limitation of
       | CPython.
        
       ___________________________________________________________________
       (page generated 2021-10-17 23:00 UTC)