[HN Gopher] 5% of 666 Python repos had comma typo bugs (inc V8, ...
___________________________________________________________________
5% of 666 Python repos had comma typo bugs (inc V8, TensorFlow and
PyTorch)
Author : rikatee
Score : 238 points
Date : 2022-01-07 17:00 UTC (6 hours ago)
(HTM) web link (codereviewdoctor.medium.com)
(TXT) w3m dump (codereviewdoctor.medium.com)
| usrbinbash wrote:
| Literally the second item in the "Zen of Python"
| (https://www.python.org/dev/peps/pep-0020/):
|
| _Explicit is better than implicit._
|
| And yet, s = ["one", "two" "three"] will implicitly and silently
| do something, that is probably wrong most of the time.
| dpedu wrote:
| Hmmm, it sounds like you're expecting "two" and "three" to be
| separate list elements because of some sort of implicit
| behavior due to being written in a list context. This is the
| opposite of what "Explicit is better than implicit" means.
|
| This is a list and you must explicitly place a comma when you
| want to start a new element in the list. Is there ever a time a
| new element follows a previous one and is NOT separated by a
| comma? No, this is explicit.
|
| Whereas, strings also always concatenate in this manner be it
| in a list context or not. It seems like you're assuming
| behaviors from other languages would be the same in another.
| matsemann wrote:
| No, we don't want it to implicitly be a list item. We want it
| to fail as invalid syntax. If I wanted the two and three
| strings to be combined, I would have /explicitly/ used an
| operator for that. It's the implicit behavior of that which
| is the problem.
| fantod wrote:
| Not to mention the implicit string concatenation that you
| get instead.
| ReleaseCandidat wrote:
| > it sounds like you're expecting "two" and "three" to be
| separate list elements
|
| I'd expect that to be an error.
| lijogdfljk wrote:
| Funny enough, in dynamic languages i expect it to do
| something unexpected and unwanted.
|
| This is why i like Go/Rust. I detest the implicit warts of
| these languages.
| wott wrote:
| It's not related to being dynamic or not, it's a
| syntactical choice: that's also the way to concatenate
| string literals in C.
| ReleaseCandidat wrote:
| Well, there are dynamic languages and dynamic languages.
| There are Python and Ruby and there are Elixir, Erlang
| and Lisps.
| aylmao wrote:
| Ah yes, why would anyone expect lists' main purpose to be
| listing?
|
| Sarcasm aside, I'd assume people primarily _list_ things in
| between [ and ], and _sometimes_ concatenate things in there
| too. The language should err on the side of doing what people
| expect, unless explicitly told not to.
|
| > It seems like you're assuming behaviors from other
| languages would be the same in another.
|
| Rather, I think people expect a language, especially one this
| big and important, to work for them, and not to be designed
| with unergonomic features instead.
| twobitshifter wrote:
| I could see lisp programmers missing the commas out of muscle
| memory
| doubleunplussed wrote:
| Your sarcasm is misplaced. I would prefer a SyntaxError to
| either of the implicit behaviours.
| bokchoi wrote:
| I'm not a python programmer, but the implicit string
| concatenation seems surprising to me.
| Dudeman112 wrote:
| I'm not a python programmer either, but I would be
| _seriously_ annoyed at implicit anything instead of syntax
| error
| kevin_thibedeau wrote:
| It's idiomatic in C.
| rat9988 wrote:
| This is not what implicit is about.
| ianbicking wrote:
| Implicit concatenation sure seems implicit to me
| suifbwish wrote:
| Implicit things are rarely nice in code for production
| environments. It makes bug tracing and security much more
| complicated
| oaiey wrote:
| This is indeed the point. Some use cases are amazing and
| increase quality while others are just pure evil.
| [deleted]
| jstx1 wrote:
| I mean the zen being wrong is kind of a meme at this point. The
| whole "only one obvious way to do it" isn't just false but the
| exact opposite is true. Python is one of the most flexible
| languages with many many ways to do the same thing; more than
| any other language I can think of.
| lenkite wrote:
| Python finally ended up following Perl's TMTOWTDI motto! http
| s://en.wikipedia.org/wiki/There%27s_more_than_one_way_to...
| egeozcan wrote:
| > Complex is better than complicated
|
| What? Something being complex is artificial, we try to avoid
| it. Problems can be complicated, we try to simplify them, and
| more complicated the problem is, we tend to develop more
| complex solutions. So comparing them does not make sense?
|
| Or did I always know them wrong?
| stonemetal12 wrote:
| Complex: consisting of many different and connected parts.
|
| Complicated: consisting of many interconnecting parts or
| elements; intricate.
|
| Nothing specifically artificial about either one. Software
| that is well decomposed is Complex (made of many smaller
| connected parts). Software that is is poorly decomposed is
| Complicated (made of many smaller interconnected parts).
|
| Connected vs interconnected?
|
| Interconnected: connected at multiple points or levels (aka
| spaghetti code)
| dylan604 wrote:
| Complicated: this mutha is hard all by itself
|
| Complex: we took all of these simple steps, lumped them
| together, now we have this
| hibrass wrote:
| I first encountered the notion of complex/complicated in
| Antifragile I believe, and IIRC it's based on the [Cynefin
| framework](https://en.wikipedia.org/wiki/Cynefin_framework)
| .
|
| My understanding is that: * Complex domains lend themselves
| to experimentation and emergent behavior. * Complicated
| domains lend themselves to analysis, expertise, and rule
| following.
|
| The Wikipedia article offers the domains as containing
| "unknown unknowns" and "known unknowns" respectively.
|
| I'm trying to think how this maps to Python -- the language
| is complicated, while the problems we're solving are
| expected to be complex? Or, maybe, the language lives at
| the boundary between complicated and complex. We push
| complicated procedures into the language, and let the
| programmers deal with complex issues?
| nighthawk454 wrote:
| It's not particularly well-worded. A lot of dictionaries
| list complex/complicated as synonyms.
|
| I always took it to mean 'complex' as in having many
| connected parts, and 'complicated' more as in over-
| complicated or convoluted - the opposite of 'simple'. In
| other words, breaking something complicated into a system
| of intentionally-designed pieces is probably better than a
| chunk of opaque code to brute-force the current case. A
| good system is probably also 'simpler', despite having more
| pieces and interconnects.
| pmarreck wrote:
| Except exit.
|
| I knew Python wasn't for me in my first foray into it when I
| fired its REPL and then went to exit it with control-C or
| whatever and it _literally printed out the right way to do it
| but then didn 't do it._ Python was more interested in having
| me do things a certain way _even when it knew what I intended
| to do, just to be a twit_.
| jrockway wrote:
| The REPL prints the value of a variable that you type in.
| exit is a variable, and so the REPL prints its value. If
| you want to run it as a function, you can do that, and
| indeed its string value is a message telling you to do
| that. $ python3 Python 3.9.2
| (default, Feb 28 2021, 17:03:44) [GCC 10.2.1
| 20210110] on linux Type "help", "copyright",
| "credits" or "license" for more information. >>>
| exit Use exit() or Ctrl-D (i.e. EOF) to exit
| >>> exit.eof 'Ctrl-D (i.e. EOF)' >>>
| exit.name 'exit' >>> exit = 42 >>>
| exit 42 >>> exit() Traceback (most
| recent call last): File "<stdin>", line 1, in
| <module> TypeError: 'int' object is not callable
| >>>
|
| I would have special-cased exit, though.
| animal_spirits wrote:
| Ctrl-c raises a KeyboardInterrupt error, which is useful
| for programs to catch. If you type >>>
| exit Use exit() or Ctrl-D (i.e. EOF) to exit
|
| You will get that error response. The goal of this is to
| have the REPL language the exact same as the scripting
| language. exit() is supposed to be called as a function to
| make the language more consistent, so just typing `exit`
| will do nothing
| solox3 wrote:
| Notice that, in the original quote, There
| should be one-- and preferably only one --obvious way to do
| it.
|
| the author used two different ways of hyphenating (three, if
| you count the whole PEP 20). PEP 20 is clearly not meant to
| be taken as law. Nor PEP 8. Nor PEP 257.
|
| People frequently mistake "one obvious way" with "one way".
| There are lots of ways to iterate through something, for
| example, but there is really one obvious way. And the
| philosophy here still applies: when you read anyone else's
| python code, the obvious way is probably doing the obvious
| thing. I think that is the more appropriate takeaway from PEP
| 20.
| jstx1 wrote:
| > And the philosophy here still applies: when you read
| anyone else's python code, the obvious way is probably
| doing the obvious thing.
|
| I don't get what you mean by this.
|
| When I read someone else's code, what is obvious to me
| isn't necessarily what was obvious to the author. For an
| illustration of this, have a look at the day 1 solution
| thread from this year's Advent of Code - https://www.reddit
| .com/r/adventofcode/comments/r66vow/2021_d... (you can
| search for Python solutions) - and see how many different
| ways there are to solve a fairly straightforward problem.
| srcreigh wrote:
| The author uses "only one" to clarify "one". So obviously
| "one" means at least one. There should be
| at least one-- preferably only one --obvious way to do it.
|
| Kinda funny meta joke considering everybody conflates "one"
| and "only one" to mean the same thing. Preferably there
| would only be one obvious way to describe "one". :p
| bilalq wrote:
| It's not even obvious how to run Python or dependencies in
| the first place. Even putting aside the 2.7/3.x fiasco
| (that still causes problems even today), you're left with
| figuring out wheel vs egg vs easy-install vs setuptools vs
| poetry vs pip vs pip3 vs pip3.7 vs pip3.8 vs piptools vs
| conda vs anaconda vs miniconda vs virtualenv vs pyenv vs
| pipenv vs pyflow.
| dylan604 wrote:
| it's like you read my mind.
| JasonFruit wrote:
| I suspect this comment was an elaborate nerdsnipe.
| dragonwriter wrote:
| > the author used two different ways of hyphenating
|
| No, first, it doesn't use _hyphenating_ at all, it uses
| hyphens as an ASCII approximation for typographical dashes
| used to set off a phrase (a distinct function from
| hyphenation), and, second, in that quote they used one way
| of doing it: "two dashes set closed on the side of the main
| sentence and set open on the side of set-off phrase".
|
| It is an _unusual_ way of doing it--just as with actual
| typographical dashes, setting open or closed symmetrically
| would be more common--but it 's not two ways.
|
| EDIT: And the third use (in the heading and later in the
| body) is seperating parts where neither is a mid-sentence
| appositive phrase, and uses open-on-both sides. So that's
| not a different way of doing the _same_ thing, it 's a
| different way of doing a semantically different thing.
|
| Actually, I think the dash use makes a good illustration of
| how the "it" in "one way to do it" is intended.
| Wowfunhappy wrote:
| > "two dashes set closed on the side of the main sentence
| and set open on the side of set-off phrase".
|
| Eh, I don't think that's the interpretation the author
| was going for. The author wanted to show two different
| ways of approximating a dash, and he had limited options.
|
| If he'd done this-- for example-- he would have been
| showing one way, not two.
|
| If he'd done this --for example-- you would have called
| it "two dashes set open on the side of the main sentence
| and set closed on the side of set-off phrase".
|
| If he'd done this-- for example -- it would have been too
| obvious (on the same line).
|
| I _suppose_ he could have done this-- for example--but I
| still think that would have been too obvious. You 're not
| supposed to see it on a first read.
|
| > And the third use (in the heading and later in the
| body) is seperating parts where neither is a mid-sentence
| appositive phrase, and uses open-on-both sides. So that's
| not a different way of doing the same thing, it's a
| different way of doing a semantically different thing.
|
| It's a different use of a dash, but it's still a place
| where you'd typically use a dash.
|
| -----
|
| Edit: You know what, thinking about it again--perhaps
| both interpretations are valid. That almost adds to the
| effectiveness of the whole thing.
| xigoi wrote:
| I can think of at least 2 obvious ways to iterate through
| something: for loops and comprehensions.
| voltagedivider wrote:
| You're right that both iterate through something but
| `for` loops and comprehensions aren't used as if they
| were interchangeable.
|
| For example, you'll sometimes see people do bad stuff
| like this: >>> lst = [] >>>
| >>> [lst.append(i + i) for i in range(10)] [None,
| None, None, None, None, None, None, None, None, None]
| >>> >>> lst [0, 2, 4, 6, 8, 10, 12, 14, 16,
| 18] >>>
|
| When they should be doing this: >>> lst =
| [] >>> >>> for i in range(10): ...
| lst.append(i + i) ... >>> lst [0, 2,
| 4, 6, 8, 10, 12, 14, 16, 18] >>>
|
| Or just this: >>> lst = [i + i for i in
| range(10)] >>> >>> lst [0, 2, 4, 6, 8,
| 10, 12, 14, 16, 18] >>>
| a_t48 wrote:
| lst = [range(0, 10, 2)]
| orlp wrote:
| That's wrong in multiple ways. You want
| lst = list(range(0, 20, 2))
| a_t48 wrote:
| Ohh, yeah, you're right.
| jstx1 wrote:
| The first append version will more often be in a loop.
| It's unlikely that someone will know enough to use
| comprehensions but not enough to still use append.
| voltagedivider wrote:
| Agreed. I've mainly seen the first `append` version in
| code written by people who've just discovered
| comprehensions and code golf.
| dragonwriter wrote:
| To _generate a list /dictionary/geneator from an input
| iterable_, you use a comprehension of the appropriate
| type.
|
| To iterate through it _without_ doing one of those
| things, you use a for loop.
|
| In "one obvious way to do it", "it" refers to a concrete
| task; the same is not necessarily intended to be true of
| arbitrarily broad generalizations of _classes_ of tasks.
| Quekid5 wrote:
| It's sort of like the Unix Philosophy. It sounds good and is
| probably a good thing to strive for generally, but it's
| ultimately pointless when it comes to actually evaluating
| whether approach A is better than approach B.
| fault1 wrote:
| the zen of python was written in the 90s.
|
| from that context it makes sense, because the only goal of
| python in the 1990s was to be more popular than perl, which
| was notorious in having many ways of doing the same thing.
|
| but yeah, python had had significant feature creep over the
| years, it's nowhere near the small clear lang it used to be.
| andi999 wrote:
| And still no expressive switch/case statement, breaking out
| of loops and ending scripts early (for explorative
| programming).
| jstx1 wrote:
| > And still no expressive switch/case statement
|
| There's match/case in 3.10 -
| https://www.python.org/dev/peps/pep-0636/
| gabagool wrote:
| >no expressive switch/case statement
|
| match/case (not a drop in switch statement)
|
| >breaking out of loops break
|
| >ending scripts early (for explorative programming)
| exit() or sys.exit()
| [deleted]
| oblvious-earth wrote:
| It was a meme when Zen was written, the spaces around the em
| dash are handled 3 different ways. Twice in the line you
| abbreviated, removing the joke.
| savant_penguin wrote:
| Matplotlib is an example of a library with at least two
| "correct" ways of plotting
| jstx1 wrote:
| But only one of them is recommended - the one that makes
| less sense.
| jofer wrote:
| How is working with figure and axes objects the one that
| makes less sense?
|
| Is it really that crazy do set up a figure, axes on that
| figure, and plot on the axes, returning an artist object
| for each plotting command?
| fault1 wrote:
| one is more or less based on matlab's plotting
| procedures, the other is an attempt at a cogent
| implementation of a OOP implementation. However, the OOP
| paradigm just doesn't seem very good for plotting.
|
| Personally, I like plotting in R way better than in
| python. It has a lot better developer UX.
| jstx1 wrote:
| Yes, it is crazy. I guess this isn't really the place for
| it but ... From the official docs: The
| Figure is the final image that may contain 1 or more
| Axes. The Axes represent an individual plot
| (don't confuse this with the word "axis", which refers to
| the x/y axis of a plot).
|
| This is infuriatingly bad and I firmly believe that it
| makes sense only to people who already know how it works.
| There's an image, axes (this word alone is a crime),
| plot, figure... it's like they took a bunch of synonyms
| and arranged them randomly to put together an API.
| dylan604 wrote:
| >axes (this word alone is a crime),
|
| why so? you prefer something like axiis?
| jstx1 wrote:
| See, that's the thing:
|
| > Axes object is the region of the image with the data
| space.
|
| In matplotlib axes is not the plural of axis. It has its
| own meaning specific to the API. And at the same time
| it's the plural form of another word (axis) which is also
| relevant in this context and it sounds almost identical
| when pronounced.
| marcosdumay wrote:
| I dunno. One sets global values everywhere, then collects
| them all into a plot. The other creates a bunch of
| apparently disconnected objects, sets a bunch of
| different attributes on each one, and then gets the plot
| from one of those objects.
|
| If I was designing something like it, I wouldn't
| recommend either. The global one has many fewer WTFs per
| character, but the objects one looks like it works in a
| multithreaded program or that you can create more than
| one plot without displaying them (but I've never tested
| this).
| andi999 wrote:
| Which two ways?
| jstx1 wrote:
| Object-oriented vs Pyplot -
| https://matplotlib.org/matplotblog/posts/pyplot-vs-
| object-or...
| webmaven wrote:
| _> I mean the zen being wrong is kind of a meme at this
| point. The whole "only one obvious way to do it" isn't just
| false but the exact opposite is true. Python is one of the
| most flexible languages with many many ways to do the same
| thing; more than any other language I can think of._
|
| Not in comparison to Perl, which usually has multiple ways to
| do anything, each 'obvious' to different sets of people (each
| Perl codebase therefore seems to have a distinct dialect
| based on which 'obvious' alternatives are chosen).
|
| The other direction languages can take that is being
| contrasted, is there being one non-obvious way to do
| something.
|
| Python's 'most obvious way' isn't necessarily the
| fastest/most concise/most efficient/scalable/etc. way to do
| something in Python, but it will usually be obvious to most
| Python developers. And although broad styles have certainly
| developed over time (imperative, functional, OO) as Python
| has gained power and flexibility, the dictum still largely
| holds true.
| hnlmorg wrote:
| 10 years ago I'd have agreed with you. But Perl has gone a
| long way in pulling back from some of that insanity while
| Python has been giving C++ a run for it's money in terms of
| features.
| onphonenow wrote:
| I'd totally agree - there's been a burst of sort of the
| perl style stuff (:= ?) to gain relatively small wins.
|
| ie, instead of
|
| for line in lines: print(line)
|
| we are supposed to be using
|
| while line := f.readline(): print(line)
|
| I've not been super impressed with this type of thing.
|
| That said, string formatting is better with f strings.
|
| They also rolled back some the forced breakage from
| trying to force unicode with 3 which made a big
| difference. 3.3 added back u''
|
| Lots of good cleanups lstrip vs removeprefix etc.
|
| Underscores in numeric literals (10000000 vs 10_000_000)
|
| So lots of good stuff still landing.
| dragonwriter wrote:
| > ie, instead of
|
| > for line in lines: print(line)
|
| > we are supposed to be using
|
| > while line := f.readline(): print(line)
|
| No, we're not. Walrus, in loops, IME, is more for
| replacing this pattern: while True:
| myvar = get_it() if not ok(myvar):
| break # code that uses myvar
|
| with this pattern: while ok(myvar := get-
| it()): # code that uses myvar
| onphonenow wrote:
| False. I have been harshly attacked here on HN for
| suggesting things like for line in lines - literally been
| called "stupid".
|
| I'm not the only one who looked at the recommended
| examples of the use case here and went, huh?
|
| https://news.ycombinator.com/item?id=17450890
|
| Recommended new way: if any(len(longline
| := line) >= 100 for line in lines):
| print("Extremely long line:", longline)
|
| Old way: for line in lines:
| if len(line) >= 100: print("Extremely
| long line:", line) break
|
| I prefer the old way. These were examples in the PEP!
|
| In your example get_it() might be better as a generator
| or iterable. A lot of code looks great if you push that
| type of thing down a bit, and sometimes memory is helped
| as well. Then you iterate over it, for values in get_it.
| This keeps python very natural. You start to get a lot of
| weird line noise type code with := vs the old python
| style which while a bit longer was basically psudo-code.
| oaiey wrote:
| I am a bit in shock. Accidental string concatenation. Python just
| lost a lot of reputation in my brain.
| atleta wrote:
| Not sure if it's irony or not. After all, this is not really
| accidental string concatenation but an easy to make type error
| which can go undetected due to the dynamic typing (and the lack
| of thorough type annotation in most code).
|
| The string concatenation in itself should not be a problem as
| it's really just string constants. (But again, it might be
| irony exactly because of this :) )
| ErikCorry wrote:
| In most languages an array with 3 elements has the same type
| as an array with 2 elements so the type system isn't going to
| warn you about the difference between
|
| ("foo" "bar", "baz")
|
| and
|
| ("foo", "bar", "baz")
| skitter wrote:
| They still tend to differentiate between 2- and 3 element
| tuples (but I agree that the implicit concatenation is
| problematic).
| atleta wrote:
| Fair enough. I was only thinking about the str vs tuple
| case. So when you have 2 elements in the parenthesis.
| oaiey wrote:
| Unfortunately no irony.
|
| I come from a programming platform (C#) where productivity is
| a key element of language design. I highly doubt that Anders
| Heijlsberg would have accepted such a error prone concept
| like a literal free implicit operator on a key type like
| strings.
| atleta wrote:
| Well, I guess it's true for most language that productivity
| is intended to be a key element of design. (For python,
| definitely. But I also remember James Gosling saying this
| about Java.) This implicit concatenation seems to come
| (inherited?) from C.
|
| I kind of remembered that some languages do support it for
| braking strings into multiple lines conveniently. I'm a bit
| surprised that it works even on line (I've never used it,
| because why would have I), but you'll likely to make the
| mistake on multiline statements anyway. I've also checked
| and it doesn't work in java (which I kind of remembered,
| though I mostly do python these days).
| amelius wrote:
| > for braking strings into multiple lines conveniently
|
| What is inconvenient about just adding a + at the end or
| beginning of the line?
| ErikCorry wrote:
| Misspelling a variable on the lhs of an assignment just causes
| a new variable to be created with the new name. That's a lot
| worse in my book.
| version_five wrote:
| I dont think that's the same kind of thing. Your example is a
| tradeoff that anyone who uses a language that doesn't require
| explicit variable declaration faces, and it's pretty tough to
| argue such languages really shouldn't exist.
|
| Missing an operator resulting in explicit behavior is much
| more subtle and not even obvious behavior. For those who use
| python, it is worse.
| SomeCallMeTim wrote:
| > ...it's pretty tough to argue such languages really
| shouldn't exist.
|
| "Shouldn't exist" is too strong.
|
| Dynamic languages that let you create a new variable via
| assignment _shouldn 't be used to create non-trivial
| software._ How about that?
|
| Scripting languages have a place. That place is 100% in
| creating quick-and-dirty scripts and tools. Or in doing
| some kind of one-off data transform (as is common in
| machine learning scenarios). Anything that has a life span
| of two weeks or less, or a code length of fewer than a
| hundred lines? Yeah, script languages rock for that.
|
| Explicit/static typing adds vastly more value to large
| projects than the cost of the overhead. The fact that you
| _can 't_ really gain that value in Python means that Python
| should be relegated to quick and dirty scripts.
|
| Same for JavaScript, Ruby, and other completely dynamic
| languages.
|
| You'll note that all of these languages are getting types
| one way or another, meaning that there are a lot of people
| who do recognize their value. Though TypeScript is years
| ahead of the rest in the completeness and sophisticated of
| its type system; bugs like the comma bug detailed by OP,
| along with simply _every_ JavaScript "wat" bug, simply
| can't happen in TypeScript in strict mode. And static types
| enables entire other categories of bugs to be detectable
| via a linter as well.
| simonw wrote:
| I've been building non-trivial software in dynamic
| languages for twenty years. They work great.
|
| I'd take a project in a dynamic language with a decent
| test suite over a project without tests in a statically
| typed language any day of the week.
| shultays wrote:
| it's pretty tough to argue such languages really shouldn't
| exist
|
| Well, I agree with OP so that is at least two people. I
| really don't see it as a good trade.
| bilkow wrote:
| Explicit variable declaration is just adding a keyword
| (such as var or let) when you're declaring a new variable
| instead of modifying one.
|
| The cognitive burden of having to memorize and look for
| which variables are new vs which are being modified is
| simply not worth it in my opinion, even for a scripting
| language. Maybe for esolangs, simple math or first time
| learning programming.
|
| In any case, it's a short coming of the language (IMO) but
| not a deal breaker. We learn to live with it.
| voltagedivider wrote:
| Isn't that common for all/most languages that don't require
| explicit typing?
| xigoi wrote:
| JavaScript (strict mode) doesn't have explicit typing, but
| it still requires variables to be declared.
| wott wrote:
| Same for Perl.
| samhw wrote:
| It would be impossible in any language that requires either
| explicit typing or some kind of 'let' keyword. (Or, in the
| fringe case, a language like Go which uses a different
| operator for initialisation-plus-assignment.)
| voltagedivider wrote:
| Exactly. That's why I asked about languages that _don 't_
| require explicit typing. My point is that it's a feature
| of many languages rather than a Python idiosyncrasy.
| dragonwriter wrote:
| Declaration and explicit typing are logically orthogonal,
| but few if any languages require typing but not
| declaration. Lots require declaration but not typing.
| ReleaseCandidat wrote:
| I'd say unexpected behavior is always worse than expected
| one.
|
| Yes, you'll certainly find somebody who doesn't know what
| 'not statically typed' means, but ... And yes, there are also
| C(++) users, that expect strings to be concatenated like
| that.
| obua wrote:
| You seem to also not know what "not statically typed"
| means. It certainly does not mean "not properly scoped".
| ReleaseCandidat wrote:
| Yes, of course. But you see that no scope keywords exist
| in Python. But there exists `+` to concatenate strings
| (too).
| fragmede wrote:
| Keywords like _namespace_ , no; but functions and classes
| and modules provide for a lot of scoping opportunities.
| ReleaseCandidat wrote:
| The problem is `fop` should be `foo`:
| foo = 5 fop = 6
|
| Keywords like `let` solve this problem:
| let foo = 5 fop = 6 # error
| deathanatos wrote:
| Not entirely: let foo = a(); let
| foo = b(foo); let fop = c(foo); let foo =
| d(foo);
|
| (Which is valid, e.g., in Rust.)
| TheEzEzz wrote:
| You do get a warning, though. And most Rust projects I've
| seen usually adhere to 0 warnings.
| ErikCorry wrote:
| Or := for declaration like Go and Toit
| ReleaseCandidat wrote:
| Yes, or another symbol instead of `=` for assignment,
| like `<-` (F#)
| oaiey wrote:
| While I agree, this is somehow something I expect. Implicit
| string concatenation without operator or function around it
| sounds just like a terrible idea. It breaks the basic syntax
| concept of `foo X bar`. On the other hand it is probably very
| handy with DSLs and things like that.
| jstx1 wrote:
| That's a complaint against the entire type system, nothing to
| do with misspelling.
| samhw wrote:
| It has nothing to do with the type system? It's an issue
| with implicit declaration. You could very easily require
| explicit declaration while retaining the selfsame type
| system.
| jstx1 wrote:
| Huh, you're right. It would be bizarre to see something
| like this in Python though. I've never even thought of it
| as being implicit declaration.
| ehsankia wrote:
| C/C++ has the exact same thing, no?
| colpabar wrote:
| I was going to comment something like "who would even use
| this?" and then I remembered that I have in fact used that
| feature :) It's a somewhat "nice" way to write long strings and
| keep the code from getting too wide. I never did it inside an
| array, but I found breaking up a long string into smaller ones
| and wrapping them in parens without a comma was convenient, for
| things like error messages.
|
| But that's just what comes with a hyper flexible language like
| python. You can do lots of things in lots of different ways,
| but you can also screw things up just as easily, and your IDE
| won't tell you because technically it's valid code.
| BeetleB wrote:
| Heh. I use it all the time the way you do and didn't realize
| this is alien to many developers (no one in my team every
| complained about it).
|
| It's common in some languages and used the way you use it. I
| looked in PEP8 and it seems they don't discuss this.
|
| I think it's a perfectly valid use case, but clearly there
| are two camps to this. If this is so contentious, I would
| recommend PEP8 be revised to either explicitly endorse it as
| a way to split long lines or to explicitly discourage it and
| recommend the + operator instead.
| oaiey wrote:
| I completely get that. That is a very nice feature for
| building DSL or libraries with special needs. But it makes
| the overall language very dangerous.
|
| Is this "operator" overloadable on each type in Python?
|
| And that scares me a lot. I think I have to reevaluate my
| position towards Python.
| housecarpenter wrote:
| It's not really an operator. It's part of the syntax of
| string literals. "foo" "bar" is an alternative way of
| writing the string literal "foobar". If foo is not a string
| literal, foo "bar" is invalid syntax.
| oaiey wrote:
| Okay... So it is not a implicit operator. That is good.
| Some small reputation points are regained.
|
| Thanks.
| silisili wrote:
| Why not just use plusses? Or perhaps a join func, which would
| accomplish the same.
|
| I get the use case as you described it, but it just seems
| like minimal effort to accomplish and have some semblance of
| explicit/safety.
| dnautics wrote:
| or if that's the use case, require the whitespace to
| include a \n or \r\n... It's not like python doesn't have
| significant whitespace already.
| ErikCorry wrote:
| That wouldn't fix most of the cases highighted by the
| tool in the article.
|
| So strange that Python has completely different syntax
| from C, but they chose to copy this obscure syntactic
| feature _even though they have the plus operator on
| strings_.
| TonyRobbins wrote:
| But i think Python is best than C, because of its Syntax
| you cannot compare Python and C.! Just opinion.
| shultays wrote:
| You could have the same behavior by enforcing + operation in
| between mylongstring = "hello" +
| "world"
|
| No idea if python's way of indentations allows this but
| sounds like it should
| [deleted]
| ReleaseCandidat wrote:
| No, it doesn't: mylongstring = ("hello" +
| "world")
|
| or, without `+` mylongstring = ("hello"
| "world")
| fragmede wrote:
| Use \ mylongstring = "hello " \
| "world " \ "my " \ "name " \
| "is"*
| BeetleB wrote:
| The use of \ is discouraged in Python. From PEP8:
|
| > The preferred way of wrapping long lines is by using
| Python's implied line continuation inside parentheses,
| brackets and braces. Long lines can be broken over
| multiple lines by wrapping expressions in parentheses.
| These should be used in preference to using a backslash
| for line continuation.
| pmontra wrote:
| As a comparison, in Ruby puts "a" "b" == "ab" #
| true
|
| and puts "a" "b" == "ab"
|
| prints "a" with "b" == "ab" evaluated to false and discarded.
| This could create bugs as with Python. However
| ["a" "b"] == ["ab"]
|
| is syntax error at the beginning of the second line. The parser
| expects a ] It would evaluate to true if it were on one line.
| grey-area wrote:
| In Ruby one too many commas can also cause problems:
|
| # list
|
| list = "a","b",
|
| # function
|
| def foobar
|
| end
|
| => ["a", "b", :foobar]
| asow92 wrote:
| I'm sure the devil is in the details on this bug.
| aeturnum wrote:
| The high-level goals of python end up creating these little
| syntactic landmines that can get even experienced coders. My
| personal nomination for the worst one of these is that having a
| comma after a single value often (depending on the surrounding
| syntax) creates a tuple. It's easy to miss and creates maddening
| errors where nothing works how you expect.
|
| I've moved away from working in Python in general, but I think
| the #1 feature I want in the core of the language is the ability
| to make violating type hints an exception[1]. The core team has
| been slowly integrating type information, but it feels like they
| have _really_ struggled to articulate a vision about what type
| information is "for" in the core ecosystem. I think a little
| more opinion from them would go a long way to ecosystem health.
|
| [1] I know there are libraries that do this, I am not seeking
| recommendations.
| hsbauauvhabzb wrote:
| I'd rather a compile time error over an exception (or both),
| which in many cases can occur. I know mypy does this, maybe I
| should alias python="mypy&&python"
| luhn wrote:
| I feel like it's been pretty clear from day one that type hints
| are meant for static analysis with tools like mypy. It's not
| exclusive to that use and has a lot of other possible
| applications, but the primary goal has always static analysis.
| aylmao wrote:
| The lack of a static type-system is IMO what makes these one-
| character mistakes very annoying. The compiler can't tell you
| something is wrong, so you're just left to figure out why
| things are broken, just to realize it was the smallest of
| typos.
| tyingq wrote:
| C lets me do this, and doesn't say much about it.
| char ch_arr[3][10] = { "uno", "dos"
| "tres" };
| aeturnum wrote:
| I love how simple and forgiving Python is for small projects.
| The "trailing comma creates a tuple" situation comes out of,
| as far as I can tell, a desire to create maximally convenient
| syntax in the scenarios where tuples are intended. I think
| that's great for small code!
|
| I just wish that the core team would take that same zeal for
| a "pythonic" experience with small code and use it to develop
| more scaled-up systems for dealing with larger code bases. My
| idea is to enforce strong pre-conditions on function calls
| using type hints, but I am sure there are other ways to do
| it.
| DangitBobby wrote:
| For a language that is so incredibly picky about it's
| whitespace rules, it's a little laissez faire on the
| string-concatentation/tuple syntax side. I say this as
| someone who loves python and uses it extensively.
| trulyme wrote:
| The "trailing comma creates a tuple" bug actually comes
| from a disconnect between what people think defines a tuple
| (parenthesis) and what really does (comma). I always put
| parenthesis around a tuple for clarity.
| iooi wrote:
| If you use mypy (as anyone should for any non-hobby Python
| usage) then Python has one of the strongest type systems
| available. Optional types, generics, "Any" escape hatches,
| everything you could want.
| aeturnum wrote:
| mypy is a great project and I agree that basically every
| project at scale should use it. However, I think you're
| wrong about the strength of the Python type system and
| what a good type system can "get" you. I think mypy both
| does an amazing job at static checking and that more
| powerful type systems go far beyond static checks and
| into changing how you structure and write code. The newly
| introduced "structural pattern matching" they just
| introduced[1] is an example of the kind of feature that
| could be usefully expanded by making type a first-class
| part of the Python runtime.
|
| Again - the dynamism of Python means teams can write
| amazing extensions to Python (like mypy), but that isn't
| a replacement for the core team having a plan for how
| they think typing information should be used at runtime.
| Their current answer seems to be "nothing," which
| disappoints me.
|
| [1] https://www.python.org/dev/peps/pep-0622/
| ehsankia wrote:
| A lot of people in this thread are using this to make fun of
| Python, but the exact same issue exists in something like c++,
| here's some I fixed recently:
|
| https://github.com/UWQuickstep/quickstep/pull/9
|
| https://github.com/tensorflow/tensorflow/pull/51578
|
| https://github.com/mono/mono/pull/21197
|
| https://github.com/llvm/llvm-project/pull/335
| aeturnum wrote:
| I didn't understand anyone to be saying that Python is the
| only language to have this flaw.
|
| Also, I personally don't mind this approach to string
| concatenation. I think it's a fine compromise between easy
| formatting and clarity. I was whining about a corner case of
| tuple construction - which as far as I know is not a feature
| of any other language.
| macNchz wrote:
| I've been writing Python professional full time for 8 years and
| still occasionally make the trailing-comma-tuple mistake. These
| days at least I'll recognize and be able to find it quickly
| rather than wasting time. Can be caught with a linter, but not
| every codebase is readily linted.
| kazinator wrote:
| Not in Lisp! ("foo" "bar") and ("foobar") are lists of length 2
| and 1, respectively.
|
| (Python copies some bad ideas from C. Another one is having to
| _import_ everything you use. It seems that since Python is
| written in C, its designer took it for granted that there will be
| something analogous to #include for using libraries, even
| standard ones that come with the language.)
|
| Implicit string literal catenation is tempting to implement
| because it solves problems like: printf("long %s
| string" "nicely breaks up" "with
| indentation and all", arg, arg, ...)
|
| and if you're working in a language which has comma separation
| everywhere, you can get away with it easily.
|
| There are other ways to solve it. In TXR Lisp, I allow string
| literals to go across multiple lines with a backslash newline
| sequence. All contiguous unescaped whitespace adjacent to the
| backslash is eaten: This is the TXR Lisp
| interactive listener of TXR 273. Quit with :quit or Ctrl-D
| on an empty line. Ctrl-X ? for cheatsheet. TXR needs money,
| so even abnormal exits now go through the gift shop. 1>
| "abcd \ efg" "abcdefg"
|
| If you want a significant space, you can backslash escape it; the
| exact placement is up to you: 2> "abcd\ \
| efg" "abcd efg" 3> "abcd \ \ efg"
| "abcd efg" 4> "abcd \ \ efg"
| "abcd efg" 5> "abcd \ \ \ efg"
| "abcd efg"
| edflsafoiewq wrote:
| The Python certainly looks nicer though.
| [deleted]
| rileymat2 wrote:
| I like imports, it tells me what files symbols are coming from,
| even for built in libraries.
|
| Maybe it is that through my work I use a half dozen languages,
| where it is hard to remember each in detail.
|
| I have also worked on a javascript project where there were no
| imports/requires and the build process created one file. So you
| had to inspect the confusing build script to even know what was
| what.
| kazinator wrote:
| You could fairly easily work with a bunch of .js files that
| get catenated together by using an editor that can jump to a
| definition.
|
| Build processes creating one file is the seven decade norm in
| computing.
|
| Even if you literally don't catenate the .js files into one,
| they get loaded into one running image one way or another.
| stevesimmons wrote:
| I like the explicit nature of Python's imports.
|
| And especially how I can choose the best way to indicate the
| sources of names in my code: import time
| t = time.perf_counter() import time, my_module
| t1 = time.perf_counter() t2 = my_module.perf_counter()
| from time import perf_counter as std_counter from
| my_module import perf_counter as my_counter t1 =
| std_counter() t2 = my_counter() try:
| from my_module import perf_counter except ImportError:
| # Fall back to standard implementation from time
| import perf_counter t = perf_counter() #
| import time as m import my_module as m t =
| m.perf_counter()
| kazinator wrote:
| > import perf_counter as my_counter
|
| Yikes; you're renaming/aliasing global identifiers! Just
| no.
| tgv wrote:
| The difference is: in C, it's pretty unlikely someone wants to
| add strings. I suppose it's even illegal in the later C
| versions.
| kazinator wrote:
| It is positively not illegal in any standard verision of C
| since ANSI C 89.
|
| It's an essential feature used in all sorts of everyday code.
|
| C99 added printf conversion specifiers that are hidden behind
| macros, and idomatic usage of them relies on string
| catenation. uint32_t x = 0;
| printf("x = " PRIx32 "\n", x);
|
| where PRIx32 might expand to "%lx" (if uint32_t is the same
| as unsigned long in that compiler).
|
| All sorts of C macrology relies on string catenation. Kernel
| print messages: printk(KERN_EMERG "%s:
| temperature sensor indicates fire!", dev->name);
| ^ must not have comma here
| lanstin wrote:
| Interesting. Arguably tho this shows how C is aging. I find
| that PRIx32 a bit ugly.
|
| Although I just had a (logging) use case in go where I
| missed cpp macros - wanted the log statement to get
| something from the file and just had to pass it in as
| another parameter.
| kazinator wrote:
| I have also never used PRI-anything. It's a crime against
| readability.
|
| If I have a uint32_t which needs printing I cast it to
| (unsigned long) and use %lu or %lx. This requires more
| typing in the argument list, but keeps the format string
| tidy. It's important for the format string to be tidy,
| because that's the reason of its existence: to clearly
| and concisely convey the shape of what is being printed.
| tgv wrote:
| I know that. I meant that "abc" + "def" is most likely
| illegal (although "abc" + 'd' is not).
| wott wrote:
| > I meant that "abc" + "def" is most likely illegal
|
| That would be adding 2 pointers, and that's indeed
| illegal.
|
| However, you can subtract them: "abc" - "def" . Now, the
| result is not a pointer any more, it's a ptrdiff_t (an
| integer type), so most compilers will warn if you try to
| assign that to a char *.
| kazinator wrote:
| You started talking about "adding strings" in a thread
| about adjacent literals, without mentioning any +
| operator..
|
| String catenation ("adding") by adjacency (no visible
| operator) is a thing; "add" doesn't imply that we are
| talking about a + operator: $ awk 'BEGIN
| { x = "abc-" 2 + 2 "-def"; print x}' abc-4-def
| Spivak wrote:
| I'm gonna disagree on the import thing. Compared to Ruby where
| requires are magic bags of metaprogramming bullshit, Python is
| much much easier to reason about. It takes some getting used to
| that require 'json' actually adds methods to existing classes.
| kazinator wrote:
| "require 'json'" is just another #include in disguise, and if
| it monkey patches existing classes, it ... probably should
| not exist in any form.
|
| If the language supports json, it should just do that.
| 1> #J[1,2,3] #(1.0 2.0 3.0) 2> (get-json
| "[1,2,3,{\"foo\":true}]") #(1.0 2.0 3.0 #H(() ("foo"
| t))) 3> (put-json #(1.0 2.0 t)) [1,2,true]t
| Spivak wrote:
| Welcome to Ruby. $ irb
| irb(main):001:0> { hello: "world" }.to_json
| NoMethodError (undefined method `to_json' for
| {:hello=>"world"}:Hash) irb(main):002:0>
| require 'json' irb(main):003:0> { hello: "world"
| }.to_json => "{\"hello\":\"world\"}"
| kazinator wrote:
| I mean, I understand that classes which are open to
| extension with new methods is useful, and the right way
| to do OOP and all.
|
| If it was CLOS with multiple dispatch, it would be easier
| to swallow. Because it would look like:
| (to-json { hello: "world" }) ;; error: no such
| function!
|
| Then load the module, and you have a generic to-json
| function now, with a method specialized to handle the
| dictionary object and all. (I still wouldn't want to be
| doing this if it's supposed to be a language built-in).
|
| I regard the ability to add new methods to a class as
| good, but with a valid use case, like extending some
| third party piece with new methods in your own
| application. And the fact of not having to declare
| methods in a class definition, which is cumbersome. Just
| write a new method in that class's file, at the bottom,
| and there it is.
|
| I ideally don't want that third-party piece itself to be
| divided into three pieces that I have to separately load
| to get all of the methods. Or worse, pieces from separate
| third parties that add methods to each other.
|
| I copied a thing or two from Ruby in TXR Lisp. The object
| system as a derived hook, and that was inspired by
| something in Ruby: 1> (defstruct foo ()
| (:function derived (super sub) (prinl `derived @super
| @sub`))) #<struct-type foo> 2> (defstruct bar
| foo) "derived #<struct-type foo> #<struct-type
| bar>" #<struct-type bar> 3> (defstruct xyzzy
| bar) "derived #<struct-type bar> #<struct-type
| xyzzy>" #<struct-type xyzzy>
|
| The derived hook is inherited (like any other static
| slot), so it fires in bar also. The function can
| distinguish which class is being derived by the super
| argument.
| Someone wrote:
| You mean long %s stringnicely breaks upwith
| indentation and all"
|
| ? In my experience, this always gets ugly when you want to
| insert spaces (= about always). Do you put them at the end or
| at the start of each string (apart from the first or last
| string)
|
| I think scala's _mkString_
| (https://superruzafa.github.io/visual-scala-
| reference/mkStrin...) is the best solution, visually, for such
| things, but unfortunately, it would require hackers in the
| parser to do the concatenation at compile time, where possible.
|
| Scala's multiline strings look nice, too, if you want to insert
| newlines, except for the _stripMargin_ thing
| (https://docs.scala-lang.org/overviews/scala-book/two-
| notes-a...)
| kazinator wrote:
| The spaces aren't the point of the comment; rather that we
| can break the literal into pieces and indent those pieces
| without affecting the contents. In a non-strawman real
| exmaple with real data, of course we include all the
| necessary spaces in the literals. However, this bug is easy
| to make in C; I've seen it numerous times.
| Someone wrote:
| That's preciseLy my point. This looks nice, but it's too
| easy to forget tone of those spaces and to hard to spot
| that.
| NoahTheDuke wrote:
| > Another one is having to import everything you use.
|
| The alternative is what exactly? Have the entire standard
| library exposed at once? Make all modules create non-
| conflicting names for exported objects, so that the json parse
| function has to be called json_parse and the csv parse function
| has to be called csv_parse?
|
| Seems less than ideal to me.
| kazinator wrote:
| That's one way.
|
| If these things are classes in a plain old single-dispatch
| oop system, you can havec a json-parser and csv-parser which
| have parse methods.
|
| There could be packages/namespaces. So csv:parse and
| json:parse. These packages are standard and so they just
| exist; nothing to import.
|
| In Python, you cannot use anything without an import! The
| top-level modules (which serve as _de facto_ namespaces)
| themselves are not visible.
|
| Say there is a csv module with a parse. You cannot just do:
| csv.parse(...)
|
| you have to first say import csv
|
| This jaw-droppingly moronic.
| NoahTheDuke wrote:
| > This jaw-droppingly moronic.
|
| It can be slightly inconvenient but doesn't feel moronic to
| me. It means that except for the built-in functions,
| everything can be traced to either a definition or an
| import. Makes tracking code much easier.
| lanstin wrote:
| It lets you debug. E.g. if they have made a file called
| cvs.py in the same directory, then print (cvs.__file__)
| will show you this. If they have some weirdly screwed up
| paths with multiple pythons installed and multiple copies
| of the modules etc., same.
|
| I will not Go lang has the same feature carried forward
| from C. It helps a lot in the reading code side of the code
| lifecycle. And Go compiler makes you keep the imports up to
| date, which is good.
| kazinator wrote:
| > It lets you debug.
|
| It lets you debug Python problems which the system
| created in the first place.
|
| > If they have some weirdly screwed up paths with
| multiple pythons installed and multiple copies of the
| modules etc., same.
|
| Doesn't happen in a sane language. Or, even not a sanely
| defined language/implementation.
|
| I can easily have multiple different GCC copies (possibly
| for different processor targets) on the same machine.
| Each one knows where its own files are; an #include
| <stdio.h> compiled with your /path/to/arm-linux-eabi-gcc
| will positively not use your /usr/include/stdio.h, unless
| you explicitly do stupid things, like -I/usr/include on
| the command line.
| lanstin wrote:
| Having everything be imported is what makes the language be
| useable. Especially if you never import * you can easily find
| the definition and meaning of everything you read on the
| screen. A prime example of explicit is better than implicit.
|
| And backslash doesn't let you have the literal obey the proper
| indenting. Might as well use """
| kazinator wrote:
| > _you can easily find the definition and meaning of
| everything you read on the screen_
|
| I don't want to be finding definitions of things that the
| _language_ provides in the code.
|
| Languages that don't work this way have IDE's, editor plug-
| ins or other tools for easily finding the definitions of
| things that are in the language, without hunting for them
| through intermediate definition steps in the same file.
|
| "I've spent all my life in and out of jails, so I expect bars
| on doors and windows ..."
| justsomehnguy wrote:
| @" here strings in PS are fine for this purpose and
| even allows whitespace anywhere but because
| of the latter you can't indent it with your other
| code "@ -split "`r`n" | % {'<SOL>{0}<EOL>' -f $_ }
| <SOL> here strings in PS are fine for this purpose and <EOL>
| <SOL> even allows whitespace anywhere <EOL>
| <SOL> but because of the latter you can't indent it
| <EOL> <SOL> with your other code <EOL>
| kazinator wrote:
| I posted a Unix StackExchange answer with some tricks for
| doing this in shell programming, very similar to your <SOL>
| trick.
|
| https://unix.stackexchange.com/questions/76481/cant-
| indent-h...
| wartijn_ wrote:
| I like this. It's clearly meant as marketing for their product,
| but imo the best kind of marketing. They don't just run their
| tool and automatically make tickets, but check for false positive
| and (offer to) make pr's.
|
| It's both good for those projects and for the company that does
| the marketing since they reach there exact target group. Plus it
| gets them on the front page of HN.
| ehsankia wrote:
| A great addition to prune a ton of false-positives is to check
| the length of the strings. Almost always, the intentional
| implicit concats will have a very long string that reaches the
| max line length, whereas the accidental ones are almost always
| very short strings.
| routerl wrote:
| tl;dr: Python concatenates space separated strings, so ['foo'
| 'bar'] becomes ['foobar'], leading to silent bugs due to typos.
|
| I've been bitten by this one at work, and can't help but think it
| is an insane behaviour, given that ['foo' + 'bar'] explicitly
| concatenates the strings, and ['foo', 'bar'] is the much more
| common desired result.
|
| edit: This also applies to un-separated strings, so ['foo''bar']
| also becomes ['foobar']
| idealmedtech wrote:
| It's a holdover from C, where implicit string literal
| concatenation is very useful in the preprocessor.
| Palomides wrote:
| I assume it's based on the C behavior, where it can be handy
| with macros
|
| I don't think it fits well in python
| pmontra wrote:
| Maybe. We must remember that Python was designed at the very
| end of the 80s so what was normal for developers back then
| could be unexpected nowadays. An example: the self in
| Python's OO is a C pointer to struct of data and function
| pointers. It should be perfectly clear to anybody writing OO
| code in plain C at the time (rising hand.) Five years later
| new OO languages (Java, Ruby) kept self inside the classes
| but hide it in method definitions.
| wartijn_ wrote:
| But Python 3 was designed in the 2000s and had many
| breaking changes. Seems like they could have changed this
| behavior with that version.
| pletnes wrote:
| I assumed it was borrowed from shell, where everything can
| just be put next to eachother since it's all text.
| thrdbndndn wrote:
| I luckily never accidently used this space-concatenation thing,
| but I've been bitten by the fact a=(1) doesn't create 1-element
| tuple multiple times in my early days learning Python.
| onphonenow wrote:
| I still don't understand why it doesn't! So I still get bit
| from time to time.
| scbrg wrote:
| Presumably because parantheses don't really have anything
| to do with tuples, it's commas that do. Parantheses are
| there to help the parser group things in case of ambiguity,
| and to support expressions spanning multiple lines.
| voussoir wrote:
| If a person decides to add parentheses to some booleans or
| arithmetic, (4 + 5) * (8 + 2)
| (this and that) or (theother)
|
| These elements should not become 1-tuples after the
| interior contents are evaluated. I sometimes add
| parentheses even around single variables just for visual
| clarity.
|
| Also, this allows you to do dot-access on int / float
| literals, if you want to # doesn't work
| 4.to_bytes(8, 'little') # works
| (4).to_bytes(8, 'little')
| shoyer wrote:
| Most of the "bugs" caught here (including in TensorFlow and in my
| own project, Xarray) seems to actually be typos in the test
| suite. This is certainly a good catch (and yes, linters should
| check for this!), but seems a little oversold to me.
| jiveturkey wrote:
| nice ad!
| Pensacola wrote:
| Why 666?
| TonyRobbins wrote:
| Thanks for sharing these because if you not share i have to
| manage with this bug. Thank You Again!
| micimize wrote:
| For those looking to avoid this specific problem, there is a
| flake8 rule: https://pypi.org/project/flake8-no-implicit-concat.
|
| More broadly, the https://codereview.doctors makers are making
| the point that their tool caught an easy-to-miss issue that most
| wouldn't think to add a rule for. A bit of an open question to me
| how many of those there really are at the language level, but
| still seems like a neat project.
| pfisherman wrote:
| Ime, Black will add parenthesis to clearly and explicitly
| indicate a tuple where there is trailing comma. Figured this
| out when I made the trailing comma mistake and wondered why
| Black kept reformatting my code.
| trulyme wrote:
| Black rules. I love it that I don't need to have a discussion
| about style with anyone when Black is used on the project.
| oblvious-earth wrote:
| Also all but 1 of the issues they found relates to test code,
| it seems people are a little less careful compared to
| functional code.
|
| Also in terms of mistakes codereviewdoctor twice linked to the
| same issue in their blog
| https://github.com/tensorflow/tensorflow/issues/53636 and
| raised the PR to the wrong project
| https://github.com/tensorflow/tensorflow/pull/53637 (I guess
| Tensorflow vendors Keras, easy mistake)
| sundarurfriend wrote:
| > all but 1 of the issues they found relates to test code, it
| seems people are a little less careful compared to functional
| code.
|
| Also a factor that bugs in functional code are more visible,
| both during development and to users once shipped. So there
| may have been an equal number or more such bugs in the non-
| test code, that just didn't remain in the code base for this
| long.
| thrdbndndn wrote:
| https://github.com/tensorflow/tensorflow/tree/0d8705c82c64df.
| .. STOP! This folder contains the
| legacy Keras code which is stale and about to be deleted. The
| current Keras code lives in github/keras-team/keras.
| Please do not use the code from this folder.
|
| Yeah, not the most obvious notice.
|
| The fact they didn't find the same mistake(s) in keras-
| team/keras (I assume they scanned, it's one of the most
| popular Python repo) makes me believe these issues have been
| fixed/removed in up-to-date karas repo.
| rikatee wrote:
| once tensorflow pointed to keras-team this happened
|
| https://github.com/keras-team/keras/issues/15854
|
| resulting in
|
| https://github.com/keras-team/keras/pull/15876
| tedmiston wrote:
| The URL in this comment has an incorrect TLD: it should be
| `doctor` (singular).
|
| https://codereview.doctor/
| rikatee wrote:
| there is also https://pypi.org/project/flake8-tuple/
|
| typo in the url (or in HN's markup) btw: it's
| https://codereview.doctor
| prepend wrote:
| This seems like not a big deal. It's a common mistake and is in
| 5% of repos but it's not causing major damage.
|
| And there's no evaluation of importance as to whether these
| instances are in test files or non-critical code. Packages are
| big and can have hundreds or thousands of files.
|
| It could be that if these mattered, they would have been detected
| and fixed.
|
| A good example for unit tests and perhaps checking to see if
| these bugs are covered or not covered.
|
| I like these kinds of analyses but don't like the presented like
| it's some significant failure.
| rikatee wrote:
| yeah the impact varies. the sentry one seems pretty big:
| https://codereviewdoctor.medium.com/5-of-666-python-repos-ha...
|
| test did not work but did not fail either, imagine being that
| dev maintaining the code that the test professes to cover.
| Imagine being the user relying on the feature that test was
| meant to check (if the feature under test actually broke).
| jollybean wrote:
| 5% of 'released' software is quite a lot, more importantly it's
| a class of errors that definitely should not exist. This is a
| 'bug' in the language effectively there just isn't any real
| upside.
|
| Python has a few of these things, which is really sad.
| onphonenow wrote:
| There were proposals to fix some of these but the unicode
| zeal beat out some of the more boring (but I'd say as
| important) cleanups.
| bcrl wrote:
| It's a class of error that would be caught by even the most
| basic testing. A better title for the article is that 5% of
| 666 Python repos have typos that demonstrate the code in them
| that is completely untested. It doesn't matter which language
| it is: untested code is untested code in any language.
| rikatee wrote:
| unfortunately like 10% of the bugs were in the tests
| themselves. e.g., the sentry one
| https://codereviewdoctor.medium.com/5-of-666-python-repos-
| ha...
|
| the tests are only as good as the code they're written
| with, and as good as the code review process they were
| merged under.
| wott wrote:
| I believe that, whenever possible, tests should be
| written in a different language that the one used for the
| code under test (even better, in a dedicated, mostly
| declarative, testing language).
|
| It avoids replicating the same category of errors in both
| the test and the code under test, especially when some
| calculation or some sub-tests generation is made in the
| test.
| bcrl wrote:
| One of the habits I have when writing kernel code is to
| intentionally break code in the kernel to verify that my
| test is checking what I think it's checking. That's
| because of a lesson I learned a long, long time ago after
| someone reviewed my code and caught a problem: when your
| code has security implications, you need to make sure the
| boundary conditions that your tests are supposed to cover
| actually get tested. Having implemented a number of
| syscalls exposted to untrusted userland over the years,
| this habit has saved my bacon several times and avoided
| CVEs.
| geofft wrote:
| The errors were usually in tests themselves. Are you
| arguing that tests need their own tests to test that they
| are testing the right thing? Usually I think people believe
| that tests do not need to be tested and should not be
| tested, i.e., that you measure "100% coverage" against non-
| test code alone.
| samhw wrote:
| I don't think anyone could disagree: you could never
| exceed 0% code coverage if your definition was recursive
| (i.e. included tests, tests-of-tests, tests-of-tests-of-
| tests, ...).
| Cpoll wrote:
| Only if you generate infinite tests, then your coverage
| approaches 0%. But 100% covered code + 0% covered tests =
| ~50% total coverage.
|
| Also, the _obvious_ solution is self-testing code. (Jokes
| aside, structures like code contracts attempt something
| like this).
| jve wrote:
| I checked those those 11 links to issues for major software.
| 10 bugs were actually in tests...
| oaiey wrote:
| I do not see this from a verification perspective ... But
| also from a productivity perspective.
| ErikCorry wrote:
| This is understandable since many of those projects are not
| written in python. So the python code in them is only in
| incidental scripts like test harnesses. If V8 was written
| in python then performance would probably not be very good.
| deathanatos wrote:
| 9 out of 10, actually; the Tensorflow links are the same
| link.
| enchiridion wrote:
| I mean, if you're ultimately going to combine the list into a
| string anyway it's no big deal.
|
| Along those lines. I wonder how many of these come from ad-hoc
| file path handling instead of using pathlib.
| karolkozub wrote:
| I really like the idea of automated code review tools that point
| out unusual or suspicious solutions and code patterns. Kind of
| like an advanced linter that looks deeper into the code
| structure. With emerging AI tools like Github Copilot, it seems
| like the inevitable future. Programming is very pattern-oriented
| and even though these kinds of tools might not necessarily be
| able to point out architectural flaws in a codebase, there might
| be lots of low-hanging fruits in this area and opportunities to
| add automated value.
| rak1507 wrote:
| Or people could just write it correctly in the first place!
| Controversial I know! Seems like people would rather half-ass
| things and then let some AI autocorrect fix it up for whatever
| reason rather than doing it properly.
| lumost wrote:
| Consider that you may be describing a compiler. Typos are not
| _generally_ a problem in statically typed languages with
| notable exceptions such as dictionary key lookups etc.
|
| Even without static typing, argument length verification etc.
| can be done with a suitable compiler. In python we are left
| chasing 100% code coverage in unit tests as it's the only way
| to be certain that the code doesn't include a silly mistake.
| samhw wrote:
| I think 100% code coverage is folly. Spreading tests so
| widely near-inevitably means they're also going to be thin.
| In any codebase I'm working on, I would focus my attention on
| testing functions which are either (a) crucially important or
| (b) significantly complex (and I mean real complexity, not
| just the cyclomatic complexity of the control flow inside the
| function itself).
| lumost wrote:
| Fully agree, but I _never_ want to see a missed function
| argument programming error in customer facing code. In
| python you really do need code coverage to achieve this
| goal - static languages have some additional flexibility.
| lanstin wrote:
| Or a rich suite of linters religiously applied. Never
| save a file with red lines in flymake or the equivalent.
| Ed: actually, I am unsure if my current suite would miss
| required parameters. I tend to have defaults for all but
| the first parameter or two, so not a big issue for me I
| guess. I do like a compile time check on stuff tho, one
| of the reasons I am doing more and more tools in Go.
| joatmon-snoo wrote:
| I actually recently joined a startup working on this problem!
|
| One of our products is a universal linter, which wraps the
| standard open-source tools available for different ecosystems,
| simplifies the setup/installation process for all of them, and
| a bunch of other usability things (suppressing existing issues
| so that you can introduce new linters with minimal pain, CI
| integration, and more): you can read more about it at
| http://trunk.io/products/check or try out the VSCode
| extension[0] :)
|
| [0]
| https://marketplace.visualstudio.com/items?itemName=Trunk.io
| rikatee wrote:
| cool product :) it is just linting or do any of the tools do
| code transformation to offer the fix for the lint failure?
| (code review doctor also offers the fix if you add the github
| PR integration)
| atleta wrote:
| This is basically linting, i.e. code analysis. The techniques
| used might be more current (as they have been evolving, as you
| say, for pattern matching) but linting is just that: a code
| review tool to find usual bugs. (This is what did happen in
| this blog post. It wasn't looking for unusual solutions but
| usual mistakes.) The packaging, form of the feedback seems also
| different and that in itself may make a lot of difference in
| ease of use and thus adoption.
| joatmon-snoo wrote:
| Admittedly, the difference here is that codereview.doctor
| spent time tuning a custom lint on a variety of repos. In an
| org with a sufficiently large monorepo (or enough repos, but
| I don't really know how the tooling scales there) it's
| possible to justify spending time doing that, but for most
| companies it's one of those "one day we'll get around to it"
| issues.
| rikatee wrote:
| yeah something like sonarqube or https://codereview.doctor (if
| you use GitHub)
| ficklepickle wrote:
| Ironically there are a variety of typos in the article.
|
| A paragraph is repeated and the markdown links at the end are
| broken because there is a space between ] and (.
| codeptualize wrote:
| And then people make fun of JavaScript! (Just joking, I like
| Python, also JS, I guess everything has it's quirks, it's a good
| thing we have linters)
| bilalq wrote:
| The whole "666" thing really threw me off. I thought it was some
| Python specific term or something at first glance. They open with
| a sentence that mentions "5% of the 666 Python open source GitHub
| repositories" as though there were only 666 total open source
| Python GH repos. Picking a number with other fun connotations or
| whatever to use as a sample is fine, but without setting that
| context, it was kind of distracting from their main content.
| deathanatos wrote:
| Did you figure out what the context is, and if you did, would
| you mind spelling it out for me? I still haven't figured out
| what correction to make to that sentence to get it to make
| sense.
| rikatee wrote:
| in a blog post about the evils of typos there was a typo!
| classic https://en.wikipedia.org/wiki/Muphry%27s_law ;)
| ffhhj wrote:
| Also this classic:
|
| > Apple I was the first product ever announced by the
| company in 1976. The computer was put on sale for $666.66
| at the time.
|
| https://9to5mac.com/2021/11/25/steve-woz-signs-
| rare-1976-app...
| bilalq wrote:
| They ran their static analyzer over a sample of GH repos.
| They chose 666 as the number for their sample size. That's
| all.
| tyingq wrote:
| Seems expected, as linters can't be sure when it's not
| intentional. Like this request to pylint:
|
| https://github.com/PyCQA/pylint/issues/1589
|
| Is there usually enough context for a linter to make an educated
| guess?
| rikatee wrote:
| can do a good job at allowing long urls for example, but would
| be whack a mole trying to cater for "all" purposeful implicit
| string concatenations
| chrismorgan wrote:
| Splitting long URLs onto multiple lines because you have a
| hard line length limit is _considerably_ more harmful than
| exceeding the length limit in such cases, because you break
| the URL up so that tooling (including language-unaware static
| analysers) can't conveniently access it. (e.g. if you want to
| open the link, you can't just copy it or click on it or
| whatever, but must first join the lines, removing the
| quotation marks.) Any tool that forcibly splits up such lines
| when there is no fundamental hard technical reason why it
| must is, I categorically state, a bad tool.
| mikepurvis wrote:
| I would have thought it would be a no-brainer to just ban it
| and insist on an explicit + operator. I'm pretty surprised that
| issue was so flippantly closed.
| ReleaseCandidat wrote:
| The PR has been merged (for lists and tuples and sets only).
|
| https://github.com/PyCQA/pylint/pull/1655
| thaumasiotes wrote:
| > I would have thought it would be a no-brainer to just ban
| it and insist on an explicit + operator.
|
| Maybe as a matter of linting. As a matter of language design,
| I think + for string concatenation is a big mistake; using
| different symbols for numeric addition and string
| concatenation is something Perl got right.
| mikepurvis wrote:
| Yes, I meant as a matter of linting. I can understand the
| arguments being different for the language as a whole,
| particularly when legacy compatibility is a consideration.
|
| But my impression using pylint is that its default settings
| are wildly opinionated, hence the surprise that this
| wouldn't have fallen under that umbrella.
| tus666 wrote:
| Alternative title: 5% of Python repos has inadequate test
| coverage.
| _dain_ wrote:
| Most of the errors were in the tests themselves.
| ehsankia wrote:
| Nice! Internally we have a PCRE support on our code search and I
| regularly run a regex to find and fix these. I've also found a
| ton on opensource project which I've been trying to fix:
|
| https://github.com/YosysHQ/prjtrellis/pull/176
|
| https://github.com/UWQuickstep/quickstep/pull/9
|
| https://github.com/tensorflow/tensorflow/pull/51578
|
| https://github.com/mono/mono/pull/21197
|
| https://github.com/llvm/llvm-project/pull/335
|
| https://github.com/PyCQA/baron/pull/156
|
| https://github.com/dagwieers/pygments/pull/1
|
| https://github.com/zhuyifei1999/guppy3/pull/12
|
| https://github.com/pyusb/pyusb/pull/277
|
| https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull...
|
| It is indeed a very common mistake in Python, and can be very
| hard to debug. It bit me once and wasted a whole day for me, so
| I've been finding/fixing them ever since trying to save others
| the same pain I went through.
|
| EDIT: I will point out that I've found this error in other non-
| Python code too, such as c++ (see the 2nd PR for example).
|
| Here's the regex for anyone curious:
|
| [([{]\s*\n?(\s*['"](\w)+['"],\n)+(\s*['"]\w+['"]\n)(\s*['"]\w+['"
| ],\n)*
| arusahni wrote:
| The removal of implicit string concatenation was proposed for
| Py3k[1], but was rejected.
|
| [1] https://www.python.org/dev/peps/pep-3126/
| Wowfunhappy wrote:
| Does Python support the concept of allowing code to opt in to
| new safety features? I can understand rejecting something like
| this for the sake of legacy compatibility (something Python has
| abandoned too readily in the past), but it seems like an option
| --or maybe even a default--might be nice.
|
| I suppose this is also something you could catch with a linter?
| cpeterso wrote:
| Yes: import from __future__
|
| https://docs.python.org/3/library/__future__.html
| Wowfunhappy wrote:
| I'd say that's a "kind of", since it implies the feature
| will eventually become mandatory. I was thinking more along
| the lines of Javascript's 'use strict';
| wodenokoto wrote:
| The rejection notice seems completely counter intuitive to me.
| How is adding a plus "harder" compared to removing a foot gun?
|
| > This PEP is rejected. There wasn't enough support in favor,
| the feature to be removed isn't all that harmful, and there are
| some use cases that would become harder.
| oa2022 wrote:
| This change would break a lot of legacy code for no good
| reason
|
| The most common way to split a string in lines is using this
| concatenation formula.
| [deleted]
| nojs wrote:
| > The most common way to split a string in lines is using
| this concatenation formula.
|
| Is it really? I tend to avoid it in favour of """ or
| '\n'.join(<list of lines>), because it looks like a
| mistake.
|
| Triple quotes are kind of annoying if the string is
| indented, but you can just not indent the string to avoid
| the whitespace.
| benesch wrote:
| > This change would break a lot of legacy code for no good
| reason
|
| Preventing a bug that occurs in 5% of observed codebases
| (and anecdotally, happens to me during development all the
| time) seems like about as good as reasons get.
|
| Swapping a perfectly fine print statement for a function,
| on the other hand... that's the breaking change in Py3k
| that's never seemed worth it to me.
| wodenokoto wrote:
| But wasn't this proposal part of the move to python 3?
| strings where broken left and right anyway.
| gsnedders wrote:
| Right, there was lots of deliberate breakage, _and_ this
| is purely syntaxual hence the sort of thing 2to3 could
| trivially deal with.
| ehsankia wrote:
| > the sort of thing 2to3 could trivially deal with
|
| 2to3 could also trivially add +, and if anything, that
| would actually help surface these kind of bugs, because
| if you randomly see a + in the middle of your list of
| strings, it's much easier to spot the bug than if there
| was a missing comma.
| wirthjason wrote:
| Ironic to see this today. I spent an hour debugging this very
| same issue this morning.
|
| I was just doing some simple refactoring, changing a hard coded
| sting into a parameterized list of f-strings that's filtered and
| joined back into a string.
|
| I'm glad that I had unit tests that caught the problem! I
| couldn't figure out why it was breaking, that comma is very
| devilish to spot with the naked eye. I'm surprised my linters
| didn't catch it either. Maybe time to revisit them.
| Forge36 wrote:
| I wonder if any of the found issues will turn out to be important
| issues.
| titzer wrote:
| Just to be clear, the V8 "bug" was in the test runner code and
| caused mis-parsing of command line options for testing for non-
| SSE hardware. Not exactly a critical bug.
___________________________________________________________________
(page generated 2022-01-07 23:00 UTC)