[HN Gopher] What's Going on with Language Rankings?
___________________________________________________________________
What's Going on with Language Rankings?
Author : coloneltcb
Score : 78 points
Date : 2023-12-14 17:15 UTC (5 hours ago)
(HTM) web link (redmonk.com)
(TXT) w3m dump (redmonk.com)
| memco wrote:
| Curious if the layoffs factor in to the decline in a
| statistically significant way? I wouldn't expect that to account
| for all of the changes but maybe that accounts for some.
| jvalencia wrote:
| This is an interesting point. If I were conducting layoffs, the
| seemingly extra things like open source contributions, may be
| the first to go.
| runevault wrote:
| Even if you aren't explicitly targeting people who's primary
| job is that some still likely get laid off. Sheer numbers
| alone could lead to real damage there.
| myth_drannon wrote:
| Shopify as an example is heavily involved in open source
| (mostly Rails) and had large layoffs. I wouldn't be surprised
| if many of the contributors were gone dark.
| serial_dev wrote:
| On the other hand, if I were laid off, I'd pretty quickly clean
| up my open source profile and get to creating some cool open
| source packages to showcase my work.
|
| Though my intuition tells me that the net effect of layoffs is
| still negative.
| kodablah wrote:
| > the data [...] showed a roughly 25% decline in pull requests in
| 1H2023 as compared to 2H2022 PRs [...] largely lacking an
| explanation
|
| May be obvious, but would rather know if you saw PRs-per-repo
| decline. Otherwise, I wonder if some very frequent-PR repos
| (automated or otherwise) had previously skewed numbers. Also
| curious if early numbers for this half can be obtained/compared.
| rnk wrote:
| Github activity is an abstruse source of programming activity.
| I might do many github actions or few, separate from the actual
| use of the language. A lot of activity when learning is
| probably using online websites and online environments. People
| can move to other systems, use git on their own for instance.
|
| They need another datasource. There's another classic not
| reliable system, survey programmers to see what they are using.
| rezonant wrote:
| > A lot of activity when learning
|
| Oh, this made me connect some dots. At first I figured
| increase in GitHub activity during the pandemic was because
| people were home more and had more time to contribute to open
| source.
|
| But a lot of people took that time to learn software
| development. And a lot of courses will have you publish your
| learning projects to GitHub as a way to learn version control
| and processes (and also to promulgate awareness of, and
| analyze their course since there's often templates involved).
|
| Often this is just creating repos and pushing them, and since
| TFA directly looks at pull requests, perhaps this isn't it,
| unless these courses regularly include lessons on how to do a
| pull request (perhaps from yourself to your own repo).
|
| If that was a significant factor, it would make sense why the
| activity is down.
|
| > They need another datasource.
|
| Agreed here, the source is very imperfect.
| mdgrech23 wrote:
| This reminds me of the time managers tried to judge developer
| productivity based on lines of code changed. The person who ran
| the build script to generate the minified output for production
| kept winning. Mgmt was like why is this person generating 10x
| what you all do. #rockstardevelopers
| lowbloodsugar wrote:
| Or the lines of code committed to mainline, and that one guy
| who pulls from the branch where everyone else did the work.
|
| Don't assume this "oversight" is an accident or result of
| stupidity. In one case it was absolutely used to justify a
| promotion of a yes man.
| justinclift wrote:
| Alternatively: "Dude, your stats say you're only ever
| changing a single line. WHAT are you doing?". ;)
| ActionHank wrote:
| I feel like many, myself included, have moved away from using
| GitHub after they turned the good will of opensource into
| copilot. Trampling licenses and reselling open solutions to
| companies just doesn't sit right with many people. I will not
| feed the beast.
| karaterobot wrote:
| > There are languages that were under- or over-represented by
| using public GitHub repos; there are communities that were under-
| or over-represented by looking at discussions happening in Stack
| Overflow.
|
| I'm not sure I understand. Granted, they're never going to get
| perfect data about language rankings. But, why would AI
| assistants affect the rankings of language in a way that makes
| their methodology more unsound? They mention a drop-off in SO tag
| usage, and in Github pull requests: that's an indication of
| _something_ , but is there any reason to believe it distorts the
| results so as to give undue prominence to one language more than
| another?
|
| Intuitively, I can't think of how. "Undue" is the keyword: even
| if the prominence changed as a result of AI language assistants,
| wouldn't that be an indication that people are just using those
| languages more, with the help of AI assistants?
|
| I'm really just wondering why, given the unavoidable squishiness
| of the methodology, the decrease in _magnitude_ would bother them
| so much. They probably don 't see their work as fundamentally
| hand-wavey, and maybe I just do, and that's the problem. I guess
| they're just investigating it, but the results seem to be more or
| less what you'd expect to conclude ab initio. I could be wrong,
| often am.
| bad_user wrote:
| I think this trend favours unpopular languages.
|
| ChatGPT can only provide answers for programming languages and
| libraries that were included in its training. Granted,
| unpopular programming languages have online activity, too, but
| sample size matters.
| rezonant wrote:
| I imagine they're just trying not to tell the wrong story just
| because of a tangential change in how developers use these
| tools, so taking some time to look closely and account for
| things and provide the necessary context for the report is
| sensible. I guess it's just taking longer than they expected
| due to the unexpected GitHub changes.
| stonemetal12 wrote:
| Lets say you are tracking lion populations. You watch a couple
| of watering holes and count the number of lions. From there you
| use stats to make estimates of total number of lions in the
| world. Suddenly your data shows a drop of 30% and is continuing
| to decline. Are lions going extinct, or did they find a new
| watering hole? If they found a new watering hole then you need
| to change your measuring locations to account for the shift.
|
| I don't think software development has started slowing down. So
| the data is indicating that the way they were measuring is
| becoming less useful than it was, and that they will need to
| find new data sources if they are going to continue.
|
| If that really is AI, maybe Copilot and friends will become an
| interesting source of data on programming language use.
| janalsncm wrote:
| A smaller sample size means there is higher variance in the
| results.
| marcosdumay wrote:
| Hum, I haven't seen that graph for Stack Overflow yet. So, it
| peaked in 2017, but it's widely accepted that the decline is
| caused by ChatGPT?
|
| That doesn't strike me as reasonable analysis.
| tgv wrote:
| If attention shifts a bit to other languages (Nim? Zig? D?
| Clojure? Kotlin?), or if the answer is already in the multitude
| of answered questions, the activity will drop too. Or a
| multitude of not directly obvious other reasons.
| cle wrote:
| > The advent and rise of AI-based code assistants are already
| impacting the data that populates RedMonk's language rankings.
|
| Not only that, they are impacting the actual languages people
| use. Why use some new / esoteric language that an LLM doesn't
| know much about, when you can get the same job done much faster
| using a language that the LLM knows well and can debug?
|
| It's going to get increasingly difficult to bootstrap new
| languages unless they are available in LLMs.
| echelon wrote:
| > _Why use some new / esoteric language that an LLM doesn't
| know much about [...]_
|
| Rust is wildly easy to generate (idiomatically so).
|
| I've been building in our Rust monorepo without LLM tools for
| years and just tried them out yesterday -- I was utterly blown
| away.
|
| Rust might become _more_ popular with LLM accessibility.
| NooneAtAll3 wrote:
| ...why did you decide to start talking about rust out of the
| blue?
|
| do you need help? are you getting paid?
| throwup238 wrote:
| Did you not get your monthly check from big Oxide? You have
| to make sure your DeepState Connect account is up to date.
| evilduck wrote:
| Because it was an anecdote directly relevant to the comment
| it replied to, which happened to be someone using Rust?
|
| Would you have replied the same way had they mentioned the
| ability to use LLMs to generate Python? Would you equally
| imply an accusation of a Python developer of being a shill
| for mentioning it?
| stavros wrote:
| I've been wanting to learn Rust for years but the learning
| curve was too steep, I think LLMs are what will finally do it
| for me.
| lolinder wrote:
| Copilot was key for me for learning Rust. The syntax is
| verbose and unique enough that it was hard to get into from
| nothing without constant syntax errors. Copilot eliminated
| that barrier: I could describe what I wanted in a comment
| and immediately see a (relatively) idiomatic way to do it
| in Rust.
| pharmakom wrote:
| I deeply disagree with this. With Good Old Fashioned Dev Tools
| like Zero TurnAround, ReSharper, PyCharm, Clang tools, etc. the
| tool was specific to the language stack. Esoteric languages
| would simply not have many tools. All the LLM-based tool needs
| is to hoover up some code on GitHub, or even just your other
| project files as context, and it will start generating useful
| stuff. On top of that, translating between languages just
| became a whole lot easier.
| cle wrote:
| What other project files or GitHub code? It's a new language.
| And what does "hoover up" mean? Use the context window for
| language basics? RAG? Retrain the model?
|
| Also the LLM won't have been trained on the vast corpus of
| debugging sessions, blog posts, etc of existing languages.
| I'd expect the quality to be very poor compared to the
| current top players.
|
| (It's effectively the same phenomenon as folks sticking to
| languages with a big hiring pool.)
| travisgriggs wrote:
| I've seen this. I can get quite a bit of decent responses from
| LLMs about Python. But using it for Elixir produces more
| shady/dated answers. And Jetpack Compose, it just completely
| gets weird (and mostly worthless, being either too
| synthetically approximated or dated).
|
| It really illustrates (to me) that LLMs are "Plausibility
| Machines". They produce plausible answers. Plausibility is
| informed by repeated observability over large sets. We asses
| something as plausible when it matches the majority of our
| observations. That's how the parlor trick of seeming to "tell
| the truth" is pulled off. The venn overlap of "plausible" and
| "true" is pretty high. But when "truth" is in flux, you start
| to see the feet of the wizard behind the curtain.
| umanwizard wrote:
| Which LLMs have you tried?
| ideamotor wrote:
| Not just that. When the general public uses LLMs more, this
| will calcify markets for Everything. Try bootstrapping any new
| product, store, idea, art - you name it.
| mattgreenrocks wrote:
| Is there evidence of this happening yet?
|
| It makes sense in theory, but this is the type of thing I
| want to see data on before I consider it a possible problem,
| even.
|
| No sense in worrying about things that you can't control,
| can't prepare for and have not occurred yet, after all. :)
| com2kid wrote:
| > Is there evidence of this happening yet?
|
| Right now, today, when choosing a library, top of my mind
| is "has this been around long enough to be in GPT4's
| training set?"
|
| Because to give up the 5-10x productivity boost GPT allows
| for, a library better be _really_ good.
| ideamotor wrote:
| Yup
| bcrosby95 wrote:
| If business en mass stagnate because what llms offer
| calcifies then there will be opportunities to outcompete them
| via knowledge not offered by llms.
| zamadatix wrote:
| Using something like e.g. Zig right now is already slower
| because examples are limited, a barrage of common questions
| over the last decade don't exist on a Stack Overflow type site,
| few people are experts in it, and the tooling/ecosystem is
| immature. Is replacing part of this with an LLM really
| impacting anything or just continuing what was already true
| before?
| tryptophan wrote:
| I wonder if this can be turned into an advantage however. Say
| the development team invests time into writing really good
| documentation and examples and then feeds it into LLMs and
| posts a link to a chat bot on their documentation. Much
| easier and faster for that bot to answer the inevitable 1000s
| of easy questions than waiting around for the community to do
| so.
| stusmall wrote:
| > Not only that, they are impacting the actual languages people
| use. Why use some new / esoteric language that an LLM doesn't
| know much about, when you can get the same job done much faster
| using a language that the LLM knows well and can debug?
|
| I don't see how this is anything new. Similar tension has
| always existed slowing adoption of new technology. I've
| resisted picking up new languages because I could implement
| something faster/better in languages I already know. I've been
| skeptical about some because the ecosystem just isn't there
| yet. You said "a language that the LLM knows well and can
| debug" when sometimes the resistance is that the new language
| does have as good of tooling/debugging support.
|
| There is nothing new here and if anything LLMs could help. I've
| noticed the community seems pretty understanding/accepting of
| low quality outputs of some of these LLMs. The small corpus of
| a new/esoteric would mean even lower quality, but I don't think
| that'd be a deal breaker for some folks.
| vintermann wrote:
| From my intuition (and experience) with LLMs, they make more of
| what is in the context. If you use a language that is often
| used for business CRUD software, the assistant will be great it
| giving you business CRUD software, of typical business CRUD
| software quality. Using copilot for work, my experience is that
| it will happily offer ugly but context-plausible solutions
| (that often don't even work), but if you stop and ask "maybe
| there's a better way to do this", it often will suggest good
| simplifications. Especially if you give it a hint ("Maybe it's
| worth trying...")
|
| I haven't really used copilot with, say, Rust or Haskell, but
| if the average quality of code is better in those languages,
| maybe it will take less coaxing to actually suggest good code?
|
| Either way, LLMs are great at generalizing between human
| languages, so I expect they transfer a lot between programming
| languages too. I think your language has to be pretty weird for
| copilot to not get a grip on it with enough context.
| lmm wrote:
| > Not only that, they are impacting the actual languages people
| use. Why use some new / esoteric language that an LLM doesn't
| know much about, when you can get the same job done much faster
| using a language that the LLM knows well and can debug?
|
| Hmm, I can see it going the opposite way - maybe the
| availability of LLMs means that having an active user community
| in your timezone is less important, and makes semi-obscure
| languages more competitive.
| janalsncm wrote:
| It depends. Usually there's other things like thorough
| documentation, bug fixes, and comprehensive libraries that
| accompany popular languages. And LLM support is basically
| only as good as the documentation and online resources.
|
| For me, GPT4 means Rust is approachable, something I would
| never have considered a couple years ago. There's tons of
| documentation online, but it's great to be able to get the
| "vocabulary" of my question from GPT4 before looking it up
| online. Rust compiler errors are pretty good, but a LLM
| really smooths over the rough spots.
| lolinder wrote:
| I've actually been building toy languages for fun (nothing I'm
| ready to share) with GitHub Copilot as an aid, and it's able to
| pick up on my homegrown language just fine. It helps that I
| opted for C-style syntax, but it's able to pick up on semantic
| concepts like effects that don't exist in the mainstream, as
| well as specific syntactic choices like `:` vs `=` for record
| constructors.
|
| Where a new language _will_ be severely disadvantaged is in the
| blank page problem. With a well-known language you can start a
| new project with just a comment and let Copilot get you over
| the initial barrier of just having _something_ down on the
| page. A new language won 't work because it needs a fair amount
| of context before it's able to produce correct code.
| ljm wrote:
| Copilot stuff has made some things harder for me to review,
| funnily enough. I don't blame it on the model, but the
| programmer using it with zero scrutiny.
|
| If people doggedly continue down that path they're going to
| stagnate for as long as they assume the AI is always right.
|
| Just because an LLM can can cobble together some output doesn't
| mean that the output is instantly trustworthy.
|
| Not to mention languages often optimise for different things.
| If you want to default to JS for everything because the AI can
| auto-complete that more easily, you're locking into a very
| specific operating model.
| goodroot wrote:
| The same is true for database rankings (db-engines).
|
| If entrants are not artificially inflating "organic" signals via
| fake content spam (Twitter/X), then the criteria themselves are
| losing their signal strength (StackOverflow/GitHub).
|
| The diffusion makes it increasingly difficult to understand which
| channels are important and which correlate to strength in the
| market.
|
| Unfortunately, these can be more than vanity metrics.
|
| Some VCs or financial markets may use these as methods towards
| valuation.
| blueflow wrote:
| The decrease of Github activity might be caused by their 2FA
| rollout. Everything that does not have a 2FA workflow (like a git
| mirror bot) got locked out from non-public repositories this
| summer.
| janalsncm wrote:
| Seems reasonable. If this is the case, we shouldn't see a
| decline from summer (post lockout) to now. In fact we should
| see an _uptick_ in the next period as more people get around to
| unlocking their accounts (or at least not a further decline).
| blueflow wrote:
| It was announced that 2FA will be forced for selected
| accounts (which seems to be everyone active) by January, so
| the doom is still impending.
| Havoc wrote:
| It's not just pollution on data side, but it is also messing with
| actual usage.
|
| I was making decent progress on Rust and enjoying that, but have
| reverted back to more python since that is the lingua franca of
| AI.
|
| So how do we rank that? Rust because of technical merit or python
| because real world? That's a distinct you won't catch via stack
| overflow scraping
| araes wrote:
| Just to see what it looked like, plotted the last 5 seasons of
| top language from RedMonk on the StackExchange / Github plot.
| Mostly ad hoc circling of language groups to get an idea of
| movement patterns. Kept changing their chart scaling and naming
| every season... [1]
|
| [1] "RedMonk Programming Language Rankings, 2021-2023",
| https://i.imgur.com/MWhzGC2.jpg.
|
| Individuals since the colors get messy.
|
| Q1, 2023:
| https://redmonk.com/sogrady/files/2023/05/lang.rank_.q123.wm...
|
| Q3, 2022:
| https://redmonk.com/sogrady/files/2022/10/lang.rank_.622.png
|
| Q1, 2022: https://redmonk.com/sogrady/files/2022/03/lang-
| rank-0122-wm....
|
| Q3, 2021:
| https://redmonk.com/sogrady/files/2021/08/lang.rank_.0621.pn...
|
| Q1, 2021:
| https://redmonk.com/sogrady/files/2021/03/lang.rank_.0121.wm...
| janalsncm wrote:
| If I could make a friendly suggestion, I think the data would
| be a bit easier to interpret if it were a grouped bar chart (GH
| rank, SO rank) and sorted by one (or the sum of the
| reciprocals).
| lacker wrote:
| This is especially tough for RedMonk's rankings because they
| typically evaluate languages on "lifetime summary" metrics like
| total number of tags on Stack Overflow, or total number of
| projects on GitHub. e.g.:
|
| https://redmonk.com/sogrady/2023/05/16/language-rankings-1-2...
|
| If Stack Overflow becomes more and more a reflection of
| "historical behavior" rather than "modern behavior", this sort of
| metric will become a less useful way to judge modern developer
| usage.
| galkk wrote:
| Idk, I don't feel that languages need to be/can be ranked. Right
| now I don't even believe that much on the right language for the
| job, but believe in "there was good founding engineer/team, they
| picked up what they were familiar with and were able to deliver".
| People > language. When time will come, another army will rewrite
| in another language/microservice-size/whatevet
| persedes wrote:
| The article is not about ranking the languages per se, but
| about the drop in activity on stackoverflow likely due to
| chatGPT and a similar drop in PRs for github.
| serial_dev wrote:
| https://twitter.com/timsneath/status/1734108536987373910
|
| TIOBE
|
| > Does anyone honestly believe that TypeScript is half as popular
| as Prolog or D?
| mminer237 wrote:
| You're probably aware, but that's a totally different ranking.
| On RedMonk's last ranking, TypeScript was one of the most
| popular languages, D was maybe 55% as popular, and Prolog
| didn't even make the list.
| RyanShook wrote:
| I think 2024 is the year we see Python overtake JS in overall
| popularity.
| frou_dh wrote:
| On the other hand, the significance of that diminishes quite a
| bit if it it's only the case when JS/TS are separated in the
| ranking.
___________________________________________________________________
(page generated 2023-12-14 23:01 UTC)