[HN Gopher] The computers are fast, but you don't know it
___________________________________________________________________
The computers are fast, but you don't know it
Author : dropbox_miner
Score : 239 points
Date : 2022-06-16 18:24 UTC (4 hours ago)
(HTM) web link (shvbsle.in)
(TXT) w3m dump (shvbsle.in)
| thrwyoilarticle wrote:
| >Optimization 3: Writing your function in pure C++
|
| >double score_array[]
| jiggawatts wrote:
| Something all architecture astronauts deploying microservices on
| Kubernetes should try is benchmarking the latency of function
| calls.
|
| E.g.: call a "ping" function that does no computation using
| different styles.
|
| In-process function call.
|
| In-process virtual ("abstract") function.
|
| Cross-process RPC call in the same operating system.
|
| Cross-VM call on the same box (2 VMs on the same host).
|
| Remote call across a network switch.
|
| Remote call across a firewall and a load balancer.
|
| Remote call across the above, but with HTTPS and JSON encoding.
|
| Same as above, but across Availability Zones.
|
| In my tests these scenarios have a performance range of about 1
| million from the fastest to slowest. Languages like C++ and Rust
| will inline most local calls, but even when that's not possible
| overhead is typically less than 10 CPU clocks, or about 3
| nanoseconds. Remote calls in the typical case _start_ at around
| 1.5 milliseconds and HTTPS+JSON and intermediate hops like
| firewalls or layer-7 load balancers can blow this out to 3+
| milliseconds surprisingly easily.
|
| To put it another way, a synchronous/sequential stream of remote
| RPC calls in the _typical case_ can only provide about 300-600
| calls per second to a function that does _nothing_. Performance
| only goes downhill from here if the function does more work, or
| calls other remote functions.
|
| Yet, every enterprise architecture you will ever see, without
| exception has layers and layers, hop upon hop, and everything is
| HTTPS and JSON as far as the eye can see.
|
| I see K8s architectures growing side-cars, envoys, and proxies
| like mushrooms, _and then_ having all of that go across external
| L7 proxies ( "ingress"), multiple firewall hops, web application
| firewalls, etc...
| hamstergene wrote:
| On mobile devices it is more serious than just bad craftsmanship
| & hurt pride, bad code is short battery life.
|
| Think mobile game that could last 8 hours instead of 2 of it
| wasn't doing unnecessary linear searches on timer in JavaScript.
| ben_w wrote:
| There was one place where a coworker had written a function
| that converted data from a proprietary format into a SQL
| database. On some data, this took 20 minutes on the test
| iPhone. The coworker swore blind it was as optimised as
| possible and could not possibly go faster, even though it
| didn't take that long to load from _either_ the original file
| format _or_ the database in normal use.
|
| By the next morning, I'd found it was doing an O(n^2) operation
| that, while probably sensible when the app had first been
| released, was now totally unnecessary and which I could safely
| remove. That alone reduced the 20 minutes to 200 milliseconds.
|
| (And this is despite that coworker repeatedly emphasising the
| importance of making the phone battery last as long as
| possible).
| andrewclunn wrote:
| How are we supposed to optimize coding languages, when the
| underlying hardware architecture keeps changing? I mean you don't
| write assembly anymore, you would right in the LLVM. Optimization
| was done because it was required. It will come back when complete
| commoditization of cpus occur. Enforcement of standards and
| consistent targets allow for high optimizations. Just see what
| people are able to do with outdated hardware in the demo and
| homebrew scene for old game consoles! We don't need better
| computers, but so long as we keep getting them, we will get
| unoptimized software, which will necessitate better computers.
| The vicious cycle of consumerism continues.
| wodenokoto wrote:
| Did the author beat pandas group an aggregate by using standard
| Python lists?
| morelisp wrote:
| The main optimization at that stage seems to be preallocating
| the weights. I don't know pandas but such a thing would have
| been possible without dropping any of the linalg libraries I do
| know how to use.
|
| I doubt the author's C++ implementations beat BLAS/LAPACK, but
| since they're not shown I can only guess.
|
| I've done stuff like this before but the tooling is really no
| fun, somewhere between 2 and 3 I'd just write it all in C++.
|
| Changing the interface just to get parallelism out seems not
| great - give it to the user for free if the array is long
| enough - but maybe it was more reasonable for the non-trivial
| real problem.
| BiteCode_dev wrote:
| Most likely a missued of Pandas. DF are heavy to create, but
| calculations on them are fast if you stay in the numpy world
| and stay vectorized.
| klyrs wrote:
| Yeah, I'm a bit suspicious that they made two simultaneous
| changes: 1. remove pandas 2.
| externalize WEIGHTS and don't generate it every run
|
| Point 2 is likely a huge portion of the runtime.
| jjoonathan wrote:
| "It's fast so long as you don't use any of the many parts
| that aren't fast!"
|
| This isn't great.
| BiteCode_dev wrote:
| That's true for everything in computing.
|
| Don't use a hammer as a screwdriver.
|
| I'm not even implying they shouldn't have used pandas for
| this, I'm suggesting they probably wrote the wrong pandas
| code for this.
|
| Pandas is typically 3 times faster than raw Python, not 10
| times slower.
| jjoonathan wrote:
| No, I think it _is_ fair to call out mediocrity, even
| when it tries to pull the "disclaim exactly the set of
| specific applications it gets called out on" trick.
|
| Sure, pandas often beats raw python by a bit, but come
| on, there's so much mediocrity between the two that I
| doubt they even had to cheat to find a situation the
| other way around.
| BiteCode_dev wrote:
| I wish for myself to create any project in my life that
| reach twice this level of mediocrity then.
| cs137 wrote:
| I used to be a hardcore functional programming weenie,
| but over time I realized that to do high-performance,
| systems programming in an FP language means writing a
| bunch of non-idiomatic code, to the point that it's worth
| considering C (or C++ for STL only, but not that OOP
| stuff) instead unless you have a good reason (which you
| might) for a nonstandard language.
|
| The problem isn't Python itself. Python has come a long
| way from where it started. The problem is people using
| Python for modules where they actually end up needing,
| say, manual memory management or heterogeneous high
| performance (e.g. Monte Carlo algorithms).
| kzrdude wrote:
| People create accidentally quadratic code all the time.
| It's even easier in pandas because the feature set is so
| huge and finding the right way to do it takes some
| experience (see stackoverflow for a lot of plain loops over
| pandas dataframes).
| [deleted]
| bornfreddy wrote:
| Yes. Apparently they have used Python lists to beat highly
| optimized library which builds upon numpy, in C. Yeah right.
|
| Note that I'm not saying that their second version of the code
| wasn't faster, just that this has nothing to do with python vs.
| pandas.
| dfgqeryqe wrote:
| You would think that, wouldn't you? But every time I've
| worked on a Python code base I have torn out Pandas and
| replaced it with simple procedural code, getting at least an
| order of magnitude.
|
| Pandas is spectacularly slow. I don't understand how or why,
| but it is.
| snovv_crash wrote:
| Every time I've used tools that used Pandas things are slow.
| I totally believe this.
| abraxas wrote:
| Yep, many (especially younger) programmers don't get the "feel"
| for how fast things should run and as a result often "optimize"
| things horribly by either "scaling out" i.e. running things on
| clusters way larger than the problem justifies or putting queuing
| in front and dealing with the wait.
| pelorat wrote:
| Maybe, stop using Python for anything but better shell scripts?
| Pretty sure it was invented to be a bash replacement.
| javajosh wrote:
| That's really cool but I somewhat resent the use of percentages
| here. Just use a straight factor or even better just the order of
| magnitude. In this case it's four orders of magnitude of an
| improvement.
| Taywee wrote:
| > It's crazy how fast pure C++ can be. We have reduced the time
| for the computation by ~119%!
|
| The pure C++ version is so fast, it finishes before you even
| start it!
| etaioinshrdlu wrote:
| My entire career, we never optimize code as well as we can, we
| optimize as well as we need to. Obviously the result is that
| computer performance is only "just okay" despite the hardware
| being capable of much more. This pattern repeats itself across
| the industry over decades without changing much.
| pjvsvsrtrxc wrote:
| The problem is that performance for most common tasks that
| people do (f.e. browsing the web, opening a word processor,
| hell even opening an IM app) has gone from "just okay" to "bad"
| over the past couple of decades despite our computers getting
| many times more powerful across every possible dimension (from
| instructions-per-clock to clock-rate to cache-size to memory-
| speed to memory-size to ...)
|
| For all this decreased performance, what new features do we
| have to show for it? Oh great, I can search my Start menu and
| my taskbar had a shiny gradient for a decade.
| etaioinshrdlu wrote:
| I think a lot of this is actually somewhat misremembering how
| slow computers used to be. We used to use spinning hard
| disks, and we were so often waiting for them to open
| programs.
|
| Thinking about it some more, the iPhone and iPad actually
| comes to mind as devices that perform well and are
| practically always snappy.
| mulmboy wrote:
| Doubtful that moving from vectorised pandas & numpy to vanilla
| python is faster unless the dataset is small (sub 1k values) or
| you haven't been mindful of access patterns (that is, you're bad
| at pandas & numpy)
| forinti wrote:
| On a 3GHz CPU, one clock cycle is enough time for light to travel
| only 10cm.
|
| If you hold up a sign with, say, a multiplication, a CPU will
| produce the result before light reaches a person a few metres
| away.
| dragontamer wrote:
| > If you hold up a sign with, say, a multiplication, a CPU will
| produce the result before light reaches a person a few metres
| away.
|
| The latency on multiplication (register input to register
| output) is 5-clock ticks, and many computers are 4GHz or 5GHz
| these days.
|
| 5-clock cycles at 5GHz is 1ns, which is 30-centimeters of light
| travel.
|
| If we include L1 cache read and L1 cache write, IIRC its 4
| clock cycles for read + 4 more for the write. So 13 clock
| ticks, which is almost 70 centimeters.
|
| ------------
|
| DDR4 read and L1 cache write will add 50 nanoseconds (~250
| cycles) of delay, and we're up to 13 meters.
|
| And now you know why cache exists, otherwise computers will be
| waiting on DDR4 RAM all day, rather than doing work.
| moonchild wrote:
| > The latency on multiplication (register input to register
| output) is 5-clock ticks
|
| 3
|
| https://www.agner.org/optimize/instruction_tables.pdf
| jiggawatts wrote:
| Back in the days, an integer division took something like
| 46 clocks (original Pentium), and now on Ice Lake it's just
| 12 with a reciprocal throughput of 6. Multiply that by the
| clock speed increase and a modern CPU can "do division"
| about 300-400 times faster than a Pentium could. Then
| multiply _that_ by the number of cores available now versus
| just one core, and that increases to about 2000 times
| faster!
|
| I used to play 3D games on Pentium-based machines and I
| thought of them as a "huge upgrade" from 486, which in turn
| were a huge upgrade from 286, etc...
|
| Now, people with Ice Lake CPUs in their laptops and servers
| complain that things are slow.
| jnordwick wrote:
| Those insrtruction latencies are in addition to the
| pipeline created latency. (They are actually the number of
| cycles added to the dependency chain specifically). The
| mult port has a small pipeline itself of 3 stages (that why
| 3 cycles latency). Intel has a 5 stage pipeline so the
| minimum latency is going to be 8 for just those two things.
| diroussel wrote:
| That is quite an amazing way to put it.
|
| So the processor in my hand can compute a multiplication fast
| than light can cross the room?
| jiggawatts wrote:
| It can complete _many_ multiplications in that time,
| especially if you factor in parallelism. An 8-core machine
| using AVX-512 could do a few thousand 32-bit multiplications
| in that time. Your GPU can do tens of thousands, maybe
| hundreds of thousands depending on the model.
| generalizations wrote:
| I ran across an animation once that showed graphically the time
| it takes light to travel between the planets and the sun. It's
| weird, but light doesn't seem that fast anymore.
| shultays wrote:
| I feel like it is more like we cant comprehend how big and
| empty space is
| ineedasername wrote:
| The speed of light has really not kept pace with Moore's Law.
| Engineers have focused overly much on clock speed and
| transistor density and completely ignored C, and it's really
| beginning to show.
| [deleted]
| orzig wrote:
| Another consequence of our society's reduced investment in
| fundamental physics - instead we go ether.
| AnimalMuppet wrote:
| I recently read a science fiction short story on reddit,
| where humans had developed faster-than-light
| communication because they needed to reduce lag in
| networked games.
| scarmig wrote:
| Perhaps the solution is more on the bioengineering side
| of things: make smaller people so they can fit in smaller
| rooms.
| edbaskerville wrote:
| The thing that did for me is realizing that people on
| opposite sides of the United States can't play music together
| if it requires any rhythmic coordination, even with a true
| speed-of-light signal with no other sources of latency.
| aeonik wrote:
| jamtaba and ninjam are an open source solutions to this
| problem.
|
| They allow you to buffer everyone's playing, at a user
| specified interval, then replays the last measure of music
| to everyone.
|
| It's definitely not the same as live playing, but it's
| still pretty fun, and actually forces you to get creative
| on different ways.
|
| https://jamtaba-music-web-site.appspot.com/
| dekhn wrote:
| Eventually you can't stuff enough computing in a small area
| (power density). Therefore you have to connect multiple CPUs
| spread out in space. The limit for many supercomputers is about
| how long it takes light or electrical signals to travel about
| 20 meters. Latency to first result is only part of the
| measurement that matters.
| grishka wrote:
| Huh, but then I'm pretty sure that there are some paths inside
| the CPU die that are long enough that speed of light is a
| consideration at these frequencies. Must require a lot of smart
| people to design these things, yet it only takes a bunch of
| junior developers to bog them down.
| google234123 wrote:
| I say we ban all junior devs.
| morelisp wrote:
| Easier to raise the speed of light.
| salawat wrote:
| And risk the universal UB?
|
| No thanks.
| charlie0 wrote:
| I've always been tempted to make things fast, but for what I
| personally do on a day to day basis, it all lands under the
| category of premature optimization. I suspect this is the case
| for 90% of development out there. I will optimize, but only after
| the problem presents itself. Unfortunately, as devs, we need to
| provide "value to the business". This means cranking out features
| quickly rather than as performant as possible and leaving those
| optimization itches for later. I don't like it, but it is what it
| is.
| momojo wrote:
| > for what I personally do on a day to day basis, it all lands
| under the category of premature optimization
|
| Another perspective on premature opt: When my software tool is
| used for an hour in the middle of a 20-day data pipeline, most
| optimization becomes negligible unless it's saving time on the
| scale of hours. And even then, some of my coworkers just shrug
| and run the job over the weekend.
| wildrhythms wrote:
| I agree... for a business "fast" means shipping a feature
| quickly. I have personally seen the convos from upper
| management where they handwave away or even justify making the
| application slower or unusable for certain users (usually
| people in developing countries with crappy devices). Oh it will
| cost +500KB per page load, but we can ship it in 2 weeks?
| Sounds good!
| tiffanyh wrote:
| NIM
|
| NIM should be part of the conversation.
|
| Typically, people trade slower compute time for faster
| development time.
|
| With NIM, you don't need to make that trade-off. It allows you to
| develop in a high-level but get C like performance.
|
| I'm surprise its not more widely used.
| dgan wrote:
| at that point, almost anything compiled will be at least an
| order of magnitude faster than python
| Spivak wrote:
| The dichotomy between "compiled/interpreted" languages is
| completely meaningless at this point. You can argue that
| Python is compiled and Java is interpreted. I mean one of our
| deployment stages is to compile all our Python files.
|
| The thing that makes the difference isn't the compilation
| steps, it's how dynamic the language is and how much behind
| the scenes work has to be done per line and what tools the
| language gives you to express stronger guarantees that can be
| optimized (like __slots__ in Python).
| 41b696ef1113 wrote:
| >I'm surprise its not more widely used.
|
| It's a ~community language without the backing of an 800lb
| gorilla to offer up both financial and cheerleading support.
|
| I love the idea of Nim, but it is in a real chicken-and-egg
| problem where it is hard for me to dedicate time to a language
| I fear will never reach a critical mass.
| sergiotapia wrote:
| I've used Nim for about 2 years now. It's a wonderful language
| but it's desperately lacking a proper web framework and a
| proper ORM. If such a thing existed I would probably drop
| Elixir for Nim career-wise.
| goodpoint wrote:
| It's written Nim, not NIM.
| dragontamer wrote:
| As a hobby, I still write Win32 programs (WTL framework).
|
| Its hilarious how quickly things work these days if you just used
| the 90s-era APIs.
|
| Its also fun to play with ControlSpy++ and see the dozens, maybe
| hundreds, of messages that your Win32 windows receive, and
| imagine all the function calls that occur in a short period of
| time (ie: moving your mouse cursor over a button and moving it
| around a bit).
| pjvsvsrtrxc wrote:
| Linux windows get just as many (run xev from a terminal and do
| the same thing). Our modern processors, even the crappiest
| Atoms and ARMs, are actually _really, really fast_.
| dragontamer wrote:
| GPUs even faster.
|
| Vega64 can explore the entire 32-bit space roughly 1-thousand
| times per second. (4096 shaders, each handling a 32-bit
| number per clock tick, albeit requiring 16384 threads to
| actually utilize all those shaders due to how the hardware
| works, at 1200 MHz)
|
| One of my toy programs was brute forcing all 32-bit constants
| looking for the maximum amount of "bit-avalanche" in my home-
| brew random number generators. It only takes a couple of
| seconds to run on the GPU, despite exhaustive searching and
| calculating the RNG across every possible 32-bit number and
| running statistics on the results.
| jansommer wrote:
| Win32 is so really, really fast. And with Tiny C Compiler the
| program compiles and boots faster than the Win10 calculator app
| takes to start.
| xupybd wrote:
| You also have to optimize for the constraints you have. If you're
| like me then development time is expensive. Is optimizing a
| function really the best use of that time? Sometimes yes, often
| no.
|
| Using Pandas in production might make sense if your production
| system only has a few users. Who cares if 3 people have to wait
| 20 minutes 4 times a year? But if you're public facing and speed
| equals user retention then no way can you be that slow.
| pjvsvsrtrxc wrote:
| > If you're like me then development time is expensive. Is
| optimizing a function really the best use of that time?
| Sometimes yes, often no.
|
| Almost always yes, because software is almost always used many
| more times than it is written. Even if you _doubled_ your dev
| time to only get a 5% increase of speed at runtime, that 's
| usually worth it!
|
| (Of course, capitalism is really bad at dealing with
| externalities and it makes our society that much worse. But
| that's an argument against capitalism, not an argument against
| optimization.)
| tintor wrote:
| TLDR: How to optimize Python function? Use C++.
| reedjosh wrote:
| Python and Pandas are absolutely excellent until you notice you
| need performance. I say write everything in Python with Pandas
| until you notice something take 20 seconds.
|
| Then rewrite it with a more performant language or cython hooks.
|
| Developing features quickly is greatly aided by nice tools like
| Python and Pandas. And these tools make it easy to drop into
| something better when needed.
|
| Eat your cake and have it too!
| ineedasername wrote:
| Yes, there have been times that I have called linux command
| line utilities from python to process something rather do it in
| python.
| muziq wrote:
| This afternoon, discussing with my boss, why issuing two x 64
| byte loads per cycle is pushing it; to the point where l1 says
| no.. 400GB of l1 bandwidth is all we have.. Is _all_ we have.. I
| remember when we could move maybe 50KB /s.. Ans that was more
| than enough..
| vlovich123 wrote:
| > extra_compile_args = ["-O3", "-ffast-math", "-march=native",
| "-fopenmp" ], > Some say -O3 flag is dangerous but that's how we
| roll
|
| No. O3 is fine. -ffast-math is dangerous.
| tomrod wrote:
| Why?
| xcdzvyn wrote:
| It reorders instructions in ways that are mathematically but
| not computationally equivalent (as is the nature of FP). This
| also breaks IEEE compliance.
| TheRealPomax wrote:
| Some really good reasons:
| https://stackoverflow.com/a/22135559/740553
|
| It basically assumes all maths is finite and defined, then
| ignores how floating point arithmetic actually works,
| optimizing based purely on "what the operations suggest
| should work if we wrote them on paper" (alongside using
| approximations of certain functions that are super fast,
| while also being guaranteed inaccurate)
| newaccount2021 wrote:
| FirstLvR wrote:
| This is exactly what I was dealing last year, some particular
| costumer came to meeting with the idea developers has to be aware
| of making the code Inclusive and sustainable... We told them that
| we must set priorities on the performance and the literal result
| from the operation (a transaction development from an
| integration)
|
| Nothing really happened at the end but it's a funny history in
| the office
| bcatanzaro wrote:
| 1. There's no real limit to how slow you can make code. So that
| means there can be surprising large speedups if you start from
| very slow code.
|
| 2. But, there is a real limit to the speed of a particular piece
| of code. You can try finding it with a roofline model, for
| example. This post didn't do that. So we don't know if 201ms is
| good for this benchmark. It could still be very slow.
| liprais wrote:
| most likely misused pandas / numpy,as long as you stay in numpy
| land,it is quite fast.
| dfgqeryqe wrote:
| Use C or C++ or Rust or even Java and you don't have to worry
| about any of this. You can just write the obvious thing with
| your normal set of tools and it will be good enough.
| jerf wrote:
| I've been lightly banging the drum the last few years that a lot
| of programmers don't seem to understand how fast computers are,
| and often ship code that is just _miserably_ slower than it needs
| to be, like the code in this article, because they simply don 't
| realize that their code _ought_ to be much, much faster. There 's
| still a lot of very early-2000s ideas of how fast computers are
| floating around. I've wondered how much of it is the still-
| extensive use of dynamic scripting languages and programmers not
| understanding just _how much_ performance you can throw away how
| quickly with those things. It isn 't even just the slowdown you
| get just from using one at all; it's really easy to pile on
| several layers of indirection without really noticing it. And in
| the end, the code seems to run "fast enough" and nobody involved
| really notices that what is running in 750ms really ought to run
| in something more like 200us.
|
| I have a hard time using (pure) Python anymore for any task that
| speed is even _remotely_ a consideration for anymore. Not only is
| it slow even at the best of times, but so many of its features
| beg you to slow down even more without thinking about it.
| divan wrote:
| So much this.
|
| I wonder what would be the software engineering landscape today
| if hardware specs were growing like 10% per year...
| lynguist wrote:
| I agree except for the Python bit, which is factually wrong.
|
| Python allows you to program as if you're a jazz pianist. You
| can improvise, iterate and have fun.
|
| And when you found a solution you just refactor it and use
| numba. Boom, it runs the same speed as a compiled language.
|
| I once wrote one little program that ran in 24 min without
| numba and ca. 8 seconds with numba.
| willis936 wrote:
| Dozens of instances of a C GUI can launch in the time it
| takes to launch a hello world python program.
| geysersam wrote:
| Yes but what kind of comparison is that: 1. How often do
| you need to execute 1000 GUI instances? 2. How often do you
| need to print "hello world"?
|
| The right tool for the right job.
| willis936 wrote:
| This is a discussion about computers being slow. As in a
| person asks a computer to do something and the human
| waits while the computer does it.
|
| So python isn't the right tool for any job that involves
| human interaction.
| geysersam wrote:
| Nah that's too general. A _lot_ of website /app backends
| use Django or Fastapi and they work fine. Many more use
| PHP, also not a language famed for extreme performance.
|
| It depends on the application. Personally I wouldn't use
| Python for a GUI (because I'd use JS/TS).
| Nihilartikel wrote:
| Python _is_ slow, but even back in 2006 on a pentium 4 I
| had no problem using it with PyGame to build a smooth
| 60fps rtype style shooter for a coding challenge.
|
| One just has to not do anything dumb in the render loop
| and it's plenty responsive.
|
| Of course, if you're going to interactively process a
| 50mb csv or something... But even then pandas is faster.
| munificent wrote:
| I agree 100%. I wish every software engineer would spent at
| least a little time writing some programs in bare C and running
| them to get a feel for how fast a native executable can start
| up and run. It is breathtaking if you're used to running
| scripting languages and VMs.
|
| Related anecdote: My blog used to be written using Jekyll with
| Pygments for syntax highlighting. As the number of posts
| increased, it got closer and closer. Eventually, it took about
| 20 seconds to refresh a simple text change in a single blog
| post.
|
| I eventually decided to just write my own damn blog engine
| completely from scratch in Dart. Wrote my own template
| language, build graph, and syntax highlighter. By having a
| smart build system that knew which pages actually needed to be
| regenerated based on what data actually changed, I hoped to get
| very fast incremental rebuilds in the common case where only
| text inside a single post had changed.
|
| Before I got the incremental rebuild system working, I worked
| on getting it to just to a full build of the entire blog: every
| post page, pages, for each tag, date archives, and RSS support.
| I diffed it against the old blog to ensure it produced the same
| output.
|
| Once I got that working... I realized I didn't even need to
| implement incremental rebuilds. It could build the entire blog
| and every single post from scratch in less than a second.
|
| I don't know how people tolerate slow frameworks and build
| systems.
| throwaway894345 wrote:
| Yeah, I've written static site generators in Go and Rust
| among other languages (it's my goto project for learning a
| new language). Neither needed incremental builds because they
| build instantly. The bottlenecks are I/O.
|
| I've also worked in Python shops for the entirety of my
| career. There are a lot of Python programmers who don't have
| experience with and thus can't quite believe how much faster
| many other languages are (100X-1000X sounds fast in the
| abstract, but it's _really, really fast_ ). I've seen
| engineering months spent trying to get a CPU-bound endpoint
| to finish reliably in under 60s (yes, we tried all of the
| "rewrite the hot path in X" things), while a naive Go
| implementation completed in hundreds of milliseconds.
|
| Starting a project in Python is a great way to paint yourself
| into a corner (unless you have 100% certainty that Python
| [and "rewrite hot path in X"] can handle every performance
| requirement your project will ever have). Yeah, 3.11 is going
| to get a bit faster, but other languages are 100-1000X faster
| --too little, too late.
| michaelchisari wrote:
| I wish product designers took performance into consideration
| when they designed applications. Engineers can optimize until
| their fingers fall off, but if the application isn't designed
| with efficiency in mind ( _and willing to make trade-offs in
| order to achieve that_ ), we'll probably just end up right
| back in the same place.
|
| And a product which is designed inefficiently where the
| engineer has figured out clever ways to get it to be more
| performant is most likely a product that is more complicated
| under the hood than it would be if performance were a design
| goal in the first place.
| latenightcoding wrote:
| off topic but I initially didn't notice your username but the
| second I read "I wrote my own template language in Dart" I
| knew who it was.
| jcelerier wrote:
| > I agree 100%. I wish every software engineer would spent at
| least a little time writing some programs in bare C and
| running them to get a feel for how fast a native executable
| can start up and run. It is breathtaking if you're used to
| running scripting languages and VMs.
|
| Conversely when 99.9% of the software you use in your daily
| life is blazing fast C / C++, having to do anything in other
| stacks is a complete exercise in frustration, it feels like
| going back a few decades in time
| p1esk wrote:
| Conversely when 99.9% of the software you use in your daily
| life is user friendly Python, having to do anything in
| C/C++ is a complete exercise in frustration, it feels like
| going back a few decades in time
| bena wrote:
| I kind of feel both statements.
|
| I like writing things in python. It honestly feels like
| cheating at times. Being able to reduce things down to a
| list comprehension feels like wizardry.
|
| I like having things written in C/C++. Because like every
| deep magic, there's a cost associated with it.
| bayindirh wrote:
| As a person who uses both languages for various needs, I
| disagree. Things which takes minutes in optimized C++
| will probably take days in Python, even if I use the
| "accelerated" libraries for matrix operations and other
| math I implement in C++.
|
| Lastly, people think C++ is not user friendly. No, it
| certainly is. It needs being careful, yes, but a lot of
| things can be done in less lines then people expect.
| kaba0 wrote:
| I call bullshit on that. You either don't compare the
| same thing, but C++ is not _that_ much faster than even
| Python.
| bb88 wrote:
| Java and Go were both responses to how terrible C++
| actually is. While there are footguns in python, java,
| and go, there are exponentially more in C++.
| bayindirh wrote:
| As a person who wrote Java and loved it (and I still love
| it), I understand where you're coming from, however all
| programming languages thrive in certain circumstances.
|
| I'm no hater of _any_ programming language, but a strong
| proponent of using the right one for the job at hand. I
| write a lot of Python these days, because I neither need
| the speed, nor have the time to write a small utility
| which will help a user with C++. Similarly, I 'd rather
| use Java if I'm going to talk with bigger DBs, do CRUD,
| or develop bigger software which is going to be used in
| an enterprise or similar setting.
|
| However, if I'm writing high performance software, I'll
| reach for C++ for the sheer speed and flexibility,
| despite all the possible foot guns and other not-so-
| enjoyable parts, because I can verify the absence of most
| foot-guns, and more importantly, it gets the job done the
| way it should be done.
| rot13xor wrote:
| The biggest weakness of C++ (and C) is non-localized
| behavior of bugs due to undefined behavior. Once you have
| undefined behavior, you can no longer reason about your
| program in a logically consistent way. A language like
| Python or Java has no undefined behavior so for example
| if you have an integer overflow, you can debug knowing
| that only data touched by that integer overflow is
| affected by the bug whereas in C++ your entire program is
| now potentially meaningless.
| bb88 wrote:
| I've seen a lot of bad C++ in my life, and have seen Java
| people write C++ like they would Java.
|
| Writing good C++ is hard. People who think they can write
| good C++ are surprised to learn about certain footguns
| (static initialization before main, exception handling
| during destructors, etc).
|
| I found this reference which I thought was a pretty good
| take on the C++ learning curve.
|
| https://www.reddit.com/r/ProgrammerHumor/comments/7iokz5/
| c_l...
| bayindirh wrote:
| > I've seen a lot of bad C++ in my life, and have seen
| Java people write C++ like they would Java.
|
| Ah, don't remind me Java people write C++ like they write
| Java, I've seen my fair share, thank you.
|
| > Writing good C++ is hard.
|
| I concur, however writing good Java is also hard. e.g.
| Swing has a fixed and correct initialization/build
| sequence, and Java self-corrects if you diverge, but you
| get a noticeable performance hit. Most developers miss
| the signs and don't fix these innocent looking mistakes.
|
| I've learnt C++ first and Java later. I also tend to hit
| myself pretty hard during testing (incl. Valgrind memory
| sanity and Cachegrind hotpath checks), so I don't claim I
| write impeccable C++. Instead I assume I'm worse than
| average and try to find what's wrong vigorously and fix
| them ruthlessly.
| throwaway894345 wrote:
| I've written a whole bunch of all of those languages, and
| they each occupy a different order of magnitude of
| footguns. From fewest to most: Go (1X), Java (10X),
| Python (100X), and C++ (1000X).
| kaba0 wrote:
| Go has much more footguns in my opinion. Just look at the
| recent thread on the topic:
| https://news.ycombinator.com/item?id=31734110
| throwaway894345 wrote:
| I was a C++ dev in a past life and I have no particular
| fondness for Python (having used it for a couple of
| decades), and "friendliness" is a lot more than code
| golf. It's also "being able to understand all of the
| features you encounter and their interactions" as well as
| "sane, standard build tooling" and "good debugability"
| and many other things that C++ lacks (unless something
| has changed recently).
| LoveMortuus wrote:
| I personally find C++ more friendly, just because of the
| formatting that python forces upon you.
|
| But I do have to say that I never managed to really get
| into python, it always just felt like to much of a
| hassle, thus I always avoided it if possible.
| idlehand wrote:
| I didn't like it for years but then I kind of got into it
| for testing out machine learning and I found it kind of
| neat. My biggest gripe is no longer the syntax but the
| slowness, trying to do anything with even a soft
| performance requirement means having to figure out how to
| use a library that calls C to do it for you. Working with
| large amounts of data in native Python is noticeably
| slower than even NodeJS.
| p1esk wrote:
| Which "accelerated" libraries for matrix operations are
| you talking about?
|
| Try writing a matmul operation in C++ and profile it
| against the same thing done in
| Numpy/Pytorch/TensorFlow/Jax. You'll be surprised.
| bayindirh wrote:
| The code I've written and still working on is using
| Eigen, which TensorFlow also uses for its matrix
| operations, so, I'm not far off from these guys in terms
| of speed, if not ahead.
|
| The code I've written can complete 1.7 million
| evaluations per core, per second, on older hardware,
| which is used to evaluate things up to 1e-6 accuracy,
| which pretty neat for what I'm working on.
|
| [0]:
| https://eigen.tuxfamily.org/index.php?title=Main_Page
| gpm wrote:
| This is because numpy and friends are really good at
| matmul's.
|
| As soon as you step out of the happy path and need to do
| any calculation that isn't at least n^2 work for every
| single python call you are looking at order of magnitude
| speed differences.
|
| Years ago now (so I'm a bit fuzzy on the details) a
| friend asked me to help optimize some python code that
| took a few days to do one job. I got something like a 10x
| speedup using numpy, I got a _further_ 100x speedup (on
| the entire program) by porting one small function from
| optimized numpy to completely naive rust (I 'm sure c or
| c++ would have been similar). The bottleneck was
| something like generating a bunch of random numbers,
| where the distribution for each one depended on the
| previous numbers - which you just couldn't represent
| nicely in numpy.
|
| What took 2 days now took 2 minutes, eyeballing the
| profiles I remember thinking you could almost certainly
| get down to 20 seconds by porting the rest to rust.
| rubyskills wrote:
| Have you tried porting the problem into postgres? Not all
| big data problems can be solved this way but I was
| surprised what a postgres database could do with 40
| million rows of data.
| gpm wrote:
| I didn't, I don't think using a db really makes sense for
| this problem. The program was simulating a physical
| process to get two streams of timestamps from simulated
| single-photon detectors, and then running a somewhat-
| expensive analysis on the data (primarily a cross
| correlation).
|
| There's nothing here for a DB to really help with, the
| data access patterns are both trivial and optimal. IIRC
| it was also more like a billion rows so I'd have some
| scaling questions (a big enough instance could certainly
| handle it, but the hardware actually being used was a
| cheap laptop).
|
| Even if there was though - I would have been very
| hesitant to do so. The not-a-fulltime-programmer PhD
| student whose project this was really needed to be able
| to understand and modify the code. I was pretty hesitant
| to even introduce a second programming language.
| pjvsvsrtrxc wrote:
| Yes, writing software in C/C++ is harder. It's a darn
| good thing most software is used much more frequently
| than it is written, isn't it?
| tester756 wrote:
| Is performance inversely proportional to dev experience?
|
| because what you wrote could be said about using C++ in the
| context of dev experience
|
| 10 compilers, IDEs, debuggers, package managers
|
| and at the end of the day LLVM compiles 30min and uses tens
| of GBs of RAM on average hardware
|
| I don't believe that this is the best we can get.
| kaba0 wrote:
| You haven't touched a C++ toolchain in the last decade,
| have you?
| spc476 wrote:
| TCC is a fast compiler. So fast, that at one time, one
| could use it to boot Linux _from source code!_ But there
| 's a downside: the code is produces is slow. There's no
| optimization done. None. So the trade off seems to be:
| compile fast but slow program, or compile slow but fast
| program.
| tester756 wrote:
| is this actually this binary?
|
| I mean what if there are features that take significant %
| of whole time
|
| What if getting rid of them could decrease perf by e.g
| 4%, but also decrease comp. time by 30%
|
| would it be worth?
| scarmig wrote:
| The trade-off is more of a gradient: e.g. PGO allows an
| instrumented binary to collect runtime statistics and
| then use those to optimize hot paths for future build
| cycles.
| bhauer wrote:
| > _Is performance inversely proportional to dev
| experience?_
|
| No. I feel there is great developer experience in many
| high performance languages: Java, C#, Rust, Go, etc.
|
| In fact, for my personal tastes, I find these languages
| more ergonomic than many popular dynamic languages.
| Though I will admit that one thing that I find ergonomic
| is a language that lifts the performance headroom above
| my head so that I'm not constantly bumping my head on the
| ceiling.
| pjvsvsrtrxc wrote:
| Huh?
|
| "10 compilers, IDEs, debuggers, package managers" what
| are you talking about? (Virtually) No one uses ten
| different tools to build one application. I don't even
| know of any C++-specific package managers, although I do
| know of language-specific package managers for... oh,
| right, most scripting languages. And an IDE _includes_ a
| compiler and a debugger, that 's what makes it an IDE
| instead of a text editor.
|
| "and at the end of the day LLVM compiles 30min and uses
| tens of GBs of RAM on average hardware" sure, if you're
| compiling something enormous and bloated... I'm not sure
| why you think that's an argument against debloating?
| optimalsolver wrote:
| >I don't even know of any C++-specific package managers
|
| https://conan.io/
| tester756 wrote:
| >No one uses ten different tools to build one
| application.
|
| I meant you have a lot of choices to make
|
| Instead of having one strong standard which everyone
| uses, you have X of them which makes changing
| projects/companies harder, but for solid reason? I don't
| know.
|
| >"and at the end of the day LLVM compiles 30min and uses
| tens of GBs of RAM on average hardware" sure, if you're
| compiling something enormous and bloated... I'm not sure
| why you think that's an argument against debloating?
|
| I know that lines in repo aren't great way to compare
| those things, but
|
| .NET Compiler Infrastructure:
|
| 20 587 028 lines of code in 17 440 files
|
| LLVM:
|
| 45 673 398 lines of code in 116 784 files
|
| The first one I built (restore+build) in 6mins and it
| used around 6-7GB of RAM
|
| The second I'm not even trying because the last time I
| tried doing it on Windows it BSODed after using _whole_
| ram (16GBs)
| throwaway894345 wrote:
| I assume the parent was talking about the fragmentation
| in the ecosystem (fair point, especially regarding
| package management landscape and build tooling), but it's
| unclear.
| jcelerier wrote:
| > and at the end of the day LLVM compiles 30min and uses
| tens of GBs of RAM on average hardware
|
| I mean, that's the initial build.
|
| Here's my compile-edit-run cycle in https://ossia.io
| which is nearing 400kloc, with a free example of
| performance profiling, I haven't found anything like this
| whenever I had to profile python. It's not LLVM-sized of
| course, but it's not a small project either, maybe in the
| medium-low C++ project size:
| https://streamable.com/o8p22f ; pretty much a couple
| seconds at most from keystroke to result, for a complete
| DAW which links against Qt, FFMPEG, LLVM, Boost and a few
| others. Notice also how my IDE kindly informs me of
| memory leaks and other funsies. C/C++
| Header 2212 29523
| 17227 200382 C++
| 1381 34060 13503 199259
|
| Here's some additional tooling I'm developing - build
| times can be made as low as a few dozen milliseconds when
| one puts some work into making the correct API and using
| the tools correctly:
| https://www.youtube.com/watch?v=fMQvsqTDm3k
| zozbot234 wrote:
| Please don't write programs in bare C. Use Go if you're
| looking for something very simple and fast-enough for most
| uses; it's even memory safe as long as you avoid shared-state
| concurrency.
| bb88 wrote:
| Please don't write programs in go. Sure it looks awesome on
| the surface but it's a nightmare when you get a null
| pointer panic in a 3rd party library.
|
| Instead use Rust.
|
| See here for more info:
|
| https://getstream.io/blog/fixing-the-billion-dollar-
| mistake-...
| tomcam wrote:
| Or, you know, whatever the fuck language you care to use.
| michaelsshaw wrote:
| Nothing wrong with any of these languages, especially C.
| It's been around since the early 70s and is not going
| anywhere. There's a very good reason it (and to an extent
| C++) is still is the default language for doing a lot of
| things since everyone understands it.
| KronisLV wrote:
| C and C++ both have excellent library support, perhaps
| the best interop of any language out there and platform
| support that cannot be beat.
|
| That said, they're also challenging to use for the
| "average" (median) developer who'd end up creating code
| that is error-prone and would probably have memory leaks
| sooner or later.
|
| Thus, unless you have a good reason (of which,
| admittedly, there are plenty) to use C or C++, something
| that holds your hand a bit more might be a reasonable
| choice for many people out there.
|
| Go is a decent choice, because of a fairly shallow
| learning curve and not too much complexity, while having
| good library support and decent platform support.
|
| Rust is a safer choice, but at the expense of needing to
| spend a non-insignificant amount of time learning the
| language, even though the compiler is pretty good at
| being helpful too.
| bb88 wrote:
| > perhaps the best interop of any language out there and
| platform support that cannot be beat.
|
| Disagree here. The C++ ABI has pretty much been terrible
| for the last 20 years.
|
| C is fine in this regard though.
| throwaway894345 wrote:
| > That said, they're also challenging to use for the
| "average" (median) developer who'd end up creating code
| that is error-prone and would probably have memory leaks
| sooner or later.
|
| Many of the most highly credentialed, veteran C
| developers have said they can't write secure C code. Food
| for thought.
|
| > Go is a decent choice, because of a fairly shallow
| learning curve and not too much complexity, while having
| good library support and decent platform support. Rust is
| a safer choice, but at the expense of needing to spend a
| non-insignificant amount of time learning the language,
| even though the compiler is pretty good at being helpful
| too.
|
| Go doesn't have the strongest static guarantees, but it
| does provide a decent amount of static guarantees while
| also keeping the iteration cycle to a minimum. Languages
| like Rust have significantly longer iteration cycles,
| such that you can very likely ship sooner with Go at
| similar quality levels (time savings can go into catching
| bugs, including bugs which Rust's static analysis can't
| catch, such as race conditions). Moreover, I've had a few
| experiences where I got so in-the-weeds trying to pacify
| Rust's borrow-checker that I overlooked relatively
| straightforward bugs that I almost certainly would've
| caught in a less-tedious languages--sometimes static
| analysis can be _distracting_ and in that respect, harm
| quality (I don 't think this a big effect, but it's not
| something I've seen much discussion about).
| _gabe_ wrote:
| > secure C code.
|
| There is unsecure code hidden in every project that uses
| any programming language ;)
|
| I get what you're saying here, you're specifically
| talking about security vulnerabilities from memory
| related errors. I honestly wonder how many of these
| security vulnerabilities are truly issues that never
| would have come up in a more "secure" language like Java,
| or if the vulnerabilities would have just surfaced in a
| different manner.
|
| In other words, we're constantly told C and C++ are
| unsafe languages they should never be used and blah blah
| blah. How much of this is because of the fact that C has
| been around since the 1970s, so its had a lot more time
| to rack up large apps with security vulnerabilities,
| whereas most of the new recommended languages to replace
| C and C++ have been around since the late 90s. In another
| 20 years will we be saying the same thing about java that
| people say about C and C++? And will we be telling people
| to switch to the latest and greatest because Java is
| "unsafe"? Are these errors due to the language, or is it
| because we will always have attackers looking for
| vulnerabilities that will always exist because
| programmers are fallible and write buggy code?
| jcranmer wrote:
| > I honestly wonder how many of these security
| vulnerabilities are truly issues that never would have
| come up in a more "secure" language like Java, or if the
| vulnerabilities would have just surfaced in a different
| manner.
|
| Memory safety vulnerabilities basically boil down to
| following causes: null pointer dereferences, use-after-
| free (/dangling stack pointers), uninitialized memory,
| array out-of-bounds, and type confusion. Now, strictly
| speaking, in a memory-safe languages, you're guaranteed
| not to get uncontrollable behavior in any of these cases,
| but if the result is a thrown exception or panic or
| similar, your program is still crashing. And I think for
| your purposes, such a crash isn't meaningfully better
| than C's well-things-are-going-haywire.
|
| That said, use-after-free and uninitialized memory
| vulnerabilities are completely impossible in a GC
| language--you're not going to even get a controlled
| crash. In a language like Rust or even C++ in some cases,
| these issues are effectively mitigated to the point where
| I'm able to trust that it's not the cause of anything I'm
| seeing. Null-pointer dereferences are not effectively
| mitigated against in Java, but in Rust (which has
| nullability as part of the type), it does end up being
| effectively mitigated. This does leave out-of-bounds and
| type confusion as two errors that are not effectively
| mitigated by even safe languages, although they might end
| up being safer in practice.
| nequo wrote:
| > In another 20 years will we be saying the same thing
| about java that people say about C and C++? And will we
| be telling people to switch to the latest and greatest
| because Java is "unsafe"?
|
| As long as the vulnerability types that cause trouble in
| language B are a superset of those that cause trouble in
| language C, it makes sense to recommend moving from B to
| C for safety reasons.
|
| This is true even if there is a language A that is even
| worse and in the absence of language C, we recommended
| moving from A to B. Code written in A will be worse in
| expectation than code written in B than code written in
| C.
| throwaway894345 wrote:
| > There is unsecure code hidden in every project that
| uses any programming language ;)
|
| Security isn't a binary :) Two insecure code bases can
| have different degrees of insecurity.
|
| > I honestly wonder how many of these security
| vulnerabilities are truly issues that never would have
| come up in a more "secure" language like Java, or if the
| vulnerabilities would have just surfaced in a different
| manner.
|
| I don't know how memory safety vulns could manifest
| differently in Java or Rust.
|
| > In other words, we're constantly told C and C++ are
| unsafe languages they should never be used and blah blah
| blah. How much of this is because of the fact that C has
| been around since the 1970s, so its had a lot more time
| to rack up large apps with security vulnerabilities
|
| That doesn't address the veteran C programmers who say
| they can't reliably write secure C code (that's _new_
| code, not 50 year old code).
|
| > Are these errors due to the language, or is it because
| we will always have attackers looking for vulnerabilities
| that will always exist because programmers are fallible
| and write buggy code?
|
| A memory safe language can't have memory safety
| vulnerabilities (of course, most "memory safe" languages
| have the ability to opt out of memory safety for certain
| small sections, and maybe 0.5% of code written in these
| languages is memory-unsafe, but that's still a whole lot
| less than the ~100% of C and C++ code).
|
| Of course, there are other classes of errors that Java,
| Rust, Go, etc can't preclude with much more efficacy than
| C or C++, but eliminating entire classes of
| vulnerabilities is a pretty compelling reason to avoid C
| and C++ for a whole lot of code if one can help it (and
| increasingly one can help it).
| AnimalMuppet wrote:
| You've obviously been burned by null pointers (probably
| not just once). And you think they are a problem, and
| you're right. And you think they are a mistake, and you
| could be right about that, too.
|
| But they're not the _only_ problem. Writing async network
| servers can be a problem, too. Go helps a lot with that
| problem. If _for your situation_ it helps more with that
| than it hurts with nulls, then it can be a rational
| choice.
|
| And, don't assume that go must be a bad choice for _all_
| programmers, in _all_ situations. It 's not.
| bb88 wrote:
| And it's certainly not perfect in writing async network
| servers. It adds new concurrency bug types:
|
| https://songlh.github.io/paper/go-study.pdf
| tomcam wrote:
| Some of us know how to program. Some of us know the
| fundamentals.
| dylan604 wrote:
| Fewer know both
| kaba0 wrote:
| If Go is "fast enough", then so is Java, C#, JS, Haskell,
| and a litany of other managed languages.
| skybrian wrote:
| I mean, he just explained that after rewriting his program
| in Dart, it was fast enough? That's not really the point
| here.
|
| On the other hand, I tried writing a Wren interpreter in Go
| and it was considerably slower than the C version. Even
| programming languages that are usually pretty fast aren't
| always fast, and interpreter inner loops are a weak spot
| for Go.
| zozbot234 wrote:
| > I mean, he just explained that after rewriting his
| program in Dart, it was fast enough?
|
| Yes, and that makes his C advocacy even less sensible.
| Dart is a perfectly fine language, even though it seems
| to be a bit underused compared to others.
| skybrian wrote:
| Spending "a little time writing some programs in C" is
| not the same as advocating that people write most of
| their code in C, or that you use it in production.
|
| Maybe try reading _Crafting Interpreters_ , half of which
| is in Java and half in C.
|
| http://craftinginterpreters.com/
| cozzyd wrote:
| If you want to write something you can use from any
| language, C is still the best choice...
| staticassertion wrote:
| Unqualified "fast enough" is pretty much exactly the
| problem being pointed out. Most developers have no idea
| what "fast" is let alone "fast enough". If they were taught
| to benchmark at with a lower level language, see what
| adding different abstractions causes, that would help a
| ton.
|
| I would personally suggest C++ though because there is such
| a huge amount of knowledge around performance and
| abstraction in that community - wonderful conference talks
| and blog posts to learn from.
| zozbot234 wrote:
| It's not an "unqualified" claim, Go really is fast enough
| compared to the likes of Python and Ruby. I'm not saying
| that rewriting a Go program in a faster language
| (C/C++/Rust) can't sometimes be effective, but that's due
| to special circumstances - it's not something that
| generalizes to any and all programs.
| staticassertion wrote:
| "Fast enough" is inherently unqualified since what
| "enough" is is going to be case specific.
| shadowofneptune wrote:
| Go comes from a different school of compiler design where
| the code generation is decent in most cases, but
| struggles with calculations and more specific patterns.
| Delphi is a similar compiler. Looking at benchmarks, the
| performance is only a few times worse than optimized C.
| That's on par with the most optimized JITed languages
| like Java, while being overall a much simpler compiler. I
| feel it is is fair to say 'good enough' in this
| situation.
| marcosdumay wrote:
| It's not only a matter of 750ms instead of 200ms. I'm
| astonished every time I open some tool like Visual Studio, SAP
| Power Designer, or Libre Office that can stay for the most part
| of a minute on its loading screen.
|
| What do those tools even do for that long? They can read enough
| data from the disk to overflow my computer's main memory a few
| times during it.
| blktiger wrote:
| I've always assumed they are loading a bunch of stuff into
| caches and pre-computing things.
| m12k wrote:
| I heard optimization described this way: Sure, you think you
| need to tune the engine, but really, the first thing you need
| to do is get the clowns out of the car.
| chopin wrote:
| Phone home. I suspect much of the lag is network latency.
| bentcorner wrote:
| I work at a BigCorp that ships desktop software (but none
| of the above products) and network latency is (usually)
| pretty easy to extract out of the boot critical path.
| Blocking UI with network calls is a big no-no, and I expect
| any sizeable organization to have similar guidelines.
|
| Work like in the OP's article is probably the most
| difficult - it's work that is necessary, cannot be
| deferred, but is still slow. So it requires an expert to
| dig into it.
| chopin wrote:
| The answers in this subthread had me think more: I am using
| a company provided Windows machine and a Linux virtual
| desktop for the same tasks. Startup times difference for
| many applications is night and day. Probably due to virus
| scan and MS OneDrive.
| cs137 wrote:
| Network lag can be worked around with concurrent
| programming techniques--you don't even have to use a high-
| performance language to do it. The problem is that
| concurrent programming is far beyond what the typical Jira
| jockey can do--bosses would rather hire commodity drones
| who'll put up with Agile than put up with and pay for the
| kind of engineers who can write concurrent or parallel
| programs.
| [deleted]
| marcosdumay wrote:
| Power Designer surely is phoning home but this isn't nearly
| slow enough to matter here. AFAIK Visual Studio phones in
| during the operation, and not on startup. Libre Office
| almost certainly isn't phoning anywhere.
|
| I didn't include the slowest starting software that I know,
| Oracle SQL Developer, because it's clear that all the
| slowness is caused by phoning home, several times for some
| reason. But that's not the case for all of them.
|
| EDIT: Or, maybe it's useful to put it another way. The
| slowest region on the world for me to ping is around
| Eastern Asia and Australia. Some times, I get around 1.5s
| round trip time for there. A minute has around 40 of those.
| schroeding wrote:
| I use Visual Studio on an air-gapped machine with no
| (active) network cards (so Windows / winsock2 knows there
| is nothing that can respond and any connection should error
| out immediately) and it still takes almost a minute.
|
| At least VS is just kinda slow, maybe it's the XML parser
| :D
| grishka wrote:
| I remember a video of a guy running an old version of Visual
| C++ on an equally old version of Windows, in a VM on modern
| hardware, to try Windows development "the old way". It took
| about one frame to launch. One. Frame.
|
| By the way, Apple isn't much better. Xcode takes around 15
| seconds to launch on an M1 Max.
|
| edit: probably this video https://youtu.be/j_4iTovYJtc?t=282
| deergomoo wrote:
| > Xcode takes around 15 seconds to launch on an M1 Max
|
| Not really related to launch time but it's hilarious how
| much faster Xcode is when working with Objective-C compared
| to Swift. I understand why, but it's still jarring
| dylan604 wrote:
| One. Frame. Of. What?
|
| I've never heard of someone describing how long something
| took like this without at least defining the frame rate.
| grishka wrote:
| Of video. Which probably was 30 fps. I mean, the splash
| screen just blinked for a barely noticeable split second
| before the main window appeared. You double click the
| shortcut, and it's already done launching before you
| realize anything. That's how fast modern computers are.
|
| (actually, _some_ things on the M1 are fast enough that I
| 'm now getting annoyed at networking taking what feels
| like ages)
| dylan604 wrote:
| Why would you assume video is at 30fps? Geographic
| location? People not in the US (and a handful of other
| countries) would assume video framerate of 25fps.
|
| Does the refresh rate of a computer monitor get referred
| to as frames? Usually, it's just the frequency like 120Hz
| type units. Sorry for the conversation break, but I've
| just never heard app start up times with a framerate
| reference. Was just an unusual enough thing that I let me
| brain wonder on it longer than necessary
| grishka wrote:
| Oh ffs. First off, I'm not from the US. I've been there
| for less than a month combined. Secondly, if you do want
| to nitpick, at least do some research first. The video in
| question is 60 or 30 fps depending on the quality
| setting. $ yt-dlp -F
| https://www.youtube.com/watch?v=j_4iTovYJtc
| [youtube] j_4iTovYJtc: Downloading webpage
| [youtube] j_4iTovYJtc: Downloading android player API
| JSON [youtube] j_4iTovYJtc: Downloading player
| df5197e2 [info] Available formats for
| j_4iTovYJtc: ID EXT RESOLUTION FPS |
| FILESIZE TBR PROTO | VCODEC VBR ACODEC ABR
| ASR MORE INFO -----------------------------------
| ---------------------------------------------------------
| ----------------- sb2 mhtml 48x27 |
| mhtml | images
| storyboard sb1 mhtml 80x45 |
| mhtml | images
| storyboard sb0 mhtml 160x90 |
| mhtml | images
| storyboard 139 m4a audio only | 46.85MiB
| 48k https | audio only mp4a.40.5 48k 22050Hz low,
| m4a_dash 249 webm audio only | 49.06MiB
| 51k https | audio only opus 51k 48000Hz low,
| webm_dash 250 webm audio only | 63.84MiB
| 66k https | audio only opus 66k 48000Hz low,
| webm_dash 140 m4a audio only | 124.33MiB
| 129k https | audio only mp4a.40.2 129k 44100Hz
| medium, m4a_dash 251 webm audio only |
| 125.02MiB 130k https | audio only opus 130k
| 48000Hz medium, webm_dash 17 3gp 176x144
| 8 | 56.70MiB 59k https | mp4v.20.3 59k mp4a.40.2
| 0k 22050Hz 144p 160 mp4 256x144 30 |
| 37.86MiB 39k https | avc1.4d400c 39k video only
| 144p, mp4_dash 278 webm 256x144 30 |
| 42.59MiB 44k https | vp9 44k video only
| 144p, webm_dash 133 mp4 426x240 30 |
| 84.31MiB 87k https | avc1.4d4015 87k video only
| 240p, mp4_dash 242 webm 426x240 30 |
| 70.03MiB 72k https | vp9 72k video only
| 240p, webm_dash 134 mp4 640x360 30 |
| 167.27MiB 174k https | avc1.4d401e 174k video only
| 360p, mp4_dash 18 mp4 640x360 30 |
| 352.24MiB 366k https | avc1.42001E 366k mp4a.40.2 0k
| 44100Hz 360p 243 webm 640x360 30 |
| 134.68MiB 140k https | vp9 140k video only
| 360p, webm_dash 135 mp4 854x480 30 |
| 294.98MiB 307k https | avc1.4d401f 307k video only
| 480p, mp4_dash 244 webm 854x480 30 |
| 233.37MiB 243k https | vp9 243k video only
| 480p, webm_dash 136 mp4 1280x720 30 |
| 653.31MiB 680k https | avc1.4d401f 680k video only
| 720p, mp4_dash 22 mp4 1280x720 30 |
| ~795.07MiB 808k https | avc1.64001F 808k mp4a.40.2 0k
| 44100Hz 720p 247 webm 1280x720 30 |
| 548.72MiB 571k https | vp9 571k video only
| 720p, webm_dash 298 mp4 1280x720 60 |
| 817.18MiB 850k https | avc1.4d4020 850k video only
| 720p60, mp4_dash 302 webm 1280x720 60 |
| 651.39MiB 678k https | vp9 678k video only
| 720p60, webm_dash
|
| And the units? Hz and FPS are generally interchangeable
| but FPS is more often used as a measure of how fast
| something renders while Hz is more often used for monitor
| refresh rates (a holdover from CRTs I guess).
| MiddleEndian wrote:
| But imagine if Visual C++ was written entirely in Electron
| instead! Wouldn't THAT be sweet?
| grishka wrote:
| Should be called Visual React then!
| dmitriid wrote:
| It's at the end of Casey Muratori's Visual Studio rant:
| https://youtu.be/GC-0tCy4P1U
|
| Not only Visual Studio s up instantly in an older version
| of Windows running in a VM. Debugger values update
| instantly there as well, something that Visual Studio _can
| no longer do_.
| knorker wrote:
| Not that it invalidates anything you said, but it was 750ms
| vs 200 microseconds.
|
| But yeah. I agree. Why does Lightroom take forever to load,
| when I can query its backing SQLite in no time at all?
|
| And that's not even mentioning the RAM elephant in the room:
| chrome.
|
| Younglings today don't understand what a mindbogglingly large
| amount of data a GB is.
|
| But here's the thing: it's cheaper to waste thousands of CPU
| cores on bad performance than to have an engineer spend a day
| optimizing it.
| pjvsvsrtrxc wrote:
| > But here's the thing: it's cheaper to waste thousands of
| CPU cores on bad performance than to have an engineer spend
| a day optimizing it.
|
| No, it really isn't. It's only cheaper for the company
| making the software (and only if they don't use their
| software extensively, at that).
| knorker wrote:
| It depends.
|
| Run the lifetime cost of a CPU, and compare it to what
| you pay your engineers. It's shocking how much RAM and
| CPU you can get for the price of an hour of engineer
| time.
|
| And that's not even all! Next time someone reads the
| code, if it's "clever" (but much much faster) then that's
| more human time spent.
|
| And if it has a bug because it sacrificed some
| simplicity? That's human hours or days.
|
| And that's not even all. There's the opportunity cost of
| that engineer. They cost $100 an hour. They could spend
| an hour optimizing $50 worth of computer resources, or
| they could implement 0.1% of a feature that unlocks a
| million dollar deal.
|
| Then having them optimize is not just a $50 loss, it's a
| $900 opportunity cost.
|
| But yeah, shipped software like shrinkwrapped or JS
| running on client browsers, that's just having someone
| else pay for it.
|
| (which, for the company, has even less cost)
|
| But on the server side: yes, in most cases it's cheaper
| to get another server than to make the software twice as
| fast.
|
| Not always. But don't prematurely optimize. Run the
| numbers.
|
| One thing where it really does matter is when it'll run
| on battery power. Performance equals battery time. You
| can't just buy another CPU for that.
| cgriswald wrote:
| Exactly. Users are subsidizing the software provider with
| CPU cycles and employee time.
|
| Assume it costs $800 for an engineer-day. Assume your
| software has 10,000 daily users and that the wasted time
| cost is 20 seconds (assume this is actual wasted time
| when an employee is actively waiting and not completing
| some other task). Assume the employees using the software
| earn on average 1/8 of what the engineer makes. It would
| take less than 4 days to make up for the employee's time.
| That $800 would save about $80,000 per year.
|
| Obviously, this is a contrived example, but I think it's
| a conservative one. I'm overpaying the engineer (on
| average) and probably under-estimating time wasted and
| user cost.
| knorker wrote:
| If humans wait, yes. If you can just buy another server:
| no.
|
| I 100% agree on saving human time. Human time is
| expensive. CPU time is absolutely not.
| rot13xor wrote:
| Regarding chrome, browsers are basically operating systems
| nowadays. A standards compliant HTML5 parser is at the bare
| minimum millions of lines of code. Same for the renderer
| and Javascript engine.
| compressedgas wrote:
| > A standards compliant HTML5 parser is at the bare
| minimum millions of lines of code.
|
| But https://github.com/google/gumbo-parser is only 34K
| lines?
| vbezhenar wrote:
| Back in time all you needed for perfect performance is to use C
| and proper algorithms. It was easy.
|
| Nowadays you need vector operations, you need to utilise GPU,
| you need to utilise various accelerators. For me it is black
| magic.
| AnIdiotOnTheNet wrote:
| But perfect performance isn't even the benchmark, it's "not
| ridiculously slow". This is what is meant by "Computers are
| fast, but you don't know it", you don't even know how
| ludicrously fast computers are because so much stuff is so
| insanely slow.
|
| They're so fast that, in the vast majority of cases, you
| don't even need optimization, you just need non-
| pessimization: https://youtu.be/pgoetgxecw8
| LeifCarrotson wrote:
| To be really fast, yes. Those are optimizations that allow
| you to go beyond the speed of just C and proper algorithms.
|
| But C and proper algorithms are still fast - Moore's law is
| going wider, yes, and single-threaded advancements aren't as
| impressive as they used to be, but solid C code and proper
| algorithms will still be faster than it was before!
|
| What's not fast is when, instead of using a hashmap when you
| should have used a B-tree, you instead store half the data in
| a relational database from one microservice and the other
| half on the blockchain and query it using a zero-code
| platform provided by a third vendor.
| Schroedingersat wrote:
| These things only net you one or two orders of magnitude (and
| give you very little or even negative power efficiency gain),
| or maybe 3 for the gpu.
|
| This pales in comparison to the 4-6 orders of magnitude
| induced by thoughtless patterns, excessive abstraction,
| bloat, and user-hostile network round trips (this one is more
| like 10 orders of magnitude).
|
| Write good clean code in a way that your compiler can easily
| reason about to insert suitable vector operations (a little
| easier in c++, rust, zig etc. than c) and it's perfect
| performance in my book even if it isn't saturating all the
| cores
| flaviut wrote:
| I think you're trying too hard.
|
| Something that you do a lot? Fine, write it in C/C++/Rust.
|
| It's something that costs thousands/millions of dollars of
| compute? Ok, maybe it's worth it for you to spend a month on,
| put your robe on, and start chanting in latin.
| pjvsvsrtrxc wrote:
| And yet, even with all the evidence that modern, heavily-
| bloated software development is AWFUL (constant bugs and
| breakage because no one writing code understands any of the
| software sitting between them and the machine, much less
| understands the machine; Rowhammer, Spectre, Meltdown, and now
| Hertzbleed; sitting there waiting multiple seconds for
| something to launch up another copy of the web browser you
| already have running just so that you can have chat, hi
| Discord)... you still have all the people in the comments below
| trying to come up with reasons why "oh no it's actually good,
| the poor software developers would have to actually learn
| something instead of copying code off of Stack Overflow without
| understanding it".
| Tade0 wrote:
| > And in the end, the code seems to run "fast enough" and
| nobody involved really notices that what is running in 750ms
| really ought to run in something more like 200us.
|
| At least with Chrome's V8, the difference is not that big.
|
| Sure, it loses to C/C++, because it can't vectorize and uses
| orders of magnitude more memory, but at least in the Computer
| Language Benchmarks Game it's "just" 2-4x slower.
|
| I remember getting a faster program doing large matrix
| multiplication in JavaScript than in C with -o1, because V8
| figured out that I'm reading from and writing to the same cell,
| so optimised that out, which gave it an edge, because in both
| cases the memory bandwidth limited the speed of execution.
|
| As for Electron and the like: half of the reason why they're
| slow is that document reflows are not minimized, so the
| underlying view engine works really, really hard to re-render
| the same thing over and over again.
|
| It's not nearly as visible in web apps, because these in turn
| are often slowed down by the HTTP connection limit(hardcoded to
| six in most browsers).
| 1vuio0pswjnm7 wrote:
| "I have a hard time using (pure) Python anymore for any task
| that speed is even remotely a consideration anymore. Not only
| is it slow even at the best of times, but so many of its
| features beg you to slow down even more without thinking about
| it."
| yieldcrv wrote:
| There is a balance, like sure there is inefficient code but
| often its because that code is accessing an I/O resource
| inefficiently, and so the CPU and RAM speed of the host machine
| isnt the bottlebeck no matter what dumb things the programmer
| does
|
| So you dont need to pretty much ever reinvent or even use a
| hackerrank algorithm, you need to understand that the database
| compute instance has a fast cpu and lots of RAM too
| Guest19023892 wrote:
| I upgraded a desktop machine the last time I visited my family.
| It was a Windows 7 computer that was at least 10 years old with
| 4GB of ram. They wanted to use it online for basic web
| browsing, so I thought I'd install Windows 10 for security
| reasons and drop in a modern SSD to upgrade the old 7200rpm
| drive to make it more snappy.
|
| Well, it felt slower after the "upgrade". Clicking the start
| menu and opening something like the Downloads or Documents
| folder was basically instant before. Now, with Windows 10 and
| the new SSD there was a noticeable delay when opening and
| browsing folders.
|
| It really made me wonder how it would be running something like
| Windows 98 and websites of the past on modern hardware.
| ishjoh wrote:
| In a similar vein I installed Ubuntu on an older laptop that
| had been running Windows 10. I was shocked at how fast it was
| compared to Windows 10, it was night and day.
| babypuncher wrote:
| Throw in more RAM and Windows 10 will likely feel snappier
| than Windows 7 did.
|
| It's probable the old Windows 7 install was 32-bit while your
| fresh install of 10 would have defaulted to 64-bit. That
| combined with 10's naturally higher memory requirements means
| the system has less overhead to work with.
| 867-5309 wrote:
| recently I've seen new laptops being shipped with 4GB.
| possibly with a slightly lighter (but not fully debloated)
| version of 10 (Home? Starter? Edu?)
|
| I'm not sure if this is because Windows memory usage is a
| lot more efficient now, or if the newer processors'
| performances can cancel out the RAM capacity bottleneck, or
| if PC4-25600 + NVMe pagefiles are simply fast enough, or if
| manufacturers are spreading thinly during the chip
| shortage. but it's certainly an ongoing trend
| SV_BubbleTime wrote:
| It's all this, and I'm dealing with it today.
|
| Mother I law bought a machine with 4GB of ram, which was
| fine before windows 10. Now it spends all day doing
| page/sysfile swap from its mechanical hard drive. Basically
| unusable.
|
| So here in my pocket is an 8GB stick of DDR3 sodimm for
| later.
| antisthenes wrote:
| > Throw in more RAM and Windows 10 will likely feel
| snappier than Windows 7 did.
|
| It doesn't and never will. I've used them side by side for
| a few years and went back to W7 for productivity.
|
| Interestingly enough, Lubuntu LXQt feels snappier than
| either system.
| [deleted]
| askafriend wrote:
| Let the caches warm up a little!
| bombcar wrote:
| This is part of it - many things are "fast enough" that
| were you _used_ to have caches that would display nearly
| instantly, now you don 't have those - it reads from disk
| each time it needs to show the folder, etc.
|
| This is very visible in any app that no longer maintains
| "local state" but instead is just a web browser to some
| online state (think: Electron, teams, etc). Disconnect the
| web or slow it down and it all goes to hell.
| xen2xen1 wrote:
| Windows 10 or 11 with 4gb of RAM is a BAD idea. 8 gb is a
| minimum. Found that out several times.
| [deleted]
| moffkalast wrote:
| That's interesting, I cloned a Win10 installation on a HDD to
| a sata SSD a year or two back and the speed difference was
| considerable. Especially something like Atom that took
| minutes to open before was ready to go in like 10 seconds
| afterwards.
|
| A lot of things remained slow though.
| digitallyfree wrote:
| Yeah the change from a 7200 HDD to an SSD for those 10 year
| old machines provides a very considerable improvement. It
| goes from "unusable" to "moderate" performance for general
| web browsing and business duties.
|
| I'm talking about Windows 10 on 4G C2Q or Phenom/Phenom II
| machines - they aren't fast but they're very usable with a
| SSD and GPU in place.
| antisthenes wrote:
| The bigger question is why does a glorified text editor
| take 10 seconds to open on any system?
|
| Is it loading 2000 plugins?
| dataflow wrote:
| You'll want to stop using the new start menu. Use OpenShell.
| It's fast and even better than the old menus.
| speedgoose wrote:
| Old windows run a bit slow on a web browser:
| https://copy.sh/v86/?profile=windows98 or https://bellard.org
| /jslinux/vm.html?url=win2k.cfg&mem=192&gr...
| nequo wrote:
| I wonder if you'd have any more luck with that hardware
| putting Ubuntu Mate on it. For basic web browsing, it
| probably wouldn't matter much to your family whether it's
| running Windows or Linux.
| jonnycomputer wrote:
| I'm running Ubuntu Mate on a low-end brand-new laptop that
| couldn't handle the Windows OS it shipped with. Couldn't be
| happier.
| [deleted]
| geysersam wrote:
| More important than the language, is using the right tool for
| the job. If you are using the scientific Python stack,
| correctly, you'll have a difficult time beating that with c++.
| For many applications. While producing way simpler and more
| maintainable code.
| [deleted]
| dekhn wrote:
| I don't really find python slow for what I do (typically
| writing UIs around computer vision systems) but also, several
| years back I made a microcontroller-based self-balancing robot.
| It was hard to debug the PID and the sensor, so I replaced it
| with a Pi Zero and the main robot loop ran in python- enough to
| read the accelerometer, compute a PID update, and send motor
| instructions- 100 times a second. If there was a problem (say,
| another heavy process, like computer vision, running on the
| single CPU) it would eventually not respond fast enough and the
| robot would fall over.
|
| Most of the time it's not that you need a faster language, it's
| that you need to write faster code. I was working on a problem
| recently where random.choices was slow but I realized that due
| to the structure of my problem I could convert it to numpy and
| get a 100X speedup.
| yoyohello13 wrote:
| I felt this pretty viscerally recently. I did Advent of Code
| 2021 in python last year. My day job is programming in Python
| so I didn't really think about the execution speed of my
| solutions much.
|
| As a fun exercise this year I've been doing Advent of Code 2020
| in C, and my god it's crazy how much faster my solutions seem
| to execute. These are just little toy problems, but even still
| the speed difference is night and day.
|
| Although, I still find Python much easier to read and maintain,
| but that may just be I'm more experienced with the language.
| agumonkey wrote:
| Remember when people were counting cpu cycles and instruction
| size to ensure performance ?
| SV_BubbleTime wrote:
| I program for embedded... still do that.
| agumonkey wrote:
| find people like you, form a club, write articles and enjoy
| .. 1 reader :)
| kossTKR wrote:
| I hope one day latency in general will be "back to normal".
|
| I still remember how fast console based computing, an old
| gameboy or a 90's macintosh would be - click a button and stuff
| would show up instantly.
|
| There was a tactility present with computers that's gone today.
|
| Today everything feels sluggish - just writing this comment on
| my $3000 Macbook Pro and i can feel the latency, sometimes
| there's even small pauses. A little when i write stuff, a lot
| when i drag windows.
|
| Hopefully the focus on 100hz+ screens in tech in general will
| put more focus on latency from click to screen print - now when
| resolution and interface graphics in general are close to
| biological limits.
| 41b696ef1113 wrote:
| >Hopefully the focus on 100hz+ screens in tech
|
| Come again? I think anything beyond 60hz still qualifies as
| niche. Vendors are still selling 720p laptops.
| kossTKR wrote:
| True, it's probably just bleeding edge, but i've noticed
| several flagship phones, have 90HZ, and the new iPad Pros
| have up to 120hz "smooth scrolling", so it seems something
| will be happening x years down the line.
| theandrewbailey wrote:
| My guess is that few people have stopped to compare them.
| I've never knowingly seen a 100+hz screen in person, so I
| stopped by a local store. Sure enough, I could tell that
| the motion was smoother. Bought 2. After using those, I can
| feel my older monitors that I'm using to write this are
| choppy.
| LoveMortuus wrote:
| But do you notice the smoothness in the day to day basis
| or have you, in a way, crippled yourself, because now the
| majority of monitors feel choppy to you?
|
| Sounds a bit like the, 'Never meet your heroes', thingy.
| LoveMortuus wrote:
| May I ask if you're using the M1 based MacBook or the Intel
| one?
|
| I'm asking because I've been thinking of getting a MacBook
| Air in the future with the intent to use it for writing.
| kossTKR wrote:
| Still on intel. And yes the newer M1's actually feels
| better for writing as far as i've tried..
| BeFlatXIII wrote:
| I have an M1 Air right I'm typing on right now and have not
| had any sluggishness concerns besides when switching
| between Spaces. Even that is more of a visual stutter
| instead of actually lagging to the point the animation
| takes longer than usual. This is the first thin & light
| computer I've owned that I'm 100% happy with its
| performance.
| slotrans wrote:
| Switching between spaces on this M1 takes multiple
| seconds. It's almost unbearable.
|
| My 8-core 64GB Windows machine fares no better.
|
| Switching between OLVWM desktops on my 200MHz Pentium Pro
| _twenty years ago_ was instantaneous.
| saagarjha wrote:
| Any idea what it's doing during those several seconds?
| aidos wrote:
| Weird. I don't use Spaces (this is the multiple desktops
| thing, right?) but I've just tried it and it's not laggy
| at all for me. I turn on the reduce motion thing, so it
| fades between them rather than swiping, but neither feel
| laggy.
|
| (I'm on an M1 Air and I think the performance is great)
| fmakunbound wrote:
| I'm afraid we just "deploy more pods" these days
| mrtranscendence wrote:
| > And in the end, the code seems to run "fast enough" and
| nobody involved really notices that what is running in 750ms
| really ought to run in something more like 200us.
|
| Nobody has created a language that is both thousands of times
| faster than Python and nearly as straightforward to learn and
| to use. The closest thing I know of might be Julia, but that
| has its own performance problems and is tied closely to its
| AI/ML niche. Even within that niche I'm certainly not going to
| get most data scientists to write their code in C or C++ (or
| heaven forbid Rust) to solve a performance impediment that
| they've generally been able to work around.
|
| It's great that you've been able to switch to higher-
| performance languages, but not everyone can do that easily
| enough to make it worth doing.
| V1ndaar wrote:
| I don't know, but imo Nim begs to differ.
| stefanos82 wrote:
| > Nobody has created a language that is both thousands of
| times faster than Python and nearly as straightforward to
| learn and to use.
|
| Not Python-based, but Lua-based is Nelua [1]
|
| If you like Lua's syntax, LISP's metaprogramming abilities,
| and C's performance, well there you have it!
|
| [1] https://github.com/edubart/nelua-lang
| morelisp wrote:
| The "iterate from notebook to production" process which is
| common everywhere but the largest data engineering groups
| rules out anything with manual memory management from
| becoming popular with data science work.
|
| Some data scientists I know like (or even love) Scala, but
| that tends to blow up once it's handed over to the data
| engineers as Scala supports too many paradigms and just a
| couple DSs will probably manage to find all of them in one
| program.
|
| We use Go extensively for other things, and most data
| scientists I've worked with sketching ideas in Go liked it a
| lot, but the library support just isn't there, and it's not
| really a priority for any of the big players who are all
| committed to Python wrapper + C/C++/GPU core, or stock Java
| stacks. (The performance also isn't quite there yet compared
| to the top C and C++ libraries, but it's improving.)
| ishjoh wrote:
| I love scala and wish it was more popular. I've made piece
| with java at this point as it slowly adopts my favorite
| parts of scala but I miss how concise my code was.
| alar44 wrote:
| I think that's my argument. If a developer thinks C or C++ is
| really that difficult and they can only write effectively in
| Python, they're a shitty developer and the world seems to be
| jam packed with them.
| Gibbon1 wrote:
| C# is faster than python and as easy to use.
| monkpit wrote:
| Apparently Jupyter works with .net now. Cool
| dataflow wrote:
| As nice as it is, C# is definitely not as easy to use as
| Python.
| radicalbyte wrote:
| If you're using an IDE (Rider or Visual Studio) and avoid
| the Enterprise frameworks, then it's much easier to use
| than Python. Tooling makes a huge difference, no more
| digging through the sometimes flakey Python documentation
| and cursing compatibility issues with random dependencies
| not supporting Apple Silicon.
| dataflow wrote:
| I agree tooling makes a huge difference but I
| specifically said this with the understanding that you're
| using C# with Visual Studio. Some stuff will be easier in
| C#, but a lot of other stuff _just isn 't_ as easy as in
| Python.
|
| At the risk of setting up a strawman for people to punch
| down, try comparing how easy it is to do the equivalent
| of something like this in C#, and feel free to use as
| much IDE magic as you'd like: x = [t[1]
| for t in enumerate(range(1, 50, 4)) if t[0] % 3 == 0][2:]
|
| Was it actually easier?
|
| There's a million other examples I could write here, but
| I'm hoping that one-liner will be sufficient for
| illustration purposes.
| radicalbyte wrote:
| There's very little difference between the two as long as
| you're using modern versions of both and add your own
| functions to fill any API gaps and are using type hinting
| properly in Python. My C# tends to be "larger" because I
| use more vertical whitespace and pylint is rather
| opinionated.. :)
|
| Where you can complain about C# - and I do - is where
| you're having to write (or work with) code which has been
| force to stick to strict architectural and style
| standards. That makes code-bases which are very hard to
| understand for newbies and are verbose.
|
| On the flip side, once you start doing anything even
| slightly interesting with Python you run into the crappy
| package management. The end result of which is lots of
| frustration getting projects working and a lot of time
| wasted on administration vs work.
| eterm wrote:
| Here's a one liner in c# for that:
| Enumerable.Range(1,50).Where((x,i) => i % 4 == 0).Where(e
| => e % 3 == 0).Skip(1).Select(e => e+4)
|
| Okay, so you might consider that last e+4 cheating and
| against the spirit, but I couldn't be bothered to spend
| money upgrading my linqpad to support the latest .net
| with Enumerable.Chunk which makes taking two at a time
| easier for the first part.
|
| Edit: more in spirit:
| Enumerable.Range(1,50).Where(e => e % 4 == 0 && e % 3 ==
| 0).Skip(1).Select(e => e + 1)
| dataflow wrote:
| Nah that part I'm not worried about. The "cheating" is
| omitting the rest of the line. What you really needed
| was: var y = Enumerable.Range(1,
| 50).Where((x, i) => i % 4 == 0).Where(e => e % 3 ==
| 0).Skip(1).Select(e => e + 4).ToArray();
|
| Compare that against: y = [t[1] for t in
| enumerate(range(1, 50, 4)) if t[0] % 3 == 0][2:]
|
| It's almost twice as long, and doesn't exactly make up
| for it with readability either.
| eterm wrote:
| It's not "twice as long" in any syntactic sense, and
| readability is easily fixed:
| Enumerable.Range(1,50) .Where(e => e % 4 == 0
| && e % 3 == 0) .Skip(1) .Select(e
| => e + 1)
|
| That's very understandable, it's clear what it does, and
| if your complaint is that dotnet prefers to name
| expressions like Skip rather than magic syntax, we can
| disagree on what make things readable and easy to
| maintain.
| dataflow wrote:
| It's literally "twice as long" syntactically. 120 vs. 67
| characters.
|
| And again, you keep omitting the rest of the line. (Why?)
| What you should've written in response was:
| var y = Enumerable.Range(1,50) .Where(e => e %
| 4 == 0 && e % 3 == 0) .Skip(1)
| .Select(e => e + 1) .ToArray();
|
| Compare: y = [t[1] for t in
| enumerate(range(1, 50, 4)) if t[0] % 3 ==
| 0][2:]
|
| And (again), my complaint isn't about LINQ or numbers or
| these functions in particular. This is just a tiny one-
| liner to illustrate with one example. I could write a ton
| more. There's just stuff Python is better at, there's
| other stuff C# is better at, that's just a fact of life.
| I switch between them depending on what I'm doing.
| eterm wrote:
| The ToArray is unneccessay, it's much more idiomatic
| dotnet to deal with IEnumerable all the way through.
|
| The only meaningful difference in lengths is that C#
| doesn't have an Enumable.Range(start, stop, increment)
| overload but it's easy enough to write one, and then it'd
| be essentially the same length.
| dataflow wrote:
| "Unnecessary"? You can't just change the problem! I was
| asking for the equivalent of some particular piece of
| code using a list, not a different one using a generator.
| Sometimes you want a generator, sometimes you want an
| array. In either language.
| eterm wrote:
| This is a silly argument, you're asking for a literal
| translation of a pythonic problem without allowing the
| idioms from the other languages.
|
| If you were actually trying to solve the problem in
| dotnet, you'd almost certainly structure it as the
| Queryable result and then at the very end after composing
| run ToList, or ToArray or consume in something else that
| will enumerate it.
|
| We can also shorten it further to:
| Enumerable.Range(1, 50) .Where(e => e % 12 ==
| 1) .Skip(2) .ToList()
|
| Now even including the ToList it's now just four basic
| steps:
|
| Range, Filter, Skip, Enumerate.
|
| Those are the very basics, all one line if wanted. It
| doesn't get much more basic than that, and I'd still
| argue it's easier for someone new to programming to see
| what's going on in the C# than the python example.
|
| edit: realised the maths simplifies it even further.
| radicalbyte wrote:
| There's not a lot of difference if you use the query
| syntax in C# (assuming you add an overload to
| Enumerable.Range() to take the skip) - only no-one uses
| that because it's ugly. Also really nice that the types
| are checked + shown by tooling, as is the syntax.
|
| I use Python a lot for scripting - what it lacks in speed
| of development/runtime it gains in being more accessible
| to amateurs and having less "enterprise" style libraries
| (particularly with cryptographic libraries, MS abstract
| way too much whilst Python just has think wrappers around
| C). That makes Python a strong scripting language for me.
| PyCharm is really nice too.
|
| For real work? C# is better as long as you have either VS
| or Rider. Really dislike the VS Code experience (these
| JS-based editors are _slow_ and nowhere near as nice a
| Rider) so then I can understand why people would avoid
| it.
| dmitriid wrote:
| What you're missing is that C# example works on _any
| Enumerable_. And it 's very hard to explain how damn
| important and impressive this is without trying it first.
|
| Yes, it's more verbose, but I can swap that initial array
| for a List, or a collection, or even an external async
| datasource, and my code will not change. It will be the
| same Select.Where....
| dataflow wrote:
| > What you're missing
|
| I'm not missing it.
|
| > is that C# example works on any Enumerable. And it's
| very hard to explain how damn important and impressive
| this is without trying it first.
|
| Believe me I've tried (by which I mean used it a ton).
| I'm not a newbie to this. C# is great. Nobody was saying
| it's unimportant or unimpressive or whatever.
|
| > Yes, it's more verbose, but I can swap that initial
| array for a List, or a collection, or even an external
| async datasource, and my code will not change
|
| Excellent. And when you want that flexibility, the
| verbosity pays off. When you don't, it doesn't. Simple as
| that.
| dmitriid wrote:
| > Excellent. And when you want that flexibility, the
| verbosity pays off. When you don't, it doesn't. Simple as
| that.
|
| It's rarely as simple as that. For example, this entire
| conversation started with "At the risk of setting up a
| strawman for people to punch down, try comparing how easy
| it is to do the equivalent of something like this".
|
| And this became a discussion of straw men :) Because I
| could just as easily come up with "replace a range of
| numbers with data that is read from a database or from
| async function that then goes through the same
| transformations", and the result might not be in Python's
| favor.
| Jtsummers wrote:
| If I understand dataflow's example correctly you don't
| need the Select at the end: var x =
| Enumerable.Range(1,50) .Where((num, index)
| => num % 4 == 1 && index % 3 == 0) .Skip(2)
| .ToArray();
|
| That computes the same thing as their Python snippet:
| [25,37,49]. Of course, what this is actually computing is
| whether the number is congruent to 1 modulo 4 and 3 so it
| was a weird example, but here's how you'd really want to
| write it (since a number congruent to 1 modulo 4 and 3 is
| the same as being congruent to 1 module 12):
| var x = Enumerable.Range(1,50) .Where(num
| => num % 12 == 1) .Skip(2)
| .ToArray();
|
| Rewriting that Python example to be a bit clearer for a
| proper one-to-one comparison: y = [t for
| t in range(1, 50, 4) if t % 3 == 1][2:]
|
| That _enumerate_ wrapper was unnecessary. I don 't recall
| a way, in LINQ, to generate only every 4th number in a
| range, but I also haven't used C# in a few years so my
| memory is rusty on LINQ anyways.
| dataflow wrote:
| > num % 12
|
| > That _enumerate_ wrapper was unnecessary.
|
| I'm surprised you didn't go all the way and just write
| x = [25, 37, 49]
|
| and tell me the rest of the code was unnecessary!
| Jtsummers wrote:
| I mean, was it necessary? Your original Python expression
| was pretty obfuscated for such a simple calculation.
| dataflow wrote:
| Are you actually suggesting I didn't realize I could've
| written x = [25, 37, 49], or what?
|
| Surely the point of the example wasn't "find the optimal
| way to calculate that particular list of numbers"?
| Jtsummers wrote:
| No, I'm suggesting that your original example was a great
| example of obfuscated Python. Even supposing that you
| wanted to alter the total number of values generated and
| the number of initial values to skip, you're doing
| unnecessary work and made it more convoluted than
| necessary: def some_example(to_skip=2,
| total_count=3): return [n * 12 + 1 for n in
| range(to_skip, to_skip+total_count)]
|
| There you go. Change the variable names that I spent < 1
| second coming up with and that does exactly the same
| thing without the enumeration or discarding values. In a
| thread on how computer speed is wasted on unnecessary
| computation, it seems silly that you're arguing in favor
| of unnecessary work and obfuscated code.
| 369548684892826 wrote:
| > LINQ, to generate only every 4th number in a range
|
| Maybe something like this?
| Enumerable.Range(0,49).Select(x => 4*x + 1)
| Jtsummers wrote:
| Yeah, that would work, throw it before the _Where_ clause
| and change 49. _Range_ here doesn 't specify a stopping
| point, but a count of generated values (this makes it not
| quite the same as Python's _range_ ). So you'd want:
| Enumerable.Range(0,13).Select(x => 4 * x + 1).Where((e,
| i) => i % 3 == 0).Skip(2)
|
| And that's equivalent to the original, short of writing a
| _MyRange_ that combines the first _Range_ and _Select_.
| Still an awful lot of work for generating 3 numbers.
| eterm wrote:
| You're right, the maths simplifies it a lot. I rushed out
| a one-liner without much analysis, and eventually come to
| the same conclusion.
|
| There's no Range method that takes (start, stop, step)
| but it's trivial enough to write one, it's a single for
| loop and yield return statement.
|
| We can even trigger the python users by doing it in one
| line ;) public static class
| CustomEnumerable { public static IEnumerable<Int32>
| Range(int start, int stop, int step) {for (int i = start;
| i < stop; i+=step) yield return i;}}
|
| Try writing your function definitions on one line in
| python!
| pelorat wrote:
| But who compares Python with C#, they are not even in the
| same league? Python is a glorified bash scripting
| replacement with a mediocre JIT engine. Modern C# is
| faster than Go which is what it is competing against.
| radicalbyte wrote:
| I'd argue easier and it plays _much_ better cross CPU than
| Python. Once you pass the initial JIT phase it 's also
| extremely fast.
|
| As the sister post says: Go is in the same class as C# only
| it's a bit verbose/ugly in comparison but it compiles to
| native machine code..
| fatnoah wrote:
| As a long-time C# user who started life with coding for
| embedded systems with C, graduated to C++ business tiers,
| and then on to C#, my personal crusade has always been to
| show that it's very possible to make things go pretty fast
| with C#.
|
| One of my favorite moments happened after my C#-based back-
| end company was acquired by an all-[FASTER LANGUAGE]
| company. We had to connect our platforms and hit a shared
| performance goal of supporting 1 billion events/month,
| which amounted to something like (IIRC) 380 per second. Our
| platform hit that mark running on 3 server setup w/2 Amazon
| Medium FE servers and a SQL backend. The other company's
| bits choked at 10 per second, running on roughly 50x the
| infra.
|
| Poorly written and architected code is a bigger drag than
| the specific language in many cases.
| divan wrote:
| Go hit a really sweet spot here.
| Yajirobe wrote:
| Python allows one to save development time in exchange for
| execution time
| wizofaus wrote:
| Except as a developer I lose lots of time if I have to wait
| long for my code (esp. Unit tests) to run. Having said that
| larger projects in C/C++ are often very slow to build (esp.
| if dependencies are not well defined and certain header files
| affect huge numbers of source files - a problem that doesn't
| exist with higher level languages). But even if using a
| particular language and framework saves developer time, it
| rarely seems to translate into developers using that saved
| time to bother optimizing where it might really count.
| zasdffaa wrote:
| Get beyond a certain size of python program and you lose dev
| time.
|
| IOW you lose both. It's not a huge size either.
| Yajirobe wrote:
| Like... youtube?
| zasdffaa wrote:
| could you elaborate please - is all of youtube including
| the streaming, all python?
| pavon wrote:
| I've not found that to be the case. The first draft might get
| done faster, but then I spend more time debugging issues in
| dynamic languages that only show up at runtime that the
| compiler would find in other languages. And then more time
| optimizing the code, adding caching, moving to more advanced
| algorithms, and rewriting parts in C just to get it to run at
| a reasonable speed when the naive approach I implement in
| other languages is fast enough on first try.
|
| For most tasks, modern mid-level statically typed languages
| like C#, Go, Kotlin really are the sweet spot for
| productivity. Languages like Python, Ruby and JS are a false
| economy that appear more productive than they really are.
| david422 wrote:
| But it just comes back to bite you later on maintenance
| costs.
| stonogo wrote:
| That's only an excuse if you're sociopathically profit-
| oriented. The program is developed orders of magnitude fewer
| times than it is run. Shitty performance, like pollution, is
| an externality that can be ignored but should not.
| geysersam wrote:
| Shitty performance certainly is bad, but it is _not_ an
| externality like emissions into the atmosphere. The
| fundamental difference is that the customer (and only the
| customer) is harmed by bad performance, while emissions
| harms everyone.
| morelisp wrote:
| Python hasn't saved me development time since distutils was
| the right and only way to build things.
| pointernil wrote:
| Maybe it's been stated already by someone else here but I really
| hope that CO2 pricing on the major Cloud platforms will help with
| this. It boils down to resources used (like energy) and waste/CO2
| generated.
|
| Software/System Developers using 'good enough' stacks/solutions
| are externalising costs for their own benefit.
|
| Making those externalities transparent will drive alot of the
| transformation needed.
| vjerancrnjak wrote:
| Hmm, interesting that single threaded C++ is 25% of Python exec
| time. It feels like C++ implementation might have area for
| improvement.
|
| My usual 1-to-1 translations result in C++ being 1-5% of Python
| exec time, even on combinatorial stuff.
| bbojan wrote:
| I recently ported some very simple combinatorial code from
| Python to Rust. I was expecting around 100x speed up. I was
| surprised when the code ended running only 14 times faster.
| remuskaos wrote:
| Did you use python specific functions like list
| comprehensions, or "classic" for/while loops? Because I've
| found the former to be surprisingly fast, while naive for
| loops are incredibly slow in python.
| momojo wrote:
| Aren't most of the python primitives implemented in C?
| SekstiNi wrote:
| Just to be sure, you did compile the Rust program using the
| --release flag?
| alpaca128 wrote:
| Parts of Python are implemented in C. For example the
| standard for loop using `range`. So when comparing Python's
| performance with other languages using just one simple
| benchmark can lead to unexpected results depending on what
| the program does.
| tomrod wrote:
| Interesting! This is usually touted as an antipattern.
| aaaaaaaaaaab wrote:
| Developers should be mandated to use artificially slow machines.
| varispeed wrote:
| I wonder if eventually there is going to be consideration for
| environment required when building software.
|
| For instance running unoptimised code can eat a lot of energy
| unnecessarily, which has an impact on carbon footprint.
|
| Do you think we are going to see regulation in this area akin to
| car emission bands?
|
| Even to an extent that some algorithms would be illegal to use
| when there are more optimal ways to perform a task? Like using
| BubbleSort when QuickSort would perform much better.
| kzrdude wrote:
| Well, there is some rumbling about making proof of work
| cryptocurrencies illegal, and that falls under this topic.
|
| To some extent they can claim to deliver a unique feature where
| there is no replacement for the algorithm they are using.
| jcelerier wrote:
| > Do you think we are going to see regulation in this area akin
| to car emission bands?
|
| it has thankfully started: https://www.blauer-
| engel.de/en/productworld/resources-and-en...
|
| I think KDE's Okular has been one of the first certified
| software :-)
| [deleted]
| mg wrote:
| Good example is this high performance Fizz Buzz challenge:
|
| https://codegolf.stackexchange.com/questions/215216/high-thr...
|
| An optimized assembler implementation is 500 times faster than a
| naive Python implementation.
|
| By the way, it is still missing a Javascript entry!
| porcoda wrote:
| Yup. We have gotten into the habit of leaving a lot of potential
| performance on the floor in the interest of
| productivity/accessibility. What always amazes me is when I have
| to work with a person who only speaks Python or only speaks JS
| and is completely unaware of the actual performance potential of
| a system. I think a lot of people just accept the performance
| they get as normal even if they are doing things that take 1000x
| (or worse) the time and/or space than it could (even without
| heroic work).
| contravariant wrote:
| While I would certainly welcome _awareness_ when it comes to
| performance it 's not always useful to make something 1000x
| faster if it takes even as little as 25% longer to develop.
| Taking an extra day to make something take 1s instead of an
| hour is just not always worth it.
|
| Though I will never understand webpages that use more code than
| you'd reasonably need to implement a performant lisp compiler
| and build the webpage in that (not that I'm saying that's what
| they should have done, I just don't understand how they use
| _more_ code)
| egypturnash wrote:
| It depends on how often you need to do the thing and how long
| it takes to do it. There's an XKCD that's just a chart of
| that.
|
| Sadly any concept of performance seems to completely go out
| the window for most programmers once it leaves their hands;
| taking 2-3x longer to write a performant app in a compiled
| language would save a _ton_ of time and cycles on users'
| machines but Programmer Time Is Expensive, let's just shit
| out an Electron app, who cares that it's five orders of
| magnitude larger and slower.
| spatley wrote:
| Agreed that switching to lower level languages give the
| potential of many orders of magnitude. But the thing that was
| most enlightening was that removing pandas made a 9900%
| increase in speed without even a change to language. 20 minutes
| down to 12 seconds is a very big deal, and I still don't have
| to remember how to manage pointers.
| twobitshifter wrote:
| I think that should be emphasized. The rest of the
| optimizations are entirely unneeded and added complexity to
| the code base. The next guy to work on this needs to be a cpp
| dev, but the requirements were only asking for 500ms which
| was more than met by the first fix. What the payoff of this
| added performance with and at what cost?
| kzrdude wrote:
| In some cases you can add pandas to a solution and speed it
| up by 10 seconds. It's about using the tools right
| kaba0 wrote:
| I don't believe orders of magnitude is achievable in general.
| Even python, which is perhaps the slowest mainstream language
| clocks in at around 10x that of C.
|
| Sure, there will be some specialized program where keeping
| the cache manually small you can achieve big improvements,
| but most mainstream managed languages have very great
| performance. The slowdown is caused by the frameworks and
| whatnot, not the language itself.
| fleddr wrote:
| I think it's even stronger than a habit. When you're exposed to
| the typical "performance" of the web and apps for a decade or
| so, you may have forgotten about raw performance entirely.
| Young people may have never experienced it at all.
|
| I once owned a small business server with a Xeon processor,
| Linux installed. Just for kicks I wrote a C program that would
| loop over many thousands of files, read their content, sort in
| memory, dump into a single output file.
|
| I ran the program and as I ran it, it was done. I kept upping
| the scope and load but it seems I could throw anything at it
| and the response time was zero, or something perceived as zero.
|
| Meanwhile, it's 2022 and we can't even have a text editor place
| a character on screen without noticeable lag.
|
| Shit performance is even ingrained in our culture. When you
| have a web shop with a "submit order" button, if you'd click it
| and would instantly say "thanks for your order", people are
| going to call you. They wonder if the order got through.
| Tostino wrote:
| In my SAAS app, we have a few artificial delays to ensure all
| "background tasks" that pop up a progress dialog take at
| least long enough to show that to the user.
| xedrac wrote:
| In my experience, this wouldn't be needed if the rest of
| the app ran at native speed. There would already be a
| natural delay that would be noticed by the user.
| smolder wrote:
| This is a tangent but there are other, arguably better ways
| to give the user confidence the order took place in your
| example. You could show the line items to them again with
| some indicators of completion, or show the order added to an
| excerpt of their order history, where perhaps they can
| tap/click to view line items. Something like that is a bit
| more convincing than just thank-you text even with the delay,
| IMO, though it may be tougher to pull off design-wise.
| vladvasiliu wrote:
| > I think a lot of people just accept the performance they get
| as normal even if they are doing things that take 1000x (or
| worse) the time and/or space than it could (even without heroic
| work).
|
| Habit is a very powerful force.
|
| Performance is somewhat abstract, as in "just throw more CPUs
| at it" / it works for me (on my top of the line PC). But people
| will happily keep on using unergonomic tools just because
| they've always done so.
|
| I work for a shop that's mainly Windows (but I'm a Linux guy).
| I won't even get into how annoying the OS is and how
| unnecessary, since we're mostly using web apps through Chrome.
| But pretty much all my colleagues have no issue with using VNC
| for remote administration of computers.
|
| It's so painful, it hurts to see them do it. And for some
| reason, they absolutely refuse to use RDP (I'm talking about
| local connections, over a controlled network). And they don't
| particularly need to see what the user in front of the computer
| is seeing, they just need to see that some random app starts or
| something.
|
| I won't even get into Windows Remote Management and controlling
| those systems from the comfort of their local terminal with 0
| lag.
|
| But for some reason, "we've always done it this way" is
| stronger than the inconvenience through which they have to
| suffer every day.
| tomrod wrote:
| It's the downside to choosing boring tech. It costs
| believable dollars to migrate and unbelievable dollars to
| keep course. There is a happy medium, I believe, that is
| better than "pissing away the competitive edge."
| nonameiguess wrote:
| It's interesting to me that two of the top three comments right
| now are talking about gaining performance benefits by switching
| from Python to C when the actual article in the link claims he
| gained a speedup by pulling things out of pandas, which is
| written in C, and using normal Python list operations.
|
| I would like to see all of the actual code he omitted, because
| I am skeptical how that would happen. It's been a while since
| I've used pandas for anything, but it should be pretty fast.
| The only thing I can think is he was maybe trying to run an
| apply on a column where the function was something doing Python
| string processing, or possibly the groupby is on something that
| isn't a categorical variable and needs to be converted on the
| fly.
| dragonwriter wrote:
| > the actual article in the link claims he gained a speedup
| by pulling things out of pandas, which is written in C, and
| using normal Python list operations.
|
| Well, he claims he did three things:
|
| (1) avoid repeating a shared step every time the aggregate
| function was called,
|
| (2) unspecified algorithmic optimizations.
|
| (3) use Python lists instead of pandas dataframes.
|
| (1) is a win that doesn't have anything to do with pandas vs
| python list ops, (2) is just skipped over any detail but
| appears to be the meat of the change. Visually, it looks like
| most of the things the pandas code tries to do just aren't
| done in the revised code (it's hard to tell because some is
| hidden behind a function whose purpose and implementation are
| not provided). It's not at all clear that the move out of
| pandas was necessary or particularly relevant.
| LAC-Tech wrote:
| I don't think we can really blame slow languages.
|
| Implementations of languages like javascript, ruby - and I
| would presume python and php - are a lot faster than they used
| to be.
|
| I think most slowness is architectural.
| jacobolus wrote:
| > _only speaks Python or only speaks JS and is completely
| unaware of the actual performance potential of a system_
|
| If you stick to only doing arithmetic and avoid making lots of
| small objects, javascript engines are pretty fast (really!).
| The tricky part with doing performance-sensitive work in JS is
| that it's hard to reason about the intricacies of JITs and
| differences between implementations and sometimes subtle
| mistakes will dramatically bonk performance, but it's not
| impossible to be fast.
|
| People building giant towers of indirection and never bothering
| to profile them is what slows the code down, not running in JS
| per se.
|
| JS, like other high-level languages, offers convenient features
| that encourage authors to focus on code clarity and concision
| by building abstractions out of abstractions out of
| abstractions, whereas performance is best with simple for loops
| working over pre-allocated arrays.
| physicsguy wrote:
| The return value on the function in C++ is of the wrong type :)
|
| I agree though. I used these tricks a lot in scientific
| computing. Go to the world outside and people are just unaware.
| With that said - there is a cost to introducing those tricks.
| Either in needing your team to learn new tools and techniques,
| maintaining the build process across different operating systems,
| etc. - Python extension modules on Windows for e.g. are still a
| PITA if you're not able to use Conda.
| modeless wrote:
| Now do it on the GPU. There's at least a factor of 10 more there.
| And a lot of things people think aren't possible with GPUs are
| actually possible.
| ineedasername wrote:
| Slow Code Conjecture: inefficient code slows down computers
| incrementally such that any increase in computer power is offset
| by slower code.
|
| This is for normal computer tasks-- browser, desktop
| applications, UI. The exception to this seem to be tasks that
| were previously bottlenecked by HDD speeds which have been much
| improved by solid state disks.
|
| It amazes me, for example, that keeping a dozen miscellaneous
| tabs open in Chrome will eat roughly the same amount of idling
| CPU time as a dozen tabs did a decade ago, while RAM usage is
| 5-10x higher.
| eterm wrote:
| It's hard to evaluate this article without seeing the detail of
| the "algorithm_wizardry", there's no detail here just where it
| would be interesting.
| geph2021 wrote:
| The author says: "The function looks something
| like this:"
|
| And then shows some grouping and sorting functions using
| pandas.
|
| Then he says: "I replaced Pandas with simple
| python lists and implemented the algorithm manually to do the
| group-by and sort."
|
| I think the point of the first optimization is you can do the
| relatively expenseive group/sort operations without pandas, and
| improve performance. For the rest of the article it's just
| "algorithm_wizardry", which no longer deals with that portion
| of the code.
| eterm wrote:
| We never get a good sense of how much time was actually saved
| with that change not least because the original function
| calls "initialise weights" inside every loop, the new
| function does not. It would have been interesting to see what
| difference that alone made.
|
| The takeaway of the article, that computers are blindingly
| fast and we make them do unecessary work (and often sit
| around waiting on I/O) with most their time is true of
| course.
|
| I'm currently writing a utility to do a basic benchmark of
| data structures and I/O and it's been a real learning
| experience for me in just how fast computers can be, but also
| just how slow a little bit of overhead or contention can
| cause things, but that's better left for a full write up
| another day.
| geph2021 wrote:
| We never get a good sense of how much time was actually
| saved with that change not least because the original
| function calls "initialise weights" inside every loop, the
| new function does not.
|
| Good point. Furthermore to your point, I would assume a
| library like pandas has fairly well optimized group and
| sort operations. It would not occur to me that pandas is
| the bottleneck, but the author does clarify in his footnote
| that pandas operations, by virtue of creating more complex
| pandas objects, can indeed be a bottleneck.
| [1] Please don't get me wrong. Pandas is pretty fast for a
| typical dataset but it's not the processing that slows down
| pandas in my case. It's the creation of Pandas objects
| itself which can be slow. If your service needs to respond
| in less than 500ms, then you will feel the effect of each
| line of Pandas code.
| user_7832 wrote:
| I wonder how much power (and resulting CO2 emissions) could be
| saved if all code had to go through such optimization.
|
| And on a slightly ranty note, Apple's A12z and A14 are still
| apparently "too weak" to run multiple windows simultaneously :)
| MR4D wrote:
| That's a ram issue not a processor issue. At least, that's
| according to Apple
|
| https://appleinsider.com/articles/22/06/11/stage-manager-for...
| lynguist wrote:
| The funny thing is that actual Macs with 4GB RAM are also
| supported and they can run the entire desktop environment.
| aaaaaaaaaaab wrote:
| That's the biggest bullshit ever.
| david38 wrote:
| Worse CO2 emissions. You think optimizations are energy free?
| tomrod wrote:
| Many can be pareto improvements from current state.
| hashingroll wrote:
| Not all optimizations are more energy consuming. For an
| analogy, does a using a car consume more energy than a
| bicycle? Yes. But a using a bicycle does not consume more
| energy than a man running on feet.
| TheDong wrote:
| I think it depends and could be either worse or better
| depending.
|
| Some code is compiled more often than it is run, and some
| code is run more often than it's compiled.
|
| If you can spend 100k operations per compilation to save 50k
| operations at runtime on average... That'll probably be a net
| positive for chromium or glibc functions or linux syscalls,
| all of which end up being run by users more often than they
| are built by developers.
|
| If it's 100k operations at build-time to remove 50k
| operations from a test function only hit by CI, then yeah,
| you'll be in the hole 50k operations per CI run.
|
| All of this ignores the human cost; I don't really want to
| try (and fail) to approximate the CO2 emissions of converting
| coffee to performance optimizations.
___________________________________________________________________
(page generated 2022-06-16 23:00 UTC)