[HN Gopher] Magic-trace - High-resolution traces of what a proce...
___________________________________________________________________
Magic-trace - High-resolution traces of what a process is doing
Author : cgaebel
Score : 559 points
Date : 2022-04-22 13:30 UTC (9 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| amaring wrote:
| fkdekfekuecr
| Sirened wrote:
| Can I ask an office hours type question?
|
| I worked on a very similar (if not identical lol) project at a
| job once upon a time and the biggest problem I had (and one that
| I never really solved well) was recovering call stacks from trace
| data. I effectively ended up using DWARF and just simulating
| execution and keeping a call stack in the decoder. This mostly
| worked fine for small and simple programs, but I ran into SO MUCH
| trouble because I found that (at least on my generation of cores)
| IPT actually overflows and drops packets very frequently if you
| have too many calls/returns too quickly. This is largely not an
| issue for C code but once you start getting into more dynamic
| languages with fancy features, IPT cannot keep up. Once packets
| get dropped, the entire call stack for the entire rest of the
| thread is ruined since you have no idea who called/returned in
| the dropped packets.
|
| One option that we had but didn't really chase down due to time
| was maybe combining IPT with low frequency stack traces so that
| we can both just reset every so often and, if needed, work
| backwards/apply heuristics in order to arrive at that next
| callstack.
|
| How did y'all manage this? Your call stacks look totally correct
| and I'm very impressed :)
| gigel82 wrote:
| "for Linux", just add that in the title.
|
| We shouldn't need to scroll through pages of text to figure that
| out.
| neilv wrote:
| I suppose it's easy to forget to mention, when doing open
| source development, and the target platform is also open
| source. Especially if it's Linux, which is the default.
| electroly wrote:
| Looks really cool. I'm sad that I'll never know; Intel + Linux +
| non-VM excludes literally every computer I have access to. We're
| all AMD + Windows, and my only access to Intel Linux machines
| would be a cloud VM.
| bentcorner wrote:
| Windows (AMD or Intel) has Time Travel Debugging:
| https://docs.microsoft.com/en-us/windows-
| hardware/drivers/de..., although this has extremely high
| performance overhead.
|
| There's also the regular performance traces you can capture
| with wpr and friends. I don't think these provide function-
| level traces, and I also don't think it's possible to do that
| (but I could be wrong). You just get sampled callstacks, which
| may or may not be enough for your needs.
|
| In my experience on Windows you need to instrument applications
| to get function-level tracing.
| fulafel wrote:
| Apparently it works with virtualization at least with KVM
| according to https://github.com/janestreet/magic-
| trace/wiki/Supported-pla... &
| https://github.com/janestreet/magic-trace/wiki/How-could-
| mag......
| jeffbee wrote:
| You can use PMUs in the cloud. Amazon's c6i.metal instance type
| is an Intel Ice Lake Xeon with full PMU access, for example.
| electroly wrote:
| A simple answer that solves the problem, thank you!
| tanelpoder wrote:
| Also, KVM, VMWare (and maybe Xen?) allow PMU access too (but
| not VirtualBox). Both VMWare ESXi and VMware
| Server/Workstation/Fusion allow PMU access, you just need to
| make sure it's enabled in the settings.
|
| A quick way to check if PMU access is enabled is this:
| dmesg | grep "Performance Events"
|
| Edit: Oh, unfortunately VMware Fusion 12 (on Mac OSX Intel)
| does not expose the performance counters to the VM anymore,
| as it's using the OSX Hypervisor Framework instead of its own
| kernel module.
| jjgreen wrote:
| Needs a Skylake or later CPU, linuxers can cat
| /sys/devices/cpu/caps/pmu_name
|
| to find if they're invited to the party
| tbr1 wrote:
| One of the maintainers here -- it should work on Broadwell if
| you're not super keen on the tens-of-nanoseconds timing
| precision and are okay with microsecond precision (i.e. only
| want accurate callstacks), grep intel_pt
| /proc/cpuinfo
|
| should do the trick.
| quasarj wrote:
| Thank you! I was really disappointed none of the readme
| actually said what the extension name was.....
| tbr1 wrote:
| You may also be interested in this wiki page:
| <https://github.com/janestreet/magic-trace/wiki/Supported-
| pla...>
|
| Intel PT has a bunch of rough edges that we've tried to
| paper over in magic-trace, but the gritty caveats are
| documented in the wiki.
| tdiff wrote:
| I wonder if there is some support for dumping pt-traces only on
| some condition? Would be useful for debugging spikes in busy-
| loops.
| tbr1 wrote:
| Absolutely! This is one of the main features of magic-trace,
| and in fact a primary use-case.
|
| You can select a trigger symbol for magic-trace to snapshot
| upon the next call of. This can be whatever you want, and you
| can imagine writing code like if
| (something_really_wonky_happened) { take_magic_trace(); }
|
| and asking magic-trace to take a snapshot of the past only when
| `take_magic_trace` is called.
| tdiff wrote:
| Sounds great, thank you!
| metadaemon wrote:
| This may be a neat use case for bpf at some point.
| [deleted]
| amelius wrote:
| Can you apply it to itself?
| tbr1 wrote:
| Yes, in fact this is how we've been narrowing down performance
| problems in it and its dependencies :)
|
| - https://github.com/let-def/owee/issues/23
|
| - https://github.com/janestreet/magic-trace/issues/93
| aendruk wrote:
| Note that this uses an Intel-specific API, so it doesn't
| currently work on AMD or ARM:
|
| https://man7.org/linux/man-pages/man1/perf-intel-pt.1.html
|
| > Intel Processor Trace is an extension of Intel Architecture
| that collects information about software execution such as
| control flow, execution modes and timings and formats it into
| highly compressed binary packets. [...] The main distinguishing
| feature of Intel Processor Trace is that the decoder can
| determine the exact flow of software execution.
| tbr1 wrote:
| We have a bit more color on compatibility in general up on
| <https://github.com/janestreet/magic-trace/wiki/How-could-
| mag...> for those interested.
| infogulch wrote:
| Jane Street has been publishing some interesting projects
| recently. See Signals and Threads podcast _State Machine
| Replication, and Why You Should Care_ [0] posted 2 days ago.
|
| I came across the incr_dom library [1], which efficiently
| calculates the diff on the projected DOM based on a _diff of the
| model data_ , sorta like React but more... mathematically
| grounded (?). incr_dom was then reformulated in Bonsai [2] which
| refactors and generalizes the idea to work with more than DOM.
| There was a recent Signals and Threads podcast about it a few
| months ago [3].
|
| [0]: https://news.ycombinator.com/item?id=31100023
|
| [1]: https://opensource.janestreet.com/incr_dom/
|
| [2]: https://opensource.janestreet.com/bonsai/
|
| [3]: https://signalsandthreads.com/building-a-ui-framework/
| stephankoelle wrote:
| Is this like java flight recorder?
| Scaevolus wrote:
| If you just want calls and returns, can't you use one of the
| other PMUs for that? Or is sampling at the "1 sample per event"
| level higher overhead than IPT?
| rrss wrote:
| do you mean configuring the other PMUs to interrupt the core
| every function call / return?
|
| If yes, then yes that is much much higher overhead than
| processor trace.
| tbr1 wrote:
| It's worth noting that aside from the overhead, function call
| / returns are not _quite_ enough to reconstruct the
| callstack: tailcalls are just regular branch instructions.
| DSingularity wrote:
| I thought perf can also exploit Intel processor trace?
| cgaebel wrote:
| It can! You can read more about how to do that yourself here: h
| ttps://perf.wiki.kernel.org/index.php/Perf_tools_support_fo....
|
| magic-trace uses perf. If you want, you can think of it as a
| mere "alternative frontend" for the Intel PT decoding offered
| by perf.
| DSingularity wrote:
| Ah okay, I misunderstood magic-trace to be an alternative to
| perf.
| cgaebel wrote:
| Hi HN! I'm Clark, one of the maintainers of magic-trace.
|
| magic-trace was submitted before, our first announcement was this
| blog post: https://blog.janestreet.com/magic-trace/.
|
| Since then, we've worked hard at making magic-trace more
| accessible to outside users. We've heard stories of people
| thinking this was cool before but being unable to even find a
| download link.
|
| I'm posting this here because we just released "version 1.0",
| which is the first version that we think is sufficiently user-
| friendly for it to be worth your time to experiment with.
|
| And uhh... sorry in advance if you run into hiccups despite our
| best efforts. Going from dozens of internal users to anyone and
| everyone is bound to discover new corners we haven't considered
| yet. Let us know if you run into trouble!
| giovannibonetti wrote:
| I'm curious about why Standard ML (SML) was chosen for this
| project, given the track record Jane Street has with OCaml. Do
| you seen an advantage for using the former in this kind of
| project?
| tbr1 wrote:
| It's all OCaml, GitHub is just misclassifying it as SML :)
| la6472 wrote:
| So one needs to upload the trace log to your website to
| visualize? Any way to do it locally?
| tbr1 wrote:
| Absolutely, check out https://github.com/janestreet/magic-
| trace#privacy-policy and https://github.com/janestreet/magic-
| trace/wiki/Setting-up-a-.... With a bit of extra
| configuration, magic-trace can host its own UI locally. You
| just need to build the UI from source, and point magic-trace
| to it (via an environment variable).
| b20000 wrote:
| why do you guys use Caml?
| cgaebel wrote:
| https://www.youtube.com/watch?v=v1CmGbOGb2I
| b20000 wrote:
| that seems to be a presentation about language features.
| I'm mostly interested in the business reasons for using the
| language within what jane street does, and how the language
| offers a competitive advantage and why it is "good enough"
| for the highly competitive HFT landscape they work in.
| brobinson wrote:
| The language features are the competitive advantage.
| nesarkvechnep wrote:
| Because Java is not the be-all and end-all.
| b20000 wrote:
| why use java when you have c++?
| ARandomerDude wrote:
| So you can use Clojure. ;-)
| gigatexal wrote:
| I'm probably missing something so apologies if this is obvious:
| does this only work on compiled programs or could it work on
| any arbitrary running code. Everything from Firefox to my
| random python script?
| tbr1 wrote:
| It works best on compiled programs.
|
| We do try to support scripted languages with JITs that can
| emit info about what symbol is located where [1]. Notably,
| this more or less works for Node.js. It'll work somewhat for
| Python in that you'll see the Python interpreter frames
| (probably uninteresting), but you will see any ffi calls
| (e.g., numpy) with proper stacks.
|
| [1]: https://github.com/torvalds/linux/blob/master/tools/perf
| /Doc...
| panosfilianos wrote:
| Thanks for sharing this.
| zeusk wrote:
| Windows has Windows Performance Analyzer, GPUView and PIX so
| most game devs are covered on that front :)
| geuis wrote:
| On the website, scrolling doesn't work in mobile safari.
| TooSmugToFail wrote:
| Awesome work Clark!
|
| Any plans to support Arm in the future? Thanks!
| tbr1 wrote:
| We don't have plans to add ARM support largely because we
| have no in-house expertise with ARM. That said, ARM has
| CoreSight which sounds like it could support something like
| magic-trace in some form, and we'd definitely be open to
| community contributions for CoreSight support in magic-trace.
| nemetroid wrote:
| > The key difference from perf is that instead of sampling call
| stacks throughout time, magic-trace uses Intel Processor Trace to
| snapshot a ring buffer of all control flow leading up to a chosen
| point in time[2].
|
| > 2. perf can do this too, but that's not how most people use it.
| In fact, if you peek under the hood you'll see that magic-trace
| uses perf to drive Intel PT.
|
| I think this (the first sentence quoted) is a bit misleading. The
| main feature is not really a "key difference from perf" if the
| main feature _is implemented using perf_. From a brief read, it
| looks like the real key difference is a friendlier and more
| interactive UI (both when capturing and viewing the trace).
|
| Regardless, I think it looks neat, and will try to take it for a
| spin sometime soon.
| akamoonknight wrote:
| Being a person who works with FPGAs. For all their difficulties,
| waveforms of both simulation and real-time tapping capabilities
| are indispensable for tracking what happens before, during, and
| after an erroneous event occurs. One thing I always miss when
| going back to software development is some sort of 'waveform' of
| what the process has done over time and tracking the state of the
| system over that same course of time. Admittedly dealing with CPU
| instructions and a program's function calls and overlapping
| threads is orders of magnitude more difficult than tracking how a
| bit is changing in a waveform over time, but it definitely seems
| like tools are getting closer and closer to something that's
| equivalent and that's pretty awesome to see.
| Sirened wrote:
| Ha, this was my experience as well when getting into hardware
| after spending forever in software. It's SO amazing being able
| to just shoot a program through simulation and then look at the
| waveform to see how an instruction propagates down a pipeline.
| Debugging concurrency issues on hardware (i.e. incorrect re-
| ordering/concurrent scheduling) is honestly so much easier than
| debugging software concurrency because you often can't even see
| the entire system state. We're starting to see software catch
| up with things like time-travel debug with instruction tracing
| (whether Intel Processor Trace or ARM CoreSight tracing) but
| the analysis tools for these sorts of things have nothing on
| wave analysis programs. They either force you into a linear
| interface (GDB time travel) which makes actually finding the
| issue a pain in the ass or they simply don't give you the
| granularity of data that you need.
| t_mann wrote:
| Jane Street is a prop trading shop. I heard that they take tech
| seriously, but still impressed to see them at the front of HN
| with a tool like this.
| DonaldPShimoda wrote:
| They are among the most notable users of OCaml in the world,
| and have taken significant stewardship of the language. They
| sponsor, attend, and publish at multiple academic conferences,
| especially in the realm of programming languages.
|
| I don't think I can say I support their company's primary
| mission, but their commitment to improving the world of
| software through various means (language influence, academic
| publication, open-source software releases, etc) is admirable
| and well worth respecting.
| t_mann wrote:
| Yeah, quite impressive language to make the backbone of all
| your systems.
| b20000 wrote:
| I doubt that it is used for actual trading. Maybe the OP
| can elaborate what the live infrastructure and backtesting
| setup look like?
| DonaldPShimoda wrote:
| > I doubt that it is used for actual trading.
|
| Do you have a basis for that claim?
|
| JS have developed tons of libraries and tools for OCaml
| development, and new developers and quants that they hire
| go through an OCaml bootcamp to come up to speed. They
| put lots of work into the OCaml compiler, and in blog
| posts about that work they talk about why this is useful
| _for trading_. Maybe I 'm missing something crucial, but
| I think it's more likely that you just don't know what
| you're talking about.
| mirekrusin wrote:
| Of course they use it for trading and everything around
| for many years.
| b20000 wrote:
| cool, so how about the OP shares the details of how they
| use it for trading?
| dqpb wrote:
| Do you take issue with the market maker business model?
| DonaldPShimoda wrote:
| I think just the idea of a company that does not directly
| produce things, and instead spends its efforts turning its
| money into more money via investment, is something that...
| doesn't entirely sit well with me.
|
| There is, of course, something to be said about the by-
| products of their work. Jane Street is far from an evil
| company, and I would not be entirely morally opposed to
| working for or with them. They do a lot of good in academic
| research in areas I care about. I just wish that that was
| their primary purpose instead of direct money-making, if
| that makes sense.
| tdiff wrote:
| Fwiw, market makers sometimes say that by providing
| liquidity they benefit the whole market.
| lallysingh wrote:
| Does that cover the entire financial industry?
| DonaldPShimoda wrote:
| Not necessarily. Companies that facilitate trades (eg,
| E-Trade) don't fall in this category, because their
| business model is about providing a service to regular
| people.
|
| Jane Street is a proprietary trading firm. They have no
| external customers to whom they would provide goods or
| services. Their primary purpose is to invest the
| company's own money.
| kangda123 wrote:
| Doesn't that pretty much characterise all for-profit
| companies?
| tonyarkles wrote:
| I think for a lot of people there's a pretty big
| difference between these two business models:
|
| - We use cash to buy circuit boards, screens, enclosures,
| etc, write software, and sell mobile phones.
|
| - We use cash to rent a building, order pallets of
| inventory, and sell that inventory locally to walk-in
| customers.
|
| - We use cash to buy shares, hold onto them for a bit,
| and sell those same shares and make money off the spread.
|
| I'm not making any kind of comment at all about the value
| of market makers, just... those three businesses feel
| like they're different models.
| DonaldPShimoda wrote:
| Yes, this is pretty much it. As I said in another
| comment, Jane Street does not have customers for whom
| they provide goods or services. Their primary business
| model is using the company's own money for investments,
| and that's a business model that I don't love.
| what-the-grump wrote:
| Company spends its own money to make money for itself?
|
| Not exploiting teenagers in some 3rd world country?
|
| Not gambling with your pension?
|
| Not manipulating some physical commodity like oil?
|
| What's the problem here?
| bostik wrote:
| Considering how many high finance shops live off of
| gambling with other peoples' money, I find the
| intellectual honesty of a company doing it on their own
| dime ... refreshing.
| rtlfe wrote:
| > - We use cash to rent a building, order pallets of
| inventory, and sell that inventory locally to walk-in
| customers.
|
| > - We use cash to buy shares, hold onto them for a bit,
| and sell those same shares and make money off the spread.
|
| Those two sound like pretty much the same thing.
| andrepd wrote:
| Indeed, but not like the first one.
| csomar wrote:
| They are the same model. They both buy stuff, do some
| stuff and then re-sell the same stuff but with the
| modifications they made.
|
| In the case of a trading shop, the stuff they do is
| playing the market liquidity, collecting interest,
| arbitrage, etc...
|
| Sure there are some evil ones, but other businesses have
| those too.
| nojito wrote:
| Partly because the cost of switching is impossible for them.
| DonaldPShimoda wrote:
| You think they are platinum-level sponsors at multiple
| conferences every year, sponsors of PLMW multiple times a
| year, sponsors for carbon-neutrality at ICFP each year, and
| continue investing in hiring PhDs and improving the OCaml
| ecosystem because... the... cost of switching away from
| OCaml is too great?
|
| I don't think that tracks. They _like_ OCaml, and they are
| pretty adamant that it is a good tool for the job. Maybe
| you disagree, but you should not project your opinions on
| them.
| b20000 wrote:
| I think they are mostly here to hire people. Not to
| educate.
| DonaldPShimoda wrote:
| If by "here" you mean "at academic conferences", then I'd
| ask you why they bother to actually do primary research
| and publish it if they have no interest in sharing
| knowledge.
|
| If all they wanted was to hire people, they... would. You
| don't have to sponsor a conference to attend or hire from
| that conference. And you especially also don't need to
| sponsor additional workshops, or carbon neutrality
| initiatives, or anything else.
|
| It's genuinely silly to suggest that they spend all this
| money on things just to hire people. There are so many
| more effective uses of their money if that is the only
| goal.
| b20000 wrote:
| because they are hoping to run into interesting people to
| hire. why would they share knowledge otherwise? if they
| are interested in doing that, why not share all the
| details of their tech stack and what they are doing
| exactly in the market? it's very hard to hire, especially
| with FAANGs who compete with them for talent.
| tedunangst wrote:
| I don't think that's the situation, but if you are
| irreversibly committed to a technology, you have an
| interest in seeing that tech continue to advance. Being
| stuck on a dead tech is even worse.
| DonaldPShimoda wrote:
| > I don't think that's the situation
|
| I'm not sure what you mean by this. What is the
| situation?
|
| > Being stuck on a dead tech is even worse.
|
| Why do you think OCaml is a "dead tech"? Can you justify
| that? Or is it just based on the notion that if most
| people don't use it, it must be a bad tool?
| lawl wrote:
| I dont think he's saying it's dead tech. He's saying
| their incentive to invest into ocaml is to make sure it
| doesn't _become_ dead tech.
| skrebbel wrote:
| I bet it is also a great way for them to hire and retain
| great engineers, including the kind that isn't just in it
| for the money.
| DonaldPShimoda wrote:
| Speaking only of what I see (I'm a PhD student in the
| field of programming languages), quite a few people seek
| internships and full-time employment with JS specifically
| because of the tech stack and the kind of problems they
| (JS) tend to like throwing themselves at. They do some
| seriously cool stuff, and they've published a lot of
| great research!
| nojito wrote:
| Pretty much yes.
|
| Cost is more than just monetary. There are significant
| indirect costs as well.
| DonaldPShimoda wrote:
| > There are significant indirect costs as well.
|
| Can you elaborate on these costs? Do you have knowledge
| of JS's internal needs and resources to suggest a better
| alternative?
|
| They are not just hapless consumers of a dead language;
| they actively maintain it and invest in it because it
| works well for them. The language itself gives them the
| kind of guarantees they want in their work, and their
| work on the language and surrounding tooling (among other
| things) helps them to acquire high-skill talent. I don't
| know how you can claim that they would transition to
| another language if they could without having some pretty
| firm data to back that claim up. Otherwise, I think
| you're just projecting your own feelings about OCaml onto
| them.
| Frost1x wrote:
| You should make it clearer in the magic trace Readme that the UI
| service fork used for viewing and analyzing the traces is also
| available should one want to deploy it locally or in a situation
| without Internet access:
|
| https://github.com/janestreet/perfetto/
|
| It is mentioned in the documentation but for anyone quickly
| skimming and expecting a SaaS pricing model underneath (I think
| many do now), that isn't obvious. From my initial scroll through
| this wasn't obvious and it was significantly less attractive.
| Looks very interesting!
| bostonsre wrote:
| They mention that in the privacy section:
|
| https://github.com/janestreet/magic-trace#privacy-policy
| collinmanderson wrote:
| I've recently stumbled on Google's Perfetto in the last few
| months. Very very nice UI. I've been using it with viztracer
| for Python.
| haggy wrote:
| The magic of PR's :)
| ben0x539 wrote:
| Are people generally cool with unsolicited PRs to "editorial"
| content? I'd have assumed that stating or not stating
| something in a README is mostly an intentional decision.
| VikingCoder wrote:
| One time, I wrote a simple set of tools at my company.
|
| 1) A program that inserted a macro invocation with a GUID, at
| every single new scope. "{", except not switch, struct, or class,
| etc.
|
| 2) I made the macro instantiate an object on the stack, passing
| in the GUID. In the constructor and in the destructor, it called
| a singleton with thread-local storage to a file pointer, where it
| would append a few bytes indicating whether it was entering or
| exciting scope, what the GUID was, and what the time was. In this
| way, each thread was writing to its own file.
|
| 3) I made a program which walked my source, looking for file
| name, line number for each of my macro invocations, and the GUID.
| If I were more sophisticated, I would have tried to get the
| function name out of it, too.
|
| 4) I made a program which would turn one of the thread's files
| into a Visual Studio recognized output. Basically
| "filename(linenumber): [content, such as the time]". Then I set
| that up as a tool in Visual Studio, and when I would run it, it
| would output in a Visual Studio window. The reason for that was
| then I could hit (I think) F4 and Shift-F4 to step forward and
| backward through the output, and each time it would jump to the
| source code at that location.
|
| So then I had a forward-and-backward time travelling debug
| script. I think I also started manually passing in function
| parameters into the macros, which would format (a: "a") on the
| debug line, too.
|
| We had automated testing of our whole integrated application. I
| wanted to record my output from each automated test. Then when I
| was checking in new code, I could see which new GUIDs were never
| touched by any of our integration tests, to tell me how much
| coverage we had. And I could tell the testers which automated
| tests were most likely to exercise my code changes.
|
| I liked that the GUIDs would have been stable, even as code
| moved. (Unlike file name, line number, or even class and function
| name.)
|
| And yes, seeing this in the code wasn't great:
|
| { TIMER("5c7c062f-84a3-40d0-b7cd-77bd9db59f3e");
| // real code
|
| }
|
| I wanted to teach Visual Studio how to basically ignore those,
| and if I copied code and pasted it, have it generate new GUIDs
| when I pasted.
|
| But I could imagine using the output to generate the fire charts,
| and other debugging tools, like in the article.
|
| And it all compiled to 0, in Release mode.
|
| The payoff of this felt large, and the cost felt small. But the
| biggest pain was that humans would see these macro invocations,
| and need to maintain them.
|
| So I chickened out and didn't force my coworkers to see all of
| this.
| kolbe wrote:
| Nice. And it satisfies my curiosity about whether trading firms
| are switching to AMD or not
| jeffbee wrote:
| I don't know that this definitively answers that question. It's
| possible to use a different architecture based on
| cost/performance and keep a small population of Intel machines
| in service because you want access to their superior PMUs. Most
| of what you learn on the latter would still apply to the
| former.
| bitcharmer wrote:
| In HFT single-threaded performance is king so that's why we're
| all still on Intel. AMD is making progress but not just quite
| there yet.
| cgaebel wrote:
| To be clear: bitcharmer says "we" to mean "fellow HFTs" not
| "Jane Street".
| bitcharmer wrote:
| Yes. Thanks, should have made that more explicit.
| kolbe wrote:
| Huh. My experience has been that AMD wins that unless your
| application is so small that it can fit into Intel's smaller
| cache. And the new 3D architecture from AMD I thought would
| make your developers drool, allowing them to actually inline
| everything instead of being scared of building apps that are
| too big to fit into cache
| bitcharmer wrote:
| Not my experience at all and I work across different teams
| who own different latency sensitive apps. Most of them have
| unhygienically huge working sets.
| isogon wrote:
| For low-latency strategies, AMD's lack of DDIO [0] makes it a
| non-starter. The memory latency is a big gap to close.
|
| [0] https://www.intel.com/content/www/us/en/io/data-direct-i-
| o-t...
| b20000 wrote:
| how do you access this DDIO feature if you are writing a C or
| C++ application? intrinsics?
| tbr1 wrote:
| DDIO operates mostly transparently to software, with the
| I/O controller feeding DMAs into a slice of L3. Hardware
| can opt out by setting PCIe TLP header hints, and you have
| some system-wide configurability via MSRs, but it's not
| something a userspace application can take into its own
| hands.
| b20000 wrote:
| so is this taken advantage of by the OnLoad drivers of
| solarflare cards, for example?
| kolbe wrote:
| Do you know this for a fact? I've done some work in the
| industry where I needed to make fast software, but never the
| like sub-microsecond tick-to-trade type fast, so I really
| don't know.
|
| There was a great presentation from 2017 about some of
| Optiver's low latency techniques[1]. I had assumed they
| released it because the had obviated all of them by switching
| to FPGAs, but I don't know. Either way, he suggested that if
| you ever needed to ping main memory for anything, you already
| lost. So, I wouldn't have thought DDIO plays into their
| thinking much.
|
| [1] https://www.youtube.com/watch?v=NH1Tta7purM
| isogon wrote:
| The idea is precisely that you want to avoid pinging main
| memory at all, which is possible (in the happy case) if you
| do things correctly with DDIO. Not everything is done in
| hardware where I am. I am wary of saying much because my
| employer frowns on it, and admittedly I work on the
| software more than the hardware, but DDIO is certainly
| important to us.
| lysecret wrote:
| Tangentially related. But i really love the podcast from the
| janestreet people: signals and threads.
| sgeisenh wrote:
| Ron Minsky is an excellent communicator and incredible
| interviewer. Every episode has been phenomenal.
| genpfault wrote:
| If you aren't adverse to manual instrumentation there's also
| Tracy[1].
|
| [1]: https://github.com/wolfpld/tracy
| klik99 wrote:
| Supports Windows and Mac too, not just linux - thanks for
| mentioning this, this fills a really big need for me
| smitty1e wrote:
| This is implemented in ML[1]?
|
| First no-kidding application I've seen in that language.
|
| [1] https://en.wikipedia.org/wiki/ML_(programming_language)
| nephanth wrote:
| It's actually in Ocaml
| mrtweetyhack wrote:
___________________________________________________________________
(page generated 2022-04-22 23:00 UTC)