[HN Gopher] Multi-Horizon Forecasting for Limit Order Books
___________________________________________________________________
Multi-Horizon Forecasting for Limit Order Books
Author : ArtWomb
Score : 21 points
Date : 2021-07-02 13:21 UTC (9 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| dcolkitt wrote:
| The problem with using neural networks in market microstructure
| is the latency at inference time. Market makers and HFTs need to
| compute decisions on the order of microseconds. That's not
| feasible with large, deep networks.
|
| With specialized hardware you can get close. But you're still
| talking about a mid-single digit number of microseconds on
| inference alone. The competitor using linear models can get down
| to hundreds of nanoseconds. If you're in FPGA world, that kind of
| latency advantage is worth way more than a 30% accuracy
| improvement from using a complex ML model.
| tomas789 wrote:
| This describes one extreme of the spectra. That is go fast but
| be dumb. As far as I know this works well for many people.
| There are other grous of people going a bit slower but making
| more informed decision. I think of it as a scatter plot of time
| on one axis and smartness on the other one. As long as you are
| siting at Pareto front, you can make money.
| clipradiowallet wrote:
| Furthermore... the HFT market participants are not using CPU-
| intensive calculations to win consistently. They are using
| simple calculations(eg 6-period SMA) and _extremely_ low
| latency to win. They are competing with other HFT participants
| to get their order on the inside bid /ask before everyone else.
|
| At it's core, macro-level algorithmic trading is answering a
| question with only 2 possible answers, at any point in
| time...the question is, will the next tick be either "up" or
| "down".
| jstrong wrote:
| what is the special hardware/setup that achieves mid-single
| digit number of microseconds latency for deep learning
| inference you referred to?
| dcolkitt wrote:
| My understanding is that O(5 uS) is achievable on optimized
| FPGAs with reasonably large networks. Because of the
| parallelization, large networks don't add that much more
| latency as long as you have enough gates. But I have little
| experience on FPGA stacks, so can't say for sure.
|
| Even in software, I've been able to hit O(15 uS) using
| optimized FANN libraries. But the nets are far smaller than
| deep, and pretty ruthlessly pruned and compressed. Another
| trick that helps is pre-differentiating across all the
| variables you don't expect to change on a latency critical
| event. E.g. if you're running a liquidity take strategy, you
| can pre-differentiate assuming the opposite touch size and
| deep book stays constant, because you're only gonna act
| following on an aggressor trade at the touch.
| echelon wrote:
| What about use over longer time horizons? The paper seems to be
| geared for longer predictions.
| [deleted]
___________________________________________________________________
(page generated 2021-07-02 23:02 UTC)