hngopher.com

       [HN Gopher] NYSE Tuesday opening mayhem traced to a staffer who ...
       ___________________________________________________________________
        
       NYSE Tuesday opening mayhem traced to a staffer who left a backup
       system running
        
       Author : helsinkiandrew
       Score  : 261 points
       Date   : 2023-01-26 06:47 UTC (16 hours ago)
        
 (HTM) web link (www.bloomberg.com)
 (TXT) w3m dump (www.bloomberg.com)
        
       | jmount wrote:
       | Galileo's Principle bites hard, great explanation here:
       | https://99percentinvisible.org/episode/cautionary-tales/tran...
        
       | nickdothutton wrote:
       | If a single staffer SNAFU can send your exchange into chaos then
       | you dun goofed at risk management and probably a whole lot of
       | other management discipline.
        
       | twawaaay wrote:
       | Oh, yes. It is the staffer who made the mistake.
       | 
       | What about people who designed it this way?
        
       | sabujp wrote:
       | yeah absolutely nothing to do the SPX hitting a target 4k level
       | and messing up everything just as it hit that level
        
       | h2odragon wrote:
       | Tuesday news blackout; Thursday "it was all Jim's fault"...
       | Right.
       | 
       | Smells like horse shit. Most of what comes out of the profession
       | of "journalism" does too, lately; but this smells _strongly_.
        
         | yakubin wrote:
         | Why couldn't have been Jim's fault? The world is run by Jims.
        
           | h2odragon wrote:
           | As others have said, if Jim's fuckup can have such large
           | consequences, then there should have been backstops for Jim
           | who would have shared the blame. Nothing against Jim, it's
           | just he's being used as a distraction to avoid talking about
           | the real issues.
        
             | throwawaaarrgh wrote:
             | Manager: "I hope nobody asks why Jim was allowed to have so
             | much power with no oversight or validation"
        
         | epistemer wrote:
         | and all the trades are being reversed basically. The whole
         | thing stinks.
        
           | tgtweak wrote:
           | Not all of them... only the most egregious.
        
       | psychlops wrote:
       | This and other exchanges need to be running 24/7, in large part
       | to level the trading field for retail investors. The backend
       | should be handled invisibly behind the scenes. There should
       | further be an exchange version of a Netflix chaos monkey running
       | constantly to ensure such a critical infrastructure is robust.
       | 
       | The fact that these systems do not exist is an exchange problem,
       | not a "staffer".
        
         | bagacrap wrote:
         | why do retail traders "need" to exist?
         | 
         | Retail traders realistically have only luck to rely on to beat
         | hedge funds and banks. What they do is akin to gambling, which
         | is on net quite negative for those who participate in it and
         | heavily regulated. Retail traders don't serve any purpose in
         | our society. They don't help with efficient allocation of
         | capital and anyone who might be an actual savant in trading can
         | join or start a firm rather than staying independent and
         | unlicensed.
        
       | bmitc wrote:
       | Seems so weird to not have automated checks for something that
       | seems to be described as "someone left the light on" and also not
       | have the exchange automatically initiate itself. However, it
       | still isn't that clear what the problem was. Were prices not
       | "real" or correct?
       | 
       | Stuff like this will happen more and more. We treat software
       | driven systems rather recklessly.
        
         | mrkeen wrote:
         | More automation -> more code -> more things to go wrong
        
           | nix23 wrote:
           | >More automation -> more code -> more things to go wrong
           | 
           | More people -> more entropy -> much more things to go wrong
        
             | wongarsu wrote:
             | One person might be inattentive or drunk, but it's less
             | likely that two people are. So you institue a two-person
             | rule. And if that's not good enough, add a third person to
             | double-check. Maybe a supervisor to observe the people
             | doing all of the above, to catch any mistakes or negligent
             | behavior. Also have them write down the steps they have
             | taken, and have somebody else read through that to verify.
             | Just keep adding people until you are satisfied with your
             | odds (and hope you are not making it worse through second-
             | order effects)
        
               | nix23 wrote:
               | So two people are more reliable as a automated system you
               | wanna say? That's totally wrong....
        
               | reaperducer wrote:
               | Sounds like you've never experienced a catastrophic
               | failure due to an automation that didn't work right.
               | 
               | I did last year, and my company is in the process of de-
               | automating certain processes that can endanger that
               | company if they go wrong.
               | 
               | There are many things in tech that are too important to
               | automate.
               | 
               | I'd even posit that the more experience you have in tech,
               | the more you've seen how things go wrong, and the more
               | you realize that automation is a tool for humans to use,
               | not a replacement for humans doing a task.
        
               | nix23 wrote:
               | >Sounds like you've never experienced a catastrophic
               | failure due to an automation that didn't work right.
               | 
               | Much much much more due to human error......but hey maybe
               | you are the worst programmer ever..but even then i would
               | say your programs are more reliable then a human.
        
           | themitigating wrote:
           | This employee left the backup system running. There's
           | obviously some automation but what is the solution?
           | 
           | Process changes that people have to remember or more systems
           | to prevent the issue. So I don't get your statement related
           | to this article
        
             | nix23 wrote:
             | >There's obviously some automation but what is the
             | solution?
             | 
             | Shutdown and wake-up time in bios of server and switch ;)
        
             | [deleted]
        
             | mrkeen wrote:
             | I don't imply that there is a solution.
             | 
             | We will simply create a second system (B) to monitor the
             | first system (A). Now we have two systems to maintain.
             | System B _will not_ be capable of steering A by itself. So
             | we still need to know how to diagnose and repair A, and we
             | also need to know about B too. Maybe system B can talk to a
             | Prometheus /Grafana stack (if it's up). And that can put
             | alerts into Slack (which we ignore because there's _always_
             | alerts in Slack). And after standup we can take turns
             | looking at graphs with consternation.
             | 
             | > Stuff like this will happen more and more. We treat
             | software driven systems rather recklessly.
             | 
             | That sentence is where I go when I hear the word
             | 'automation'.
        
           | bongobingo1 wrote:
           | The great thing about automation is the breadth, depth and
           | speed at which I can propagate mistakes.
        
             | WJW wrote:
             | Some submarine operations are deliberately not automated,
             | because if eg a sensor is broken or miscalibrated it could
             | sink the entire boat if a computer very rapidly acts on the
             | wrong information. Rather they do those operations with one
             | person operating the valve/machine/device/etc, another
             | watching and confirming readings, and the whole thing is on
             | a constant audio link with a third person in an engineering
             | room who watches the readings through a centralized system.
             | 
             | It's clearly not optimized for efficient use of personnel,
             | but the personnel complement will have been designed to
             | provide sufficient people at all times and the cost of
             | getting it wrong can be very large indeed.
        
               | JackFr wrote:
               | Working on the repo desk of a large Japanese bank in New
               | York in the 90's. There was a big (both in font size and
               | magnitude) number that was on the upper left of the
               | blotter system that ran on every traders desk, which
               | represented the total we had to borrow that day to fund
               | the banks trading book. There would be a number below it
               | which would represent how much we had borrowed so far.
               | 
               | It was "too important to automate" so a trading assistant
               | keyed it in every morning. One morning he typed the wrong
               | number and the mistake was in the billions digit.
               | 
               | At 2:45 "the cage" called the repo desk and said "You
               | know you guys are still short a billion, right?"
               | 
               | There was then a flurry of activity as traders got on the
               | phone to try to borrow a billion dollars in in fifteen
               | minutes, while also trying to not let on we were kind of
               | over a barrel. The head of fixed income prepared his
               | explanation to the Fed about why we needed to borrow a
               | few hundred million overnight.
               | 
               | The number got automated in our next release, and the
               | open procedure was changed to the trading assistant
               | verifying the number against the "cage" report.
        
               | papito wrote:
               | I will argue that the corporate world is the exact
               | opposite of the training and discipline of a submarine
               | crew. Most of the time I wonder how the businesses even
               | survive the chaos and mismanagement, let alone make
               | money.
        
               | MrYellowP wrote:
               | > It's clearly not optimized for efficient use of
               | personnel
               | 
               | I disagree. When the sub sinks, all these people die.
               | That's far more inefficient use of personnel.
               | 
               | It feelf like you're putting more emphasis on the
               | material cost ("the cost ... can be very large indeed")
               | than on what actually matters.
        
               | wongarsu wrote:
               | It feels like a wartime-vs-peacetime priority problem.
               | 
               | In wartime you would care about efficiency, build the
               | largest number of subs staffed with the minimum crew. And
               | since training crew quickly becomes the bottleneck you
               | would probably go for the highest degree of automation
               | that doesn't impact production times too much. In
               | peacetime, efficiency isn't as important. What is
               | important is the bad PR of losing one of your submarines
               | in a training excercise or on patrol, so crew safety
               | becomes a much bigger concern.
               | 
               | Losing those sailors in war would have been a noble
               | sacrifice for the cause, losing them to the exact same
               | accident in peacetime is a national tragedy.
        
               | UncleEntity wrote:
               | Losing a ship (boat?) during peacetime is a PR disaster.
               | 
               | Losing one during wartime can cost you the war.
               | 
               | An inefficient one that probably won't sink due to combat
               | damage can stay in the fight long enough to matter.
        
               | WJW wrote:
               | Submarines are referred to as boats rather than ships due
               | to naval tradition.
               | 
               | As a former naval officer I can only say that technically
               | losing any vessel could be the one that loses a war, just
               | as any soldier lost could be the straw that breaks the
               | camel's back. But if your navy is so rickety that the
               | loss of a single vessel is enough to lose then the main
               | deficiency was in planning rather than any specific
               | warship loss. Losing one in peacetime should never happen
               | but is not unheard of even in modern times. See eg the
               | Kursk or the Fitzgerald.
        
               | UncleEntity wrote:
               | It's also not unheard of for a single vessel to have
               | outsized effects on an entire war
               | https://amp.theguardian.com/world/2017/oct/20/enigma-
               | code-u-...
               | 
               | It could also be argued that the Romans capturing a
               | single Carthaginian warship turned the tide for their
               | entire empire.
        
               | TeMPOraL wrote:
               | This is now my headcanon explanation for why starships in
               | Star Trek still require large crews (or crews in
               | general).
        
               | mschuster91 wrote:
               | At least in ST:VOY, it has been shown that a single
               | hologram is capable of running the ship - although one
               | might argue "exceptional circumstances" ;)
        
               | wongarsu wrote:
               | In Star Trek III they jury-rig the Enterprise to fly with
               | a crew of 5, instead of the regular crew of 400.
               | 
               | Though in that state it can't do much more than fly:
               | combat capabilities are strongly diminished, maintenance
               | doesn't happen, post-combat repairs are out of the
               | question, science missions would be much harder. On
               | occasion the Enterprise has transported 150 passengers,
               | so I imagine there's a lot of kitchen staff, security,
               | etc. You only need 5 people to fly the ship, maybe 40 to
               | fly sustainably with maintenance, but to actually
               | accomplish their reglar mission you need the other 300
               | people.
        
               | krapp wrote:
               | On the one hand, given how often sensors fail and AIs
               | flip the evil bit in Star Trek, limiting automation is
               | probably a good idea.
               | 
               | On the other hand, the Enterprise won't even warn anyone
               | when command staff are injured, cloned, mind-controlled
               | or vanish from the ship altogether unless a human asks
               | the computer where a specific person is first.
               | 
               | Of course there are Doylist reasons for all of this but I
               | do like the premise of a general fear of AI and possible
               | weird space BS being a factor.
        
               | dragonwriter wrote:
               | > This is now my headcanon explanation for why starships
               | in Star Trek still require large crews (or crews in
               | general
               | 
               | The canon explanation is that automation-in-charge was
               | experimented with and went really badly, though
               | periodically they try something approaching it again.
               | 
               | https://memory-
               | alpha.fandom.com/wiki/The_Ultimate_Computer_(...
               | 
               | (AI, human genetic engineering, and a number of other
               | areas of technology are affected by variants of this
               | issue in the Trek canon.)
        
             | junon wrote:
             | I've not seen it put so eloquently.
        
               | dsr3 wrote:
               | To err is human, to really foul things up requires a
               | computer
               | 
               | - William E. Vaughan
               | https://quoteinvestigator.com/2010/12/07/foul-computer/
        
               | erik_seaberg wrote:
               | "A computer lets you make more mistakes faster than any
               | other invention, with the possible exceptions of handguns
               | and tequila."--Mitch Ratcliffe
        
             | ynniv wrote:
             | Knight Capital will forever haunt fintech engineers...
             | https://www.henricodolfing.com/2019/06/project-failure-
             | case-...
        
         | baxtr wrote:
         | [flagged]
        
         | qrybam wrote:
         | The auction is meant to find a stable price and can have some
         | wild prices coming in, because no matching happens at the point
         | of entry, the market will naturally find a level before trading
         | commences. In this case, those wild prices were matching,
         | resulting in crazy trades no-one would have expected.
         | 
         | You'd be surprised how many manual processes there are in
         | places like this. It's a combination of legacy systems /
         | processes, and a general paranoia around automation going
         | wrong. I wouldn't be surprised if they always have someone
         | there to shepherd the system along.
        
           | benjaminwootton wrote:
           | When I worked on a trading platform, I spent many a happy
           | Sunday night waiting for the Australian market to open and
           | watch the first orders go through successfully.
           | 
           | We had hundreds of jobs and upgrades happening over each
           | weekend. It definetly needed an eye casting over it
           | regardless of the automation.
        
             | xwolfi wrote:
             | I am working at one now and nothing has changed.
             | Automations are so many we recruit an army of people just
             | checking if they ran, while knowing how to replace them if
             | they didnt (or, more accurately, who to call at night to
             | fix it asap).
             | 
             | And the AU orders going through is a good sign, but it's
             | far from guaranteeing a free monday, as Japan, Korea or
             | Shanghai can fuck it up, each in their own little ways.
             | Hong Kong is the best, low regulatory crap, invested
             | regulator, high volume low latency traffic everyday
             | (relative to the region), I cant recall a time it broke.
             | 
             | Once, someone fat fingered an excel import at close, and we
             | lost our trading license for that entire country for 18
             | months. And we're not small. But the amount mismatched at
             | settlement was super tiny. High attack surface, low
             | holistic understanding (it works despite us, we honestly
             | have no clue sometimes), heavy consequences on screwup.
        
         | helsinkiandrew wrote:
         | Matt Levine at Bloomberg gives a good explanation:
         | https://www.bloomberg.com/opinion/articles/2023-01-25/nyse-f...
         | 
         | Basically at the market open all the requests to buy and sell
         | get matched at the same "open" auction price, then (a second
         | later) the orders get sent to the order book, where the price
         | can go up and down based on size. Because the system didn't
         | think there was an opening, there wasn't the opening auction,
         | the prices went straight to the book and there were large
         | swings in price.
        
           | usefulcat wrote:
           | https://archive.is/LovhH
        
         | tgtweak wrote:
         | I don't think it's that they "left it running" like you would
         | leave a backup app running... they literally left the entire
         | disaster-recovery site up and running and live. Cermak
         | (referred to in the article as the "backup") is an entire
         | datacenter, hosting a running copy of the exchange to be used
         | in a failover scenario.
         | 
         | You'd have to have more than 1 person involved to forget that
         | DR is still active when completing these failover exercises and
         | tests off-hours.
        
           | Kon-Peki wrote:
           | > Cermak (referred to in the article as the "backup") is an
           | entire datacenter
           | 
           | Well, they are a tenent at the Cermak data center. It's a
           | truly massive building with huge amounts of connectivity and
           | colo opportunities. Probably also the only 100+ year old data
           | center building on the US register of historic places, lol
           | (it's a former catalog printing facility, built to hold
           | insanely heavy printing presses on 8 or 9 really tall floors,
           | so it has no problems with densely-packed server racks)
        
             | tgtweak wrote:
             | Yeah my point is only that it's a full DR site, not a
             | "backup" that was left running as the article pointed out
             | (and as a lot of commenters are insinuating).
        
       | chiefalchemist wrote:
       | While we're on the subject, if your company uses technology in an
       | capacity, read the book The Phoenix Project.
       | 
       | https://www.amazon.com/Phoenix-Project-DevOps-Helping-Busine...
        
         | dogleash wrote:
         | I wanted to like that book but it's a big Just-so story.
         | 
         | tl;dr: The brave knight implemented devops and everyone lived
         | happily ever after!
        
       | h3daz wrote:
       | I am short a couple dozen naked Feb 3 calls on a lot of affected
       | tickers and almost had a heart attack when looking at one of them
       | on my phone. Thankfully I was not in front of my computer at the
       | time because I have no idea how my broker was managing my margin
       | at open.
        
         | qeternity wrote:
         | The LULD breakers saved everyone from more chaos here because
         | effectively as soon as the market opened, all these symbols
         | were halted.
        
       | mellosouls wrote:
       | These "issue traced to staffer" stories sound like management
       | cover up for management/system shortcomings to me.
       | 
       | Systems with such significant potential impact, and in industries
       | where lack of financial investment in their continuity is a
       | deliberate choice have very little excuse to be passing the buck
       | to grunts for basic process flaws that can be triggered by
       | individual error.
        
         | NoboruWataya wrote:
         | They're not mutually exclusive. A staffer leaving a backup
         | system running may well have been the proximate cause of the
         | issue but, if true, it was also likely a management/system
         | issue as you say. The article is a bit strange in that it
         | doesn't attribute the fact in the headline to any source. I
         | don't see anything from the NYSE saying "it's all that guy's
         | fault". On the contrary it says:
         | 
         | > [NYSE execs] plan to examine the platform's procedures and
         | management, potentially reworking rules to be more flexible and
         | provide further protections.
         | 
         | Sounds like they know it's a management issue. The headline
         | probably focuses on the staffer leaving the backup system
         | running simply because it's a better headline.
        
           | zmmmmm wrote:
           | yes ... if an organisation has a critical process where a
           | single human making a mistake can cost it $millions then it
           | has a management / process level issue not an issue with the
           | human. Humans make mistakes. Apart from other obvious issues
           | with it, creating a context where individual mistakes lead to
           | horrific outcomes will create a toxic and horrifically
           | stressful workplace - I would actively avoid working in a
           | situation like that myself.
        
         | dsfyu404ed wrote:
         | > These "issue traced to staffer" stories sound like management
         | cover up for management/system shortcomings to me.
         | 
         | At some point you need to strike a balance between
         | freedom/flexibility and stupid proofing.
         | 
         | HN goes real hard on the "people are idiots and we should
         | design things that no matter what buttons get mashed it all
         | works out fine" side of things but in the financial world the
         | balance is struck a little further on the "train our employees
         | to not be idiots" side of things.
         | 
         | Furthermore, it's usually better optics to blame things on
         | people because people can easily and cheaply alter their
         | behavior cheaply (per incremental change). If you blame the
         | outage on systems it raises questions of when it will be fixed
         | and how much $$.
         | 
         | As an aside, it was almost certainly not individual error. At
         | places like NYSE you pretty much always have 2-3 people who
         | should be in a position to catch a mistake like this.
        
           | mcherm wrote:
           | > it was almost certainly not individual error. At places
           | like NYSE you pretty much always have 2-3 people who should
           | be in a position to catch a mistake like this.
           | 
           | That's exactly the point that is being made here. Either the
           | message being put out by the NYSE claiming this was an error
           | by one individual is true -- in which case, NYSE leadership
           | is to blame for setting up a process that allows catastrophic
           | consequences for a single individual's error, OR the message
           | being put out by the NYSE is a fabrication designed to
           | redirect blame at some scapegoat, in which case NYSE
           | leadership is to blame for putting out a false or misleading
           | statement.
           | 
           | [Edit: It seems I misunderstood -- attributing this to an
           | individual was done by reporters and rumors, not by a formal
           | statement from NYSE.]
        
         | reaperducer wrote:
         | _These "issue traced to staffer" stories sound like management
         | cover up for management/system shortcomings to me._
         | 
         | If you're going to move the blame up the food chain, might as
         | well blame the shareholders for giving the company money and
         | choosing to keep the upper management in place.
        
         | blippitybleep wrote:
         | Sounds like many places I've worked. I think most devs have had
         | a job like that.
        
         | GuB-42 wrote:
         | We can always blame management, since they make decisions,
         | including hiring, we can always trace back problems to
         | management. But it is as unhelpful as blaming the grunts.
         | Management has the job of making the company profitable, if
         | they don't employees won't get paid, investors will lose money,
         | and ultimately the company will fail and customers won't get
         | service. And just like the "grunts", they are not perfect,
         | sometimes, they make mistakes, sometimes, they have to take
         | chances.
         | 
         | In fact, blaming anyone is unhelpful unless baltent misconduct
         | is the problem, and I don't think it is the case here. As
         | always, shared responsibilities. I just wished a different
         | wording, something like "NYSE Tuesday opening mayhem traced to
         | a backup system not properly shut down". Leave the "staffer"
         | part to the technical report. It is useful information for
         | investing the problem and fixing what needs to be fixed, but it
         | is inconsiderate for a press release.
        
           | Waterluvian wrote:
           | The difference is that one asserts authority over the other.
        
         | bigpeopleareold wrote:
         | I had a project that was really important once. I made a tiny
         | mistake that had a big consequence - a couple of hours of
         | potential lost revenues from our customers. I fixed my mistake
         | with both my boss and CEO nearby. I said after I pushed the fix
         | that I really need more resources around it. That little light
         | of "yeah, this is important" that should have flickered didn't.
         | :)
         | 
         | I will not be surprised if nothing gets fixed with the issue at
         | NYSE.
        
           | 1980phipsi wrote:
           | If you were to ask what is the probability that this specific
           | error happens again, I would think it would be pretty low.
           | Probably lower than a week ago. If you were to ask what is
           | the probability that some significant, costly error happens
           | again, I don't think the probability is that much lower than
           | a week ago.
        
           | smcin wrote:
           | But how much _actual_ lost customer revenue? Also, did the
           | customer even notice or not?
           | 
           | You're reminding me of the difference between engineers and
           | non-technical managers; to many of the latter something's
           | only a problem if/when the customer or senior mgmt are on the
           | phone complaining about it. Until then it's all engineers
           | being too pessimistic about process and risk.
        
         | hinkley wrote:
         | We have a Slack channel where we are expected to announce all
         | of our production changes.
         | 
         | Some updates are highly regimented, but a couple of the more
         | operational teams have discretion to deploy things outside of
         | that process, and most teams can flip feature toggles whenever
         | they want.
         | 
         | Point is that sometimes people will comment, or even veto
         | changes. We have a major customer visiting today, or the sales
         | team is at a conference. Don't touch anything or you might
         | break something.
        
         | wrldos wrote:
         | This. If an individual's mistake can take out your business you
         | have a process control problem and that is owned by management.
        
           | JumpCrisscross wrote:
           | > _individual 's mistake can take out your business_
           | 
           | It didn't.
        
           | cm2187 wrote:
           | If you have processes where there is nothing an employee can
           | do to affect the outcome of the company you successfully
           | built a legacy bureaucracy that is waiting to be disrupted.
        
             | hgsgm wrote:
             | On the contrary, if a single employee can take out the
             | whole business, you are guaranteeing disruption.
             | 
             | There are many kinds of "outcomes". A simple backup would
             | make outages far more rare.
        
             | yourapostasy wrote:
             | In this specific case, I don't think that's necessarily the
             | outcome. Our industry has yet to accept a universally-
             | acknowledged equivalent of a lockout/tagout (LOTO)
             | interlock. There is no need for a bureaucracy if we have
             | cryptographically-enforced multisig Shamir secret sharing
             | keys where a LOTO prevents (in this case) a system from
             | spinning up while another system (the backup system
             | apparently in this case) is running. Allow it to be
             | overridden by a sufficiently senior manager or say a
             | sufficient number of lower-seniority managers, which leaves
             | an audit trail. Integrate with a change management,
             | notification, secrets storage infrastructures, and
             | infrastructure as code, and it encodes these infrastructure
             | dependencies into code, and can be queried to auto-
             | construct change interlock sequences for a particular
             | desired state.
             | 
             | Of course, once you take advantage of such a representation
             | at scale by deploying tremendously more complex
             | infrastructures, you then have to deal with the dependency
             | network meta challenge lest you inadvertently fall into
             | dependency hell. While towards there lies NP-hard problems,
             | they're still computable to a reasonable degree and I dare
             | say a more robust situation than doing it all by hand like
             | we do today.
             | 
             | The real challenge is the vast majority of devops staff
             | today would really dislike reasoning about such a
             | representation when it blows up in their faces, and I can't
             | blame them for that kind of reaction.
        
               | quantgenius wrote:
               | It's very easy to talk about completely automated systems
               | and LOTO and you need these when you have under-skilled
               | staff. The NYSE likely does NOT have under trained staff.
               | If you have LOTO systems etc, what do you do when a
               | sensor fails and you can't figure out why your method for
               | checking whether the other system is running incorrectly
               | thinks it is. Do you allow the stock market to simply not
               | open?
               | 
               | What if multiple sensors fail or it's an ambiguous
               | situation like say you are deciding whether or not to
               | fail over a power circuit and it's a brownout but not a
               | complete power failure? What if there is a systemic
               | problem and it's likely the backup power source is going
               | to brown out too? At some point you need highly skilled
               | individuals, like say trained airline pilots flying a
               | plane who have the authority to override systems
               | immediately without having to jump through hoops.
               | 
               | This is especially true for mission critical systems.
               | Many of the mission critical systems we rely on are NOT
               | built on the cloud, i.e. other people's computers because
               | you want to be really careful about what hardware you are
               | using, precisely how your data center is setup and want
               | to make sure things like a noisy neighbor do not impact
               | you.
               | 
               | Like it or not, these highly trained individuals are
               | going to make mistakes every now and then. A failure like
               | this once every decade or so really isn't so bad. The
               | individual who made this error is likely not a "grunt". I
               | suspect the individual in question will not necessarily
               | suffer any major consequences as a result of this unless
               | it wasn't a mistake but a flagrant disregard for the
               | rules like say bringing a bottle of water into a data
               | center that then spilled or something.
               | 
               | Have you built a mission critical, distributed system
               | that hasn't failed for 10 years? It's a lot harder than
               | it looks. That's how often the NYSE has a problem like
               | this, about once a decade. A lot of things that work in
               | theory, don't work for the edge cases and things that
               | lead to problems once a decade or so are extreme edge
               | cases.
               | 
               | In the grand scheme of things a mucked up opening auction
               | is a minor problem and anyone who did not take the
               | precaution of sending a limit order and sent a market on
               | open order despite it being standard practice to
               | essentially always use limits and go hurt badly will be
               | made whole.
        
               | yourapostasy wrote:
               | _> If you have LOTO systems etc, what do you do when a
               | sensor fails..._
               | 
               | It pretty much boils down to: it depends upon what the
               | business wants to prioritize; operating margin or
               | resiliency. There is an entire subfield investigating the
               | statistical foundations of resiliency, and the general
               | case of N-modular redundancy is in practice implemented
               | as triple modular redundancy in most commercial systems
               | that want to spend in this vector.
               | 
               |  _> Like it or not, these highly trained individuals are
               | going to make mistakes every now and then._
               | 
               | Absolutely, and here is where the organization's no-blame
               | learning culture swings into action for the well-led
               | teams.
               | 
               |  _> It 's a lot harder than it looks._
               | 
               | We all know this, and we can all help each other get
               | better to deliver ever increasing value to our customers
               | by sharing what works for the context we deployed within!
        
               | quantgenius wrote:
               | You don't get to major failures once a decade (or less)
               | on systems this complex without understanding and in fact
               | being on the cutting edge (likely ahead of what you read
               | in journal articles written by academics) of the
               | statistical foundations of resiliency, n-modular
               | redundancy etc.
               | 
               | In real-life outside of a journal article, it's a lot
               | harder than just deciding whether you want to prioritize
               | operating margin or resiliency at 5000 feet.
               | 
               | In real life when these sorts of edge cases happen, you
               | have to understand in minutes or sometimes seconds the
               | tradeoffs in terms of costs to your own company and your
               | customers of one of n specific possible failure modes and
               | risk-manage so you minimize the probability of the
               | catastrophic outcomes. This sometimes may involve
               | increasing the probability of low cost bad outcomes. You
               | can't reason about this stuff before hand. If you could,
               | you would have designed your system to not fail in that
               | manner.
        
             | itsoktocry wrote:
             | > _If you have processes where there is nothing an employee
             | can do to affect the outcome of the company you
             | successfully built a legacy bureaucracy that is waiting to
             | be disrupted._
             | 
             | Exactly.
             | 
             | I wonder if any of the people claiming "it's management's
             | process fault!" would be the first to complain about their
             | workplace where they have no autonomy.
        
           | iancmceachern wrote:
           | Exactly, management shouldn't allow this kind of situation to
           | occur by designing the xomoanies processes such that there
           | are checks and balances
        
           | gonzo41 wrote:
           | Everyone makes mistakes. HOWEVER, Mordern leaders get to the
           | heads of large organizations by never making mistakes, by
           | blaming the little guy, or coworker and hustling up. They'll
           | only fix this because they have too. There isn't a problem
           | until it happens.
        
           | thisarticle wrote:
           | The days of management taking responsibility for anything are
           | over. See: not a single CEO stepping down for over hiring.
        
             | wonderwonder wrote:
             | This is because the CEO's core job is to raise stock price.
             | Nothing else. They hired in covid and profits & share price
             | spiked due to the economic state at the time. Now the
             | economic state has changed so they fire employees and the
             | stock goes up. By that metric, the CEO will get a bonus at
             | the end of the year. CEO does not get a bonus for not
             | laying people off. Employees are not humans once you get to
             | the csuite. An employee can be a person but multiple
             | employees are just numbers on a ledger. They just send out
             | "I'm sorry" emails to placate the masses and to get good
             | media, no one really cares if the lower level people are
             | upset. You only count once you get to a certain level.
        
             | duckmysick wrote:
             | There's plenty of companies replacing their CEOs. Just
             | today Toyota announced theirs.
        
               | Octoth0rpe wrote:
               | The CEO of Toyoda is becoming the chairman of their
               | board, that doesn't feel like a CEO being replaced as
               | punishment for poor performance in the way that people
               | are talking about in this thread. But even when CEOs are
               | fully ousted over issues, the golden parachute makes it
               | barely feel like a punishment anyway. I'm having trouble
               | thinking of a case where a CEO actually seemed to be
               | significantly financially impacted by such an event,
               | though maybe FTX will provide an example shortly.
        
               | hgsgm wrote:
               | Are the golden parachutes bigger (as % of annual comp)
               | than employee severance packages?
        
               | Octoth0rpe wrote:
               | Do you think that honestly matters in the 10s of millions
               | of dollars range? I certainly don't. The problematic
               | parachutes in question are beyond enough for an
               | excessively wealthy standard of living for the rest of
               | their natural lives, even if it's proportionally smaller.
               | Whether or not CEO comp should be as high as it currently
               | is is another question entirely.
        
             | lotsofspots wrote:
             | Oh, they take full responsibility, it always says so in the
             | mails they send out. It's just that taking responsibility
             | doesn't appear to actually result in anything happening.
        
               | parsimo2010 wrote:
               | A cyncical take might be that they are saying that they
               | take responsibility (credit) for reducing the monthly
               | payroll expenses. They may also have overhired in the
               | past, but what's in the past was already paid for. The
               | savings next month is how they justify a large paycheck.
        
               | shapefrog wrote:
               | Macroeconomic changes have made it impossible for me to
               | want to pay you
               | 
               | https://news.ycombinator.com/item?id=34515267
        
               | j33zusjuice wrote:
               | Their punishment is in bearing the shame of having been
               | wrong. That's the price of leadership.
        
               | randomdata wrote:
               | What shame is there in being wrong? Being wrong is the
               | ideal state, paving a path to gaining an education, which
               | is a source of pride and a benefit.
        
             | [deleted]
        
             | noisy_boy wrote:
             | I don't see why we reward scale-out/scale-in in the cloud
             | but punish CEOs when they do the same with real people /s
        
               | hgsgm wrote:
               | How will those poor decommissioned computers get enough
               | bytes to feed themselves?
        
             | oxfordmale wrote:
             | They are taking responsibility. They are just delegating
             | the consequences to their staff. I suspect this will change
             | soon. Activist investors are already surrounding companies
             | like Salesforce and I can see CEOs being promoted sideways
             | (board member only).
        
             | itsoktocry wrote:
             | > _See: not a single CEO stepping down for over hiring._
             | 
             | Wait, what? You think a CEO should _step down_ because
             | their management over-hired a relatively small proportion
             | of employees and had to do some layoffs?
        
               | hgsgm wrote:
               | It's not relatively small. All the companies are
               | experiencing similar chaos to the NYSE because people in
               | the middle of important operational work suddenly
               | vanished. The people laid off weren't idle like H&R Block
               | tax preparers in May or Target clerks in January.
               | 
               | The people laid off and the people not needed were a
               | different set of people, at the time of the layoff.
        
               | horsawlarway wrote:
               | Not really sure why you're getting downvoted, other than
               | to assume an emotional reaction from the community to
               | layoffs impacting tech.
               | 
               | Frankly - People seem to be forgetting that until 2013,
               | MS was still doing stack ranking and routinely letting go
               | of the bottom 10% of their workforce (and they were
               | hardly the only ones doing it...)
               | 
               | I don't see it as unusual _AT ALL_ that these companies
               | are doing a wave of cuts to headcounts after the large
               | hiring sprees during covid. Especially as interest rates
               | rise, so they 're looking to lower debt burdens in the
               | short term and pay off loans made at low interest rates
               | instead of rolling into a higher interest loan in the new
               | environment.
               | 
               | If anything... I'd expect the exact opposite - a CEO that
               | fails to address cost centers as debt becomes more
               | expensive is a liability, and someone the board might be
               | looking to replace (ask to step down).
               | 
               | ---
               | 
               | Does that mean I'm not sympathetic to those who've lost
               | jobs? Of course not.
               | 
               | But tech had to rev the engine pretty hard to handle the
               | extra load during covid when everyone was indoors and
               | doing things online, and now that demand has dropped. So
               | they're letting off the gas pedal.
               | 
               | If folks don't like it - blame the game. Work to
               | unionize. Work to incentivize co-ops and shared
               | ownership. Work to increase taxation on these companies
               | and their highest earners (which... if you're in the tech
               | industry almost certainly includes _YOU_ ). Don't go work
               | for giant tech conglomerates and then act surprised when
               | they act like giant tech conglomerates...
        
               | DrBazza wrote:
               | I'll make an extreme comparison:
               | 
               | "Kill one man, and you are a murderer. Kill millions of
               | men, and you are a conqueror"
               | 
               | If you make some idiotic financial decision near the
               | bottom of the management tree, such as... over hiring,
               | you'll likely lose your job or get demoted.
               | 
               | Do it as a CEO, and get a huge bonus.
               | 
               | [1] https://en.wikipedia.org/wiki/Jean_Rostand
        
               | zaroth wrote:
               | But it's absurd. Companies are not supposed to _only ever
               | hire_.
               | 
               | Some things are cyclical and you need more people for
               | some amount of time, and then you find you need less.
               | It's not always predictable/seasonal like farming or
               | holiday rush.
               | 
               | Is it wrong for a company to respond to market effects?
               | That there was a layoff isn't necessarily a sign a
               | company did anything wrong... I think how they actually
               | do the layoff certainly can be done well or poorly.
        
               | DrBazza wrote:
               | It's not hiring though. It's overhiring.
               | 
               | I've forgotten which FAANG it is. But one of them still
               | has more employees than last year even after layoffs.
               | It's offensive.
        
               | usefulcat wrote:
               | So if they "under hire", should they step down for that
               | too?
               | 
               | Maybe they should step down any time they fail to
               | accurately predict the future?
        
               | SpeedilyDamage wrote:
               | Offensive? I'm... honestly, baffled. How could one tech
               | company's ability to hire many more people actually
               | offend you?
        
               | horsawlarway wrote:
               | It's a response to extreme demand during covid. When -
               | you know - online service usage was at all time highs
               | because everyone was stuck inside and doing things
               | online.
               | 
               | It was likely the right call to hire then, just like it
               | might be the right call to reduce headcount now.
        
               | icedchai wrote:
               | Why is it offensive? Over-hiring has been a thing since
               | at least the first dot-com boom. One's managerial power
               | is directly proportional to how many "reports" they have
               | under them. I worked at one company that raised a decent
               | A round. We immediately rented another office down the
               | street, spent close to 2 million on renovations, then
               | filled it with anyone who could spell HTML. The B round
               | was even larger, so the cycle continued (until late 2001
               | or so.)
        
             | logifail wrote:
             | > The days of management taking responsibility for anything
             | are over. See: not a single CEO stepping down for over
             | hiring
             | 
             | The list of managers stating that "they were taking
             | responsibility" _and then immediately stepping down_ was
             | always fairly short.
        
             | cc81 wrote:
             | They only take responsibility for the profit margins. Over
             | hiring affects those but often not significant enough and
             | can be corrected with layoffs.
        
               | hgsgm wrote:
               | No, they only take responsibility for short-term market
               | cap. Margins and profit don't matter. That's why they
               | chase whatever fad hits the investor class.
        
         | PragmaticPulp wrote:
         | > These "issue traced to staffer" stories sound like management
         | cover up for management/system shortcomings to me.
         | 
         | At the end of the day, the engineers are responsible for the
         | engineering. Managers are responsible for managing. Shifting
         | all responsibility for execution issues on to management can
         | give warm fuzzies, but in reality managers aren't all powerful
         | in shaping execution by engineers.
         | 
         | Companies that put all blame on managers when things fail are
         | inevitably encumbered with excessive micromanagement, as the
         | managers are effectively saddled with responsibility for
         | execution as well.
         | 
         | The article was purely anonymous. I don't think it's fair to
         | assume they're jumping to blame or fire individual engineers.
        
           | willcipriano wrote:
           | Do engineers have authority over engineering? Can they
           | overrule management on engineering issues? Whoever takes the
           | authority gets the blame.
        
         | NovemberWhiskey wrote:
         | Look it's totally OK to recognize that a human action was the
         | trigger for an incident - i.e. the causal chain for this
         | specific incident started there. That's not the same thing as
         | saying the human action was the root cause, and I hope by-and-
         | large any kind of baseline competent engineering organization
         | has gotten to that level of thinking by now.
        
         | baby wrote:
         | It's also true that in systems like this there exist many
         | single points of failure. There's a reason decentralized
         | systems are seeing a rebirth.
        
         | credit_guy wrote:
         | It does not sound like cover up to me.
         | 
         | It was simply the explanation of what happened. I didn't get
         | any hint that the said "staffer" will be fired or otherwise
         | punished.
         | 
         | Is there a problem with the system that did not have enough
         | safeguards to let this happen. For sure, but then no system is
         | perfect. This glitch does not happen every day. From memory, I
         | remember a NASDAQ glitch at Facebook's IPO. Let's say there are
         | 2 or 3 glitches like that for major exchanges in one decade.
         | How can you design a system that prevents bugs that show up
         | once a decade?
        
       | corobo wrote:
       | Oh we're still doing scapegoats?
       | 
       | If your system can be hosed by a single person the system is at
       | fault. Start with the scapegoat's manager.
        
       | anonu wrote:
       | Tech can be so fragile. You do everything right, trade millions
       | of shares everyday and handle billions of dollars. But you forget
       | to run one script to shutdown a backup system and everything
       | comes crashing down: your reputation in tatters, millions in
       | costs to settle bad trades, barbarians at the gates.
        
       | tgtweak wrote:
       | https://www.nyse.com/publicdocs/support/DisasterRecoveryFAQs...
       | 
       | > Question: Can I connect to both the production and the DR site
       | at the same time?
       | 
       | Answer: No, only one site is available at a time. When the
       | primary site is up, the DR site is down; and when the DR site is
       | activated, the primary site is down.
       | 
       | I think they need to update these docs to say /should/ be down
        
       | irthomasthomas wrote:
       | Is it normal to simply accept the word of an anonymous source for
       | something so important? I genuinely don't know, anymore, but it
       | doesn't seem like a good idea. I'd rather wait for a more
       | thorough investigation. Especially when the story from these
       | sources boils down to "Kevin was in charge of booting the NYSE
       | App that morning, but he was late for work. He had a good excuse,
       | though, he flaked! We'll have the chap straight up for lunch, no
       | question".
       | 
       | Edit: I also note that this piece is lacking the traditional "The
       | NYSE did not respond to a request for comment".
        
         | reaperducer wrote:
         | I don't know how Bloomberg works, but the New York Times has a
         | very clear and public policy about using anonymous sources.
         | 
         | There's usually a link to it in the middle or end of any story
         | it publishes using an anonymous source.
         | 
         | The Times isn't Bloomberg, but it might give you some insight
         | into how these things work.
        
         | itsoktocry wrote:
         | > _Is it normal to simply accept the word of an anonymous
         | source for something so important?_
         | 
         | Anonymous means they aren't revealing the source, not that
         | Bloomberg doesn't know who the sources is, or what they do.
        
           | irthomasthomas wrote:
           | I know that. I am referring to you and I, the _reader_
           | accepting the word of the anonymous source. Combined with the
           | fact that they apparently did not ask for a comment from the
           | NYSE before publishing this. Or if they did, they neglected
           | to mention it.
        
             | maronato wrote:
             | We aren't accepting the word of the anonymous source. We're
             | accepting Bloomberg's word that the source is reliable.
        
               | adolph wrote:
               | Bloomberg's word:
               | https://news.ycombinator.com/item?id=19526348
        
               | bink wrote:
               | Some of us aren't "accepting" anything. We're just
               | reading about a potential cause of an incident and
               | speculating about how it could happen to us or could have
               | been prevented. Just because we're reading this article
               | and commenting here doesn't mean we just believe
               | everything that we read. The post-mortem will come out
               | soon enough and we'll read that and comment again.
        
       | lr1970 wrote:
       | From the article:
       | 
       | > Meanwhile, market professionals and day traders are rattled and
       | waiting for the exchange to elaborate on what it publicly called
       | a "manual error" involving its "disaster recovery configuration".
       | 
       | Oh, I love it -- a disaster caused by "disaster recovery
       | configuration" :-)
        
         | gjvc wrote:
         | _Oh, I love it -- a disaster caused by "disaster recovery
         | configuration" :-)_
         | 
         | People install failover configurations to minimise time-to-
         | repair or time-to-resume service (and some customers' contracts
         | will demand this). This is at the expense of another layer of
         | stuff to go wrong, and raising the possibility that it fails
         | over when it shouldn't, causing brief but embarrassing outages.
         | 
         | It's possible in some such situations that, on the balance of
         | probabilities, introducing mechanisms like this cause more
         | disruption _over time_ than they were intended to protect
         | against, and that this is more widespread than often
         | considered. Still, their operational cost must be borne in
         | order to satisfy the clause in the customers ' contracts.
        
       | jrochkind1 wrote:
       | > That misled the exchange's computers to treat the 9:30 a.m.
       | opening bell as a continuation of trading, and so they skipped
       | the day's opening auctions that neatly set initial prices.
       | 
       | I didn't even know about this process. I don't know much about
       | trading, but it surprises me that there is a separate process for
       | setting prices at the start of trading, and that if it's missed,
       | chaotic prices result.
       | 
       | Is this related to how stock markets aren't really ever open 24
       | hours? Do they need that reset to function in stable way?
        
         | khold_stare wrote:
         | Worked in HFT for a few years. The reason why most markets are
         | not open 24 hours is more human, and just historical - aligned
         | with people's 9-5 workday. There are also pre open and post
         | close sessions of trading but it's much less liquid. Futures
         | markets are open almost 24 hours. Even there, it's down for
         | some time daily. Personally I think it's actually inertia that
         | keeps existing markets this way - the systems of the exchanges
         | and participants were designed with the assumption that they
         | will have daily downtime, so it's hard to change. It's also
         | dependant on how banking and settlement works - a lot of stuff
         | happens after the trading ends. Batch processes run as
         | different institutions settle their trades between each other,
         | etc etc.
         | 
         | Now, as a result, there needs to be a way to set the opening
         | price and closing price, like a bootstrap process. A smaller
         | version of this process actually happens every time a stock
         | gets halted and resumed.
         | 
         | An exchange has an order book - orders of things people want to
         | buy and sell at different prices. During normal operation the
         | buy and sell orders don't overlap in the order book - if two
         | people want to buy and sell at the same overlapping price, they
         | just get matched by the exchange at that moment. Unmatched
         | orders stay in the order book data structure until a matching
         | order comes along. The "price" you see in charts is just the
         | midpoint between the highest buy and lowest sell price in the
         | order book.
         | 
         | Now, if the order book is empty, what the heck is the price?
         | That's what the opening auction needs to solve. The way it
         | works is that people can start placing orders ahead of the
         | opening bell, but they won't get matched until the open. So
         | before the open, the order book is getting filled with orders,
         | but crucially the _orders will overlap_. This "crossed" order
         | book is a no no during normal trading, but ok before the
         | opening auction. When the auction comes, a price is picked
         | which maximizes the amount of orders filled (it's more nuanced
         | than that, but bear with me). Imagine you pick a price in the
         | overlapping region of the order book - every buy order that has
         | a higher price than that will match with every sell orders that
         | has a price lower than that. They will get matched and executed
         | at the opening price, and BAM, you have an uncrossed order
         | book, full of orders.
         | 
         | If the auction doesn't happen, and you just open the stock,
         | then all hell breaks loose. Many things can go wrong here.
         | Firms connected to the exchange may have code that assumes a
         | book is not crossed (or at least not as crossed as it would be
         | during an auction) causing wild behavior. The exchange itself
         | could start matching orders haphazardly in the overlapping
         | region, causing those "price swings" that the article talked
         | about.
         | 
         | Can't imagine the panic that day haha.
        
           | jrochkind1 wrote:
           | Very helpful and clear, thank you.
           | 
           | > Now, as a result, there needs to be a way to set the
           | opening price and closing price, like a bootstrap process. A
           | smaller version of this process actually happens every time a
           | stock gets halted and resumed.
           | 
           | So this suggests that if you _did_ have a hypothetical
           | exchange that ran 24 /7... and something unusual happened to
           | make trading halt completely (which always is going to happen
           | occasionally, whether 9/11 level or more frequently)... you
           | would still need to have that "bootstrap" process in place to
           | re-start trading.
           | 
           | But if you normally ran 24/7, you'd have a process that you
           | maybe had never used, or hadn't used in years!
           | 
           | This maybe provides another justification that isn't just
           | historical for having exchanges shut down every day. So you
           | are at least testing the bootstrap process daily, you don't
           | have a bootstrap process you're going to need in an emergency
           | (the worst time to have further problems) that has actually
           | just been sitting around unused for years!
           | 
           | (Reminding me of making sure you test your backup and
           | continuity processes regularly, right? And the irony here is
           | that it's the backup/continuity processes which are alleged
           | to have caused the issue here! but still, you need the
           | backup/continuity processes...)
        
           | johnbcoughlin wrote:
           | Matt Levine suggested that the chaos after opening was mainly
           | due to market orders executing at ridiculous prices. Like, a
           | limit buy for half the "real price" is the first buy order to
           | get in the door, and that gets matched with a market sell
           | order.
           | 
           | Does that track with your understanding?
        
             | khold_stare wrote:
             | Yes! I almost forgot about market orders because trading
             | firms never use market orders for this exact reason - you
             | have no control over the price if things go bad. Most flash
             | crashes are exacerbated by runaway market orders and stop
             | orders for example.
             | 
             | A buy market order would try to match with the "best price"
             | which in a deeply crossed book would mean matching with a
             | really low priced sell order. Exchanges match orders in
             | price-time priority. Similar is true for a market sell
             | order - would match at an extreme high price.
             | 
             | Besides the midpoint of the order book, another metric for
             | a "current price of the stock" people use, is the "last
             | trade price". In the situation above you would get "swings"
             | in the price because market orders would be trading very
             | high and very low if they alternate between buying and
             | selling. The data structure on the exchange itself isn't
             | "swinging", it's just the overlapping region being slowly
             | eroded by market orders. The "last trade price" metric
             | looks really insane in this situation.
        
         | toast0 wrote:
         | FYI, there's a similar auction for closing, too. The closing
         | price isn't just a race for the last trade under the buzzer;
         | there's a process where at some number of minutes before close,
         | you can put in orders for close or realtime, and then magic
         | happens.
        
         | xyzelement wrote:
         | That's right. In short it's something like this: stocks trade
         | _on their primary exchanges_ during specific hours. For example
         | 9:30 to 4 in the US.
         | 
         | Part of it is legacy from when trading was done by actual
         | humans being at the exchange physically to trade during those
         | times and part of it (I would guess is still the case) is to
         | allow plenty of non-trading hours for back-office jobs and
         | settlement.
         | 
         | So yes there's a special start of day process that runs at 9:30
         | that runs through all the orders on the books at that time and
         | determines a price at which some optimal set of those orders
         | can trade, trades them at that price, and also posts that price
         | as the Open price for the day.
         | 
         | The process is different during continuous trading since orders
         | are one by one matched against the order book.
         | 
         | Source: ran one of the world's largest equity platforms for 5
         | years.
        
           | ajoseps wrote:
           | isn't there another component to the NYSE auction where the
           | DMM has some input into what the closing/opening price
           | actually is?
        
             | khold_stare wrote:
             | Yes. In my reply to the first comment I mentioned setting
             | the opening price is "more complicated". Every exchange has
             | their own system for the opening auction which you buy into
             | when you list with a particular exchange. Most exchanges
             | have an algorithmic way of calculating the price. For NYSE,
             | it's again more historical. A Designated Market Maker (DMM)
             | for a stock technically determines the opening price. There
             | is a person physically on the NYSE trading floor who
             | represents the DMM firm who technically opens the different
             | stocks. They have a weird custom keyboard from NYSE for
             | this purpose...
             | 
             | The price is usually calculated algorithmically by the DMM
             | firm and sent to the person at NYSE to approve. Pretty
             | arcane. Also somewhat shady, as the DMM firm can be and is
             | part of the auction themselves. DMM firms can analyze the
             | order book to see what the imbalance is in the overlapping
             | region, and place an order of their own to correct the
             | imbalance and then set the opening price. I can see how one
             | can profit from this in certain situations
        
               | ajoseps wrote:
               | I didn't realize the floor broker was actually involved
               | with setting the opening price. I always wondered what
               | the incentive was to access floor feeds for opening
               | auctions
        
       | papito wrote:
       | I sweat over "idiot-proofing" the smallest systems, while multi-
       | billion dollar operations don't seem to care enough.
       | 
       | Like the S3 being blown away with a simple change in the early
       | days, or GitHub running a test suite with production settings.
       | It's like the FIRST thing I think about when starting a project.
       | 
       | https://github.blog/2010-11-15-today-s-outage/
        
       | afhammad wrote:
       | It seems that this wasn't as routine as these things aught to be
       | but rarely are.
        
         | H8crilA wrote:
         | Yes it could have also been a test of potential escalation from
         | Solomon Islands.
        
       | kube-system wrote:
       | I spent an hour trying to figure out why my new stock purchase
       | had disappeared from my account. I had an order placed for
       | opening on Tuesday morning, and I guess I was affected by the
       | trade cancellations. Which is totally weird, because they showed
       | up in my account on Tuesday morning after opening.
        
       | herpderperator wrote:
       | If the trades are being cancelled, are they going to correct the
       | chart data? Right now it looks very misleading on the daily[0],
       | weekly, monthly, quarterly, yearly etc for large caps that trade
       | quite steadily otherwise. I do understand that this would be a
       | challenging effort as that data already flowed to and was stored
       | by all the broker-dealers, but I think it should be done.
       | 
       | [0] https://www.dropbox.com/s/6jdmgkdyei9xqz0/mcd.png?dl=0
        
         | anonu wrote:
         | Technically yes, historical market data feed needs to be
         | cleaned up. Which will be a nightmare for every single person
         | who maintains one...
         | 
         | Which is also why exchanges are very reluctant to mass cancel
         | trades. The knock on effect goes beyond just market data feeds
        
         | evanpw wrote:
         | The same feed that publishes trades also publishes trade busts,
         | so it's up to whoever's consuming it downstream to take care
         | of.
        
       | ynniv wrote:
       | It's easy to throw shade when Bloomberg writes an article that
       | puts the blame on "a staffer". Having worked near some of these
       | systems, the engineering and process are actually quite good. How
       | many companies publish their private network topology, service
       | p99.9 in microseconds, and detailed pricing on the open web?
       | They're in a painfully competitive global market that's
       | ambivalent to names on buildings.
       | 
       | In a week or so there will be a comprehensive internal post
       | mortem, and every engineer in the company will read it because
       | that's why they work there. "The staffer" will not be named, nor
       | will they be fired. The process will be changed. The systems will
       | be changed. You probably haven't heard of Pillar, but the NYSE in
       | your head was replaced by some pretty amazing, distributed, low
       | latency systems. The culture is to over-engineer, over-provision,
       | plan for black swans. And test. That it works. Test that it
       | scales. Test that backups work. Test, test, test. _firmitatis,
       | utilitatis, venustatis_. This failure was due to daily testing.
       | 
       | Sometimes things still fail. That's true anywhere. In most places
       | your failures don't make the papers, and accidents are swept
       | under the rug. That doesn't happen at NYSE for obvious reasons.
       | They're not building large language models (that I know of), or
       | self driving cars (pretty sure on this one), but they're a
       | modern, cutting edge, "soft" real-time engineering shop. If you
       | haven't looked already, you might find something interesting
       | there: https://www.ice.com/careers
        
         | bob1029 wrote:
         | I've thought about getting into this... The stuff they work on
         | is so incredible to me.
         | 
         | Here's a quote from their Pillar product page:
         | 
         | > Up to a 95% Reduction in Latency: The roundtrip latency on
         | NYSE Pillar order entry sessions via Pillar matching engines
         | has been reduced from ~592ms to ~32ms for FIX and from ~96ms to
         | ~26ms for Binary, getting client orders into the market much
         | faster. With a 92% improvement in the 99th percentile latency
         | results, clients can also have more confidence in improved
         | performance consistency regardless of market conditions.
         | 
         | Reading stuff like this makes my current work feel stupid by
         | comparison.
        
           | davidf18 wrote:
           | [dead]
        
           | nubb wrote:
           | wonder how they measure this and is this smart engineering
           | from the exchange or just new fast network gear.
        
             | DontchaKnowit wrote:
             | Just an educated guess (I'm in the same industry, have
             | worked on some networking related stuff) But I think it is
             | probably mostly network hardware and architecture. You can
             | only improve so much from the code, the networking is where
             | all the latency comes from.
        
           | Razengan wrote:
           | > _Reading stuff like this makes my current work feel stupid
           | by comparison._
           | 
           | It makes our economic system seem stupid. Jesus, we're not
           | calculating astrophysics or quantum mechanics. A made-up
           | system should not require or depend upon this kind of speed
           | or precision. Maybe we should chill.
           | 
           | Reminds me of those pro StarCraft players who keep
           | unnecessarily clicking the mouse to keep their APM (actions
           | per minute) stat high.
        
             | DontchaKnowit wrote:
             | Absolutely agree. but "liquid markets are important" or
             | something. _rolls eyes_
             | 
             | This is just another step in the endless journey of
             | widening the gap between your average Joe and someone with
             | access to high level financial services.
        
         | blibble wrote:
         | do they still do UDP packet loss replay over email?
        
         | BirAdam wrote:
         | I worked for NYSE's parent, ICE, and I have to agree with this.
         | While there were many things I didn't like about working there,
         | the tech and the management weren't involved in those things. A
         | similar problem to this happened while I was working there, but
         | it was on Endex and not NYSE. Management spoke with the
         | responsible party, but he wasn't fired and no punitive actions
         | were taken against him. The blame game also wasn't played. The
         | team just decided to provide more eyes on the process, change
         | the interface of the tools a bit, and move on. The company
         | itself did face hefty fines for the screw up tho. Ultimately,
         | the issues at ICE/NYSE are due to a highly bureaucratic
         | structure and to onerous regulations forcing parts of that
         | structure to exist. Given those two problems, I think ICE. does
         | extremely well.
        
         | jacquesm wrote:
         | Just naming a 'staffer' though seems to already be a way to
         | apportion blame to a segment of the employees, insulating
         | management from what was done. Named or not doesn't really
         | matter, clearly blame is being assigned.
        
           | bostonsre wrote:
           | Yea, it sounds like an issue with process and automation. It
           | shouldn't have been possible for the staffer to make a
           | mistake that would cause this.
        
             | jacquesm wrote:
             | Precisely. It is never just one error. At a minimum two and
             | if you really stare at this sort of thing long enough it
             | isn't rare at all to discover a whole chain of them. The
             | only difference with all the times that it went right is
             | that this time everything was aligned 'just so'.
        
           | ynniv wrote:
           | I think those are Bloomberg's words, or their paraphrasing of
           | the grapevine. The high level people that I knew there
           | weren't petty, and everyone was of the opinion that it didn't
           | matter who clicked the button: we were all in the same boat.
        
         | ngz00 wrote:
         | I worked there and I can say that this is not accurate at all.
         | It is very much a blame culture. I've seen people fired for
         | less severe incidents. Beyond the core technology of the Pillar
         | engine, the place is not comparable to a modern tech company in
         | almost any way.
        
           | throwaway122095 wrote:
           | As somebody who worked with them as a client, I can confirm
           | this. There is currently a spec-level bug with their core
           | Pillar engine and it was essentially bounced between several
           | different teams and ultimately ignored as nobody's problem.
        
             | mynameisvlad wrote:
             | So basically like any other medium to large company? This
             | doesn't sound unique in the slightest.
        
               | uoaei wrote:
               | I would think that _the company being a securities
               | exchange_ would factor into the analysis. Don 't you?
        
               | mynameisvlad wrote:
               | How does them being a securities exchange in any way
               | affect the analysis of their software engineering
               | practices? They're not some special snowflake, they can
               | suffer the same software engineering and business process
               | issues as other companies.
        
               | sandworm101 wrote:
               | >> They're not some special snowflake
               | 
               | But they are. The consequence of a one-day or one-hour
               | shutdown on their system is exponentially worse than most
               | any other. I would expect them to have more rigorous
               | systems, including more rigorous attention to
               | development. Comparing the NYSE to any other business is
               | like calling Fort Knox just like any other bank vault.
        
               | LarryMullins wrote:
               | > _like calling Fort Knox just like any other bank
               | vault._
               | 
               | Main difference being that most bank vaults aren't
               | actually empty. ;)
        
               | mynameisvlad wrote:
               | No company or organization is immune to bad business
               | practices.
               | 
               | Them being a securities exchange does not somehow provide
               | immunity from developing rigorous systems which have
               | oversights, or make bureaucracy magically go away.
               | 
               | Likewise, the impact of an outage being more extreme does
               | not mean the people there are infallible. Things slip
               | through. Especially random customer requests being
               | bounced around from team to team, the thing in question.
        
               | shanebellone wrote:
               | I disagreed with you until: "...like calling Fort Knox
               | just like any other bank vault."
               | 
               | Interesting point that teeters on false equivalence. I
               | think AWS or Azure might make for a better analogy. Your
               | point identifies the inherent risk of actually operating
               | a platform business. A bank vault is (mostly) synonymous
               | with Cloud, in this context. If a vault is robbed or a
               | cloud goes offline, losses extend beyond the business
               | which inherently compounds the severity of downtime.
               | 
               | Linear loss vs. parabolic loss.
        
               | btown wrote:
               | But if a cloud goes offline, there is damage to the
               | economy linear to the length and breadth of the outage.
               | Sure, there are losses to businesses serviced by the
               | cloud's users, but they'll bounce back, even if a day-
               | long outage was so severe as to temporarily ground
               | flights and halt supply chains.
               | 
               | If a stock exchange executes trades at incorrect prices,
               | even for a short amount of time, all of a sudden you're
               | in a kind of non-linear sigmoid regime, where investor
               | confidence can suddenly tip into panic selling and
               | recessions can be triggered. Thankfully, that didn't
               | happen here, but it could have. If you're going to give a
               | company that power, you should better hope that they're
               | held to higher standards than most dysfunctional tech
               | organizations!
        
               | shanebellone wrote:
               | "If a stock exchange executes trades at incorrect prices,
               | even for a short amount of time, all of a sudden you're
               | in a kind of non-linear sigmoid regime, where investor
               | confidence can suddenly tip into panic selling and
               | recessions can be triggered."
               | 
               | This is false equivalence and slippery slope.
        
               | blantonl wrote:
               | No they aren't.
               | 
               | There's far more critical snowflakes out there... FAA
               | Airspace management, a medical radiation device, avionics
               | in an aircraft, and facebook.
        
             | ynniv wrote:
             | Unlike all of the "modern tech company" problems which are
             | never ignored and only solved when someone's problem goes
             | viral on social media.
             | 
             | They're a big company, some groups are better than others,
             | some customers get more attention than others.
        
           | galangalalgol wrote:
           | Blame cultures and process cultures are both problems in
           | different ways. Blame cultures don't care about individual
           | accountability, only that someone suffers. Process cultures
           | only care that no one suffers, not that individuals are
           | accountable. Both have some misguided notion that something
           | other than personal accountability can lead to good results.
           | Misattributed blame and suffering does not deter poor
           | performance or mistakes. Not even correctly aimed punishments
           | are very good at that. Accountability isn't about punishment,
           | it is about limiting power to the level of responsibility
           | demonstrated. Rules and procedures don't prevent poor
           | performance, they can in fact entrench and guard it, and they
           | only mildly impact mistakes. Best practice can mitigate
           | mistakes to the same extent or better (due to easier
           | adaptability), but people keep trying to turn them into
           | rules, and that has to be fought. If you followed all the
           | rules but didn't get the job done, you still shouldn't be
           | handed the same task again, but not out of blame.
        
             | davidf18 wrote:
             | [dead]
        
           | ynniv wrote:
           | Having been in the industry for a couple decades, and having
           | worked at both, they're not all that different. Some groups
           | are going to be better than others in the same company. Some
           | companies are floating on venture money today, and might
           | disappear tomorrow. Most technologies constantly cycle. Our
           | experiences working at the same company were different.
        
           | Johnny555 wrote:
           | I used to work for a small startup, and postmortems were
           | truly no blame - engineers would talk about exactly what
           | happened and wouldn't hesitate to put the blame on their
           | mistakes.
           | 
           | But as the company grew, the postmortems became more about
           | blame since now you're not blaming an engineer, but an entire
           | team so singling them out isn't personal. The postmortems
           | were no longer a single engineer describing what happened in
           | his code, but were team leads talking on behalf of teams.
           | They were all about shifting blame from your own team and
           | talking about why a service from another team led to the
           | problem, even if your team could have (and should have) been
           | able to work around it without melting down.
           | 
           | I'm no longer at the company, but Postmortems are much more
           | useful when they really are no-blame because you can get to
           | the real root of the problem, but I don't know if that's
           | possible in a large company.
        
             | SoftTalker wrote:
             | As organizations become larger they become more political.
             | It's unavoidable.
        
         | hackernewds wrote:
         | Curious why the link to ICE.COM?
        
           | hadlock wrote:
           | ICE owns NYSE and several other exchanges. ICE stands for
           | Intercontinental Exchange. ICE is the IT administrator for
           | these exchanges. I helped sell some router management
           | software to them a while ago. ICE is fairly new, NYSE used to
           | be independent. That changed sometime after 2008.
        
         | anonred wrote:
         | Serious question: If someone is smart and capable enough to
         | work on tangible things like AI systems or self-driving cars,
         | why should they choose the NYSE outside of pure monetary
         | reasons or affinity for a "modern" tech stack?
        
           | barneygale wrote:
           | You answered your own question. The only motivation is greed.
        
             | DontchaKnowit wrote:
             | Taking a well paying job is greedy? What planet are you
             | living on?
        
           | misja111 wrote:
           | You're not making it easy to get any answers, if you cut out
           | 2 of the main reasons for people in general to change jobs.
        
             | anonred wrote:
             | I consider these things table stakes when choosing a job.
        
           | ynniv wrote:
           | I'm not recruiting for them, just sharing my experience. I
           | included their careers link for people who might be
           | interested because I know they're always looking for good
           | engineers.
        
           | godshatter wrote:
           | Wouldn't working with systems that keep the largest stock
           | exchange for the largest economy in the world running where a
           | simple mistake can cause "mayhem" when the market opens be
           | considered more "tangible" than working in AI or on self-
           | driving cars? It just doesn't have as much street cred as
           | working on those particular projects in the tech community.
        
             | dylan604 wrote:
             | Not necessarily. If you're the type that's into finance,
             | then sure, that might get you out of bed in the morning.
             | I'm not into finance and kind stand the culture that
             | surround finance. Yes, it's big and touches every single
             | one of us, but doesn't mean I want to embrace it and go to
             | work in it every day.
             | 
             | If I can take that same skill set and apply it to something
             | with a much better culture surrounding it that affects
             | people in a positive way, then I would definitely choose
             | that over finance any day of the week and twice on Sunday.
             | 
             | At the end of the day, if the NYSE did not exist, the world
             | would continue to turn. It's just not that big of a deal to
             | a heck of a lot of people.
        
               | yibg wrote:
               | Will the world stop turning if people stopped working on
               | self driving cars or AI?
        
               | dylan604 wrote:
               | i'm guessing you're trying to make a point here, but care
               | to elaborate on what it is? i think you well know the
               | answer to the question
        
               | idiotsecant wrote:
               | >if the NYSE did not exist, the world would continue to
               | turn
               | 
               | This is startlingly ignorant of the complex machine that
               | is the modern economic system. If something like the NYSE
               | was to shut down today it would be pandemonium.
               | 
               | There is a difference between 'I don't understand how
               | something works' and 'I don't understand how something
               | works, so it is worthless'. The former is healthy and the
               | first step to understanding, the latter is ignorant, and
               | the first step to getting more ignorant.
        
               | barbishkoolaid wrote:
               | Relax, Ayn Rand cum True Believer complex.
               | 
               | The current state of business within the current
               | iteration of how people interact with one another isn't
               | some necessity.
               | 
               | Yes, the world may fall apart for a relatively brief
               | moment in the grand scheme of things -- but then life
               | will go on.
               | 
               | The first step to understanding this is to drop the
               | superiority complex.
               | 
               | Very little is actually needed to keep the world turnin.
        
               | Razengan wrote:
               | Exactly. Our current implementations of resource
               | rationing isn't some fundamental of reality, or even
               | needed by human society just 1-2 centuries ago.
        
               | DontchaKnowit wrote:
               | I actually think both you and the guy you are arguing
               | with are half correct.
               | 
               | The real answer here, in my opinion, is that yes there
               | would be pandemonium, and then yes, the world would go on
               | without it, but then something else just like it will pop
               | up. And that is because a liquid market for financial
               | assets (whether that is securities, options on
               | securities, futures, etc) will always be a massive
               | benefit to the ability of businesses to conduct business,
               | and the ability of individuals to preserve and increase
               | wealth.
        
               | godshatter wrote:
               | I was remarking on the key word "tangible", not trying to
               | express an opinion one way or the other on financial
               | institutions. Accidentally forgetting to do something and
               | ending up in the news because you caused havoc when the
               | markets opened the next morning is more "tangible" (able
               | to touch things directly) than working on AI or self-
               | driving cars, at least currently. Certainly working in
               | either of those fields might provide more benefits down
               | the line.
        
               | kasey_junk wrote:
               | All of my "culture" experience working in finance were
               | uniformly better than pure tech.
               | 
               | The movie portrayals don't match my experiences at all
               | and I saw a lot more bad behavior in the tech companies I
               | worked for.
               | 
               | Heck I saw more people working for the intellectual
               | challenge of it in trading than I did in SV style tech
               | firms where money drove nearly every decision.
               | 
               | It's really hard for me to buy that SV style tech
               | companies are a better place to work when for the last 2
               | decades the business models that have been front and
               | center are panopticon style tracking to sell ads and
               | legal arbitrage.
        
               | dylan604 wrote:
               | Oh, don't get me wrong. I pretty much abhor SV/VC culture
               | too. It's why I don't have one inkling of a notion to
               | work on either coast for the "big" corps.
               | 
               | It's not an either or, I can hate both ;-) I'm a big boy
               | and get to make up my own mind on the matter.
        
           | JBlue42 wrote:
           | >If someone is smart and capable enough to work on tangible
           | things like AI systems or self-driving cars
           | 
           | Maybe they aren't as smart as they think they are? Or they
           | find that there are interesting problems to solve in fintech?
           | Problems they can tackle and see resolved in a realistic time
           | frame vs 'tangible' (?) self-driving cars or chat bots.
           | 
           | I know AI encompasses a far larger range of things but right
           | now, what problems is it solving? Artists, writers, and
           | others can do that work. What do self-driving cars resolve
           | beyond continuing the dominance of car culture in a world
           | that could have better public transit and safer
           | infrastructure?
        
           | DontchaKnowit wrote:
           | Serious Answer :
           | 
           | There are problems in Fintech that are absolutely worth
           | solving for altruistic reasons. One that I think is very
           | important and might even need to incorporate AI is this :
           | 
           | Larger financial institutions have access WAY more and WAY
           | higher quality data surrounding stocks and options. For
           | example, publicly available SEC filings contain extremely
           | useful information about companies. Professional traders have
           | access to services which provide this data accurately in
           | programatic form (like an API). Us normal people have only
           | the SEC filings themselves, which are enormous documents. It
           | would be impossible to read them fast enough to ever catch up
           | on all of them in the last say year. There are free APIs, but
           | they are absolute dogshit and provide incomplete and
           | inaccurate information.
           | 
           | If someone could democratize this and provide this info for
           | free or cheap to the public, it would be an enormous benefit
           | to the general public.
        
           | hinkley wrote:
           | There's also geography. You go far enough East and you have
           | mostly public sector or defense jobs. A little smattering of
           | insurance data processing. And then fintech.
           | 
           | illinois.edu has one of the top rated CS programs, but once
           | you graduate there are not a lot of options but to move to
           | one of the coasts, or move back/to Chicago and try your luck
           | there. Second City has a good deal of fintech.
        
       | helsinkiandrew wrote:
       | https://archive.ph/UoMr9
        
       | WiSaGaN wrote:
       | A lot of stocks have minute bar of wide range prices in the first
       | few minutes of continuous trading. This seems like the incident
       | in what caused Knight Capital fiasco, in which the system
       | repeatedly buy on the ask and sell on the bid very fast, thus
       | pushing the high price high and low price low. In the opening
       | usually the market maker will be more weary of the risk they
       | cannot hedge directly and thus will be less willing to take on
       | positions, leading to wild swings.
       | 
       | Still this report (and the previous statement) does not give
       | enough detail on why a backup system misoperation resulted this.
       | Also, critical large systems like exchange rarely have a single
       | point failure. Usually there will be a sequence of issues along
       | the event chains leading to this. Thus one "failed to properly
       | shutdown" caused all this is a bit incredible. We will need more
       | explanation.
        
         | itronitron wrote:
         | Doesn't each trade require both a buyer and a seller, who buy
         | and sell at an agreed on price? Presumably both parties would
         | be satisfied with the trade so it isn't clear to me what all
         | the fuss is about.
        
           | toast0 wrote:
           | If I had put in a market buy/sell on open order, I'm
           | accepting the market price, but expecting the market price to
           | be set by an opening auction. I don't know if a market on
           | open order would have been cancelled or just executed shortly
           | after the bell; you could argue for both treatments and it
           | usually doesn't come up, so it might not be mentioned in
           | retail brokerage documentation.
           | 
           | Personally, I always do limit orders, but I would consider
           | market on open/close as reasonable options. But I don't think
           | this is typical, a lot of orders are market orders against
           | whatever limit order is at the top of the book. Normally,
           | that's ok, but it gets weird when things get weird, as seen
           | here.
        
           | pcl wrote:
           | Market price trades are executed at whatever the current
           | price is. Presumably it's those trades that caused havoc.
        
           | gpderetta wrote:
           | I haven't really researched anything, but both parties
           | thought they were bid/offering into an auction with time to
           | cancel or amend their orders.
        
           | WiSaGaN wrote:
           | I don't know what happened in current case. In the Knight
           | Capital case, KC clearly didn't intend to send those
           | erroneous orders. And if those trades were not annuled, KC
           | would not be able to settle those, since the trade loss were
           | larger than the collateral KC put up.
        
             | evanpw wrote:
             | The trades were not annulled, because NYSE ruled them not
             | "clearly erroneous". Which is why it is was an existential
             | mistake, not just an embarrassing one.
        
         | anonu wrote:
         | Knight capital issue was a test flag in the code that caused
         | orders to multiply.
         | 
         | From what I'm reading, this NYSE error seems a bit more complex
         | where the presence of a backup system confused the current
         | market state to skip the open auction.
        
       | throwawaaarrgh wrote:
       | There is so much stupidity in the process they describe, I have
       | no faith it will be fixed. a manual daily DR test that clearly
       | wasn't followed by a test or checklist or double checked by
       | another person, _and_ leaving the DR up broke prod?? literally
       | none of those things should have happened.
       | 
       | I know the world is held together with duct tape, but it's
       | embarrassing when you see the tape fall off.
        
         | tgtweak wrote:
         | The way their DR is setup is that clients of NYSE (brokerages,
         | OTC systems, firms, banks) all have IP (not dns) connections to
         | the primary NYSE production datacenter and a full second set of
         | IPs for the DR site. It's not a "dns and load balancers" setup
         | where the service itself can just route the traffic somewhere
         | else. The clients themselves determine where to connect to
         | consume trade data and execute trades. There is likely some
         | modus operandi given to clients on how to connect to primary
         | and DR sites based on some specific logic.
         | 
         | The NYSE DR guide [1] says that if DR is active, production is
         | not. It's not a distant reach to consider that some of these
         | clients have a deadman switch doing a healthcheck poll on DR
         | and switching to it when it see's that it is "up". If they've
         | built their systems in such a way that when it detects the DR
         | site active it uses that, then it makes sense that having both
         | "online" would cause some havoc. I'm sure the complexity of the
         | entire exchange is fairly significant, and having "two" copies
         | of it running in parallel with both able to accept and execute
         | trades would be a scenario that can cause some unintended
         | consequences. Fundamentally, an exchange is "atomic" and
         | transactional and cannot be meaningfully distributed to two
         | sites that are that far away. The replication in place is
         | likely master/slave with a switch to make the slave primary.
         | Anyone who has toyed with master-master replication on less
         | complicated databases knows the issues that can come up with
         | split writes. Imagine that at the scale of a system as large as
         | the NYSE.
         | 
         | [1]
         | https://www.nyse.com/publicdocs/support/DisasterRecoveryFAQs...
        
       | hksoftware wrote:
       | [flagged]
        
         | [deleted]
        
         | ideamotor wrote:
         | You suspect a breach?
        
           | avree wrote:
           | He is, undoubtedly, a meme-stock conspiracy theorist. Only
           | those steeped in the cult of AMC, GME, or BBBY say things
           | like that.
        
             | [deleted]
        
         | shapefrog wrote:
         | I am curious what your "independent research" has turned up on
         | the subject.
        
       | PaulHoule wrote:
       | See https://www.henricodolfing.com/2019/06/project-failure-
       | case-...
        
       | hknmtt wrote:
       | there is no way a person, a single person, an unauthorized
       | person, can have access into such system/functionality like this.
       | utter BS.
        
         | jterrys wrote:
         | my 2c:
         | 
         | In reality what probably happened is previous market day and
         | post-trading data encountered some kind of error, which
         | triggered a cascade of problems overnight that they were unable
         | to properly rectify. This caused delays up until market open.
         | They were unable to fully resolve the issue, and forced with
         | either delaying opening the market (which is a HUGE no-no) or
         | opening with wrong data as is, they chose wrong data.
         | 
         | All in all a lot of people didn't get much sleep Monday. More
         | than likely they implemented some changes or updates over the
         | weekend that were not properly done, or they encountered some
         | errors, and didn't have adequate controls/time to roll-back
         | Monday night. They made the right calls too late and there was
         | a controls process up the chain that seriously fucked up. These
         | are the kinds of problems that get the CEO woken up in the
         | middle of the night.
        
       | lvl102 wrote:
       | Remember the flash crash of 2015? They let those trades actually
       | STAND. Including options. This week's open was nothing in
       | comparison.
        
         | detaro wrote:
         | That wasn't an exchange "malfunction" in the sense that the
         | exchange did not do what it was supposed to, was it?
        
           | lvl102 wrote:
           | Do you really think they will take responsibility for
           | billions lost/made that day? No, that's a big liability.
           | Anyone trading that day knew it was a big glitch at open.
           | Some names, BLUE CHIPS, were down 40-50%! What shocked us all
           | is that they actually allowed the trades to stand. What was
           | different about this week was that the SEC actually tried to
           | do their jobs for once and the exchange had to address it ie
           | come up with bullshit excuse.
        
             | spywaregorilla wrote:
             | So... what was wrong? Why should those people not have been
             | allowed to make and lose money?
        
       | lordnacho wrote:
       | Backup system connected to prod, that somehow reminds of the
       | Knight Trading debacle. Someone there apparently connected some
       | test code to their prod and they blew up the company in under an
       | hour.
        
         | laurencei wrote:
         | I mean - isnt that how all DR is essentially configured? You
         | need it "somehow" connected to Prod depending on the failover,
         | config, system etc. And in many of these complex systems DR can
         | be on a subsystem etc - not an "all or nothing" approach?
        
       | kmac_ wrote:
       | Weak blame game.
        
       | pedro2 wrote:
       | Successes are management's, failures are individual's :)
        
         | chiefalchemist wrote:
         | True. But when that happens, that's a textbook sign of lack of
         | leadership.
        
       | jonpo wrote:
       | Its nearly always the damn humans. I find it awful that we are
       | often assigning blame to some "technology error" when its the
       | damn humans pulling the strings all the time. all those times the
       | market shut accidentally. or the market opens at the wrong time.
       | or that time someone accidentally deletes all the GTC orders in
       | order to "save some disk space". that time someone tests opening
       | the market at the weekend and puts in the wrong date. Sometimes
       | we are just trying to test that the things work and so we take
       | awful risks like adding test orders, or failing over to test that
       | backup versions of the trading infrastructure still work. All
       | these things add human execution risk.
       | 
       | That said I find the US market structure is unfair Charles Schwab
       | does protest too much. Retail orders never seem to get near the
       | central order book. there is no direct market access. brokers
       | just sell your order to whomever MM pays them for the spread in
       | return for a kickback. this should be a fantastic fair
       | multiplayer game, but instead its pay to win mobile crap with
       | vested interests milking their customers.
        
       ___________________________________________________________________
       (page generated 2023-01-26 23:01 UTC)