[HN Gopher] Why some DVLA digital services don't work at night
___________________________________________________________________
Why some DVLA digital services don't work at night
Author : edent
Score : 74 points
Date : 2025-01-12 20:20 UTC (4 days ago)
(HTM) web link (dafyddvaughan.uk)
(TXT) w3m dump (dafyddvaughan.uk)
| pestatije wrote:
| DVLA - Driver & Vehicle Licensing Agency
|
| plus, since im already posting a comment: its because there is no
| batch window to process transactions
| ForHackernews wrote:
| Unpopular opinion, but I think many systems would benefit from a
| regular "downtime window". Not everything needs to be 24/7 high
| availability.
|
| Maybe not every night, but if you get users accustomed to the
| idea that you're offline for 12 hours every Sunday morning, they
| will not be angry when you need to be offline for 12 hours on a
| Sunday morning to do maintenance.
|
| The stock market closes, more things should close. We are paying
| too high of a price for 99.999% uptime when 99.9% is plenty for
| most applications.
| kragen wrote:
| Basically this happens because the DVLA and the stock market
| don't have any competition. Customers in a competitive market
| won't be angry when you need to be offline for 12 hours every
| Sunday morning; they'll just switch to your competitor some
| Sunday, because the competitor is providing them something they
| value that you don't provide.
| ForHackernews wrote:
| Maybe they should regulate Sunday trading hours, or unionized
| sysadmins should negotiate the end of on-call hours.
|
| The red queen's race that you describe for ever-greater
| scale, ever-greater availability is an example of the tragedy
| of the commons. Think how much money and many human minds
| have been wasted trying to squeeze out that last .0001% of
| "zero downtime" when they could have been creating something
| new.
|
| "Keep doing the same thing, but more of it, harder" is a
| recipe for a barren world of monoculture.
| lifeoflejf wrote:
| Bergen county NJ has blue laws that make it so non-grocery
| stores must be closed on Sunday's. Maybe there's some value
| in structuring a time where everybody is off?
|
| Just like at work the only time I really get off is when
| all of my customers are off. It's nice when the industry
| sorta shuts off for a week or so around christmas
| kragen wrote:
| Something like that might plausibly be correct, though
| you've exaggerated it to a level where it's clearly false.
|
| If we steelman it to its most defensible essence, I think
| what you're saying is that the cost of the human effort
| needed to provide these higher uptimes exceeds the consumer
| benefit (the value of being able to buy a camera on
| Saturday), say. You could imagine, for example, that each
| incremental improvement in uptime wins over a proportion of
| the customer base providing a value that vastly exceeds its
| cost -- but only until your competitors improve their own
| offering to match, so all the surplus from all this uptime
| improvement ultimately goes to the consumers, not the
| producers.
|
| There are two related holes in this idea.
|
| The first is that producing consumer surplus is _what the
| economy is for_ , in a moral sense. The reason producing
| goods and services is a good thing to do is so that someone
| will benefit from using them! So if all the effort that
| sysadmins make goes into making services better for users,
| that's a _good_ thing, not a bad thing.
|
| The second is that nothing is stopping a new entrant from
| offering a new, low-cost service that isn't as reliable. If
| the cost of providing all that extra reliability (bundled
| into the incumbents' pricing scheme) is higher than the
| actual benefit to users, the users will switch to the
| lower-cost, less-reliable service. This has happened many
| times, in fact: less-reliable minicomputers stole business
| from mainframes, less-reliable VoIP stole business from ATM
| and SONET and SDH, all kinds of less-reliable plastic goods
| have stolen business from all-metal versions, and now solar
| panels are stealing business from coal power plants even
| though solar panel "uptime" is like 30%.
|
| So the particular market dynamics we're talking about
| actually sensitively optimize the amount of effort given to
| uptime to the economic optimum. There do exist lots of
| market failures, but the particular dynamic we're
| discussing is the opposite extreme from something like a
| dollar auction.
| abigail95 wrote:
| Who is trying to achieve zero downtime? Facebook has
| degraded service regularly it's just close enough to 99.9
| that nobody cares.
|
| If loading my messages times out I just move onto something
| else and go back a few minutes later.
|
| Surely they have metrics measuring that and don't think
| it's worth the engineering effort to improve it.
| kragen wrote:
| One of the interesting things that came out of Google's
| "SRE" system is that they deliberately add outages if
| they don't have enough. They learned years ago that if
| you build a service that promises 99% uptime and deliver
| 99.99% uptime, other people in the company will come to
| depend on that 99.99% uptime unintentionally. So they
| chaos-monkey it to ensure that the inevitable failures
| aren't catastrophic.
| ajnin wrote:
| The stock markets definitely have competition. For instance
| Frankfurt, London, Paris or Amsterdam very much compete with
| each other to offer desirable conditions for investors, and
| companies will move their trading from one to another if it
| is their interest. I think the fact they close at night is a
| self-preservation mechanism, traders would become insane if
| they had to worry about their positions 24/7.
| kragen wrote:
| There's a very strong network effect, and most stocks are
| only listed on a single stock exchange, so in most contexts
| the competition is very minimal.
| OJFord wrote:
| It only really works where the audience is already limited in
| country/timezone though. Sure a global service could just
| stagger the downtime around the world.. but (unless you've
| already equivalent partitioned the infrastructure) then you're
| just running 24/7 with arbitrary geofencing downtime on top.
| jmwilson wrote:
| Who works Sunday morning then?
|
| The maintenance window will morph into a do-big-risky-changes
| window, which means everybody in engineering will have to be
| on-call. Many years ago, when I newly joined a FAANG, I asked,
| "shouldn't I run this migration after hours when load is low?"
| and the response was firm, "No, you'll run it when people are
| around to fix things". It may not always be the answer, but in
| general, I want to do maintenance when people are present and
| willing to respond, not nights and weekends when they're
| somewhere else and can't be found.
| crazygringo wrote:
| > _Not everything needs to be 24 /7 high availability._
|
| If it makes you more money to be available 24/7 then why
| _wouldn 't_ you?
|
| > _Maybe not every night, but if you get users accustomed to
| the idea that you 're offline for 12 hours every Sunday
| morning_
|
| Then I would use a competitor that was online, period.
|
| Imagine Sunday morning if the only time you have to complete a
| certain school assignment, but Wikipedia is offline? Or you
| need to send messages to a few folks that they need to see by
| the evening, but the platform won't come online until 3pm,
| which means you'll need to interrupt your afternoon family time
| instead?
|
| Maybe things closing works fine for your needs and your
| schedule. But it sure won't for everyone else. Having services
| that are reliable is one of the things that distinguishes
| developed countries from developing ones.
| corint wrote:
| > If it makes you more money to be available 24/7 then why
| wouldn't you?
|
| Agreed, but for a government service where you update your
| license, or tell them about selling a car or something,
| there's no real 'more' money. Being closed at 3am doesn't
| lose the opportunity in the way that it would if you were
| selling widgets. It instead forces the would-be users at 3am
| to wait until the morning.
| rozab wrote:
| I've often ran into this when using DVLA services and spluttered
| with indignation. But at the end of the day, these services are
| fantastically usable (during the daytime) and I appreciate Dafydd
| pushing to just get them out there!
|
| I got my license in 2015 so never in my life have I had the
| apparently ubiquitous American experience of queuing at the DMV
| and filling in paper forms. (is this still real? or limited to
| stand-up comedy?)
| nsxwolf wrote:
| The queues have been mostly replaced with "take a number"
| systems where you can sit down and wait... with your...
| papers... that you had to fill out first...
| fn-mote wrote:
| > The queues have been mostly replaced with "take a number"
| systems where you can sit down and wait...
|
| My recent experience was: sign up online and get a 30 min
| window (9:00-9:30 say). Queue everyone for that 30 minute
| window outside the building. At exactly 9:30, enter and go
| through the usual queues inside. The advantage is that
| getting through those queues now takes 30 minutes or less
| because their length is limited. Presumably we/they traded
| volume of processing for certainty of time spent in the
| queue. A very familiar tradeoff for a computer scientist.
| AlotOfReading wrote:
| Queuing at the DMV and filling out paperwork is very much a
| real thing that still happens. It's a pretty different
| experience in every state though.
| ChocolateGod wrote:
| Can it not be done online like in the UK?
| neckro23 wrote:
| Usually, but it depends on the state. Remember, America
| isn't a country, it's 50 countries in a trenchcoat.
|
| It's often a mishmash of services too. I was told in-person
| at the DMV that I couldn't renew my registration since I'm
| not the registered owner of my car. So I just went to a DMV
| kiosk at the local grocery store and did it there without a
| hassle.
| snakeyjake wrote:
| My US state, one of the ones NOT living in the past, does
| almost everything online.
|
| The only times you have to come in are:
|
| 1. for your first license, either as a newly-licensed driver or
| an out-of-state driver who recently moved
|
| 2. if you were bad and broke the law or otherwise had your
| license cancelled/revoked/suspended
|
| Even those people have to call or go online to make an
| appointment.
|
| All other tasks from getting/returning plates to requesting a
| duplicate title can be done online, though drop boxes, or by
| mail.
|
| I have been to the DMV three times since 1995: once to turn my
| out-of-state license into an in-state one, once to turn that
| drivers license into a realID-compliant one, and once to have
| my fingerprints taken for a concealed carry permit.
| mike_hearn wrote:
| tl;dr same reason other services go offline at night: concurrency
| is hard and many computations aren't thread safe, so need to run
| serially against stable snapshots of the data. If you don't have
| a database that can provide that efficiently you have no choice
| but to stop the flow of inbound transactions entirely.
|
| Sounds like Dafydd did the right thing in pushing them to deliver
| some value now and not try to rebuild everything right away. A
| common mistake I've seen some people make is assuming that
| overnight batch jobs that have to shut down the service are some
| side effect of using mainframes, and any new system that uses
| newer tech won't have that problem.
|
| In reality getting rid of those kinds of batch jobs is often a
| hard engineering project that requires a redesign of the
| algorithms or changes to business processes. A classic example is
| in banking where the ordering of these jobs can change real world
| outcomes (e.g. are interest payments made first and then cheques
| processed, or vice-versa?).
|
| In other cases it's often easier for users to understand a system
| that shuts down overnight. If the rule is "things submitted by
| 9pm will be processed by the next day" then it's easy to explain.
| If the rule is "you can submit at any time and it _might_ be
| processed by the next day ", depending on whether or not it
| happens to intersect the snapshot taken at the start of that
| particular batch job, then that can be more frustrating than
| helpful.
|
| Sometimes the jobs are batch just because of mainframe
| limitations and not for any other reason, those can be made
| incremental more easily if you can get off the mainframe platform
| to begin with. But that requires rewriting huge amounts of code,
| hence the popularity of emulators and code transpilers.
| abigail95 wrote:
| Do you know why the downtime window hasn't been decreasing over
| time as it gets deployed onto faster hardware over the years?
|
| Nobody would care or notice if this thing had 99.5%
| availability and went read only for a few minutes per day.
| mike_hearn wrote:
| It doesn't get deployed onto faster hardware. Mainframes
| haven't really got faster.
| ndriscoll wrote:
| Mainframes have absolutely gotten faster. They're basically
| small supercomputers.
| abigail95 wrote:
| It must be. Maintaining the original hardware would be more
| expensive that upgrading to compatible but faster systems.
| mike_hearn wrote:
| What compatible systems? Mainframes are maintained in
| more or less their original state by teams from IBM. They
| are designed to be single machines that scale vertically
| and never shut down, every component can be hot-swapped
| including CPUs but IBM charge a lot for CPU capacity if I
| recall correctly. Given that nighttime doesn't get
| shorter, the DVLA probably don't see much reason to pay a
| lot more for a slightly smaller window.
|
| And mainframes from the 80s are slow. It sounds like
| they're running on the original.
| ndriscoll wrote:
| Newer mainframes are still faster than older mainframes,
| and can have hundreds of cores and 10s of TB of RAM. A
| big part of IBM's draw is that they make modern systems
| that will continue to run your software forever with no
| modifications. I had an older guy there tell me a story
| about them changing a default in some ISPF panel, and
| customers complained enough that they had to change it
| back. Their storage systems have a virtualization layer
| for old programs that send commands to move the heads of
| a drive that hasn't been manufactured for 55 years or
| whatever and translate that to use storage backed by a
| modern RAID with normal disks. The engineers in the
| mainframe groups know who their customer base is and what
| they want.
|
| It's unlikely that they're literally using 40 year old
| hardware since the replacement parts for that would be a
| nightmare to find and almost certainly more expensive
| than a compatible new machine.
| throw16180339 wrote:
| You're mistaken about this. IBM's z-series had 5GHz CPUs
| well over a decade ago and they haven't gotten any slower.
| pjc50 wrote:
| Maybe it isn't running on faster hardware? These systems are
| often horrifyingly outdated.
| pwg wrote:
| Or maybe it is running on faster hardware, but the UK
| budget office decided not to pay IBM's fees required to
| make use of the extra speed, so it has been "throttled" to
| run at the same speed that it ran on the old hardware.
| ndriscoll wrote:
| Getting rid of batch jobs shouldn't be a goal; batch processing
| is generally more efficient as things get amortized, caches get
| better hit ratios, etc.
|
| What software engineers should understand is there's no reason
| a batch can't take 3 ms to process and run every 20 ms. "Batch"
| and "real-time" aren't antonyms. In a language/framework with
| promises and thread-safe queues it's easy to turn a real time
| API into a batch one, possibly giving an order of magnitude
| increase in throughput.
| mike_hearn wrote:
| Batch size is usually fixed by the business problem in these
| scenarios, I doubt you can process them in 3msec if the job
| requires reading in every driving license in the country and
| doing some work on them for instance.
| ndriscoll wrote:
| This particular thing might be difficult to change because
| it's 50 year old COBOL or whatever, but my point was more
| that I've encountered pushes from architects to "eliminate
| batches" and it makes no sense. It just means that now I
| have to re-batch things in my code. The correct way to
| think about it is that you want smaller, more frequent
| batches.
|
| Do they really need to do work on all records every night?
| Probably not. Most people aren't changing their license or
| vehicle info most days. So the problem is that somewhere
| they're (conceptually) doing a table scan instead of using
| an index. That might still be hard to fix, but at least
| identify the correct problem. Otherwise as you say moving
| to different tech won't fix it.
| abigail95 wrote:
| Something is missing here, why do batch jobs take 13 hours? If
| this thing was started on an old mainframe why isn't the downtime
| just 5 minutes at 3:39 AM?
|
| Exactly how much data is getting processed?
|
| Edit: Why does rebuilding take a decade or more? This is not a
| complex system. It doesn't need to solve any novel engineering
| challenges to operate efficiently. Article does not give much
| insight into why this particular task couldn't be fixed in 3
| months.
| shermantanktop wrote:
| It's funny to me that I would never ask those questions. I've
| specialized in legacy rehab projects (among other things) and
| there seems to be no upper bound on how bad things can be or
| how many annoying reasons there are for why we can't "just fix
| it." Those "just" questions--which I ask too--end up being
| hopelessly naive. The answers will crush your soul if you let
| them, so you can't let them, and you should always assume
| things are worse than you think.
|
| TFA is spot on - the way to make progress is to cut problems up
| and deliver value. The unfortunate consequence is that badness
| gets more and more concentrated into the systems that nobody
| can touch, sort of like the evolution of a star into an
| eventual black hole.
| abigail95 wrote:
| I made a lot of money moving mid size enterprises from legacy
| ERP systems to custom in house ones.
|
| The DVLA dataset and the computations that are run on it can
| be studied and replicated in 3 months by a competent team.
| From there it can be improved.
|
| There is no way that this system requires 13 hours of
| downtime. If it required two hours - even if the code was
| generated through automation it can be reverse engineered and
| optimized.
|
| It is absolute rubbish that this thing is still unavailable
| outside of 8am-7pm.
|
| I maintain my position that it could be replaced in 3 months.
|
| I got my start in this business when I was in university and
| they told us our online learning software was going offline
| for 3 days for an upgrade. Those are the gatekeepers and low
| achievers we fight against. Think bigger.
| arccy wrote:
| it's a gov agency, they don't quite pay enough for a
| motivated competent team....
| monkey_monkey wrote:
| > The DVLA dataset and the computations that are run on it
| can be studied and replicated in 3 months by a competent
| team. From there it can be improved.
|
| Such an HN comment. Made me lol. Think funnier!
| that_guy_iain wrote:
| > Edit: Why does rebuilding take a decade or more? This is not
| a complex system. It doesn't need to solve any novel
| engineering challenges to operate efficiently. Article does not
| give much insight into why this particular task couldn't be
| fixed in 3 months.
|
| You do know the UK government has been cutting all their
| budgets to the bone for about 10 years? That means everywhere
| is pretty much understaffed.
|
| And how do you know it's not a complex system? I would think
| that a system like that would be somewhat complex. It's not
| just driving licenses but a whole bunch of other things that
| are handled by the DVLA.
| abigail95 wrote:
| The system may or may not be complex but the data is has to
| store and transform is not. Because it handles drivers
| licenses. A function that has been done on pen and paper and
| filing cabinets.
|
| Study the data, study the operations, reduce complexity.
|
| Since you imply you know more about UK budgets than I do -
| how much is the DVLA budgeted for IT operations like this and
| how much more would you give them to expect this problem
| solved?
|
| I can argue real numbers but vibes about bone dry budgets I
| cannot.
| that_guy_iain wrote:
| > The system may or may not be complex but the data is has
| to store and transform is not. Because it handles drivers
| licenses. A function that has been done on pen and paper
| and filing cabinets.
|
| It handles more than just driving licenses... The DVLA do
| more than just driving licenses.
|
| > Since you imply you know more about UK budgets than I do
| - how much is the DVLA budgeted for IT operations like this
| and how much more would you give them to expect this
| problem solved?
|
| It's not budgeted anything for this as far as I know. I
| believe it's handled by Government Digital Services which
| handles lots of the digital services for various
| departments. The budget for all of GDS is about 90 million
| most of which isn't for .gov.uk. A rewrite of that size I
| would expect to cost about 50-60 million in total but take
| several years.
| ellen364 wrote:
| Are you suggesting that a process once done using pen and
| paper can't possibly be complicated?
|
| I have no insight into the DVLA, but the idea that no paper
| process could ever be complicated is really funny. The UK
| enjoyed/loathed centuries of bureaucracy before computers
| were invented. At one point getting a divorce required an
| Act of Parliament specifically naming the unhappy couple!
| Being restricted to pen and paper hardly inhibited the
| human ability to create complex systems.
| ajnin wrote:
| The batch jobs don't take 13 hours. They're just scheduled to
| run some time at night where the old offices used to be closed
| and the jobs could be ran with some expectations regarding data
| stability over the period. There are probably many jobs
| scheduled to run at 1AM then 2AM, etc, all depending on the
| previous to be finished so there is some large delay to ensure
| that a job does not start before the previous one is finished.
|
| As to your "not a complex system" remark, when a system is
| built for 60 years, piling up new rules to implement new
| legislation and needs over time, you tend to end up with a
| tangled mess of services all interdependent that are very
| difficult to replace piece-wise with a new shiny
| architecturally pure one. This is closer to a distributed
| monolith than a microservices architecture. In my experience
| you can't rebuild such a thing "in 3 months". People who
| believe that are those that don't realize the complexity and
| the extraordinary amount of specifics, special cases, that are
| baked into the system, and any attempt to just rebuild from
| scratch in a few months hits that wall and ends up taking
| years.
| abigail95 wrote:
| The code will be spaghettified and hideous. The queries will
| be nonsense.
|
| That doesn't change the fact that the ultimate goal of the
| system is to manage drivers licenses.
|
| > In my experience you can't rebuild such a thing "in 3
| months".
|
| Me and my team rebuilt the core stack for the central bank of
| a developing country. In 3 months. The tech started in the
| 70s just like this. Think bigger.
| mattmanser wrote:
| Yeah, I always raise an eyebrow at attitudes like that too.
|
| I've also reimplemented or gradually replaced several out-
| of-date systems. Albeit on a smaller scale.
|
| In my experience, when you start picking the programs apart
| you find 90% of the code is redundant or boilerplate. Much
| of it isn't even called from anywhere, abandoned code, and
| can be deleted en masse. A lot of programmers don't clean
| code up "just in case" and then no-one else deletes it.
|
| They can also often be vastly simplified because
| programmers back then didn't have the patterns and
| knowledge to write consisely.
|
| I often find myself simplifying the original code first,
| which gets rid of 50% of it. Then I can see what the code
| actually does and rewrite it which gets rid of the other
| 40%.
|
| On the other hand, many programmers don't have the
| patience, stubbornness or skill to do this kind of work.
|
| And the ability to get through the major panic you have
| when you're half way through and wondering if you were mad
| to even start.
| patrickmay wrote:
| > And the ability to get through the major panic you have
| when you're half way through and wondering if you were
| mad to even start.
|
| I feel seen, thank you.
| PaulAJ wrote:
| Anyone who doesn't understand what's so difficult should read
| this:
|
| https://wiki.c2.com/?WhyIsPayrollHard
|
| Its from a different domain, but it gives you a flavour of
| the headaches you encounter. These systems always look simple
| from the outside, but once you get inside you find endless
| reams of interrelated and arbitrary business rules that have
| accumulated. There is probably no complete specification
| (unless you count the accumulated legal, regulatory and
| procedural history of the DVLA), and the old code will have
| little or no accurate documentation (if you are lucky there
| will be comments).
| stego-tech wrote:
| Basically this. The people running the show would
| desperately like to make it simpler, but ultimately it's
| left overly complicated due to priorities from past
| leadership well above our paygrade.
|
| The right solution is always to just rip off the bandaid
| and do it again by hand in a new language or platform, and
| to eliminate useless complexity while doing so.
| Unfortunately no leader would ever do this because the
| Board and/or Shareholders would crucify them for not
| outsourcing it to McKinsey first and using the fancy-pants
| automation tool their report recommended.
| pwagland wrote:
| Well, that, and any organization that has gotten
| themselves into this situation tend to have a very strong
| risk aversion principal. Which means they _can't_ approve
| something like this organisationally since there is
| simply too much risk embedded, and someone has to accept
| that.
| Reubend wrote:
| > In my experience you can't rebuild such a thing "in 3
| months". People who believe that are those that don't realize
| the complexity and the extraordinary amount of specifics,
| special cases, that are baked into the system, and any
| attempt to just rebuild from scratch in a few months hits
| that wall and ends up taking years.
|
| Rebuilding a legacy system doesn't require you to support
| every single edge case that the older system did. It's okay
| to start off with some minor limitations and gradually add
| functionality to account for those edge cases.
|
| Furthermore, you've got a huge advantage when remaking
| something: you can see all the edge cases from the start, and
| make an ideal design for that, rather than bolting on things
| as you go (which is done in the case of many of these legacy
| systems, where functionality was added over time with dirty
| code in lieu of refactoring).
| jarofgreen wrote:
| > Rebuilding a legacy system doesn't require you to support
| every single edge case that the older system did.
|
| Depends on context.
|
| This isn't some social media fun site where you can live
| with some rough edges; in this context "edge case" may be
| someone with an health condition who is still entitled to a
| drivers license; or it could be someone who normally could
| get one but due to a health condition really shouldn't be
| allowed one!
| firefoxd wrote:
| Our systems took 8 hours to back up. Then it grew to 12 hours
| [0]. The system was a side project by an intern fresh out of
| college. Over the years, it grew into a crucial software the
| company relied on. I joined over 10 years later and was able to
| bring it down to few minutes.
|
| [0]: https://news.ycombinator.com/item?id=38456429
| jdietrich wrote:
| Per their own data, the DVLA are responsible for the records of
| 52 million drivers and 46 million vehicles. Those records are
| immensely complex, because they reflect decades of accumulated
| legislation, regulation and practice. Every edge case has an
| edge case.
|
| There's someone, somewhere in the bowels of the DVLA who
| understands the rules for drivers with visual field defects who
| use a bioptic device. There's someone who knows which date code
| applies to a vehicle that has been built with a brand new kit
| chassis but an old engine and drive train. There's someone who
| understands the special rates of tax that apply to goods
| vehicles that are solely used by showmen, or are based on
| certain offshore islands. God help any outsider who has to
| condense all of that institutional knowledge into a working
| piece of software.
|
| Government does not have a good track record of ground-up
| refactors of complex IT systems. The British government in
| particular does not have a good track record. Considering all
| that, the fact that most interactions with DVLA can be done
| entirely online is borderline miraculous.
|
| https://assets.publishing.service.gov.uk/media/675ad406fd753...
| delta_p_delta_x wrote:
| Some DVLA services don't work in the day, too. Case in point, the
| 'get a share code' service:
| https://www.viewdrivingrecord.service.gov.uk/driving-record/...
| glonq wrote:
| This sounds a bit familiar. I used to work at a medium-sized
| company whose systems were based on COBOL code and Unisys
| mini/mainframe hardware from the 80's. We even had a person
| employed as a "tape ape"; thankfully not me. Throughout the next
| decade or two they tried various 4GL-generated facades and bolt-
| ons but could never escape from that COBOL core. Eventually I
| think they migrated the software to some kind of big box that
| emulated the Unisys environment but was slightly more civilized.
| I have no idea whether they ever eradicated all the COBOL though.
| arjie wrote:
| While these explanations are plausible, certain other things I've
| encountered make me believe that deeper reasons underlie even
| these reasons. When I lived in the UK in 2017 as a foreigner, all
| applications for a driving licence as a foreigner on a T2-ICT
| visa had to be sent over for a couple of weeks and you had to
| include your passport and Biometric Residence Permit and
| everything. By comparison, I was able to get my driving licence
| at the California DMV pretty easily even as a foreigner and my
| passport and so on were photocopied and not retained. This
| drastic difference in service ability between the DVLA and a
| notoriously disliked American government service lead me to
| believe that the proximal technical causes for this are
| downstream from organizational choices for how to deliver
| service.
| robertlagrant wrote:
| > downstream from organizational choices for how to deliver
| service
|
| 100000%. They're a monopoly service you must interact with or
| get fined and (eventually) locked up. They have zero incentive
| to do a particularly good job. Some orgs in this situation are
| just well run and do a good job, but there's no competitive
| pressure for them to do so.
| Y_Y wrote:
| And there are pressures other than competition, and some
| people just want to do a particularly job just because it's
| their job.
| IOT_Apprentice wrote:
| This seems weird to me. The number of records is minuscule
| compared to internet scale tech.
|
| The data model for this sounds like it would be simple. Exactly
| how many use cases are there to be implemented?
|
| Build this with modern tech on HA Linux backends. Eliminate the
| batch job nonsense.
|
| This could be written up as a project for bootcamps or even a
| YouTube series.
|
| I suspect some internal politics about moving forward and
| clinging to old methods is at hand.
|
| Perhaps someone could build an open source platform if the
| requirements were made public.
| dhosek wrote:
| The thing is that a lot of internet scale stuff tends to be
| non-critical. It's not a big deal if 1% of users don't see a
| post to a social network site. It'll show up later, maybe, or
| never, but nobody will care.
|
| On the other hand, with transactions like banking or licensing
| or health insurance, it's absolutely essential that we
| definitely maintain ACID compliance for every single
| transaction, which is something that many "internet-scale" data
| solutions do not and often cannot promise. I have a vague
| recollection of some of the data issues at a large health
| insurance company where I worked a couple years ago that made
| it really clear why there would be an overnight period where
| the system would be offline--it was essential to make sure that
| systems could be brought to a consistent state. It also became
| clear why enrolling someone in a new plan was not simply a
| matter of adding a record to a database somewhere.
|
| Not to mention that I suspect that data such as bank
| transaction records or health insurance claims probably rival
| "internet scale" for being real big data operations.
| mh- wrote:
| The reason that these "internet scale" solutions are
| challenging to operate is _because_ of their latency and
| availability targets.
|
| If you threw into the requirements "can go down nightly, for
| hours, for writes AND reads", they could absolutely provide
| the transactional guarantees you're looking for.
| neuroelectron wrote:
| I'm sure the upgrade would have been trivial for a competent
| expert to do but instead they outsourced it to a big software
| firm and surprise, it went over-budget. Seriously, what could
| this database be doing that's so complicated?
| simonbarker87 wrote:
| Excellent bit of pragmatism and as a user of this service I'm
| happy with the trade off.
|
| People wondering why it's not a simple switch and "there must be
| something else going on here" have clearly never worked with
| layers of legacy systems where the data actually matters. Sure
| it's fixable and it's a shame it hasn't been but don't assume
| there aren't very good reasons why it's not a quick fix.
|
| The gov.uk team have moved mountains over the past decade,
| members of it have earned the right to be believed when they say
| "it's not simple".
| NVHacker wrote:
| Having legacy data and systems for a few years is a challenge.
| Still having them after decades is incompetence.
| simonbarker87 wrote:
| Yes I'd imagine the reason it still hasn't been fixed after
| nearly a decade is management/politics etc. But it taking
| more than just 6 months will be technical. As a result it's a
| job that falls into the area of being canned because it's
| taking too long even though no one said it would be quick.
___________________________________________________________________
(page generated 2025-01-16 23:01 UTC)