[HN Gopher] Tell HN: I salute everyone on call/working support t...
___________________________________________________________________
Tell HN: I salute everyone on call/working support through the
holidays
Thank you for keeping systems available and safe. I've been there
many times in the past, including having to fly at the last minute
to a non-internet-connected data center in NJ to babysit an
emergency production bug fix that took the entire holiday to
create, install, verify, and monitor.
Author : waynesoftware
Score : 370 points
Date : 2023-12-21 18:54 UTC (4 hours ago)
| maerF0x0 wrote:
| Yes, absolutely thanks to all who keep our world running when no
| one is looking. To keep the yule log on Youtube, to keep our
| christmas tree lights on, to keep a fresh glass of water from the
| tap, warm natural gas to keep the freezing cold outside etc.
| Thank you for keeping society ticking away :)
| akomtu wrote:
| Let's not confuse on-call firefighters or a water facility
| staff with the on-call admins that maintain money-making
| machines monetizing attention of billions. The latter is a net
| negative on society.
| kridsdale1 wrote:
| Not a fan of the YouTube Yule Log, I see.
| krallja wrote:
| Yeah, how dare Netflix provide entertainment on-demand and
| for cheaper than the other entertainment companies?
| op00to wrote:
| I am currently viewing this on ethically sourced rfc1149
| (birds gave consent via a scientifically proven "brain
| electrode interface"), manually decoding packets using an
| abacus made out of various animal droppings foraged on the
| forest floor. If I can't view your content this way, it
| should not be on the internet.
| oceanplexian wrote:
| That firefighter is probably using YouTube or scrolling
| through Instagram to unwind while they're stuck at the
| station waiting for a call. Just because someone works in
| entertainment or ads doesn't mean that the economic puzzle
| piece they represent isn't valuable to society.
| kridsdale1 wrote:
| I'll give a shout out too to everyone in the military
| monitoring warning systems and maintaining stance to protect us
| from being killed while we're with our families.
| sleazebreeze wrote:
| My wakeup alarm this morning was 9am when OpsGenie let me know
| I'm on-call today. Praying for peace.
| frakt0x90 wrote:
| In a similar vein, I'm grateful for the people who maintain the
| foundational pieces of our digital world that often go unnoticed
| like date & time systems.
| isoprophlex wrote:
| Big up the on call heroes! Hope you're getting paid well, hope
| you get no red lights on the bug hotlines.
| chasd00 wrote:
| > Thank you for keeping systems available and safe.
|
| theres that word "safe" again. What systems are dangerous
| otherwise? Do you mean like traffic lights or something? The API
| serving ads to your mobile game isn't dangerous.
| dotnet00 wrote:
| 'Safe' in the context of systems can mean hacking attempts,
| safe from data leaks and other emergencies relative to the
| system that may arise. It can refer to things that are
| dangerous for the system itself.
| tecleandor wrote:
| On call till 31st, so please don't hit refresh too much this days
| ;-)
| Scoundreller wrote:
| Me too, but they pay a few bucks an hour to carry the phone so
| at least that adds up.
|
| Ultimate trick is to have a diverse team. Someone that doesn't
| care about Christmas but absolutely needs some random day off
| in March (cool with us!). Someone that celebrates new years
| some other time.
| kridsdale1 wrote:
| Speaking of diversity, if you don't do Christmas dinner I
| strongly recommend ordering takeout from a Chinese place on
| the 25th! There's lots of happy photos of Chinese chefs and
| Jewish customers doing a Christmas fist-bump.
| rollcat wrote:
| Two live video productions, including one on the evening of 31st.
| I managed to push back on last minute infra/workflow changes (:
| mkhnews wrote:
| Thanks, been there many seasons, and same to you all.
| datpuz wrote:
| For some of us, we look forward to the peace and quiet
| tetha wrote:
| Yeah, we discourage production changes starting first or second
| december week, and start freezing changes third december week
| until it's frozen solid fourth december week until second week of
| january.
|
| December tends to be hell for our customers, so stability should
| be a priority there.
|
| And honestly, no one wants to work on holidays. So lets just wrap
| everything starting in december, maybe use the third week for
| some unnoticed issues and then just lay down the tools. Use that
| time for documentation, or shorter days, quite frankly.
|
| That way we minimize the on-call situations occuring. Let's hope
| it goes well for the engineer this year as well. We have a streak
| to keep.
| ainiriand wrote:
| We do the same, I work in logistics software and we usually
| freeze early November up until Christmas.
| bertil wrote:
| I think that's a great policy as it's clearly intended to help
| people when they need it, and get people to unplug when it's
| valued by their loved ones.
|
| _However_ (that part is probably best bookmarked until Jan
| 2nd), it also betrays that your system is brittle and can be
| broken by a bad commit. Don't do it because you want people to
| grind until Dec 24th at 6 pm. Do it because it's great the rest
| of the year, too. I'd recommend you look into (or ask me about)
| feature flags, alerting, and automated roll-backs.
|
| The short version is: there's a meta-system on top of your
| release process that can tell (if you are using roll-back not
| features flags): - commits until xyzsdf are fine; - roll-outs
| starting from commit abcdef have a 2% error rate, 80% on
| Android; - revert to xyzsdf, send a message (low-priority,
| email) to the DevOps on call and the author of abcdef that it
| happened; - for all commits after abcdef: if there no conflicts
| with xyzsdf, re-try to roll them out; - if there is a conflict
| because they were on top or abcdef, send a message (low-
| priority email) to the authors that there is a conflict.
|
| There are more sophisticated versions that can do things like,
| if you use feature flags, flagging Android users to use the
| previous version. Another way to do this is to scale who has
| access to abcdef gradually: say 1% every hour, and revert if
| you detect issues.
|
| All those seem daunting to teams that haven't worked like this
| before, but it my experience, they love it very fast.
| yardstick wrote:
| How do you detect errors like this?
|
| What is an error? Is a business logic bug going to be picked
| up by this process automatically, or is some manual steps
| involved?
|
| Ie a point of sale app releases an update that automatically
| halves the amount to charge, but displays the full amount to
| the merchant in the UI. Unit tests pass (because an engineer
| made a human mistake). Backend calls are correctly used, no
| errors thrown, simply the wrong amount is used.
|
| How would this be automatically detected and reverted?
|
| Would anyone writing point of sale software want to risk this
| over one of the biggest trading periods of the year?
| codebolt wrote:
| Yeah, that model may work for many public facing apps, but
| probably less so for enterprise systems that are heavy in
| business logic.
| bertil wrote:
| As you point out, it really depends on what is an error.
| Most of the companies I know of have a Holiday freeze are
| video games, casual ones, even. Changes are minor fixes and
| optimization--glitches that a player likely won't notice,
| but you want to detect them early to avoid losing your
| ability to detect more.
|
| Back-end tools are different, and I definitely see reasons
| other than bugs to not change business logic this month.
| tetha wrote:
| We use these systems liberally on other times of the year and
| no one notices, usually. If they do, downtime and
| interruption budgets handle this.
|
| /However/, let me counter with the point: Just one of our
| customer has 8000 FTEs working with our system. During hell-
| time (aka, December and Christmas shopping and shipping),
| each of those dudes spends their shift taking customer calls
| lasting 2-4 minutes, which in turn require a few requests
| into our systems.
|
| Due to the stress of their customers^2 (because it's
| Christmas and holidays and such), if an agent of a customer
| is unable to access our systems, they cannot handle the use
| case of the customer^2 and that will piss of the customer of
| the customer.
|
| So if we push a bad change during this time, we're going to
| piss of hundreds of customers^2 per minute for that one
| customer alone. Even with a fast automatic rollback, that's a
| long time during hell-time. And they have people who know how
| to yell at vendors in nasty ways who don't like that.
|
| I enjoy moving software fast and enabling moving software
| quickly, but customer focus and customer orientation means to
| understand when to move slow as well.
|
| And hey, if that means more quiet holidays for the hard
| working operators on my team, who's gonna complain?
| bertil wrote:
| You are a lot more ahead than most companies.
|
| I've worked for too many places where the Christmas break
| was because of a lack of tooling. I'm glad you are two
| steps ahead.
| ok_dad wrote:
| The place I work for pushed v2 of their software, a full
| rewrite (nothing from the old system, not even databases) by a
| new team, into production this week for several customers.
| Mostly they did it so they could say they met their made up
| 2023 KPIs for the v2 rewrite. There was no good reason to push
| it out now other than that, and there were several reasons not
| to, such as it wasn't well tested and it's fucking December
| 20th. Anyways, I'm not really on call so I can't complain much,
| but my poor coworkers have to support this over the holidays
| now.
| hotsauceror wrote:
| Ugh. Several years ago I spent an entire Christmas vacation,
| including all day Christmas Day, putting out fires because a
| team couldn't be bothered to do five minutes of cursory load
| testing. As a consequence, multiple production systems went
| down under load.
|
| Later, after we regrouped after a month of this brutality,
| they wandered around the office bragging like they'd hung the
| fucking moon after they fixed the crippling, obvious design
| issue they'd released. I confronted the dev lead with the
| fact that they would have seen this after 30s of load testing
| and he just laughed, I think he literally said "LOL". A giant
| middle finger, that's what Ops got from Dev for Christmas
| that year.
|
| Here's to the people who KTLO. My people.
| lynx23 wrote:
| I announced a downtime for a smallish GPU Cluster starting from
| christmas eve just a few hours ago. It is just the perfect time
| to schedule a day or two of downtime for a system like that. And
| if IPMI doesn't fail me, I can get a lot of things done without
| leaving the comfort of my home. I scheduled this without pressure
| from my boss. It was a totally voluntary decision... While being
| raised as a Christian, this time of the year is for me more about
| solstice then about the Christian clelbration. A time to enjoy
| the comfort of a heated home. A time to celebrate that the days
| are going to be longer from now on again. A time to reflect on
| the past year. And all of this is easily done while having a few
| terminals open and waiting for remote stuff to complete...
| wavemode wrote:
| Just barely started my current job too recently to be in the on-
| call rotation yet. Lucked out! Props to those keeping the wheels
| turning.
| RationPhantoms wrote:
| As if this week had attempted to take a measure of blood from my
| body, I'll be on-call next week. Looking forward to all things
| quiet on the HEP network front.
| 6stringmerc wrote:
| Agreed, there are some gigs that just really require support to
| exist - I know this first-hand from working at a Zoo (very large
| exotic animal rescue basically). Animals do not take holidays.
| They need to eat and do animal things in spite of our costumes
| that day.
|
| On the flip side, having worked Cinema on Christmas Day two years
| I think, there is no amount of Grace and Patience I can give that
| is enough to those earning their living. Still have a hat and
| polo. Why? I had to buy them!
| blast wrote:
| https://www.youtube.com/watch?v=zB1T3zgne5Y
| bertil wrote:
| Always be kind, and say it's your fault.
|
| If you don't do it for the sake of the person you are asking for
| help, do it because it works better. That's the most practical
| advice [0] ever given by Hans Rosling [1], the Fact master
| himself:
|
| > In fact, I have the secret to how to get the best help
| immediately from any customer service, like the phone company or
| the bank or anything. I have the best line, it always works. You
| want to know what it is? When I call, I say, "Hello. I am Hans
| Rosling and I have made a mistake." People immediately want to
| help you when you put it this way. You get much more when you
| don't offend people.
|
| [0]: Unless you are in charge of a developing country's budget
| and have to decide between education and healthcare.
|
| [1]: https://blog.ted.com/qa_with_hans_ro_1/
| chunkymilk wrote:
| > Always be kind, and say it's your fault.
|
| I do this with internal teams at work. I've found approaching
| other teams with issues with their library/framework in a "this
| could be our mistake" manner really helps in keeping them from
| getting defensive and stonewalling.
| steve_adams_86 wrote:
| I do something similar. Hey, I'm pretty sure I'm doing
| something wrong. Can you help me figure it out?
|
| Then be grateful for the help, because it truly isn't granted
| or a given that people have to drop everything and figure
| things out for you, even if you work together. And even if
| the mistake was actually theirs. Gratitude is huge.
| smoyer wrote:
| I'm going to try that's but will need more information about
| Hans Rosling to get through the identity verification ...
| caminante wrote:
| You forgot the rest of the story!
|
| _> "Hello. I am Hans Rosling and I have made a mistake."_
|
| continues on as...
|
| _> "I foolishly chose to rely on <insert your service>. I just
| spent 7 minutes hopping around your phone tree with deadend
| voicemail terminuses and an outdated monologue starting with
| "Due to high call volumes..." that has been running since
| before Covid19. Finally, I found the right combination to talk
| to you. A human! There's no option to cancel my service online
| and the help menu threw an error after filling out a detailed
| form. Can I please reset my password?"_
| whalesalad wrote:
| Meanwhile a huge number of us (non-religious? introverted kernel
| compiling cave dwellers?) treat this period no differently than
| any other week in the year. I'll be here keepin the servers
| runnin :horns:
|
| It's actually my favorite time of the year. Everyone is gone, it
| is quiet, and I can get shit done.
| kaashif wrote:
| > non-religious
|
| Or a member of one of the religions that don't celebrate
| Christmas.
| muzani wrote:
| It's me. But we still have a holiday period at the end of
| year - normally financial targets are hit and it's a 4 day
| leave to get 10 days off.
| sneak wrote:
| Holidays are special because they're special, both the winter
| solstice festival (rebranded for christianity) and the spring
| equinox one (same deal) can be treated differently for cultural
| variety by the non-observant.
|
| I'm a militant proselytizing atheist raised by a jew and I
| still have a tree with pretty lights, give presents, and drink
| and eat some things I only drink/eat once per year (never make
| homemade eggnog if you ever want to enjoy it guilt free again,
| you're basically drinking a megacalorie of heavy cream, yum).
| It's fun to celebrate the generic concept of "holiday" - a time
| that is different from other times.
|
| You're allowed to feel nice about peppermint candy (and/or
| chocolate gelt, I go for both) at the end of December without
| bringing the supernatural into the equation. :)
|
| \m/
| whalesalad wrote:
| Oh ya same I love the smell of evergreen wreaths and trees
| and enjoy partaking in festive activities. A 4K cracklin'
| Yule log goes a long way too.
| iddan wrote:
| FYI Israelis are not on holiday - our holidays are on whole
| different dates. Hire Israelis and experience no down time while
| working with Silicon Valley level talent
| liorsbg wrote:
| True story
| loloquwowndueo wrote:
| Just hope stuff doesn't break on sabbath, they ain't touching
| no computer that day :)
| INTPenis wrote:
| They'd need a goy on-call to flick the switch.
| cco wrote:
| I'm not sure if there is something in the water this year, but
| this week, Dec 18th to Dec 21st (only a partial week), has been
| our busiest week all time already.
|
| Sweating over here trying to make it through the week and praying
| that it slows at least for the first half of next week.
| yardstick wrote:
| Just remember this time of year is often peak vulnerability time.
| When attackers exploit that teams are at reduced strength and off
| guard. Slower response times to investigate and fix issues etc.
| timwaagh wrote:
| I wouldn't mind honestly. Seems like a good excuse to skip the
| social obligations.
| smoyer wrote:
| I'm on call 12 hours a day and hoping things are very quiet next
| week. Best wishes to everyone else too!
| acedTrex wrote:
| We've been on freeze for weeks now in preparation for the holiday
| season.
| er0k wrote:
| Thanks for the salute, but we also accept cash :)
| mateusfreira wrote:
| Thanks for all the great work; I hope no one has an outage this
| holiday and has time to enjoy family and alone.
|
| Keep up the good work, folks
| comprev wrote:
| I salute those in the startup world - the ones in a team of 5 and
| they're the only Ops person who always gets paged.
|
| Been there, done that.
| itqwertz wrote:
| Holidays are excellent times for hackers to take advantage. It's
| not just Christmas or other Western holidays, either. Extend this
| principle to any holiday/world conflict/anniversary of conflict
| made into holiday/calendar new year and then adjust your time of
| attack.
|
| protip: US companies with offshore groups are usually
| underfunded, understaffed, and underskilled. Time to see if that
| disaster recovery environment works!
|
| Happy holidays to those who encounter system stress tests. Can't
| spell salary without some elements of slavery...
| dallas wrote:
| Managers, if you're reading this, and you have
| engineers/developers "on call" but not contractually (off book),
| make it so, because it slightly sucks when you're having
| Christmas drinks but can't enjoy yourself because you might need
| to drive somewhere and climb up a ladder to tend to a product.
___________________________________________________________________
(page generated 2023-12-21 23:00 UTC)