[HN Gopher] Write Better Error Messages
___________________________________________________________________
Write Better Error Messages
Author : noch
Score : 386 points
Date : 2022-10-19 12:27 UTC (10 hours ago)
(HTM) web link (wix-ux.com)
(TXT) w3m dump (wix-ux.com)
| egberts1 wrote:
| 0x0000001E, KMODE_EXCEPTION_NOT_HANDLED
|
| That is all.
| _nalply wrote:
| It is my opinion that software problems tend be analyzed
| corresponding to these four axes:
|
| - Can an end-user solve the problem themselves? If so, tell them
| how, if not, display a generic error message telling them to ask
| for support (with an error identifier they can tell the support)
|
| - Developers and end-users need different information: developers
| need as much information as possible, like file names, contents
| of important variables and especially where the error happened in
| the source code with a backtrace, sometimes even two backtraces:
| the backtrace for the cause of the error, too; and end-users only
| need to be told what they can do, but this needs to be worded
| clearly and carefully. This means that error messages need to be
| written twice.
|
| - Is the problem serious? If so, report, crash and restart, if
| not, just report and abort the affected operation when
| neccessary.
|
| - The problem should be logged. Sometimes it can be sent to
| developers automatically.
| usrme wrote:
| Here's another link for how to write useful error messages:
| https://www.bbc.co.uk/gel/features/how-to-write-useful-error....
| allisonbrow wrote:
| They recommend, avoid technical jargon so change it to:
|
| 'due to a technical issue on our end'
|
| but isn't that also generic and obvious which they were trying to
| avoid too.
| Minor49er wrote:
| At a previous job, writing unambiguous error messages was
| discouraged. Everything just had to be "Oops! Something went
| wrong"
|
| The reasoning was that "users can't do anything with information
| we tell them anyways", despite the overwhelming number of help
| desk tickets we'd get from "Oops!" appearing in a million
| different scenarios with no clear way for us to tell what error
| actually caused the message to appear.
|
| Users naturally report the messages that they see because they're
| helping us to see the problem. I didn't get why that was such a
| hard concept to understand
| MattGaiser wrote:
| I have only worked at one place that wanted informative error
| messages.
|
| All the others wanted to hide the reason because "if we know
| the reason and tell the user, we seem incompetent" or "then
| hackers will know which API call isn't working right"
| (apparently the network console in Chrome is beyond hackers) to
| wanting customers to be dependent as they paid for support.
| rockemsockem wrote:
| People who don't know anything about computer security use it
| as a bludgeon to not do the thing that they didn't want to do
| anyway.
| PetahNZ wrote:
| As long as you are logging the error with the context somewhere
| that's fine. You could always include a timestamp or request ID
| with the user message to not give away information, but be able
| to easily search your logs for the occurrence.
| m-p-3 wrote:
| The epitome of uselessness: making an error message so "user-
| friendly" that it doesn't help anyone.
|
| At least a "Details" button to unmask the technical details
| would be useful in some way, while hiding the "ugliness" to the
| end-user.
| wongarsu wrote:
| That seems like peak uselessness. Even "Error code 0x00ad4829"
| is a more useful message, because even if it's useless to the
| user it is useful to _somebody_.
| cogman10 wrote:
| There is some logic, the "you don't want to expose your
| internals". Really useful messages might contain a lot of
| details about the tech stack you use (giving a nice hint into
| which CVEs to try).
|
| That said, this is an easily solved problem. The best
| solution is to aggressively log errors AND prioritize having
| dev teams push that error count to 0. If an error happens,
| it's a bug.
|
| The next way to solve it is simply a report button. Let the
| users click a "I'm mad at you for not working" button and
| embed something like a session ID that allows internal
| queries into what went wrong.
|
| Error codes are a terrible solution, but perhaps an OK option
| if this is not hosted software. That said, a more user
| friendly approach would be a QR code with all the relevant
| details embedded.
| marcosdumay wrote:
| > Really useful messages might contain a lot of details
| about the tech stack you use (giving a nice hint into which
| CVEs to try).
|
| Nope. Useful messages contain details about what your
| software does. Anything about your tech stack is redundant
| and can be removed.
|
| > The best solution is to aggressively log errors AND
| prioritize having dev teams push that error count to 0.
|
| Many errors can only be replicated talking to users. And on
| the cases your dev team is not all capable enough to remove
| all errors, you will still want to provide customer support
| and work-arounds.
|
| > The next way to solve it is simply a report button.
|
| A report button is good. But neither session ID nor any
| data that you can reasonably add to your logs will be
| enough to let dev know what went wrong. Besides, your
| report button will have errors too.
|
| And anyway, anything that you said applies exclusively to
| people that create web applications. Many other types of
| application exist, and everybody writing them are better
| off not following any of your recommendations.
| berkes wrote:
| Why are error codes a terrible solution? I rather have an
| error "bad request f12793b2" than a "bad request".
| Obviously I prefer a "bad request, 'expiresAt cannot be
| after 2022-12-19'. Code f12793b2".
|
| Having a unique ID to be able to search in documentation or
| even source code is -IMO- preferable. It's still rather
| technical and helps only those who can search such docs,
| but at least it gives something unique to google/search
| for."
| slavik81 wrote:
| This seems to be the approach that Android takes. If you try to
| connect to a WiFi network and it fails, it just gives up. It
| won't tell you why it failed. This makes it very frustrating to
| figure out what's wrong. Maybe I wouldn't understand the error
| message, but at least it would provide a starting place for me
| to look up more information or ask for help from someone
| knowledgable.
| jrochkind1 wrote:
| > The reasoning was that "users can't do anything with
| information we tell them anyways",
|
| I mean, I feel like the focus of the OP was on giving them
| something they _could_ do something with. Like the information
| that their information was not lost; and the recommendation to
| change X or try again in Y way; and the fallthrough to contact
| customer support with a quick link.
|
| The OP was definitely not recommending giving more specific
| technical info without thinking about what the user could do
| with it, but instead specifically thinking about what hte user
| could do or would want to know (about their data/account, not
| about your under the hood services), and giving info to that
| end.
| grandinj wrote:
| Probably just me, but I am less concerned with how good my error
| messages are, and more concerned with trying very very hard to
| make the errors happen closer to the cause of the problem, rather
| than further away.
|
| "Fail early, fail hard"
|
| i.e. if I can make the error message happen near the beginning of
| a process, I can get away with making it a hard error.
|
| Hard errors in the middle of a multi-hour operation tend to annoy
| people.
| vbezhenar wrote:
| Exactly. Software must crash as soon as possible and include
| some context information which is necessary to further debug
| the problem.
| Merad wrote:
| This is an attitude I really try to build up in junior devs.
| Soooo many people seem to default to writing code like, "if
| input is null return null" (when input should never be null) or
| "if valueThatShodBePositive < 0 silently skip the code that was
| going to use the value". If the app detects that something is
| in an invalid state _I want it to break_. The worst problems to
| debug are the ones where you have to work backwards through
| miles of strange behavior and corrupted data to find the root
| cause, because the program tried valiantly to soldier on long
| after it had been shot through heart with bad data.
|
| I guess this is because no one really teaches error handling. I
| assume a lot of students end up with a mindset of just make the
| errors go away instead of, deal with the errors effectively.
| S201 wrote:
| Agreed; I've often wondered if this is a result of early CS
| classes usually expecting students to handle weird/bad
| inputs. It's only natural for a programmer to want to write a
| program that gracefully handles all reasonably bad inputs,
| like nulls. So we're taught early on to write defensive code
| that handles those. And that's fine when you're writing
| short, academic programs. But when the complexity goes up by
| a few orders of magnitude trying to gracefully handle that
| null value 10 levels deep in some parsing logic maybe isn't
| the best thing to do. Old habits die hard, however.
| nightpool wrote:
| Yeah, this is a great point. Both overly defensive
| programming and (my personal least favorite) overly-
| commented code are instilled in students at a very early
| point in their careers by irresponsible teachers trying to
| find something to grade students on (Didn't handle negative
| values? 5 points off! Didn't leave a comment on every line?
| 1 point off per line!)
| elboru wrote:
| When I was a jr dev, getting exceptions was a synonymous of
| "me messing something up". Null exceptions were specially
| annoying, so the naive approach is to check for nulls and
| avoid the code that will cause the exception. And it "works"!
| You don't get exceptions and your code keeps running. It's
| just when you need to fix difficult bugs while you go through
| logs when you understand the value of having the right
| exception with the right message. And you learn to love them
| and start caring about them.
| tiborsaas wrote:
| That's not really a respectful practice. Error messages should
| be clear and actionable.
|
| Users don't care if you consider an error soft or hard.
| madeofpalk wrote:
| I think the point is that the higher up you fail, the harder
| it is to identify why you errored in order to give the user
| clear and actionable feedback.
| tiborsaas wrote:
| That's possibly indicating a bad UI / information
| architecture if you are unable to tell that.
| llanowarelves wrote:
| When you have nested exceptions being caught by other
| exceptions, how do you determine what level is correct to
| show the user? Especially when it's a service class or
| something that is used by a lot of calling code.
|
| It's implied that it would be the upper top-most
| exception handlers in that code path but those are gonna
| be more generic in their messages, and anything more
| detailed has to be manually wrapped to add useful
| description (that's not some internal developer
| exception).
|
| Error codes may be the least bad solution, to fallback
| on.
| lupire wrote:
| After it fails fast (thank you!), we also want to fix fast. So
| we need info.
| ChrisMarshallNY wrote:
| The general approach that I take, is that an error message is one
| of the most stressful occurrences that a user encounters, so it's
| incumbent upon me to make it as pain-free as possible.
|
| First of all, unless I'm writing an engineering tool, my users
| aren't geeks, and don't especially care _why_ the error is
| happening (geeks always need to know _why_ ). They just need to
| know that what was expected, did not happen. If there is a
| remedy, and it can be simply stated, then I can add that, but _it
| needs to be short and simple_. Longer stuff needs to go into some
| kind of secondary screen (which probably won 't be read).
|
| Also, I take the "shopkeeper" approach. The customer is always
| right, and it's never the customer's fault. I avoid any hints of
| blaming the user (even if it is their fault), and try to be
| polite and helpful[0].
|
| Of course, the best way to deal with errors, is to avoid them. I
| try to design good affordances.
|
| The rules are different for SDKs, though. In that case, I tend to
| send a great deal of information back. I take advantage of
| Swift's enums, and the ability to associate data. It can allow me
| to nest error reports.
|
| [0] https://littlegreenviper.com/miscellany/the-road-most-
| travel...
| Stratoscope wrote:
| 20 years ago I was working on Acrobat at Adobe. I was mostly the
| "Windows guy" but also worked and tested on the Mac.
|
| When I tried to install Acrobat on my Mac, I got this message:
|
| "Your hard disk is too small"
|
| My _what_ is too small?!
|
| Later, on Windows I got this unexpected popup:
|
| "You are not here"
|
| WTF?
|
| I searched the code for that string and found it in a function
| named "CantHappen()". This function was called in numerous places
| where the programmer thought there was no possible way for the
| code to get to that place. But of course CantHappen() _did_
| happen.
|
| As I looked through the code I found many other messages that
| were bizarre and incomprehensible and sometimes downright
| offensive.
|
| So I started a project to go through all our messages and make
| them more clear and informative - and even better, when possible
| to not have the message at all but just take care of the
| situation.
|
| The underlying cause of these bad messages was twofold:
|
| 1. Programmers never got raises for writing great error messages
| or finding ways to avoid them in the first place. We were just
| rated on how much work we got done.
|
| 2. We did have a product designer who was supposed to specify all
| user-facing messages. But the designer mainly considered the
| "happy path" and didn't think about edge cases. It was left to
| developers working under time pressure to handle those.
| TacticalCoder wrote:
| > Later, on Windows I got this unexpected popup: > > "You are
| not here"
|
| The absolute best I had in a Microsoft product was this
| (paraphrasing): _" An error happened because your computer may
| be turned off"_. I still have a screenshot of that somewhere.
| What it meant was that an hypothetical computer I may be trying
| to connect to (which I wasn't, it was all local) was off, but
| that wasn't the case. This was seriously WTF.
|
| The second most beautiful one from another Microsoft product
| was whatever software generating a password and asking me, in a
| pop-up window, to write it down. The problem was the password
| was something like: 9mZOvy9E(4)?6b(w(<$KcTU%>
| 9T6cz0Z4YxgQ-<tw035X6S.dLE0[2n0"42`/S=S1{q5{)61s190':&6UHT.4hZX
| jO6b%l#X7v]~4tIT2Y0._ebFH,>2:G>%*P]7n4"
|
| I probably also still have a screenshot of that somewhere.
|
| Haven't used Microsoft stuff in two decades so it was a long
| time ago. But it's still seriously WTF.
| im3w1l wrote:
| The paradox of CantHappen is that if the programmer truly
| thought it can't happen then there would be no need for it in
| the first place. The only reason to include it is because of a
| fear that it may in fact happen.
|
| Rust funny enough has unreachable()! for that case, but it also
| has unreachable_unchecked() for actually unreachable code. The
| latter has undefined behavior and exists to help the optimizer.
| giancarlostoro wrote:
| What does unreachable()! do actually? I had no idea that was
| a thing.
| maleldil wrote:
| It terminates the program with panic!
|
| https://doc.rust-lang.org/std/macro.unreachable.html
| remram wrote:
| Rust has a few of those, they all panic but with different
| default messages: panic!(), todo!(), unimplemented!(), and
| unreachable!()
| FartyMcFarter wrote:
| I've been guilty of this in the past - I remember writing an
| error message that looked like "if you used X setting, do this,
| otherwise that". The code should have instead checked what
| settings the user enabled and given a clearer error for the
| situation at hand.
| marginalia_nu wrote:
| That sort of code is a bit tricky though.
|
| Since the fault code paths (hopefully) are very rarely
| executed, the error messages are easy to overlook, and tend to
| rapidly become stale. This is to an extent always a problem
| with error messages, but it's an ever bigger problem when you
| have half a dozen error messages depending on various
| parameters, since they create more and even more rare code
| paths for staleness to hide in.
| Too wrote:
| Internet connectivity is an obvious candidate for this.
|
| Could not connect to server? Check if WiFi is on. Check if Dns
| is working. Check if ping to router is working. Check if ping
| to google is working. Link to wifi settings.
|
| Whatever you do. Just don't do this the reverse way, like my
| smart ass Samsung tv does! It determines if internet is working
| by pinging a Samsung server, _before_ it even allows _other_
| apps to use their internet. You can probably figure out what
| will happen when Samsung servers are down.
| e40 wrote:
| Tried to download my data from takeout.google.com and got this
| error:
|
| "500. It's an error."
|
| Thanks, google. I tried to start a chat (I'm a Workspace
| customer) and could not continue because all the language choices
| were disabled (even English).
| hulitu wrote:
| You shall be happy that you got an error message from Google
| because their default is not to give any.
| robmoore121 wrote:
| I really like this. There are clear shibboleths which identify
| the author as a person who deeply respects and cares for the
| readers of error messages, and their experiences. It makes me
| hopeful for the future of software when I see that there are
| others. Thanks for sharing.
| upofadown wrote:
| There are fundamentally two classes of error message:
|
| 1. Information that can help a technically engaged person debug a
| problem.
|
| 2. Information that can help a user of the system understand what
| they have to do the overcome the problem.
|
| Since most error messages are created by people responsible for
| debugging the system they tend to be of the 1st class. There has
| to be a way to provide different information based on who is
| getting the error.
| tremon wrote:
| There's a fatal flaw in assuming that there's no overlap
| between groups 1 and 2.
| koblas wrote:
| The error message that is presented to the user should always
| be clear and helpful. When an error is presented to the user,
| you should have matching logging (e.g. sentry) that provides
| technical reporting on what happened. By having both solutions
| in place you have error handling that is complete and services
| both communities.
| FridayoLeary wrote:
| There's also a third class which is "Oops! Something went
| wrong..." which basically means "i don't know. Try and reload
| the page." Why this is better then a simple "error" is beyond
| me, but its mildly fustrating.
| mttjj wrote:
| > There has to be a way to provide different information based
| on who is getting the error.
|
| Yes, this concept exists. The error message that is shown to
| the user (number 2) is what's discussed in the article. The
| error message that an engineer or someone else debugging the
| system should get (number 1) is the full stack trace and data
| dump that should be sent to the application log at the same
| time that the user is shown the error dialog.
|
| Users can fix the problem by following the instructions in the
| error dialog and engineers or technical people can come back
| later and look at the more detailed stack trace to determine
| the best course of action.
| lupire wrote:
| It's easy. Just provide both, with mark-up to label them.
| MetaWhirledPeas wrote:
| > There has to be a way to provide different information based
| on who is getting the error.
|
| This is already solved. Provide one error to the user and
| another to your logging system. In the user error provide a
| mechanism to point you to the logged error (even a simple
| timestamp helps).
| munk-a wrote:
| I don't disagree with any points but they missed a big one. If at
| all possible, include some application (or attempt at a globally)
| unique error code on each of your errors - i.e. YCOM-HN-9021.
| When you provide a clearly googleable string you can help your
| users independently resolve the issue and you can also set up
| google alerts on the string - if you roll out a new feature that
| took 3 months to develop and a week later google tells you that
| YCOM-HN-9021 is up 9000% you probably broke something. If at all
| possible make yourself open to client communication but most
| users won't reach out about an error - users have very low trust
| in customer care in the modern world (and it is, honestly, often
| more trouble than it's worth) and are more likely to turn to
| reddit/technical forums for a solution. It is extremely
| advantageous to try and track these users.
| CodeWriter23 wrote:
| Even their example of terrible, "Whoops something went wrong" is
| miles ahead of Chrome's "Oh snap!"
| est wrote:
| Redesign err msg or UX you want, I hope there is always a "more"
| button to show exactly what went wrong. I hate eventvwr.msc or
| less -nir wall of log texts.
| 734129837261 wrote:
| I completely agree with this article, but it never bothers me in
| particular. But I'm a developer, so I'm an outlier. That said, I
| do wish that the error message I see every day would be simpler.
| <looks at TypeScript>
| hdesh wrote:
| Nicely written piece with clear examples. It would be great to
| know the impact of this work. Perhaps one metric to look at would
| be the number of tickets submitted to customer care?
| Terr_ wrote:
| Over time I've come to believe in the "grepability" of error
| messages, and the code-lines that construct them.
|
| Sometimes the data (and error-messages) are flowing up and down
| through many different modules and APIs and job-queues and
| whatnot, that when an error pops up it saves a lot of developer-
| time when you can just text-search on the code repo(s) and see
| exactly the line that generated it in the first place.
| klik99 wrote:
| "Passing the Blame" in particular is a personal pet peeve. I hate
| when apps phrase errors like I did something wrong by clicking
| the totally normal link. Closely related is the general trend of
| "lol wut" tone in error messages, which really grates when you're
| frustrated and doing something that might be very important.
| "Whoops! We made an Oopsies! Sorry :("
| manv1 wrote:
| Really, error handling has been my big beef with CS education for
| like 40 years. There is none.
|
| Error handling has been left to engineers, and when left to they
| own devices engineers will almost always make the wrong choice
| from a user point of view.
|
| Engineering need to think of error messages this way: the error
| message is there to help people (which might be fellow engineers,
| support, and/or and your consultants) identify the error quickly
| so that they can manage the user's expectations, fix the error,
| and/or both.
|
| Unfortunately, many engineering paradigms make this an impossible
| task.
|
| Layering and encapsulation means that you have little idea what's
| happening downstream or how the downstream stuff actually works,
| but the lower-level you are the less likely the error will mean
| anything to the end-user.
|
| Then, it's a question of who's responsible for handling the
| error? If you're on the backend, where does it go? Does the user
| care that the backend microservice can't connect to the database?
| Heck, the UI probably has no idea what's happening back there.
|
| However, for accurate troubleshooting detail is needed.
|
| For many orgs, leaving transaction IDs in your log files is the
| primary way that you figure out errors, especially in big
| distributed systems. That doesn't really help end-users, and
| requires developer discipline, something many engineering teams
| find challenging.
|
| Ideally error objects would aggregate error codes up the stack,
| so that if an error occurs you can at least present technical
| people with the errors that were thrown..and they can search
| through the source code trying to find that unique error code.
| But designing that is difficult; conceptually you don't want a
| list of 500 error codes being thrown upwards, one from each
| function in the call chain. But sometimes you do.
|
| Anyway, error handling design really should be part of the
| initial architecture, but it usually isn't because architecture
| guys don't really understand support.
| residualmind wrote:
| Watched the new Quantum Leap yesterday (it's not great) and there
| was this really cringeworthy moment when something goes wrong
| with their awesome supercomputer and the screen flashes a giant
| "INTERNAL SYNTAX ERROR". Apparently, somebody didn't run their
| linter before sending people through time. Too bad.
| londons_explore wrote:
| How about just engineering stuff to not have errors in the first
| place.
|
| My toaster is a complex bit of engineering - it has thousands of
| parts which all work together to take power from the wall to make
| toast.
|
| Yet it has no errors. It just does the job I ask it to do.
|
| A computer on the other hand seems to have a lot of ways to fail,
| and does so nearly every day. I suspect everyone reading this
| comment has seen at least one error _today_. Can 't we engineers
| make the software better so that these errors can't/don't happen?
| frontiersummit wrote:
| A toaster is probably a bad example, given the common error
| states (burnt toast, stuck toast) which are no doubt amplified
| by design flaws in some units. I've never seen a toaster with
| 2000+ components, so maybe such a machine is different. A
| toaster is also historically famous for a dangerous error
| state: if the plug is inserted the wrong way round, the coils
| will be switched on neutral. A toaster which is "off" is thus
| liable to shock an unwitting person using a fork to resolve the
| stuck-toast error state.
| jdiez17 wrote:
| I don't know what kind of toaster you have but mine doesn't
| have thousands of parts. Maybe 20 or so.
| dietsche wrote:
| This quote form a textbook in my graduate studies helped me a
| lot: "Error messages should be how to fix it messages."
| TruthWillHurt wrote:
| So essentially go back to dev style error messages?
|
| A UX person telling us not to do what the previous UX person
| thought was cute.
|
| Thank you sooo much! Ask PM for a pat on the back.
| sposeray wrote:
| magicalhippo wrote:
| If you're raising an exception deep in some internal code,
| provide as much detail as possible.
|
| If the error bubbles up to the user, then either the information
| is over their head, in which case there's no difference to a non-
| detailed error message, or the user/support person can actually
| act on it.
|
| The most infuriating error I see is "file not found"... WHICH
| FILE?!
|
| Of course if the error is found in the higher level due to some
| consistency check in the business logic, then yeah try to guide
| the user. But for internal stuff, try to help the person who
| needs to fix it or find a workaround. It might be you.
| riskable wrote:
| > If you're raising an exception deep in some internal code,
| provide as much detail as possible.
|
| > If the error bubbles up to the user,
|
| ...then you have an information disclosure vulnerability!
| There's a _really good reason_ why we don 't bubble up deep
| exceptions to end users: Attackers can use that info to gain
| information about your back end that they can use to find worse
| vulnerabilities.
|
| Put all the detail you want in your logs. Keep the end users
| out of it. They shouldn't be able to tell what line broke
| things.
| magicalhippo wrote:
| Yeah things are a bit different with web apps. There users
| usually can't do anything with the info even if they had
| details, so internal logs is clearly the place. But my point
| still stands: you want detailed info in those logs, not just
| a lone "file not found" without anything else.
| lupire wrote:
| > The most infuriating error I see is "file not found"... WHICH
| FILE?!
|
| Filenames might contain user data, which must not be logged
| outside of a database with proper access control, schema
| annotations, and acccess auditing.
|
| We can only display an opaque object key, so authorized devs
| can look up the filename using secure tools.
| magicalhippo wrote:
| Fair enough. I work mostly with good old desktop applications
| though, so if there's user data, it's almost always the users
| data.
|
| For the majority of errors in most applications one can
| provide some helpful information. But yeah, one need to be a
| bit careful if one has PII in the mix.
| thenerdhead wrote:
| As with everything, context matters. It's a great run-down of how
| to empower an error message. Many products can add so much value
| and saved support resources by doing so.
|
| There's one thing I wasn't sure about in this article though. Did
| they talk to actual users regarding these empowered error
| messages or even asked them what they want to see out of common
| error messages they run into? It seems rather difficult to
| empower error messages without first understanding the scenarios
| that got them into the error state to begin with. Next would be
| understanding if these error messages are helpful to the users
| and asking them how they go about resolving these types of
| issues. All of that is hinted at in the "what makes a good error
| message".
| duxup wrote:
| For me error messages come in two forms.
|
| 1. For the user.
|
| You can't do that (maybe explain why). Don't do that.
|
| 2. Error that's actually there for the support or engineering
| team for a customer to convey to support, probably with a handy
| copy to clipboard link (that the user has at best a 50/50 chance
| of using no matter how much prodding).
|
| That's it.
|
| Humans generally lock up hard when they see an error in my
| experience. No amount of information or hand holding will help
| most of them figure it out. It's better to try to solve it in
| software.
|
| If the software can't fix the issue internally then they get an
| error message and 2 things happen:
|
| 1. The user is going to try something else and solve it themself
| (awesome) regardless of the error because they're smart and
| capable people and could probably solve it no matter what you
| told them.
|
| 2. Their brain locks up, they do the same thing 20 times and get
| the same result and complain to support with some form of
| "doesn't work". Doesn't matter what error you give them, they
| won't even try to tell you what the error was / doesn't register
| in their brain unless it had a cute cat on it or something (that
| actually works... so forget this "tone" stuff).
|
| I like the article, but I am skeptical about a UX team who
| doesn't answer support tickets ... just magically knows what the
| user is thinking / will work. I get lots of advice on error
| messages, I change them when they ask, but when it's from folks
| inside the company who know the product it often isn't helpful.
|
| Heck even users give bad advice about errors. I've had them tell
| me "Well it should have said X" where X is exactly word for word
| what it said (they forgot...).
|
| Granted I still try to help the user along, but I'm skeptical
| that software with any large user base can have "good" error
| messages.
| jeremy_wiebe wrote:
| I'm not sure we'll ever eclipse the awesomeness of the VB6 error:
| "Method ~ of object ~ failed".
|
| On a more serious note, error messages is something I always try
| to keep in mind on in code reviews. Most error messages the code
| I review deals with are only ever seen in production logs, so I
| try to think what I'd do with that message (and accompanying
| details) if I saw it in production.
| dale_glass wrote:
| I'll add a few for developer-oriented messages.
|
| * Say what the program was trying to do.
|
| * Make the message unique and searchable.
|
| * Make it detailed.
|
| * FFS, include the filename or whatever else the program is
| having trouble with.
|
| * If possible, include the source code location.
|
| * If possible, include useful contextual information.
|
| * Quote strings. Once in a while, some unexpected whitespace
| sneaks in somewhere and this can be hard to figure out.
|
| Eg, don't just abort with "Open failed: NOT_FOUND". Abort with
| "job.c:2105 Failed to open job description file
| '/var/spool/jobs/125.json' when processing job #5 for user
| 'alice': NOT_FOUND".
|
| This way I don't have to strace the damn thing to try and figure
| out what's it looking for, and know which user it was for, so I
| don't have to dig around and try and figure out which entry in
| the database might contain the wrong information.
|
| Also, context-free, generic error messages are awful. A large
| enough codebase may be impossible to search for some very common
| keywords.
|
| If possible, googleable error codes are great to have, but they
| shouldn't replace the error message. It's ideal if you can search
| the source code and instantly find where the error message
| originates.
| chiefalchemist wrote:
| A couple+ years ago my then employer required I take (what
| amounted to) Security Training 101 for Software Developers. I
| believe one of the client orgs expected everyone to go through
| the program.
|
| That said, Ppetty much everything you're suggesting was
| considered a bad idea (for security). Mainly because the more
| details you give away, the more a hacker can understand about
| the underlying system. The more they probe and possibly break
| things, the more you're showing your cards.
|
| It was then the bland cryptic error msg made perfect sense to
| me.
| m-p-3 wrote:
| I'll also add to make them easy to copy to clipboard in the
| case of a GUI-based program.
|
| It's easier to search and store in an incident management
| system.
| golergka wrote:
| Also, make sure that sensitive information like user's
| passwords, emails, credit card numbers etc, is filtered out of
| the logs and not sent to your servers.
| at_a_remove wrote:
| Yup, all of these. Sometimes I look "around" the problem, like,
| "I found _THIS_ directory but the file 'z.txt' was not in it!"
| or "Not only could I not find 'z.txt' I could not find _THIS_
| directory it was supposed to be in. " Check to see that it is
| really a file, not a directory. "I found 'z.txt' in _THIS_
| directory but it was zero bytes in length! "
|
| In terms of "fail early," my larger programs have a section
| called Pre-Flight Checklist, which looks for files (and that
| they _are_ files), databases, that the databases have the
| expected tables and the correct columns, and so on. Are the
| files sufficiently recent? More or less the expected length?
| Because this is ETL stuff, it 's usually okay to push this
| stuff up as early as I can.
| redact207 wrote:
| For Saas products, this plus use structured logging so you
| don't have to grep-parse log messages when searching your log
| collectors.
|
| Ie all the meta/log context in a hashmap alongside the error
| message.
| [deleted]
| jrochkind1 wrote:
| Very well-written article with good examples and advice.
| cosmotic wrote:
| All the 'do this' versions suffer from the same problems as the
| 'don't do this' versions. Aside from fixing the tone, they are
| still generic, still inactionable, and still verbose.
| donatj wrote:
| > Even in today's world of user-centered design, technical jargon
| still sneaks its way into error messages. You couldn't fetch my
| data? My credentials were denied? What? The technical stuff is
| not important to the user
|
| This is the opposite of what I want. Stop condescending and just
| tell me what actually went wrong.
| lucumo wrote:
| I have this issue with Google Family Link, where I want to add
| my child's voice to a Nest Audio. The app straight up tells me
| that I'm not connected to the wifi, which is clearly not true.
| Furthermore, the app knows I'm connected because in the logging
| you can see it finding the Nest Audio.
|
| It's impossible to figure out what goes wrong. Plenty of people
| have the same problem, but Google only has this forum where
| superusers assume everybody else is either lying or an idiot.
| Meanwhile, they take such error messages at face value, despite
| many people saying they have wifi.
|
| All that to say that I'd rather have an overly technical error
| that actually tells me what's wrong, instead of a friendly
| error message that's straight up wrong.
| deathanatos wrote:
| This; particularly because more and more, "support" seemingly
| has no means to access logs, no ability to do the debugging,
| and no way to escalate obvious bugs in the application to the
| developers.
|
| I need the technical jargon to do support's -- and the company
| whose product I'm using's -- job for them.
|
| Is it not helpful to laypeople? Perhaps not, but it is what the
| technical friend they're going to drag into the problem needs.
| gpderetta wrote:
| How many times I had to strace an application because the
| fucking error message didn't give enough information!!
| jaywalk wrote:
| It all depends on the context. If it's a web application that
| can't connect to some backend service, for example, what
| exactly are you going to do with that information?
| aeonik wrote:
| Depends on why it didn't connect, right?
|
| Was it a timeout? Maybe an HTTP 401 Was it a DNS failure Was
| there a TCP reset immediately?
|
| Each one has a miriad of troubleshooting steps associated
| with it. Some could be local to the host, some could be
| network/firewall some could be from the remote host or behind
| that.
| lupire wrote:
| I'm going to web search it and find advice from other users
| or devs. Maybe I need to use my email address instead of
| username, or delete my cookies, or something.
|
| If it's proprietary locked down user-hostile junk, then yeah,
| all I want in the error message is a statement of a refund on
| my payment, and a link to a competitor website.
| [deleted]
| josefresco wrote:
| > Stop condescending
|
| Wix is mostly a platform for non-techie DIY website builders. I
| can't imagine they'd know what to do with a highly technical
| error.
| taink wrote:
| I think the point they are making here is that clearly stating
| what went wrong doesn't necessitate using "technical jargon".
|
| Now, "your credentials have been denied" seems pretty clear and
| does not use jargon in my opinion, but telling the user "the
| ajax request failed, returning a 403 http error code" seems
| unhelpful and doesn't tell them what happened.
| InCityDreams wrote:
| ...then they clearly didn't make their point at all. Big
| error (in communication) on their part. Your single (2nd)
| sentence communicates everything required.
| bornfreddy wrote:
| Even in your example there's a world of difference. "Your
| credentials have been denied" implies a problem with
| credentials, while 403 clearly states that the credentials
| are valid, they are just denied access to this resource.
|
| I know it is a made up example, but it does show the problem
| with "dumbing down" the error messages. Details matter.
| pwinnski wrote:
| Error messages should definitely be written with a target
| audience in mind. For Wix, a blogging platform, the target
| audience is usually decidedly non-technical. For many of the
| tools I use, more technical detail would be welcome. Then
| again, my parents are unlikely to use the same tools, while
| they might use Wix.
| artogahr wrote:
| I don't understand why they wouldn't have a dropdown below the
| error that would reveal the technical jargon.
| Too wrote:
| They do. Press F12 ;)
| ARandomerDude wrote:
| > If the issue keeps happening, contact Customer Care.
|
| This actually means "if you like wasting your time and want to
| speak to incompetent fools who will pass you to an endless stream
| of their 'colleagues' then dial this number."
| simion314 wrote:
| My recent experience with docker, I am a total newb so I was
| running a tutorial step by step, then I get some error about apt
| certificates/keys/repo stuff. After lot of googling the issue was
| there was not enough disk space but the fucking error was
| pointing in a different direction. Also this is a good example
| why Stack Overflow is usefull for the dudes that hate on it and
| RTFM everyone else.
|
| This is why I love exceptions, I had an issue with a C# game, but
| with a stack trace I could figure out myself that the issue is
| happening when the app initialize and fails to open a file.
|
| I think twe should always give the users a detailed log and stack
| traces, also docker should fucking have some way to catch the
| issue when there is not enough space and report the error
| properly.
| progx wrote:
| @Microsoft read this article! ;)
| hprotagonist wrote:
| I would, if i had any evidence at all that they would be read and
| acted on. I'm convinced even seemingly competent people are just
| rendered contextually blind by the appearance of any error at
| all.
|
| In the past month, i've had about a dozen interactions like this:
| developer: your service crashed, here's a screenshot of the last
| 5 lines of the crash me: do you see where the final
| text you just pasted is "RuntimeError: Did not find ENVVAR,
| ensure this is set to the proper value (see <internal wiki link>)
| and then restart this service" developer: yeah?
| me: well, did you do that thing? developer: what
| thing? me: <headdesk>
|
| and this at work, where the developer in question is intimately
| acquainted with the context and purpose of the project.
| grandinj wrote:
| Some developers are just lazy, and will likely need some kind
| of negative feedback to force them to confront their own
| laziness.
|
| Which can be tricky, because the degree of negative feedback
| that is appropriate to the person in question can range from
|
| "Polite one-on-one suggestion that you read the error message
| more than once before calling me"
|
| to
|
| "Full on yelling at the person in the middle of an open-plan
| office".
|
| Thankfully, type II is rare, but they do occur.
| lupire wrote:
| Send a link to wiki. Last line of page is "if you have
| questions, reach out and include the keyword $THIS_PAGE_KEY
| in your message."
| bartread wrote:
| > Some developers are just lazy
|
| I'm _really_ lazy: if I were on the receiving end of emails
| with error messages that included instructions about how to
| fix said error I 'd automate Freshdesk (or whatever ticketing
| system I was using) to respond with instructions specific to
| that error message, in the first instance, along with a note
| to get in touch again if that didn't solve the problem. I'd
| also set the ticket to autoresolve after a set period of
| time.
| Taylor_OD wrote:
| It's a little annoying but to be fair because most error
| messaging is garbage, its easy to start to ignore them. How
| often is the error message shown, and the little fix given,
| actually going to solve the problem in modern web development?
| 10% of the time? 25% of the time? I'd be shocked if its that
| high.
| bonoboTP wrote:
| It's error message blindness, similar to ad blindness. Even if
| you make a great banner ad with some very useful information,
| or the perfect and affordable product for my life I won't see
| it because I mentally filter out ads because they are junk most
| of the time.
|
| Some people develop the same with relation to error messages
| because most of them are not actionable, other than "stuff
| broke somehow, [gibberish] blabla". Even if your error message
| is impeccable, it's in the class of things that are noise.
|
| If you come up to me at some busy tourist location, where I'm
| used to lots of scammers, I won't listen to you even if you are
| actually a nice person and just want to have a nice chat and we
| would be compatible friends.
|
| Often it _is_ a good strategy to just ask people. Documentation
| and comments get out of date very fast. If you are the kind of
| person who reads everything meticulously and googles around,
| reads manuals etc. you may be wasting a lot of time. Of course
| there is a right balance to find. Some people err too much on
| the side of not thinking themselves and immediately asking for
| handholding, but overall it 's often the right thing to do.
|
| In many cases I found that trying to reason out what was going
| on was hopeless, because when I eventually gave up and asked
| someone, it turned out that the solution was unguessable,
| something like "ah of course, that things is out of date, do
| this magic incantation, then this and that, yeah we should
| update the docs sometime!".
|
| A lot of knowledge is locked up inside people's brains and just
| spreads around as "rumors" on the grapevine. Is that state of
| affairs ideal? No. But it's realistic and people are going to
| adapt by asking first, thinking second.
| Jiro wrote:
| There's also the situation where the program creator likes
| changing functionality on a whim, and every time you google
| up your problem, you find a solution for a version of the
| software that doesn't have the particular menu or whatever
| that you had the problem with.
|
| (This is a big problem if you've ever had a problem with
| Android.)
| BlargMcLarg wrote:
| Asking people is mostly bad habits from a culture too
| ingrained into the whole 'ask first' thing, and often times
| it is the people _trying to help_ that are to blame.
|
| I had this recently. Many individuals like to play hero and
| make sure I don't get stuck because their business is an
| undocumented mess. Before I even read the thing and tried,
| they are already trying to give me the answer. When I ask 'is
| this documented and if so, how would it be discovered easily'
| their first reaction is 'no' followed by a lengthy
| explanation which _should_ be in the wiki and easy for
| newcomers to find.
|
| And it shows when I forget a few days later because my brain
| never put in the effort to get to the answer and my memory is
| that of a fruit fly's.
| jimmytidey wrote:
| This is a context where people are used to seeing errors that
| they don't know what to do with.
|
| If a web app pops a well written error it is much more likely
| to be acted on than an unmotivated dev seeing a some (probably
| badly formatted) text.
|
| Every time I see an error in terminal with a link to
| documentation I'm delighted. And surprised.
| Kalium wrote:
| Once upon a time, I worked at a financial startup (the
| company is irrelevant). I created a little harness around a
| static analysis tool. It would fail builds when a library had
| an outstanding vulnerability scored as HIGH or SEVERE with a
| patch available. The harness put a friendly error message
| around it. It ran roughly as follows:
|
| > Hi! If you're reading this message, it's likely because
| this tool failed your build. To understand why and fix it,
| please click this link <link_to_internal_doc>. Below is a
| table that lists the packages you need to update and the
| version you need to update them to.
|
| The doc had at the very top in big flashing red text with
| siren anigifs a link to the portion that explained that they
| needed to update their libraries with _very_ clear copy-
| paste-into-Dockerfile actionable guidance. The page also
| explained the broader context, such as the point of the tool
| and why we were doing this despite having a firewall and so
| on.
|
| This is where you might be delighted and surprised.
|
| What was perhaps less delightful and surprising were the
| consequences for me. About 4-6 times a week, I would then
| have a Slack conversation akin to this:
| Dev: Why did you break my build!?! Me: Can I see
| the error message? Dev: <pastes message above>
| Me: Thanks! Looking at the message, is there something
| unclear about the documentation? Does it not work?
| <ten minutes pass> Dev: Nope! Docs are great!
|
| At this point the conversation would end.
| nerdponx wrote:
| So? That's no excuse for a _developer_ to disregard the
| content of an error message in their own application.
| lijogdfljk wrote:
| It kinda is. Kinda like when documentation is so repeatedly
| outdated and incorrect, that when you need new information
| you just skip documentation entirely.
|
| Are you wrong for skipping documentation? Yea, maybe. Is it
| entirely expected? Yea.
|
| Based on the parent comment, at least.
| monknomo wrote:
| And yet developers do disregard the content of error
| messages. Try to figure out why they disregard it. I doubt
| the answer is "because they're stupid". The answer probably
| also isn't "because they just aren't trying".
|
| What could it be? Why do people read things and react in
| similar ways, even if they have different jobs? If only
| there was some field of study that could answer these
| mysteries.
| outworlder wrote:
| I have managed to get a lot of notoriety in my company by just:
|
| 1. Paying attention to error messages
|
| 2. Reading documentation
|
| 3. Looking up stuff I don't fully understand(including googling
| error messages)
|
| That's it.
|
| Some people don't even read error messages at all. I understand
| non technical people doing that, but I've seen far too many
| engineers doing it. If anything doesn't go exactly as expected,
| they freeze. I have no idea how a person gets so far in their
| careers without reading error messages. Actually, I do, those
| people ask others to figure out stuff for them. That's way
| prevalent in enterprise settings. Sure, collaboration is good,
| but I've seen a lot of instances where there's a massive
| imbalance - you'll have 10 people pinging a single person to
| 'unblock' them. They could have spent a couple of minutes
| trying to figure out yourself.
|
| I'll move mountains to help someone that comes to me after
| having done some basic homework to try to fix (or at least
| triage) an issue. It very rare though.
|
| It's also amazing how many people will just go ahead without
| having read a single line of documentation of the thing they
| are working on. I've even had a developer dive in a Golang
| codebase without having _ever_ worked on the language. That
| would have been fine - that's how I learn new languages, just
| get accustomed, before doing some more formal training and
| exercises - except that he continued to not read the language
| documentation before asking a bunch of questions. Needless to
| say, the questions weren't good.
|
| And number 3... just rubber ducky everything. If you can't
| explain it, you don't get it. Go read up on the topic.
| Sometimes I'll find out that I don't fully understand something
| as I'm writing an email to others.
| vladvasiliu wrote:
| > I'll move mountains to help someone that comes to me after
| having done some basic homework to try to fix (or at least
| triage) an issue. It very rare though.
|
| This. I actually am OK with people not figuring out even
| basic stuff. But please, at least try to give the impression
| that you've put some effort in, instead of just trying to
| have me do your homework while you browse facebook or
| whatever.
| dan_mctree wrote:
| > except that he continued to not read the language
| documentation before asking a bunch of questions
|
| Can't really blame people for that too much, most language
| documentation is utterly unreadable unless you already know
| exactly what you're doing. And even if you do get it, it's in
| one eye and out the other. Most people just don't learn very
| well from reading technical information you don't need to use
| right away. You might be a happy exception and got to build
| up your notoriety that way
| tetha wrote:
| > I'll move mountains to help someone that comes to me after
| having done some basic homework to try to fix (or at least
| triage) an issue. It very rare though.
|
| These are rare, but they also tend to be the really effective
| ones. We have a couple of teams who understand the stack,
| read documentation and read error messages. We generally
| don't hear of them for months and months, because they are
| too busy being productive.
|
| But when we hear of them, it's usually time to push
| boundaries of the infrastructure and the processes. They
| tried everything and nothing worked and now it's time to make
| it work.
| bob1029 wrote:
| This is a lesson I learned while being system owner of the
| primary user interface that runs on a semiconductor factory
| floor. No amount of confirmation/warning dialogs will actually
| stop someone from doing a wrong thing. Doesn't matter how scary
| the language is. Here's an approximate sample of one:
| "DANGER! Confirming this action may result in 8 figures worth
| of scrap!!!"
|
| Even if you are super careful and make sure your error messages
| are terse in all cases, you will still succumb to things like
| muscle memory among your users. I've caught _myself_ mindlessly
| dismissing these while testing. How can I expect my users to be
| better than the person who developed the UI? That is
| unreasonable.
|
| It got to a point where we started _removing_ these alerts
| /confirmations because it was training people to do the wrong
| thing in a few places. If you have part of a UI where all
| actions are immediate and final, the game theory changes. The
| moment a user enters into one of these spaces, they are much
| more cautious.
|
| If the user thinks the UI will save them, they may eventually
| tire of these protections and forget why they are there in the
| first place. I feel like this is very similar to the problem of
| driver assistance and partial self-driving capabilities today.
| nkrisc wrote:
| The goal of writing better error messages isn't to help the
| people who never read error messages, it's to help the people
| who do and who you never have to hear from.
| marklubi wrote:
| The trick that I've found is that each error message needs to
| be unique... not just the stack trace, but the actual wording
| of the message leading up to that.
|
| Get a screenshot or the exact verbatim of it, and you can
| identify exactly where in the code it originated.
|
| User reports are unreliable, but when I can pinpoint where
| the message originated from, it massively cuts down on the
| troubleshooting time.
| legulere wrote:
| In RFC 7807 all errors get an unique URI. Message texts
| might change or be translated into a language you don't
| understand.
| shadowgovt wrote:
| It turns out translating error messages is controversial.
|
| Users, upon hitting an error, often go check Stack
| Overflow. If you localize your error messages, you
| Balkanize the collective wisdom on how to address the
| error (which will always be larger than your team's
| ability to troubleshoot errors and offer correctives in
| your documentation and FAQs).
| BerislavLopac wrote:
| To be precise, each error _type_ gets a unique URI.
|
| A good way to take advantage of that is to have a central
| database of all error types, but not many companies
| bother to do that.
| mi_lk wrote:
| > have a central database of all error types
|
| do you have any example?
| zem wrote:
| here's ours for pytype (a python type checker):
| https://google.github.io/pytype/errors.html
| alisonatwork wrote:
| A useful thing here is not just to include a unique error
| code for the type of error (usually numeric), but also to
| generate some kind of short Base32 or similar hash and
| print that right next to the error message while logging it
| to your normal back end. Then whether people send you a
| screen shot, copy/paste, whatever, you can easily search
| the logs to find the exact event that occurred.
| mceachen wrote:
| Better still: add a unique prefix to the error code, so
| it's googlable.
|
| The Typescript team does this with compilation errors,
| like `TS12345: frobulating types cannot be transmuted`.
| [deleted]
| rmetzler wrote:
| Yes, that type of thing is pretty useful for linters.
| These error codes act as identifiers if you need to
| google them and whenever you need to configure the linter
| the way you like it or for one-off exceptions.
| lucb1e wrote:
| > each error message needs to be unique
|
| Include random numbers. "Error 7743929" is super easy to
| track down (grep -r 7743929 takes 2 seconds to type), you
| don't need a NATO alphabet to understand what they're
| saying on the phone in order to be able to search it
| correctly, its general purpose is understood
| internationally, and it won't change between versions (like
| when you'd encode a file name and line number, for
| example). When I first figured this out at, idk, 17 years
| old and mentioned the idea in a game making forum, people
| called me crazy, but I still use it and don't know of any
| better system.
|
| Of course, this is _alongside_ an actual error message to
| help the user help themselves. This is just to trace the
| line where it originated, which already helps a lot for
| small software projects like I make.
| Too wrote:
| About that, the number of _developers_ that can't read, or
| even understand the value of, a stack trace is also
| astonishing.
|
| If only I had a penny every time someone sent me a "log of
| the error", that only contains the final line with the
| unhelpful message saying nothing but KeyError.
| vladvasiliu wrote:
| Forget stack traces.
|
| I've met multiple "web developers" (actually working on
| the backend or "full-stack", building API servers and
| whatnot) who came complaining about this or that server
| being "unreachable" and could I check it's up / whether
| the firewall allows them through. Only to find they were
| getting HTTP 404 errors or the like. Which were explicit
| in the errors they'd show me.
| lamontcg wrote:
| At prior work we removed stack traces from the default
| error output because it was thought to "scare" too many
| users.
|
| Then for years almost without fail when an error was
| pasted into a GH issue it would include the big "If
| submitting a bug report, please include the full stack
| trace at /var/log/stacktrace.out" message--without the
| stacktrace. I added some whitespace around it and all
| caps to it and still nobody read it.
| [deleted]
| dylan604 wrote:
| I used to lean on line numbers, but those quickly fall out
| of sync with deployed code and what's currently checked out
| and available for immediate debugging. I've also switched
| to using unique text you mention as it will always find the
| place in the code regardless if it has been moved.
|
| I wish I had learned that earlier than I had.
| EvanAnderson wrote:
| I am reminded of the classic non-intuitive survivorship bias
| example from WWII re: armoring bombers: https://en.wikipedia.
| org/wiki/Survivorship_bias#In_the_milit...
| vkou wrote:
| Or, in the anecdote above, to help yourself, when you are
| inevitably contacted by the person who never reads error
| messages.
| rjmill wrote:
| > evidence at all that they would be read
|
| I just had an idea: Put tracking info in the error URL. If your
| company has an internal URL shortener, that could do the trick.
|
| More practically, I feel like it helps to put an empty line
| before the call to action. For many people, a traceback is just
| noise. The empty line helps split the useful info out from the
| traceback.
|
| Or if it's a script/CLI (and you know the error reason) don't
| even show a traceback. Just print the error message to stderr,
| exit non-zero, and be done with it.
| residualmind wrote:
| Actually reading (and understanding, acting upon) error
| messages seems to be part of the learning process of every
| developer. And while more senior devs usually do read error
| messages, even they sometimes, rather than reading it will jump
| to behavior like "trying again a different way", before looking
| closely what went wrong.
| hinkley wrote:
| Developers often seemed shocked that people can't find the
| important error in a wall of text. A particular peeve is when
| the same error is reported three ways and the real error is
| sandwiched between others or scrolled off the screen due to
| spammy behavior.
| [deleted]
| ajnin wrote:
| How many interactions didn't you have, because the developer
| read the error message, read the Wiki, and ultimately solved
| the issue themselves ?
| [deleted]
| zagrebian wrote:
| This just means that the error message needs to be more clear.
| For example, after the error itself, it could give direct
| advice: "PERFORM THESE STEPS: You must define ENVVAR. Go to
| <wiki link>. Set ENVVAR to a proper value and restart the
| service."
|
| Notice the direct language. It reads like an order. The less
| direct the message, the higher the chances that the user will
| not act upon it.
| MiddleMan5 wrote:
| I can't tell if this is sarcasm or not, this is obviously
| highlighting a deeper issue in developer culture.
|
| The example given _was clear_ compared to 90% of other error
| messages, and saying that it needs to be "more clear" is
| almost dismissive
| Aperocky wrote:
| Don't blame developer culture, if _that_ error cannot be
| acted on, attribute to incompetence and not culture.
| ckozlowski wrote:
| I think you're correct. To add to this (and I think it's the
| point that the article was trying to make), errors written in
| fragmented language or "developer speak" I feel are likely to
| get glossed over. The "Write it like you're talking to a
| friend." advice the article gives I think is spot on. Making
| the message more conversational is to invite better
| understanding and comprehension.
|
| I feel there's a trend when it comes to disseminating
| messaging like this that we adopt an attitude of our audience
| "is smart, and should figure the rest out". They may be. But
| they already have lots to do any plenty to figure out. Any
| opportunity we, the requestor, can lighten their mental load,
| is going to increase the odds that they'll be inclined to
| take action right away.
| duxup wrote:
| The problem is people are not rational... and we try to solve
| that with software.
|
| Many people just lock up when software doesn't do what they
| expect.
| hinkley wrote:
| Lots of people find ways to irrationalize being rational.
| vbezhenar wrote:
| Not rational people must be fired from IT.
| duxup wrote:
| Generally a pipe dream in my experience.
| bee_rider wrote:
| There's a type of error for which the user can be given
| detailed step-by-step instructions (permission issues, etc).
| But to some extent, errors should handle situations the
| programmer didn't expect. If it is possible to provide
| detailed step-by-step fixes, then the program should do those
| steps itself.
|
| Adding a URL might not be a great plan, never know how long
| an old copy of a program will stick around, might not control
| that website forever.
| dvtrn wrote:
| I'm not seeing how what the message already is any less
| direct or clear than what you're saying it should be? It
| straight up tells you it can't find the var and what to do
| about it.
|
| Can you help me understand what isn't clear about the message
| as is, or maybe point out the ambiguity to someone who just
| isn't seeing it? I want to write better error messages but I
| share the frustration of the above poster. The message tells
| you specifically what to do, but you're coming back saying
| it's not clear.
| lupire wrote:
| Some people don't read anything that isn't an all-caps
| command. They have learned helplessness from seeing too
| much useless error text in the past.
| j-bos wrote:
| I think the original error is quite clear, under normal
| circumstances.
|
| Not OP but I've noticed that people often get brain fog
| when something goes wrong and are often need BIG, SHORT,
| WORDS to shake out of it. Or really anything that can shake
| them out of the 'idunno' state of mind.
|
| But maybe if something like that became standard ut would
| no longet be a context switcher..
| ckozlowski wrote:
| I think you're spot on, and I made a similar comment
| above.
|
| It's easy to say "they can figure it out". Sure, in a
| restful state. But the people we're asking to take action
| already have a lot on their plate. Using plain,
| conversational language whenever possible with
| exceedingly clear steps means less mental exertion on the
| receiver. And since we need their help, anything we can
| do to make it easier on their end helps us.
| Too wrote:
| Conversational errors can also be fatiguing. Often what
| you want is something short and dry that can be pattern
| matched. Compilers are pretty good at this because all
| their errors start the same way. Error
| in file foo/bar.c, line 32, missing semicolon.
|
| No conversation needed. These can then be complemented
| with more conversational language on the next line to
| explain why semicolon is needed. Rust is quite good at
| this.
| dvtrn wrote:
| These are fascinating responses to me, as with the
| example given my mind first went to someone for whom
| English is a second language. that group having trouble
| with this message I would understand, or at least have an
| easier time understanding having trouble, if even a very
| little amount.
|
| For someone who was born speaking English and spoke it
| their entire lives, the example provided couldn't
| possibly be more to the point in my opinion.
|
| Though I agree overall with the general idea and that yes
| there are some pretty baffling and downright awfully
| written error messages and log entries that take a minute
| to grok (I just don't think the example replied to is one
| of them).
| bombcar wrote:
| Some of the errors that Gentoo portage can encounter do
| exactly this - and they do it with beautiful terminal colors
| that make it easy to figure out what you need to run, or
| where to go to figure out which of the three options you
| need.
|
| The problem can come when there's a wall of "useless"
| logging/error messages, and the last one or near the last one
| is the actual important one to look at. You have to
| explicitly call it out on a clear screen and make it obvious
| - and even then, people won't always read it.
| mariusmg wrote:
| >it could give direct advice: "PERFORM THESE STEPS: You must
| define ENVVAR. Go to <wiki link>. Set ENVVAR to a proper
| value and restart the service."
|
| Really, should logs also be documentation now ? Just
| mindlessly logging the same "advice" over and over again each
| time the error happen ?
| dementiapatent wrote:
| It will be so much fun when the implementation is
| refactored and half of these comments are forgotten about
| and no longer meaningful.
| prerok wrote:
| Exactly. At one of my previous workplaces there was a
| cumulative effect of misattributed error messages so the
| actions to perform were often of no help.
|
| Not even to mention the fact that new or changed error
| messages caused a landslide in costs in translations to
| various languages. I guess this product has no
| localization? At that time, when I was working at such a
| product that had it, we had to go through a deliberate
| process to describe why we want to change it, what the
| impact is, etc. Tell me you want 100 new messages and you
| will be stuck in meetings for the next month.
|
| In their case, though, it seems they at least have the
| support in management for it. I hope it turns out better
| for them than it did for me.
| SpicyLemonZest wrote:
| I had an error message a few months ago that instructed
| me to reinstall the AWS CLI, I filed a ticket when that
| didn't work, and the team was annoyed with me because
| _obviously_ the real problem was a Python configuration
| warning with no suggested action 10 lines up.
| kortex wrote:
| It depends who, what, and when the error is about. Failures
| are generally a bathtub curve. You have a high rate at
| start (usually configuration issues), some fairly fixed
| rate during operation, and then more at end of lifecycle
| (exhaustion, service hiccups on scale-in).
|
| If it's in the early lifecycle, absolutely, because it's
| most actionable. X is set wrong, Y can't be reached, etc,
| guide whoever is operating the system how to fix it.
|
| If it's mid cycle, it's often post-hoc, but context is
| worth its weight in gold. Less about telling the operator
| how to fix and more about why it broke, to avoid in the
| future.
|
| End of cycle, whatever.
| 0xbadcafebee wrote:
| Logs actually are a form of documentation. Documentation
| can provide instructions on how to diagnose and fix
| problems, and that's what logs do: tell a human being what
| a problem is and how to fix it.
|
| Remember that often the person reading the logs is not the
| person who wrote the software. Maybe it's an Ops person at
| 2AM trying to fix a broken deploy. Maybe it's a developer
| who joined the company 3 years after the software was
| written. Maybe the log is passing through an error message
| from 3 layers deep in the stack. The more literate your
| logs are, the better.
| ddulaney wrote:
| Logs can definitely be a form of documentation.
|
| I write software that is generally run low in the stack,
| quietly doing some mundane tasks that are business-critical
| but rarely thought about. If one of our clients has to mess
| with our software beyond the occasional update, that was a
| failing. Not all software is like this, but lots of it is
| -- its value is that no human needs to be involved.
|
| I need to write log messages with the expectation of an
| audience who doesn't know much about the software -- it's
| been running uninterrupted for months or years and suddenly
| something has gone wrong. If the log line doesn't tell the
| user how to solve their problem, I will end up getting a
| call.
| throw827474737 wrote:
| If it is that simple, the why doesn't the code fix it
| itself? But no, usually there is 1/2/3 likely things, but
| it also could be anything else.. and that kind if
| unexpected errors even often have no default-fix.
|
| No, the most best thing is to point to the documentation
| which has that, and not printig out manpages of docs in
| error messages now.
|
| > I write software that is generally run low in the stack
|
| What stack, how low? Me too.. that low that I usually
| cannot return or even log a " see error code doc at
| http.." string for various reasons (bandwidth, mem,
| performance) but only have error codes ;)
| pwinnski wrote:
| In the case at hand, where an environment variable isn't
| set, how exactly should the code fix itself? Human
| interaction is necessary, which is the reason the log
| message should spell out what the human needs to do.
|
| If I'm starting a service and see a pointer in the logs
| to documentation, that seems like an incredibly broken
| approach to me. Why would I look at missing or out-of-
| date documentation that may or may not be at hand when
| the code that knows the problem is _right there_ and can
| just tell me? A log message like you 're describing might
| as well say, "Something went wrong, but I don't want to
| tell you what. Instead check page 43 of the document in
| the third file cabinet from the left in that room over
| there on your right. No, your other right."
| an_ko wrote:
| I don't want to have to hunt for documentation if it
| breaks. It may have been 30 years and everything but the
| binary has been lost, and the vendor is out of business.
| If in that situation all I get is an error code and a
| link to documentation that doesn't exist, I'd have to
| start reverse-engineering. And while doing so I'd
| definitely be cursing the coder who decided that saving a
| couple hundred bytes of space in a log file in the event
| of an "abort the program"-severity event was worth
| dumping this in my lap.
| Spivak wrote:
| Errors on initialization, fatal errors, and non-recurrent
| errors that require human/support intervention should be
| documentation.
| hinkley wrote:
| If the error results in the program shutting down, it's
| once per fatal interaction.
|
| In other words, yes.
| chillfox wrote:
| Yes! We have tools to filter what gets saved and
| compression that handles repeated text very well.
|
| So why not provide docs on how to solve the error along
| with the error.
| eyelidlessness wrote:
| This is fairly common in good error logs.
| pwinnski wrote:
| Yes!
|
| There are people who don't read formal documentation but do
| read logs, after all.
|
| If the advice is the same over and over again, then yes,
| give the advice over and over again. I wouldn't want to
| assume that someone has read every line of the logs, or has
| started to read top-to-bottom, so the advice should always
| be among the most recent lines in the log, and the only way
| to ensure that is to give the advice again each time the
| error happens.
| pydry wrote:
| It more likely means that the developer views the service as
| OP's responsibility. They'll view an order as something OP
| needs to do.
|
| The clarity of the error message doesnt really matter if the
| recipient believes it is intended for somebody else.
| quintussss wrote:
| Isn't this just survivor bias though? You only hear from those
| that fail to read and act on the error message.
| Joker_vD wrote:
| Well, imagine the error was simply "RuntimeError: Environment
| variable not set" instead, then how much of your time would
| have been wasted by those dozen interactions?
| chillfox wrote:
| Don't send people somewhere else to learn how to fix the error.
| The more steps and indirection you add the fewer people will
| bother doing it themselves, especially if they can bump it to
| the developer. Make it easy for people to fix their own
| problems by being explicit, direct and complete. List all the
| steps and use formatting to make it visually easier to consume.
|
| So your error message while a far cry from the worst I have
| seen is also pretty far from the good ones I have seen.
| starkd wrote:
| I think his point was the developer tends not to even
| investigate the ENVVAR at issue or visit the link. If the
| developer does investigate the link and still has an issue,
| than you have a point.
| chillfox wrote:
| Pretty sure his problem was he got contacted about an issue
| he considers uninteresting, and his preferred solution is
| the user stops behaving like a human.
|
| Reaching for the easiest way to solve a problem first is a
| very human thing to do, and in this case he was easier to
| contact than opening up a browser and reading an article
| that presumably is written in the same kind of language as
| the error message.
| starkd wrote:
| I admit to doing this. Even many of the useful error
| messages that clearly indicate the fix are drowned out
| out by the mass of output. I've made this mistake before,
| and I'll probably do so again.
| chillfox wrote:
| I feel like this is a problem of overly chatty
| application logs + lack of formatting for errors.
|
| If the volume of drivel was lowered and errors were
| formatted with spacing and color to stand out, then they
| would be easier to focus on.
|
| So log errors to stderr, send it to a separate log file,
| and format it well (use multiple lines).
| marcosdumay wrote:
| > So log errors to stderr, send it to a separate log
| file, and format it well (use multiple lines).
|
| Oh, for sure. Do never:
|
| - send errors to the same log you send normal activity.
|
| - default into logging things that aren't errors on the
| error log (make this possible to override if you want,
| but never the default).
|
| - log the errors there, but the necessary context on
| stdout so it appears correct on a terminal. (E.g. build
| tools that print entering into target in stdout; error in
| stderr; leaving target in stdout)
|
| - try to recover just to show a different error later.
| dvtrn wrote:
| I'm left wonder at what point does the "give a man a
| fish/teach a man how to fish" method of pedagogy apply in
| terms of 'acting like a human' in this context?
|
| Asking as someone who otherwise generally agrees that
| there are some truly poorly written errors and exceptions
| out there, but has also been on the admittedly
| frustrating end of the constant requests for help
| deciphering error messages that were very plainly stating
| what the problem is for someone who didn't even try
| looking for the fishing rod.
| chillfox wrote:
| Sure, clearly there are people who will never try, or
| learn, but in general as an industry I feel like the wast
| majority of errors are very very far from good.
|
| Few error messages are written well, has good formatting
| and are self contained (can be used to fix the issue
| without having to seek further information elsewhere).
| Sometimes you see errors that contain one of those
| elements, but rarely all of them.
|
| There has been an effort the last few years improving
| compiler errors for some languages, but those same
| improvements have not reached applications.
| coldacid wrote:
| The help desk guys are on the other side of a cubicle wall from
| my workstation, and almost every call I overhear about someone
| getting errors just convinces me further and further that
| people don't only not pay attention to the error message, they
| don't pay attention to the people they're calling to help them
| get through the situation either.
| lupire wrote:
| Use the error messages you wrote! Send them the link they sent
| you, and move on.
| pizza wrote:
| I mean it kinda makes sense. When you're coding, you're
| constructing something. When you're debugging, you're
| deconstructing something. I feel like it's natural for people
| to take a sec to codeswitch, bc they were likely in a state of
| flow w/ considerable momentum up until they saw the error
| llbeansandrice wrote:
| I feel like I can't get folks to open the log file and cmd-F
| "ERROR" half the time.
| TillE wrote:
| I've seen this _constantly_ over the years, people who
| absolutely refuse to read the simplest instructions, but
| instead require step-by-step hand-holding from you personally.
|
| I have no idea how these people get through life at all.
| 0x457 wrote:
| Hey, let's jump on a quick call, so we can go through this
| together and maybe update docs if they're out of date?
| dagw wrote:
| I suspect that, at least subconsciously, they're to some
| extent doing that to punish you for writing 'bad' software
| that they have to struggle with. If they're going to suffer,
| you're going to suffer right along side them.
| 0xbadcafebee wrote:
| The problem is here: _" RuntimeError:"_. Once they saw that,
| they stopped reading. _" Did not find ENVVAR"_ [..] _" ensure
| this is set to the proper value"_ [..] _" and then restart the
| service"_ are also obscure and will stop them from reading.
|
| Why is the user like this? Error message PTSD. Years of staring
| at obscure errors full of technical jargon that are not helpful
| to the user, has left them scared to even _look_ at the content
| of the error message. They have tried to Google these things
| before and failed, and now they just avoid it entirely and run
| for help.
|
| I'm sure there's enough detail in the link you provided to help
| the user. But if that's the case, it will be better for the
| error message to simply say: A problem
| occurred, but don't worry! You can fix it yourself in 5
| minutes! For instructions, visit https://internal-wiki-
| link/spaces/BLAH/AppUserRuntimeError#A013579
|
| Even if you expect the user to be "smart enough" to fix their
| own problem, they are more likely to try it themselves if you
| make it seem easier.
| Kalium wrote:
| I tried exactly this approach! What I got was a bunch of
| developers copy-pasting the error message with helpful URL at
| me and demanding to know what they should do. The number who
| followed the link and fixed the problem themselves was
| shockingly small.
|
| Going out on a limb, I think we're all going astray by trying
| to parse the error messages our fellow developers are
| reacting to. A great many seem to handle any unfamiliar or
| unexpected error message by giving up, no matter how friendly
| or informative or helpful it may be.
| bonoboTP wrote:
| They don't parse the error message as a natural language
| sentence talking to them. They take it as an opaque string,
| like a big error code. It literally passes through them
| without getting interpreted.
|
| They learned that the affordances of these error messages
| are copy pasting into some place: a google search box, or a
| chat box asking for help. But it has no affordance of
| "interpret as an English sentence" for them.
| 0xbadcafebee wrote:
| If that's the case, then these people may just need
| training. It's likely that nobody has ever sat them down
| and explained that they have a responsibility to
| investigate their own issue. Often people feel they have to
| rush to get something done, and that they _can 't_ take
| time to troubleshoot. But if their bosses explained that,
| actually, it's fine if your work is a little late due to
| troubleshooting, they might do it themselves more often.
| You also may need to provide back-pressure by interacting
| via email/ticket.
| Kalium wrote:
| That's a kind, caring, compassionate, empathetic approach
| founded on assuming good faith.
|
| Unfortunately, it is perhaps not an ideal fit. I was
| mostly not dealing with the most junior and new of
| developers here. I was often dealing with senior
| developers who fully understood that they were
| responsible for investigating their own issues in a
| context where it was understood that troubleshooting
| takes time.
|
| I often wound up regurgitating the error message back to
| them, asking them to point to the problems in the
| documentation getting in the way of them solving their
| own problems. This generally resulted in a conspicuous
| silence and the issues shortly thereafter being resolved.
|
| The lesson I drew from this was not that the developers
| in question needed training. What I learned was that they
| needed to be convinced to treat these errors as natural-
| language strings they could interpret themselves.
| tlogan wrote:
| This is 100% correct.
|
| In theory, all errors should: explain the input, explain the
| problem and explain how to solve the problem (actions). And
| that should help and reduce number of support calls. However,
| error messages and actions how to solve the error are read by
| maybe 1% of users.
|
| The only way to improve your UI is to prevent errors and use
| standards / familiar design.
| JTbane wrote:
| >>>"RuntimeError: Did not find ENVVAR, ensure this is set to
| the proper value (see <internal wiki link>) and then restart
| this service"
|
| I'm laughing as you could not make it clearer if you tried.
| PEBKAC
| [deleted]
| onion2k wrote:
| Shouldn't the app gracefully exit with a clear message, and not
| bail out in a way that looks like a crash? I'd guess that the
| person who wrote it hooked into the error handler because that
| was the easy thing to do rather than bother to write a nice way
| to exit properly.
|
| The fact that you've had this _a dozen times_ points to a
| problem with the app more than the people using it to me.
| CityCobra wrote:
| Still, if you write proper error messages then at least _you_
| can figure out what the issue was without SSHing into the
| person's computer and checking their logs.
| [deleted]
| xiphias2 wrote:
| ,,Try again'' button is the worst way to solve the problem of
| having no connection. GMail does it right by trying again
| automatically periodically while having an error bar on the top
| of the screen, at the same time not stopping the user from using
| the application.
|
| If Wix can save the data locally, why not just copy the GMail
| error interface and let the user decide when to connect to
| internet?
| he0001 wrote:
| I believe that any language that treats errors and error
| management as an afterthought are bad. Also any programmer that
| treats errors as an afterthought or simply ignore them is going
| to write bad code/programs. Errors are hard and need language
| first level support. People talks about "higher order functions"
| but never how to deal with errors (mainly because it's boring and
| complicated). Also errors are tightly coupled with intentions, as
| if you fail to do something, well that's an error. But that also
| means that it's tightly coupled with what the program is trying
| to achieve. So anywhere an error happens should be close to what
| it tries to do. Also it solves what an error is all about, which
| makes it easy to describe what it should be. Yes there are errors
| that may not fall into this category as they are much less
| related to what you are trying to do functionally. Any program
| which ignores how errors work and flow, in my experience, has
| always been bad in general, as the structure of it is also bad as
| there's no organization.
| londons_explore wrote:
| A big part of this is to direct more of your development time
| into errors that happen more frequently.
|
| Most systems I was involved in designing have some kind of error
| tracking system, so we can know exactly how often each error
| occurs.
|
| An error that never happened needs (usually) no attention.
|
| An error that 28% of installations have seen needs _a lot_ of
| attention. The error text should be translated into local
| languages, wiki pages should be written about how to resolve it,
| efforts should be made to auto-resolve the error. The error
| message should include helpful info, etc.
|
| Eg. "SSH server can't start. Config file unreadable".
|
| Could be split into:
|
| SSH server can't start. Config file error on line 7.
| 'AllowPasswordLoogin' is an invalid setting. Did you mean
| 'AllowPasswordLogin'? If you want to make this change, 'sudo nano
| /etc/sshserver.conf' will let you change this config.
| [deleted]
| imwillofficial wrote:
| I saw an error message the other day:
|
| "Deployment failed because: deployment succeeded"
| cpeterso wrote:
| If you have tech support or knowledge base articles for your
| product, you can include unique error codes in your error
| messages so that Googling the error code will find the
| appropriate support article. Microsoft is pretty good about this
| with their KB article numbers and their compiler error messages
| like C4000: https://learn.microsoft.com/en-us/cpp/error-
| messages/compile...
| andrewguenther wrote:
| Bonus points if your link to customer care auto-populates the
| fields necessary to get the ticket where it needs to go and can
| attach relevant diagnostic information to the resulting ticket.
| swyx wrote:
| write errors that don't make me think: https://dev.to/swyx/write-
| errors-that-don-t-make-me-think-24...
| larsonnn wrote:
| Just tell me you can't connect with a big red Error message. I
| don't give a damn about polite error messages.
| kgeist wrote:
| What the article is missing is how they learned the new error
| messages are now more helpful to the end user. Some kind of
| metrics: maybe, the number of support tickets/angry reviews
| decreased? Otherwise without clear criteria for success I'm not
| sure if it was worth it and wasn't just changing the error
| messages for the sake of changing. Sure what they talk about
| makes sense but "it makes sense" is not a business metric.
| fleddr wrote:
| This is great, I would add one critical ingredient: provide
| actual customer care.
|
| Meaning, the "way out" is to point users to customer care, but
| this still does not help if customer care is shit. And we know it
| often is.
|
| Customer care should be an email address (and/or phone number) in
| the footer. Not a contact form. Self-help/FAQ is fine, but no
| replacement for direct contact. Nor is a shitty AI bot.
|
| And when contacting support directly, answers should not be
| scripted non-sense completely ignoring the actual issue at hand.
|
| I don't care if it doesn't scale. Make it scale. Your problem.
| p5a0u9l wrote:
| Was hoping to get insight on better logging for engineering
| users, not UX design.
___________________________________________________________________
(page generated 2022-10-19 23:00 UTC)