[HN Gopher] Logging in Python like a pro
___________________________________________________________________
Logging in Python like a pro
Author : EntICOnc
Score : 212 points
Date : 2022-07-01 12:38 UTC (10 hours ago)
(HTM) web link (guicommits.com)
(TXT) w3m dump (guicommits.com)
| jasonpeacock wrote:
| This is very long (but thorough!) article about implementing
| structured logging.
|
| Shorter version: Use structured logging!
|
| If using Python, use `structlog`:
| https://www.structlog.org/en/stable/
| timost wrote:
| I agree, structlog is such a quality of life improvement.
|
| It can be tricky to forward all logs through it [0] but it's
| definitely worth it.
|
| [0] https://www.structlog.org/en/stable/standard-library.html
| jteppinette wrote:
| https://github.com/jteppinette/python-logfmter
|
| I made this library to make global structured logging in
| Python super easy.
| delijati wrote:
| You always need some boilerplate to get structlog working. I
| always use this (author) to have the needed defaults
| https://github.com/peakiq/logma.
| poopman68686 wrote:
| Good Post!
| mr-karan wrote:
| Honestly, I like to keep the logger as decoupled and minimalist
| as possible. There's https://vector.dev which can basically ship
| your logs from source A to destination B (supports tonnes of
| usecases). Separation of concerns makes things much more easier.
| gregmac wrote:
| I think this is a good overview, I especially like "descriptive,
| contextual, reactive." I like to ask "who is going to see this
| message and why? What action should they take?"
|
| I've also come to find a few "wide" log entries significantly
| better than having many smaller log messages (especially as the
| system grows).
|
| Ideally, one operation == one log message, and contains all
| context: userid, sessionid, url path, domain, error
| codes/messages, timing, etc. log error/warning if it fails, info
| if success. It's much easier to follow than one operation spread
| across a half dozen log messages. In dev, with one thing
| happening at a time, it's no big deal, but in production there's
| way more stuff happening and it is interspersed with hundreds of
| lines of logs of other operations as well as other threads
| running the same operation concurrently.
| bakpakin wrote:
| anyone else just define a function log(msg) the prints to stderr
| or maybe stdout and use that instead for simple programs/daemons?
| I never really got the big value proposition of seven-leveled
| logging or the complexity of it. If I need to customize it I just
| change my log function.
| miohtama wrote:
| A very good post on creating good log contexts.
|
| For powerful log channeling and notifications to Telegram, Slack,
| etc. there is this Python package:
|
| https://pypi.org/project/notifiers/
|
| I also created a logging handler for Discord, where one can get
| notifications on errors to a private channel:
|
| https://github.com/tradingstrategy-ai/python-logging-discord...
| Thaxll wrote:
| Don't use warn, either info or error nothing in between.
| m463 wrote:
| I disliked the logging levels
|
| I really wanted to have an app with a reasonable default level,
| then allow flags like --debug, --verbose plus --quiet and it
| never seemed to map like I wanted.
|
| maybe it's obvious and I just don't see it.
| LtWorf wrote:
| Just try reading logs from journalctl, where you can filter
| the levels you don't need and see errors and warnings in a
| different colour. Then you will perhaps re-evaluate the
| importance of levels.
| 1337shadow wrote:
| By default, use INFO to tell what has been done, use DEBUG to
| describe what's going to be done and how it's deciding it.
| gregmac wrote:
| Warning is actually very useful.
|
| One of the biggest logging mistakes I see is people misusing
| Error. What most developers consider an error -- external
| service call failed, user gave bad input, or a timeout happened
| - is often not really a problem. The service call is re-tried.
| Users are going to user (and provide garbage data and then fix
| it). None of these require operator intervention.
|
| To me, Errors imply action needed, or highlight an unresolved
| situation. If the service call keeps failing after all retries,
| and now a feature is impacted (eg a user won't get a report, or
| there's a gap in data somewhere) then that can be an Error.
|
| All the temporary faults are useful as Warn to highlight them
| and distinguish from other normal operations, but no action is
| needed. This is most useful when a user reports something like
| "it seems to be taking much longer than usual to get email
| notifications" and when an operator looks at the logs, they're
| full of warnings. It's also useful for monitoring: high number
| of warnings in the log can cause a non-critical notification to
| operators (during business hours; not a wake-me-up-at-2am
| page).
| jxramos wrote:
| This is the first time I've ever learned about that `extra`
| argument to logging calls
|
| Interesting capabilities. There's a rundown with log output
| examples seen in
| https://docs.python.org/3/library/logging.html#logging.Logge...
| mekster wrote:
| Does anyone ever log in multi dimensional way at all?
|
| Everyone seems to just go, logger.info('Doing
| that')
|
| But logs really should be covering the context instead of being
| single dimensional by being placed at the execution point.
| //Sorry, I don't know python well but trying to run a lambda as a
| parameter. myLogger.info('Doing that', lambda : (
| doThat() whatNot()))
|
| And then, you know exactly what the log is covering as well as
| take a benchmark of the said parts and report how long it took to
| run.
|
| You don't need 'Done that' logging as if you throw inside, the
| log doesn't happen.
|
| It seems no logging libraries do that but it's an essential part
| of how I do logging.
| aunty_helen wrote:
| "%(asctime)s - %(levelname)-5s - %(name)-20.20s - %(message)s -
| (%(filename)s:%(lineno)s)"
|
| Here's a logging format that I use and probably you should too.
| Among other things, it gets around most of the context issues by
| outputting the file and line number of the code that called the
| log.
|
| Before I would try and make every log message unique or mostly
| unique so there weren't double ups.
|
| I don't agree with the authors logging clutter though. If you're
| manually eyeballing the logs looking for a message you're already
| doing it wrong. Use search, have ids in the messages etc.
|
| Remember folks, debug logging on because if it's not logged, when
| it breaks you don't have the logs...
| LtWorf wrote:
| If I can't query for warning level on journalctl it's useless.
| aunty_helen wrote:
| That would be the levelname part. But if your log aggregator
| or subsystem is hindering your ability to use your logs, I
| would look to solve that problem before worrying about
| anything within your application.
|
| Even storing logs in a format that you need to decode or
| download would be a red flag for me. Seconds of debugging all
| of a sudden turns into minutes or even hours.
| connorbrinton wrote:
| Another great project for Python logging is loguru:
| https://loguru.readthedocs.io/en/stable/overview.html
|
| One of my favorite features of loguru is the ability to
| contextualize logging messages using a context manager.
| with logger.contextualize(user_name=user_name):
| handle_request(...)
|
| Within the context manager, all loguru logging messages include
| the contextual information. This is especially nice when used in
| combination with loguru's native JSON formatting support and a
| JSON-compatible log archiving system.
|
| One downside of loguru is that it doesn't include messages from
| Python's standard logging system, but it can be integrated as a
| logging handler for the standard logging system. At the company
| where I work we've created a helper similar to Datadog's ddtrace-
| run executable to automatically set up these integrations and log
| message formatting with loguru
| codethief wrote:
| Does loguru come with a lot of global state like the standard
| logging library? In my experience this makes logging a real
| PITA when forking multiple processes...
| isoprophlex wrote:
| Cannot recommend loguru highly enough. After a coworker showed
| it to me, I fell in love immediately.
|
| I'll never go back to that rancid pile of shit that is the
| builtin logger lib.
| memco wrote:
| I mentioned this almost every time logging and python come up but
| one of my favorite feature is logging.exception:
| https://docs.python.org/3/library/logging.html#logging.Logge....
| It allows you to attach the whole exception with stack trace
| without having to explicitly used "as e" and then use traceback
| or embed the error in the message. Something like
| logger.exception("Failed to connect.") will print the message and
| the exception in one go.
| atorodius wrote:
| I'm using this too, but I always found it very unreadable for
| people unfamiliar with it. It's so surprising that it catches
| the exception for you! I end up always adding a commend
|
| ``` logger.exception("Failed to connect.") # Will print
| exception. ```
| Narushia wrote:
| I second this, very useful. I kind of expected the article to
| mention it, but it didn't!
| [deleted]
| kordlessagain wrote:
| As a logger in a previous lifetime, I approve of this content.
| orf wrote:
| The Python logging module is one of the worst modules in the
| stdlib. You can see it's Java inspiration all over it, as being
| one of the oldest modules it was created before anyone really
| knew what "pythonic" code was.
|
| Things I really dislike about it is the lack of "context".
| Usually log messages are about something - often a request. You
| can't attach a request ID or something else a nested set of
| function calls that log.
|
| The whole library is just really rubbish and has not aged well.
| There's no standard way of outputting JSON logs, but there's a
| built in way to spawn a socket server that allows the logging
| configuration to be updated remotely via a bespoke protocol? And
| everything is built mostly around logging to files.
|
| It also hits that perfect sweet spot of being both overly
| engineered and totally inflexible. I wanted to funnel our log
| messages to s3 via the "s3fs" module, which supports a "file
| like" object that gets written to s3. I had to hack around the
| various handlers to support this because it assumes it's a "real"
| file.
|
| Hate it.
| [deleted]
| systemvoltage wrote:
| Agreed. One of the most confusing libraries with a ton of magic
| going on behind the scenes. I also detest PyTest, although not
| a std lib, pytest is one of the worst in terms of explicitness.
| It's very useful but that's orthogonal. I've seen newcomers to
| python struggle to grasp how pytest just manipulates a billion
| things behind the scenes.
|
| Overall, I prefer less magic and more explicitness.
| simonw wrote:
| Disagree on pytest: it's the first testing framework I've
| used in ANY language which has genuinely made me enjoy
| writing tests.
|
| It has a bit of an initial learning curve, but once you
| understand how fixtures work it's amazing how productive you
| can be with it.
| [deleted]
| miohtama wrote:
| The implicit magic makes writing pytest tests painless.
|
| Also with improvement of type hinting, tooling like PyCharm
| starts to understand pytest better and make it easier to
| read and manage and a lot of the magic can be dispellt by
| written out explicitly. Though this need some discipline
| from the dev team.
| codethief wrote:
| Strong agree on all points! The number of times I had to
| debug into the pytest source code because developers (the
| ones using pytest) thought it would be cool to use some
| decorator black magic to manipulate global state behind the
| scenes which pytest not just makes very easy but actually
| encourages and which was obviously going to break sooner or
| later...
|
| And regarding the sibling comment: Yes, I prefer unittest as
| well.
| oefrha wrote:
| PSL unittest is very explicit and has no magic. Ironically it
| also has Java written all over it. You must be the first
| person I met who prefer unittest to pytest.
| dragonwriter wrote:
| > PSL unittest is very explicit and has no magic.
| Ironically it also has Java written all over it.
|
| By "Java written all over it" do you mean that like JUnit
| for Java, it's an implementation of the xUnit pattern
| derived from SUnit for Smalltalk?
| [deleted]
| oefrha wrote:
| It didn't have to copy everything down to the camelCase
| if it was just an implementation of the xUnit pattern
| with no Java influence.
| dragonwriter wrote:
| camelCase is idiomatic in Smalltalk, which as a language
| and a community was a pretty big influence on other
| dynamic OO languages like Python and Ruby, and from
| whence SUnit comes, and thus the whole xUnit world,
| derives.
|
| So, it's maybe not the best evidence that the Python
| version is particularly a clone of the Java port.
| txru wrote:
| I sometimes prefer unittest too!
|
| Pytest is very opinionated about how and when it's going to
| be run. If you want to put a test in say, Jupyter or a
| Markdown block, unittest is much more amenable to just
| doing what you want without having to read through
| documentation to break a whole bunch of magic behavior.
| bornfreddy wrote:
| What do you mean? Pytest rules! Yes there is some magic, but
| if you let it take control, it is very convenient to write
| and run tests with it. Just... you need to trust it and it
| will do its job.
| systemvoltage wrote:
| I use pytest all the time. Doesn't mean it is well thought
| out. It could have been so much better with more visibility
| and control, and less magic.
| akhmatova wrote:
| _The Python logging module is one of the worst modules in the
| stdlib_.
|
| One has to remember that logging was never intended to be some
| infinitely generalizable metaprotocol for massively streaming
| parallel notifications to the cloud or whatever. It was meant
| as a replacement for "if (DEBUG > 4): print(...)". For which I
| think it does quite well, thank you.
|
| _The whole library is just really rubbish and has not aged
| well. Hate it._
|
| So what have you written that has made people's (not just your
| bosses' or your clients') lives easier, for nearly 20 years now
| (if only incrementally)? Do share.
|
| I see and grant your points, but I find this strong emotional
| rebuke of something that was never intended to be anything more
| than a simple convenience library to be well, strange.
| orf wrote:
| It clearly is supposed to be more than a convenience library.
| It's _the_ blessed, stdlib way to do _logging_ with Python.
| The module isn't called "print_with_levels".
|
| So now everything needs to work with it, it's internals and
| it's way of doing things. And to top it off it's near
| impossible to refactor or replace.
| akhmatova wrote:
| Is there an alternative that you would prefer to bless?
| thechao wrote:
| I can't agree with you more. I switched to Python's logger off
| of my janky thing about 6 months ago, and I'm hating every
| second of it. I still have no clue how to turn it on, turn it
| off, get it to fork output (stdout, stderr, file), nor how to
| override its behavior.
|
| It's literally easier to just have an IOBuffer laying around.
| oefrha wrote:
| There are valid complaints in the grandparent, but if you
| can't figure out how to fork output to stdout, stderr and
| file in six months, that's really on you. You should know how
| to register multiple handlers if you spent five minutes
| reading the documentation.
| thechao wrote:
| I can easily fork output _manually_ ; I can't do it
| reliably using Python's logger.
| maleldil wrote:
| Echoing grandparent's comment, it's not that hard:
|
| https://docs.python.org/3/library/logging.html#logging.Lo
| gge...
|
| For dictConfig, the documentation has examples
|
| https://docs.python.org/3/library/logging.config.html#obj
| ect...
|
| Python's logging has its faults, but you're complaining
| about something I could find in 5 minutes of Google.
| thechao wrote:
| None of which work reliably across multiple modules all
| fighting for control of the logger. Reliable is the key
| term, here.
| slaymaker1907 wrote:
| Eh, it's really not that bad. I implemented this for a
| tool at work last year and it seemed pretty
| straightforward.
| paulfurtado wrote:
| It's definitely sad that there isn't built-in context support,
| but if you're looking for a solution to this, it is actually
| very easy to attach context to logs using threadlocal and a
| logging filter.
|
| When running a webserver though, an even simpler trick is to
| add middleware that sets the current request's info as the
| current thread's name, and then including threadName in your
| log format.
| andrewingram wrote:
| ContextVars are seemingly more robust and work with async. We
| use them quite extensively for adding context to logs.
| MapleWalnut wrote:
| TIL that ContextVars can work in place of thread locals
| too. Thanks for pointing that out!
|
| I had mistakenly assumed ContextVars only worked for async
| code.
| nerdwaller wrote:
| I was going to say the same, I combine ContextVars with
| LoggerAdapters[1].
|
| [1] https://docs.python.org/3/library/logging.html#logging.
| Logge...
| 1337shadow wrote:
| Take a look at pylogctx maybe it'll do what you need.
| memco wrote:
| The contextual support is lacking but it can be done. See
| this comment from a previous discussion where I briefly
| described how I've used the record factory to help:
| https://news.ycombinator.com/item?id=31484527
| helsinki wrote:
| It's rock solid, at least. I've also had the displeasure of
| working in the logging package. Totally outdated. I created a
| JSON formatter (also support for context). Huge pain, but
| always works.
| 1337shadow wrote:
| Maybe worth trying pylogctx and python-logging-loki or see how
| it's done there.
| doliveira wrote:
| > there's a built in way to spawn a socket server that allows
| the logging configuration to be updated remotely via a bespoke
| protocol
|
| That sounds like there's a Log4shell-like vulnerability waiting
| to be found. But I couldn't even find it, I found the
| SocketHandler but that's just a log destination.
|
| EDIT: shockingly, I did find it:
| https://docs.python.org/3/howto/logging-cookbook.html#config...
| emmelaich wrote:
| That might explain why it lacks a NOTICE level. That bugged the
| hell out of me with Java.
|
| Presumably these levels derive from syslog or older. I've no
| idea why they omitted NOTICE.
| miohtama wrote:
| You can quite easily add your own logging levels - it is not
| hardcoded. I think it's even covered in the documentation, as
| I have done it before.
| miohtama wrote:
| Not to defend the legacy of good old logging, but if you have
| ever tried to log in Node.js applications Python logging starts
| to feel godsent.
| rgovostes wrote:
| I've been writing Python daily for 15 years but when I have to
| do any configuration beyond `basicConfig()`, I still groan. In
| many instances I have wanted to do something like temporarily
| turn on logging of HTTP requests, which turns into a much more
| involved effort than it ought to be.
|
| I just recently discovered that there's no documented way to
| dump the current configuration
| (https://stackoverflow.com/q/72624883) so you can see what
| loggers exist and where their output is going.
| drzoltar wrote:
| Are there any good 3rd party alternatives that work better?
| Jamie9912 wrote:
| loguru perhaps
| sage92 wrote:
| This is the only library I that both can use out of the box
| and extend to my needs.
| seabrookmx wrote:
| structlog
| delijati wrote:
| You always need some boilerplate to get structlog working.
| I always use this (author) to have the needed defaults
| https://github.com/peakiq/logma.
| raverbashing wrote:
| Yeah, it exists
|
| But IMO it inherited most of loggers problems while not
| offering a better interface
|
| It is basically a kwargs to json or 'key=value' converter
|
| (Also json logs are overrated. There I said it. Especially
| one key per line, please don't)
| formerly_proven wrote:
| It's unfortunate how blind copying of log4j into many languages
| seems to have inhibited improvements in logging in the last
| ~20ish years.
| MichaelRazum wrote:
| Yep. Have to agree. It works, but is really messy.
| LtWorf wrote:
| I normally just import syslog or journald. The latter allows
| for arbitrary fields to be defined and then searched, but it's
| not in the stdlib.
| 1337shadow wrote:
| Great idea but how do you work this out in a container where
| the python process is PID 1?
| LtWorf wrote:
| bind mount /dev/log in the container of course.
|
| Also logging can be set up remotely, but it's easier to do
| like that from the container and then configure the hos to
| send everything to a single log machine.
| sorokod wrote:
| For whatever it's worth, java logging has mapped diagnostic
| context (MDC).
| efitz wrote:
| Very good advice.
|
| The advice I give to people is "think about logging like a
| feature with customers, just like any other feature." If you
| think about logging this way, you ideally put yourself in the
| position of your logs' "customer"- a bleary-eyed teammate who
| just got paged out of bed at 2am, or a security guy who isn't
| intimately familiar with your code. Providing extra context for
| those users is really kind, as they don't have a "secret decoder
| ring".
|
| Since the article discusses it, I also highly recommend AWS
| CloudWatch logs. It's super easy to log to CloudWatch from Lambda
| with the Python logger class. It's a little harder from EC2, but
| not really that difficult. The advantage is that you don't need
| to SSH somewhere and navigate to a file and then locate the
| interesting stuff in that file - You just keep the CloudWatch
| console open (or keep a link to it handy) and the latest stuff is
| on top with easy filtering.
| pid-1 wrote:
| Cloudwatch is cheap (even free) for small environments and has
| very small operational overhead.
|
| The cost, however, can easily get out of hand and AWS does not
| have per resource billing enabled by default. So it might not
| be obvious which Log Group is responsible for for which charge.
|
| I also think its console is way inferior than something like
| Grafana for searching and visualisation.
| jstarfish wrote:
| My standard, to the extent possible: make your logs clear
| enough that your _attorney_ would be able to understand them.
|
| You may well find yourself wishing you did this if you also
| output DEBUG logs in production.
| dragonwriter wrote:
| > It's super easy to log to CloudWatch from Lambda with the
| Python logger class
|
| Or even just print(): You've got a unique log group for a
| lambda, CloudWatch logs autonatically timestamps entries, and
| stdout & stderr from a lambda both go to the current CloudWatch
| log stream in the lambda's CloudWatch logs group.
| efitz wrote:
| My experience with Python in Lambda was that my functions ran
| significantly faster when I used logger rather than print for
| logging. I agree that Lambda captures stdout to CloudWatch,
| but my experience was that logger was better (and more
| intentional).
|
| ed: clarified that this relates to AWS Lambda only
| [deleted]
| AmpsterMan wrote:
| I don't know if it's the way cloud watch works in my current
| company or what, but I preferred SSH into the server and grep
| for what I want lol.
|
| I miss that I can't do that any longer
| dygd wrote:
| Using the official CLI (aws logs get-log-events) or
| https://github.com/jorgebastida/awslogs is pretty close to
| SSH-ing and grepping.
| AmpsterMan wrote:
| Oh thanks I'll have to give that a shot! It may also just
| be that I need to not get so caught up in learning a new
| way to do something.
| efitz wrote:
| I hear you, and when I did some research on this in the past,
| going to get logs was the most common reason for engineers to
| SSH to production hosts.
|
| As a security guy, I hate SSH to production (the whole
| "cattle, not pets" thing). In my last company we had an
| internal tool to federate you to the AWS console. We had
| runbooks in a wiki, and had links literally to the logs for a
| particular component/service/region - the link would federate
| you to the right account and take you directly to the target
| log in cloudwatch logs in the appropriate region. Safer and
| easier than ssh-to-prod.
| alanwreath wrote:
| given you've configured logging somewhere early on in the
| bootstrap phase of your app - there's really few reason to go up
| and above having more than these two lines in all your logging
| modules import logging logger =
| logging.getLogger(__name__)
|
| But boy oh have I seen some funny business where people attach
| the logger to their objects like so class Wat:
| def __init__(self): self.logger =
| logging.getLogger(...)
|
| All for what? For WAT?!
| 1337shadow wrote:
| To inherit the logger attribute in subclasses?? Save an import?
| Nimitz14 wrote:
| That is a super common mistake from beginner python
| programmers.
| hotfixguru wrote:
| Best way to find logs connected to one of your API requests is by
| using correlation IDs, which are injected to each log message. Me
| and my friend wrote libraries to do this for both Django[1] and
| ASGI apps(starlette, FastAPI)[2]
|
| [1] https://github.com/snok/django-guid
|
| [2] https://github.com/snok/asgi-correlation-id
___________________________________________________________________
(page generated 2022-07-01 23:00 UTC)