[HN Gopher] Lossless Log Aggregation - Reduce Log Volume by 99% ...
       ___________________________________________________________________
        
       Lossless Log Aggregation - Reduce Log Volume by 99% Without
       Dropping Data
        
       Author : benshumaker
       Score  : 113 points
       Date   : 2024-12-06 00:31 UTC (22 hours ago)
        
 (HTM) web link (bit.kevinslin.com)
 (TXT) w3m dump (bit.kevinslin.com)
        
       | aaaronic wrote:
       | The multi-line case can usually be fixed with simple
       | configuration changes to a structured log format.
       | 
       | The other cases are more interesting, and pre-aggregation of all
       | logs related to a correlation ID can be really helpful when
       | debugging a specific incident, but it does seem like this
       | proposal is the same basic trade-off around size and performance
       | as with virtually any form of compression.
        
       | XorNot wrote:
       | I have a persistent mental nag that the way we do logging always
       | seems so inefficient: that theoretically no application should
       | ever be outputting actual text messages, because logs are
       | basically fixed strings + formatting data.
       | 
       | So in theory we could uniquely identify all logs as a much more
       | compact binary representation + a lookup table we ship with the
       | executable.
        
         | RHSeeger wrote:
         | It depends on your use case. If you're looking to have a set of
         | data that you can search through, sure. If you're looking to
         | tail the log while working on things, having it output in plain
         | text is very handy.
        
         | drivebyhooting wrote:
         | Compression will basically achieve that kind of encoding
         | without tightly coupling your logging with an external enum or
         | schema.
        
           | vouwfietsman wrote:
           | Compression is only ever as good as the information it is
           | given, if given text, it can do an OK job using general
           | entropy encoding. However, most data is not text, even log
           | data is not text. Log data is timestamps, numbers, enums,
           | indices, guids, text, etc. Each of those categories, combined
           | with how they are layed out, can be compressed in different
           | optimal ways, resulting in different optimal compression
           | ratios based on your specific log format. As a simple
           | example, compressing 3D meshes as meshes is an order of
           | magnitude better than compressing 3D meshes as text.
           | 
           | Sure, zipping your logs gives you a LOT, and you should do
           | it, but if the result disappoints it is not the end of the
           | road, at all.
        
             | jumping_frog wrote:
             | Isn't compression the best general purpose solution with
             | theoretical guarantees? I mean a simple huffman coding
             | could easily extract the key-values (where the repeated
             | keys will be candidates for compression) relationship and
             | compress it. But if you want to extract more juice, then
             | that implies knowing the datatypes of strings themselves.
             | It would be like types for logs and not messing those up.
        
         | floren wrote:
         | How big of a lookup table will you use for logging IPv6
         | addresses?
        
           | do_not_redeem wrote:
           | > fixed strings + formatting data
           | 
           | IP addresses are obviously part of the formatting data, not
           | the fixed strings.
        
             | efitz wrote:
             | Why wouldn't IPv6 compress well?
             | 
             | IPv6 is used internally a lot more than externally, so I
             | would expect to see a LOT of commonality in the network ID
             | fraction of the addresses- essentially all the bits of
             | representing your IPv6 network ID get reduced to the number
             | of bits in a compression token, in the worst case. In the
             | moderate case, you get a few chatty machines (DNS servers
             | and the like) where the whole address is converted to a
             | single compression token. In the best case, you get that
             | AND a lot of repetition in the rest of the message, and you
             | reduce most of each message to a single compression token.
             | 
             | It's hard to explain if you haven't actually experimented
             | with it, but modern variants of LZ compression are
             | miraculous. It's like compilers- your intuition tells you
             | hand tuned assembly is better, but compilers know crazy
             | tricks and that intuition is almost always wrong. Same with
             | compressors- they don't look at data the same way you do,
             | and they work way better than your intuition thinks they
             | would.
        
         | muststopmyths wrote:
         | Microsoft had a poorly documented tool called tracewpp that did
         | this. Blindingly fast logging with very little runtime
         | overhead.
         | 
         | It was hard to figure out how to use it without documentation
         | so it wasn't very popular. No idea if they still ship it in the
         | DDK.
         | 
         | It was a preprocessor that converted logging macros into string
         | table references so there was no runtime formatting. You
         | decoded the binary logs with another tool after the fact.
         | 
         | Vaguely remember some open source cross platform tool that did
         | something similar but the name escapes me now.
        
           | comex wrote:
           | Apple's os_log API also works this way.
        
           | koolba wrote:
           | > It was a preprocessor that converted logging macros into
           | string table references so there was no runtime formatting.
           | You decoded the binary logs with another tool after the fact.
           | 
           | Is be curious how that stacks up against something like zstd
           | with a predefined dictionary.
        
           | HdS84 wrote:
           | I think the Windows event log works like this. Sadly it's
           | very opaque and difficult to use for non admin apps (you need
           | admin rights to install your logs for the first time.
           | Afterwards you can run with less privileges.)
        
             | muststopmyths wrote:
             | If you're thinking of ETW (event tracing for Windows) and
             | not the actual Windows EventLog itself, then you're right.
             | traceWPP used ETW under the hood to record logging as ETW
             | events in a file.
        
               | masfuerte wrote:
               | The Windows Event Log also used (uses?) this idea of pre-
               | defined messages and you just supplied an event ID and
               | data to fill in the blanks in the message.
               | 
               | Originally there was only one system-wide application
               | event log and you needed to be admin to install your
               | message definitions but it all changed in Vista (IIRC).
               | I'd lost interest by then so I don't know how it works
               | now. I do know that the event log viewer is orders of
               | magnitude slower than it was before the refit.
        
         | dietr1ch wrote:
         | Besides space, you can also save CPU when writing logs.
         | 
         | This is a logging library that does lazy formatting, -
         | https://defmt.ferrous-systems.com/
        
         | jiggawatts wrote:
         | Your feelings are spot on.
         | 
         | In most modern distributed tracing, "observability", or similar
         | systems the _write amplification_ is typically 100:1 because of
         | these overheads.
         | 
         | For example, in Azure, every log entry includes a bunch of
         | highly repetitive fields in full, such as the resource ID,
         | "Azure" as the source system, the log entry Type, the source
         | system, tenant, etc...
         | 
         | A single "line" is typically over a kilobyte, but often the
         | interesting part is maybe 4 to 20 bytes of actual payload data.
         | Sending this involves HTTP overheads as well such as the
         | headers, authentication, etc...
         | 
         | Most vendors in this space _charge by the gigabyte_ , so as you
         | can imagine they have zero incentive to improve on this.
         | 
         | Even for efficient binary logs such as the Windows performance
         | counters, I noticed that second-to-second they're very highly
         | redundant.
         | 
         | I once experimented with a metric monitor that could collect
         | 10,000-15,000 metrics _per server per second_ and use only
         | about 100MB of storage per host... _per year_.
         | 
         | The trick was to simply binary-diff the collected metrics with
         | some light "alignment" so that groups of related metrics would
         | be at the same offsets. Almost all numbers become zero, and
         | compress very well.
        
           | kiitos wrote:
           | You never send a single individual log event per HTTP
           | request, you always batch them up. Assuming some reasonable
           | batch size per request (minimum ~1MiB or so) there is rarely
           | any meaningful difference in payload size between
           | gzipped/zstd/whatever JSON bytes, and any particular binary
           | encoding format you might prefer.
        
             | jiggawatts wrote:
             | Most log collection systems do not compress logs as they
             | send them, because again, why would they? This would
             | instantly turn their firehose of revenue cash down to a
             | trickle. Any engineer suggesting such a feature would be
             | disciplined at best, fired at worst. Even if their boss is
             | naive to the business realities and approves the idea, it
             | turns out that it's weirdly difficult in HTTP to send
             | compressed _requests_. See:
             | https://medium.com/@abhinav.ittekot/why-http-request-
             | compres...
             | 
             | HTTP/2 would also improve efficiency because of its built-
             | in header compression feature, but again, I've not seen
             | this used much.
             | 
             | The ideal would be to have some sort of "session" cookie
             | associated with a bag of constants, slowly changing values,
             | and the schema for the source tables. Send this once a day
             | or so, and then send only the cookie followed by columnar
             | data compressed with RLE and then zstd. Ideally in a format
             | where the server doesn't have to apply any processing to
             | store the data apart from some light verification and
             | appending onto existing blobs. I.e.: make the whole thing
             | compatible with Parquet, Avro, or _something_ other than
             | just sending uncompressed JSON like a savage.
        
               | kiitos wrote:
               | Most systems _do_ compress request payloads on the wire,
               | because the cost-per-byte in transit over those wires is
               | almost always frictional and externalized.
               | 
               | Weird perspective, yours.
        
               | piterrro wrote:
               | They will compress over the wire, but then decompress and
               | ingest counting billing for uncompressed data. After
               | that, an interesting thing will happen, because they will
               | compress the data along other interesting techniques to
               | minimize the size of the data on their premises. Cant
               | blame them... they're just trying to cut costs but the
               | fact that they are charging so much for something that is
               | so easily compressible is just... not fair.
        
           | david38 wrote:
           | This is why metrics rule and logging in production need only
           | be turned on to debug specific problems and even then have a
           | short TTL
        
             | jiggawatts wrote:
             | You got... entirely the wrong message.
             | 
             | The answer to "this thing is horrendously inefficient
             | because of misaligned incentives" isn't to be frugal with
             | the thing, but to _make it efficient_ , ideally by aligning
             | incentives.
             | 
             | Open source monitoring software will eventually blow the
             | proprietary products out of the water because when you're
             | running something yourself, the cost per gigabyte is now
             | just your own cost and not a _profit centre line item_ for
             | someone else.
        
             | piterrro wrote:
             | Unless you start attaching tags to metrics and allow
             | engineers to explode cardinality of the metrics. Then your
             | pockets need to be deep.
        
         | david38 wrote:
         | This is what metrics are for
        
         | Veserv wrote:
         | Yep, every actually efficient logging system does it that way.
         | It is the only way you can log fast enough to saturate memory
         | bandwidth or output billions of logs per core-second.
         | 
         | You can see a fairly general explanation of the concept here:
         | https://messagetemplates.org/
        
         | jpalomaki wrote:
         | Side benefit is that you don't need to parse the arbitrary
         | strings to extract information from the logs.
        
         | willvarfar wrote:
         | This all might be true for log generating etc.
         | 
         | But someone who has tried to wrangle gazillion row dumps from a
         | variety of old msgpack protobuf etc and make sense of it all
         | will hate it.
         | 
         | Zipped text formats are infinitely easier to own long term and
         | import into future fancy tools and databases for analysis.
        
         | rcxdude wrote:
         | I've built a logging system like that, in an embedded context,
         | and defmt (https://github.com/knurling-rs/defmt) is an open-
         | source implementation of the same context. What's most handy
         | about it is that logging continuous sensor data and logging
         | events can both use the same framework.
        
       | ThrowawayTestr wrote:
       | Logs seem like they'd be easily compressible, no?
        
         | efitz wrote:
         | All the LZ variants work well on logs, but I had the best luck
         | so far with zstd. YMMV.
        
         | kikimora wrote:
         | They are, but log analysis vendors charge for the amount of
         | uncompressed logs which quickly amounts to a large bill.
        
       | iampims wrote:
       | Or sampling :)
        
         | craigching wrote:
         | Sampling is lossy though
        
           | iampims wrote:
           | lossy and simpler.
           | 
           | IME, I've found sampling simpler to reason about, and with
           | the sampling rate part of the message, deriving metrics from
           | logs works pretty well.
           | 
           | The example in the article is a little contrived.
           | Healthchecks often originate from multiple hosts and/or logs
           | contain the remote address+port, leading to each log message
           | being effectively unique. So sure, one could parse the remote
           | address into remote_address=192.168.12.23 remote_port=64780
           | and then decide to drop the port in the aggregation, but is
           | it worth the squeeze?
        
             | kiitos wrote:
             | If a service emits a log event, then that log event should
             | be visible in your logging system. Basic stuff. Sampling
             | fails this table-stakes requirement.
        
               | eru wrote:
               | Typically, you store your most recent logs in full, and
               | you can move to sampling for older logs (if you don't
               | want to delete them outright).
        
               | kiitos wrote:
               | It's reasonable to drop logs beyond some window of time
               | -- a year, say -- but I'm not sure why you'd ever sample
               | log events. Metric samples, maybe! Log data, no point.
               | 
               | But, in general, I think we agree -- all good!
        
       | 1oooqooq wrote:
       | always assumed this was a given on all the log aggregators SaaS
       | of the last decade.
        
       | spenczar5 wrote:
       | But... this does drop data? Only the start and end timestamp are
       | preserved; the middle ones have no time. How can this be called
       | lossless?
       | 
       | Genuinely lossless compression algorithms like gzip work pretty
       | well.
        
         | efitz wrote:
         | Was going to point out the same thing - the original article's
         | solution is losing timestamps and possibly ordering. They also
         | are losing some compressibility by converting to a structured
         | format (JSON). And if they actually include a lot of UUIDs
         | (their diagram is vague on what transaction IDs look like),
         | then good luck - those don't compress very well.
         | 
         | I worked at a magnificent 7 company that compressed a lot of
         | logs; we found that zstd actually did the best all-around job
         | back in 2021 after a lot of testing.
        
           | eru wrote:
           | Agreed.
           | 
           | If you used something like sequential IDs (even in some UUID
           | format) it can compress pretty well.
        
             | willvarfar wrote:
             | As a member of the UUIDv7 cheering squad let me say 'rah
             | rah'! :D
        
           | pdimitar wrote:
           | Which compression level of zstd worked best in terms of the
           | ideal balance between compression ratio vs. run time?
        
           | greggyb wrote:
           | We have a process monitor that basically polls ps output and
           | writes it to JSON. We see ~30:1 compression using zstd on a
           | ZFS dataset that stores these logs.
           | 
           | I laugh every time I see it.
        
         | corytheboyd wrote:
         | Exactly my thoughts, the order of these events by timestamp is
         | itself necessary for debugging.
         | 
         | If I want something like per-transaction rollup of events into
         | one log message, I build it and use it explicitly.
        
       | eastern wrote:
       | The obvious answer is a relational structure. In the given
       | example, host, status, path and target should be separate
       | relations. They'll all be tiny ones, a few rows each.
       | 
       | Of course, performance etc are a separate story but as far as the
       | shape of the data goes, that's what the solution is
        
         | LAC-Tech wrote:
         | Then you're locking your data into a relational model.
         | 
         | A log is universal, and can be projected to relational,
         | document etc. Which is why most relational databases are built
         | on top of them.
        
       | KaiserPro wrote:
       | Shipping and diving logs is a bad idea for anything other than
       | last line debug defence.
       | 
       | If you're going to agregate your logs, you're much better off
       | converting them to metrics _on device_. it makes comparison much
       | easier, and storage and pensioning trivial.
        
         | speedgoose wrote:
         | I agree. I much prefer Prometheus and Sentry over logs.
        
           | VBprogrammer wrote:
           | Metrics are useless in my experience for figuring out a
           | problem. For the most part they only tell you that you have a
           | problem. Being able to slice and dice logs that you have
           | faith in is critical in my experience.
        
             | intelVISA wrote:
             | problem_counter: 1
        
               | _kb wrote:
               | days_since_problem compresses better if you have long
               | streams of '0'.
        
             | KaiserPro wrote:
             | Yeah you still need logs, but as a last line.
             | 
             | Metrics tell you when and what went wrong, but unless
             | you've got lots of coverage, its less likely to tell you
             | _why_
        
         | _kb wrote:
         | It doesn't need to happen on device, just upstream of storage
         | (and as close to the source as possible to minimise transport
         | overheads). Most of the OTel collectors are good at this, but
         | IMO Grafana Alloy is particularly neat.
         | 
         | This works for when you cannot change the log source too (e.g.
         | third party component or even legacy hardware that may be
         | syslog only).
        
         | marklar423 wrote:
         | I think a good compromise is having metrics that are always on,
         | with the ability to enable more verbose debugging logs as
         | needed.
        
       | vdm wrote:
       | https://github.com/y-scope/clp
        
       | koliber wrote:
       | I'm wondering how this would compare to compressing the logs with
       | Broli and a custom dictionary.
       | 
       | I would imagine the size reduction would be much better than the
       | 40% Kevin showed with his approach.
       | 
       | As for volume, I don't know if volume is a problem. With logs I
       | either need to look at one log entry to get it's details, a small
       | handful to see a pattern, or aggregate statistics.
       | 
       | I can do all of those things wether the volume is 1x, 100x, or
       | 100000x.
       | 
       | In other words, size matters to me, but I don't care about
       | volume.
       | 
       | On the other hand, for cases when we use tools that charge for
       | uncompressed size or message count, then Kevin's approach could
       | be a useful cost-saving measure.
        
         | ericyd wrote:
         | I interpretted the article's argument as saying that volume can
         | be a major business cost, not that it creates operational
         | challenges.
        
       | ajuc wrote:
       | My problem with this is handling malformed logs (which is exactly
       | the logs you are the most interested in).
       | 
       | If you process logs in a smart way you have to have assumptions
       | about them. When the assumptions are broken the data might get
       | missing (or you can lose the correlation between logs that are
       | close together).
       | 
       | That's why I prefer raw unprocessed logs.
       | 
       | For example let's imagine you have a bug with stale transaction
       | id in the first example. You probably won't notice when it's
       | grouped by transactionId. If it's raw logs you might notice the
       | order of the logs is wrong.
       | 
       | Or maybe the grouping key is missing and your throw NPE and crash
       | the app or skip that line of logs. I've seen both happen when I
       | got too fancy with logging (we had async logger that logged by
       | inserting into a db).
        
       | ericyd wrote:
       | Many commenters seem unenthusiastic about this approach, but it
       | seems like it solves some useful problems for my typical use
       | cases.
       | 
       | Regarding dropped timestamps: if log order is preserved within an
       | aggregation, that would be sufficient for me. I'm less concerned
       | with millisecond precision and more interested in sequence. I'm
       | making a big assumption that aggregation would be limited to
       | relatively short time horizons on the order of seconds, not
       | minutes.
       | 
       | Grouping related logs sounds like a big win to me in general.
       | Certainly I can perform the same kind of filtering manually in a
       | log viewer, but having logs grouped by common properties by
       | default sounds like it would make logs easier to scan.
       | 
       | I've never been involved in billing for logging services so I
       | can't comment on the efficiency gains of log aggregation vs zstd
       | or gzip as proposed by some other commenters.
        
       | corytheboyd wrote:
       | You know what would actually kill for saving on log data? Being
       | forced to use an efficient format. Instead of serializing string
       | literals to text and sending all those bytes on each log, require
       | that log message templates be registered in a schema, and then a
       | byte or two can replace the text part of the message.
       | 
       | Log message template parameters would have to be fully
       | represented in the message, it would be way too much work to
       | register the arbitrary values that get thrown in there.
       | 
       | Next logical step is to serialize to something more efficient
       | than JSON-- should be dead simple, it's a template followed by N
       | values to sub into the template, could be a proprietary format,
       | or just use something like protobuf.
       | 
       | It's better than compression, because the data that was being
       | compressed (full text of log message template) is just not even
       | present. You could still see gains from compressing the entire
       | message, in case it has text template values that would benefit
       | from it.
       | 
       | I get it, we lost human readability, which may be too big of a
       | compromise for some, but we accomplish the main goal of "make
       | logs smaller" without losing data (individually timestamped
       | events). Besides, this could be made up for with a really nice
       | log viewer client.
       | 
       | I'm sure this all exists already to some degree and I just look
       | dumb, but look dumb I will.
        
         | onionisafruit wrote:
         | Don't worry about human readability. When you have an issue
         | with log size, you are already logging more than a human can
         | read.
        
           | corytheboyd wrote:
           | Agreed. At that point you need specialized tools anyway.
        
           | nepthar wrote:
           | I think this is a really good point. A logging system could
           | theoretically toggle "text" mode on and off, giving human
           | readable logs in development and small scale deployments.
           | 
           | In fact, I'm going to build a toy one in python!
        
             | Izkata wrote:
             | > In fact, I'm going to build a toy one in python!
             | 
             | I suggest building it as a normal python logging handler
             | instead of totally custom, that way you don't need a "text"
             | toggle and it can be used without changing any existing
             | standard python logging code. Only requires one tweak to
             | the idea: Rather than a template table at the start of the
             | file, have two types of log entries and add each template
             | the first time it's used.
             | 
             | Drawback is having to parse the whole file to find all the
             | templates, but you could also do something like putting the
             | templates into a separate file to avoid that...
        
         | packetlost wrote:
         | I started developing a tracing/span library that does just
         | this: log messages are "global" (to a system/org) hierarchical
         | "paths" + timestamp + a tagged union. The tagged union method
         | allows you to have zero or more internal parameters that can be
         | injected into a printf (or similar style) format string when
         | printing, but the message itself is only a few bytes.
         | 
         | The benefits to this approach is it's dramatically easier to
         | index and cheaper to store at any scale.
         | 
         | One thing I think people don't appreciate about logging
         | efficiency is it enables you to log and store more and I think
         | many don't appreciate how _much_ even modest amounts of text
         | logs can bog down systems. You can 't read anything, but you
         | filters easy and powerful and you can't filter something that
         | doesn't exist.
        
           | corytheboyd wrote:
           | Another thing people won't appreciate is ANY amount of
           | friction when they "just want to log something real quick".
           | Which has merit, you're debugging some garbage, and need to
           | log something out in production because it's dumb, harmless,
           | quick, and will tell you exactly what you need. That's why I
           | think you need a sort of fallback as well, for something like
           | this to capture enough mindshare.
           | 
           | How did your solution work out in terms of adoption by
           | others? Was it a large team using it? What did those people
           | say? Really curious!
        
             | packetlost wrote:
             | It doesn't really replace something like print-line
             | debugging, but the type of system that benefits/can use
             | print-line debugging would see no benefit from a structured
             | logging approach either. The systems I'm targeting are
             | producing logs that get fed into multi-petabyte
             | Elasticsearch clusters.
             | 
             | To answer your question: the prototype was never finished,
             | but the concepts were adapted to a production version and
             | is used for structured events in a semi-embedded system at
             | my work.
        
           | lokar wrote:
           | There are logging libraries that do this. The text template
           | is logged alongside a binary encoding of the arguments. It
           | saves both space and cpu.
        
             | packetlost wrote:
             | Yup, I'm aware. My focus was more on scaling it out to
             | large aggregation systems.
        
         | CodesInChaos wrote:
         | > should be dead simple, it's a template followed by N values
         | to sub into the template,
         | 
         | CSV without fixed columns would be fine for that.
         | 
         | > require that log message templates be registered in a schema,
         | and then a byte or two can replace the text part of the
         | message.
         | 
         | Pre-registering is annoying to handle, and compression already
         | de-duplicates these very well. Alternatively the logger can
         | track every template logged in this file so far, and assign it
         | an integer on the fly.
        
       ___________________________________________________________________
       (page generated 2024-12-06 23:02 UTC)