[HN Gopher] Lossless Log Aggregation - Reduce Log Volume by 99% ...
___________________________________________________________________
Lossless Log Aggregation - Reduce Log Volume by 99% Without
Dropping Data
Author : benshumaker
Score : 113 points
Date : 2024-12-06 00:31 UTC (22 hours ago)
(HTM) web link (bit.kevinslin.com)
(TXT) w3m dump (bit.kevinslin.com)
| aaaronic wrote:
| The multi-line case can usually be fixed with simple
| configuration changes to a structured log format.
|
| The other cases are more interesting, and pre-aggregation of all
| logs related to a correlation ID can be really helpful when
| debugging a specific incident, but it does seem like this
| proposal is the same basic trade-off around size and performance
| as with virtually any form of compression.
| XorNot wrote:
| I have a persistent mental nag that the way we do logging always
| seems so inefficient: that theoretically no application should
| ever be outputting actual text messages, because logs are
| basically fixed strings + formatting data.
|
| So in theory we could uniquely identify all logs as a much more
| compact binary representation + a lookup table we ship with the
| executable.
| RHSeeger wrote:
| It depends on your use case. If you're looking to have a set of
| data that you can search through, sure. If you're looking to
| tail the log while working on things, having it output in plain
| text is very handy.
| drivebyhooting wrote:
| Compression will basically achieve that kind of encoding
| without tightly coupling your logging with an external enum or
| schema.
| vouwfietsman wrote:
| Compression is only ever as good as the information it is
| given, if given text, it can do an OK job using general
| entropy encoding. However, most data is not text, even log
| data is not text. Log data is timestamps, numbers, enums,
| indices, guids, text, etc. Each of those categories, combined
| with how they are layed out, can be compressed in different
| optimal ways, resulting in different optimal compression
| ratios based on your specific log format. As a simple
| example, compressing 3D meshes as meshes is an order of
| magnitude better than compressing 3D meshes as text.
|
| Sure, zipping your logs gives you a LOT, and you should do
| it, but if the result disappoints it is not the end of the
| road, at all.
| jumping_frog wrote:
| Isn't compression the best general purpose solution with
| theoretical guarantees? I mean a simple huffman coding
| could easily extract the key-values (where the repeated
| keys will be candidates for compression) relationship and
| compress it. But if you want to extract more juice, then
| that implies knowing the datatypes of strings themselves.
| It would be like types for logs and not messing those up.
| floren wrote:
| How big of a lookup table will you use for logging IPv6
| addresses?
| do_not_redeem wrote:
| > fixed strings + formatting data
|
| IP addresses are obviously part of the formatting data, not
| the fixed strings.
| efitz wrote:
| Why wouldn't IPv6 compress well?
|
| IPv6 is used internally a lot more than externally, so I
| would expect to see a LOT of commonality in the network ID
| fraction of the addresses- essentially all the bits of
| representing your IPv6 network ID get reduced to the number
| of bits in a compression token, in the worst case. In the
| moderate case, you get a few chatty machines (DNS servers
| and the like) where the whole address is converted to a
| single compression token. In the best case, you get that
| AND a lot of repetition in the rest of the message, and you
| reduce most of each message to a single compression token.
|
| It's hard to explain if you haven't actually experimented
| with it, but modern variants of LZ compression are
| miraculous. It's like compilers- your intuition tells you
| hand tuned assembly is better, but compilers know crazy
| tricks and that intuition is almost always wrong. Same with
| compressors- they don't look at data the same way you do,
| and they work way better than your intuition thinks they
| would.
| muststopmyths wrote:
| Microsoft had a poorly documented tool called tracewpp that did
| this. Blindingly fast logging with very little runtime
| overhead.
|
| It was hard to figure out how to use it without documentation
| so it wasn't very popular. No idea if they still ship it in the
| DDK.
|
| It was a preprocessor that converted logging macros into string
| table references so there was no runtime formatting. You
| decoded the binary logs with another tool after the fact.
|
| Vaguely remember some open source cross platform tool that did
| something similar but the name escapes me now.
| comex wrote:
| Apple's os_log API also works this way.
| koolba wrote:
| > It was a preprocessor that converted logging macros into
| string table references so there was no runtime formatting.
| You decoded the binary logs with another tool after the fact.
|
| Is be curious how that stacks up against something like zstd
| with a predefined dictionary.
| HdS84 wrote:
| I think the Windows event log works like this. Sadly it's
| very opaque and difficult to use for non admin apps (you need
| admin rights to install your logs for the first time.
| Afterwards you can run with less privileges.)
| muststopmyths wrote:
| If you're thinking of ETW (event tracing for Windows) and
| not the actual Windows EventLog itself, then you're right.
| traceWPP used ETW under the hood to record logging as ETW
| events in a file.
| masfuerte wrote:
| The Windows Event Log also used (uses?) this idea of pre-
| defined messages and you just supplied an event ID and
| data to fill in the blanks in the message.
|
| Originally there was only one system-wide application
| event log and you needed to be admin to install your
| message definitions but it all changed in Vista (IIRC).
| I'd lost interest by then so I don't know how it works
| now. I do know that the event log viewer is orders of
| magnitude slower than it was before the refit.
| dietr1ch wrote:
| Besides space, you can also save CPU when writing logs.
|
| This is a logging library that does lazy formatting, -
| https://defmt.ferrous-systems.com/
| jiggawatts wrote:
| Your feelings are spot on.
|
| In most modern distributed tracing, "observability", or similar
| systems the _write amplification_ is typically 100:1 because of
| these overheads.
|
| For example, in Azure, every log entry includes a bunch of
| highly repetitive fields in full, such as the resource ID,
| "Azure" as the source system, the log entry Type, the source
| system, tenant, etc...
|
| A single "line" is typically over a kilobyte, but often the
| interesting part is maybe 4 to 20 bytes of actual payload data.
| Sending this involves HTTP overheads as well such as the
| headers, authentication, etc...
|
| Most vendors in this space _charge by the gigabyte_ , so as you
| can imagine they have zero incentive to improve on this.
|
| Even for efficient binary logs such as the Windows performance
| counters, I noticed that second-to-second they're very highly
| redundant.
|
| I once experimented with a metric monitor that could collect
| 10,000-15,000 metrics _per server per second_ and use only
| about 100MB of storage per host... _per year_.
|
| The trick was to simply binary-diff the collected metrics with
| some light "alignment" so that groups of related metrics would
| be at the same offsets. Almost all numbers become zero, and
| compress very well.
| kiitos wrote:
| You never send a single individual log event per HTTP
| request, you always batch them up. Assuming some reasonable
| batch size per request (minimum ~1MiB or so) there is rarely
| any meaningful difference in payload size between
| gzipped/zstd/whatever JSON bytes, and any particular binary
| encoding format you might prefer.
| jiggawatts wrote:
| Most log collection systems do not compress logs as they
| send them, because again, why would they? This would
| instantly turn their firehose of revenue cash down to a
| trickle. Any engineer suggesting such a feature would be
| disciplined at best, fired at worst. Even if their boss is
| naive to the business realities and approves the idea, it
| turns out that it's weirdly difficult in HTTP to send
| compressed _requests_. See:
| https://medium.com/@abhinav.ittekot/why-http-request-
| compres...
|
| HTTP/2 would also improve efficiency because of its built-
| in header compression feature, but again, I've not seen
| this used much.
|
| The ideal would be to have some sort of "session" cookie
| associated with a bag of constants, slowly changing values,
| and the schema for the source tables. Send this once a day
| or so, and then send only the cookie followed by columnar
| data compressed with RLE and then zstd. Ideally in a format
| where the server doesn't have to apply any processing to
| store the data apart from some light verification and
| appending onto existing blobs. I.e.: make the whole thing
| compatible with Parquet, Avro, or _something_ other than
| just sending uncompressed JSON like a savage.
| kiitos wrote:
| Most systems _do_ compress request payloads on the wire,
| because the cost-per-byte in transit over those wires is
| almost always frictional and externalized.
|
| Weird perspective, yours.
| piterrro wrote:
| They will compress over the wire, but then decompress and
| ingest counting billing for uncompressed data. After
| that, an interesting thing will happen, because they will
| compress the data along other interesting techniques to
| minimize the size of the data on their premises. Cant
| blame them... they're just trying to cut costs but the
| fact that they are charging so much for something that is
| so easily compressible is just... not fair.
| david38 wrote:
| This is why metrics rule and logging in production need only
| be turned on to debug specific problems and even then have a
| short TTL
| jiggawatts wrote:
| You got... entirely the wrong message.
|
| The answer to "this thing is horrendously inefficient
| because of misaligned incentives" isn't to be frugal with
| the thing, but to _make it efficient_ , ideally by aligning
| incentives.
|
| Open source monitoring software will eventually blow the
| proprietary products out of the water because when you're
| running something yourself, the cost per gigabyte is now
| just your own cost and not a _profit centre line item_ for
| someone else.
| piterrro wrote:
| Unless you start attaching tags to metrics and allow
| engineers to explode cardinality of the metrics. Then your
| pockets need to be deep.
| david38 wrote:
| This is what metrics are for
| Veserv wrote:
| Yep, every actually efficient logging system does it that way.
| It is the only way you can log fast enough to saturate memory
| bandwidth or output billions of logs per core-second.
|
| You can see a fairly general explanation of the concept here:
| https://messagetemplates.org/
| jpalomaki wrote:
| Side benefit is that you don't need to parse the arbitrary
| strings to extract information from the logs.
| willvarfar wrote:
| This all might be true for log generating etc.
|
| But someone who has tried to wrangle gazillion row dumps from a
| variety of old msgpack protobuf etc and make sense of it all
| will hate it.
|
| Zipped text formats are infinitely easier to own long term and
| import into future fancy tools and databases for analysis.
| rcxdude wrote:
| I've built a logging system like that, in an embedded context,
| and defmt (https://github.com/knurling-rs/defmt) is an open-
| source implementation of the same context. What's most handy
| about it is that logging continuous sensor data and logging
| events can both use the same framework.
| ThrowawayTestr wrote:
| Logs seem like they'd be easily compressible, no?
| efitz wrote:
| All the LZ variants work well on logs, but I had the best luck
| so far with zstd. YMMV.
| kikimora wrote:
| They are, but log analysis vendors charge for the amount of
| uncompressed logs which quickly amounts to a large bill.
| iampims wrote:
| Or sampling :)
| craigching wrote:
| Sampling is lossy though
| iampims wrote:
| lossy and simpler.
|
| IME, I've found sampling simpler to reason about, and with
| the sampling rate part of the message, deriving metrics from
| logs works pretty well.
|
| The example in the article is a little contrived.
| Healthchecks often originate from multiple hosts and/or logs
| contain the remote address+port, leading to each log message
| being effectively unique. So sure, one could parse the remote
| address into remote_address=192.168.12.23 remote_port=64780
| and then decide to drop the port in the aggregation, but is
| it worth the squeeze?
| kiitos wrote:
| If a service emits a log event, then that log event should
| be visible in your logging system. Basic stuff. Sampling
| fails this table-stakes requirement.
| eru wrote:
| Typically, you store your most recent logs in full, and
| you can move to sampling for older logs (if you don't
| want to delete them outright).
| kiitos wrote:
| It's reasonable to drop logs beyond some window of time
| -- a year, say -- but I'm not sure why you'd ever sample
| log events. Metric samples, maybe! Log data, no point.
|
| But, in general, I think we agree -- all good!
| 1oooqooq wrote:
| always assumed this was a given on all the log aggregators SaaS
| of the last decade.
| spenczar5 wrote:
| But... this does drop data? Only the start and end timestamp are
| preserved; the middle ones have no time. How can this be called
| lossless?
|
| Genuinely lossless compression algorithms like gzip work pretty
| well.
| efitz wrote:
| Was going to point out the same thing - the original article's
| solution is losing timestamps and possibly ordering. They also
| are losing some compressibility by converting to a structured
| format (JSON). And if they actually include a lot of UUIDs
| (their diagram is vague on what transaction IDs look like),
| then good luck - those don't compress very well.
|
| I worked at a magnificent 7 company that compressed a lot of
| logs; we found that zstd actually did the best all-around job
| back in 2021 after a lot of testing.
| eru wrote:
| Agreed.
|
| If you used something like sequential IDs (even in some UUID
| format) it can compress pretty well.
| willvarfar wrote:
| As a member of the UUIDv7 cheering squad let me say 'rah
| rah'! :D
| pdimitar wrote:
| Which compression level of zstd worked best in terms of the
| ideal balance between compression ratio vs. run time?
| greggyb wrote:
| We have a process monitor that basically polls ps output and
| writes it to JSON. We see ~30:1 compression using zstd on a
| ZFS dataset that stores these logs.
|
| I laugh every time I see it.
| corytheboyd wrote:
| Exactly my thoughts, the order of these events by timestamp is
| itself necessary for debugging.
|
| If I want something like per-transaction rollup of events into
| one log message, I build it and use it explicitly.
| eastern wrote:
| The obvious answer is a relational structure. In the given
| example, host, status, path and target should be separate
| relations. They'll all be tiny ones, a few rows each.
|
| Of course, performance etc are a separate story but as far as the
| shape of the data goes, that's what the solution is
| LAC-Tech wrote:
| Then you're locking your data into a relational model.
|
| A log is universal, and can be projected to relational,
| document etc. Which is why most relational databases are built
| on top of them.
| KaiserPro wrote:
| Shipping and diving logs is a bad idea for anything other than
| last line debug defence.
|
| If you're going to agregate your logs, you're much better off
| converting them to metrics _on device_. it makes comparison much
| easier, and storage and pensioning trivial.
| speedgoose wrote:
| I agree. I much prefer Prometheus and Sentry over logs.
| VBprogrammer wrote:
| Metrics are useless in my experience for figuring out a
| problem. For the most part they only tell you that you have a
| problem. Being able to slice and dice logs that you have
| faith in is critical in my experience.
| intelVISA wrote:
| problem_counter: 1
| _kb wrote:
| days_since_problem compresses better if you have long
| streams of '0'.
| KaiserPro wrote:
| Yeah you still need logs, but as a last line.
|
| Metrics tell you when and what went wrong, but unless
| you've got lots of coverage, its less likely to tell you
| _why_
| _kb wrote:
| It doesn't need to happen on device, just upstream of storage
| (and as close to the source as possible to minimise transport
| overheads). Most of the OTel collectors are good at this, but
| IMO Grafana Alloy is particularly neat.
|
| This works for when you cannot change the log source too (e.g.
| third party component or even legacy hardware that may be
| syslog only).
| marklar423 wrote:
| I think a good compromise is having metrics that are always on,
| with the ability to enable more verbose debugging logs as
| needed.
| vdm wrote:
| https://github.com/y-scope/clp
| koliber wrote:
| I'm wondering how this would compare to compressing the logs with
| Broli and a custom dictionary.
|
| I would imagine the size reduction would be much better than the
| 40% Kevin showed with his approach.
|
| As for volume, I don't know if volume is a problem. With logs I
| either need to look at one log entry to get it's details, a small
| handful to see a pattern, or aggregate statistics.
|
| I can do all of those things wether the volume is 1x, 100x, or
| 100000x.
|
| In other words, size matters to me, but I don't care about
| volume.
|
| On the other hand, for cases when we use tools that charge for
| uncompressed size or message count, then Kevin's approach could
| be a useful cost-saving measure.
| ericyd wrote:
| I interpretted the article's argument as saying that volume can
| be a major business cost, not that it creates operational
| challenges.
| ajuc wrote:
| My problem with this is handling malformed logs (which is exactly
| the logs you are the most interested in).
|
| If you process logs in a smart way you have to have assumptions
| about them. When the assumptions are broken the data might get
| missing (or you can lose the correlation between logs that are
| close together).
|
| That's why I prefer raw unprocessed logs.
|
| For example let's imagine you have a bug with stale transaction
| id in the first example. You probably won't notice when it's
| grouped by transactionId. If it's raw logs you might notice the
| order of the logs is wrong.
|
| Or maybe the grouping key is missing and your throw NPE and crash
| the app or skip that line of logs. I've seen both happen when I
| got too fancy with logging (we had async logger that logged by
| inserting into a db).
| ericyd wrote:
| Many commenters seem unenthusiastic about this approach, but it
| seems like it solves some useful problems for my typical use
| cases.
|
| Regarding dropped timestamps: if log order is preserved within an
| aggregation, that would be sufficient for me. I'm less concerned
| with millisecond precision and more interested in sequence. I'm
| making a big assumption that aggregation would be limited to
| relatively short time horizons on the order of seconds, not
| minutes.
|
| Grouping related logs sounds like a big win to me in general.
| Certainly I can perform the same kind of filtering manually in a
| log viewer, but having logs grouped by common properties by
| default sounds like it would make logs easier to scan.
|
| I've never been involved in billing for logging services so I
| can't comment on the efficiency gains of log aggregation vs zstd
| or gzip as proposed by some other commenters.
| corytheboyd wrote:
| You know what would actually kill for saving on log data? Being
| forced to use an efficient format. Instead of serializing string
| literals to text and sending all those bytes on each log, require
| that log message templates be registered in a schema, and then a
| byte or two can replace the text part of the message.
|
| Log message template parameters would have to be fully
| represented in the message, it would be way too much work to
| register the arbitrary values that get thrown in there.
|
| Next logical step is to serialize to something more efficient
| than JSON-- should be dead simple, it's a template followed by N
| values to sub into the template, could be a proprietary format,
| or just use something like protobuf.
|
| It's better than compression, because the data that was being
| compressed (full text of log message template) is just not even
| present. You could still see gains from compressing the entire
| message, in case it has text template values that would benefit
| from it.
|
| I get it, we lost human readability, which may be too big of a
| compromise for some, but we accomplish the main goal of "make
| logs smaller" without losing data (individually timestamped
| events). Besides, this could be made up for with a really nice
| log viewer client.
|
| I'm sure this all exists already to some degree and I just look
| dumb, but look dumb I will.
| onionisafruit wrote:
| Don't worry about human readability. When you have an issue
| with log size, you are already logging more than a human can
| read.
| corytheboyd wrote:
| Agreed. At that point you need specialized tools anyway.
| nepthar wrote:
| I think this is a really good point. A logging system could
| theoretically toggle "text" mode on and off, giving human
| readable logs in development and small scale deployments.
|
| In fact, I'm going to build a toy one in python!
| Izkata wrote:
| > In fact, I'm going to build a toy one in python!
|
| I suggest building it as a normal python logging handler
| instead of totally custom, that way you don't need a "text"
| toggle and it can be used without changing any existing
| standard python logging code. Only requires one tweak to
| the idea: Rather than a template table at the start of the
| file, have two types of log entries and add each template
| the first time it's used.
|
| Drawback is having to parse the whole file to find all the
| templates, but you could also do something like putting the
| templates into a separate file to avoid that...
| packetlost wrote:
| I started developing a tracing/span library that does just
| this: log messages are "global" (to a system/org) hierarchical
| "paths" + timestamp + a tagged union. The tagged union method
| allows you to have zero or more internal parameters that can be
| injected into a printf (or similar style) format string when
| printing, but the message itself is only a few bytes.
|
| The benefits to this approach is it's dramatically easier to
| index and cheaper to store at any scale.
|
| One thing I think people don't appreciate about logging
| efficiency is it enables you to log and store more and I think
| many don't appreciate how _much_ even modest amounts of text
| logs can bog down systems. You can 't read anything, but you
| filters easy and powerful and you can't filter something that
| doesn't exist.
| corytheboyd wrote:
| Another thing people won't appreciate is ANY amount of
| friction when they "just want to log something real quick".
| Which has merit, you're debugging some garbage, and need to
| log something out in production because it's dumb, harmless,
| quick, and will tell you exactly what you need. That's why I
| think you need a sort of fallback as well, for something like
| this to capture enough mindshare.
|
| How did your solution work out in terms of adoption by
| others? Was it a large team using it? What did those people
| say? Really curious!
| packetlost wrote:
| It doesn't really replace something like print-line
| debugging, but the type of system that benefits/can use
| print-line debugging would see no benefit from a structured
| logging approach either. The systems I'm targeting are
| producing logs that get fed into multi-petabyte
| Elasticsearch clusters.
|
| To answer your question: the prototype was never finished,
| but the concepts were adapted to a production version and
| is used for structured events in a semi-embedded system at
| my work.
| lokar wrote:
| There are logging libraries that do this. The text template
| is logged alongside a binary encoding of the arguments. It
| saves both space and cpu.
| packetlost wrote:
| Yup, I'm aware. My focus was more on scaling it out to
| large aggregation systems.
| CodesInChaos wrote:
| > should be dead simple, it's a template followed by N values
| to sub into the template,
|
| CSV without fixed columns would be fine for that.
|
| > require that log message templates be registered in a schema,
| and then a byte or two can replace the text part of the
| message.
|
| Pre-registering is annoying to handle, and compression already
| de-duplicates these very well. Alternatively the logger can
| track every template logged in this file so far, and assign it
| an integer on the fly.
___________________________________________________________________
(page generated 2024-12-06 23:02 UTC)