https://medium.com/@JaouherK/creating-a-human-and-machine-freindly-logging-format-bb6d4bb01dca Get unlimited access Open in app Home Notifications Lists Stories --------------------------------------------------------------------- Write Jaouher Kharrat Jaouher Kharrat Follow Sep 17 * 6 min read The Art of Logging Creating a Human and Machine Friendly Logging Format. Historically, logs have been essential for troubleshooting application and infrastructure performance. Nowadays, it is used for business dashboards visualization and performance analysis. The importance of structuring the data in those log files so that it can be extracted, manipulated, and analyzed efficiently, in addition to being understandable by humans, is quickly moving up the priority list. The rise of (micro)services also gave birth to another challenge: tracing the propagation of the request throughout the system. [1] Photo by Viktor Talashuk on Unsplash In this article, we will identify the optimal format for structuring our logs that is easy for humans and machines to parse and understand. Next, we will highlight the key info to log in addition to a proposal of data structure. Finally we will try to provide some important notes to keep in mind for your own projects. Why should logs be human readable? Although logs are originally meant to be parsed, processed, and stored by machines, they are actively being read, understood, and diagnosed by humans. Logs are our best indicators to investigate the murder scene caused by our arch enemy: The Bug! [1] Photo by Elisa Ventur on Unsplash Nothing can be more frustrating and time-consuming than trying to grasp the information lost within a long and unstructured logline. It is imperative to have a meaningful log that a person can easily understand and dig deeper if the content is relevant to him. 66.249.65.159 - - [06/Nov/2014:19:10:38 +0000] "GET /news/53f8d72920ba2744fe873ebc.html HTTP/1.1" 404 177 "-" "Debian APT-HTTP/1.3 (0.8.16~exp12ubuntu10.16)" Although we are used to the default Nginx format, the above example is still hard to read and process. It is even harder, when it is part of a huge log file extracted in order to reproduce a bug in production, for example. The advantages of JSON over other data exchange formats, such as XML, becomes very clear as it's simple for us humans to both read, write, and understand. How? Its structure is a simple syntax of key-value pairs ordered and nested within arrays. So what does a log message written in JSON look like? The following is the same previous example of the Nginx web server formatted in JSON: { "time": "06/May/2022:19:10:38 +0100", "remote_ip": "66.249.65.159", "remote_user": "-", "request": "GET /news/53f8d72920ba2744fe873ebc.html HTTP/1.1", "response": 404, "bytes": 177, "referrer": "-", "agent": "Debian APT-HTTP/1.3 (0.8.16~exp12ubuntu10.16)" } Why should logs be machine friendly? Let's consider the above log line example again: 66.249.65.159 - - [06/Nov/2014:19:10:38 +0000] "GET /news/53f8d72920ba2744fe873ebc.html HTTP/1.1" 404 177 "-" "Debian APT-HTTP/1.3 (0.8.16~exp12ubuntu10.16)" In order to make sense of it, we need to: * decipher the syntax, * write logic to parse the messages and extract the data we need. Unfortunately, that logic is fragile. In case something changes in the log format (like a developer adds a new field or changes the items order), then the parser will break. I am sure anyone can face or relate to a similar situation. That's where a structured format such as JSON can help. The key-value pairs make it easy to extract specific values and to filter and search across a data set. If new key-value pairs are added, the software parsing the log messages will just ignore those keys it doesn't expect, rather than failing completely. [1] Photo by Alex Knight on Unsplash The benefits of logging in JSON for machines are: * It has a structured format, thus facilitating the analyzis of application logs and querying each and every field. * Every programming language can parse it. Usually, we can aggregate our JSON data in a log parsing system (ELK, newRelic, Datadog etc.) that gives us powerful reporting, searching, and insights to our data. These tools make it easier to index some fields, thus solving the issues of tracing requests through (micro) services environement. Which information to include? Here is a list of info we should include in any proper log message. Some elements could be optional. The (o) next to the field name indicates an optional field. * message : this is a human-readable message to describe the situation - easy to read when filtering to have an overview of the content. * level : This is a numerical presentation for the priority level (more details in the next section). Very useful to sort messages into different buckets of priority or to generate a dashboard with an overview of the system. * level_name : This is a "string" presentation for the priority level (more details in the next section) * datetime_iso : This is an iso8601 format. It is a required field because we need it to correlate it with other events. Although we can use the server's date-time, this could be misleading because these servers will use their server acquisition time of the logs which could be a few seconds different or even at a different timezone. * correlation_id : this is an important field for the microservices environment. We will use the correlation id of the parsed message/request to trace a request in the whole journey between services. * (o) hostname : Useful to identify which machine generated this log. We recommend it in a microservices environment. It could be redundant when the server logs maps already the original host from the docker's service name. * (o) application: Useful to identify which device or application generated this log. * (o) owner_id : this will report the user id or API key id if available. We can trace which steps a user has done to reproduce his actions. * (o) tenant_id : this will report the tenant id if available. Very helpful for multi-tenant systems * (o) tags : Could be an array of elements. This element contains meta info about a request like the type, used protocol, etc. * (o) stacktrace: this is used to display the stack trace in a stringified online format if it exists * (o) exception: this is used to display the exception message if it exists Log levels and associated Log codes The propsed Format: So what does a log message written in JSON look like? The following is the proposed logging concept format through a sample log: [1] Sample message generated using carbon Notes * In the log server, we should index the following elements for faster search: owner_id, tenant_id, correlation_id, level, level_name, and application. * The optional fields should be added to the logs when they are available. We can only omit them when they are not available. Their value will be visible when debugging our system. * The context element can contain other fields that can be useful (like a dump of the incoming request). * For security or compliance reasons (personal information protection), the logs should have some filters for key fields possibly present in a request (password). We should anonymize the content before outputting it. * Every service should forward the correlation_id it is receiving. If this value is not present, it should generate a new one and pass it to the next service. An API gateway (when existing) should always take care of the presence and generation of this field. Best practices: * Invest time in designing your log structure. This format serves our needs and can easily be replicated. However, some teams might need a different one. Consider also the level of granularity that suits your needs. * Log as much as possible. Having the module and line number during a fatal exception or the IP/username when confronting a security breach is invaluable to resolve the issue faster and accurately. * Keeping consistency is everone's priority. Proper keys and accurate values in the JSON message make debugging easier and more efficient. Correlation Id and log levels are the obvious components to prove this point. * Log as you code. Like writing unit tests, try keeping the same discipline and log your system interactions. It is easier to do it on the spot than to add it later to avoid losing the context and edge cases of the function. -- -- 4 More from Jaouher Kharrat Follow Engineering Manager & Software Engineer | Hardcore Gamer | JS, PHP, GO | IAM adept | Packtpub author | @EQS Group | http://github.com/ JaouherK Love podcasts or audiobooks? Learn on the go with our new app. Try Knowable Recommended from Medium Ing. Jan Jilecek Ing. Jan Jilecek GameDev tutorial list [0] Josef Cruz Josef Cruz in JavaScript in Plain English 4 Bad Programming Techniques You'll Know When Working As A Programmer [0] Shawn Shawn Predicting Customer Churn in Python: Lecture 1 [1] Ally Tech Ally Tech in Ally Tech Software Development -- It's all about shortening the feedback cycle [1] Wakoli Votes Wakoli Votes Network Science: Useful Implementation Methods in NetworkX Package [1] Shirlen Detablan Shirlen Detablan Setting up your Unity Environment [0] Mandi Gunningham Mandi Gunningham in Better Programming Want to Switch Careers to Computer Science? 4 Options to Consider Concrete sidewalk reading "passion led us here" Yuri Stadnik Yuri Stadnik in KeenEthics Blog 36 Questions You Should Ask a Node Developer on the Interview [0] AboutHelpTermsPrivacy --------------------------------------------------------------------- Get the Medium app A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store Get started [ ] Jaouher Kharrat Jaouher Kharrat 18 Followers Engineering Manager & Software Engineer | Hardcore Gamer | JS, PHP, GO | IAM adept | Packtpub author | @EQS Group | http://github.com/ JaouherK Follow Help Status Writers Blog Careers Privacy Terms About Knowable