https://medium.com/@happybits/how-hackerman-would-create-an-image-just-by-typing-zeros-and-ones-a-deep-dive-into-gif-file-32bada0926c0 [ ] How Hackerman would create an image just by typing zeros and ones -- a deep dive into the GIF file format Oscar Olsson Oscar Olsson * Follow 12 min read * Nov 23 -- Listen Share CUSTOMER Hackerman, I need an impressive icon for my website. It should be 5x5 pixels big and look like a rabbit. Can you please draw it for me? HACKERMAN Draw!? Bah! I don't need any graphic program for that. I am Hackerman. I will code it for you. You will get the image next week. CUSTOMER Next week?? But... HACKERMAN No buts! I just need to read about how the GIF file format works, then I can create the image in no time. [TIME PASSES] After spending some evenings, Hackerman gets the main idea of how the GIF file format works and the compression algorithm called LZW. With that knowledge, he succeeded in creating the image within an hour. Hackerman calculated that the binary of the image should be as follows: 47 49 46 38 39 61 00 00 00 00 70 00 00 2c 00 00 00 00 05 00 05 00 81 11 11 11 FF FF FF D5 D7 D9 00 00 00 07 0F 80 01 00 83 01 82 84 85 88 82 8A 85 02 85 81 00 3b So he just opened his code editor, saved the file as rabbit.gif, and sent it to his customer. Boom! Easy-peasy! rabbit.gif Do you want understand the GIF-file format and be as cool as Hackerman? Then you better read this article. Get started If you open a GIF file, like the rabbit above, in a text editor, it will look something like this: In Visual Studio Code, there is an extension called Hex Editor, which lets you view and edit the binary file. On the left-hand side, you see the bits (zeros and ones) represented as hexadecimal numbers. The first line is not part of the file, it just shows column numbers. So, the file starts with the hex numbers: 47 49 46 38 39 61 and ends with 00 3B If you open any GIF file, you'll see that every file starts with 47 49 46 38 39 61 and ends with 00 3B. 47 49 46 38 39 61 is just the ASCII value of the string GIF89a, which tells anyone interested that this is a GIF file. Every GIF file ends with a special trailer byte, which is always 3B. But what about all the bytes in between? How is the image data actually stored? Let's start by looking at the simplest GIF is in history. An image with the size 1x1. The 1x1 GIF Okay, so let's study this image; it is just a pixel: I am an orange pixel The binary for this GIF image is: 47 49 46 38 39 61 00 00 00 00 70 00 00 2c 00 00 00 00 01 00 01 00 81 FF 6A 00 00 00 00 00 00 00 00 00 00 07 03 80 00 81 00 3b Let's look at the first part of the binaries -- the boring part. The boring part So, the first six bytes just tell the world that this is a GIF. Then there are four bytes (00 00 00 00) which set the canvas width and height. But these values aren't used anymore so you can write whatever you want here. A secret love letter if you can compress it to four bytes. Next is the Global Packed Field, which in this case is 70 which in binary form is 00000000 which is divided into four segments 0 000 0 000 The first segment (0) indicates if Global Color Table is used. It's not in this case. The second segment (000) indicates the color resolution. But this value isn't needed if we don't use Global Color Table. The third (0) is a sorting flag. The last part (000) indicates the size of the Global Color Table. Next we have Background Color Index (00). It's the color index of the background color (surprise!), but it is only used if Global Color Table is used. The aspect ratio isn't used anymore, so let's skip that byte. Finally, we have the image's left and top position, which tells where on the canvas the image should start. This is usually (0,0), which is translated to 00 00 00 00. When we, just like Hackerman, create our image, this part is the boring part; we will always use the same bits. But now, let's look at the rest of the bits -- the fun part. The fun part Now for the fun part! First, we have width and height. In this case, 1x1 pixels, represented as 01 00 01 00. Then, the Local Packed Field, in this case 81, which in binary form is: 10000001 But let's group it into five segments: 1 0 0 00 001 The first segment (1) tells if we should use Local Color Table or not (in this case we will). The second (0) is an interlace flag. The third (0) is a sort flag. The fourth (00) is reserved for future use, so you can write whatever you want -- your favorite byte, for example. The fifth (001) is the size of the local color table. In this case, the size is 1. To calculate how many colors should be in the color table (the palette), we use this formula: So, in this case: 2^(1+1) = 4 colors. A color consists of three parts (red, green, blue), so we need 12 bytes for the palette. No more, no less. In this one-pixel image, we only need one color, but we still have to add three more colors (with any values). Finally, the image data, in this case: 80 00 81 80 = start of image data. 00 = pick the color with the index 0 (orange in this case). 81 = end of image data. The byte before image data (03) just says how many bytes that follows with image data information. The final two bytes: 00 means "no more bytes, please" and 3B means trailer, which indicates the end of the file. So now you know how to create a one-by-one pixel using only code. If you want to change the color of the pixel, just change the first value of the color palette, in this case FF 6A 00. The hardest part to understand in the GIF file format is the last part of the file, which contains the image data. It is compressed using an algorithm called LZW. So let's look closer into that. Image data Okay, for me, the hardest part to get my head around was the combination of how the LZW algorithm works and how the result of the algorithm is split into segments of bits before we know which bytes to actually store. And when we fiddle with the data, it's all or nothing. Either we get it right, or there could be an error anywhere. That makes it a challenge to create an image like Hackerman. But we love challenges, don't we? The LZW algorithm isn't that hard. The main idea is that the algorithm likes repetition. Repetition leads to better compression. After every step, we will store a new row with indexes that we hope will come up in the future. The Line Let's try to create this image: First, we recognize the colors and create a palette. It is orange and blue; here are the color codes: The palette Next, we list the indexes of each color in order. Since 0 is orange and 1 is blue, we have: This is called the index stream. The mission is to convert this stream to a code stream. This will be done using the LZW algorithm (the algorithm that loves repetitition). We start with a table that contains all the colors in the palette and two extra special codes: Clear is used to start, and End marks the end of the code stream. To figure out the code stream is, for now, our only goal in life. Let's start by adding the start byte 80 to the code stream. Then, we add the code for the first color. We look one index ahead and see a 1. We add "0 1" to our table: ...and hope that the indexes "0 1" will come back in the future. Since we have consumed the first zero, the index stream now looks like this: We consume 1 and add it to the code stream. We look ahead and see another 1. So, we add "1 1" to the color table. Haha! Look at the index stream! Now we see "1 1", which makes us very happy. Because we have stored a special code for that, we add 83 to the code stream, and we can gobble "1 1" in one sweep. Now compression is happening! We look ahead and see another 1, so we add "1 1 1" to the table. Next, we see "1 0". Sadly, "1 0" isn't in the table so we can just consume "1". But we add "1 0" in the table and hope this combination will occur in the future. Aha! "0 1" is next, and we have it in the table! So, we add 82 to the code stream, and store "0 1 0" to the table. We hope to see "0 1 0" in the future, but sadly, there's just one index left, which is the index 0. So we consume it. Now the index stream is empty, so we just send the end byte (81) to the code stream. Mission complete. So, the binaries for the image is: 47 49 46 38 39 61 00 00 00 00 70 00 00 2c 00 00 00 00 08 00 01 00 81 FF 6A 00 00 26 FF 11 11 11 00 00 00 07 08 80 00 01 83 01 82 00 81 00 3B Now you just have to save the hexadecimal numbers in a file. You can do this in Visual Studio Code using the Hex Editor plugin. Or if you have Python installed you can use my simple Python script. import sys def textfile_to_gif(textfilename): # Reading the hexadecimal string from the text file with open(textfilename + ".txt", 'r') as file: hex_string = file.read() # Splitting the string by spaces to get individual hex values hex_values = hex_string.split() # Converting each hex value to a byte and collecting them in a byte array byte_array = bytearray(int(hex_val, 16) for hex_val in hex_values) # Writing the byte array to the specified output binary file with open(textfilename+".gif", 'wb') as file: file.write(byte_array) textfile_to_gif(sys.argv[1]) You create a file line.txt, with the hex numbers: 47 49 46 38 39 61 00 00 00 00 70 00 00 2c 00 00 00 00 08 00 01 00 81 FF 6A 00 00 26 FF 11 11 11 00 00 00 07 08 80 00 01 83 01 82 00 81 00 3B Then you run python textfile_to_gif.py line ...which will generate this beautiful line.gif: What if you want an image with a height other than 1? That is easy, you just modify the width and height values, and the pixels will flow to the next line. So, if you want an image with width 8 and height 2 you only need to change two numbers: 47 49 46 38 39 61 00 00 00 00 70 00 00 2c 00 00 00 00 04 00 02 00 81 FF 6A 00 00 26 FF 11 11 11 00 00 00 07 08 80 00 01 83 01 82 00 81 00 3B And this image looks like this: The rabbit Let's do something harder, and create the binaries for this image: The image has three colors: index 0 is almost black, index 1 is white and index 2 is rabbit nose white. With this palette, we get the following index stream: 10001 10001 11111 10101 11211 But for our purpose it's easier to put all indexes in a row, so here is our index stream: 1000110001111111010111211 To simplify the work, you can start with a table that contains the color indexes and the special codes: Then we gobble indexes from the index stream, from left to right. After each step, we look ahead, and add a new key to the right. The final table looks like this: The combination of the column Code will give the end result (80 00 82...). The first column (Consume) indicates which index values are consumed in each step. If the column cells contain many indexes, the compression is going well. On each step, we build the color table, which is the gray part. After every step, we add one more key-value pair here and hope that the upcoming indexes will fit. One interesting detail is that the color table (the gray part) isn't actually stored in the file. This is because of some ingenious LZW black magic. The resulting binaries for the rabbit are shown here: 47 49 46 38 39 61 00 00 00 00 00 00 00 2c 00 00 00 00 05 00 05 00 81 11 11 11 FF FF FF D5 D7 D9 00 00 00 07 0F 80 01 00 83 01 82 84 85 88 82 8A 85 02 85 81 00 3B What next? If you want to go deeper, I suggest attempting to create some small images (like four or six pixels big) and then trying to create the rabbit above. Please share your work. Next, you can read the following two great articles about the GIF file format. You can experiment with LZW Minimum Code Size. I set this value to 07 to make the transition from the code stream to bytes smooth. You can try a value of 03, and each code will occupy just half a byte (four bits). Note that after a while, the GIF-ghost will increase the code size and gobble more bits. What's In A GIF - Bits and Bytes The authority on the content of GIFs is the GIF89a specification. Originally developed at CompuServe in the late 1980s... giflib.sourceforge.net What's In A GIF - LZW Image Data Now let's look at exactly how we go about storing an image in a GIF file. The GIF format is a raster format, meaning it... giflib.sourceforge.net GIF Hackerman Graphics Algorithms Nerd Culture -- -- Oscar Olsson Follow Written by Oscar Olsson 85 Followers Senior Web Developer and Teacher linkedin.com/in/happybits Follow More from Oscar Olsson Paperclip Maximizer Oscar Olsson Oscar Olsson Paperclip Maximizer Could a super-intelligent machine with an innocent goal cause any trouble? 3 min read*Aug 27, 2022 -- Moloch -- a race to the bottom where everyone loses Oscar Olsson Oscar Olsson Moloch -- a race to the bottom where everyone loses Market economy is fantastic. A multitude of companies compete to develop the best product or service using as few resources as possible... 5 min read*May 7 -- Sydney -- the clingy, lovestruck, chatbot from Bing.com Oscar Olsson Oscar Olsson Sydney -- the clingy, lovestruck, chatbot from Bing.com Kevin Roose had a long conversation with Bing's chatbot on Valentine's Day. When he asked challenging questions to the chatbot, it... 3 min read*Feb 21 -- Use Tampermonkey to add features to YouTube, LinkedIn or Facebook Oscar Olsson Oscar Olsson Use Tampermonkey to add features to YouTube, LinkedIn or Facebook Tampermonkey is a tool where you can inject your own JavaScript code into websites you frequently use. 3 min read*Dec 15, 2022 -- See all from Oscar Olsson Recommended from Medium Toyota CEO: This New Engine Will Destroy the Entire EV Industry! The Pareto Investor The Pareto Investor Toyota CEO: This New Engine Will Destroy the Entire EV Industry! The Dawn of a New Era in Automotive Technology: Toyota's Revolutionary Water-Powered Engine *4 min read*Nov 27 -- 46 AI-generated image of a cute tiny robot in the backdrop of ChatGPT's logo Neeramitra Reddy Neeramitra Reddy in The Startup 3 Advanced (and Unique) ChatGPT Uses You've Likely Not Seen Before Valuable "meta" use cases I've found in 10 months of tinkering with ChatGPT *11 min read*Nov 28 -- 78 Lists Principal Component Analysis for ML Time Series Analysis deep learning cheatsheet for beginner Practical Guides to Machine Learning 10 stories*745 saves [1] [0] [1] General Coding Knowledge 20 stories*642 saves [1] [1] [1] Staff Picks 526 stories*494 saves GPT -- Intuitively and Exhaustively Explained Daniel Warfield Daniel Warfield in Towards Data Science GPT -- Intuitively and Exhaustively Explained Exploring the architecture of OpenAI's Generative Pre-trained Transformers. *16 min read*4 days ago -- 14 Mac Studio -- What Is Such a Powerful Device Actually Good For? Jakub Jirak Jakub Jirak in Mac O'Clock Mac Studio -- What Is Such a Powerful Device Actually Good For? Pushing the boundaries of performance and virtualization *7 min read*5 days ago -- 1 This is Why I Didn't Accept You as a Senior Software Engineer David Goudet David Goudet This is Why I Didn't Accept You as a Senior Software Engineer An Alarming Trend in The Software Industry *5 min read*Jul 25 -- 54 The "Next Big Thing": 8 New Technologies That Will Change The World Adrien Book Adrien Book in DataDrivenInvestor The "Next Big Thing": 8 New Technologies That Will Change The World Life in 2030 will be... complex *9 min read*Nov 29 -- 16 See more recommendations Help Status About Careers Blog Privacy Terms Text to speech Teams