https://bdtechtalks.com/2020/09/16/deep-learning-game-of-life/ * Home * Blog * Tips & Tricks * What is * Interviews * Reviews * About + About TechTalks + About Ben Dickson + Write for TechTalks Search [ ] [ ][Search] [techtalks-] TechTalks * Home * Blog + apple openai microsoft Why OpenAI did not release a native ChatGPT app for Windows robot reasoning chain of thought How far can you trust chain-of-thought prompting? robot memory document Train your LLMs to choose between RAG and internal memory automatically apple language model What OpenELM language models say about Apple's generative AI strategy infinite-context llm Will infinite context windows kill LLM fine-tuning and RAG? * Tips & Tricks + robot teaching llama Fine-tune a Llama-2 language model with a single instruction robot webcam What to know about the rising threat of deepfake scams llamas 4 reasons to use open-source LLMs (especially after the OpenAI drama) llama rag No-code retrieval augmented generation (RAG) with LlamaIndex and ChatGPT lightweight flying llamas How to make your LLMs lighter with GPTQ quantization * What is + multi-modal language model What to know about open-source alternatives to GPT-4 Vision baby llama llm compression The complete guide to LLM compression 3D gradient descent A simple guide to gradient descent in machine learning swirls abstract data The complete guide to LLM fine-tuning vector abstract background What is low-rank adaptation (LoRA)? * Interviews + AI cybersecurity What to know about the security of open-source machine learning models open-source language models Understanding the impact of open-source language models deep learning revolution What we learned from the deep learning revolution language model document retrieval AI21 Labs' mission to make large language models get their facts... integrated circuit board ic Democratizing the hardware side of large language models * Reviews * About + About TechTalks + About Ben Dickson + Write for TechTalks Home Blog Why neural networks struggle with the Game of Life * Blog Why neural networks struggle with the Game of Life By Ben Dickson - September 16, 2020 Facebook Twitter ReddIt Linkedin game of life neural networks This article is part of our reviews of AI research papers, a series of posts that explore the latest findings in artificial intelligence. The Game of Life is a grid-based automaton that is very popular in discussions about science, computation, and artificial intelligence. It is an interesting idea that shows how very simple rules can yield very complicated results. Despite its simplicity, however, the Game of Life remains a challenge to artificial neural networks, AI researchers at Swarthmore College and the Los Alamos National Laboratory have shown in a recent paper. Titled, "It's Hard for Neural Networks To Learn the Game of Life," their research investigates how neural networks explore the Game of Life and why they often miss finding the right solution. Their findings highlight some of the key issues with deep learning models and give some interesting hints at what could be the next direction of research for the AI community. What is the Game of Life? British mathematician John Conway invented the Game of Life in 1970. Basically, the Game of Life tracks the on or off state--the life--of a series of cells on a grid across timesteps. At each timestep, the following simple rules define which cells come to life or stay alive, and which cells die or stay dead: 1. If a live cell has less than two live neighbors, it dies of underpopulation. 2. If a live cell has more than three live neighbors, it dies of overpopulation. 3. If a live cell has exactly two or three live neighbors, it survives. 4. If a dead cell has three live neighbors, it will come to life. Based on these four simple rules, you can adjust the initial state of your grid to create interesting stable, oscillating, and gliding patterns. For instance, this is what's called the glider gun. game of life glider gun You can also use the Game of Life to create very complex pattern, such as this one. game of life complex pattern Interestingly, no matter how complex a grid becomes, you can predict the state of each cell in the next timestep with the same rules. With neural networks being very good prediction machines, the researchers wanted to find out whether deep learning models could learn the underlying rules of the Game of Life. Artificial neural networks vs the Game of Life There are a few reasons the Game of Life is an interesting experiment for neural networks. "We already know a solution," Jacob Springer, a computer science student at Swarthmore College and co-author of the paper, told TechTalks. "We can write down by hand a neural network that implements the Game of Life, and therefore we can compare the learned solutions to our hand-crafted one. This is not the case in." It is also very easy to adjust the flexibility of the problem in the Game of Life by modifying the number of timesteps in the future the target deep learning model must predict. Also, unlike domains such as computer vision or natural language processing, if a neural network has learned the rules of the Game of Life it will reach 100 percent accuracy. "There's no ambiguity. If the network fails even once, then it is has not correctly learned the rules," Springer says. In their work, the researchers first created a small convolutional neural network and manually tuned its parameters to be able to predict the sequence of changes in the Game of Life's grid cells. This proved that there's a minimal neural network that can represent the rule of the Game of Life. [jacob-springer]Jacob Springer, computer science student at Swarthmore College Then, they tried to see if the same neural network could reach optimal settings when trained from scratch. They initialized the parameters to random values and trained the neural network on 1 million randomly generated examples of the Game of Life. The only way the neural network could reach 100 percent accuracy would be to converge on the hand-crafted parameter values. This would imply that the AI model had managed to parameterize the rules underlying the Game of Life. But in most cases the trained neural network did not find the optimal solution, and the performance of the network decreased even further as the number of steps increased. The result of training the neural network was largely affected by the chosen set training examples as well as the initial parameters. Unfortunately, you never know what the initial weights of the neural network should be. The most common practice is to pick random values from a normal distribution, therefore settling on the right initial weights becomes a game of luck. As for the training dataset, in many cases, it isn't clear which samples are the right ones, and in others, there's not much of a choice. "For many problems, you don't have a lot of choice in dataset; you get the data that you can collect, so if there is a problem with your dataset, you may have trouble training the neural network," Springer says. The performance of larger neural networks convolutional neural network game of lifeLeft: A manually tuned convolutional neural network can predict outcomes in the Game of Life with perfect accuracy. Right: But in practice, when training the network from scratch, you'll need a much larger neural network to obtain equal results In machine learning, one of the popular ways to improve the accuracy of a model that is underperforming is to increase its complexity. And this technique worked with the Game of Life. As the researchers added more layers and parameters to the neural network, the results improved and the training process eventually yielded a solution that reached near-perfect accuracy. But a larger neural network also means an increase in the cost of training and running the deep learning model. On the one hand, this shows the flexibility of large neural networks. Although a huge deep learning model might not be the most optimal architecture to address your problem, it has a greater chance of finding a good solution. But on the other, it proves that there is likely to be a smaller deep learning model that can provide the same or better results--if you can find it. These findings are in line with "The Lottery Ticket Hypothesis," presented at the ICLR 2019 conference by AI researchers at MIT CSAIL. The hypothesis suggested that for each large neural network, there are smaller sub-networks that can converge on a solution if their parameters have been initialized on lucky, winning values, thus the "lottery ticket" nomenclature. "The lottery ticket hypothesis proposes that when training a convolutional neural network, small lucky subnetworks quickly converge on a solution," the authors of the Game of Life paper write. "This suggests that rather than searching extensively through weight-space for an optimal solution, gradient-descent optimization may rely on lucky initializations of weights that happen to position a subnetwork close to a reasonable local minima to which the network converges." What are the implications for AI research? "While Conway's Game of Life itself is a toy problem and has few direct applications, the results we report here have implications for similar tasks in which a neural network is trained to predict an outcome which requires the network to follow a set of local rules with multiple hidden steps," the AI researchers write in their paper. These findings can apply to machine learning models used logic or math solvers, weather and fluid dynamics simulations, and logical deduction in language or image processing. "Given the difficulty that we have found for small neural networks to learn the Game of Life, which can be expressed with relatively simple symbolic rules, I would expect that most sophisticated symbol manipulation would be even more difficult for neural networks to learn, and would require even larger neural networks," Springer said. "Our result does not necessarily suggest that neural networks cannot learn and execute symbolic rules to make decisions, however, it suggests that these types of systems may be very difficult to learn, especially as the complexity of the problem increases." The researchers further believe that their findings apply to other fields of machine learning that do not necessarily rely on clear-cut logical rules, such as image and audio classification. For the moment, we know that, in some cases, increasing the size and complexity of our neural networks can solve the problem of poorly performing deep learning models. But we should also consider the negative impact of using larger neural networks as the go-to method to overcome impasses in machine learning research. One outcome can be greater energy consumption and carbon emissions caused from the compute resources required to train large neural networks. On the other hand, it can result in the collection of larger training datasets instead of relying on finding ideal distribution strategies across smaller datasets, which might not be feasible in domains where data is subject to ethical considerations and privacy laws. And finally, the general trend toward endorsing overcomplete and very large deep learning models can consolidate AI power in large tech companies and make it harder for smaller players to enter the deep learning research space. "We hope that this paper will promote research into the limitations of neural networks so that we can better understand the flaws that necessitate overcomplete networks for learning. We hope that our result will drive development into better learning algorithms that do not face the drawbacks of gradient-based learning," the authors of the paper write. "I think the results certainly motivate research into improved search algorithms, or for methods to improve the efficiency of large networks," Springer said. Like this: Like Loading... * TAGS * AI research papers * Artificial intelligence (AI) * deep learning * neural networks Facebook Twitter ReddIt Linkedin Previous articleThe Guardian's GPT-3-written article misleads readers about AI. Here's why. Next articleThe 5 most famous DDoS attacks in history [5184782561a2] Ben Dickson Ben is a software engineer and the founder of TechTalks. He writes about technology, business and politics. RELATED ARTICLESMORE FROM AUTHOR apple openai microsoft Why OpenAI did not release a native ChatGPT app for Windows robot reasoning chain of thought How far can you trust chain-of-thought prompting? robot memory document Train your LLMs to choose between RAG and internal memory automatically apple language model What OpenELM language models say about Apple's generative AI strategy infinite-context llm Will infinite context windows kill LLM fine-tuning and RAG? robot solving rubik's cube How to turn any LLM into an embedding model 3 COMMENTS 1. [3ac5ba] Robert Feyerharm September 17, 2020 at 7:25 pm Interesting project - but wouldn't it be far simpler to use a decision tree, instead of a vast neural network, to predict a given cell being alive or dead? You'd only need two predictor variables: the sum of alive cells in the 8 surrounding cells, and whether the current cell is alive or dead. Loading... Reply 2. [8d540e] Dmitrii September 29, 2020 at 5:57 pm See no problem. 1-step can be easily predicted with just 1 hidden convolutional layer with 2 neurons: https://gist.github.com/ failure-to-thrive/61048f3407836cc91ab1430eb8e342d9 Although it's a mere memorization. As of multi-step predictions: it inevitable requires nested calculations of neighbors. Could anyone write an algorithm for just 2 steps similar to 1-step? No surprise that the computational complexity blows exponentially. No guilt of DNN, they do their best. Loading... Reply 3. [6d487a] James McGuigan November 29, 2022 at 7:04 pm I discovered that the root cause of lottery ticket initialisation is when a neural network gets stuck in a saddle point between multiple valid linear algebra solutions. One option to increase the size of the neural network, as it becomes increasingly difficult to create true saddle points in high-dimension systems. The other option is to make the neural network smaller. I went as far as hand-solving a '2*(3x3) + 2 + 1' neural network which had three solutions (causing lottery ticket initialisation), but was able to train the smallest network with only '1*(3x3) + 2 + 1' = 11 weights, for which only a single linear algebra solution exists. This is how I solved the lottery ticket initialisation problem. https://www.kaggle.com/code/jamesmcguigan/ its-easy-for-neural-networks-to-learn-game-of-life Loading... Reply Leave a ReplyCancel reply This site uses Akismet to reduce spam. Learn how your comment data is processed. Recent Posts * Why OpenAI did not release a native ChatGPT app for Windows * How far can you trust chain-of-thought prompting? * Train your LLMs to choose between RAG and internal memory automatically * What OpenELM language models say about Apple's generative AI strategy * Will infinite context windows kill LLM fine-tuning and RAG? Follow Us * Twitter * Facebook [techtalks-logo-normal] ABOUT US At TechTalks, we examine trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech and what we need to look out for. The idea is to be able to make the most out of the benefits provided by new tech trends and to minimize the trade-offs and costs. FOLLOW US (c) TechTalks, all rights reserved. This website uses cookies to improve your experience. We assume you're ok with this.Accept Reject Read More Privacy & Cookies Policy Close Privacy Overview This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience. Necessary [*] Necessary Always Enabled Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information. Non-necessary [*] Non-necessary Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website. SAVE & ACCEPT Discover more from TechTalks Subscribe now to keep reading and get access to the full archive. Type your email... [ ] Subscribe Continue reading Loading Comments... Write a Comment... [ ] Email (Required) [ ] Name (Required) [ ] Website [ ] [Post Comment] %d