[HN Gopher] Building LLMs from the Ground Up: A 3-Hour Coding Wo...
___________________________________________________________________
Building LLMs from the Ground Up: A 3-Hour Coding Workshop
Author : mdp2021
Score : 843 points
Date : 2024-08-31 21:45 UTC (1 days ago)
(HTM) web link (magazine.sebastianraschka.com)
(TXT) w3m dump (magazine.sebastianraschka.com)
| abusaidm wrote:
| Nice write up Sebastian, looking forward to the book. There are
| lots of details on the LLM and how it's composed, would be great
| if you can expand on how Llama and OpenAI could be cleaning and
| structuring their training data given it seems this is where the
| battle is heading in the long run.
| rakahn wrote:
| Yes. Would love to read that.
| rahimnathwani wrote:
| how Llama and OpenAI could be cleaning and structuring their
| training data
|
| If you're interested in this, there are several sections in the
| Llama paper you will likely enjoy:
|
| https://ai.meta.com/research/publications/the-llama-3-herd-o...
| kbrkbr wrote:
| But isn't it the beauty of llm's that they need comparably
| little preparation (unstructured text as input) and pick the
| features on their own so to say?
|
| edit: grammar
| atum47 wrote:
| Excuse my ignorance, is this different from Andrej Karpathy
| https://www.youtube.com/watch?v=kCc8FmEb1nY
|
| Anyway I will watch it tonight before bed. Thank you for sharing.
| BaculumMeumEst wrote:
| Andrej's series is excellent, Sebastian's book + this video are
| excellent. There's a lot of overlap but they go into more
| detail on different topics or focus on different things.
| Andrej's entire series is absolutely worth watching, his
| upcoming Eureka Labs stuff is looking extremely good too.
| Sebastian's blog and book are definitely worth the time and
| money IMO.
| brcmthrowaway wrote:
| what book
| StefanBatory wrote:
| Most likely this one.
|
| https://www.manning.com/books/build-a-large-language-
| model-f...
|
| (I've taken it from the footnotes on the article)
| BaculumMeumEst wrote:
| That's the one! High enough quality that I would guess it
| would highly convert from torrents to purchases.
| Hypothetically, of course.
| eclectic29 wrote:
| This is excellent. Thanks for sharing. It's always good to go
| back to the fundamentals. There's another resource that is also
| quite good: https://jaykmody.com/blog/gpt-from-scratch/
| _giorgio_ wrote:
| Not true.
|
| Your resource is really bad.
|
| "We'll then load the trained GPT-2 model weights released by
| OpenAI into our implementation and generate some text."
| skinner_ wrote:
| > Your resource is really bad.
|
| What a bad take. That resource is awesome. Sure, it is about
| inference, not training, but why is that a bad thing?
| szundi wrote:
| This is not "building from the ground up"
| abustamam wrote:
| Why is that bad?
| skinner_ wrote:
| Neither the author of the GPT from scratch post, nor
| eclectic29 who recommended it above did ever promise that
| the post is about building LLMs from the ground up. That
| was the original post.
|
| The GPT from scratch post explains, from the ground up,
| ground being numpy, what calculations take place inside a
| GPT model.
| adultSwim wrote:
| This page is just a container for a youtube video. I suggest
| updating this HN link to point to the video directly, which
| contains the same links as the page in its description.
| yebyen wrote:
| Why not support the author's own website? It looks like a nice
| website
| _giorgio_ wrote:
| He shares a ton of videos and code. His material is really
| valuable. Just support him?
| mdp2021 wrote:
| On the contrary, I saved you that extra step of looking for
| Sebastian Raschka's repository of writings.
| bschmidt1 wrote:
| Love stuff like this. Tangentially I'm working on useful language
| models without taking the LLM approach:
|
| Next-token prediction: https://github.com/bennyschmidt/next-
| token-prediction
|
| Good for auto-complete, spellcheck, etc.
|
| AI chatbot: https://github.com/bennyschmidt/llimo
|
| Good for domain-specific conversational chat with instant
| responses that doesn't hallucinate.
| p1esk wrote:
| Why do you call your language model "transformer"?
| bschmidt1 wrote:
| Language is the language model that extends Transformer.
| Transformer is a base model for any kind of token (words,
| pixels, etc.).
|
| However, currently there is some language-specific stuff in
| Transformer that should be moved to Language :) I'm focusing
| first on language models, and getting into image generation
| next.
| p1esk wrote:
| No, I mean, a transformer is a very specific model
| architecture, and your simple language model has nothing to
| do with that architecture. Unless I'm missing something.
| richrichie wrote:
| For a century, transformer meant a very different thing.
| Power systems people are justifiably amused.
| p1esk wrote:
| And it means something else in Hollywood. But we are
| discussing language models here, aren't we?
| bschmidt1 wrote:
| And it fits the definition doesn't it since it tokenizes
| inputs to compute them against pre-trained ones, rather
| than being based on rules/lookups or arbitrary
| logic/algorithms?
|
| Even in CSS a matrix "transform" is the same concept -
| the word "transform" is not unique to language models,
| more a reference to how 1 set of data becomes another by
| way of computation.
|
| Same with tile engines / game dev. Say I wanted to rotate
| a map, this could be a simple 2D tic-tac-toe board or a
| 3D MMO tile map, anything in between:
|
| Input
|
| [ [0, 0, 1], [0, 0, 0],
| [0, 0, 0]
|
| ]
|
| Output
|
| [ [0, 0, 0], [0, 0, 0],
| [0, 0, 1]
|
| ]
|
| The method that takes the input and gives that output is
| called a "transformer" because it is not looking up some
| rule that says where to put the new values, it's
| performing math on the data structure whose result
| determines the new values.
|
| It's not unique to language models. If anything vector
| word embeddings are much later to this concept than math
| and game dev.
|
| An example of use of word "Transformer" outside language
| models in JavaScript is Three.js' https://threejs.org/doc
| s/#examples/en/controls/TransformCont...
|
| I used Three.js to build https://www.playshadowvane.com/
| - built the engine from scratch and recall working with
| vectors (e.g. THREE Vector3 for XYZ stuff) years before
| they were being popularized by LLMs.
| bschmidt1 wrote:
| I still call it a transformer because the inputs are
| tokenized and computed to produce completions, not from
| lookups or assembling based on rules.
|
| > Unless I'm missing something.
|
| Only that I said "without taking the LLM approach"
| meaning tokens aren't scored in high-dimensional vectors,
| just as far simpler JSON bigrams. I don't think that
| disqualifies using the term "transformer" - I didn't want
| to call it a "computer" or a "completer". Have a better
| word?
|
| > JSON instead of vectors
|
| I did experiment with a low-dimensional vector approach
| from scratch, you can paste this into your browser
| console: https://gist.github.com/bennyschmidt/ba79ba64faa
| 5ba18334b4ae...
|
| But the n-gram approach is better, I don't think vectors
| start to pull away on accuracy until they are capturing a
| lot more contextual information (where there is already a
| lot of context inferred from the structure of an n-gram).
| vunderba wrote:
| I took a very cursory look at the code, and it looks like this
| is just a standard Markov chain. Is it doing something
| different?
| bschmidt1 wrote:
| I get this question only on Hacker News, and am baffled as to
| why (and also the question "isn't this just n-grams, nothing
| more?").
|
| https://github.com/bennyschmidt/next-token-prediction
|
| ^ If you look at this GitHub repo, should be obvious it's a
| token prediction library - the video of the browser demo
| shown there clearly shows it being used with an <input /> to
| autocomplete text based on your domain-specific data. Is THAT
| a Markov chain, nothing more? What a strange question, the
| answer is an obvious "No" - it's a front-end library for
| predicting text and pixels (AKA tokens).
|
| https://github.com/bennyschmidt/llimo
|
| This project, which uses the aforementioned library is a chat
| bot. There's an added NLP layer that uses parts-of-speech
| analysis to transform your inputs into a cursor that is
| completed (AKA "answered"). See the video where I am chatting
| with the bot about Paris? Is that nothing more than a
| standard Markov chain? Nothing else going on? Again the
| answer is an obvious "No" it's a chat bot - what about the
| NLP work, or the chat interface, etc. makes you ask if it's
| nothing more than a standard [insert vague philosophical
| idea]?
|
| To me, your question is like when people were asking if
| jQuery "is just a monad"? I don't understand the significance
| of the question - jQuery is a library for web development.
| Maybe there are some similarities to this philosophical
| concept "monad"? See:
| https://stackoverflow.com/questions/10496932/is-jquery-a-
| mon...
|
| It's like saying "I looked at your website and have concluded
| it is nothing more than an Array."
| kgeist wrote:
| >Simpler take on embeddings (just bigrams stored in JSON
| format)
|
| So Markov chains
| bschmidt1 wrote:
| See https://news.ycombinator.com/item?id=41419329
| karmakaze wrote:
| This is great. Just yesterday I was wondering how exactly
| transformers/attention and LLMs work. I'd worked through how
| back-propagation works in a deep RNN a long while ago and thought
| it would be interesting to see the rest.
| ein0p wrote:
| I'm not sure why you'd want to build an LLM these days - you
| won't be able to train it anyway. It'd make a lot of sense to
| teach people how to build stuff with LLMs, not LLMs themselves.
| ckok wrote:
| This has been said about pretty much every subject. Writing
| your own Browsers, compilers, cryptography, etc. But at least
| for me even if nothing comes of it just knowing how it really
| works, What steps are involved are part of using things
| properly. Some people are perfectly happy using a black box,
| but without kowning how its made, how do we know the limits?
| How will the next generation of llms happen if nobody can get
| excited about the internal workings?
| ein0p wrote:
| You don't need to write your own LLM to know how it works.
| And unlike, say, a browser it doesn't really do anything even
| remotely impressive unless you have at least a few tens of
| thousands of dollars to spend on training. Source: my day job
| is to do precisely what I'm telling you not to bother doing,
| but I do have access to a large pool of GPUs. If I didn't,
| I'd be doing what I suggest above.
| richrichie wrote:
| Good points. For learning purpose, just understanding what
| a neural network is and how it works covers it all.
| BaculumMeumEst wrote:
| But I mean people can always rent GPUs too. And they're
| getting pretty ubiquitous as we ramp up from the AI hype
| craze, I am just an IT monkey at the moment and even I have
| on-demand access to a server with something like 4x192GB
| GPUs at work.
| ein0p wrote:
| Have you tried renting a few hundred GPUs in public
| clouds? Or TPUs for that matter? For weeks or months on
| end?
| kgeist wrote:
| It's possible to train useful LLMs on affordable harwdare. It
| depends on what kind of LLM you want. Sure you won't build the
| next ChatGPT, but not every language task requires a universal
| general-purpose LLM with billions of parameters.
| BaculumMeumEst wrote:
| It's so fun! And for me at least, it sparks a lot of curiosity
| to learn the theory behind them, so I would imagine it is
| similar for others. And some of that theory will likely cross
| over to the next AI breakthrough. So I think this is a fun and
| interesting vehicle for a lot of useful knowledge. It's not
| like building compilers is still super relevant for most of us,
| but many people still learn to do it!
| alok-g wrote:
| This is great! Hope it works on a Windows 11 machine too (I often
| find that when Windows isn't explicitly mentioned, the code isn't
| tested on it and usually fails to work due to random issues).
| sidkshatriya wrote:
| When it does not work on Windows 11 -- what about trying it out
| on WSL (Windows Subsystem for Linux ) ?
| politelemon wrote:
| This should work perfectly fine in WSL2 as it has access to a
| GPU. Do remember to install the Cuda toolkit, NVidia has one
| for WSL2 specifically.
|
| https://developer.nvidia.com/cuda-downloads?target_os=Linux&...
| paradite wrote:
| I wrote a practical guide on how to train nanoGPT from scratch on
| Azure a while ago. It's pretty hands-on and easy to follow:
|
| https://16x.engineer/2023/12/29/nanoGPT-azure-T4-ubuntu-guid...
| firesteelrain wrote:
| Did it really only cost $200?
|
| What sort of things could you do with it? How do you train it
| on current events?
| 1zael wrote:
| Sebastian, you are a god among mortals. Thank you.
| alecco wrote:
| Using PyTorch is not "LLMs from the ground up".
|
| It's a fine PyTorch tutorial but let's not pretend it's something
| low level.
| menzoic wrote:
| Is this a joke? Can't tell. OpenAI uses PyTorch to build LLMs
| jnhl wrote:
| You could always go deeper and from some points of view, it's
| not "from the ground up" enough unless you build your own
| autograd and tensors from plain numpy arrays.
| 0cf8612b2e1e wrote:
| Numpy sounds like cheating on the backs of others. Going to
| need your own hand crafted linear algebra routines.
| TZubiri wrote:
| Source please?
| leobg wrote:
| People think of the Karpathy tutorials which do indeed build
| LLMs from the ground up, starting with Python dictionaries.
| krmboya wrote:
| From scratch is relative. To a python programmer, from
| scratch may mean starting with dictionaries but a non-
| programmer will have to learn what python dicts are first.
|
| To someone who already knows excel, from scratch with excel
| sheets instead of python may work with them.
| wredue wrote:
| For the record, if you do not know what a dict actually
| is, and how it works, it is impossible to use it
| effectively.
|
| Although if your claim is then that most programmers do
| not care about being effective, that I would tend to
| agree with given the 64 gigs of ram my basic text editors
| need these days.
| carlmr wrote:
| >For the record, if you do not know what a dict actually
| is, and how it works, it is impossible to use it
| effectively.
|
| While I agree it's good to know how your collections
| work. "Efficient key-value store" may be enough to use it
| effectively 80% of the time for somebody dabbling in
| Python.
|
| Sadly I've met enough people that call themselves
| programmers that didn't even have such a surface level
| understanding of it.
| atoav wrote:
| No it is not. From scratch has a meaning. To me it means: in
| a way that letxs you undrrstand the important details, e.g.
| using a programming language without major dependencies.
|
| Calling that _from scratch_ is like saying "Just go to the
| store and tell them what you want" in a series called: "How
| to make sausage from scratch".
|
| When I want to know _how to do X from scratch_ I am not
| interested in "how to get X the fastest way possible", to be
| frank I am not even interested in "How to get X in the way
| others typically get it", what I am interested in is learning
| how to do all the stuff that is normally hidden away in
| dependencies or frameworks myself -- or, you know, _from
| scratch_. And considering the comments here I am not alone in
| that reading.
| kenjackson wrote:
| Your definition doesn't match mine. My definition is
| fuzzier. It is "building something using no more than the
| common tools of the trade". The term "common" is very era
| dependent.
|
| For example, building a web server from scratch - I'd
| probably assume the presence of a sockets library or at the
| very least networking card driver support. For logging and
| configuration I'd assume standard I/o support.
|
| It probably comes down to what you think makes LLMs
| interesting as programs.
| SirSegWit wrote:
| I'm still waiting for an assembly language model tutorial, but
| apparently there are no real engineers out there anymore, only
| torch script kiddies /s
| sigmoid10 wrote:
| Pfft. Assembly. I'm waiting for the _real_ low level tutorial
| based on quantum electrodynamics.
| oaw-bct-ar-bamf wrote:
| Automotive actually uses ML in plain c with some inline
| assembly sprinkled on top run run models in embedded devices.
|
| It's definitely out there and in productive use.
| mdp2021 wrote:
| > _ML in plain c_
|
| Which engines in particular? I never found especially
| flexible ones.
| wredue wrote:
| Ironically, slippery slope argumentation is a favourite style
| of kids.
|
| Unfortunately, your argument is a well known fallacy and
| carries no weight.
| botverse wrote:
| #378
| alecco wrote:
| I'll write a guide "no-code LLMs in CUDA".
| jb1991 wrote:
| Learn to play Bach: start with making your own piano.
| defrost wrote:
| Bach (Johann Sebastian .. there were _many_ musical Bach 's
| in the family) owned and wrote for harpsichords, lute-
| harpsichords, violin, viola, cellos, a viola da gamba, lute
| and spinet.
|
| Never had a piano, not even a fortepiano .. though reportedly
| he played one once.
| generic92034 wrote:
| He had to improvise on the Hammerklavier when visiting
| Frederick the Great in Potsdam. That (improvising for
| Frederick) is also the starting point for the later
| creation of
| https://en.wikipedia.org/wiki/The_Musical_Offering .
| vixen99 wrote:
| We know what he meant.
| jb1991 wrote:
| Yes, I know, but that's irrelevant. You can replace the
| word piano in my comment with harpsichord if it makes you
| happy.
| jahdgOI wrote:
| Pianos are not proprietary in that they all have the same
| interface. This is like a web development tutorial in
| ColdFusion.
| maleldil wrote:
| Are you implying that PyTorch is proprietary?
| jb1991 wrote:
| We're digressing to get way off the whole point of the
| comment, but to address your point, actually piano design
| has been an area of great innovation over the centuries,
| with different companies doing it in considerably different
| ways.
| atoav wrote:
| Wanted to say the same thing. As an educator who once gave a
| course on a similar topic for non-programmers you need to start
| way, _way_ earlier.
|
| E.g.
|
| 1. Programming basics
|
| 2. How to manipulate text using programs (reading, writing,
| tokenization, counting words, randomization, case conversion,
| ...)
|
| 3. How to extract statistical properties from texts (ngrams,
| etc, ...)
|
| 4. How to generate crude text using markov chains
|
| 5. Improving on markov chains and thinking about/trying out
| different topologies
|
| Etc.
|
| Sure markov chains are not exactly LLMS, but they are a good
| starting point to byild a intuition how programs can extract
| statistical properties from text and generate new text based on
| that. Also it gives you a feeling how programes can work on
| text.
|
| If you start directly with a framework there is some essential
| understanding missing.
| BaculumMeumEst wrote:
| I really like Sebastian's content but I do agree with you. I
| didn't get into deep learning until starting with Karpathy's
| series, which starts by creating an autograd engine from
| scratch. Before that I tried learning with fast.ai, which dives
| immediately into building networks with Pytorch, but I noped
| out of there quickly. It felt about as fun as learning Java in
| high school. I need to understand what I'm working with!
| krmboya wrote:
| Maybe it's just different learning styles. Some people, me
| included, like to start getting some immediate real world
| results to keep it relevant and form some kind of intuition,
| then start peeling back the layers to understand the
| underlying principles. With fastAI you are already doing this
| by the 3rd lecture.
|
| Like driving a car, you don't need to understand what's under
| the hood you start driving, but eventually understanding it
| makes you a better driver.
| BaculumMeumEst wrote:
| For sure! In both cases I imagine it is a conscious choice
| where the teachers thought about the trade-offs of each
| option. Both have their merits. Whenever you write learning
| material you have to decide where to draw the line of how
| far you want to break down the subject matter. You have to
| think quite hard about exactly who you are writing for.
| It's really hard to do!
| jph00 wrote:
| You seem to be implying that the top-down approach is a
| trade off that involves not breaking down the subject
| matter into as lower level details. I think the opposite
| is true - when you go top down you can keep teaching
| lower and lower layers all the way down to physics if you
| like!
| jph00 wrote:
| fast.ai also does autograd from scratch - and goes further
| than Karpathy since it even does matrix multiplication from
| scratch.
|
| But it doesn't _start_ there. It uses top-down pedagogy,
| instead of bottom up.
| BaculumMeumEst wrote:
| Oh that's interesting to know! I guess I gel better with
| bottom up. As soon as I start seeing API functions I don't
| understand I immediately want to know how they work!
| delano wrote:
| If you want to make an apple pie from scratch, first you have
| to invent the universe.
| CamperBob2 wrote:
| After watching the Karpathy videos on the subject, of course.
| _giorgio_ wrote:
| Your comment is one of the most pompous that I've ever read.
|
| NVDIA value lies only in pytorch and cuda optimizations with
| respect with pure c implementation, so saying that you need go
| lower level than cuda or pytorch means simply reinventing
| Nvidia. Good luck with that
| alecco wrote:
| 1. I only said the meaning of the title is wrong, and I
| praised the content
|
| 2. I didn't say CUDA wouldn't be ground up or low level
| (please re-read) (I say in another comment about a no-code
| guide with CUDA, but it's obviously a joke)
|
| 3. And finally, I think your comment comes out as holier than
| thou and finger pointing and making a huge deal out of a
| minor semantic observation.
| nerdponx wrote:
| Low level by what standards? Is writing an IRC client in Python
| using only the socket API also not "from scratch"?
| badsectoracula wrote:
| Considering i seem to be the minority here based on all the
| other responses the message you replied to, the answer i'd
| give is "by mine, i guess".
|
| At least when i saw the "Building LLMs from the Ground Up"
| what i expected was someone to open vim, emacs or their
| favorite text editor and start writing some C code (or
| something around that level) to implement, well, everything
| from the "ground" (the operating system's user space which in
| most OSes is around the overall level of C) and "up".
| nerdponx wrote:
| The problem with this line of thinking is that 1) it's all
| relative anyway, and 2) The notion of "ground" is
| completely different depending on which perspective you
| have.
|
| To a statistician or a practitioner approaching machine
| learning from a mathematical perspective, the computational
| details are a distraction.
|
| Yes, these models would not be possible without automatic
| differentiation and massively parallel computing. But there
| is a lot of rich detail to consider in building up the
| model from first _mathematical_ principles, motivating
| design choices with prior art from natural language
| processing, various topics related to how input data is
| represented and loss is evaluated, data processing
| considerations, putting things into context of machine,
| learning more broadly, etc. You could fill half a book
| chapter with that kind of content (and people do), without
| ever talking about computational details beyond a passing
| mention.
|
| In my personal opinion, fussing over manual memory
| management is far afield from anything useful unless you
| want to actually work on hardware or core library
| implementations like Pytorch. Nobody else in industry is
| doing that.
| wredue wrote:
| Gluing together premade components is not "from the
| ground up" by most people's definition.
|
| People are looking at the ground up for a clear picture
| of what the thing is actually doing, so masking the
| important part of what is actually happening, then
| calling it "ground up" is disingenuous.
| nerdponx wrote:
| Yes, but "what the thing is actually doing" is different
| depending on what your perspective is on what "the thing"
| and what "actually" consists of.
|
| If you are interested in how the model works
| conceptually, how training works, how it represents text
| semantically, etc., then I maintain that computational
| details are an irrelevant distraction, not an essential
| foundation.
|
| How about another analogy? Is SICP not a good foundation
| for learning about language design because it uses Scheme
| and not assembly or C?
| theanonymousone wrote:
| It may be unreasonable, but I have a default negativity toward
| anything that uses the word "coding" instead of programming or
| development.
| xanderlewis wrote:
| Probably now an unpopular view (as is any opinion perceived as
| 'judgemental' or 'gatekeeping'), but I agree.
| smartmic wrote:
| I fully agree. We had a discussion about this one year ago:
| https://news.ycombinator.com/item?id=36924239
| ljlolel wrote:
| This is more a European thing
| atoav wrote:
| I am from Europe and I am not completely sure about that to
| be honest. I also prefer programming.
|
| I also dislike software development as it reminds me of
| developing a photograhic negative - like "oh let's check out
| how the software we developed came out".
|
| It should be software engineering and it should be held to a
| similar standard as other engineering fields if it isn't done
| in a non-professional context.
| mdp2021 wrote:
| > _software development_
|
| Wrong angle. There is a problem, your consideration of the
| problem, the refinement of your solution to the problem:
| the solution gradually unfolds - it is developed.
| reichstein wrote:
| The word "development" can mean several things. I don't
| think "software development" sounds bad when grouped with a
| phrase like "urban development". It describes growing and
| tuning software for, well, working better, solving more
| needs, and with fewer failure modes.
|
| I do agree that a "coder" creates code, and a programmer
| creates programs. I expect more of a complete program than
| of a bunch of code. If a text says "coder", it does set an
| expectation about the professionalism of the text. And I
| expect even more from a software solution created by a
| software engineer. At least a specification!
|
| Still, I, a professional software engineer and programmer,
| also write "code" for throwaway scripts, or just for
| myself, or that never gets completed. Or for fun. I will
| read articles by and for coders too.
|
| The word is a signal. It's neither good nor bad, but If
| that's not the signal the author wants to send, they should
| work on their communication.
| mdp2021 wrote:
| > _If that 's not the signal the author wants to send_
|
| You can't use a language that will be taken by everyone
| the same way. The public is heterogeneous - its subsets
| will use different "codes".
| SkiFire13 wrote:
| As an European: my language doesn't even have a proper
| equivalent to "coding", only a direct translation to
| "programming"
| badsectoracula wrote:
| I'm from Europe and my language doesn't have an equivalent
| to "coding" but i'm still using the English word "coder"
| and "coding" for decades - in my case i learned it from the
| demoscene where it was always used for programmers since
| the 80s. FWIW the Demoscene is (or was at least) largely a
| European thing (groups outside of Europe did exist but the
| majority of both groups and demoparties were -and i think
| still are- in Europe) so perhaps there is some truth about
| the "coding" word being a European thing (e.g. it sounded
| ok in some languages and spread from there).
|
| Also in my ears coder always sounded cooler than programmer
| and it wasn't until a few years ago i first heard that to
| some people it has negative connotations. Too late to
| change though, it still sounds cooler to me :-P.
|
| [0] https://en.wikipedia.org/wiki/Demoscene
| mdp2021 wrote:
| Quite a cry, in a submission page from one of the most language
| "obsessed" in this community.
|
| Now: "code" is something you establish - as the content of the
| codex medium (see https://en.wikipedia.org/wiki/Codex for its
| history); from the field of law, a set of rules, exported in
| use to other domains since at least the mid XVI century in
| English.
|
| "Program" is something you publish, with the implied content of
| a set of intentions ("first we play Bach then Mozart" - the use
| postdates "code"-as-"set of rules" by centuries).
|
| "Develop" is something you unfold - good, but it does not imply
| "rules" or "[sequential] process" like the other two terms.
| cpill wrote:
| yeah really valuable stuff. so we know how the ginormous model
| that we can't train or host works (putting practice there are so
| many hacks and optimizations that none of them work like this).
| great.
___________________________________________________________________
(page generated 2024-09-01 23:01 UTC)