[HN Gopher] Computer Science from the Bottom Up
___________________________________________________________________
Computer Science from the Bottom Up
Author : hliyan
Score : 183 points
Date : 2021-04-29 14:34 UTC (8 hours ago)
(HTM) web link (www.bottomupcs.com)
(TXT) w3m dump (www.bottomupcs.com)
| thamalama wrote:
| https://csunplugged.org/en/
|
| "Computer science" is a terrible name. Astronomy is not called
| "telescope science", and biology is not called "microscope
| science".
| tantalor wrote:
| Telescopes and microscopes are subjects of theory of optics,
| which studies the nature of light.
|
| The "computer" in "computer science" is referring to turing
| machines, automata, etc, which are subjects of theory of
| computation, which studies the nature of computable functions.
| [deleted]
| Joker_vD wrote:
| Slightly off-topic, but... am I the only one who is troubled by
| the fact that the explanations of "fork and exec" are almost
| never, ever, discuss the rationale for picking this design
| instead of "CreateProcess()" a.k.a.
| "just_launch_this_exe_file()", or even acknowledge this
| alternative design? A student would probably expect that to
| launch an app, there is a system call that would do exactly that:
| you pass it the name of the application/executable file, and it
| launches. Instead, there is a small dance of duplicating itself
| and then re-writing one of the copies--try not to get tangled up
| in the legs while doing that!
|
| The fork+exec has many unobvious advantages over CreateProcess,
| but also disadvantages, and one of them is that it's, in my
| opinion, a completely unexpected approach: I can't think of any
| other system object that can only be created by first copying
| another already existing object and then overwriting the newly
| created one if needed. Files of any kind
| (folder/pipe/socket/etc)? Memory mappings? Signal handling?
| Process groups/sessions? Users/groups? The processes seem to be
| unique in this regard. How did people come up with this approach?
| That would be an interesting discussion, I think.
|
| Instead, it's just described as the completely self-justified,
| with description of "zombies" thrown in the end for boots: and
| the zombie processes are pretty much an accidental artifact of
| early UNIX design; if the fork was returning not only globally
| visible (and therefore unstable) PID but also an fd associated
| with the child process, neither wait/waitpid syscalls nor reaping
| duties of PID 1 would have been necessary.
| patrec wrote:
| > The fork+exec has many unobvious advantages over
| CreateProcess,
|
| What are the advantages you see, that are not rooted in the
| fact that it's very difficult to create any remotely sane
| declarative API in a language as imperative and primitive as C?
| In other words which of these advantages would still apply if
| you wrote your OS in, say, Ocaml?
|
| Splitting fork and exec allows you to roughly hew the process
| environment of a clone of the parent into shape, process state-
| wise, before you sacrifice it to birth the child-process which
| will inherit many of these desired (and generally a few
| undesired) traits. So one big advantage is that it allows you
| to re-use an existing (and growing!) set of imperative commands
| to modify aspects of a running process to specify the same
| aspects for a child process,and that piecemeal mutation is
| basically the only way to express anything of any complexity in
| C, particularly if you don't want to break API compatibility
| all the time.
|
| The downside is that you generally end up inheriting a bunch of
| cruft that you really didn't want to, that the imperative and
| piecemeal nature opens up problems with race conditions, and
| that there is a lot of overhead only partially mitigated by
| various complex hacks (COW, various more "leightweight" fork
| alternatives, special flags to general purpose system calls
| that are only there to control behavior upon forking, ...).
| dataflow wrote:
| What I hate about fork is that it fundamentally doesn't even
| make sense. You fork a process with a window, what happens to
| that window? Does it get duplicated? Do both control the same
| window? It has no sensible behavior in the general case without
| cloning the whole machine, and even then, your network isn't
| going to get forked. There are limited cases where it could
| make sense, but the fact that that's not true in general should
| make it fairly obvious that we need a different spawning
| primitive. It boggles my mind that people teach fork as if it
| could be the primitive.
| megous wrote:
| Depends on what do you mean by window. If the window lives in
| a X server and you clone the client, window will obviously
| not get duplicated.
|
| It's quite clearly defined what gets duplicated and what gets
| shared on clone()/fork(). There really is no ambiguity.
| Joker_vD wrote:
| So, do threads get duplicated or shared? Or something
| altogether different happens to them?
| megous wrote:
| That's documented in man 2 fork.
| dataflow wrote:
| I'm not saying a particular implementation is ambiguous as
| to what it _does_. Obviously any implementation will do
| _something_. I 'm saying that fork as a _concept_ is
| ambiguous as to what it _should_ do.
| robotresearcher wrote:
| POSIX defines what a POSIX compliant fork() must do.
| dataflow wrote:
| Yes it does.
| megous wrote:
| You're forking a process, not the whole system. Sorry,
| but I fail to see the ambiguity. If the window is in the
| process as some framebuffer, it gets cloned too. That may
| not mean you'll get two windows on your monitor, because
| your display engine is HW thing, and not a process.
| dataflow wrote:
| It's because you're thinking "it does X! it's not
| ambiguous!" but still missing that the question is over
| why it SHOULD do X instead of Y, not over _whether_ it
| does X or Y. To a user, it sure as heck doesn 't make
| sense to fork a process and still end up with one window
| for both of them. For lots of processes the process _is_
| its windows as far as the user is concerned.
| robotresearcher wrote:
| Fork isn't an idea that users need to have. Devs may need
| to know what their OS API's fork() does. No-one else
| cares.
| [deleted]
| kccqzy wrote:
| Well we do have a library call for launching a new process
| called posix_spawn(3), but it's complicated mainly because they
| are a lot of options you want to configure when launching a
| process: which file descriptors would you like to close, which
| would you like to share with the newly created process, what
| uid the new process should have, what working directory it
| should have, etc. It's just a large and complicated call.
|
| But with fork/exec, all of that setup becomes just normal code
| you run after fork but before exec. You want the child not to
| have a certain file descriptor? Just close it after fork. You
| want to drop privileges when running the child? Same thing,
| just setuid/setgid/setgroups after fork. You want to set up
| resource limits for the child? Again just setrlimit after fork.
|
| It avoids a lot of complexity in the system call itself.
| (Naturally, it adds some other complexity elsewhere.)
| wmf wrote:
| Besides fork() and spawn() there's another option which is to
| create an empty process, configure it as needed, then start
| it. IMO this is what simple, orthogonal primitives looks
| like. This generally isn't possible in Unix but AFAIK it's
| possible in Mach.
| Joker_vD wrote:
| Yes, I know. But you only figure it out after you've done you
| fair share of IPC and management of worker processes by hand.
| When you're a fresh student, this "fork+exec" just seems like
| a pretty ridiculous way to organize things: why not just
| launch the executable you want to launch immediately? Nope,
| that's at best postponed until the chapter on IPC, at worst
| it's never discussed at all, so you're left puzzled and with
| "okay, I guess that's how things are done, if you say so..."
| feeling.
|
| Oh, and by the way: we have open()/fcntl() for opening files
| and then fiddling with their settings instead of one open()
| call with 20 arguments; we could easily have had launch()
| with the same parameters as execve() that would launch the
| new process suspended, then we could use... I dunno, even
| fcntl() on the process's descriptor to configure all those
| things and then send it SIGCONT when we're done setting it
| up.
| kccqzy wrote:
| When you are a fresh student, and when fork/exec is too
| advanced for you, just use system(3). Not recommended for
| production use, but that's indeed the easiest way to launch
| an executable.
|
| Do we optimize our system call design for fresh students or
| for professionals?
| Joker_vD wrote:
| We were talking about the presentation of "fundamentals"
| in the courses on Computer Science, right? The fork/exec
| design is highly non-trivial and so should probably
| deserve more discussion, with examples of alternative
| designs and design considerations, than mere "that's how
| new processes are created, nothing special here, let's
| move on".
| kps wrote:
| If you want to look at it as 'fundamentals', fork only
| requires the concept of a task, whereas your top-level
| comment's suggestion also requires (a) secondary storage,
| (b) organized into files, (c) that can be named.
|
| (That's not _why_ Unix developed with fork, though.)
| Joker_vD wrote:
| The (b) and (c) are not really needed: early timesharing
| systems managed to launch processes straight from the
| punched card decks IIRC. And without (a), what do you
| even need the second, identical task for? I guess it
| could be used parallelize crunching the numbers, but
| that's it; and probably that should perhaps be done from
| the system operator layer anyway? Like, "launch 5 copies
| of program on this deck but mark one specific slot in
| memory different for them, so they would know their ID".
| khaledh wrote:
| From "Operating Systems: Three Easy Pieces" chapter on "Process
| API" (section 5.4 "Why? Motivating The API") [1]:
| ... the separation of fork() and exec() is essential in
| building a UNIX shell, because it lets the shell run
| code after the call to fork() but before the call to
| exec(); this code can alter the environment of the about-to-be-
| run program, and thus enables a variety of interesting
| features to be readily built. ... The
| separation of fork() and exec() allows the shell to do a whole
| bunch of useful things rather easily. For example:
| prompt> wc p3.c > newfile.txt In the example
| above, the output of the program wc is redirected into the
| output file newfile.txt (the greater-than sign is how
| said redirection is indicated). The way the shell
| accomplishes this task is quite simple: when the child is
| created, before calling exec(), the shell closes standard
| output and opens the file newfile.txt. By doing so, any
| output from the soon-to-be-running program wc are sent
| to the file instead of the screen.
|
| [1] https://pages.cs.wisc.edu/~remzi/OSTEP/cpu-api.pdf
| kingkongjaffa wrote:
| In the beginning, bell labs invented the transistor...
| tachyonbeam wrote:
| I mean there were vacuum tubes and relay machines before that
| :)
|
| IMO relays are a bit easier to understand than transistors
| because it's a bit easier to understand how to implement
| various logic gates with them with only very basic knowledge of
| electronics.
|
| Also, very satisfying clicking sounds:
| https://www.youtube.com/watch?v=JZyFSrNyhy8&t
| ggggtez wrote:
| This is not computer science. I see nary a reference to data
| structures or algorithms, or math of any kind.
|
| These are all engineering concerns, not science.
| globular-toast wrote:
| Computer science isn't science.
| jll29 wrote:
| A program often is a model.
| neatze wrote:
| hmm, applied science is not engineering ?
| ravi-delia wrote:
| Applied science is engineering, by and large, but it's
| worthwhile to have different words for different things.
| 'Engineering' is one, 'Science' is another, even though the
| latter is useful to the former.
|
| Of course, Computer Science is almost no science, most
| degrees are some blend of math and engineering.
| jll29 wrote:
| As with anything that entirely depends who is at work.
| Watching Linus Torvalds is going to make you arrive at your
| answer, but if you followed Don Knuth, you may have to
| reconsider.
| mikedilger wrote:
| I'm disappointed this didn't start from "sand", quantum
| mechanics, silicon doping, electromagnetism, electronics,
| transistors, gates and latches, etc.
|
| Also, Computer Science is a much broader field than what this
| website addresses including very little science but actually a
| mixture of math (e.g. P vs NP stuff) and engineering (both
| hardware and software), this website addressing a subset of the
| engineering part.
| optimalsolver wrote:
| I think these CS books should start with a philosophical look at
| what kind of thing computation actually is, rather than diving
| straight into UNIX file system structures with barely any
| context.
|
| But I guess that would be computer science from the top down.
| maweki wrote:
| I am a theorist and I would say, that the bottom - as it is the
| foundation - is mathematical logic. And you start from there.
|
| Baffingly enough: All this theory is easily understood with
| 7th-grade math.
|
| Approaching from some specific piece of engineering is surely
| top-down.
| akersten wrote:
| Yeah, the first page of this book being "Everything is a File
| in Unix" is already so high-level and far above what I would
| call the "bottom-up" of computer science that this book should
| be renamed "Unix from the bottom-up."
|
| The "bottom" of computer science would be mathematically
| explaining what it means to compute, and building up to the
| abstraction of a Turing machine, and eventually getting to a
| concrete implementation of what we call programming and code.
| This book has code on page 2 - but code is nowhere near the
| bottom.
| hctaw wrote:
| Good book on that is _Introduction to the Theory of
| Computation_ by Michael Sipser.
|
| One could argue it's from the _very_ bottom up, not much
| practicum lies in that text.
| zaksingh wrote:
| This book was one of the best parts of my CS degree. It's the
| only textbook which I read cover-to-cover because the writing
| and narrative were so cohesive. Highly recommend.
| pc9 wrote:
| Does anyone have a comparison between this and Computer Systems:
| A Programmer's Perspective? I was planning to read that book and
| this looks like a similar resource.
| qntty wrote:
| CSAPP is like the theory behind low-level (C/C++/etc)
| programming. This is more like a sketch of the concepts behind
| operating systems. CSAPP is also much longer and more
| technical. Where they overlap CSAPP is going to be much more
| complete.
| mariodiana wrote:
| I can't be the only one, but I feel pretty sure I'm in a minority
| among computer programmers. This "bottom-up" approach is just not
| the way my brain works. I definitely feel I'm a top-down sort of
| thinker and learner, and this puts me at a disadvantage,
| regarding the literature and approach in this profession.
|
| Are the terms describing the two approaches here "empirical"
| versus "rational"? I'm talking about building bit by bit as
| opposed to getting an overview and then drilling down. I
| definitely feel like I benefit from the second. And I almost feel
| like it may not be too strong to suggest that I'm actively harmed
| by the first.
|
| Something like the author's approach is _great,_ for my purposes,
| when I already pretty much understand something. But I can 't
| _start_ here.
|
| Again, I feel like I'm in a minority. But anyone else have this
| turn of mind, too? How do you cope? Do you just look for other
| materials?
| rambambram wrote:
| You put into words what I've been feeling for a long time. So I
| guess you're not alone. I 'cope' by just turning it around:
| finding some meaningful project, decide what I want it to be,
| and then find out one step at a time how to get closer. That
| starts with me finding out - and then being comfortable enough
| with - a big overview of the project. Somehow I keep coming
| back at the 'work is like a hill' metaphor from Ryan Singer in
| his book Shape Up (from basecamp).
| https://basecamp.com/shapeup/3.4-chapter-13#work-is-like-a-h...
| AlanSE wrote:
| I think it would be good to re-package this information, but
| for me, the critical thing that stuff like this accomplishes is
| comprehensiveness. You have to go through so many classes at
| school just to have all the knowledge base covered at all.
| Everything is compartmentalized by default, so that's kind of
| the mother of all bottom-up approaches.
|
| Top-down takes time and reflection, and re-packaging.
| Jtsummers wrote:
| Past discussions with comments (including the author addressing
| the naming issue):
|
| https://news.ycombinator.com/item?id=13249675 (2016, 157
| comments)
|
| https://news.ycombinator.com/item?id=21903007 (2019, 77 comments)
| dang wrote:
| Also
|
| _Computer Science from the Bottom Up_ -
| https://news.ycombinator.com/item?id=7611093 - April 2014 (44
| comments)
| brundolf wrote:
| This seems highly focused on systems; I wouldn't call it
| representative of "Computer Science" as a whole. Only maybe 2-3
| of the classes in my computer science degree were focused on this
| kind of stuff.
|
| Still seems like a good resource, just perhaps mislabeled.
| macintux wrote:
| I'm surprised you had that many. My CS program had nothing
| comparable that I can remember.
|
| Definitely not computer science.
| brundolf wrote:
| We had one called "Systems Programming" that was a whirlwind
| intro to everything Unix (6 weeks on bash, 6 weeks on C, 6
| weeks on Perl, a mishmash of SSH and everything-is-a-file and
| Unix-philosophy, etc)
|
| We had another called "Operating Systems" that focused on
| things like concurrency, building a really basic memory
| pager, etc (mix of C and Java I think)
|
| And we had one other called "Networking" that was sort of
| along the same lines as "Operating Systems" but focused on,
| well, networking. Mostly Java; we wrote our own application-
| layer protocols, that sort of thing
|
| But yeah pretty much everything else - including the more
| "concrete" programming courses like Data Structures and
| Algorithms - was fairly abstracted away from any specific
| system
| macintux wrote:
| Nice, and now that I think of it we did have an operating
| systems class. I know better than to trust 25-year-old
| memories.
| michaelcampbell wrote:
| My undergrad CS degree I don't think contained any sort
| of OS course, other than maybe a survey of them but there
| was a graduate level class that was required for any
| higher degree. I took it before I graduated with an
| undergrad because it was interesting, and it was quite
| extensive as I recall. That was well over 20 years ago,
| however.
| ivankolev wrote:
| That sounds a lot like part of the UofT CS program.
| elcapitan wrote:
| Yeah it's basically an Operating Systems book like Tanenbaum's
| "Modern Operating Systems".
| zerkten wrote:
| > This seems highly focused on systems; I wouldn't call it
| representative of "Computer Science" as a whole.
|
| I'd agree with this, but it fits with how a certain set of
| schools teaches their core. They will teach a set of related
| classes around the topics here: computer architecture,
| operating systems, and networking, because this is often
| central to their research interests. Others will scatter this
| information across a number of unrelated courses to check the
| box, or skip it entirely because their research doesn't touch
| as heavily on these topics.
|
| Further research on the author suggests they are from New
| Zealand. I'm from the UK and live in the US. What I see taught
| in US CS degrees is somewhat different from what I saw in the
| UK, and heard about from friends in continental Europe. I
| imagine the author's experience is similar.
|
| A further bias from the author, and I'm making big assumptions,
| is that see value here because they've worked in places that do
| lot of system programming. Mentions of VMware and Red Hat on
| https://www.wienand.org.
| userbinator wrote:
| Yes, the same comment appears whenever this comes up on HN.
| Bottom-up would actually be something more like Nand2Tetris or
| Petzold's _code_.
| _wldu wrote:
| Years ago, systems was the general purpose CS degree. I guess
| it is/was classical CS. Today, there are many specialties
| (data, security, algorithms, etc.), but many top-ranked CS
| schools still offer a traditional systems degree.
| Yottaqubyter wrote:
| The reason for the name seems to be because of its focus on
| systems. I do agree that the way it's labeled isn't the best,
| but it's not trying to be representative of all Computer
| Scionce.
|
| In fact, it seems to specifically be trying to have a different
| focus than a cs degree:
|
| > This book aims to move in completely the opposite direction,
| working from operating systems fundamentals through to how
| those applications are complied and executed.
| brundolf wrote:
| "opposite direction" tends to imply "the same content, just
| with a different order/framing". Whereas the OP leaves out
| the majority (not all, but more than half) of what people
| tend to put under the term "computer science" completely
| ChrisArchitect wrote:
| anything new added?
|
| Some earlier discussion:
|
| 1 year ago https://news.ycombinator.com/item?id=21903007
|
| 4 years ago https://news.ycombinator.com/item?id=13249675
|
| 7 years ago https://news.ycombinator.com/item?id=7611093
| wildmanx wrote:
| As much as I appreciate such material, but this is not "computer
| science". Maybe "computer engineering", or some mix of "software
| engineering", "systems engineering", those things.
|
| Computer science is about algorithms, complexity, possibly
| foundations of programming languages, cryptography, database
| theory, those things. Not about Unix or file formats.
|
| Those are important too, but not "computer science".
| wrnr wrote:
| The ELF binary format, really? This is like a course advertising
| itself as an introduction to normative ethics and proceeding to
| lecture people on canon law in the 14th century in Flanders. Like
| sure if you into obscure shit this might be interesting but dunno
| what one has to do with the other.
| Joker_vD wrote:
| Well, more like the common law in the 21st century in the USA.
| Quite useful in nowadays practice, but obviously neither
| universal nor eternal. But could be a good starting point as a
| running example, or a reference point when comparing to the
| "civil law" systems.
| megous wrote:
| Not sure what's obscure about ELF. Almost every one of the
| programs I use is wrapped in ELF format, except for maybe a
| bootloader and scripts.
|
| If you want to learn from bottom up, then learning what
| programs are seems like an obvious approach.
| snicker7 wrote:
| Looks more like computer engineering than computer science.
| MithrilTuxedo wrote:
| Yeah, that kinda bothered me too.
|
| >Computer Science is no more about computers than astronomy is
| about telescopes.
| xbar wrote:
| Similar but different from
| https://www.bartlettpublishing.com/site/product/programming-...
| beckman466 wrote:
| Just read page one and two and I already LOVE the conversational
| style of writing from the author. Who the hell decided we should
| write textbooks in such a dry and formal way (this has been my
| experience). Maybe the textbooks fail me because the author is
| not able to translate their excitement into text (I have ADD, so
| there's that too)? Of course it's also due the nature of the
| copyright/IP system, together with the hierarchies inherent to
| capitalism. The competition between workers selling their labor
| works to the advantage of employers/capitalists.
|
| Anyway, learning is basically just a long conversation where we
| learn about the feedback loops about whatever concept or system
| we are trying to understand and interact with. For me this
| author's style (and others like him) works a million times better
| to do that in an intuitive way.
___________________________________________________________________
(page generated 2021-04-29 23:01 UTC)