[HN Gopher] Continuous Unix commit history from 1970 until today
___________________________________________________________________
Continuous Unix commit history from 1970 until today
Author : FrankyHollywood
Score : 188 points
Date : 2022-06-16 14:04 UTC (8 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| danschuller wrote:
| We have all this commit data at scale, it really feels like there
| are interesting stories or lessons that could be extracted from
| them.
|
| There's kind of the obvious operational stuff like: What are the
| properties of commits that introduce bugs compared to those that
| don't. Which type of commits are rarely changed and which are
| more likely to be changed over time. But what I'd find even more
| interesting is some insight into how we solve problems and how
| well we're able to solve them. I guess part of the puzzle is
| missing - the external requirements / environment that give rise
| to some number of the commits.
| DSpinellis wrote:
| There is a series of conferences MSR -- Mining Software
| Repositories -- with research papers looking at such questions.
| http://www.msrconf.org/ In fact, I presented this work in the
| 2015 MSR conference.
| vandahm wrote:
| You don't see this every day:
|
| https://github.com/dspinellis/unix-history-repo/blob/Researc...
|
| Is this B, or is it BCPL? What would have compiled this code back
| in the day?
| marcodiego wrote:
| They had "auto" vars in 1970. WG14, the ISO work group that
| maintains the C programming language specification, has just
| recently discussed acceptance of __auto_type.
|
| EDIT: ops, the "auto" here means automatic allocation.
| hoten wrote:
| very weird that two characters - $( and $) - were used before {
| and }
|
| did old keyboards not have curly braces or what?
| kps wrote:
| {} were added to the 1967 revision of ASCII, along with `|~
| and lower case. (EBCDIC never got them in the base character
| set, only in alternate 'code pages'.)
| pm215 wrote:
| Wikipedia's article on B says that BCPL used := for assignment
| and = for equality tests, whereas B used = for assignment and
| == for equality. Assuming that's correct, this must be B code.
| projektfu wrote:
| It's B. BCPL has "LET MAIN() BE $(..." instead of "main $(...".
|
| Running B was a challenge on the PDP-7 but easier on the
| PDP-11, apparently, because of the increase of memory size. The
| linked document has an interesting history about compiling B to
| threaded code, a form of interpreted code, and then to machine
| language. B never really made the jump to a full-fledged
| citizen because it quickly got replaced by C, although BCPL was
| popular for a long time.
|
| https://www.bell-labs.com/usr/dmr/www/chist.html
| Erlangen wrote:
| So _auto_ is used as a keyword here. Maybe C inherits this
| never-used auto from B?
| veltas wrote:
| auto stands for 'automatic', because such variables are
| automatically allocated for each function invocation. In C it
| became redundant because base types were added, and so the
| base type could start the definition (auto was still
| permitted with default base type of int until C99 I think).
| auto in B is a bit like 'let', it starts a declaration, along
| with 'extrn'.
| mftb wrote:
| Yea, I have to say, to me, this is cool. Glad to see this sort
| of history being preserved.
| judge2020 wrote:
| Is that truly from 1970? For example, that commit's grandparent
| seems to have been specifically crafted to use "Date: Thu, 1
| Jan 1970 00:00:00 +0000" https://github.com/dspinellis/unix-
| history-repo/commit/185f8....
| anyfoo wrote:
| That's 0 in Unix epoch time (guess why!), so seems more like
| a missing timestamp than a crafted one. The fact that the
| linked file does not have a 0 timestamp, but a slightly later
| one, suggests it's valid, or at least intended to be valid.
| Nition wrote:
| I recall that in A Deepness in the Sky by Vernor Vinge, a
| space sci-fi set in the far future, they're still using
| Unix time underneath many many layers of abstractions, and
| with their cultural context they guess that humanity must
| have set it to start with the moment mankind first
| travelled into space to land on the Moon.
| anyfoo wrote:
| Hah, plausible. Not far off timewise, and yet totally
| wrong, but understandable how such a conclusion could be
| made.
| swatcoder wrote:
| I don't know, but I love how clearly and concisely it expresses
| what would later become ubiquitous as do-while and continue.
|
| That's poetry. Nice find.
| stingraycharles wrote:
| I love how thin the layer above assembly is: without knowing
| B, is my interpretation correct that this function
| effectively "inherits" the stack of the calling function? In
| other words, rather than passing function arguments and let
| the compiler deal with it, you're supposed to push the string
| you want to lcase onto the top of the stack?
|
| Reminds me a lot of writing my own compiler/assembler in
| university, where it's expected that all this happens
| automatically nowadays.
| anyfoo wrote:
| Hmm, don't think so. The function does not operate on a
| string, it seems to read a character using read() and write
| it back, transformed, using write(). Given that the
| function is named main, it's probably the top level
| function anyway (from the programmer's point of view, often
| the OS actually calls into a different function that is
| part of the language runtime, e.g. _start, which in turn
| calls main eventually, but that is usually hidden from the
| programmer).
| messe wrote:
| No, that's not correct. It reads the string from standard
| input. A C translation would look like this:
| main() { int ch; while ((ch
| = read()) != 4) { if (ch > 0100 && ch <
| 0133) ch = ch + 040; if
| (ch == 015) continue; if (ch == 014)
| continue; if (ch == 011) {
| ch = 040040; write(040040);
| write(040040); } write(ch);
| } }
|
| A more modern C version would look like:
| #include <stdio.h> int main(void)
| { int ch; while ((ch = getchar())
| != -1) { if (ch > 0100 && ch < 0133)
| ch = ch + 040; if (ch == 015) continue;
| if (ch == 014) continue; // No need to
| handle tabstop specially putchar(ch);
| } }
| justsomeguy123 wrote:
| Gource Visualization video which points to
| https://www.youtube.com/watch?v=S7JB0mhrGCQ does not work
| anymore.
|
| > Video unavailable > This video is no longer available because
| the YouTube account associated with this video has been
| terminated.
| danuker wrote:
| We need to solve this problem.
|
| YouTube is free to delete any account, even just to cut costs.
| alar44 wrote:
| wolverine876 wrote:
| I assume Github, the host of the OP, can do the same. How
| many people have entrusted their life's work to it?
| cmeacham98 wrote:
| I'm not sure what the problem to be solved here is. It
| doesn't seem reasonable to force YouTube (or any other free
| video host) to indefinitely store and host content.
|
| If you want something to stay around on the internet it has
| to take up space on somebody's drive and bandwidth on
| somebody's network connection - and for sufficiently large
| content like video you're going to have to do that yourself
| or convince/pay someone you trust to do so on your behalf.
| roansh wrote:
| How would you feel if your commits become publicly available for
| everyone to see forever?
| pavon wrote:
| That ship sailed nearly half a century ago. All of this source
| code was previously licensed to research universities starting
| in 1975. The earlier releases weren't under FLOSS license like
| we know them today, but with the intent that researchers would
| be reading, learning from, and modifying the code. And they
| did! creating later BSD Unix releases with more open licenses
| whose code was shared more widely under more permissive
| licenses.
|
| Finally, the people who created this repo are some of the
| primary authors of the code. They wanted this to be in the
| open.
| jrochkind1 wrote:
| Really proud to be a part of history.
| e40 wrote:
| Isn't it cool? I mean, being in the history of a project like
| this... it could be around long after we are gone.
| alar44 wrote:
| Fine. You?
| duxup wrote:
| I hope everyone is ok with cursing....
| ARandomerDude wrote:
| This is the point of GitHub. Also Unix was(/is) a masterwork of
| craftsmanship. Struggling to see a problem here.
| projektfu wrote:
| I love Spinellis' work on teaching reading of code.
| PAPPPmAc wrote:
| Diomidis Spinellis' "Code Reading: The Open Source Perspective"
| is a thing I've wanted but didn't know existed, browsing it now
| to hopefully recommend, thanks for the pointer.
|
| I work with computer engineering students and often tell them
| that reading more code would be good for them but have never
| had a great generic but concrete suggestion for how to get
| there.
|
| The second best programming class I took in college was a
| graduate elective and the _only_ code-reading-based course I
| took or knew of being offered: a guided safari in the Linux
| kernel sources where we had to make targeted changes for the
| assignments. FTR, the best programming class was set up as "new
| language in a different paradigm every few weeks, write one
| small program that suits it and one small program that
| doesn't," not incidentally taught by the same person (
| https://en.wikipedia.org/wiki/Raphael_Finkel ).
| dgrin91 wrote:
| I like how Github shows it as infinity commits
| deathanatos wrote:
| What's up with that? There only seem to be 4, on HEAD?
| caslon wrote:
| Check the other branches.
| deathanatos wrote:
| I saw the other branches when I made the comment.
|
| The commit count is -- usually -- the commit count from the
| currently selected ref.
|
| E.g., on a sample repo, "master" displays as 29,474
| commits. "master^" displays as 29,473.
| kevincox wrote:
| I always expected that the commit count was for that
| branch. I guess it is global?
| [deleted]
| ollien wrote:
| Yeah, is that a bug? lol
| mywittyname wrote:
| Sounds like a overflow bug prevention mechanism.
|
| There are an infinite number of infinities, so surely one of
| them is the maximum possible commits in github.
| kps wrote:
| Git runs into problems with more than 2160 commits in a
| repository.
| ChrisMarshallNY wrote:
| That's a _lot_ of work!
|
| A true labor of love.
|
| Thanks!
| ninefathom wrote:
| Anybody feel brave enough to try merging in SVR4?
|
| https://github.com/dspinellis/unix-history-repo/blob/Researc...
|
| https://github.com/illumos/illumos-gate/blob/9ecd05bdc59e4a1...
| mprovost wrote:
| This repo has been super useful as I've been writing a book that
| teaches Rust by rewriting classic Unix utilities. I settled on
| using the 4.4 BSD source as a base but having the whole history
| available has been really interesting. Recently I came across a
| bug in the 4.4 version of cat that wasn't fixed until a few years
| later (in FreeBSD).
| sydthrowaway wrote:
| Who holds the canonical unix repo?
| kps wrote:
| There is no canonical Unix repository.
|
| Unix (1969) predates source version control (1972).
| throw0101a wrote:
| > _IBM 's OS/360 IEBUPDTE software update tool dates back to
| 1962, arguably a precursor to version control system tools. A
| full system designed for source code control was started in
| 1972, Source Code Control System for the same system
| (OS/360). Source Code Control System's introduction, having
| been published on December 4, 1975, historically implied it
| was the first deliberate revision control system.[4] RCS
| followed just after,[5] with its networked version Concurrent
| Versions System. The next generation after Concurrent
| Versions System was dominated by Subversion,[6] followed by
| the rise of distributed revision control tools such as
| Git.[7]_
|
| * https://en.wikipedia.org/wiki/Version_control#History
| sydthrowaway wrote:
| Who owns the modern unix copyright?
| ChrisArchitect wrote:
| You don't see this every day.....
|
| But you do see it every year for the last number of years
|
| Some previous discussion from 3 years ago:
|
| https://news.ycombinator.com/item?id=19429249
___________________________________________________________________
(page generated 2022-06-16 23:00 UTC)