[HN Gopher] scrapscript.py
       ___________________________________________________________________
        
       scrapscript.py
        
       Author : surprisetalk
       Score  : 290 points
       Date   : 2024-01-23 15:19 UTC (1 days ago)
        
 (HTM) web link (bernsteinbear.com)
 (TXT) w3m dump (bernsteinbear.com)
        
       | rahimnathwani wrote:
       | This is about how scrapscript is implemented.
       | 
       | There was a popular Show HN about 9 months ago, about scrapscript
       | itself: https://news.ycombinator.com/item?id=35712163
        
       | aardvarkr wrote:
       | Interesting tidbit: the book series that this website is named
       | for is actually spelled berEnstAin bears, emphasis on the letters
       | that everyone (including myself) remembers being spelled the
       | other way. I literally learned this yesterday
        
         | LeonB wrote:
         | I concur! This is considered part of her Mandela effect^1 isn't
         | it?
         | 
         | The Wikipedia article on the aforementioned bears even has a
         | section on it --
         | 
         | https://en.wikipedia.org/wiki/Berenstain_Bears#Name_confusio...
         | 
         | ^1 - which was the Mandala effect, in my original universe, I'm
         | sure.
        
           | lkirkwood wrote:
           | Why would it be the Mandala effect? It's named for the mass
           | false memory of Nelson Mandela dying in prison in the 80s.
        
             | amenhotep wrote:
             | One high profile archetype of the mandala is the sand
             | mandala, where practitioners painstakingly construct an
             | intricate mandala out of sand over the course of days and
             | then ritually sweep it away once it's complete, leaving no
             | trace, as a meditation on impermanence or something like
             | that.
             | 
             | Much like how in the Mandela effect the original universe
             | is wiped at least partially away, leaving no trace of what
             | was a complex and fully featured aspect of the timeline,
             | other than what remains in your memory. Other people say
             | "no, that's always been a table" while you remember the
             | sand that was on top of it. Or something along those lines!
             | For some people the resonance is strong enough for the
             | mandala imagery to potentially overwrite the Mandela
             | etymology. Especially if you're a person who's never
             | experienced the effect about Mandela himself.
        
         | juliusgeo wrote:
         | I think the website name is a pun based on the last name of its
         | author.
        
           | tekknolagi wrote:
           | Correct!
        
         | NoZebra120vClip wrote:
         | > everyone
         | 
         | The books were a very minor part of my childhood, but I noticed
         | immediately, and my family always pronounced it correctly.
        
       | hitekker wrote:
       | I enjoyed the link to the language checklist
       | https://www.mcmillen.dev/language_checklist.html
        
         | CodeCompost wrote:
         | Same here. It's the first time I've ever seen it.
        
         | popcorncowboy wrote:
         | > Programming in this language is an adequate punishment for
         | inventing it
         | 
         | I was already laughing hard by this final punchline. Bravo.
        
       | deepnet wrote:
       | Elegant and pure.
       | 
       | I also like the Javascript lambda calculus this is a fork of.
       | 
       | Like early Haskell when it was just for fun before Haskell's
       | Meta-monadic library sprawl that upped the learning curve
        
       | cdchn wrote:
       | Making python scripts into an 'Actually Portable Executable' is
       | what really interested me here.
        
         | fermigier wrote:
         | Is this the reason why there is only one 5+ KLOC module (which
         | includes the tests)? I personally prefer short / shorter
         | modules with clear responsibilities.
        
           | tekknolagi wrote:
           | No. This is not a limitation of APE/Cosmopolitan. This is
           | just my personal preference because imports get tricky in
           | Python land unless you either have a single file or go Full
           | Package Mode. There's probably a world where we split the
           | tests out, though.
        
             | fermigier wrote:
             | Well, just to better understand how the code is organized
             | (and experiment a bit with it), I have forked it: https://g
             | ithub.com/sfermigier/scrapscript/tree/trunk/src/scr...
        
         | semi-extrinsic wrote:
         | Not just python scripts, you can package any C code as well.
         | We've used it to compile up a "python.com" APE file with a
         | python 3.11 interpreter that has lots of packages (including C
         | extensions) that we can just drop straight into old airgapped
         | lab instrument computers and get a modern python data analysis
         | suite up and running.
        
           | actionfromafar wrote:
           | Now integrate Shedskin the Python compiler. :)
        
             | semi-extrinsic wrote:
             | Thanks, TIL. Now we can combine this with the xlcalculator
             | package and transpile models built in Excel right down to C
             | code and build it as a portable executable.
        
               | mst wrote:
               | That's both terrifying and wonderful and I hope to see a
               | write up of it on the front page one day :)
        
           | cl3misch wrote:
           | That sounds very interesting! I probably could hack it
           | together myself, but do you happen to have a writeup on that,
           | or maybe some pointers on how to include the numpy-scipy
           | stack into the executable?
        
         | iainmerrick wrote:
         | I'm a little confused by this part:
         | 
         |  _This executable is theoretically runnable on all major
         | platforms without fuss. And the Docker container that we build
         | with it_ [...]
         | 
         | It sounds like they're putting an APE inside a Docker
         | container, but why would you want both?
        
           | zilti wrote:
           | Because some people throw a tantrum if it doesn't come pre-
           | dockerized, I suppose
        
           | zem wrote:
           | perhaps to integrate with tooling that wants to work with
           | containers
        
             | tekknolagi wrote:
             | Yep, to deploy on fly.io
        
           | cl3misch wrote:
           | No, they _build_ the APE with a docker container. The APE
           | itself is... actually portable.
        
             | tekknolagi wrote:
             | We do both. We also deploy a Docker container that runs
             | scrapscript.com
        
       | Karupan wrote:
       | I have to admit this broke my brain. This is the first time I'm
       | hearing about content addressable languages, and once you get
       | over that barrier, a distributed language doesn't seem far
       | fetched.
       | 
       | As a big fan of functional programming, is this something that is
       | going to just end up being an esoteric language? Don't get me
       | wrong, I absolutely love the vision of the authors, but after
       | being bitten by the Elm bug and that crashing and burning, I'm
       | just cautious of getting invested in new languages and tools.
        
         | adius wrote:
         | Nothing crashed and burned. Elm is in a state where it's fully
         | usable and has all the futures you need. I use it every day!
         | 
         | Since it's a DSL to create HTML + JS + CSS websites, you still
         | get all the new features of browsers!
        
           | Karupan wrote:
           | Except for the compiler bugs, lack of self hosted package
           | management, improvements to tooling, etc. I use it regularly
           | too, but it is frustrating to see it in a state of decay.
           | 
           | They can definitely claim the above points and more are not
           | goals, and Evan absolutely has every right to do so. But
           | don't be surprised when devs see it as dead.
        
           | mst wrote:
           | I think, even though I can absolutely see the arguments for
           | doing so, locking down the compiler to no longer accept
           | external native extensions was a huge mistake community wise,
           | since a lot of the people advocating for Elm were the sort of
           | early adopter who really really wants an escape hatch because
           | they're in the habit of getting into situations where they
           | need one no matter what tools they're using.
           | 
           | Certainly that describes me, and when all the people who
           | seemed to be like me got told to go sit on a cactus by the
           | Elm core developers and bailed out to work with something
           | else, my experiments got pretty much immediately shelved and
           | Elm moved into the "interesting place to steal ideas from,
           | actively hostile to my actually using it" category.
           | 
           | This may be unfair, but I'm pretty sure it's a reasonable
           | description of what -did- happen, fair or not.
        
         | MarceColl wrote:
         | In case you don't know about this, this is kind of what they
         | are trying to achieve: https://www.unison-lang.org/
        
           | Karupan wrote:
           | Thank you, this looks awesome. I already have a use case in
           | mind, so will explore unison since it seems more mature.
        
           | slowmotiony wrote:
           | I still don't get it, could someone smarter than me explain?
           | 
           | helloWorld : '{IO, Exception} () helloWorld _ = printLine
           | "Hello World"
           | 
           | The example above is followed by explanation "{IO, Exception}
           | indicates which abilities the program needs to do I/O and
           | throw exceptions." Well, which abilities does it need then?
           | No idea.
        
             | trenchgun wrote:
             | >Well, which abilities does it need then? No idea.
             | 
             | Abilities called IO and Exception.
             | 
             | I am sure you are familiar with effect systems and
             | algebraic effects, right? Abilities are what algebraic
             | effects are called in Unison: https://www.unison-
             | lang.org/docs/fundamentals/abilities/
             | 
             | So, in Haskell you would have IO monad and Exception monad,
             | but in Unison you have an IO ability and an Exception
             | ability.
             | 
             | If you want to know more: https://www.unison-
             | lang.org/docs/language-reference/abilitie... and: Convent,
             | L., Lindley, S., McBride, C. and McLaughlin, C., 2020. Doo
             | bee doo bee doo. Journal of Functional Programming, 30,
             | p.e9. https://arxiv.org/pdf/1611.09259.pdf
        
               | sanderjd wrote:
               | > _I am sure you are familiar with effect systems and
               | algebraic effects, right?_
               | 
               | Probably not, based on their question!
               | 
               | These are pretty esoteric concepts. I think it was one of
               | a few bullet points on "other interesting ideas" in the
               | functional programming portion of my programming
               | languages course, and I doubt that most working
               | programmers have taken an academic PL course like that at
               | all.
               | 
               | But effects are indeed an awesome concept, and thanks for
               | the excellent links! The parent is one of today's lucky
               | 10,000: https://xkcd.com/1053/
        
               | helboi4 wrote:
               | As a junior that did not do CS at uni, a lot of stuff
               | around here goes right over my head. I often do feel that
               | I might never catch up. I just about understand what
               | functional programming is in terms of a one line
               | definition let alone any concepts that fall under it. To
               | be fair, I only really use Object Oriented.
        
               | slowmotiony wrote:
               | Thank you!
        
             | trenchgun wrote:
             | >I still don't get it, could someone smarter than me
             | explain?
             | 
             | It is not about smartness, but it is probably you not
             | having encountered these concepts before.
        
         | csantini wrote:
         | Interesting idea, but isn't code addressable already in most
         | languages?
         | 
         | We call them modules/libraries and we pip/npm install them from
         | Github and you can keep track of changes/versions/PRs.
        
           | kitd wrote:
           | Content-addressable, not code addressable. It's kind of like
           | global, distributed memoization (IIUC).
           | 
           | edit: not memoization, just hashing the AST of a function.
        
             | throwaway290 wrote:
             | Content is by definition content addressable. x = 42 is a
             | hardlink to every other instance of x = 42 if you will.
             | What this does is more compact and practical content
             | addressing, like Nix or Git. But realizing that there are
             | always more than one way of expressing the same logic (with
             | different hashes no matter how you canonicalize) makes me
             | doubt it is a killer feature.
        
           | trenchgun wrote:
           | Content addressable has a very specific meaning:
           | https://en.wikipedia.org/wiki/Content-addressable_storage
           | 
           | Modules and libraries are addressable based on their names or
           | URI:s.
           | 
           | "Unison eliminates name conflicts. Many dependency conflicts
           | are caused by different versions of a library "competing" for
           | the same names. Unison references defintions by hash, not by
           | name, and multiple versions of the same library can be used
           | within a project." https://www.unison-lang.org/docs/what-
           | problems-does-unison-s...
           | 
           | "Here's the big idea behind Unison, which we'll explain along
           | with some of its benefits:
           | 
           | Each Unison definition is identified by a hash of its syntax
           | tree.
           | 
           | Put another way, Unison code iscontent-addressed. Here's an
           | example, the increment function on Nat:
           | 
           | increment : Nat -> Nat increment n = n + 1
           | 
           | While we've given this function a human-readable name (and
           | the function Nat.+ also has a human-readable name), names are
           | just separately stored metadata that don't affect the
           | function's hash. The syntax tree of increment that Unison
           | hashes looks something like:
           | 
           | increment = (#arg1 -> #a8s6df921a8 #arg1 1)
           | 
           | Unison uses 512-bit SHA3 hashes, which have unimaginably
           | small chances of collision.
           | 
           | If we generated one million unique Unison definitions every
           | second, we should expect our first hash collision after
           | roughly 100 quadrillion years! " https://www.unison-
           | lang.org/docs/the-big-idea/
        
             | flir wrote:
             | Seems like identifying your library with a git tag would
             | drop that risk to zero.
             | 
             | I guess what I'm not understanding here is the utility. Why
             | is it useful to include multiple versions of a library in a
             | project? Is this a limitation I've been coding around
             | without knowing it?
        
               | celeritascelery wrote:
               | Have you ever had problem where two of your dependencies
               | are each using a different version of the same library?
               | Or have you ever wanted to incrementally upgrade an API
               | so that you don't have to change your entire code base in
               | one fell swoop? That is where things like Unison or
               | scrapscript can make it very easy.
        
               | flir wrote:
               | Ok, I can see "incremental upgrade" as a use-case.
               | Thanks.
        
               | penteract wrote:
               | One reason for multiple versions of a library in a
               | project is that the project wants to use 2 different
               | dependencies, which themselves depend on incompatible
               | versions of a third library.
        
               | flir wrote:
               | ok, yep, that's one I've had myself. Thanks.
        
               | cnity wrote:
               | I recommend reading the benefits section in the Unison
               | docs[0].
               | 
               | 0: https://www.unison-lang.org/docs/the-big-
               | idea/#benefits
        
               | computerfriend wrote:
               | Tags are not immutable.
        
             | bestai wrote:
             | I think it is something like Hoogle for haskell but instead
             | of looking for the types of the functions you look for a
             | hash of some kind of canonical encoding of the definition,
             | so it is like an encoded knowledge graph but you should
             | have to give rules in order to construct that graph in a
             | canonical way.
             | 
             | Edited: What I thought was wrong, anyway the idea of above
             | could be useful for something like copilot to complete
             | definitions.
        
           | throwaway290 wrote:
           | That does not sound like it could make any money though...
        
       | fermigier wrote:
       | This reminds me of a talk Tim Berners-Lee did in 2002 (at the
       | 10th Python conference):
       | 
       | https://www.w3.org/2002/Talks/0206-python/ ("Webizing Python")
       | 
       | I wasn't there but I remember hearing that this wasn't well
       | received by the participants.
       | 
       | Also, TBL references a post by Aaron Swartz at the end of his
       | slides:
       | https://web.archive.org/web/20050208021219/logicerror.com/we...
       | (also titled "Webizing Python")
        
       | account-5 wrote:
       | Can someone explain what this is? Why it's a good thing? What's
       | it's for? I have to admit based on reading the post I have
       | absolutely no idea.
        
         | KTibow wrote:
         | From reading https://scrapscript.org/ it sounds like its main
         | feature is that things can be split up, put on platforms like
         | IPFS, and distributed allowing you to access them from
         | wherever.
        
         | surprisetalk wrote:
         | I should probably write a longer post about this, but
         | scrapscript is an attempt to fix a lot of the "in-between"
         | problems in software engineering.
         | 
         | Instead of working on "real" problems, I find myself battling
         | untyped/undocumented YAML/JSON configurations, syncing JSON
         | encoders/decoders, massaging incompatible dependencies, writing
         | unholy SQL, etc.
         | 
         | I obviously don't have all the answers, but a system with the
         | following properties seems like a worthwhile pursuit: (1) small
         | enough to be used like JSON yet powerful enough to used like
         | Javascript, (2) cryptographic guarantees that code is
         | compatible over time, (3) a compiler that checks live servers
         | for compatibility before deploying, (4) simple but expressive
         | type system, (5) a package manager that facilitates all of this
         | at a granular level... and so on.
         | 
         | On top of all that, I think these properties lend themselves to
         | some grand ambitions like "a new internet" and a "google-docs
         | live coding editor experience". Maybe I'm just full of myself
         | though haha
         | 
         | scrapscript.py is the first real attempt at making scrapscript
         | a reality, so some folks who feel these pains are getting
         | excited to see some movement on the project.
         | 
         | EDIT: Here's my recent scrapyard demo, if you want to see it in
         | action: https://www.youtube.com/watch?v=SngOLU5G1Eg
        
           | account-5 wrote:
           | Still not sure I fully understand, but that is more than
           | likely down to my ignorance. I really appreciate your effort
           | in explaining here. I should mention I'm not a full time
           | developer and certainly not a webdev so this might be why I'm
           | not grokking this. Thanks.
        
           | mst wrote:
           | I'd been kind of interested by https://yglu.io/ and now
           | ingy's new piece of insanity https://yamlscript.org/ - helm
           | appears to let you inject your own script to template charts
           | and I was wondering about trying a wrapper around one of
           | those (because text templating an indentation sensitive
           | language like YAML makes me itch).
           | 
           | I think scrapscript is a really interesting idea, mind, this
           | isn't a "here's an alternative" type comment, it's a "here's
           | things that I think are neat in a similar way to how I think
           | scrapscript is neat" :)
           | 
           | Edit: I forgot something!
           | https://trout.me.uk/lisp/termite-r7rs.pdf is a paper on
           | adding library support to the cross-network (kinda erlangish)
           | termite scheme extensions - and leans heavily on content
           | addressable-ness. Termite itself has gone the way of small
           | lisp projects but I kept this around specifically for the
           | content addressable stuff having been solidly worked out in a
           | language I understood; maybe that'll come in handy for ideas
           | for you as well.
        
           | dflock wrote:
           | I assume you're well aware of: https://www.unison-lang.org/ -
           | as well as 9p and union mounting from plan9.
        
             | surprisetalk wrote:
             | Yes, I'm aware :) I actually built the first scrapscript
             | demo in 2018, drawing on inspiration from Ethereum's
             | Solidity. Somebody pointed me toward Unison when I attended
             | Strange Loop in 2019, and I chatted with Paul Chiusano, and
             | it seemed like Unison and Scrapscript had incompatible
             | design goals. Even now, I don't see much overlap outside of
             | content-addressability. Unison is super cool though, and I
             | wish their team the best!
        
           | nerdponx wrote:
           | It might be interesting to include a comparison with Dhall
           | and Jsonnet while you're writing docs.
        
           | asveikau wrote:
           | > (2) cryptographic guarantees that code is compatible over
           | time,
           | 
           | What does this mean? You hash dependencies?
           | 
           | > (3) a compiler that checks live servers for compatibility
           | before deploying,
           | 
           | Why does a compiler need to talk to a server? Why should it?
           | Seems like a huge step backwards in what a compiler is and
           | expecting it to work later on.
        
             | surprisetalk wrote:
             | _> What does this mean? You hash dependencies?_
             | 
             | Yes, but everything is hashed at the expression-level
             | rather than at the file-level, which prevents a few classes
             | of errors.
             | 
             |  _> Why does a compiler need to talk to a server? Why
             | should it? Seems like a huge step backwards in what a
             | compiler is and expecting it to work later on._
             | 
             | Imagine if Javascript tooling could throw an error when a
             | client implementation diverges from the server's expected
             | input/output types:                 > const res = await
             | fetch("https://example.com/api", [1, 2, 3]);
             | ERROR: You're sending this REST endpoint a list of
             | integers, but it expects a string!
             | 
             | Wouldn't that be nice in some applications?
        
           | manifoldgeo wrote:
           | > I find myself battling untyped/undocumented YAML/JSON
           | configurations, syncing JSON encoders/decoders, massaging
           | incompatible dependencies
           | 
           | I feel your pain on having to manage so many dependencies. I
           | write primarily in Python, and the various pip / Pipenv /
           | pipx / PDM / Poetry dependency managers drive me pretty
           | crazy. That's not even accounting for the multiple Python
           | versions I need!
           | 
           | That said, I'm surprised that you're trying to _alleviate_
           | this by implementing your FP language in Python. The Python
           | ecosystem is full of half-documented config files,
           | incompatible dependency trees, etc.
           | 
           | Have you considered implementing it in any other languages
           | after the Python one proves its worth? For example, if the
           | language becomes strong enough, would you consider writing a
           | scrapscript compiler in scrapscript, itself?
        
             | surprisetalk wrote:
             | Yeah, I'm not a huge fan of Python, but Max and Chris are
             | world-class in that domain, so that's what we're doing for
             | now.
             | 
             | Max has already started working on a meta scrapscript
             | compiler:
             | https://github.com/tekknolagi/scrapscript/pull/100
             | 
             | One thing I think we all agree on is that the
             | implementations should be simple enough to easily port
             | themselves to other languages. For example, one could
             | probably port the existing scrapscript.py to Rust or
             | Javascript using GPT in a single weekend.
             | 
             | You can see echoes of what I'm talking about in my tiny JS
             | POC: https://github.com/tekknolagi/scrapscript/blob/trunk/s
             | crapsc...
             | 
             | Some languages like Rust and Go put a lot of weight on the
             | "official" implementation. I think scrapscript can be more
             | like Lisp/Json where the spec guides parallel
             | implementations. There are obvious downsides to this in
             | general, but I think that content-addressability makes some
             | of those problems moot.
        
             | tekknolagi wrote:
             | None of these config/dependency problems are present in
             | scrapscript.py because it has no external dependencies and
             | is written in one file. This is intentional!
        
       | ingenieroariel wrote:
       | Since this uses cosmopolitan and the build script already
       | downloads portable binaries from https://cosmo.zip has there any
       | thought been given to wrap other portable binaries in scrapscript
       | / download them?
       | 
       | Small, pure, functional, content-addressable and network-first
       | sounds a lot like a mini Nix+ca-derivations [1]
       | 
       | [1] https://www.tweag.io/blog/2021-12-02-nix-cas-4/
        
       ___________________________________________________________________
       (page generated 2024-01-24 23:01 UTC)