hngopher.com

       [HN Gopher] Unison Programming Language
       ___________________________________________________________________
        
       Unison Programming Language
        
       Author : gautamcgoel
       Score  : 395 points
       Date   : 2021-06-27 16:05 UTC (1 days ago)
        
 (HTM) web link (www.unisonweb.org)
 (TXT) w3m dump (www.unisonweb.org)
        
       | sdfjkl wrote:
       | > Run ucm init to initialize a Unison codebase in $HOME/.unison
       | 
       | Ugh, this conflicts with my favorite file sync tool:
       | https://www.cis.upenn.edu/~bcpierce/unison/
        
         | carapace wrote:
         | Unison (the file sync tool) is so awesome!
        
           | zepearl wrote:
           | It's great when it works, but different versions of <OCAML?
           | and/or Unison?> on different hosts/VMs screw it up. I never
           | understood the real reason why (but I never really
           | investigated, lazy) => it's a pity.
        
         | pchiusano wrote:
         | FYI you can do 'ucm -codebase /someplace/else init' to
         | initialize a codebase in another directory.
         | 
         | And then 'ucm -codebase /somewhere/else' to launch ucm.
         | 
         | Also I'm not sure what Unison the file sync tool stashes in
         | .unison or how it uses that directory but there might not be
         | any conflict. UCM will just create a .unison/v2/unison.sqlite3
         | file.
        
       | barnyfried wrote:
       | This is stupid
        
       | thom wrote:
       | Super interesting, I was thinking about append-only codebases
       | very recently:
       | 
       | https://news.ycombinator.com/item?id=27492727
       | 
       | The implications of this, with the right frameworks and
       | processes, seem potentially huge.
        
         | joe_the_user wrote:
         | The thing is, the closest thing to append-only you have with
         | ordinary programming languages APL and it's variants, where you
         | can construct powerful functions with powerful primitives and
         | methods for combining them.
         | 
         | But the thing is that APL quickly becomes a "write only"
         | language - as far as I know, the main use of APL someone
         | sitting at a brockerage who can cobble together any algorithm
         | on twenty minutes and often throws away the result afterwards.
         | 
         | Which is to say, Unison is interesting because it seems to
         | underestimate the importance of a program's code as _document_
         | , as complete, coherent, human-readable, single-view,
         | intentionally created text. Why hasn't the stream of ascii been
         | replaced as the format of program in the last twenty years?
         | It's a good question but the answer isn't that it's just matter
         | of conservatism. There are several other things involved.
        
           | acjohnson55 wrote:
           | It doesn't discard the text. The text, like the documentation
           | and comments, are stored and re-rendered during editing. It's
           | fundamentally textual. It just doesn't have to be stored as a
           | one dimensional text stream.
        
       | steve_taylor wrote:
       | > A friendly programming language from the future.
       | 
       | It looks like a programming language from the present at best. A
       | programming language from the future would have finally broken
       | free from the prison of plain text.
        
         | lolinder wrote:
         | Look a little closer. This is not a graphical programming
         | language, but it's definitely not "plain text" either.
        
           | steve_taylor wrote:
           | I looked a little closer and found this:
           | Unison is a language in which programs are not text. That
           | is, the source of truth for a program is not its textual
           | representation as source code, but its structured
           | representation as an abstract syntax tree.              This
           | document describes Unison in terms of its default       (and
           | currently, only) textual rendering into source code.
           | 
           | Or to put it more concisely, _Unison is currently a plain-
           | text programming language._
        
             | TuringTest wrote:
             | That's akin to saying that machine code is not binary but
             | plain text, because Assembler exists.
             | 
             | It is more accurate to say, _Unison is an AST language
             | which nowadays happens to have just one human-readable
             | interface, which is text_.
        
       | ajuc wrote:
       | This is like maven at function level with no SNAPSHOT versions
       | allowed :)
       | 
       | How is recursion handled? To get a hash for a function you have
       | to have hashes for every function it calls. Is there special
       | "recurse" opcode?
       | 
       | And how do you update a function implementation when you have a
       | cycle in the callgraph?
        
         | refried_ wrote:
         | Yes to Maven, and good questions.
         | 
         | Simply put, there is a special recurse opcode.
         | 
         | When you have a cycle in the call graph, they all get hashed
         | together as a single unit; you update them together as a single
         | unit too.
        
       | umvi wrote:
       | I had a really hard time wrapping my mind around this just
       | reading the website alone. If you are in the same boat, watch the
       | first 10 minutes of this video at 1.5x speed:
       | https://www.youtube.com/watch?v=gCWtkvDQ2ZI
       | 
       | and it will make so, so much more sense.
       | 
       | ...and if you are like me you'll probably need to read this
       | twitter thread to get the answer to your #1 question:
       | https://twitter.com/unisonweb/status/1173942974381744134
       | 
       | Basically the core idea (or one of the core ideas) is instead of
       | a function (like fib(n) which returns nth Fibonacci number) being
       | identified by its _name_ (fib) as is the case with most
       | traditional languages, it 's instead identified by a hash of its
       | implementation.
        
         | taneq wrote:
         | Why not just identify it as the function text at that point?
        
         | tgbugs wrote:
         | Interesting. You can write a macro and some buffer modifying
         | code to do this in elisp. But having now written up the rest of
         | my response, why not just use Smalltalk?
         | 
         | The hard part is coming up with the normalization routine which
         | guarantees that (lambda (a) b a) == (lambda (b) a b) and coming
         | up with the rules for statement reordering for top level and
         | internal definitions so that you can identify semantically
         | equivalent statements where the outcome is order invariant.
         | This is critical for making the hash functions useful and I
         | suspect preventing denial of service attacks on the human
         | brains that have to audit the code.
         | 
         | Being able to write a version of the code and then do the
         | equivalent of creating a package.lock file to crystallize the
         | hashes seems like a reasonable workflow. This probably winds up
         | being easier in common lisp though since you can put the
         | crystallized implementations in their own packages.
         | 
         | You could also view this as a kind of extreme type theory where
         | every function (with regular names) has the type of its
         | normalized representation (compacted to a hash for sanity's
         | sake) and then you can run the checker to see if the
         | types/hashes have changed. If you have somewhere that keeps
         | track of every hash that a function with a particular name has
         | had then you could automatically refactor, or could even
         | support having multiple versions of the function with the same
         | name used in a program at the same time. I'm not sure how users
         | would feel about having to carry around `(funcall ((with-norm-
         | id '(lambda (+ a b)) f)) a b)` though ... probably just give up
         | on editing the textual representation and go back to the image
         | based approach of Smalltalk and Interlisp where you can hide
         | the hashes.
         | 
         | Will be interesting to see how Unison evolves.
        
         | lisper wrote:
         | Sounds like a cool idea, but how do you fix bugs in functions
         | with lots of callers?
        
           | umvi wrote:
           | That's basically what the twitter thread I linked explains.
           | It sounds like there is an automatic propagation mechanism
           | for updating downstream callers if the type hasn't changed,
           | otherwise it sounds like a manual update process.
        
             | torginus wrote:
             | Sounds like trading one set of problems for another.
        
               | vanderZwan wrote:
               | Tangent: here's what I find fascinating: when we evaluate
               | which algorithm or data structure is best to handle a
               | given situation, we know how to reason about algorithmic
               | complexity and pick a best option for our situation.
               | 
               | But then when it comes to ideas like this we just tend to
               | say "we're trading one set of problems or another", as if
               | we can't evaluate the problems in a similar manner. And
               | I'm not picking on you here, I tend to do the same!
               | 
               | Yes, we're trading one set of problems for another, but
               | what if the old set of problems was "O(n2)" and the new
               | set of problems is "O(nlog(n))"? Or maybe it's the other
               | way around. Why isn't it obviour how to apply those
               | earlier skills here?
        
               | capableweb wrote:
               | Welcome to software engineering where there is no golden
               | bullets, only different tradeoffs :)
        
               | aparsons wrote:
               | Law of conservation of complexity
        
           | acjohnson55 wrote:
           | The same way we do now: release a new version and tell people
           | to migrate.
        
             | lisper wrote:
             | No, now I can just redefine the buggy function and all the
             | callers will get the new version automatically. Having to
             | update all callers seems like a high price to pay. Seems
             | like the Right Answer is something like a hash of the api
             | or the contract rather than the implementation.
        
               | ucarion wrote:
               | The (public) name of the implementation is the unique
               | identifier of the contract in most systems, so I think
               | your "Right Answer" is roughly the status quo?
        
               | lisper wrote:
               | No. The name of a function in current languages has no
               | connection at all to what the function does. (What would
               | be the contract for a function named 'foo'?)
        
               | gbhn wrote:
               | That's kind of the thing that makes APIs possible, right?
               | It sounds to me like "what if programming were done in a
               | completely flat global namespace in which abstractions,
               | encapsulation, and structure were impossible."
        
               | lisper wrote:
               | No. An API specifies more than the name of the function.
               | It will specify the arguments, their types, the type of
               | the return value, and at least informally, what the
               | function does. You can change the underlying
               | implementation without changing the API. That's the whole
               | point of an API. The problem with current API technology
               | is that the informality of the spec of what the function
               | does. That allows some aspects of the behavior of the
               | function to change without triggering any warnings.
               | 
               | By having the linker work on hashes of implementations
               | you eliminate that problem but create a new problem. You
               | can no longer change the behavior of the function because
               | you can't change the function. That means you can't
               | suddenly change behavior that some caller is counting on,
               | but it also means you can't fix bugs without changes in
               | the caller.
        
               | tgbugs wrote:
               | Reading this now, I'm imagining all the horrors of static
               | linking but applied to every function instead of whole
               | modules.
               | 
               | Maybe the simplest solution is to allow the function to
               | change to the new version, but make it easy to revert in
               | the event that something breaks. This of course means
               | that you can't make the names of the functions their hash
               | (without lying, preventing the runtime from checking that
               | hashes ways match, or modifying emitted bytecode or
               | native code to do what you want), it has to be an
               | orthogonal layer on top of them like types (as I
               | mentioned elsewhere in the thread).
        
               | ajuc wrote:
               | Nope, according to
               | https://twitter.com/unisonweb/status/1173942969726054401
               | 
               | when you change a function implementation the system has
               | to walk the callers graph backwards starting from all the
               | places where the function was called updating all the
               | implementations with the new hash, then callers of these
               | with the new implementation and so on up to main (or
               | whatever it's called).
               | 
               | I had a chance to implement something like this in a
               | system that used jbpm 3 graph language (basically process
               | X version 1 called process Y version 1 and I updated
               | process Y to version 2). It's nontrivial especially with
               | recursion, I'm wondering how they are dealing with that.
        
               | lamontcg wrote:
               | A git-like datastore for your AST+callgraph.
        
               | ajuc wrote:
               | Let's say you have definitions like that:
               | f: Nat -> Nat         g: Nat -> Nat         h: Nat -> Nat
               | h x = g (x * 2)         g x = f (x * 3)         f x = x <
               | 0 ? 1 : h (x / 4)
               | 
               | And now you change f to be                   f x = x < 1
               | ? 1 : h (x / 4)
               | 
               | How do you do that? There's a cycle in the callgraph. In
               | fact - how do you calculate a hash of a function that
               | calls itself if you need its hash to calculate its hash
               | :)
               | 
               | EDIT: nevermind, recursion is a special case handled
               | differently.
        
         | dcposch wrote:
         | Cool to see people thinking this big!
         | 
         | One challenge I foresee is unintentional coupling. Say you have
         | two functions:
         | 
         | func serialize(MyRecord) ...
         | 
         | func debugToString(MyRecord) ...
         | 
         | Now if you ever make the mistake of having giving those the
         | same implemention, then in Unison they'd be the same hash
         | reference, right?
         | 
         | Then if you want to update, say the debug print later it would
         | update _all_ callsites for that hash including the ones that
         | originally called serialize(). The two are no longer
         | distinguishable.
        
           | refried_ wrote:
           | Hello, Unison author here.
           | 
           | This is definitely an issue that is real, and is currently a
           | problem, and that we will fix; probably by giving the
           | function author an option to salt the hash of new definitions
           | that have some semantic meaning beyond their implementations
           | (appropriate for most application/business logic). No salt
           | for definitions whose meanings are defined by their
           | implementations (appropriate for most generic "library"
           | functions like `List.map`).
           | 
           | We already make this distinction for data types, but not yet
           | for value/function definitions.
        
             | hota_mazi wrote:
             | This seems to be very developer hostile.
             | 
             | Not only do they have to provide a salt themselves but on
             | top of that, they need to make a judgment call of when
             | something has "more semantic meaning beyond their
             | implementation" (to use your words) rather than being some
             | more "fundamental" code.
             | 
             | I'm also surprised that you haven't solved this problem
             | yet: at least once a day, IDEA warns me that some portion
             | of my code is duplicated exactly in some other area of my
             | code, so this kind of duplicated logic is already quite
             | common.
        
             | vanderZwan wrote:
             | Why not also show that your definition already exists
             | elsewhere, together with with a warning? Or is it doing
             | that too?
        
             | billytetrud wrote:
             | Why not simply record where each reference occurs and
             | ensure that if one definition is modified, the other is
             | not? The programmer shouldn't have to think about salting
             | any hashes, it should be automatic and hidden under the
             | hood.
        
           | sgk284 wrote:
           | The names are just pointers, and they're both pointing to the
           | same definition in your example. But when you redefine one of
           | those, you would point one of the names to a new definition.
           | 
           | It's similar to how DNS can have two domains point to the
           | same IP, but then you can change one of those domains point
           | to a new IP.
        
             | ajuc wrote:
             | > The names are just pointers, and they're both pointing to
             | the same definition in your example. But when you redefine
             | one of those, you would point one of the names to a new
             | definition.
             | 
             | But how do you know which name was called where if the
             | callers referenced the content hash not the name?
        
               | milansuk wrote:
               | I also think that the DNS analogy is wrong because all
               | callers are hash-based. The only solution I see is to go
               | through the list of all callers and manually update
               | selected ones.
               | 
               | If I understand Unison right, the names are used only on
               | the developer's layer(to write code), but when you save
               | code, it's all hash-based.
               | 
               | Still, Unison got my attention.
        
               | aparsons wrote:
               | Would it not be correct for those callers to keep calling
               | the old (shared) implementation?
        
               | ajuc wrote:
               | well it would be nice to have a way to update old code
        
           | acjohnson55 wrote:
           | It knows what name you intended to use, because that's in
           | your source, so I'm pretty sure it isn't a problem if
           | implementations converge and diverge.
        
         | andi999 wrote:
         | Cool, but why/what for?
        
           | umvi wrote:
           | I'm just as much of a novice as you, but one of the use cases
           | the creators had in mind are distributed computing systems.
           | For example, if you have to crunch a bunch of data in the
           | cloud, you would write your data crunch function/algorithm
           | (which is represented by some hash '#asdjh238ad') then spin
           | up nodes to crunch data using '#asdjh238ad'. When a new node
           | in a cluster comes up it can say "I don't have '#asdjh238ad'"
           | and the orchestrator or one of the node's peers can send over
           | a copy of it.
           | 
           | With a traditional programming language you couldn't do this
           | because "send me a copy of sort()" would be met with "which
           | sort()?". Whereas with unison every different sorting
           | implementation would have different hash, so there would be
           | no confusion.
        
             | Nullabillity wrote:
             | That makes sense as a build system (and is more or less how
             | Nix works). The question would be why you'd subject your
             | _source code_ to this.
        
               | mst wrote:
               | there was a paper that implemented an r7rs compatible
               | module system for termite scheme that used hashes for
               | identification for netework transfers of code but left
               | the source files still normal - I think focusing on the
               | textual representation too much misses the point a bit
               | here.
        
         | torginus wrote:
         | I'm not totally convinced by this.
         | 
         | - Storing the AST on the disk in a million files is not
         | necessarily the best use of the filesystem. In contrast, most
         | languages store text files on the disk, and build up a similar
         | AST in memory only
         | 
         | - You can't view your code without special tools, which means
         | all text editors/version control etc. need to be Unison-aware
         | 
         | - Since the language is append only, all edits look like
         | additions in version control
         | 
         | - Their solution for the diamond problem (depending on multiple
         | versions of the same library) is having hard dependencies on
         | exact versions, and including both copies can be at best
         | wasteful, at worst bad (what if v2 fixes a bug that was in the
         | v1 dependency), I think this is a hard problem, and the reason
         | why semver exists
         | 
         | - As others have mentioned, the append-only nature of the
         | language makes bugfixes difficult
         | 
         | - Solutions that dynamically discover code dependencies and
         | automatically run tests exist for both procedural and
         | functional languages
         | 
         | - Detecting that 2 things are the same through hashing is
         | nontrivial, can it detect that 1 + x + 1 is the same as x + 2?
         | The ASTs are different
        
           | asoltysik wrote:
           | > Storing the AST on the disk in a million files is not
           | necessarily the best use of the filesystem
           | 
           | A new codebase format just uses a sqlite database instead of
           | a million files
           | 
           | > Since the language is append only, all edits look like
           | additions in version control
           | 
           | Traditional methods of showing change in verson control, that
           | is text diffs, don't make sense here anyway
           | 
           | > Detecting that 2 things are the same through hashing is
           | nontrivial, can it detect that 1 + x + 1 is the same as x +
           | 2? The ASTs are differen
           | 
           | It can't detect that. It if could it would be pretty cool,
           | but I don't think it would improve the usability too much
        
             | the-smug-one wrote:
             | What's the point of detecting that 1 + x + 1 is the same as
             | x + 2 anyway? If I wrote it in one way, I meant it to be
             | that way for a reason. Should it also be able to prove
             | arbitrary code is semantically equivalent? Well, it can't
             | do that for obvious reasons.
        
               | ballenf wrote:
               | Why not use hashing with locality for similarity? That is
               | if the two samples above "hashed" to a similar value it
               | might be helpful to find similar code.
               | 
               | Hashing was created to prevent collisions and ensure
               | small changes have big differences in result. The first
               | requirement makes sense here, but not sure how the second
               | helps.
        
           | lamontcg wrote:
           | Loading both copies of a library can be very useful to deal
           | with the situation where one piece of code has been ported to
           | v2 (due to bugs/features or just generally keeping up with
           | updates) and another piece of code is hard blocked on the
           | v1->v2 migration because it is much more costly, and its
           | possible that v2 is actually buggier for that other use case.
           | There's a bit of a naive idea that software always gets
           | better for everyone and that projects have an infinite amount
           | of spare time to drop everything to bump dependencies. That
           | feature is actually very useful.
           | 
           | (Which is not to defend the rest of the append-only
           | immutability of the rest of the language, that looks a bit
           | whack -- but then I've seen whack stuff get wildly popular,
           | so I have no idea -- but while having 2 versions loaded at
           | the same time might be useful I'm not sure I want to deploy
           | every version that has ever existed that smells way too
           | bloated)
        
             | torginus wrote:
             | You are right - but choosing the correct solution imho
             | needs to be done with human oversight - I think a semver
             | based dependency resolution works great here, for example
             | if bar requires foo 1.0 and baz needs 1.0.1 they will
             | happily use the same version, but if baz used foo 1.1 they
             | would use the separate ones.
        
               | lamontcg wrote:
               | Except 1.0.1 can fix a bug that one piece of code needs,
               | while another piece of code can be happily bug dependent
               | upon it.
               | 
               | You can scream at the developers that they've violated
               | semver but a "bugfix" is entirely subjective (relevant
               | xkcd, spacebar heating, etc).
               | 
               | And even when developers violate semver in a point
               | release the problem still exists. They actually rarely,
               | if ever, rollback with a 1.0.2 that is equivalent to
               | 1.0.0 and instead usually move forwards.
               | 
               | And if you have a language that supports loading 1.0 and
               | 1.1 then there's no point in being artificially
               | constrained over which two versions can be loaded at the
               | same time based on the label, the underlying framework
               | shouldn't be built to care. There's no need for a multi-
               | version library loader to care about what a bugfix is.
        
               | uncomputation wrote:
               | (Mentioned xkcd: https://xkcd.com/1172/)
        
               | slver wrote:
               | SemVer remains a pragmatic approach that works in vast
               | amount of cases. It's unclear what alternative we have
               | here which works in more cases.
        
               | jonahx wrote:
               | Go takes an alternative approach:
               | 
               | https://www.youtube.com/watch?v=wWApoImHuf8
        
               | lamontcg wrote:
               | We don't have any better alternative, but lets not be
               | naive about it when it comes to building bits of
               | framework.
               | 
               | Semver would just be an artificial impediment at this
               | level.
        
               | slver wrote:
               | SemVer is an impediment only if you insist to make it so.
        
               | fastball wrote:
               | I think another key idea is that you're still thinking
               | about libraries as complete packages where you kinda
               | install two versions of the same thing. But it seems more
               | likely in the Unison ecosystem that you'd end up with the
               | ability to much more easily only extract the specific
               | functions you need.
               | 
               | So say there is v1 and v2 of a utility lib in my dep
               | tree, but actually only using func A from v1 and func B
               | from v2. Then I just have the AST of v1.A and v2.B in my
               | deps and everything works.
        
               | Nullabillity wrote:
               | You still need some unit of atomicity to be able to
               | maintain invariants. You can't combine
               | HashMap_LinearProbe::insert with HashMap_Chains::remove,
               | because they both depend on implementation details in
               | order to maintain HashMap's invariants.
        
           | jackcviers3 wrote:
           | Semver doesn't help in the case of transitive binary
           | incompatibility. If lib A depends on B v2, and lib C depends
           | on B v1, and application D depends on A and C, you cannot
           | load a version of B that satisfies D, A, and C. Semver tells
           | you that B 1 and B2 are incompatible, but not how to solve
           | the issue.
           | 
           | Unison solves the issue - there isn't any binary
           | incompatibility, because the transitive versions of Bv1 and
           | Bv2 cannot be in conflict - the function references are to
           | guaranteed unique and different versions of the art.
           | 
           | As for bug fixes - you can specify in your code exactly which
           | version to use.
           | 
           | As for editors needing to be unison aware - they just
           | delegate everything to the compiler via lsp and bsp.
           | 
           | Bug fixes are no more difficult than making the change. A new
           | version is created, and your code can now depend on it. Old
           | code will still run off of the old version. It's up to the
           | code owner to decide to use the new, but fixed version.
           | 
           | Version control is all handled in the language itself.
           | 
           | As for the hard hashing problem... Runar is a particularly
           | intelligent individual. I expect that his algorithm works
           | pretty well.
           | 
           | The first argument about storing the ast is moot in an age
           | where cached compiled typescript, Python, and .class files
           | take up inordinate amounts of disk space.
           | 
           | > Solutions that dynamically discover code dependencies and
           | automatically run tests exist for both procedural and
           | functional languages
           | 
           | Eh. Piping and yarn ain't got nothing on maven and ivy and
           | apt. But yes, dependency management isn't anything new under
           | the sun. Dynamically resolving individual function versions
           | in packages alongside binary incompatible functions is.
        
           | billytetrud wrote:
           | Honestly, programming without language aware tools in this
           | day and age is very inefficient. Sure, in a pinch you can use
           | a text editing program to edit stuff, but it wouldn't be so
           | hard to install the standard editor in that case.
        
           | infogulch wrote:
           | I think Unison paired with a strong graph database instead of
           | the filesystem would be a powerful combo. It would very
           | naturally represent the AST graph directly and would benefit
           | from graph db optimizations. The cost would be the need to
           | invest a lot in new tooling: you'd want a graph db-based
           | source control implementation that offers similar
           | cryptographic certainty to git; you'd have trouble using
           | existing tooling directly like text editors that expect files
           | on disk; etc.
        
             | musingsole wrote:
             | The combination of the two makes me think auto/AI-generated
             | code would be much more feasible and powerful in such an
             | ecosystem.
        
           | xpe wrote:
           | Semver is an uneasy compromise at best. Rich Hickey has a
           | nice talk that digs into the principles around changing
           | software. Once you see this POV, you are unlikely to view
           | Semantic Versioning as anything other than a messy hack.
           | 
           | I'm not saying it is worse than nothing, but sometimes ideas
           | have a way of sticking around too long and making people
           | comfortable.
        
             | modernerd wrote:
             | The talk for those interested:
             | https://youtube.com/watch?v=oyLBGkS5ICk
        
               | Pet_Ant wrote:
               | And the transcript: https://github.com/matthiasn/talk-
               | transcripts/blob/master/Hi...
        
         | mpweiher wrote:
         | > identified by a hash of its implementation
         | 
         | Sound a lot like darklang.
         | 
         | Like others, I am dubious about this being in any way a useful
         | feature. Separating implementation from name (/interface) and
         | binding to that interface/name instead of the implementation is
         | one of the fundamental and _useful_ parts of abstraction.
        
         | remram wrote:
         | This reminds me of Kubernetes, where all cluster state is
         | neatly structured and placed in a replicated data store (etcd)
         | that is the source of truth for operation, with the right parts
         | immutable (e.g. volumes).
         | 
         | The first thing people do is check in textual representations
         | of those things in version control and operate on that instead.
        
         | fouc wrote:
         | Checked the twitter thread and I was thinking it sounds a lot
         | like how strings are linked lists in elixir.
        
       | brundolf wrote:
       | > Unison definitions are identified by content. Each Unison
       | definition is some syntax tree, and by hashing this tree in a way
       | that incorporates the hashes of all that definition's
       | dependencies, we obtain the Unison hash which uniquely identifies
       | that definition.
       | 
       | Very cool core concept. Reminds me of some things Rich Hickey has
       | said about the idea of versioning dependencies at the function
       | level
       | 
       | That said: I wonder if this idea would make more sense as static
       | analysis on an existing language. It would have to be trivial to
       | enumerate all code that might influence a function's behavior; so
       | something totally pure like Haskell or Elm
        
         | xpe wrote:
         | Yes, Rich Hickey (a bit more systematic) and Joe Armstrong (a
         | bit more scattershot) have popularized some of these ideas.
         | 
         | I'd be very interested in learning about analagous static
         | analysis tools for referentially transparent languages / purely
         | functional languages with sufficiently expressive type systems.
         | Please share what you find :)
        
       | iamevn wrote:
       | it's really neat, I love how easy it is to search for functions
       | by type to find what you need.
       | 
       | The one thing I ran into (as someone who only vaguely knows
       | haskell) is that it seems like it's impossible to write a
       | function that takes a list of A or B as an argument and then
       | branch on the type of each element. I can use Either but then I
       | need to decorate each element in the list with Left/Right rather
       | than just use their types.
       | 
       | This is probably just not how things work in Haskell and I just
       | need to be okay with that.
        
         | creata wrote:
         | > This is probably just not how things work in Haskell and I
         | just need to be okay with that.
         | 
         | Yep, that's just how things work in Haskell: disjoint unions
         | are much simpler regular unions, and they're usually what you
         | want in the first place. I think it'd be nice if Haskell had
         | automatic conversions between types (so a and b can be turned
         | into Either a b implicitly, with an error if a = b) but I don't
         | think there are any plans for that.
        
         | sullyj3 wrote:
         | If you could give a concrete example of a problem to be solved,
         | I could try to convince you that the method using Either won't
         | actually be all that unwieldy.
        
         | JackMorgan wrote:
         | Perhaps are you looking for Sum Types?[0] They let you group
         | several types into a unifying type, e.g. a Shape is a Circle,
         | Square, or Triangle, then you can use pattern matching to have
         | different behavior for each. This example is in F# [1] but it's
         | almost exactly the same as it would be in Haskell.
         | 
         | [0]https://www.schoolofhaskell.com/user/Gabriel439/sum-types
         | [1]http://deliberate-software.com/christmas-f-number-
         | polymorphi...
        
       | __david__ wrote:
       | This looks like a neat idea--I can see upsides and downsides, but
       | would have to experiment to see if one outweighs the other.
       | 
       | One thing I didn't see in my (admittedly quick) perusal of the
       | tutorial and faq: what is the technique to run a Unison program
       | from the command line? Is it practical for making unix cli tools
       | (yet)?
        
         | sullyj3 wrote:
         | For the moment, you have to create a function with the
         | appropriate IO ability, and execute using the `run` command
         | from inside the codebase manager. I don't think there's a way
         | to create a standalone executable just yet.
        
       | 0_gravitas wrote:
       | I've been semi-closely tracking this project for a while, and imo
       | it's easily __the__ most interesting project I've seen in the
       | sphere period. Serendipitous-ly, I came across an interview a
       | couple weeks ago with one of the main bodies behind the project
       | on the Corecursive podcast (from early 2019) (I think their name
       | was Runar Bjarnason). Had no idea until it was mentioned almost
       | offhandedly in the last few minutes!
        
         | agbell wrote:
         | I think this is the episode you are talking about [1] Runar and
         | Paul a huge inspiration! I'm not totally sold on this idea as
         | practical, but I think it will get there and while they have a
         | lofty goal, I certainly wouldn't bet against the pair of them.
         | 
         | [1]: https://corecursive.com/027-abstraction-and-learning-with-
         | ru...
        
           | 0_gravitas wrote:
           | ah yes exactly!
        
       | kvnhn wrote:
       | This looks very cool! Content addressed storage is an incredibly
       | powerful concept, and weaving it into a programming language is a
       | compelling idea.
       | 
       | Question for the devs: How does one deploy Unison code? After my
       | first glance through the docs I don't have a clear picture.
        
       | prezjordan wrote:
       | Strongly encourage anyone reading this to take 20 minutes to
       | download ucm and run through the Getting Started guide:
       | https://www.unisonweb.org/docs/quickstart/
       | 
       | Programming with a codebase manager and a scratchpad is just so
       | much fun - I found myself hypnotized and came back an hour later
       | with some janky min heap code. Definitely seems to scratch an
       | itch for me.
        
       | dang wrote:
       | The past threads appear to be (others?):
       | 
       |  _Unison: A Content-Addressable Programming Language_ -
       | https://news.ycombinator.com/item?id=22156370 - Jan 2020 (12
       | comments)
       | 
       |  _The Unison language_ -
       | https://news.ycombinator.com/item?id=22009912 - Jan 2020 (141
       | comments)
       | 
       |  _Unison - A statically-typed purely functional language_ -
       | https://news.ycombinator.com/item?id=20807997 - Aug 2019 (25
       | comments)
       | 
       |  _Unison Language March Update_ -
       | https://news.ycombinator.com/item?id=19528189 - March 2019 (1
       | comment)
       | 
       |  _Large-scale, well-typed edits in Unison, and reimagining
       | version control_ - https://news.ycombinator.com/item?id=9708405 -
       | June 2015 (11 comments)
       | 
       |  _Unison: a next-generation programming platform_ -
       | https://news.ycombinator.com/item?id=9512955 - May 2015 (128
       | comments)
        
         | xpe wrote:
         | Also, to go a little further back, Joe Armstrong talked about
         | content-addressable code in a conference talk:
         | 
         | "The Mess We're In" by Joe Armstrong at Strange Loop [video] -
         | https://news.ycombinator.com/item?id=8342755 - Sep 2014 (77
         | comments)
        
       | janjones wrote:
       | There is nice blog post summing up what's cool about Unison[1]
       | 
       | [1] https://jaredforsyth.com/posts/whats-cool-about-unison/
        
         | PaulDavisThe1st wrote:
         | Good explanations, but I'm always a little suspicious when I
         | see things like this:
         | 
         | > Code is stored as a structured, type-checked tree in a
         | database, not as text in files
         | 
         | What does everyone think a filesystem is?
        
           | thethimble wrote:
           | There's an important distinction between how non-unison code
           | is stored (literally as plain text files which must be re-
           | parsed and re-compiled every time) vs how unison code is
           | stored (as a post-parsing data structure).
           | 
           | The file system is in an entirely different and irrelevant
           | layer of abstraction.
        
             | turtletontine wrote:
             | I'm not totally sure what the important distinction is
             | here. For many languages the important thing is already a
             | post-parsing data structure, that's what any compilation
             | output or byte code is. You obviously want to keep the raw
             | source around as well if you're the developer. Nothing new
             | about having separate source code and compiled formats?
        
               | turtletontine wrote:
               | Update: I'm skimming here
               | (https://jaredforsyth.com/posts/whats-cool-about-unison/)
               | and here (https://joshondesign.com/2012/04/09/open-
               | letter-language-des...) and I see Unison is serious about
               | not having raw text source code as the ground truth. I'm
               | intrigued but don't totally understand yet.
               | 
               | I'm sure this analogy is technically incorrect but: This
               | reminds me of Smalltalk and old Lisps on mainframes
               | shared by many researchers where the main thing was the
               | VM image, not an object file. Though the probably kept
               | the source code around? At a gut level getting rid of
               | source code makes me uncomfortable but I'm ready to learn
               | more.
               | 
               | PS sorry for the ugly raw links I'm on my phone
        
               | jack_h wrote:
               | Perhaps see my reply here
               | (https://news.ycombinator.com/item?id=27654225).
               | 
               | I think you may be misunderstanding what is being stored
               | here. Now as a caveat I'm not familiar with this
               | language, but I am familiar with the concept as
               | described. They are not removing source code, rather
               | source code is stored after some processing; in this case
               | it appears to be after lexing, parsing, and type
               | checking. I'm not sure exactly what is being stored, i.e.
               | an AST, but it sounds like they're basically moving this
               | stage of compilation/interpretation to be much earlier in
               | the process.
               | 
               | I'm assuming this database can be queried and the result
               | can be rendered back to a textual presentation as well.
               | Presumably this opens the door for syntax being divorced
               | from language semantics since how the syntax is parsed
               | into the database and how the database is rendered into
               | text can be a client side decision rather than set in
               | stone inside the compiler/interpreter. What is set in
               | stone is the semantics of the database that everyone must
               | agree to.
               | 
               | Again, there's the caveat that I'm not familiar with how
               | this language in particular is implementing this concept.
        
           | jack_h wrote:
           | I'm not sure I understand your question. Could you elaborate?
        
             | fouc wrote:
             | He's alluding to the fact that filesystems are a kind of
             | database for files.
        
               | vanderZwan wrote:
               | I wouldn't exactly consider files "type-checked" though
        
         | jack_h wrote:
         | Storing code in a database is super cool stuff and is something
         | I've been thinking about for a number of years. I'm actually
         | surprised this development hasn't happened sooner since
         | basically all tooling is forced to deal with the limitations of
         | storing source code as text.
         | 
         | The article gives an example that most programmers would be
         | familiar with; canonicalization so that version control and
         | code reviews go smoothly. Version control also becomes somewhat
         | simpler as it can compare the structure of code rather than the
         | structure of a sequence of characters that still must be lexed,
         | parsed, etc. There are lots of other areas where storing code
         | in a structured database of some sort would benefit tooling as
         | well. One example is the use of language servers to index,
         | perform continuous recompilation, perform cross-reference
         | lookups, and offer code completion. With a structured database
         | a lot of this becomes relatively trivial.
         | 
         | I'll definitely have to look into this language further as I'm
         | curious about how their database is designed.
        
         | grawprog wrote:
         | Thank you, that helped explain pretty well what abilities are.
         | I felt like I was kind of starting to get what the language was
         | about, then I hit the abilities section and I had no idea what
         | it was talking about.
        
       | jiaminglimjm wrote:
       | Programming language i18n.
        
       | michael-ax wrote:
       | I'd love to see this in c, for scheme/racket.
        
       | jjfeiler wrote:
       | The core idea here, that of hashing the ast of a function, is
       | similar to what Maple V from Waterloo Maple Software was doing
       | circa 1991 when I last used it.
        
       | deanstag wrote:
       | This is really exciting. I might have missed this in the
       | documentation, but is there any way of grouping/tagging together
       | a set of functions, just so that conceptually similar functions
       | can be browsed together? For traditional languages a
       | folder/package/file performed this functionality.
        
         | jimmux wrote:
         | There is what looks like conventional namespacing.
        
       | dthul wrote:
       | An immediate caveat I came across: if you want to look at some
       | Unison code you need a special code management tool. Take for
       | example their base library on Github:
       | https://github.com/unisonweb/base
       | 
       | The actual code lives in a sqlite file in the .unison/v2 folder.
       | That would mean existing tools like version control and editors
       | would need to learn about how Unison works in order to seamlessly
       | support it. Also pulling out code into a scratch file, editing it
       | and pushing it back into Unison's database sounds kind of
       | annoying. Again, this could probably be solved with an editor
       | that would make this process more seamless and feel more like
       | editing regular code.
       | 
       | As it currently stands it seems very cumbersome to use, mostly
       | due to the tedious process of even just exploring a codebase,
       | nevermind modifying it.
        
         | imtringued wrote:
         | No, they just need to use FUSE and provide file system level
         | access to the source code.
        
           | dthul wrote:
           | Yes, I thought about something like that. Being able to map
           | it to the filesystem and back to the Unison database.
           | 
           | But then, what is the point of this content addressed code
           | again? What do we gain from it that we don't already have
           | now? With current file based version control you already have
           | an append only repository, code is never deleted from the
           | .git directory, it's just not always mapped to a file in the
           | source code directory (until you check out an old revision,
           | that is).
           | 
           | Edit: I guess Unison still has the unique feature that
           | dependencies are referred to by identity and not name.
        
         | brundolf wrote:
         | This is an interesting choice. Any language or framework has to
         | make a dozen or more choices between doing something in a way
         | that's compatible but compromising, or bespoke but... bespoke.
         | It's always a painful choice in my experience. This one is
         | particularly bold, though.
        
           | adrusi wrote:
           | It's afaict a necessary decision, since unison is designed
           | around the possibility of having multiple versions of the
           | same function referred to be the same name.
        
         | adrusi wrote:
         | FWIW because of how unison works, you get a lot of the benefits
         | of version control without using any proper version control.
         | Probably for small, single dev projects version control would
         | just be redundant.
         | 
         | That's not to say this isn't a limitation the project will need
         | to overcome to be useful, just a caveat.
        
         | zawodnaya wrote:
         | In practice it's a lot less annoying than navigating a file
         | hierarchy and looking in text files that have a lot of things
         | other than what you're looking for.
         | 
         | See also https://share.unison-lang.org/ where you can look at
         | the base library, and some (contributed?) libraries as well.
        
           | dthul wrote:
           | I agree that text files might not be the best way to store
           | code. My point was more that all of the existing tools like
           | code editors and version control systems have been designed
           | around the concept of files though. And instead of Unison
           | being able to tap into the existing ecosystem of tooling,
           | they have to rebuild custom versions. Maybe there would be a
           | way to map a Unison codebase onto the file system and back?
           | 
           | Edit: also worth mentioning that thanks to specialized
           | editors you don't need to manually browse through files but
           | you can browse your code similarly to https://share.unison-
           | lang.org if you so please. That's another plus point of the
           | vast existing ecosystem, it already offers so much and it's a
           | shame that Unison can't make use of it (at least for the
           | moment).
        
             | maddyboo wrote:
             | After reading through the Unison tour [0] with an open
             | mind, I actually think it makes excellent use of existing
             | tools through its "scratch files" approach [1].
             | 
             | The gist of it is that you can check out sections of code
             | that you want to work on as a plain text file and you can
             | do whatever you'd do with a text file: open it in your
             | editor, syntax highlighting, copy/paste, whatever floats
             | your boat. The cool part is that the "Unison codebase
             | manager" (ucm) watches the scratch files and re-parses them
             | whenever a file changes. I presume any syntax or type
             | errors will be immediately shown in the ucm output. Cool,
             | you say, but we can already do that with file watchers like
             | `entr` and traditional languages, so why should I care?
             | Well, it goes further.
             | 
             | You can start a line with a > character followed by an
             | expression and the expression will be evaluated when you
             | save the file, printing the output inside of ucm. It's
             | basically a REPL that you control from your editor. Cooler
             | still, building on this concept is the `test>` prefix
             | which, you guessed it, creates a unit test and runs it
             | inside ucm, showing you whether it passed or not. And as a
             | consequence of Unison's content-addressable nature, after a
             | test has run for a given expression's content hash, the
             | result is cached and the test doesn't need to be re-run
             | unless the hash changes (impure functions are soooo 2020).
             | After you're done with the scratch file, you can run `add`
             | in ucm to add either certain parts (I think) or all of the
             | work you've done in the scratch file to the source
             | codebase, and this includes the tests that you wrote along
             | with their cached values (I think)!
             | 
             | I personally find this workflow to be very compelling. To
             | me, this approach is much akin to the source control that
             | we do today, but it's actually aware of the context and
             | meaning of the changes. Git, on the other hand, relies on
             | weak heuristics to figure out what changed between versions
             | of text-based files.
             | 
             | I am very happy to see projects that push beyond the
             | boundaries of the paradigms we've been stuck in for the
             | past 60+ years. I also find it quite funny that Hacker
             | News, a forum centered around startups, can often be so
             | conservative when it comes to new technologies.
             | 
             | [0]: https://www.unisonweb.org/docs/tour
             | 
             | [1]: https://www.unisonweb.org/docs/tour#unisons-
             | interactive-scra...
        
       | wyager wrote:
       | This looks pretty well done. It doesn't seem like a gimmick;
       | they've made a lot of good choices beyond the core conceit of
       | content-addressable code.
       | 
       | One thing I didn't see skimming the language reference page: is
       | there any sort of typeclass mechanism?
        
         | refried_ wrote:
         | No, but it's planned; probably in the form of implicit
         | parameters.
        
       | jbrot wrote:
       | Very neat project! One question about content addressed programs:
       | how does this play out with types that are structurally
       | equivalent but semantically distinct?
       | 
       | For instance, assuming C definitions, an integer and a file
       | descriptor have the same content but probably should not be
       | treated as the same type (I wouldn't want arithmetic to type
       | check against file descriptors...).
       | 
       | Another scenario: say I have a type "Foo" which contains an
       | integer. In version 1 of my library, this integer must be even,
       | but in version 2 I add support for odd integers, too. The Foo
       | data type, from a content perspective, is unchanged. However, the
       | invariants around it have changed and it's therefore essential
       | that it becomes a new type. Otherwise, someone might create a Foo
       | containing an odd integer using the version 2 API and then pass
       | it to a function from the version 1 API, resulting in bad things
       | since the version 1 API believes Foo can never contain an odd
       | integer.
        
         | kroltan wrote:
         | It looks from the other discussions that it's specifically
         | content-addressed, not "semantic-addressed", so if your code
         | has any redundant syntax (the given example being `x -> x + 2`
         | vs `x -> x + 1 + 1`) it's still distinct.
         | 
         | In that case, I guess data types can have a 0-size marker
         | member, kind of like Rust's `PhantomData` type, that could
         | ensure distinctness.
        
         | refried_ wrote:
         | The language has "unique" types, meaning they have their own
         | semantic meaning apart from their structure; they get a unique
         | hash (currently implemented by adding a random salt to the hash
         | of the structure, though it might as well be a guid). So
         | "unique" types and "structural" types.
         | 
         | The same question came up for terms, here:
         | https://news.ycombinator.com/item?id=27654045
        
           | jbrot wrote:
           | Glad to hear there's a solution for this! Thanks for
           | responding :)
        
       | auggierose wrote:
       | Submission inspired by
       | https://news.ycombinator.com/item?id=27651197 , I guess
        
       | WillDaSilva wrote:
       | They mention using git to version Unison code, and point out how
       | there'll practically never be any version conflicts because of
       | the immutable / append-only nature of the language.
       | 
       | Doesn't that mean that the git repository will only ever grow,
       | and that old code will stick around forever? I hope I'm
       | misunderstanding because that would be unfortunate if true.
        
         | teraflop wrote:
         | Isn't that true of any Git repository? The internal object
         | store keeps every version of every file that has ever existed
         | (unless you rewrite history).
         | 
         | In practice, Git's content-addressable storage and delta
         | compression make it work fairly well for all but the largest
         | repositories.
        
         | dthul wrote:
         | What I don't understand is what they do when merging two
         | branches. If both branches introduce a function with the same
         | name a merge conflict is inevitable, no? Or do they not support
         | the distributed version control approach and every developer
         | has to submit their changes to the current version of the
         | database?
        
           | refried_ wrote:
           | It produces a name conflict, but (unlike git merge conflicts)
           | these don't prevent any previously written code from running
           | normally. A name conflict only needs to be resolved as a
           | convenience to the next person to try calling the function by
           | that name, and even that next person might not have trouble
           | if the two new functions with the same name have different
           | types. The person just calls the one they mean, and the type-
           | checker uses the one with the type that fits.
        
         | JoshTriplett wrote:
         | Seems like an intended design feature. That doesn't mean you
         | have to _keep_ all those old versions in every copy of the
         | repository; you could always fetch only versions you need, for
         | instance.
        
       | lpointal wrote:
       | Unison is already the name of a bidirectional files
       | synchronization software (AFAIR developped in OCaml).
       | 
       | https://www.cis.upenn.edu/~bcpierce/unison/
        
       | TbobbyZ wrote:
       | what can you build with it?
        
       | pvg wrote:
       | Previously:
       | 
       | https://news.ycombinator.com/item?id=22009912
       | 
       | https://news.ycombinator.com/item?id=9512955
        
       ___________________________________________________________________
       (page generated 2021-06-28 23:03 UTC)