[HN Gopher] Unison Programming Language
___________________________________________________________________
Unison Programming Language
Author : gautamcgoel
Score : 395 points
Date : 2021-06-27 16:05 UTC (1 days ago)
(HTM) web link (www.unisonweb.org)
(TXT) w3m dump (www.unisonweb.org)
| sdfjkl wrote:
| > Run ucm init to initialize a Unison codebase in $HOME/.unison
|
| Ugh, this conflicts with my favorite file sync tool:
| https://www.cis.upenn.edu/~bcpierce/unison/
| carapace wrote:
| Unison (the file sync tool) is so awesome!
| zepearl wrote:
| It's great when it works, but different versions of <OCAML?
| and/or Unison?> on different hosts/VMs screw it up. I never
| understood the real reason why (but I never really
| investigated, lazy) => it's a pity.
| pchiusano wrote:
| FYI you can do 'ucm -codebase /someplace/else init' to
| initialize a codebase in another directory.
|
| And then 'ucm -codebase /somewhere/else' to launch ucm.
|
| Also I'm not sure what Unison the file sync tool stashes in
| .unison or how it uses that directory but there might not be
| any conflict. UCM will just create a .unison/v2/unison.sqlite3
| file.
| barnyfried wrote:
| This is stupid
| thom wrote:
| Super interesting, I was thinking about append-only codebases
| very recently:
|
| https://news.ycombinator.com/item?id=27492727
|
| The implications of this, with the right frameworks and
| processes, seem potentially huge.
| joe_the_user wrote:
| The thing is, the closest thing to append-only you have with
| ordinary programming languages APL and it's variants, where you
| can construct powerful functions with powerful primitives and
| methods for combining them.
|
| But the thing is that APL quickly becomes a "write only"
| language - as far as I know, the main use of APL someone
| sitting at a brockerage who can cobble together any algorithm
| on twenty minutes and often throws away the result afterwards.
|
| Which is to say, Unison is interesting because it seems to
| underestimate the importance of a program's code as _document_
| , as complete, coherent, human-readable, single-view,
| intentionally created text. Why hasn't the stream of ascii been
| replaced as the format of program in the last twenty years?
| It's a good question but the answer isn't that it's just matter
| of conservatism. There are several other things involved.
| acjohnson55 wrote:
| It doesn't discard the text. The text, like the documentation
| and comments, are stored and re-rendered during editing. It's
| fundamentally textual. It just doesn't have to be stored as a
| one dimensional text stream.
| steve_taylor wrote:
| > A friendly programming language from the future.
|
| It looks like a programming language from the present at best. A
| programming language from the future would have finally broken
| free from the prison of plain text.
| lolinder wrote:
| Look a little closer. This is not a graphical programming
| language, but it's definitely not "plain text" either.
| steve_taylor wrote:
| I looked a little closer and found this:
| Unison is a language in which programs are not text. That
| is, the source of truth for a program is not its textual
| representation as source code, but its structured
| representation as an abstract syntax tree. This
| document describes Unison in terms of its default (and
| currently, only) textual rendering into source code.
|
| Or to put it more concisely, _Unison is currently a plain-
| text programming language._
| TuringTest wrote:
| That's akin to saying that machine code is not binary but
| plain text, because Assembler exists.
|
| It is more accurate to say, _Unison is an AST language
| which nowadays happens to have just one human-readable
| interface, which is text_.
| ajuc wrote:
| This is like maven at function level with no SNAPSHOT versions
| allowed :)
|
| How is recursion handled? To get a hash for a function you have
| to have hashes for every function it calls. Is there special
| "recurse" opcode?
|
| And how do you update a function implementation when you have a
| cycle in the callgraph?
| refried_ wrote:
| Yes to Maven, and good questions.
|
| Simply put, there is a special recurse opcode.
|
| When you have a cycle in the call graph, they all get hashed
| together as a single unit; you update them together as a single
| unit too.
| umvi wrote:
| I had a really hard time wrapping my mind around this just
| reading the website alone. If you are in the same boat, watch the
| first 10 minutes of this video at 1.5x speed:
| https://www.youtube.com/watch?v=gCWtkvDQ2ZI
|
| and it will make so, so much more sense.
|
| ...and if you are like me you'll probably need to read this
| twitter thread to get the answer to your #1 question:
| https://twitter.com/unisonweb/status/1173942974381744134
|
| Basically the core idea (or one of the core ideas) is instead of
| a function (like fib(n) which returns nth Fibonacci number) being
| identified by its _name_ (fib) as is the case with most
| traditional languages, it 's instead identified by a hash of its
| implementation.
| taneq wrote:
| Why not just identify it as the function text at that point?
| tgbugs wrote:
| Interesting. You can write a macro and some buffer modifying
| code to do this in elisp. But having now written up the rest of
| my response, why not just use Smalltalk?
|
| The hard part is coming up with the normalization routine which
| guarantees that (lambda (a) b a) == (lambda (b) a b) and coming
| up with the rules for statement reordering for top level and
| internal definitions so that you can identify semantically
| equivalent statements where the outcome is order invariant.
| This is critical for making the hash functions useful and I
| suspect preventing denial of service attacks on the human
| brains that have to audit the code.
|
| Being able to write a version of the code and then do the
| equivalent of creating a package.lock file to crystallize the
| hashes seems like a reasonable workflow. This probably winds up
| being easier in common lisp though since you can put the
| crystallized implementations in their own packages.
|
| You could also view this as a kind of extreme type theory where
| every function (with regular names) has the type of its
| normalized representation (compacted to a hash for sanity's
| sake) and then you can run the checker to see if the
| types/hashes have changed. If you have somewhere that keeps
| track of every hash that a function with a particular name has
| had then you could automatically refactor, or could even
| support having multiple versions of the function with the same
| name used in a program at the same time. I'm not sure how users
| would feel about having to carry around `(funcall ((with-norm-
| id '(lambda (+ a b)) f)) a b)` though ... probably just give up
| on editing the textual representation and go back to the image
| based approach of Smalltalk and Interlisp where you can hide
| the hashes.
|
| Will be interesting to see how Unison evolves.
| lisper wrote:
| Sounds like a cool idea, but how do you fix bugs in functions
| with lots of callers?
| umvi wrote:
| That's basically what the twitter thread I linked explains.
| It sounds like there is an automatic propagation mechanism
| for updating downstream callers if the type hasn't changed,
| otherwise it sounds like a manual update process.
| torginus wrote:
| Sounds like trading one set of problems for another.
| vanderZwan wrote:
| Tangent: here's what I find fascinating: when we evaluate
| which algorithm or data structure is best to handle a
| given situation, we know how to reason about algorithmic
| complexity and pick a best option for our situation.
|
| But then when it comes to ideas like this we just tend to
| say "we're trading one set of problems or another", as if
| we can't evaluate the problems in a similar manner. And
| I'm not picking on you here, I tend to do the same!
|
| Yes, we're trading one set of problems for another, but
| what if the old set of problems was "O(n2)" and the new
| set of problems is "O(nlog(n))"? Or maybe it's the other
| way around. Why isn't it obviour how to apply those
| earlier skills here?
| capableweb wrote:
| Welcome to software engineering where there is no golden
| bullets, only different tradeoffs :)
| aparsons wrote:
| Law of conservation of complexity
| acjohnson55 wrote:
| The same way we do now: release a new version and tell people
| to migrate.
| lisper wrote:
| No, now I can just redefine the buggy function and all the
| callers will get the new version automatically. Having to
| update all callers seems like a high price to pay. Seems
| like the Right Answer is something like a hash of the api
| or the contract rather than the implementation.
| ucarion wrote:
| The (public) name of the implementation is the unique
| identifier of the contract in most systems, so I think
| your "Right Answer" is roughly the status quo?
| lisper wrote:
| No. The name of a function in current languages has no
| connection at all to what the function does. (What would
| be the contract for a function named 'foo'?)
| gbhn wrote:
| That's kind of the thing that makes APIs possible, right?
| It sounds to me like "what if programming were done in a
| completely flat global namespace in which abstractions,
| encapsulation, and structure were impossible."
| lisper wrote:
| No. An API specifies more than the name of the function.
| It will specify the arguments, their types, the type of
| the return value, and at least informally, what the
| function does. You can change the underlying
| implementation without changing the API. That's the whole
| point of an API. The problem with current API technology
| is that the informality of the spec of what the function
| does. That allows some aspects of the behavior of the
| function to change without triggering any warnings.
|
| By having the linker work on hashes of implementations
| you eliminate that problem but create a new problem. You
| can no longer change the behavior of the function because
| you can't change the function. That means you can't
| suddenly change behavior that some caller is counting on,
| but it also means you can't fix bugs without changes in
| the caller.
| tgbugs wrote:
| Reading this now, I'm imagining all the horrors of static
| linking but applied to every function instead of whole
| modules.
|
| Maybe the simplest solution is to allow the function to
| change to the new version, but make it easy to revert in
| the event that something breaks. This of course means
| that you can't make the names of the functions their hash
| (without lying, preventing the runtime from checking that
| hashes ways match, or modifying emitted bytecode or
| native code to do what you want), it has to be an
| orthogonal layer on top of them like types (as I
| mentioned elsewhere in the thread).
| ajuc wrote:
| Nope, according to
| https://twitter.com/unisonweb/status/1173942969726054401
|
| when you change a function implementation the system has
| to walk the callers graph backwards starting from all the
| places where the function was called updating all the
| implementations with the new hash, then callers of these
| with the new implementation and so on up to main (or
| whatever it's called).
|
| I had a chance to implement something like this in a
| system that used jbpm 3 graph language (basically process
| X version 1 called process Y version 1 and I updated
| process Y to version 2). It's nontrivial especially with
| recursion, I'm wondering how they are dealing with that.
| lamontcg wrote:
| A git-like datastore for your AST+callgraph.
| ajuc wrote:
| Let's say you have definitions like that:
| f: Nat -> Nat g: Nat -> Nat h: Nat -> Nat
| h x = g (x * 2) g x = f (x * 3) f x = x <
| 0 ? 1 : h (x / 4)
|
| And now you change f to be f x = x < 1
| ? 1 : h (x / 4)
|
| How do you do that? There's a cycle in the callgraph. In
| fact - how do you calculate a hash of a function that
| calls itself if you need its hash to calculate its hash
| :)
|
| EDIT: nevermind, recursion is a special case handled
| differently.
| dcposch wrote:
| Cool to see people thinking this big!
|
| One challenge I foresee is unintentional coupling. Say you have
| two functions:
|
| func serialize(MyRecord) ...
|
| func debugToString(MyRecord) ...
|
| Now if you ever make the mistake of having giving those the
| same implemention, then in Unison they'd be the same hash
| reference, right?
|
| Then if you want to update, say the debug print later it would
| update _all_ callsites for that hash including the ones that
| originally called serialize(). The two are no longer
| distinguishable.
| refried_ wrote:
| Hello, Unison author here.
|
| This is definitely an issue that is real, and is currently a
| problem, and that we will fix; probably by giving the
| function author an option to salt the hash of new definitions
| that have some semantic meaning beyond their implementations
| (appropriate for most application/business logic). No salt
| for definitions whose meanings are defined by their
| implementations (appropriate for most generic "library"
| functions like `List.map`).
|
| We already make this distinction for data types, but not yet
| for value/function definitions.
| hota_mazi wrote:
| This seems to be very developer hostile.
|
| Not only do they have to provide a salt themselves but on
| top of that, they need to make a judgment call of when
| something has "more semantic meaning beyond their
| implementation" (to use your words) rather than being some
| more "fundamental" code.
|
| I'm also surprised that you haven't solved this problem
| yet: at least once a day, IDEA warns me that some portion
| of my code is duplicated exactly in some other area of my
| code, so this kind of duplicated logic is already quite
| common.
| vanderZwan wrote:
| Why not also show that your definition already exists
| elsewhere, together with with a warning? Or is it doing
| that too?
| billytetrud wrote:
| Why not simply record where each reference occurs and
| ensure that if one definition is modified, the other is
| not? The programmer shouldn't have to think about salting
| any hashes, it should be automatic and hidden under the
| hood.
| sgk284 wrote:
| The names are just pointers, and they're both pointing to the
| same definition in your example. But when you redefine one of
| those, you would point one of the names to a new definition.
|
| It's similar to how DNS can have two domains point to the
| same IP, but then you can change one of those domains point
| to a new IP.
| ajuc wrote:
| > The names are just pointers, and they're both pointing to
| the same definition in your example. But when you redefine
| one of those, you would point one of the names to a new
| definition.
|
| But how do you know which name was called where if the
| callers referenced the content hash not the name?
| milansuk wrote:
| I also think that the DNS analogy is wrong because all
| callers are hash-based. The only solution I see is to go
| through the list of all callers and manually update
| selected ones.
|
| If I understand Unison right, the names are used only on
| the developer's layer(to write code), but when you save
| code, it's all hash-based.
|
| Still, Unison got my attention.
| aparsons wrote:
| Would it not be correct for those callers to keep calling
| the old (shared) implementation?
| ajuc wrote:
| well it would be nice to have a way to update old code
| acjohnson55 wrote:
| It knows what name you intended to use, because that's in
| your source, so I'm pretty sure it isn't a problem if
| implementations converge and diverge.
| andi999 wrote:
| Cool, but why/what for?
| umvi wrote:
| I'm just as much of a novice as you, but one of the use cases
| the creators had in mind are distributed computing systems.
| For example, if you have to crunch a bunch of data in the
| cloud, you would write your data crunch function/algorithm
| (which is represented by some hash '#asdjh238ad') then spin
| up nodes to crunch data using '#asdjh238ad'. When a new node
| in a cluster comes up it can say "I don't have '#asdjh238ad'"
| and the orchestrator or one of the node's peers can send over
| a copy of it.
|
| With a traditional programming language you couldn't do this
| because "send me a copy of sort()" would be met with "which
| sort()?". Whereas with unison every different sorting
| implementation would have different hash, so there would be
| no confusion.
| Nullabillity wrote:
| That makes sense as a build system (and is more or less how
| Nix works). The question would be why you'd subject your
| _source code_ to this.
| mst wrote:
| there was a paper that implemented an r7rs compatible
| module system for termite scheme that used hashes for
| identification for netework transfers of code but left
| the source files still normal - I think focusing on the
| textual representation too much misses the point a bit
| here.
| torginus wrote:
| I'm not totally convinced by this.
|
| - Storing the AST on the disk in a million files is not
| necessarily the best use of the filesystem. In contrast, most
| languages store text files on the disk, and build up a similar
| AST in memory only
|
| - You can't view your code without special tools, which means
| all text editors/version control etc. need to be Unison-aware
|
| - Since the language is append only, all edits look like
| additions in version control
|
| - Their solution for the diamond problem (depending on multiple
| versions of the same library) is having hard dependencies on
| exact versions, and including both copies can be at best
| wasteful, at worst bad (what if v2 fixes a bug that was in the
| v1 dependency), I think this is a hard problem, and the reason
| why semver exists
|
| - As others have mentioned, the append-only nature of the
| language makes bugfixes difficult
|
| - Solutions that dynamically discover code dependencies and
| automatically run tests exist for both procedural and
| functional languages
|
| - Detecting that 2 things are the same through hashing is
| nontrivial, can it detect that 1 + x + 1 is the same as x + 2?
| The ASTs are different
| asoltysik wrote:
| > Storing the AST on the disk in a million files is not
| necessarily the best use of the filesystem
|
| A new codebase format just uses a sqlite database instead of
| a million files
|
| > Since the language is append only, all edits look like
| additions in version control
|
| Traditional methods of showing change in verson control, that
| is text diffs, don't make sense here anyway
|
| > Detecting that 2 things are the same through hashing is
| nontrivial, can it detect that 1 + x + 1 is the same as x +
| 2? The ASTs are differen
|
| It can't detect that. It if could it would be pretty cool,
| but I don't think it would improve the usability too much
| the-smug-one wrote:
| What's the point of detecting that 1 + x + 1 is the same as
| x + 2 anyway? If I wrote it in one way, I meant it to be
| that way for a reason. Should it also be able to prove
| arbitrary code is semantically equivalent? Well, it can't
| do that for obvious reasons.
| ballenf wrote:
| Why not use hashing with locality for similarity? That is
| if the two samples above "hashed" to a similar value it
| might be helpful to find similar code.
|
| Hashing was created to prevent collisions and ensure
| small changes have big differences in result. The first
| requirement makes sense here, but not sure how the second
| helps.
| lamontcg wrote:
| Loading both copies of a library can be very useful to deal
| with the situation where one piece of code has been ported to
| v2 (due to bugs/features or just generally keeping up with
| updates) and another piece of code is hard blocked on the
| v1->v2 migration because it is much more costly, and its
| possible that v2 is actually buggier for that other use case.
| There's a bit of a naive idea that software always gets
| better for everyone and that projects have an infinite amount
| of spare time to drop everything to bump dependencies. That
| feature is actually very useful.
|
| (Which is not to defend the rest of the append-only
| immutability of the rest of the language, that looks a bit
| whack -- but then I've seen whack stuff get wildly popular,
| so I have no idea -- but while having 2 versions loaded at
| the same time might be useful I'm not sure I want to deploy
| every version that has ever existed that smells way too
| bloated)
| torginus wrote:
| You are right - but choosing the correct solution imho
| needs to be done with human oversight - I think a semver
| based dependency resolution works great here, for example
| if bar requires foo 1.0 and baz needs 1.0.1 they will
| happily use the same version, but if baz used foo 1.1 they
| would use the separate ones.
| lamontcg wrote:
| Except 1.0.1 can fix a bug that one piece of code needs,
| while another piece of code can be happily bug dependent
| upon it.
|
| You can scream at the developers that they've violated
| semver but a "bugfix" is entirely subjective (relevant
| xkcd, spacebar heating, etc).
|
| And even when developers violate semver in a point
| release the problem still exists. They actually rarely,
| if ever, rollback with a 1.0.2 that is equivalent to
| 1.0.0 and instead usually move forwards.
|
| And if you have a language that supports loading 1.0 and
| 1.1 then there's no point in being artificially
| constrained over which two versions can be loaded at the
| same time based on the label, the underlying framework
| shouldn't be built to care. There's no need for a multi-
| version library loader to care about what a bugfix is.
| uncomputation wrote:
| (Mentioned xkcd: https://xkcd.com/1172/)
| slver wrote:
| SemVer remains a pragmatic approach that works in vast
| amount of cases. It's unclear what alternative we have
| here which works in more cases.
| jonahx wrote:
| Go takes an alternative approach:
|
| https://www.youtube.com/watch?v=wWApoImHuf8
| lamontcg wrote:
| We don't have any better alternative, but lets not be
| naive about it when it comes to building bits of
| framework.
|
| Semver would just be an artificial impediment at this
| level.
| slver wrote:
| SemVer is an impediment only if you insist to make it so.
| fastball wrote:
| I think another key idea is that you're still thinking
| about libraries as complete packages where you kinda
| install two versions of the same thing. But it seems more
| likely in the Unison ecosystem that you'd end up with the
| ability to much more easily only extract the specific
| functions you need.
|
| So say there is v1 and v2 of a utility lib in my dep
| tree, but actually only using func A from v1 and func B
| from v2. Then I just have the AST of v1.A and v2.B in my
| deps and everything works.
| Nullabillity wrote:
| You still need some unit of atomicity to be able to
| maintain invariants. You can't combine
| HashMap_LinearProbe::insert with HashMap_Chains::remove,
| because they both depend on implementation details in
| order to maintain HashMap's invariants.
| jackcviers3 wrote:
| Semver doesn't help in the case of transitive binary
| incompatibility. If lib A depends on B v2, and lib C depends
| on B v1, and application D depends on A and C, you cannot
| load a version of B that satisfies D, A, and C. Semver tells
| you that B 1 and B2 are incompatible, but not how to solve
| the issue.
|
| Unison solves the issue - there isn't any binary
| incompatibility, because the transitive versions of Bv1 and
| Bv2 cannot be in conflict - the function references are to
| guaranteed unique and different versions of the art.
|
| As for bug fixes - you can specify in your code exactly which
| version to use.
|
| As for editors needing to be unison aware - they just
| delegate everything to the compiler via lsp and bsp.
|
| Bug fixes are no more difficult than making the change. A new
| version is created, and your code can now depend on it. Old
| code will still run off of the old version. It's up to the
| code owner to decide to use the new, but fixed version.
|
| Version control is all handled in the language itself.
|
| As for the hard hashing problem... Runar is a particularly
| intelligent individual. I expect that his algorithm works
| pretty well.
|
| The first argument about storing the ast is moot in an age
| where cached compiled typescript, Python, and .class files
| take up inordinate amounts of disk space.
|
| > Solutions that dynamically discover code dependencies and
| automatically run tests exist for both procedural and
| functional languages
|
| Eh. Piping and yarn ain't got nothing on maven and ivy and
| apt. But yes, dependency management isn't anything new under
| the sun. Dynamically resolving individual function versions
| in packages alongside binary incompatible functions is.
| billytetrud wrote:
| Honestly, programming without language aware tools in this
| day and age is very inefficient. Sure, in a pinch you can use
| a text editing program to edit stuff, but it wouldn't be so
| hard to install the standard editor in that case.
| infogulch wrote:
| I think Unison paired with a strong graph database instead of
| the filesystem would be a powerful combo. It would very
| naturally represent the AST graph directly and would benefit
| from graph db optimizations. The cost would be the need to
| invest a lot in new tooling: you'd want a graph db-based
| source control implementation that offers similar
| cryptographic certainty to git; you'd have trouble using
| existing tooling directly like text editors that expect files
| on disk; etc.
| musingsole wrote:
| The combination of the two makes me think auto/AI-generated
| code would be much more feasible and powerful in such an
| ecosystem.
| xpe wrote:
| Semver is an uneasy compromise at best. Rich Hickey has a
| nice talk that digs into the principles around changing
| software. Once you see this POV, you are unlikely to view
| Semantic Versioning as anything other than a messy hack.
|
| I'm not saying it is worse than nothing, but sometimes ideas
| have a way of sticking around too long and making people
| comfortable.
| modernerd wrote:
| The talk for those interested:
| https://youtube.com/watch?v=oyLBGkS5ICk
| Pet_Ant wrote:
| And the transcript: https://github.com/matthiasn/talk-
| transcripts/blob/master/Hi...
| mpweiher wrote:
| > identified by a hash of its implementation
|
| Sound a lot like darklang.
|
| Like others, I am dubious about this being in any way a useful
| feature. Separating implementation from name (/interface) and
| binding to that interface/name instead of the implementation is
| one of the fundamental and _useful_ parts of abstraction.
| remram wrote:
| This reminds me of Kubernetes, where all cluster state is
| neatly structured and placed in a replicated data store (etcd)
| that is the source of truth for operation, with the right parts
| immutable (e.g. volumes).
|
| The first thing people do is check in textual representations
| of those things in version control and operate on that instead.
| fouc wrote:
| Checked the twitter thread and I was thinking it sounds a lot
| like how strings are linked lists in elixir.
| brundolf wrote:
| > Unison definitions are identified by content. Each Unison
| definition is some syntax tree, and by hashing this tree in a way
| that incorporates the hashes of all that definition's
| dependencies, we obtain the Unison hash which uniquely identifies
| that definition.
|
| Very cool core concept. Reminds me of some things Rich Hickey has
| said about the idea of versioning dependencies at the function
| level
|
| That said: I wonder if this idea would make more sense as static
| analysis on an existing language. It would have to be trivial to
| enumerate all code that might influence a function's behavior; so
| something totally pure like Haskell or Elm
| xpe wrote:
| Yes, Rich Hickey (a bit more systematic) and Joe Armstrong (a
| bit more scattershot) have popularized some of these ideas.
|
| I'd be very interested in learning about analagous static
| analysis tools for referentially transparent languages / purely
| functional languages with sufficiently expressive type systems.
| Please share what you find :)
| iamevn wrote:
| it's really neat, I love how easy it is to search for functions
| by type to find what you need.
|
| The one thing I ran into (as someone who only vaguely knows
| haskell) is that it seems like it's impossible to write a
| function that takes a list of A or B as an argument and then
| branch on the type of each element. I can use Either but then I
| need to decorate each element in the list with Left/Right rather
| than just use their types.
|
| This is probably just not how things work in Haskell and I just
| need to be okay with that.
| creata wrote:
| > This is probably just not how things work in Haskell and I
| just need to be okay with that.
|
| Yep, that's just how things work in Haskell: disjoint unions
| are much simpler regular unions, and they're usually what you
| want in the first place. I think it'd be nice if Haskell had
| automatic conversions between types (so a and b can be turned
| into Either a b implicitly, with an error if a = b) but I don't
| think there are any plans for that.
| sullyj3 wrote:
| If you could give a concrete example of a problem to be solved,
| I could try to convince you that the method using Either won't
| actually be all that unwieldy.
| JackMorgan wrote:
| Perhaps are you looking for Sum Types?[0] They let you group
| several types into a unifying type, e.g. a Shape is a Circle,
| Square, or Triangle, then you can use pattern matching to have
| different behavior for each. This example is in F# [1] but it's
| almost exactly the same as it would be in Haskell.
|
| [0]https://www.schoolofhaskell.com/user/Gabriel439/sum-types
| [1]http://deliberate-software.com/christmas-f-number-
| polymorphi...
| __david__ wrote:
| This looks like a neat idea--I can see upsides and downsides, but
| would have to experiment to see if one outweighs the other.
|
| One thing I didn't see in my (admittedly quick) perusal of the
| tutorial and faq: what is the technique to run a Unison program
| from the command line? Is it practical for making unix cli tools
| (yet)?
| sullyj3 wrote:
| For the moment, you have to create a function with the
| appropriate IO ability, and execute using the `run` command
| from inside the codebase manager. I don't think there's a way
| to create a standalone executable just yet.
| 0_gravitas wrote:
| I've been semi-closely tracking this project for a while, and imo
| it's easily __the__ most interesting project I've seen in the
| sphere period. Serendipitous-ly, I came across an interview a
| couple weeks ago with one of the main bodies behind the project
| on the Corecursive podcast (from early 2019) (I think their name
| was Runar Bjarnason). Had no idea until it was mentioned almost
| offhandedly in the last few minutes!
| agbell wrote:
| I think this is the episode you are talking about [1] Runar and
| Paul a huge inspiration! I'm not totally sold on this idea as
| practical, but I think it will get there and while they have a
| lofty goal, I certainly wouldn't bet against the pair of them.
|
| [1]: https://corecursive.com/027-abstraction-and-learning-with-
| ru...
| 0_gravitas wrote:
| ah yes exactly!
| kvnhn wrote:
| This looks very cool! Content addressed storage is an incredibly
| powerful concept, and weaving it into a programming language is a
| compelling idea.
|
| Question for the devs: How does one deploy Unison code? After my
| first glance through the docs I don't have a clear picture.
| prezjordan wrote:
| Strongly encourage anyone reading this to take 20 minutes to
| download ucm and run through the Getting Started guide:
| https://www.unisonweb.org/docs/quickstart/
|
| Programming with a codebase manager and a scratchpad is just so
| much fun - I found myself hypnotized and came back an hour later
| with some janky min heap code. Definitely seems to scratch an
| itch for me.
| dang wrote:
| The past threads appear to be (others?):
|
| _Unison: A Content-Addressable Programming Language_ -
| https://news.ycombinator.com/item?id=22156370 - Jan 2020 (12
| comments)
|
| _The Unison language_ -
| https://news.ycombinator.com/item?id=22009912 - Jan 2020 (141
| comments)
|
| _Unison - A statically-typed purely functional language_ -
| https://news.ycombinator.com/item?id=20807997 - Aug 2019 (25
| comments)
|
| _Unison Language March Update_ -
| https://news.ycombinator.com/item?id=19528189 - March 2019 (1
| comment)
|
| _Large-scale, well-typed edits in Unison, and reimagining
| version control_ - https://news.ycombinator.com/item?id=9708405 -
| June 2015 (11 comments)
|
| _Unison: a next-generation programming platform_ -
| https://news.ycombinator.com/item?id=9512955 - May 2015 (128
| comments)
| xpe wrote:
| Also, to go a little further back, Joe Armstrong talked about
| content-addressable code in a conference talk:
|
| "The Mess We're In" by Joe Armstrong at Strange Loop [video] -
| https://news.ycombinator.com/item?id=8342755 - Sep 2014 (77
| comments)
| janjones wrote:
| There is nice blog post summing up what's cool about Unison[1]
|
| [1] https://jaredforsyth.com/posts/whats-cool-about-unison/
| PaulDavisThe1st wrote:
| Good explanations, but I'm always a little suspicious when I
| see things like this:
|
| > Code is stored as a structured, type-checked tree in a
| database, not as text in files
|
| What does everyone think a filesystem is?
| thethimble wrote:
| There's an important distinction between how non-unison code
| is stored (literally as plain text files which must be re-
| parsed and re-compiled every time) vs how unison code is
| stored (as a post-parsing data structure).
|
| The file system is in an entirely different and irrelevant
| layer of abstraction.
| turtletontine wrote:
| I'm not totally sure what the important distinction is
| here. For many languages the important thing is already a
| post-parsing data structure, that's what any compilation
| output or byte code is. You obviously want to keep the raw
| source around as well if you're the developer. Nothing new
| about having separate source code and compiled formats?
| turtletontine wrote:
| Update: I'm skimming here
| (https://jaredforsyth.com/posts/whats-cool-about-unison/)
| and here (https://joshondesign.com/2012/04/09/open-
| letter-language-des...) and I see Unison is serious about
| not having raw text source code as the ground truth. I'm
| intrigued but don't totally understand yet.
|
| I'm sure this analogy is technically incorrect but: This
| reminds me of Smalltalk and old Lisps on mainframes
| shared by many researchers where the main thing was the
| VM image, not an object file. Though the probably kept
| the source code around? At a gut level getting rid of
| source code makes me uncomfortable but I'm ready to learn
| more.
|
| PS sorry for the ugly raw links I'm on my phone
| jack_h wrote:
| Perhaps see my reply here
| (https://news.ycombinator.com/item?id=27654225).
|
| I think you may be misunderstanding what is being stored
| here. Now as a caveat I'm not familiar with this
| language, but I am familiar with the concept as
| described. They are not removing source code, rather
| source code is stored after some processing; in this case
| it appears to be after lexing, parsing, and type
| checking. I'm not sure exactly what is being stored, i.e.
| an AST, but it sounds like they're basically moving this
| stage of compilation/interpretation to be much earlier in
| the process.
|
| I'm assuming this database can be queried and the result
| can be rendered back to a textual presentation as well.
| Presumably this opens the door for syntax being divorced
| from language semantics since how the syntax is parsed
| into the database and how the database is rendered into
| text can be a client side decision rather than set in
| stone inside the compiler/interpreter. What is set in
| stone is the semantics of the database that everyone must
| agree to.
|
| Again, there's the caveat that I'm not familiar with how
| this language in particular is implementing this concept.
| jack_h wrote:
| I'm not sure I understand your question. Could you elaborate?
| fouc wrote:
| He's alluding to the fact that filesystems are a kind of
| database for files.
| vanderZwan wrote:
| I wouldn't exactly consider files "type-checked" though
| jack_h wrote:
| Storing code in a database is super cool stuff and is something
| I've been thinking about for a number of years. I'm actually
| surprised this development hasn't happened sooner since
| basically all tooling is forced to deal with the limitations of
| storing source code as text.
|
| The article gives an example that most programmers would be
| familiar with; canonicalization so that version control and
| code reviews go smoothly. Version control also becomes somewhat
| simpler as it can compare the structure of code rather than the
| structure of a sequence of characters that still must be lexed,
| parsed, etc. There are lots of other areas where storing code
| in a structured database of some sort would benefit tooling as
| well. One example is the use of language servers to index,
| perform continuous recompilation, perform cross-reference
| lookups, and offer code completion. With a structured database
| a lot of this becomes relatively trivial.
|
| I'll definitely have to look into this language further as I'm
| curious about how their database is designed.
| grawprog wrote:
| Thank you, that helped explain pretty well what abilities are.
| I felt like I was kind of starting to get what the language was
| about, then I hit the abilities section and I had no idea what
| it was talking about.
| jiaminglimjm wrote:
| Programming language i18n.
| michael-ax wrote:
| I'd love to see this in c, for scheme/racket.
| jjfeiler wrote:
| The core idea here, that of hashing the ast of a function, is
| similar to what Maple V from Waterloo Maple Software was doing
| circa 1991 when I last used it.
| deanstag wrote:
| This is really exciting. I might have missed this in the
| documentation, but is there any way of grouping/tagging together
| a set of functions, just so that conceptually similar functions
| can be browsed together? For traditional languages a
| folder/package/file performed this functionality.
| jimmux wrote:
| There is what looks like conventional namespacing.
| dthul wrote:
| An immediate caveat I came across: if you want to look at some
| Unison code you need a special code management tool. Take for
| example their base library on Github:
| https://github.com/unisonweb/base
|
| The actual code lives in a sqlite file in the .unison/v2 folder.
| That would mean existing tools like version control and editors
| would need to learn about how Unison works in order to seamlessly
| support it. Also pulling out code into a scratch file, editing it
| and pushing it back into Unison's database sounds kind of
| annoying. Again, this could probably be solved with an editor
| that would make this process more seamless and feel more like
| editing regular code.
|
| As it currently stands it seems very cumbersome to use, mostly
| due to the tedious process of even just exploring a codebase,
| nevermind modifying it.
| imtringued wrote:
| No, they just need to use FUSE and provide file system level
| access to the source code.
| dthul wrote:
| Yes, I thought about something like that. Being able to map
| it to the filesystem and back to the Unison database.
|
| But then, what is the point of this content addressed code
| again? What do we gain from it that we don't already have
| now? With current file based version control you already have
| an append only repository, code is never deleted from the
| .git directory, it's just not always mapped to a file in the
| source code directory (until you check out an old revision,
| that is).
|
| Edit: I guess Unison still has the unique feature that
| dependencies are referred to by identity and not name.
| brundolf wrote:
| This is an interesting choice. Any language or framework has to
| make a dozen or more choices between doing something in a way
| that's compatible but compromising, or bespoke but... bespoke.
| It's always a painful choice in my experience. This one is
| particularly bold, though.
| adrusi wrote:
| It's afaict a necessary decision, since unison is designed
| around the possibility of having multiple versions of the
| same function referred to be the same name.
| adrusi wrote:
| FWIW because of how unison works, you get a lot of the benefits
| of version control without using any proper version control.
| Probably for small, single dev projects version control would
| just be redundant.
|
| That's not to say this isn't a limitation the project will need
| to overcome to be useful, just a caveat.
| zawodnaya wrote:
| In practice it's a lot less annoying than navigating a file
| hierarchy and looking in text files that have a lot of things
| other than what you're looking for.
|
| See also https://share.unison-lang.org/ where you can look at
| the base library, and some (contributed?) libraries as well.
| dthul wrote:
| I agree that text files might not be the best way to store
| code. My point was more that all of the existing tools like
| code editors and version control systems have been designed
| around the concept of files though. And instead of Unison
| being able to tap into the existing ecosystem of tooling,
| they have to rebuild custom versions. Maybe there would be a
| way to map a Unison codebase onto the file system and back?
|
| Edit: also worth mentioning that thanks to specialized
| editors you don't need to manually browse through files but
| you can browse your code similarly to https://share.unison-
| lang.org if you so please. That's another plus point of the
| vast existing ecosystem, it already offers so much and it's a
| shame that Unison can't make use of it (at least for the
| moment).
| maddyboo wrote:
| After reading through the Unison tour [0] with an open
| mind, I actually think it makes excellent use of existing
| tools through its "scratch files" approach [1].
|
| The gist of it is that you can check out sections of code
| that you want to work on as a plain text file and you can
| do whatever you'd do with a text file: open it in your
| editor, syntax highlighting, copy/paste, whatever floats
| your boat. The cool part is that the "Unison codebase
| manager" (ucm) watches the scratch files and re-parses them
| whenever a file changes. I presume any syntax or type
| errors will be immediately shown in the ucm output. Cool,
| you say, but we can already do that with file watchers like
| `entr` and traditional languages, so why should I care?
| Well, it goes further.
|
| You can start a line with a > character followed by an
| expression and the expression will be evaluated when you
| save the file, printing the output inside of ucm. It's
| basically a REPL that you control from your editor. Cooler
| still, building on this concept is the `test>` prefix
| which, you guessed it, creates a unit test and runs it
| inside ucm, showing you whether it passed or not. And as a
| consequence of Unison's content-addressable nature, after a
| test has run for a given expression's content hash, the
| result is cached and the test doesn't need to be re-run
| unless the hash changes (impure functions are soooo 2020).
| After you're done with the scratch file, you can run `add`
| in ucm to add either certain parts (I think) or all of the
| work you've done in the scratch file to the source
| codebase, and this includes the tests that you wrote along
| with their cached values (I think)!
|
| I personally find this workflow to be very compelling. To
| me, this approach is much akin to the source control that
| we do today, but it's actually aware of the context and
| meaning of the changes. Git, on the other hand, relies on
| weak heuristics to figure out what changed between versions
| of text-based files.
|
| I am very happy to see projects that push beyond the
| boundaries of the paradigms we've been stuck in for the
| past 60+ years. I also find it quite funny that Hacker
| News, a forum centered around startups, can often be so
| conservative when it comes to new technologies.
|
| [0]: https://www.unisonweb.org/docs/tour
|
| [1]: https://www.unisonweb.org/docs/tour#unisons-
| interactive-scra...
| wyager wrote:
| This looks pretty well done. It doesn't seem like a gimmick;
| they've made a lot of good choices beyond the core conceit of
| content-addressable code.
|
| One thing I didn't see skimming the language reference page: is
| there any sort of typeclass mechanism?
| refried_ wrote:
| No, but it's planned; probably in the form of implicit
| parameters.
| jbrot wrote:
| Very neat project! One question about content addressed programs:
| how does this play out with types that are structurally
| equivalent but semantically distinct?
|
| For instance, assuming C definitions, an integer and a file
| descriptor have the same content but probably should not be
| treated as the same type (I wouldn't want arithmetic to type
| check against file descriptors...).
|
| Another scenario: say I have a type "Foo" which contains an
| integer. In version 1 of my library, this integer must be even,
| but in version 2 I add support for odd integers, too. The Foo
| data type, from a content perspective, is unchanged. However, the
| invariants around it have changed and it's therefore essential
| that it becomes a new type. Otherwise, someone might create a Foo
| containing an odd integer using the version 2 API and then pass
| it to a function from the version 1 API, resulting in bad things
| since the version 1 API believes Foo can never contain an odd
| integer.
| kroltan wrote:
| It looks from the other discussions that it's specifically
| content-addressed, not "semantic-addressed", so if your code
| has any redundant syntax (the given example being `x -> x + 2`
| vs `x -> x + 1 + 1`) it's still distinct.
|
| In that case, I guess data types can have a 0-size marker
| member, kind of like Rust's `PhantomData` type, that could
| ensure distinctness.
| refried_ wrote:
| The language has "unique" types, meaning they have their own
| semantic meaning apart from their structure; they get a unique
| hash (currently implemented by adding a random salt to the hash
| of the structure, though it might as well be a guid). So
| "unique" types and "structural" types.
|
| The same question came up for terms, here:
| https://news.ycombinator.com/item?id=27654045
| jbrot wrote:
| Glad to hear there's a solution for this! Thanks for
| responding :)
| auggierose wrote:
| Submission inspired by
| https://news.ycombinator.com/item?id=27651197 , I guess
| WillDaSilva wrote:
| They mention using git to version Unison code, and point out how
| there'll practically never be any version conflicts because of
| the immutable / append-only nature of the language.
|
| Doesn't that mean that the git repository will only ever grow,
| and that old code will stick around forever? I hope I'm
| misunderstanding because that would be unfortunate if true.
| teraflop wrote:
| Isn't that true of any Git repository? The internal object
| store keeps every version of every file that has ever existed
| (unless you rewrite history).
|
| In practice, Git's content-addressable storage and delta
| compression make it work fairly well for all but the largest
| repositories.
| dthul wrote:
| What I don't understand is what they do when merging two
| branches. If both branches introduce a function with the same
| name a merge conflict is inevitable, no? Or do they not support
| the distributed version control approach and every developer
| has to submit their changes to the current version of the
| database?
| refried_ wrote:
| It produces a name conflict, but (unlike git merge conflicts)
| these don't prevent any previously written code from running
| normally. A name conflict only needs to be resolved as a
| convenience to the next person to try calling the function by
| that name, and even that next person might not have trouble
| if the two new functions with the same name have different
| types. The person just calls the one they mean, and the type-
| checker uses the one with the type that fits.
| JoshTriplett wrote:
| Seems like an intended design feature. That doesn't mean you
| have to _keep_ all those old versions in every copy of the
| repository; you could always fetch only versions you need, for
| instance.
| lpointal wrote:
| Unison is already the name of a bidirectional files
| synchronization software (AFAIR developped in OCaml).
|
| https://www.cis.upenn.edu/~bcpierce/unison/
| TbobbyZ wrote:
| what can you build with it?
| pvg wrote:
| Previously:
|
| https://news.ycombinator.com/item?id=22009912
|
| https://news.ycombinator.com/item?id=9512955
___________________________________________________________________
(page generated 2021-06-28 23:03 UTC)