hngopher.com

       [HN Gopher] Starlark Programming Language
       ___________________________________________________________________
        
       Starlark Programming Language
        
       Author : laurentlb
       Score  : 107 points
       Date   : 2024-12-08 17:05 UTC (1 days ago)
        
 (HTM) web link (starlark-lang.org)
 (TXT) w3m dump (starlark-lang.org)
        
       | Rhapso wrote:
       | Starklark isn't exactly "functional" but by design it is
       | hermetic. You can't call any code that would have a side-effect
       | (other than burning compute resources) or get data that wasn't
       | part of your initial input.
       | 
       | It could make a of sense as a "contract language", or as
       | intended: part of a build system.
        
         | kccqzy wrote:
         | I believe Starlark allows side effects the first time it
         | evaluates a file but not subsequent times. Otherwise something
         | as simple as calling .append on a list won't work because that
         | returns None but has an input-mutating side effect. I think
         | it's a good choice.
        
       | etamponi wrote:
       | Your mileage may vary, but Starlark was one of the most
       | complicated languages I've ever read. No types, just layers and
       | layers of indirection, with the goal of making a very complex
       | build rule look "simple". I don't like hiding non-accidental
       | complexity, in particular when the abstraction leaks everywhere.
       | Perhaps it was only due to the Starlark I was exposed to (I
       | worked at Google, probably most of the Starlark code I've read
       | has been written by SREs).
        
         | kccqzy wrote:
         | What you describe is really all large-scale Python projects
         | look like. No types, just layers and layers of indirection,
         | with the goal of making the interface simple and Pythonic while
         | hiding implementation complexity. I don't think this is
         | necessarily a fault given that this language explicitly decides
         | to look like Python (and was in fact simplified from Python).
         | 
         | The worst Starlark code I've read has been written not by SREs,
         | but by the Boq team as they have a fetish for accomplishing
         | complicated configuration at build time. This was one of the
         | reasons I've avoided Boq: an incomplete code base that's under
         | development doesn't even begin to build, which is far worse
         | than building something and seeing a real compiler error.
        
           | yodsanklai wrote:
           | > No types
           | 
           | It's pretty easy to add types to Python nowadays. I'd
           | consider it bad practice not to do so in a large project.
        
             | seanmceligot wrote:
             | It would be nice if python would show types in the
             | documentation. Not only do I need that all the time, it
             | would show python was taking type safety serious.
             | 
             | For example, knowing the return type of a function is
             | Union[DataFrame,Series] rather than simply DataFrame would
             | save a lot of bad errors.
        
             | zellyn wrote:
             | Starlark, unfortunately, does not really support (Python
             | style) types yet. Facebook's version has some kind of
             | types, but ideally Starlark would just learn to do mypy
             | types.
        
             | kccqzy wrote:
             | That's only true for greenfield projects where people start
             | a new project with types in mind. It's absolutely a
             | nightmare for old projects because, without the need to
             | write types, people write all kinds of code that cannot fit
             | within what's possible in python's type annotations.
        
               | poincaredisk wrote:
               | In my opinion it's a bad practice, and rewriting code to
               | be typeable is a good idea for refactoring.
               | 
               | But I write Python for some time now, and I know what you
               | mean. I have nightmares about codebases with dynamically
               | generated class fields for example (though I heard ruby
               | is even worse)
        
           | zellyn wrote:
           | What you describe is really what all programming languages,
           | compilers, interpreters, etc. look like. Layers and layers of
           | indirection, with the goal of making the interface simple
           | while hiding implementation complexity, all the way down to
           | assembly.
        
         | dastbe wrote:
         | How much of this was how bazel works vs. starlark itself?
         | 
         | I find the starlark language is very simple (though
         | inconsistent between the various implementations in bazel, go,
         | and rust) but it takes a bit to understand how the magic
         | between defining rules and implementations works in bazel. and
         | TBH, that is also one place I've really needed auto-
         | completion/static typing in starlark to understand what I
         | can/cannot do.
        
       | kevindamm wrote:
       | This appears to be the same language that started as a custom
       | build rule DSL for bazel (and evolving from lessons learned in
       | blaze's Python extension), the page positions it as a much more
       | general-purpose embedded language.
       | 
       | I'm all for using the right tool for the job, and I'll
       | acknowledge the benefits of hermeticity and parallelism in many
       | contexts. But, I also think that you need to weigh the cost of
       | adding another language to a project. I don't imagine myself
       | using this anywhere outside of a build system, and even then I
       | would first try to take the declarative rules of that build
       | system as far as I can. It's probably better than shell scripts,
       | though, at least most of the time? May depend on your team's
       | prior familiarity with shell and/or love of Python.
       | 
       | There are three implementations (Go, Rust, Java) which could be a
       | good thing if the language doesn't change often. They would have
       | to be kept in sync for any changes, as well as keep from drifting
       | due to changes in the host language's semantics.
       | 
       | I also think that the closeness in syntax with Python can be a
       | disadvantage, especially if users are expecting more recent
       | Python additions like the walrus operator or pattern matching to
       | be available.
       | 
       | I like to design languages, too, so these comments come from a
       | place of love and understanding. I think it would help if there
       | were some more specific justification (or examples) of why this
       | is better than something more established (e.g. Lua or
       | MicroPython) or more distinct (e.g. pure data in configs instead
       | of embedded code). I do like that the language attempts to remain
       | as simple as needed.
        
         | foooorsyth wrote:
         | >There are three implementations (Go, Rust, Java)
         | 
         | There are so many people at Google just goofing around, lol.
         | People just doing hobby stuff for $350k/year.
         | 
         | There is no justification to implement this custom dialect of
         | Python for a build system that drives everyone crazy 3 times.
         | Reminds me of when it was revealed that 400 people were working
         | on Fuschia -- a hobby OS that only shipped on a single smart
         | home device.
        
           | ithkuil wrote:
           | IIRC initially blaze was literally using python as a config
           | layer. That made it too easy to write build scripts that were
           | too slow and too hard to optimize and that negatively
           | affected the overall build experience.
           | 
           | An increase in edit/build/run cycle efficiency of thousands
           | of employees justifies an investment of a few engineers.
           | 
           | They could have invented a new bespoke DSL. Instead they
           | choose to stick to a well known and familiar language and
           | just limit its expressiveness to a subset that would be
           | easier to optimize. I think that's quite reasonable
        
             | kevindamm wrote:
             | I also remember blaze directly eval'ing Python pre-Skylark,
             | but I think the bigger problem was build hermeticity. This
             | was compounded by env-related issues making it hard to
             | build Python hermetically in the first place, and the
             | importance of cached object files and build outputs in
             | managing the incredibly large monorepo that is Google's
             | source tree. And that's all before considering the Py2->Py3
             | migration (which, fortunately, was tackled afterwards).
             | 
             | I think a new bespoke DSL would have been a non-starter
             | since so many build scripts had already been written by the
             | time Skylark was being conceived.
        
           | zellyn wrote:
           | One of my hobbies while at Google (2010-2015) was to watch
           | the multiple failed attempts to get rid of Borg Config
           | Language and actual Python in Blaze. It took a lot of work
           | until they eventually succeeded. You're probably
           | underestimating the rigor, the pain, and also the value
           | involved in cleaning those things up: being able to cleanly
           | operate programmatically at scale in the Google monorepo is
           | extremely necessary.
           | 
           | (This is also why I don't trust configuration languages built
           | by people who _didn't_ observe the years of pain. Cue and
           | jsonnet are notable projects that were able to incorporate a
           | lot of lessons.)
        
             | zellyn wrote:
             | Also, I think the Rust version was built as a side-project,
             | then handed off to Facebook.
             | 
             | With regards to Fuchsia... I used to think building a new
             | OS from the ground up was madness. Now I think _not_ doing
             | that eventually is madness. I'd be a lot happier if Zircon
             | and Fuchsia were done in Rust though...
        
               | surajrmal wrote:
               | A good deal of fuchsia is done in rust (roughly 50% and
               | increasing over time). People over emphasize the
               | importance of zircon needing to be in rust. It's more
               | important for new code to be rust than for existing code
               | to be rewritten. Zircon isn't a very fast growing part of
               | the entire OS.
        
               | zellyn wrote:
               | Fair. That study on writing only new things in Rust was
               | surprising!
               | https://security.googleblog.com/2024/09/eliminating-
               | memory-s...
               | 
               | I do think for an OS/Kernel, it's worth having everything
               | in a memory-safe language, and possibly worth formal
               | verification too, if the very core of it is small
               | enough...
        
             | foooorsyth wrote:
             | I can appreciate the pain of actually getting things
             | through in a large (100k+ people) organization.
             | 
             | What do you consider the be the justification for three
             | separate implementations of the same build config language?
             | Genuine question. I am not doubting the need for the DSL
             | itself.
        
               | jsnell wrote:
               | (Note that the Rust implementation is by Facebook, not by
               | Google.)
               | 
               | If you're looking to embed a scripting language in a Go
               | program, having a embedded language implementation
               | written in Java isn't very useful. And vice versa.
        
               | foooorsyth wrote:
               | Ah I missed that this thing is actually embedded. I
               | thought they were just doing multiple implementations for
               | shits and gigs.
               | 
               | Carry on, Google
        
               | zellyn wrote:
               | Yep. Basically the same reason there are so many lua
               | implementations, or WASM ones.
        
           | surajrmal wrote:
           | I'm not sure. I understand what you are trying to say - do
           | you think we should give up on OS diversity? Are you making
           | fun of a company for actually investing in something
           | ambitious which can benefit many if successful?
           | 
           | While I'm not sure you are in the same crowd, I always think
           | it's interesting that the HN hivemind tends to be upset with
           | the browser monoculture, but doesn't bat an eye at OS
           | monoculture. It feels like you're really just channeling
           | feelings about the company rather than the projects.
        
           | surajrmal wrote:
           | Of the three, only two are Google produced. The rust
           | implementation is written by Facebook for use in their build
           | system. The java implementation is the original but is pretty
           | tied to the the bazel build system and not really very
           | suitable to other uses. The go implementation is meant to be
           | embeddable into a varying number of applications.
           | 
           | An example application is https://github.com/shac-
           | project/shac
           | 
           | Within Google, there a large number of similar tools which
           | are written in go and harness the starlark language. While
           | there are plenty of other options, I will say I think
           | starlark is often a great choice.
        
         | yodsanklai wrote:
         | If I understand correctly, Starlark is a strict subset of
         | Python (at least very close to Python). Can anything be more
         | established than Python?
         | 
         | Also I don't think pure data is an option. The point is that
         | you want to generate data, which has many benefits (avoid
         | duplication, easier to test).
         | 
         | It seems that Starlark is a good trade-off given the
         | constraints. My main grief is that it doesn't have type.
        
           | kevindamm wrote:
           | The syntax is a pure subset but the implementations are
           | bespoke, so I wouldn't equate the established position of
           | Python with whether Starlark is considered established.
           | 
           | I agree that sometimes pure data isn't an option, and I've
           | had to write some Skylark to assist blaze build rules too,
           | but every time it also added tech debt, reduced the number of
           | people on the team who completely understood the build
           | system, and wasn't convenient/practical to test the build
           | extension. The problem I'm referring to above is when the use
           | of a source-controlled data file would have been sufficient
           | but someone had to write a build extension because it's fun
           | or something new to do (or whatever reason seemed convincing
           | at the time).
           | 
           | Then there are people who think that those same data files
           | shouldn't be part of the build process at all and should be
           | part of system turn-up/tear-down, stored in a global config
           | or DB where versioning is alongside the data instead of
           | alongside the program build. Certain kinds of migration are
           | made more difficult if everything is built into the binary.
           | Of course, that strategy comes with its own caveats.
        
           | phyrex wrote:
           | The rust implementation has types
        
       | paulddraper wrote:
       | Starlark (originally Skylark) is the configuration/extension
       | language of Google's build tool Bazel.
       | 
       | It's a bespoke Python subset... functions but no recursion, dicts
       | but no sets, etc.
       | 
       | It's also been adopted by the latest version of Buck (Meta's
       | Bazel analog).
       | 
       | I use it daily.
       | 
       | It's an option for lightweight embeddable scripting language,
       | with implementations in Java and Go. If you want Python
       | familiarity, consider it as an option.
        
         | nrr wrote:
         | Starlark is Turing-incomplete, which makes it somewhat unique
         | among embeddable languages. It's definitely a draw for me for
         | something I'm working on.
        
           | kevindamm wrote:
           | The primitive-recursive property of Cue (https://cuelang.org)
           | is a big draw for me, and may be an alternative worth
           | checking out. The authors have spent a great amount of
           | attention to the type system (they learned a lot of lessons
           | from previous config language designs that did not take
           | lattice theory and unification into account).
        
             | verdverm wrote:
             | The tl;dr is that inheritance is bad in config, whether it
             | be from OOP or layering yaml files like Helm. The reason
             | being that it is hard to understand where a value is coming
             | from and where one must make an edit to correct it in high-
             | stress SRE situations like downtime. Marcel worked on both
             | major config languages at Google, and iirc Starlark is
             | based on GCL ideas
             | 
             | The Logic of CUE is a great read:
             | https://cuelang.org/docs/concept/the-logic-of-cue/
        
           | kccqzy wrote:
           | Dhall is another configuration language that's deliberately
           | Turing-incomplete. Though its Haskell-inspired syntax turns
           | people off who aren't already Haskell programmers. It's based
           | on calculus of constructions.
        
       | anothername12 wrote:
       | > Deterministic evaluation - Executing the same code twice will
       | give the same results.
       | 
       | What's this all about? Don't most languages?
        
         | lann wrote:
         | One specific example: many languages use randomized seeds in
         | builtin dict/map types, leading to randomized iteration order.
        
           | chubot wrote:
           | Yeah, also Starlark is embedded like Lua, and doesn't come
           | with batteries included like Python
           | 
           | So that means you can control the APIs, and say opendir()
           | closedir() in Unix returns filenames in different orders.
           | Depending on what the data structure in the kernel is
           | 
           | So many programs in other languages aren't deterministic just
           | because they use APIs that aren't deterministic
        
         | phyrex wrote:
         | You don't get access to randomness or time functions or
         | anything that could change the output of a function with the
         | same input
        
         | zellyn wrote:
         | It can be very valuable to know that if you put the same files
         | in, you get _exactly_ byte-for-byte identical artifacts out of
         | your build system. Even letting your language access the actual
         | current date/time can break that.
         | 
         | (IIRC, I don't believe Bazel actually has fully deterministic
         | builds yet, though.)
        
         | laurentlb wrote:
         | Try this in Python:
         | 
         | id("ab") # not deterministic
         | 
         | hash("ab") # not deterministic
         | 
         | def foo(): pass
         | 
         | str(foo) # not deterministic
         | 
         | Another sneaky one is the `is` operator in Python, where the
         | Python documentation says:
         | 
         | > Due to automatic garbage-collection, free lists, and the
         | dynamic nature of descriptors, you may notice seemingly unusual
         | behaviour in certain uses of the `is` operator
         | 
         | Related to that is the `__del__` method: when exactly is it
         | called?
         | 
         | It's quite easy to get non-deterministic code in Python and in
         | many languages. And of course, there are lots of non-
         | deterministic functions in the standard library (Starlark
         | doesn't provide them).
        
         | neuroelectron wrote:
         | Apparently not. Wow. No wonder China is hacking into
         | everything.
        
       | lihaoyi wrote:
       | Starlark is definitely a mixed experience IMO, from my 7 years
       | working with it in Bazel
       | 
       | On one hand, having a "hermetic" subset of Python is nice. You
       | can be sure your Bazel starlark codebase isn't going to be making
       | network calls or reading files or shelling out to subprocesses
       | and all that. The fact that it is hermetic does help make things
       | reproducible and deterministic, and enables paralleization and
       | caching and other things. Everyone already knows Python syntax,
       | and its certainly nicer than the templated-bash-in-templated-yaml
       | files common elsewhere in the build tooling/infra space
       | 
       | On the other hand, a large Starlark codebase is a large Python
       | codebase, and large Python codebases are imperative, untyped, and
       | can get messy even without all the things mentioned above. Even
       | though your Starlark is pure and deterministic, it still easily
       | ends up a rats nest of sphagetti. Starlark goes the extra mile to
       | be non-turing-complete, but that doesn't mean it's performant or
       | easy to understand. And starlark's minimalism is also a curse as
       | it lacks many features that help you manage large Python
       | codebases such as PEP484 type annotations, which also means IDEs
       | also cannot provide much help since they rely on types to
       | understand the code
       | 
       | For https://mill-build.org we went the opposite route: not
       | enforcing purity, but using a language with strong types and a
       | strong functional bent to it. So far it's been working out OK,
       | but it remains to be seen how well it scales to ever larger and
       | more complex build setups
        
         | kstrauser wrote:
         | Python has always been strongly, dynamically typed. It isn't
         | untyped.
        
           | jerf wrote:
           | The sort of "untyped" that your last sentence is referring to
           | is a dead term, though. The only "untyped" language still in
           | common use is assembler, and that's not commonly written by
           | hand anymore (and when it is, it's primarily running on
           | numbers, not complex structs and complex values). There
           | aren't any extant languages anymore that just accept numbers
           | in RAM and just treat them as whatever.
           | 
           | So increasingly, this objection is meaningless, because
           | nobody is using "untyped" that way anymore. The way in which
           | people _do_ use the term, Python _is_ only  "optionally"
           | typed, and a lot of real-world Python code is "untyped".
        
             | js2 wrote:
             | I think the objection is to the conflation of strong/weak
             | with dynamic/static and it being unclear exactly what
             | typed/untyped means, since it can refer to either. Python
             | has always been strongly typed at runtime (dynamic), vs say
             | JavaScript which is relatively weakly typed at runtime.
             | 
             | Obviously lihaoyi was referring to static/dynamic when they
             | wrote untyped (as made clear by the reference to type
             | annotations) but kstrauser is objecting to using the term
             | "untyped" since that can be interpreted to mean weak typing
             | as well, which Python is not.
             | 
             | $0.02 anyway.
        
               | lolinder wrote:
               | Strong/weak is a meaningless dichotomy that could be
               | replaced by nice/icky while conveying the same meaning.
               | It just distinguishes whether I, personally, believe a
               | given language has sufficient protections against dumb
               | programmer errors. What counts as strong or weak depends
               | entirely on who's talking. Some will say that everything
               | from C on is strong, others draw the line at Java, still
               | others aren't comfortable until you get to Haskell, and
               | then there are some who want to go even further before
               | it's truly "strong".
               | 
               | Typed versus untyped is, on the other hand, a rigorously
               | defined academic distinction, and one that very clearly
               | places pre-type-hints Python in the untyped category.
               | That's not a bad thing--untyped isn't inherently a
               | derogatory term--but because untyped languages have
               | fallen out of vogue there's a huge effort to rebrand
               | them.
        
               | poincaredisk wrote:
               | ...but Python is obviously typed. It has types. In fact
               | everything has a type, and even the types are of "type"
               | type. It has type errors. Saying it's "untyped" invokes a
               | wrong impression. Your usage is very non-standard in
               | programmer circles.
               | 
               | What's wrong with universally understood and well defined
               | concepts of "statically" and "dynamically" typed
               | languages?
        
               | lolinder wrote:
               | As I said in another comment [0], it depends on what
               | definition of types we're using. But if we're going to
               | pedantically jump down someone's throat correcting their
               | usage (in this case OP's usage of "untyped"), we should
               | at least use the most pedantically correct definition,
               | which is the one used by academics who study type systems
               | and which pointedly excludes dynamic checks.
               | 
               | I have no problem with people using the other terminology
               | in casual usage--I do so myself more often than not. I do
               | have a problem with people pedantically correcting usage
               | that is actually _more_ correct than their preferred
               | usage. I dislike pedantry in general, but I especially
               | dislike incorrect pedantry.
               | 
               | [0] https://news.ycombinator.com/item?id=42367659
        
               | jyounker wrote:
               | Strong/weak typing is very specific thing. It refers to
               | the ability to create invalid types within a language. In
               | strongly typed languages it is hard to defeat the type
               | system. In weakly typed languages it is easy to defeat
               | the type system.
               | 
               | Python is strongly typed (hard to escape the bounds of
               | the type system) but (traditionally) dynamically typed
               | (types are checked at runtime).
               | 
               | C is weakly typed (easy to escape the type system), but
               | statically typed (types are checked at compile time).
        
               | lolinder wrote:
               | That is a possible definition for strongly typed, yes. It
               | is not widespread or generally agreed upon--you'll see
               | plenty of people use them in ways that contradict your
               | definitions, and you won't see any serious work
               | attempting to define them at all. Even Wikipedia doesn't
               | [0]:
               | 
               | > However, there is no precise technical definition of
               | what the terms mean and different authors disagree about
               | the implied meaning of the terms and the relative
               | rankings of the "strength" of the type systems of
               | mainstream programming languages. For this reason,
               | writers who wish to write unambiguously about type
               | systems often eschew the terms "strong typing" and "weak
               | typing" in favor of specific expressions such as "type
               | safety".
               | 
               | [0]
               | https://en.m.wikipedia.org/wiki/Strong_and_weak_typing
        
               | samatman wrote:
               | Untyped computation in the academic sense you refer to
               | _is_ untyped in the sense of Forth and assembler. The
               | untyped lambda calculus doesn 't even have numbers.
               | Pragmatically, a language in which type errors occur is a
               | typed language.
               | 
               | Nor does it make sense to conflate "typed and untyped"
               | with "statically typed and dynamically typed". These are
               | simply very different things. Julia is an example of a
               | dynamically typed language with a quite sophisticated
               | type system and pervasive use of type annotations, it
               | would be insane to call it untyped. Typescript is an
               | example of a dynamic language which is nonetheless
               | statically typed: because type errors in Typescript
               | prevent the program from compiling, they're part of the
               | static analysis of the program, not part of its dynamic
               | runtime.
               | 
               | The fact that it's uncommon to use untyped languages now
               | is not a good reason to start describing certain type
               | systems as 'untyped'! A good term for a language like
               | annotation-free Python is unityped: it definitely has a
               | (dynamic) type system, but the type of all variables and
               | parameters is "any". Using this term involves typing one
               | extra letter, and the payoff is you get to make a correct
               | statement rather than one which is wrong. I think that's
               | a worthwhile tradeoff.
        
               | lolinder wrote:
               | From Benjamin Pierce's _Types and Programming Languages_
               | , which is basically the definitive work on types:
               | 
               | > A type system is a tractable syntactic method for
               | proving the absence of certain program behaviors by
               | classifying phrases according to the kinds of values they
               | compute.
               | 
               | And later on:
               | 
               | > A type system can be regarded as calculating a kind of
               | static approximation to the runtime behaviors of the
               | terms in a program. ... Terms like "dynamically typed"
               | are arguably misnomers and should probably be replaced by
               | "dynamically checked," but the usage is standard.
               | 
               | The definitions you're using are the ones that he
               | identifies as "arguably misnomers" but "standard". That
               | is, they're fine as colloquial definitions but they are
               | not the ones used in academic works. Academically
               | speaking, a type system is a method of statically
               | approximating the behavior of a computer program in order
               | to rule out certain classes of behavior. Dynamic checks
               | do not count.
               | 
               | As I've said elsewhere, I don't have a problem with
               | people using the colloquial definitions. I do have a
               | problem with people actively correcting someone who's
               | using the more correct academic definitions. We should
               | have both sets in our lexicons and be understanding when
               | someone uses one versus the other.
        
               | js2 wrote:
               | > Strong/weak is a meaningless dichotomy
               | 
               | Strong/weak is not a dichotomy. It's a spectrum. That's
               | why folks argue over where a language lands in the
               | spectrum. OTOH, static (compile-time) vs dynamic (run-
               | time) is a dichotomy. There's not really any in between.
               | It's clear when and where typing occurs. So there's
               | nothing to argue over.
               | 
               | > Typed versus untyped is, on the other hand, a
               | rigorously defined academic distinction
               | 
               | A typed language is one that has a type system. Python
               | has a type system. It's typed.
        
               | lolinder wrote:
               | Academically, no, a type system is by definition static.
               | See the definition Benjamin Pierce gives in TAPL that
               | I've placed in many comments in this subthread [0] and
               | won't repeat here.
               | 
               | Colloquially, yes, python has a type system. All I'm
               | saying is it's unhelpful to correct someone for using the
               | more correct definition rather than the colloquial one.
               | Both definitions are valid, but if we're going to be
               | pedantic we should at least use the academic definition
               | for our pedantry.
               | 
               | And you're correct, I should have said spectrum, but the
               | point is still the same: even Wikipedia refuses to define
               | "strongly" or "weakly" typed, suggesting people use
               | terminology that isn't hopelessly muddled.
               | 
               | [0] Here's one:
               | https://news.ycombinator.com/item?id=42368689
        
             | f1shy wrote:
             | I could even argue that Asm is to some extent typed.
             | Depends on the processor, but some cisc have operations for
             | different types. But also the comment is correct: Python is
             | strongly, dynamic typed.
        
               | DougMerritt wrote:
               | Lisp CPUs had type bits stored with values, but I can't
               | think of any typed CPUs still in use. What are you
               | thinking of?
        
               | kaladin-jasnah wrote:
               | Yes, agreed.
               | 
               | I guess the typing would be for the size of the integer
               | that you work with. For example, x86_64 assembly has
               | different prefixes to indicate what part of a larger
               | register you are using: 8 (lower), 8 (upper), 16 bit, 32
               | bit, and 64 bit itself.
               | 
               | There are other "typed" operations, such as branching for
               | unsigned vs. signed integers (think JA vs JG), or SAR vs
               | SHR (signed arithmetic shift vs. unsigned arithmetic
               | shift--one preserved the division logic of shifting for
               | signed integers by repeating the MSB instead of adding
               | zeroes when shifting).
               | 
               | While I'm not too familiar with them (but have been
               | meaning to learn more for years!!), SIMD instructions
               | probably also have similar ideas of having different
               | types for sizes of arrays.
        
             | munch117 wrote:
             | There's lots of programming languages still around with
             | untyped elements to them. Javascript is one of them, with
             | its string/number conversions and the way arrays are
             | defined. Then there's all the stringly typed stuff. Make,
             | CMake, Excel, TCL, bash. You're probably right that the
             | original use of the term came from assembly vs. high level,
             | but that objection is meaningless, because nobody is using
             | "untyped" that way anymore....
             | 
             | What makes changing the meaning of "untyped" extra
             | confusing is that dynamically typed programming languages
             | often have types as 1st class objects, and they get used
             | all the time for practical everyday programming. Calling
             | these languages "untyped" is just wrong on the face of it
             | -- they're full of types.
        
               | lolinder wrote:
               | > changing the meaning of "untyped" extra confusing is
               | that dynamically typed programming languages often have
               | types as 1st class objects, and they get used all the
               | time for practical everyday programming. Calling these
               | languages "untyped" is just wrong on the face of it --
               | they're full of types.
               | 
               | Just to be clear, it's the dynamically typed languages
               | that changed the meaning of untyped. OP's usage is closer
               | to the original and to the current usage of the
               | terminology in the study of programming languages.
               | 
               | Types and Programming Languages, one of the best regarded
               | texts on types, has this helpful explanation:
               | 
               | > A type system can be regarded as calculating a kind of
               | static approximation to the runtime behaviors of the
               | terms in a program. ... Terms like "dynamically typed"
               | are arguably misnomers and should probably be replaced by
               | "dynamically checked," but the usage is standard.
               | 
               | In other words both are standard, but that's because the
               | meaning of "types" has changed over time from its
               | original sense and when it comes to the formal study of
               | programming languages we still use the original
               | terminology.
        
               | munch117 wrote:
               | Just to be even clearer.
               | 
               | In the time of the original use, there were only static
               | types. Languages had very little in terms of UDT's. Even
               | a struct in C was barely a type of its own. I don't
               | recall the details, but there was something about struct
               | member names not being local to the struct. Interpreted
               | languages didn't have records or classes at all(*), and
               | certainly not types as first class objects.
               | 
               | We cannot really talk about how dynamically typed
               | languages with rich type systems were originally
               | labelled, back when they didn't exist at all.
               | 
               | (*) I'm looking forward to someone pointing out an
               | interesting counterexample.
        
           | lolinder wrote:
           | It depends on what definition of "type system" you're using.
           | Colloquially many programmers use it to refer to any system
           | that checks whether objects have specific shapes. Academics,
           | on the other hand, have a very specific definition of a type
           | system that excludes dynamic detect languages. From TAPL (one
           | of the authoritative works on the subject):
           | 
           | > A type system is a tractable syntactic method for proving
           | the absence of certain program behaviors by classifying
           | phrases according to the kinds of values they compute.
           | 
           | And later on:
           | 
           | > A type system can be regarded as calculating a kind of
           | static approximation to the runtime behaviors of the terms in
           | a program. ... Terms like "dynamically typed" are arguably
           | misnomers and should probably be replaced by "dynamically
           | checked," but the usage is standard.
           | 
           | In other words, you're both correct in your definitions
           | depending on who you're talking to, but if we're going to get
           | pedantic (which you seem to be) OP is slightly _more_
           | correct.
           | 
           | Personally, it feels like dynamically typed language
           | advocates have been getting more and more vocal about their
           | language of choice being "typed" as static typing has grown
           | in popularity in recent years. This seems like misdirected
           | energy--static typing advocates know what they're advocating
           | for and know that dynamically typed languages don't fill
           | their need. You're not accomplishing much by trying to force
           | them to use inclusive language.
           | 
           | Rather than trying to push Python as a typed language it
           | seems like it would be more effective to show why dynamic
           | checks have value.
        
             | kstrauser wrote:
             | Here's a Stack Overflow question about it from 15 years
             | ago: https://stackoverflow.com/questions/2025353/is-python-
             | a-weak...
             | 
             | It was an old discussion before then, even. It has nothing
             | to do with advocacy and it's certainly not recent. It's
             | about accuracy so that people stop hearing and then
             | repeating the same incorrect ideas. There's no common
             | definition of types by which Python is untyped, as though
             | it doesn't have types at all when in fact _every_ Python
             | object has a type.
        
               | lolinder wrote:
               | > There's no common definition of types by which Python
               | is untyped
               | 
               | You mean besides the one used by every programming
               | languages researcher and hobbyist? Sure, you can define
               | "common" to exclude them, but I would give at least some
               | credence to the definitions put forward by the teams of
               | people who _invent_ type theory.
               | 
               | As I've said here and elsewhere, I have no problem with
               | people casually using "dynamically typed" as a term--I do
               | so as well. But there's no cause to correct someone for
               | using the _more correct_ terminology.
               | 
               | If hearing it makes you feel defensive of python, that
               | implies that you perceive "untyped" as a pejorative that
               | needs defending against. In that case, your efforts would
               | be better spent correcting the evolving consensus that
               | (statically) typed is better than they would be spent
               | trying to shout people down for using the academic
               | definitions of typed and untyped.
        
           | grumpyprole wrote:
           | This "strong typing" message from the Python community has
           | always sounded like propaganda to me - designed to confuse
           | management. Strong typing is about machine checked proofs of
           | invariants, not whether you avoid a few daft built-in
           | coercions.
        
             | ithkuil wrote:
             | There is a static vs dynamic distinction and strong vs weak
             | typing
             | 
             | There is also a semi humorously named "stringly typed"
             | which means weakly typed in such a way that incompatible
             | types are promoted to strings before being operated on.
             | 
             | I'm not aware of any static weakly typed language, but it's
             | logically possible to have one
        
         | ajayvk wrote:
         | I have been building an internal tools development and
         | deployment platform [1]. It is built in Go, using Starlark for
         | configuration and API business logic.
         | 
         | Starlark has been great to build with. You get the readability
         | of having a simple subset of python, without python's
         | dependency management challenges. It is easily extensible with
         | plugin APIs. Concurrent API performance is great without the
         | python async challenges.
         | 
         | One challenge wrt using Starlark as an general purpose embedded
         | scripting language is that it does not support usual error
         | handling features. There are no exceptions and no multi value
         | return for error values, all errors result in an abort. This
         | works for a config language, where a fail-fast behavior is
         | good. But for a general purpose script, you need more fine
         | grained error handling. Since I am using Starlark for API logic
         | only, I came up with a user definable error handling behavior.
         | This uses thread locals to keep track of error state [2], which
         | might not work for more general purpose scripting use cases.
         | 
         | [1] https://github.com/claceio/clace
         | 
         | [2] https://clace.io/docs/plugins/overview/#automatic-error-
         | hand...
        
           | jcmfernandes wrote:
           | This is also one of my major complaints at this point. I'm
           | building a developer tool with starklark resting at its core
           | and I had already came across your work.
           | 
           | I wish the starklark team had addressed it at this point.
        
             | mahmoudimus wrote:
             | We solved this by introducing a Result library.
             | load("@.../result", result=result)                   def
             | throw(arg):           return 1/0       if
             | result.Result(throw).map(arg).is_ok:          # proceed
             | else:          fail("...")
        
             | ajayvk wrote:
             | I wrote up the approach I used at
             | https://clace.io/blog/errors/, started a discussion at
             | https://news.ycombinator.com/item?id=42370488 since it
             | could apply outside of Starlark also
        
           | numbsafari wrote:
           | Doing much the same thing. Face similar issues.
           | 
           | A lot of the complaints about Starlark as a programming
           | language, and the proposed alternatives, seem to me to miss
           | out on the UX advantages of having pythonic scripting (which
           | so many folks who have taken a random "coding" class
           | understand intuitively) whereas, e.g., using a lisp or lua
           | would not. Further, having a language and runtime designed
           | for _safe_ use is absolutely critical, and trying to embed
           | another runtime (js /wasm) and manage to lock it down
           | successfully, is a much larger undertaking than I think folks
           | realize.
        
         | Rochus wrote:
         | > _a large Starlark codebase is a large Python codebase, and
         | large Python codebases are imperative, untyped, and can get
         | messy even without all the things mentioned above. Even though
         | your Starlark is pure and deterministic, it still easily ends
         | up a rats nest of sphagetti_
         | 
         | This brings it to the point. I'm still wondering why the
         | achievements of software engineering of the past fifty years,
         | like modularization and static type checking had apparently so
         | little influence on build systems. I implemented
         | https://github.com/rochus-keller/BUSY for this reason, but it
         | seems to require a greater cultural shift than most developers
         | are already willing to make.
        
           | klooney wrote:
           | The tyranny of the one-small-change use case having outsized
           | importance. Usually the build system is no one's job, which
           | means that all hurdles grow.
        
           | kccqzy wrote:
           | I think it's a cultural thing. People like to think of a
           | language for the build system as a little language that
           | somehow doesn't "deserve" a type system. And even they do
           | think a type system is necessary, they think such a language
           | doesn't "deserve" a complicated type system (say Java-like
           | with subtyping and generics) which makes that type system
           | less useful.
           | 
           | I'm curious, what kind of type system does BUSY use?
        
             | Rochus wrote:
             | It's a rather traditional type system; the specification is
             | here: http://software.rochus-keller.ch/busy_spec.html. The
             | main advantage are the combination of modularization, types
             | and formal declarations, so that if you make a change in a
             | large build (such as my https://github.com/rochus-
             | keller/LeanQt or https://github.com/rochus-
             | keller/LeanCreator systems) incompatibilites are
             | immediately found by the compiler. Without these features
             | you can never be sure whether all effects were checked.
        
               | maartenh wrote:
               | Which tool did you use to create that busy_spec.html
               | file? They remind me of Engelbart's blue numbering system
               | for documents, if I remember the name correctly.
        
               | Rochus wrote:
               | It's https://github.com/rochus-keller/crossline/, a tool
               | which I implemented and used for many years in my
               | projects. It's inspired by Netmanage Ecco and implements
               | features which can also be found in Ted Nelson's Xanadu
               | or in Ivar Jacobson's Objectory.
        
             | kamma4434 wrote:
             | I wonder why people love to create languages to be embed in
             | applications when there are plenty of languages that are
             | already useful and well known.
             | 
             | So in the end, you have to fight the half-assed small
             | language that was created and have to find a way to connect
             | to some real language to get things done.
        
               | grumpyprole wrote:
               | Small languages, if they are suitably constrained, offer
               | far more reasoning power and optimisation potential. This
               | is why we need _more_ small languages, not less. Python
               | aims for maximum flexibility and maximum ease of use.
               | This comes with real and serious trade offs. Python
               | programs are very very difficult to reason about, for
               | both people and machines.
               | 
               | A textbook example for you are (proper) regular
               | expressions. This little language guarantees O(n)
               | matching. The Python and Perl communities added
               | backtracking without truly understanding _why_
               | backtracking was missing in the first place. Now their
               | misnamed  "regular expressions" cause security issues for
               | their users.
        
               | AlotOfReading wrote:
               | Even Thompson didn't use the linear time algorithm that's
               | named after him in Ed and Grep. The Python and Perl
               | implementations were inspired by Henry Spencer's _regex_
               | , which was in turn reimplementing Thompson's
               | backtracking implementations.
        
             | hamandcheese wrote:
             | I don't think it's purely cultural. Starlark is
             | interpreted, which presents some challenges to type
             | checking. You either need to make the interpreter more
             | complex, or have an out-of-band type checking step.
        
           | skybrian wrote:
           | Build systems are sort of like type expressions, templates,
           | or constant expressions in a programming language. Either a
           | program compiles or it doesn't. What might happen when you
           | change the code in some unlikely way isn't immediately
           | relevant to whether the program works now, so it's easy to
           | skimp on that kind of checking until things get out of hand
           | due to greater scale.
           | 
           | Also, in Starlark, any runtime check you write is a build-
           | time check and calling _fail_ reports a build-time error,
           | which is good enough for users, but not for understanding how
           | to use Starlark functions within Starlark code.
        
             | Rochus wrote:
             | There is also the need to understand a build and to
             | navigate a build system. Try e.g. to understand how the
             | Chromium build works, and which options are enabled in
             | which case. I even built a tool (see
             | https://github.com/rochus-keller/GnTools) to analyze it
             | (and some other large GN projects) but even so reached the
             | limits of a dynamic specification language pretty quickly.
             | This won't happen in BUSY.
        
               | joshuamorton wrote:
               | Yes, gn is less good than bazel for a variety of reasons,
               | not the least of which is tooling like `blaze query
               | --output=build` and the more restricted evaluation model
               | in starlark which is easier to evaluate.
               | 
               | Since starlark and bazel restrict the amount of "weird"
               | things you can do, type-inference is pretty
               | straightforward (moreso than in regular python), since
               | almost everything is either a struct or a basic type and
               | there isn't any of the common magic.
        
           | mike_hearn wrote:
           | It did have influence. Take a look at Gradle, which is widely
           | used in the JVM space. It uses a general, strongly typed
           | language (Kotlin) to configure it and it has a very
           | sophisticated plugin and modules system for the build system
           | itself, not just for the apps it's building.
           | 
           | Gradle has its problems, and I often curse it for various
           | reasons, but I'm pretty glad it uses regular languages that I
           | can reuse in non-build system contexts. And the fact that it
           | just bites the bullet and treats build systems as special
           | programs with all the same support that it gives to the
           | programs it's building does have its advantages, even if the
           | results can get quite complex.
        
             | Rochus wrote:
             | Interesting. Using a regular language has some advantages,
             | but also many disadvantages. One of the intentions of BUSY
             | was - similar to e.g. Meson - to avoid a fully Turing
             | complete language, because then people start to implement
             | complex things, thus leaving the declarative character of a
             | build specification, which again makes the build more
             | difficult to understand and maintain.
        
               | mike_hearn wrote:
               | The basic assumption behind Gradle, I think, is that
               | people usually implement complex things in build systems
               | because their needs are genuinely complex. Build systems
               | are at heart parallel task execution and caching engines
               | that auto-generate a CLI based on script-like programs,
               | and that's a very useful thing. No surprise people use
               | them to automate all kinds of things. You can lean into
               | that or you can try to stop people using them in that
               | way. Gradle leans in to it and then tries to make the
               | resulting mess somewhat optimizable and tractable.
               | 
               | You can of course get people who are just bad at software
               | and make things over-complex for no reason, but if you
               | have such people on a team then the actual software
               | you're building will be your primary problem, not your
               | build system.
        
         | marssaxman wrote:
         | > a "hermetic" subset of Python
         | 
         | That's funny: I've been using bazel (and previously blaze) for
         | well over a decade, but it has never once occurred to me to
         | think of starlark as having anything at all to do with Python!
         | I can't see anything about it which is distinctively pythonic.
        
           | IshKebab wrote:
           | Erm... the syntax?
        
             | marssaxman wrote:
             | I'm curious - what about it strikes you that way?
             | 
             | To my eyes, starlark bears more resemblance to YAML, or
             | TOML, or any other generic configuration language, than to
             | Python.
        
               | laurentlb wrote:
               | You've probably looked only at the Bazel BUILD files.
               | They are indeed quite declarative (as the syntax is
               | restricted even more).
               | 
               | If you open other Starlark files that have functions (in
               | Bazel, that would be in .bzl files), you should recognize
               | the Python syntax (e.g. `def` statements + space
               | indentation).
        
               | IshKebab wrote:
               | Erm what? It's very very obviously based on Python. The
               | docs even explicitly say that. This is the example they
               | give:                 def fizz_buzz(n):         """Print
               | Fizz Buzz numbers from 1 to n."""         for i in
               | range(1, n + 1):           s = ""           if i % 3 ==
               | 0:             s += "Fizz"           if i % 5 == 0:
               | s += "Buzz"           print(s if s else i)
               | fizz_buzz(20)
               | 
               | Does that look like YAML or TOML?
        
         | IshKebab wrote:
         | The Rust version of Starlark used in Buck2 apparently supports
         | type annotations. I've never used it though and I have no idea
         | about IDE support.
        
           | davidjfelix wrote:
           | Allegedly it has an LSP and vscode support but I also have
           | never used either.
           | 
           | https://github.com/facebook/buck2/tree/main/starlark-
           | rust/vs...
        
       | mahmoudimus wrote:
       | I love Starlark. I was a major implementor of it at VGS (the repo
       | is open: https://github.com/verygoodsecurity/starlarky). It had
       | unique distinct features that made it much easier to control and
       | sandbox than many other languages out there.
       | 
       | I even built a codemod library that does a very basic python ->
       | starlark so that one can develop using python ecosystem libraries
       | and just copy & paste into a secure execution environment. It was
       | a huge success at my last company.
       | 
       | I'm very thankful to Laurent Le-brun and Alan Donovan -- both of
       | whom are exceptional engineers that I learned so much from. I
       | thought I was skilled but both of those individuals are just on
       | another level.
        
         | adsharma wrote:
         | What is this codemod library called?
        
           | mahmoudimus wrote:
           | https://github.com/mahmoudimus/py2star
        
       | srmatto wrote:
       | https://tilt.dev/ also uses Starlark.
        
       | fabmilo wrote:
       | This is interesting as I was evaluating starlark few days ago.
       | The fact that has a customizable implementation in golang, and a
       | python similar syntax makes it an interesting choice for agents
       | generated code.
        
       | jcmfernandes wrote:
       | To those pointing out that it's dynamically typed, meta's rust
       | implementation - that they use in buck2 - supports type
       | annotations.
        
       | xnacly wrote:
       | I feel like no other embedded scripting language will ever
       | surpass lua. Neovim, roblox and all my projects that needed
       | scripting support use lua, its my first choice.
        
       | dang wrote:
       | Related:
       | 
       |  _Starlark Language_ -
       | https://news.ycombinator.com/item?id=40700549 - June 2024 (49
       | comments)
       | 
       |  _An Overview of the Starlark Language_ -
       | https://news.ycombinator.com/item?id=40573689 - June 2024 (49
       | comments)
       | 
       |  _(The) Starlark Language_ -
       | https://news.ycombinator.com/item?id=39457410 - Feb 2024 (1
       | comment)
       | 
       |  _RepoKitteh: Github workflow automation using Starlark_ -
       | https://news.ycombinator.com/item?id=26674781 - April 2021 (7
       | comments)
        
         | mdaniel wrote:
         | I would have thought for sure this was submitted due to the
         | Bazel 8 release but this thread predates that one by quite a
         | bit
         | 
         | Anyway, I guess "see also:"
         | https://news.ycombinator.com/item?id=42370744
        
       ___________________________________________________________________
       (page generated 2024-12-09 23:00 UTC)