[HN Gopher] Starlark Programming Language
___________________________________________________________________
Starlark Programming Language
Author : laurentlb
Score : 107 points
Date : 2024-12-08 17:05 UTC (1 days ago)
(HTM) web link (starlark-lang.org)
(TXT) w3m dump (starlark-lang.org)
| Rhapso wrote:
| Starklark isn't exactly "functional" but by design it is
| hermetic. You can't call any code that would have a side-effect
| (other than burning compute resources) or get data that wasn't
| part of your initial input.
|
| It could make a of sense as a "contract language", or as
| intended: part of a build system.
| kccqzy wrote:
| I believe Starlark allows side effects the first time it
| evaluates a file but not subsequent times. Otherwise something
| as simple as calling .append on a list won't work because that
| returns None but has an input-mutating side effect. I think
| it's a good choice.
| etamponi wrote:
| Your mileage may vary, but Starlark was one of the most
| complicated languages I've ever read. No types, just layers and
| layers of indirection, with the goal of making a very complex
| build rule look "simple". I don't like hiding non-accidental
| complexity, in particular when the abstraction leaks everywhere.
| Perhaps it was only due to the Starlark I was exposed to (I
| worked at Google, probably most of the Starlark code I've read
| has been written by SREs).
| kccqzy wrote:
| What you describe is really all large-scale Python projects
| look like. No types, just layers and layers of indirection,
| with the goal of making the interface simple and Pythonic while
| hiding implementation complexity. I don't think this is
| necessarily a fault given that this language explicitly decides
| to look like Python (and was in fact simplified from Python).
|
| The worst Starlark code I've read has been written not by SREs,
| but by the Boq team as they have a fetish for accomplishing
| complicated configuration at build time. This was one of the
| reasons I've avoided Boq: an incomplete code base that's under
| development doesn't even begin to build, which is far worse
| than building something and seeing a real compiler error.
| yodsanklai wrote:
| > No types
|
| It's pretty easy to add types to Python nowadays. I'd
| consider it bad practice not to do so in a large project.
| seanmceligot wrote:
| It would be nice if python would show types in the
| documentation. Not only do I need that all the time, it
| would show python was taking type safety serious.
|
| For example, knowing the return type of a function is
| Union[DataFrame,Series] rather than simply DataFrame would
| save a lot of bad errors.
| zellyn wrote:
| Starlark, unfortunately, does not really support (Python
| style) types yet. Facebook's version has some kind of
| types, but ideally Starlark would just learn to do mypy
| types.
| kccqzy wrote:
| That's only true for greenfield projects where people start
| a new project with types in mind. It's absolutely a
| nightmare for old projects because, without the need to
| write types, people write all kinds of code that cannot fit
| within what's possible in python's type annotations.
| poincaredisk wrote:
| In my opinion it's a bad practice, and rewriting code to
| be typeable is a good idea for refactoring.
|
| But I write Python for some time now, and I know what you
| mean. I have nightmares about codebases with dynamically
| generated class fields for example (though I heard ruby
| is even worse)
| zellyn wrote:
| What you describe is really what all programming languages,
| compilers, interpreters, etc. look like. Layers and layers of
| indirection, with the goal of making the interface simple
| while hiding implementation complexity, all the way down to
| assembly.
| dastbe wrote:
| How much of this was how bazel works vs. starlark itself?
|
| I find the starlark language is very simple (though
| inconsistent between the various implementations in bazel, go,
| and rust) but it takes a bit to understand how the magic
| between defining rules and implementations works in bazel. and
| TBH, that is also one place I've really needed auto-
| completion/static typing in starlark to understand what I
| can/cannot do.
| kevindamm wrote:
| This appears to be the same language that started as a custom
| build rule DSL for bazel (and evolving from lessons learned in
| blaze's Python extension), the page positions it as a much more
| general-purpose embedded language.
|
| I'm all for using the right tool for the job, and I'll
| acknowledge the benefits of hermeticity and parallelism in many
| contexts. But, I also think that you need to weigh the cost of
| adding another language to a project. I don't imagine myself
| using this anywhere outside of a build system, and even then I
| would first try to take the declarative rules of that build
| system as far as I can. It's probably better than shell scripts,
| though, at least most of the time? May depend on your team's
| prior familiarity with shell and/or love of Python.
|
| There are three implementations (Go, Rust, Java) which could be a
| good thing if the language doesn't change often. They would have
| to be kept in sync for any changes, as well as keep from drifting
| due to changes in the host language's semantics.
|
| I also think that the closeness in syntax with Python can be a
| disadvantage, especially if users are expecting more recent
| Python additions like the walrus operator or pattern matching to
| be available.
|
| I like to design languages, too, so these comments come from a
| place of love and understanding. I think it would help if there
| were some more specific justification (or examples) of why this
| is better than something more established (e.g. Lua or
| MicroPython) or more distinct (e.g. pure data in configs instead
| of embedded code). I do like that the language attempts to remain
| as simple as needed.
| foooorsyth wrote:
| >There are three implementations (Go, Rust, Java)
|
| There are so many people at Google just goofing around, lol.
| People just doing hobby stuff for $350k/year.
|
| There is no justification to implement this custom dialect of
| Python for a build system that drives everyone crazy 3 times.
| Reminds me of when it was revealed that 400 people were working
| on Fuschia -- a hobby OS that only shipped on a single smart
| home device.
| ithkuil wrote:
| IIRC initially blaze was literally using python as a config
| layer. That made it too easy to write build scripts that were
| too slow and too hard to optimize and that negatively
| affected the overall build experience.
|
| An increase in edit/build/run cycle efficiency of thousands
| of employees justifies an investment of a few engineers.
|
| They could have invented a new bespoke DSL. Instead they
| choose to stick to a well known and familiar language and
| just limit its expressiveness to a subset that would be
| easier to optimize. I think that's quite reasonable
| kevindamm wrote:
| I also remember blaze directly eval'ing Python pre-Skylark,
| but I think the bigger problem was build hermeticity. This
| was compounded by env-related issues making it hard to
| build Python hermetically in the first place, and the
| importance of cached object files and build outputs in
| managing the incredibly large monorepo that is Google's
| source tree. And that's all before considering the Py2->Py3
| migration (which, fortunately, was tackled afterwards).
|
| I think a new bespoke DSL would have been a non-starter
| since so many build scripts had already been written by the
| time Skylark was being conceived.
| zellyn wrote:
| One of my hobbies while at Google (2010-2015) was to watch
| the multiple failed attempts to get rid of Borg Config
| Language and actual Python in Blaze. It took a lot of work
| until they eventually succeeded. You're probably
| underestimating the rigor, the pain, and also the value
| involved in cleaning those things up: being able to cleanly
| operate programmatically at scale in the Google monorepo is
| extremely necessary.
|
| (This is also why I don't trust configuration languages built
| by people who _didn't_ observe the years of pain. Cue and
| jsonnet are notable projects that were able to incorporate a
| lot of lessons.)
| zellyn wrote:
| Also, I think the Rust version was built as a side-project,
| then handed off to Facebook.
|
| With regards to Fuchsia... I used to think building a new
| OS from the ground up was madness. Now I think _not_ doing
| that eventually is madness. I'd be a lot happier if Zircon
| and Fuchsia were done in Rust though...
| surajrmal wrote:
| A good deal of fuchsia is done in rust (roughly 50% and
| increasing over time). People over emphasize the
| importance of zircon needing to be in rust. It's more
| important for new code to be rust than for existing code
| to be rewritten. Zircon isn't a very fast growing part of
| the entire OS.
| zellyn wrote:
| Fair. That study on writing only new things in Rust was
| surprising!
| https://security.googleblog.com/2024/09/eliminating-
| memory-s...
|
| I do think for an OS/Kernel, it's worth having everything
| in a memory-safe language, and possibly worth formal
| verification too, if the very core of it is small
| enough...
| foooorsyth wrote:
| I can appreciate the pain of actually getting things
| through in a large (100k+ people) organization.
|
| What do you consider the be the justification for three
| separate implementations of the same build config language?
| Genuine question. I am not doubting the need for the DSL
| itself.
| jsnell wrote:
| (Note that the Rust implementation is by Facebook, not by
| Google.)
|
| If you're looking to embed a scripting language in a Go
| program, having a embedded language implementation
| written in Java isn't very useful. And vice versa.
| foooorsyth wrote:
| Ah I missed that this thing is actually embedded. I
| thought they were just doing multiple implementations for
| shits and gigs.
|
| Carry on, Google
| zellyn wrote:
| Yep. Basically the same reason there are so many lua
| implementations, or WASM ones.
| surajrmal wrote:
| I'm not sure. I understand what you are trying to say - do
| you think we should give up on OS diversity? Are you making
| fun of a company for actually investing in something
| ambitious which can benefit many if successful?
|
| While I'm not sure you are in the same crowd, I always think
| it's interesting that the HN hivemind tends to be upset with
| the browser monoculture, but doesn't bat an eye at OS
| monoculture. It feels like you're really just channeling
| feelings about the company rather than the projects.
| surajrmal wrote:
| Of the three, only two are Google produced. The rust
| implementation is written by Facebook for use in their build
| system. The java implementation is the original but is pretty
| tied to the the bazel build system and not really very
| suitable to other uses. The go implementation is meant to be
| embeddable into a varying number of applications.
|
| An example application is https://github.com/shac-
| project/shac
|
| Within Google, there a large number of similar tools which
| are written in go and harness the starlark language. While
| there are plenty of other options, I will say I think
| starlark is often a great choice.
| yodsanklai wrote:
| If I understand correctly, Starlark is a strict subset of
| Python (at least very close to Python). Can anything be more
| established than Python?
|
| Also I don't think pure data is an option. The point is that
| you want to generate data, which has many benefits (avoid
| duplication, easier to test).
|
| It seems that Starlark is a good trade-off given the
| constraints. My main grief is that it doesn't have type.
| kevindamm wrote:
| The syntax is a pure subset but the implementations are
| bespoke, so I wouldn't equate the established position of
| Python with whether Starlark is considered established.
|
| I agree that sometimes pure data isn't an option, and I've
| had to write some Skylark to assist blaze build rules too,
| but every time it also added tech debt, reduced the number of
| people on the team who completely understood the build
| system, and wasn't convenient/practical to test the build
| extension. The problem I'm referring to above is when the use
| of a source-controlled data file would have been sufficient
| but someone had to write a build extension because it's fun
| or something new to do (or whatever reason seemed convincing
| at the time).
|
| Then there are people who think that those same data files
| shouldn't be part of the build process at all and should be
| part of system turn-up/tear-down, stored in a global config
| or DB where versioning is alongside the data instead of
| alongside the program build. Certain kinds of migration are
| made more difficult if everything is built into the binary.
| Of course, that strategy comes with its own caveats.
| phyrex wrote:
| The rust implementation has types
| paulddraper wrote:
| Starlark (originally Skylark) is the configuration/extension
| language of Google's build tool Bazel.
|
| It's a bespoke Python subset... functions but no recursion, dicts
| but no sets, etc.
|
| It's also been adopted by the latest version of Buck (Meta's
| Bazel analog).
|
| I use it daily.
|
| It's an option for lightweight embeddable scripting language,
| with implementations in Java and Go. If you want Python
| familiarity, consider it as an option.
| nrr wrote:
| Starlark is Turing-incomplete, which makes it somewhat unique
| among embeddable languages. It's definitely a draw for me for
| something I'm working on.
| kevindamm wrote:
| The primitive-recursive property of Cue (https://cuelang.org)
| is a big draw for me, and may be an alternative worth
| checking out. The authors have spent a great amount of
| attention to the type system (they learned a lot of lessons
| from previous config language designs that did not take
| lattice theory and unification into account).
| verdverm wrote:
| The tl;dr is that inheritance is bad in config, whether it
| be from OOP or layering yaml files like Helm. The reason
| being that it is hard to understand where a value is coming
| from and where one must make an edit to correct it in high-
| stress SRE situations like downtime. Marcel worked on both
| major config languages at Google, and iirc Starlark is
| based on GCL ideas
|
| The Logic of CUE is a great read:
| https://cuelang.org/docs/concept/the-logic-of-cue/
| kccqzy wrote:
| Dhall is another configuration language that's deliberately
| Turing-incomplete. Though its Haskell-inspired syntax turns
| people off who aren't already Haskell programmers. It's based
| on calculus of constructions.
| anothername12 wrote:
| > Deterministic evaluation - Executing the same code twice will
| give the same results.
|
| What's this all about? Don't most languages?
| lann wrote:
| One specific example: many languages use randomized seeds in
| builtin dict/map types, leading to randomized iteration order.
| chubot wrote:
| Yeah, also Starlark is embedded like Lua, and doesn't come
| with batteries included like Python
|
| So that means you can control the APIs, and say opendir()
| closedir() in Unix returns filenames in different orders.
| Depending on what the data structure in the kernel is
|
| So many programs in other languages aren't deterministic just
| because they use APIs that aren't deterministic
| phyrex wrote:
| You don't get access to randomness or time functions or
| anything that could change the output of a function with the
| same input
| zellyn wrote:
| It can be very valuable to know that if you put the same files
| in, you get _exactly_ byte-for-byte identical artifacts out of
| your build system. Even letting your language access the actual
| current date/time can break that.
|
| (IIRC, I don't believe Bazel actually has fully deterministic
| builds yet, though.)
| laurentlb wrote:
| Try this in Python:
|
| id("ab") # not deterministic
|
| hash("ab") # not deterministic
|
| def foo(): pass
|
| str(foo) # not deterministic
|
| Another sneaky one is the `is` operator in Python, where the
| Python documentation says:
|
| > Due to automatic garbage-collection, free lists, and the
| dynamic nature of descriptors, you may notice seemingly unusual
| behaviour in certain uses of the `is` operator
|
| Related to that is the `__del__` method: when exactly is it
| called?
|
| It's quite easy to get non-deterministic code in Python and in
| many languages. And of course, there are lots of non-
| deterministic functions in the standard library (Starlark
| doesn't provide them).
| neuroelectron wrote:
| Apparently not. Wow. No wonder China is hacking into
| everything.
| lihaoyi wrote:
| Starlark is definitely a mixed experience IMO, from my 7 years
| working with it in Bazel
|
| On one hand, having a "hermetic" subset of Python is nice. You
| can be sure your Bazel starlark codebase isn't going to be making
| network calls or reading files or shelling out to subprocesses
| and all that. The fact that it is hermetic does help make things
| reproducible and deterministic, and enables paralleization and
| caching and other things. Everyone already knows Python syntax,
| and its certainly nicer than the templated-bash-in-templated-yaml
| files common elsewhere in the build tooling/infra space
|
| On the other hand, a large Starlark codebase is a large Python
| codebase, and large Python codebases are imperative, untyped, and
| can get messy even without all the things mentioned above. Even
| though your Starlark is pure and deterministic, it still easily
| ends up a rats nest of sphagetti. Starlark goes the extra mile to
| be non-turing-complete, but that doesn't mean it's performant or
| easy to understand. And starlark's minimalism is also a curse as
| it lacks many features that help you manage large Python
| codebases such as PEP484 type annotations, which also means IDEs
| also cannot provide much help since they rely on types to
| understand the code
|
| For https://mill-build.org we went the opposite route: not
| enforcing purity, but using a language with strong types and a
| strong functional bent to it. So far it's been working out OK,
| but it remains to be seen how well it scales to ever larger and
| more complex build setups
| kstrauser wrote:
| Python has always been strongly, dynamically typed. It isn't
| untyped.
| jerf wrote:
| The sort of "untyped" that your last sentence is referring to
| is a dead term, though. The only "untyped" language still in
| common use is assembler, and that's not commonly written by
| hand anymore (and when it is, it's primarily running on
| numbers, not complex structs and complex values). There
| aren't any extant languages anymore that just accept numbers
| in RAM and just treat them as whatever.
|
| So increasingly, this objection is meaningless, because
| nobody is using "untyped" that way anymore. The way in which
| people _do_ use the term, Python _is_ only "optionally"
| typed, and a lot of real-world Python code is "untyped".
| js2 wrote:
| I think the objection is to the conflation of strong/weak
| with dynamic/static and it being unclear exactly what
| typed/untyped means, since it can refer to either. Python
| has always been strongly typed at runtime (dynamic), vs say
| JavaScript which is relatively weakly typed at runtime.
|
| Obviously lihaoyi was referring to static/dynamic when they
| wrote untyped (as made clear by the reference to type
| annotations) but kstrauser is objecting to using the term
| "untyped" since that can be interpreted to mean weak typing
| as well, which Python is not.
|
| $0.02 anyway.
| lolinder wrote:
| Strong/weak is a meaningless dichotomy that could be
| replaced by nice/icky while conveying the same meaning.
| It just distinguishes whether I, personally, believe a
| given language has sufficient protections against dumb
| programmer errors. What counts as strong or weak depends
| entirely on who's talking. Some will say that everything
| from C on is strong, others draw the line at Java, still
| others aren't comfortable until you get to Haskell, and
| then there are some who want to go even further before
| it's truly "strong".
|
| Typed versus untyped is, on the other hand, a rigorously
| defined academic distinction, and one that very clearly
| places pre-type-hints Python in the untyped category.
| That's not a bad thing--untyped isn't inherently a
| derogatory term--but because untyped languages have
| fallen out of vogue there's a huge effort to rebrand
| them.
| poincaredisk wrote:
| ...but Python is obviously typed. It has types. In fact
| everything has a type, and even the types are of "type"
| type. It has type errors. Saying it's "untyped" invokes a
| wrong impression. Your usage is very non-standard in
| programmer circles.
|
| What's wrong with universally understood and well defined
| concepts of "statically" and "dynamically" typed
| languages?
| lolinder wrote:
| As I said in another comment [0], it depends on what
| definition of types we're using. But if we're going to
| pedantically jump down someone's throat correcting their
| usage (in this case OP's usage of "untyped"), we should
| at least use the most pedantically correct definition,
| which is the one used by academics who study type systems
| and which pointedly excludes dynamic checks.
|
| I have no problem with people using the other terminology
| in casual usage--I do so myself more often than not. I do
| have a problem with people pedantically correcting usage
| that is actually _more_ correct than their preferred
| usage. I dislike pedantry in general, but I especially
| dislike incorrect pedantry.
|
| [0] https://news.ycombinator.com/item?id=42367659
| jyounker wrote:
| Strong/weak typing is very specific thing. It refers to
| the ability to create invalid types within a language. In
| strongly typed languages it is hard to defeat the type
| system. In weakly typed languages it is easy to defeat
| the type system.
|
| Python is strongly typed (hard to escape the bounds of
| the type system) but (traditionally) dynamically typed
| (types are checked at runtime).
|
| C is weakly typed (easy to escape the type system), but
| statically typed (types are checked at compile time).
| lolinder wrote:
| That is a possible definition for strongly typed, yes. It
| is not widespread or generally agreed upon--you'll see
| plenty of people use them in ways that contradict your
| definitions, and you won't see any serious work
| attempting to define them at all. Even Wikipedia doesn't
| [0]:
|
| > However, there is no precise technical definition of
| what the terms mean and different authors disagree about
| the implied meaning of the terms and the relative
| rankings of the "strength" of the type systems of
| mainstream programming languages. For this reason,
| writers who wish to write unambiguously about type
| systems often eschew the terms "strong typing" and "weak
| typing" in favor of specific expressions such as "type
| safety".
|
| [0]
| https://en.m.wikipedia.org/wiki/Strong_and_weak_typing
| samatman wrote:
| Untyped computation in the academic sense you refer to
| _is_ untyped in the sense of Forth and assembler. The
| untyped lambda calculus doesn 't even have numbers.
| Pragmatically, a language in which type errors occur is a
| typed language.
|
| Nor does it make sense to conflate "typed and untyped"
| with "statically typed and dynamically typed". These are
| simply very different things. Julia is an example of a
| dynamically typed language with a quite sophisticated
| type system and pervasive use of type annotations, it
| would be insane to call it untyped. Typescript is an
| example of a dynamic language which is nonetheless
| statically typed: because type errors in Typescript
| prevent the program from compiling, they're part of the
| static analysis of the program, not part of its dynamic
| runtime.
|
| The fact that it's uncommon to use untyped languages now
| is not a good reason to start describing certain type
| systems as 'untyped'! A good term for a language like
| annotation-free Python is unityped: it definitely has a
| (dynamic) type system, but the type of all variables and
| parameters is "any". Using this term involves typing one
| extra letter, and the payoff is you get to make a correct
| statement rather than one which is wrong. I think that's
| a worthwhile tradeoff.
| lolinder wrote:
| From Benjamin Pierce's _Types and Programming Languages_
| , which is basically the definitive work on types:
|
| > A type system is a tractable syntactic method for
| proving the absence of certain program behaviors by
| classifying phrases according to the kinds of values they
| compute.
|
| And later on:
|
| > A type system can be regarded as calculating a kind of
| static approximation to the runtime behaviors of the
| terms in a program. ... Terms like "dynamically typed"
| are arguably misnomers and should probably be replaced by
| "dynamically checked," but the usage is standard.
|
| The definitions you're using are the ones that he
| identifies as "arguably misnomers" but "standard". That
| is, they're fine as colloquial definitions but they are
| not the ones used in academic works. Academically
| speaking, a type system is a method of statically
| approximating the behavior of a computer program in order
| to rule out certain classes of behavior. Dynamic checks
| do not count.
|
| As I've said elsewhere, I don't have a problem with
| people using the colloquial definitions. I do have a
| problem with people actively correcting someone who's
| using the more correct academic definitions. We should
| have both sets in our lexicons and be understanding when
| someone uses one versus the other.
| js2 wrote:
| > Strong/weak is a meaningless dichotomy
|
| Strong/weak is not a dichotomy. It's a spectrum. That's
| why folks argue over where a language lands in the
| spectrum. OTOH, static (compile-time) vs dynamic (run-
| time) is a dichotomy. There's not really any in between.
| It's clear when and where typing occurs. So there's
| nothing to argue over.
|
| > Typed versus untyped is, on the other hand, a
| rigorously defined academic distinction
|
| A typed language is one that has a type system. Python
| has a type system. It's typed.
| lolinder wrote:
| Academically, no, a type system is by definition static.
| See the definition Benjamin Pierce gives in TAPL that
| I've placed in many comments in this subthread [0] and
| won't repeat here.
|
| Colloquially, yes, python has a type system. All I'm
| saying is it's unhelpful to correct someone for using the
| more correct definition rather than the colloquial one.
| Both definitions are valid, but if we're going to be
| pedantic we should at least use the academic definition
| for our pedantry.
|
| And you're correct, I should have said spectrum, but the
| point is still the same: even Wikipedia refuses to define
| "strongly" or "weakly" typed, suggesting people use
| terminology that isn't hopelessly muddled.
|
| [0] Here's one:
| https://news.ycombinator.com/item?id=42368689
| f1shy wrote:
| I could even argue that Asm is to some extent typed.
| Depends on the processor, but some cisc have operations for
| different types. But also the comment is correct: Python is
| strongly, dynamic typed.
| DougMerritt wrote:
| Lisp CPUs had type bits stored with values, but I can't
| think of any typed CPUs still in use. What are you
| thinking of?
| kaladin-jasnah wrote:
| Yes, agreed.
|
| I guess the typing would be for the size of the integer
| that you work with. For example, x86_64 assembly has
| different prefixes to indicate what part of a larger
| register you are using: 8 (lower), 8 (upper), 16 bit, 32
| bit, and 64 bit itself.
|
| There are other "typed" operations, such as branching for
| unsigned vs. signed integers (think JA vs JG), or SAR vs
| SHR (signed arithmetic shift vs. unsigned arithmetic
| shift--one preserved the division logic of shifting for
| signed integers by repeating the MSB instead of adding
| zeroes when shifting).
|
| While I'm not too familiar with them (but have been
| meaning to learn more for years!!), SIMD instructions
| probably also have similar ideas of having different
| types for sizes of arrays.
| munch117 wrote:
| There's lots of programming languages still around with
| untyped elements to them. Javascript is one of them, with
| its string/number conversions and the way arrays are
| defined. Then there's all the stringly typed stuff. Make,
| CMake, Excel, TCL, bash. You're probably right that the
| original use of the term came from assembly vs. high level,
| but that objection is meaningless, because nobody is using
| "untyped" that way anymore....
|
| What makes changing the meaning of "untyped" extra
| confusing is that dynamically typed programming languages
| often have types as 1st class objects, and they get used
| all the time for practical everyday programming. Calling
| these languages "untyped" is just wrong on the face of it
| -- they're full of types.
| lolinder wrote:
| > changing the meaning of "untyped" extra confusing is
| that dynamically typed programming languages often have
| types as 1st class objects, and they get used all the
| time for practical everyday programming. Calling these
| languages "untyped" is just wrong on the face of it --
| they're full of types.
|
| Just to be clear, it's the dynamically typed languages
| that changed the meaning of untyped. OP's usage is closer
| to the original and to the current usage of the
| terminology in the study of programming languages.
|
| Types and Programming Languages, one of the best regarded
| texts on types, has this helpful explanation:
|
| > A type system can be regarded as calculating a kind of
| static approximation to the runtime behaviors of the
| terms in a program. ... Terms like "dynamically typed"
| are arguably misnomers and should probably be replaced by
| "dynamically checked," but the usage is standard.
|
| In other words both are standard, but that's because the
| meaning of "types" has changed over time from its
| original sense and when it comes to the formal study of
| programming languages we still use the original
| terminology.
| munch117 wrote:
| Just to be even clearer.
|
| In the time of the original use, there were only static
| types. Languages had very little in terms of UDT's. Even
| a struct in C was barely a type of its own. I don't
| recall the details, but there was something about struct
| member names not being local to the struct. Interpreted
| languages didn't have records or classes at all(*), and
| certainly not types as first class objects.
|
| We cannot really talk about how dynamically typed
| languages with rich type systems were originally
| labelled, back when they didn't exist at all.
|
| (*) I'm looking forward to someone pointing out an
| interesting counterexample.
| lolinder wrote:
| It depends on what definition of "type system" you're using.
| Colloquially many programmers use it to refer to any system
| that checks whether objects have specific shapes. Academics,
| on the other hand, have a very specific definition of a type
| system that excludes dynamic detect languages. From TAPL (one
| of the authoritative works on the subject):
|
| > A type system is a tractable syntactic method for proving
| the absence of certain program behaviors by classifying
| phrases according to the kinds of values they compute.
|
| And later on:
|
| > A type system can be regarded as calculating a kind of
| static approximation to the runtime behaviors of the terms in
| a program. ... Terms like "dynamically typed" are arguably
| misnomers and should probably be replaced by "dynamically
| checked," but the usage is standard.
|
| In other words, you're both correct in your definitions
| depending on who you're talking to, but if we're going to get
| pedantic (which you seem to be) OP is slightly _more_
| correct.
|
| Personally, it feels like dynamically typed language
| advocates have been getting more and more vocal about their
| language of choice being "typed" as static typing has grown
| in popularity in recent years. This seems like misdirected
| energy--static typing advocates know what they're advocating
| for and know that dynamically typed languages don't fill
| their need. You're not accomplishing much by trying to force
| them to use inclusive language.
|
| Rather than trying to push Python as a typed language it
| seems like it would be more effective to show why dynamic
| checks have value.
| kstrauser wrote:
| Here's a Stack Overflow question about it from 15 years
| ago: https://stackoverflow.com/questions/2025353/is-python-
| a-weak...
|
| It was an old discussion before then, even. It has nothing
| to do with advocacy and it's certainly not recent. It's
| about accuracy so that people stop hearing and then
| repeating the same incorrect ideas. There's no common
| definition of types by which Python is untyped, as though
| it doesn't have types at all when in fact _every_ Python
| object has a type.
| lolinder wrote:
| > There's no common definition of types by which Python
| is untyped
|
| You mean besides the one used by every programming
| languages researcher and hobbyist? Sure, you can define
| "common" to exclude them, but I would give at least some
| credence to the definitions put forward by the teams of
| people who _invent_ type theory.
|
| As I've said here and elsewhere, I have no problem with
| people casually using "dynamically typed" as a term--I do
| so as well. But there's no cause to correct someone for
| using the _more correct_ terminology.
|
| If hearing it makes you feel defensive of python, that
| implies that you perceive "untyped" as a pejorative that
| needs defending against. In that case, your efforts would
| be better spent correcting the evolving consensus that
| (statically) typed is better than they would be spent
| trying to shout people down for using the academic
| definitions of typed and untyped.
| grumpyprole wrote:
| This "strong typing" message from the Python community has
| always sounded like propaganda to me - designed to confuse
| management. Strong typing is about machine checked proofs of
| invariants, not whether you avoid a few daft built-in
| coercions.
| ithkuil wrote:
| There is a static vs dynamic distinction and strong vs weak
| typing
|
| There is also a semi humorously named "stringly typed"
| which means weakly typed in such a way that incompatible
| types are promoted to strings before being operated on.
|
| I'm not aware of any static weakly typed language, but it's
| logically possible to have one
| ajayvk wrote:
| I have been building an internal tools development and
| deployment platform [1]. It is built in Go, using Starlark for
| configuration and API business logic.
|
| Starlark has been great to build with. You get the readability
| of having a simple subset of python, without python's
| dependency management challenges. It is easily extensible with
| plugin APIs. Concurrent API performance is great without the
| python async challenges.
|
| One challenge wrt using Starlark as an general purpose embedded
| scripting language is that it does not support usual error
| handling features. There are no exceptions and no multi value
| return for error values, all errors result in an abort. This
| works for a config language, where a fail-fast behavior is
| good. But for a general purpose script, you need more fine
| grained error handling. Since I am using Starlark for API logic
| only, I came up with a user definable error handling behavior.
| This uses thread locals to keep track of error state [2], which
| might not work for more general purpose scripting use cases.
|
| [1] https://github.com/claceio/clace
|
| [2] https://clace.io/docs/plugins/overview/#automatic-error-
| hand...
| jcmfernandes wrote:
| This is also one of my major complaints at this point. I'm
| building a developer tool with starklark resting at its core
| and I had already came across your work.
|
| I wish the starklark team had addressed it at this point.
| mahmoudimus wrote:
| We solved this by introducing a Result library.
| load("@.../result", result=result) def
| throw(arg): return 1/0 if
| result.Result(throw).map(arg).is_ok: # proceed
| else: fail("...")
| ajayvk wrote:
| I wrote up the approach I used at
| https://clace.io/blog/errors/, started a discussion at
| https://news.ycombinator.com/item?id=42370488 since it
| could apply outside of Starlark also
| numbsafari wrote:
| Doing much the same thing. Face similar issues.
|
| A lot of the complaints about Starlark as a programming
| language, and the proposed alternatives, seem to me to miss
| out on the UX advantages of having pythonic scripting (which
| so many folks who have taken a random "coding" class
| understand intuitively) whereas, e.g., using a lisp or lua
| would not. Further, having a language and runtime designed
| for _safe_ use is absolutely critical, and trying to embed
| another runtime (js /wasm) and manage to lock it down
| successfully, is a much larger undertaking than I think folks
| realize.
| Rochus wrote:
| > _a large Starlark codebase is a large Python codebase, and
| large Python codebases are imperative, untyped, and can get
| messy even without all the things mentioned above. Even though
| your Starlark is pure and deterministic, it still easily ends
| up a rats nest of sphagetti_
|
| This brings it to the point. I'm still wondering why the
| achievements of software engineering of the past fifty years,
| like modularization and static type checking had apparently so
| little influence on build systems. I implemented
| https://github.com/rochus-keller/BUSY for this reason, but it
| seems to require a greater cultural shift than most developers
| are already willing to make.
| klooney wrote:
| The tyranny of the one-small-change use case having outsized
| importance. Usually the build system is no one's job, which
| means that all hurdles grow.
| kccqzy wrote:
| I think it's a cultural thing. People like to think of a
| language for the build system as a little language that
| somehow doesn't "deserve" a type system. And even they do
| think a type system is necessary, they think such a language
| doesn't "deserve" a complicated type system (say Java-like
| with subtyping and generics) which makes that type system
| less useful.
|
| I'm curious, what kind of type system does BUSY use?
| Rochus wrote:
| It's a rather traditional type system; the specification is
| here: http://software.rochus-keller.ch/busy_spec.html. The
| main advantage are the combination of modularization, types
| and formal declarations, so that if you make a change in a
| large build (such as my https://github.com/rochus-
| keller/LeanQt or https://github.com/rochus-
| keller/LeanCreator systems) incompatibilites are
| immediately found by the compiler. Without these features
| you can never be sure whether all effects were checked.
| maartenh wrote:
| Which tool did you use to create that busy_spec.html
| file? They remind me of Engelbart's blue numbering system
| for documents, if I remember the name correctly.
| Rochus wrote:
| It's https://github.com/rochus-keller/crossline/, a tool
| which I implemented and used for many years in my
| projects. It's inspired by Netmanage Ecco and implements
| features which can also be found in Ted Nelson's Xanadu
| or in Ivar Jacobson's Objectory.
| kamma4434 wrote:
| I wonder why people love to create languages to be embed in
| applications when there are plenty of languages that are
| already useful and well known.
|
| So in the end, you have to fight the half-assed small
| language that was created and have to find a way to connect
| to some real language to get things done.
| grumpyprole wrote:
| Small languages, if they are suitably constrained, offer
| far more reasoning power and optimisation potential. This
| is why we need _more_ small languages, not less. Python
| aims for maximum flexibility and maximum ease of use.
| This comes with real and serious trade offs. Python
| programs are very very difficult to reason about, for
| both people and machines.
|
| A textbook example for you are (proper) regular
| expressions. This little language guarantees O(n)
| matching. The Python and Perl communities added
| backtracking without truly understanding _why_
| backtracking was missing in the first place. Now their
| misnamed "regular expressions" cause security issues for
| their users.
| AlotOfReading wrote:
| Even Thompson didn't use the linear time algorithm that's
| named after him in Ed and Grep. The Python and Perl
| implementations were inspired by Henry Spencer's _regex_
| , which was in turn reimplementing Thompson's
| backtracking implementations.
| hamandcheese wrote:
| I don't think it's purely cultural. Starlark is
| interpreted, which presents some challenges to type
| checking. You either need to make the interpreter more
| complex, or have an out-of-band type checking step.
| skybrian wrote:
| Build systems are sort of like type expressions, templates,
| or constant expressions in a programming language. Either a
| program compiles or it doesn't. What might happen when you
| change the code in some unlikely way isn't immediately
| relevant to whether the program works now, so it's easy to
| skimp on that kind of checking until things get out of hand
| due to greater scale.
|
| Also, in Starlark, any runtime check you write is a build-
| time check and calling _fail_ reports a build-time error,
| which is good enough for users, but not for understanding how
| to use Starlark functions within Starlark code.
| Rochus wrote:
| There is also the need to understand a build and to
| navigate a build system. Try e.g. to understand how the
| Chromium build works, and which options are enabled in
| which case. I even built a tool (see
| https://github.com/rochus-keller/GnTools) to analyze it
| (and some other large GN projects) but even so reached the
| limits of a dynamic specification language pretty quickly.
| This won't happen in BUSY.
| joshuamorton wrote:
| Yes, gn is less good than bazel for a variety of reasons,
| not the least of which is tooling like `blaze query
| --output=build` and the more restricted evaluation model
| in starlark which is easier to evaluate.
|
| Since starlark and bazel restrict the amount of "weird"
| things you can do, type-inference is pretty
| straightforward (moreso than in regular python), since
| almost everything is either a struct or a basic type and
| there isn't any of the common magic.
| mike_hearn wrote:
| It did have influence. Take a look at Gradle, which is widely
| used in the JVM space. It uses a general, strongly typed
| language (Kotlin) to configure it and it has a very
| sophisticated plugin and modules system for the build system
| itself, not just for the apps it's building.
|
| Gradle has its problems, and I often curse it for various
| reasons, but I'm pretty glad it uses regular languages that I
| can reuse in non-build system contexts. And the fact that it
| just bites the bullet and treats build systems as special
| programs with all the same support that it gives to the
| programs it's building does have its advantages, even if the
| results can get quite complex.
| Rochus wrote:
| Interesting. Using a regular language has some advantages,
| but also many disadvantages. One of the intentions of BUSY
| was - similar to e.g. Meson - to avoid a fully Turing
| complete language, because then people start to implement
| complex things, thus leaving the declarative character of a
| build specification, which again makes the build more
| difficult to understand and maintain.
| mike_hearn wrote:
| The basic assumption behind Gradle, I think, is that
| people usually implement complex things in build systems
| because their needs are genuinely complex. Build systems
| are at heart parallel task execution and caching engines
| that auto-generate a CLI based on script-like programs,
| and that's a very useful thing. No surprise people use
| them to automate all kinds of things. You can lean into
| that or you can try to stop people using them in that
| way. Gradle leans in to it and then tries to make the
| resulting mess somewhat optimizable and tractable.
|
| You can of course get people who are just bad at software
| and make things over-complex for no reason, but if you
| have such people on a team then the actual software
| you're building will be your primary problem, not your
| build system.
| marssaxman wrote:
| > a "hermetic" subset of Python
|
| That's funny: I've been using bazel (and previously blaze) for
| well over a decade, but it has never once occurred to me to
| think of starlark as having anything at all to do with Python!
| I can't see anything about it which is distinctively pythonic.
| IshKebab wrote:
| Erm... the syntax?
| marssaxman wrote:
| I'm curious - what about it strikes you that way?
|
| To my eyes, starlark bears more resemblance to YAML, or
| TOML, or any other generic configuration language, than to
| Python.
| laurentlb wrote:
| You've probably looked only at the Bazel BUILD files.
| They are indeed quite declarative (as the syntax is
| restricted even more).
|
| If you open other Starlark files that have functions (in
| Bazel, that would be in .bzl files), you should recognize
| the Python syntax (e.g. `def` statements + space
| indentation).
| IshKebab wrote:
| Erm what? It's very very obviously based on Python. The
| docs even explicitly say that. This is the example they
| give: def fizz_buzz(n): """Print
| Fizz Buzz numbers from 1 to n.""" for i in
| range(1, n + 1): s = "" if i % 3 ==
| 0: s += "Fizz" if i % 5 == 0:
| s += "Buzz" print(s if s else i)
| fizz_buzz(20)
|
| Does that look like YAML or TOML?
| IshKebab wrote:
| The Rust version of Starlark used in Buck2 apparently supports
| type annotations. I've never used it though and I have no idea
| about IDE support.
| davidjfelix wrote:
| Allegedly it has an LSP and vscode support but I also have
| never used either.
|
| https://github.com/facebook/buck2/tree/main/starlark-
| rust/vs...
| mahmoudimus wrote:
| I love Starlark. I was a major implementor of it at VGS (the repo
| is open: https://github.com/verygoodsecurity/starlarky). It had
| unique distinct features that made it much easier to control and
| sandbox than many other languages out there.
|
| I even built a codemod library that does a very basic python ->
| starlark so that one can develop using python ecosystem libraries
| and just copy & paste into a secure execution environment. It was
| a huge success at my last company.
|
| I'm very thankful to Laurent Le-brun and Alan Donovan -- both of
| whom are exceptional engineers that I learned so much from. I
| thought I was skilled but both of those individuals are just on
| another level.
| adsharma wrote:
| What is this codemod library called?
| mahmoudimus wrote:
| https://github.com/mahmoudimus/py2star
| srmatto wrote:
| https://tilt.dev/ also uses Starlark.
| fabmilo wrote:
| This is interesting as I was evaluating starlark few days ago.
| The fact that has a customizable implementation in golang, and a
| python similar syntax makes it an interesting choice for agents
| generated code.
| jcmfernandes wrote:
| To those pointing out that it's dynamically typed, meta's rust
| implementation - that they use in buck2 - supports type
| annotations.
| xnacly wrote:
| I feel like no other embedded scripting language will ever
| surpass lua. Neovim, roblox and all my projects that needed
| scripting support use lua, its my first choice.
| dang wrote:
| Related:
|
| _Starlark Language_ -
| https://news.ycombinator.com/item?id=40700549 - June 2024 (49
| comments)
|
| _An Overview of the Starlark Language_ -
| https://news.ycombinator.com/item?id=40573689 - June 2024 (49
| comments)
|
| _(The) Starlark Language_ -
| https://news.ycombinator.com/item?id=39457410 - Feb 2024 (1
| comment)
|
| _RepoKitteh: Github workflow automation using Starlark_ -
| https://news.ycombinator.com/item?id=26674781 - April 2021 (7
| comments)
| mdaniel wrote:
| I would have thought for sure this was submitted due to the
| Bazel 8 release but this thread predates that one by quite a
| bit
|
| Anyway, I guess "see also:"
| https://news.ycombinator.com/item?id=42370744
___________________________________________________________________
(page generated 2024-12-09 23:00 UTC)