[HN Gopher] A Review of the Odin Programming Language
___________________________________________________________________
A Review of the Odin Programming Language
Author : gingerBill
Score : 93 points
Date : 2022-09-11 13:34 UTC (9 hours ago)
(HTM) web link (graphitemaster.github.io)
(TXT) w3m dump (graphitemaster.github.io)
| michannne wrote:
| > This means valgrind cannot really detect memory issues as well
| since Odin has it's own internal allocators. helgrind cannot work
| because Odin doesn't use pthread primitives. ltrace cannot work
| because that just provides wrappers over every libc function, etc
|
| Unfortunately, I've gone through too many bugs that had to be
| fixed through valgrind debugging that I'm unwilling to part with
| it
| gingerBill wrote:
| The Odin team are working to put valgrind, helgrind, callgrind,
| memcheck, etc in to the Odin core library to provide better
| debugging tools! Along with many other debugging tools.
|
| https://github.com/odin-lang/Odin/tree/master/core/sys/valgr...
| verdagon wrote:
| One thing I like about Odin is that you can use any allocator
| with any existing function, without wiring that function to
| specifically do that. It just comes for free. In a way, it
| completely decouples the code from the allocator choice. It's
| especially nice because we can use any allocator with any third
| party code.
|
| I think this kind of decoupling is a big step in the right
| direction. The more concerns we can decouple from each other, the
| simpler and more flexible our codebases become and the less time
| we have to spend infectiously refactoring.
| michaelwww wrote:
| I was wondering why the author would think me, the reader, might
| get very angry about something he wrote, but then I saw this from
| the creator of Zig:
|
| _I see a lot of toxic Rust vs Zig discourse on Twitter right
| now..._
|
| https://twitter.com/andy_kelley/status/1568679389113757698
|
| There seem to be a lot of heated passions about C replacement
| languages right now. The Zig v Rust issue seems to boil down to
| the Rust community putting a lot of effort into making memory
| safety a priority for everyone in the industry and don't think
| new languages should backslide into C's "unsaftey"
| bigbillheck wrote:
| It wasn't helped by Ziglang's VP of Community Loris Cro using
| the term 'full-time safety coomer'.
| wchar_t wrote:
| He apologized:
| https://twitter.com/croloris/status/1568704125826748416
| turminal wrote:
| > the Rust community putting a lot of effort into making memory
| safety a priority for everyone in the industry
|
| It would help if they did that in a manner that resembled
| harrassment a bit less. I realize most of the community means
| well but by now it should really be clear that they picked the
| wrong way.
| staticassertion wrote:
| Never seen this.
| adamdusty wrote:
| Rust community harassment? Must have missed when rust
| developers harassed the maintainer of actix so bad he quit
| and temporarily deleted the project repo.
|
| https://news.ycombinator.com/item?id=22073908
| staticassertion wrote:
| Nope, didn't miss that, have been a part of rust since
| 2014. I don't consider any of that harassment at all, let
| alone harassment with regards to other languages. I'm not
| going to reshash the discussion as this happened years
| ago, geofft summarizes things pretty well in that thread,
| but this has all been discussed to death anyways. And,
| again, that's still not an example of Rust users
| "harassing" anyone for using other languages.
| lifthrasiir wrote:
| An unfortunate incident, but was that repeated? To my
| knowledge the community has significantly calmed down
| after that incident, exactly because it was an
| unprecedented drama and many agreed that the drama is
| bad.
| verdagon wrote:
| I suspect a lot of the toxic discourse originally comes from
| folks not understanding each other's use cases and priorities.
|
| For some, memory safety is a means to an end: delivering value.
| In this perspective, one must weigh the benefits of improving
| memory safety against the complexity costs of proving memory
| safety. In some situations, the improvement is too small and
| the added complexity is too much. Some apps and some embedded
| situations come to mind. Languages like Odin and Zig can be
| stellar in these situations.
|
| For others, memory safety is a responsibility, and upholding it
| is a basic requirement of modern software no matter what costs
| we need to pay for it, and if the world would just accept that,
| then we as a society could move past the days of rampant
| vulnerabilities and odd memory bugs.
|
| Both sides are equally compelling, to me at least. What I hope
| people can learn is that it really depends on the situation.
| There is a place and time for both approaches. Once we can
| accept that, I think the toxicity will dissipate.
| ArrayBoundCheck wrote:
| 1. The 2 most toxic community are the C/C++ community (users
| heavily overlap and one is a superset so I'll consider them
| the same) and rust. The rust community is more obsessed about
| undefined behavior than the C++ community and rust doesn't
| even have UB
|
| 2. Rust is a piece of crap language. We had memory safety in
| java and C#. Rust doesn't have reasonable compile times,
| doesn't have a reasonable standard library (I'm not trusting
| cargo/npm/whatever) and some of the time rust is somehow
| slower than C# and Java. It's a huge fuckup yet the community
| turns a blind eye
| staticassertion wrote:
| This is exactly the way I see it. I view the bar for
| developer responsibility to be much higher than some others
| do. I suspect it's because I come from the world of
| information security where we deal very directly with the
| consequences of _not_ having a higher bar.
|
| To me, the idea that a new language would be anything other
| than entirely memory safe, unless it is very domain specific
| to areas where that's not important, is just another example
| of developers lowering the bar. And I'm fine calling that out
| and being called a zealot or whatever.
| verdagon wrote:
| You didn't explicitly say this, but for anyone who might
| misinterpret your comment to imply that any memory-unsafe
| should be specifically targeted at certain small domains:
|
| A vast swath of the programming world doesn't need the
| level of memory safety that Rust has: apps and webapps are
| generally sandboxed and only talk to a trusted first-party
| domain, and games don't need memory safety if they're
| single player, or even multiplayer co-op against AI. There
| are a lot of aspects of the industry like this.
|
| We do need memory safety in any case that receives
| untrusted input (for security reasons), cases that handle
| multiple users' data (for privacy reasons), and safety
| critical software. There are plenty more cases like these
| too. Languages like Rust (or even more memory-safe
| languages) are a stellar choice for this side of the
| programming world, but not necessarily the other side.
|
| Creating simple languages to serve the purposes best served
| by simple languages is a good thing, and I celebrate and
| applaud Odin and Zig for that. It's up to the individual
| developer to make a responsible choice based on their
| situation, and any developer that uses the wrong language
| for the wrong situation is indeed guilty of lowering the
| bar.
| staticassertion wrote:
| Yeah, to be clear, some domains are different. Gaming is
| one example of a domain where "best effort" memory safety
| is perfectly acceptable. Would I _prefer_ full memory
| safety? Sure, but we don 't see games as an attack vector
| in the wild very often and there are good reasons for
| that.
|
| If someone wants to build a game in a language that's
| best-effort memory safety go for it. Similarly, CLI
| applications are often never called from an untrusted
| context and often provide semantic power that is
| equivalent to code execution anyways, so again I'd love
| to see memory safety, but I'm not going to care that
| much.
|
| But these are exceptions - you'd still want to way
| whatever you're getting from memory-unsafe-language-X
| against just using a memory safe language. The default
| should be memory safety. Given that we have Go, Rust,
| Java, and more, there are few situations where memory
| safety isn't an easy option. Not zero, just few.
| verdagon wrote:
| I'm not sure I'd characterize Rust as an easy enough
| option for most people/cases, or that there's just a
| "few" situations, but apart from that I like your
| perspective, well said.
| lifthrasiir wrote:
| I think there is some similarity between information
| security and Rust community; there had been the elephant in
| the room, and they were the first to actively acknowledge
| and get rid of it. Since it was long neglected by others,
| the elimination of the elephant---vulnerability or memory
| bugs---can be seen as the utmost goal to some (but not all)
| of them. Others will complain, but as those others were
| mostly who neglected the elephant from the beginning, their
| complaints are mixed, some reasonable, some not. It's clear
| that this situation is far from being ideal, both for who
| are aware of the elephant and not, but I'm not sure how to
| mend this gap in understanding.
| dang wrote:
| Related:
|
| _Odin Programming Language_ -
| https://news.ycombinator.com/item?id=30394000 - Feb 2022 (42
| comments)
|
| _The Odin Programming Language_ -
| https://news.ycombinator.com/item?id=22199942 - Jan 2020 (141
| comments)
|
| _The Odin Programming Language_ -
| https://news.ycombinator.com/item?id=20075638 - June 2019 (3
| comments)
| agluszak wrote:
| > Odin has two idiomatic ways to allocate memory. The make and
| new procedures. When destroying something created with make you
| call delete. When destroying something created with new you call
| free. This is a bit confusing if you come from C++ where new is
| paired with delete.
|
| What's the difference between the make and new then?
|
| > There's no need for a build system, nor to explicitly call the
| linker. The compiler does it all.
|
| Umm, so there _is_ a build system, but it 's just integrated into
| the compiler? What is the benefit here? Rust has an excellent
| build system out of the box (cargo), but it's still separate from
| the compiler itself.
| hsn915 wrote:
| The distinction between make and new is similar to Go.
|
| new allocates memory for the given type.
|
| make allocates memory referenced by the type.
|
| For example, a slice is defined as: slice ::
| struct { data: rawptr, length: int
| }
|
| When you make a slice, you don't allocate space for the struct.
| You allocate space for the data to be referenced by the struct.
|
| > Umm, so there is a build system, but it's just integrated
| into the compiler? What is the benefit here? Rust has an
| excellent build system out of the box (cargo), but it's still
| separate from the compiler itself.
|
| What's the benefit of not having the build system integrated
| into the compiler?
|
| Ideally if you are not linking to external libraries, and all
| the code is in the language, you don't want to go through the
| typical stages of the C compiler where it first produces object
| files and then links them. You want to just produce the final
| exe directly. I don't think Odin does this - at least at this
| point - but anyway messing around with object files should be
| considered obsolete.
| agluszak wrote:
| > What's the benefit of not having the build system
| integrated into the compiler?
|
| Being able to write new/integrate with existing build systems
| (i.e. Bazel).
|
| > Ideally if you are not linking to external libraries
|
| I'm afraid that's rarely the case.
|
| > messing around with object files should be considered
| obsolete.
|
| Messing with them _yourself_ , as you need to do when you use
| `make` in C/C++ (without cmake or anything more fancy) - I
| agree, it should be considered obsolete. But what if I _want_
| to mess with them because, for example, I want to add support
| for Odin to Bazel?
| vinyl7 wrote:
| There shouldn't be a buffet of build systems. How much time is
| wasted bikeshedding on different build systems, not to mention
| writing in their proprietary language/api and having to
| maintain a separate program?
|
| There doesn't need to be a variety of build systems because all
| they do is put out an executable. Its a simple thing that
| doesnt benefit from having competing products
| d_burfoot wrote:
| I decided that if I ever get a dog, I will name him Odin. That
| way, when he gets lost, I can walk around and yell
| "Ooooodddddiiiiinnnnnn!!"
| gnuvince wrote:
| Odin and Zig are the two up-and-coming languages that I keep my
| eye on. I like how they are trying to find a sweet spot where
| they offer more than plain old C, but without becoming
| overbearing like Ada, C++, or Rust.
| tialaramex wrote:
| You might want to also pay attention to Jai (or whatever
| Jonathan Blow ends up naming it)
|
| Like the author of Odin, Blow has significant experience
| writing software in a specific domain (video games), has strong
| opinions about what's wrong with existing languages, and
| decided he could do better.
|
| You can't actually use Jai yourself yet, it is as yet a closed
| beta (though you might know somebody who can get you in), but
| you can already get a flavour of it and I think it's probably
| in the sphere of things you'd be interested in judging from
| your comment.
|
| Personally I think we need to stop treating safety as optional,
| as a C programmer for about 30 years, about 15-20 years of that
| for pay, I found Rust very pleasant and would now always choose
| it over C or these C replacements - although currently I get
| paid to write Python and C# in my day job.
|
| But I'm clearly in the minority, for now at least, so I expect
| at least one of these C replacements like Odin or Zig to get
| significant adoption. Probably pays to know several of them, as
| it's far from clear which will succeed and I doubt there's room
| for all of them over the long term.
| bsaul wrote:
| I'm really surprised to see those language emerge after having
| read so many praise about rust being a fantastic system
| language.
|
| Although, from a personal standpoint, anytime i see rust code
| or read about horror stories fighting the compiler i wonder how
| that language gained so much popularity.
| hsn915 wrote:
| Odin emerged partly out of the "Handmade Network" which is a
| group of people interested in a style of programming that is
| very different from what is usually accepted by the rest of
| the industry as best practice.
|
| See: https://www.youtube.com/watch?v=f4ioc8-lDc0&t=4407s
| lucasyvas wrote:
| It's popular _because_ the compiler is difficult - people
| would rather suffer the pain at compile time instead of
| runtime for particular projects, so it is very well suited to
| those.
| bsaul wrote:
| there's good difficult, the one that forces you to clarify
| your point, and there's bad one: the one that makes simple
| constructs hard or impossible without going through hoops.
|
| From my understanding, rust has a little bit of both.
| doliveira wrote:
| I still feel that we're overdue for some paradigm shift in
| programming languages. There's a nebulous feeling, probably
| informed by my amateurishness, that some tasks just
| shouldn't be this hard. Seeing the whole buzz around GitHub
| Copilot, which seems to confirm that 90% of the typing we
| do is useless, makes me think that there's another level of
| "semantic abstraction" (?) we're missing.
| afranchuk wrote:
| I think an interesting distinction is this: it's not that
| 90% of the typing you do is necessarily useless, it's
| that it's been done before. Copilot is, in a sense,
| drawing from a corpus of prior work. In that way you are
| kind of using it as if you found a published library for
| your more specific use cases. So it's allowing you to
| draw on prior work without necessarily having external
| packages split up with the granularity of many variations
| of functions (which, ideally, would allow you to pull in
| exactly what you need from prior work, but would
| certainly be onerous in practice).
|
| That being said, I've never used Copilot myself, so I
| can't speak very confidently about it. But from what I've
| seen, it kind of allows you to incorporate every library
| that's ever been made open source into your project, but
| in a more granular fashion. Which naturally would save
| you some typing :)
|
| P.S. I realize Copilot isn't necessarily copying other
| code verbatim, though I assume pretty often it basically
| ends up doing that, at least in pieces.
| crdrost wrote:
| This is an interesting take. One does need to draw a
| distinction between "what everybody writes" (copilot
| generates) and "what has to be written" (this code is the
| 'useless' stuff bemoaned by the parent comment).
| Obviously everybody writes what has to be written, but
| you're right that there is a distinction.
|
| One gets a vision of the copilot autocomplete, as kind of
| missing what would be an essential highlight, in the
| templates it provides. "These parts highlighted in red,
| do not change those, for some reason everybody does it
| that way. But these parts highlighted in yellow, I've
| seen a bunch of different takes on those; this is where
| some variation occurs and where you might want to
| customize it yourself."
|
| On this take, copilot is a poor way of providing
| syntactic templates--macros--and indeed those macros take
| the form of template substitution, which means they form
| in theory a lambda calculus for that metalanguage.
|
| That is itself interesting because I only know of one
| programming language which says "we are going to live out
| here in la-la-functional-programming-math-land, but we
| are going to describe values which are actual fragments
| of computer programs," and that language is Haskell--
| though never used to write at this scale!. Interesting to
| think that the Ur-goal of Copilot is to provide what you
| were missing because you didn't wrap your language in
| Haskell in the first place!
|
| With that said, a better way to start with metalanguage
| design if this problem irks you is probably not Haskell
| itself (it doesn't have an easy way to swap out its
| compile target language from C-- to some other target, I
| don't think) but something like Ometa2:
| http://wiki.squeak.org/squeak/3930
| macintux wrote:
| I feel like some languages make it much easier (or at
| least require less code) to accomplish certain tasks
| (Perl, for text processing, and Erlang for
| concurrency/distributed systems), while other languages
| make it more practical to do anything at all, but the
| tradeoff is that they're much more verbose.
|
| Kitchen sink languages require you to build the kitchen
| before you can cook.
| zasdffaa wrote:
| Very interesting. I don't have experience with copilot,
| but with my own programming I tend to abstract heavily
| until repetition is removed and there's little to repeat
| (that said there are places where the language, C# in my
| case, could support my style of programming better). I'll
| check out some vids.
| bsaul wrote:
| a programming language (together with its standard
| library) should guide you toward safe code, while keeping
| an enjoyable experience.
|
| Saying "this thing is hard to get right, so the PL should
| make you feel the pain" is only marginally better than a
| language that pretends the problem isn't there at all.
| taylorius wrote:
| An interesting point. I wonder if the knowledge contained
| in github copilot could somehow be extracted to come up
| with suitable semantic abstractions.
| Existenceblinks wrote:
| It combines two things that appeal two large camps of
| developers, ML style and performance. I even think that its
| memory safety isn't that the main attractive point. It's like
| a language that's aim to sell both ML family and C guys, and
| that's over 50% of volume of devs' voice.
| kaba0 wrote:
| What I genuinely don't understand is why do we focus so much
| on the low-level/system front? It is (or very much _should
| be_ ) a small niche, and most business applications are
| better served by managed languages that won't get a huge list
| of vulnerabilities from memory corruption alone.
| lostdog wrote:
| Low-level and system programming are the areas most hurting
| for better languages.
|
| High level has several modern scripting languages (Python,
| Ruby, Javascript), and typed languages (Java, C#, Kotlin).
| Low-level programming has had C and C++ for 30+ years, and
| both have major problems that need fixing. Plus, there's
| lots of interest in having programs run much much faster,
| especially as people realize that every line of Python they
| write could be 10x faster in a lower level language, with
| minimal extra work. This is why Rust, and now Odin, get so
| much attention.
|
| Of course, there's also a renaissance brewing for mid-level
| languages, including Go, nim, and crystal.
| shpongled wrote:
| Personally, I would use Rust even if it was managed/not as
| low level. Advanced ML type system + best-in-class
| developer UX/tooling is the biggest selling point for me.
| Hemospectrum wrote:
| Systems-level programming is a frontier for language design
| precisely _because_ higher-level domains are already so
| well-served by established platforms. If your problem can
| be solved in Python or JavaScript, you 're potentially
| creating a lot of work for yourself by using a language
| that's sort of like either of these, but not actually
| compatible with their libraries. The internet is littered
| with upstart languages that were made this way and withered
| on the vine. On the other hand, if you're working in a
| problem space with tighter performance constraints, and you
| _already_ can 't touch these languages, and you can't even
| count on libraries written in C or C++, then you suddenly,
| paradoxically, have a lot more options.
| jessermeyer wrote:
| Odin has renewed my joy of programming.
|
| Built in bounds checking, slices, distinct typing, _no_
| undefined behavior, consistent semantics between optimization
| modes, minimal implicit type conversions, context system and
| the standard library tracking allocator combine together to
| eliminate the majority of memory bugs I found use for
| sanitizers in C /C++. Now I'm back to logic bugs, which neither
| Rust nor sanitizers can help you directly with anyway because
| they rely on program and not language semantics. Obviously
| these features together do not eliminate those classes of bugs,
| like Rust, but Odin chooses a different point on the efficient
| frontier to target, and does so masterfully.
|
| To put the cherry on top, the sane package system, buttery
| smooth syntax, sensible preprocessor (the 'constant system'),
| generics, LLVM backend for debug info / optimization, open
| source standard library, simply build system, engaging and well
| intended community make day to day programming a pleasant
| experience.
| dunefox wrote:
| > minimal type inference
|
| How is that a pro? That's a net negative.
| Mountain_Skies wrote:
| That depends on your goal. Want to move fast and break
| things? Then type inference is great. Want to build things
| that are solid and reliable? Type inference can be a very
| bad thing in the wrong hands and most hands are the wrong
| hands.
| dunefox wrote:
| How is type inference for "moving fast and breaking
| things" but not for building solid and reliable things?
| I'm not quite sure we're talking about the same concept
| here.
| Yoric wrote:
| Having been a professional OCaml developer, a long time
| ago, I found out that too much type inference gets into
| the way of proper documentation and, to some extent,
| proper naming conventions. Once code stabilized, we
| started annotating everything, as in languages with much
| more limited support for type inference, because it made
| finding type-level (and sometimes runtime) issues easier.
|
| Perhaps that's the GP is referring to?
| tialaramex wrote:
| I'm comfortable with Rust's choice to infer types only
| within a function, OCaml does sound like it has too much
| inference. But I think the GP was just confused about
| vocabulary and what they're really talking about is
| coercion and they originally wrote "minimal type
| inference" instead of "minimal type coercion". I think
| they subsequently corrected to "minimal implicit type
| conversions" which, is basically just more words for the
| same thing.
|
| Unwanted type coercions are an infamous problem in C and
| C++.
| jessermeyer wrote:
| I should clarify. I may have mistyped.
|
| Odin allows type inference at the declaration level `foo :=
| 1` for example, and a few other places, largely from the
| constant system. 1 can be an integer, float, or even a
| matrix, given the larger context.
|
| What I meant was implicit type conversion. Integers do not
| automatically cast as booleans in Odin, as an example.
| dunefox wrote:
| > What I meant was implicit type conversion.
|
| Well, that's something completely different then.
| hsn915 wrote:
| Odin's stance towards undefined behavior was probably _the_
| decisive element that made me prefer it over the other
| languages.
|
| https://twitter.com/TheGingerBill/status/1495004577531367425
| raphlinus wrote:
| How does this work in practice? Is the behavior of a use-
| after-free defined? Data races? The latter even more so for
| objects that don't fit in one machine word, such as slices.
|
| While avoiding undefined behavior is a noble goal, my
| personal feeling is that actually achieving that will be
| much harder than it might first seem, and will probably end
| up precluding a good deal of optimization.
|
| Of course, C has an entire class of UB that is much more
| excessive than useful, for example left shift of a negative
| integer. It's clearly and obviously possible to do much
| better than C. I'm just skeptical that "no UB at all" is in
| reach for a low-level, systems programming language that is
| also portable and can be compiled with optimization
| comparable to C.
| guenthert wrote:
| > for example left shift of a negative integer. It's
| clearly and obviously possible to do much better than C.
|
| If you want to allow machine architectures using other
| than two's complement while simultaneously striving for
| efficient translations, that isn't all that obvious to
| me.
| elteto wrote:
| Those machines can stay using whatever compiler/language
| already works for them today. Enough of holding back the
| rest of the world because of these fabled non-standard
| architectures.
| gingerBill wrote:
| Name a machine that people still program for that is NOT
| two's complement.
|
| And the latest version of C now requires two's complement
| too!
| astrange wrote:
| Left shift with a negative shift count still has UB
| issues unless you generate inefficient code. The way the
| shift count is read is different on ARM, x86 scalar, and
| x86 SIMD... so you can't autovectorize it without UB.
| gingerBill wrote:
| Odin does not allow for negative shifts to begin with. To
| copy from the Overview[1]:
|
| > The right operand in a shift expression must have an
| unsigned integer type or be an untyped constant
| representable by a typed unsigned integer.
|
| and [2]:
|
| > The shift operators shift the left operand by the shift
| count specified by the right operand, which must be non-
| negative. The shift operators implement arithmetic shifts
| if the left operand a signed integer and logical shifts
| if the it is an unsigned integer. There is not an upper
| limit on the shift count. Shifts behave as if the left
| operand is shifted `n` times by `1` for a shift count of
| `n`. Therefore, `x<<1` is the same as `x*2` and `x>>1` is
| the same as `x/2` but truncated towards negative
| infinity.
|
| [1] https://odin-lang.org/docs/overview/#arithmetic-
| operators [2] https://odin-
| lang.org/docs/overview/#integer-operators
| jessermeyer wrote:
| Odin forbids negative shifts.
| hsn915 wrote:
| I think the twitter thread I linked answers your
| questions?
|
| "use after free" is an instance of a memory access
| pattern that is considered invalid from the program's
| point of view.
|
| What happens depends on the situation:
|
| Was the memory returned to the operating system? If so it
| will probably result in a page fault and if you don't
| have a thing to handle the signal then the OS will crash
| your program.
|
| Was the memory part of an arena managed by the custom
| allocator that still owns it? If so it will return
| whatever value is contained in the address being
| dereferenced.
| lifthrasiir wrote:
| Borrowing gingerbill's terminology, your "mini spec" is
| still incomplete. And it is unclear how an implicit or
| incomplete mini spec is different from UB, except for the
| fact that the compiler can't take advantage of that. From
| the user's perspective a gap in the mini spec is still a
| gap that needs to be memorized, considered and avoided
| much like UB. If you do somehow manage to define every
| "mini spec", this poses another problem that your
| specification limits what you can do in the future---for
| example you would be unable to switch the memory
| allocator.
| glowcoil wrote:
| This is fundamentally the same thing as undefined
| behavior, regardless of whether Odin insists on calling
| it by a different name. If you don't want behavior to be
| undefined, you have to define it, and every part of the
| compiler has to respect that definition. If a use-after-
| free is not undefined behavior in Odin, what behavior is
| it defined to have?
|
| As a basic example, if the compiler guarantees that the
| write will result in a deterministic segmentation fault,
| then that address must never be reused by future
| allocations (including stack allocations!), and the
| compiler is not allowed to perform basic optimizations
| like dead store elimination and register promotion for
| accesses to that address, because those can prevent the
| segfault from occurring.
|
| If the compiler guarantees that the write will result in
| either a segfault or a valid write to that memory
| location, depending on the current state of the
| allocator, what guarantees does the compiler make about
| those writes? If some other piece of code is also
| performing reads and writes at that location, is the
| write guaranteed to be visible to that code? This
| essentially rules out dead store elimination, register
| promotion, constant folding, etc. for _both_ pieces of
| code, because those optimizations can prevent one piece
| of code from observing the other 's writes. Worse, what
| if the two pieces of code are on different threads? And
| so on.
|
| If the compiler doesn't guarantee a deterministic crash,
| and it doesn't guarantee whether or not the write is
| visible to other code using the same region of memory,
| and it doesn't provide any ordering or atomicity
| guarantees for the write if it does end up being visible
| to other code, and then it performs a bunch of
| optimizations that can affect all of those things in
| surprising ways: congratulations, your language has
| undefined behavior. You can insist on calling it
| something else, but you haven't changed the fundamental
| situation.
| kaba0 wrote:
| How is that different from specifying the behavior of
| addition only when it won't overflow? The C spec might as
| well say that it is invalid to overflow, and you are
| responsible for checking for that (and that's kind of
| what they do), but that's what UB is afaik.
| raphlinus wrote:
| The point of the C11 memory model is that it gives formal
| bounds on what optimizations the compiler is and is not
| able to do. In particular, it is free to reorder memory
| operations as if the program was single-threaded, unless
| the memory operations are explicitly marked as atomic. My
| assertion is that if you do these optimizations and then
| have a data race, it's functionally equivalent to
| undefined behavior, even if you call it something
| different and loudly proclaim that your language doesn't
| have UB.
| jessermeyer wrote:
| Obviously Odin does not use C's memory model. And in
| instances where LLVM optimizes Odin for UB, it is a bug,
| and not a feature. Odin explicitly opts out of all
| optimization passes that depend or leverage UB, but
| that's a moving target.
|
| For example, as mentioned in the article, Odin does not
| leverage LLVM's poison value optimizations, which are
| derived from optimizations exploiting undefined behavior.
|
| Sure, some code is slowed. But can you point to a well
| known and well used algorithm whose runtime
| characteristics depend upon exploiting UB? If you code
| goes fast because it's doing undefined things the
| compiler strips away, that's a structural bug in the
| application, in my view.
| raphlinus wrote:
| I'll give a concrete example. It's not the most
| compelling optimization in the world, but I think
| illustrates the tradeoffs clearly. The following is
| pseudocode, not any particular language, but hopefully
| will be clear. let i = mystruct.i;
| if (i < array.length) { let x =
| expensive_computation(i); array[i] = x;
| }
|
| For the purpose of this example, let's say that the
| expensive computation is register-intensive but doesn't
| write any memory (so we don't need to get into alias
| analysis issues). Because it is register-intensive, our
| optimizer would like to free up the register occupied by
| i, replacing the second use of i by another load from
| mystruct.i. In C or unsafe Rust, this would be a
| perfectly fine optimization.
|
| If another thread writes struct.i concurrently, we have a
| time of check to time of use (TOCTOU) error. In C or
| unsafe Rust, that's accounted for by the fact that a data
| race is undefined behavior. One of the behaviors that's
| allowed (because basically all behavior are allowed) is
| for the two uses of i to differ, invalidating the bounds
| check.
|
| Different languages deal with this in different ways.
| Java succeeds in its goal of avoiding UB, disallowing
| this optimization; the mechanism for doing so in LLVM is
| to consider most memory accesses to have "unordered"
| semantics. However, this comes with its own steep
| tradeoff. To avoid tearing, all pointers must be "thin,"
| specifically disallowing slices. Go, by contrast, has
| slices among its fat pointer types, so incurs UB when
| there's a data race. It's otherwise a fairly safe
| language, but this is one of the gaps in that promise.
|
| Basically, my argument is this. If you're _really_
| rigorous about avoiding UB, you essentially have to
| define a memory model, then make sure your use of LLVM
| (or whatever code generation technique) is actually
| consistent with that memory model. That 's potentially an
| enormous amount of work, very easy to get subtly wrong,
| and at the end of the day gives you fewer optimization
| opportunities than C or unsafe Rust. Thus, it's certainly
| not a tradeoff I personally would make.
| jessermeyer wrote:
| Thanks. Currently Odin would cache i on the stack for
| retrieval later, granting LLVM the ability to load it
| into a register if profitable with knowledge that after
| the read, `i` is constant, which bypasses the RAW hazard
| after the initial read.
|
| My view is that undefined behavior is a trash fire and
| serious effort should be undertaken to fix the situation
| before it gets even more out of hand.
| spacechild1 wrote:
| > Obviously Odin does not use C's memory model.
|
| How is that obvious? And which memory model does Odin
| use? Just to be clear: a "memory model" is not about
| memory management, it is about the behaviour of
| multithreaded programs: https://en.m.wikipedia.org/wiki/M
| emory_model_(programming)
| jessermeyer wrote:
| Thanks. I was thinking of strict-aliasing and the
| associated undefined behavior, which Odin forbids. Odin's
| atomic intrinsics model after C11 closely (and require
| the programmer to emit memory barriers when cache
| behavior must be controlled by instructions). I believe
| the final memory model will be platform defined. There is
| no built-in atomic data type in Odin. Only atomic
| operations are available, and most sync primitives are
| implemented in the standard library wrapping OS
| capabilities (WaitOnAddress), etc.
|
| The parent comment said: >then have a data race, it's
| functionally equivalent to undefined behavior
|
| This is a matter of interpretation but there is a
| categorical difference between "this read is undefined so
| the compiler will silently omit it" and "this read will
| read whatever value is in the CPU cache at this address,
| even if the cache is stale". The difference is a matter
| of undefined behavior from the _language_ versus the
| _application_ being in an undefined state.
| Yoric wrote:
| My experience of Rust is that affine types go a long way
| towards eliminating some classes of logic bugs.
|
| That being said, there's always space for exploring other
| design choices! I haven't tried Odin yet, but it looks very
| interesting.
| dikaio wrote:
| The way the article reads feels very authentic and the
| transparency in conflict of the review is refreshing.
___________________________________________________________________
(page generated 2022-09-11 23:00 UTC)