[HN Gopher] A Review of the Zig Programming Language (Using Adve...
___________________________________________________________________
A Review of the Zig Programming Language (Using Advent of Code
2021)
Author : mkeeter
Score : 236 points
Date : 2021-12-27 14:09 UTC (8 hours ago)
(HTM) web link (www.duskborn.com)
(TXT) w3m dump (www.duskborn.com)
| anatoly wrote:
| I also did AoC 2021 in Zig:
| https://github.com/avorobey/adventofcode-2021
|
| One thing the OP didn't mention that I really liked was runtime
| checks on array/slice access and integer under/overflow. Because
| dealing with heap allocation is a bit of a hassle, I was
| incentivized to use static buffers a lot. I quickly figured out
| that I didn't have to worry about their sizes much, because if
| they're overrun by the unexpectedly large input or other behavior
| in my algorithms, I get a nice runtime error with the right line
| indicated, rather than corrupt memory or a crash. Same thing
| about choosing which integer type to use: it's not a problem if I
| made the wrong choice, I'll get a nice error message and fix
| easily. This made for a lot of peace of mind during coding.
| Obviously in a real production system I'd be more careful and use
| dynamic sizes appropriately, but for one-off programs like these
| it was excellent.
|
| Overall, I really enjoyed using Zig while starting out at AoC
| problem 1 with zero knowledge of the language. To my mind, it's
| "C with as much convenience as could be wrung out of it w/o
| betraying the low-level core behavior". That is, no code
| execution hidden behind constructors or overloads, no garbage
| collection, straight imperative code, but with so much done right
| (type system, generics, errors, optionals, slices) that it feels
| much more pleasant and uncomparably safer than C.
|
| (you can still get a segmentation fault, and I did a few times -
| by erroneously holding on to pointers inside a container while it
| resized. Still, uncomparably safer)
| pcwalton wrote:
| > (you can still get a segmentation fault, and I did a few
| times - by erroneously holding on to pointers inside a
| container while it resized. Still, uncomparably safer)
|
| This is a severe problem, and I predict that this is going to
| cause real security issues that will hurt real people if Zig
| gets used in production before it gets production-ready memory
| safety. This exact pattern (pointers into a container that
| resized, invalidating those pointers) has caused zero-days
| exploited in the wild in browsers.
| elcritch wrote:
| > This is a severe problem, and I predict that this is going
| to cause real security issues
|
| That is a nasty problem, particularly in larger projects with
| different subsystems interacting (like say an xml parser and
| another).
|
| I suspect it's worse in some ways as Zig has good marketing
| as being "safer" language despite still having the same
| fundamental memory flaws as C/C++. In the worse case that
| could lull programmers into complacency. I mean it looks
| "modern" so it's safe right? Just do some testing and it's
| all good.
|
| Currently I'm skeptical Zig will get a production-ready
| memory safety. Currently there's only GC's or linear/affine
| types and Zig doesn't appear to be pursuing either. Aliased
| pointers aren't something that's properly handled by adhoc
| testing IMHO.
| Laremere wrote:
| FWIW, "safe" doesn't appear anywhere on the Zig homepage.
| I've been trying out Zig for the past couple weeks, and
| while I love it so far, it gives anything but the feeling
| of safety. I would say there's guardrails, but those are
| optionally disabled in the compiler for faster execution.
|
| It seems to be that Zig is really not trying to be a
| replacement for all programming, but fill its niche as best
| it can. If your niche requires memory safety as a top
| priority because it accepts untrusted input, Rust would
| probably be a better choice than Zig.
| pcwalton wrote:
| Some sort of pointer tagging system, like 128-bit pointers
| where the upper word is a unique generation ID, might be
| the simplest approach to eliminate security problems from
| use-after-free, but it's going to have some amount of
| runtime overhead (though new hardware features may help to
| reduce it).
|
| Alternately, use a GC.
| formerly_proven wrote:
| Yes, when I invariably had to debug the first UAF in Zig I
| did pause for a bit and pondered my rust. It's definitely an
| argument against Zig that is unlikely to go away anytime
| soon.
| dnautics wrote:
| If you write tests in zig you will probably find this using
| the testing allocator. Yes, I get that some people really
| don't like writing tests.
| pcwalton wrote:
| Many of the highest-profile memory safety security issues
| are in _very_ well-tested codebases, like browsers.
| [deleted]
| dnautics wrote:
| What's your point? You're comparing apples to oranges.
| skybrian wrote:
| Zig apparently has valgrind support. Maybe it's not turned on
| by default?
| formerly_proven wrote:
| I don't think Zig has any particular Valgrind support, it's
| just a binary after all. In order to properly utilize
| valgrind though you're going to have to change from the GPA
| or whatever allocator you're using to the libc one so that
| Valgrind can trace memory allocations correctly via
| preloading.
| skybrian wrote:
| Here is some kind of valgrind API [1] and a here is a
| report from someone who tried using valgrind [2]. Yes, it
| doesn't sound all that special.
|
| [1] https://github.com/ziglang/zig/blob/master/lib/std/va
| lgrind.... [2] https://dev.to/stein/some-notes-on-using-
| valgrind-with-zig-3...
| staticassertion wrote:
| Valgrind support is cool but it's not a solution to the
| problem.
| geokon wrote:
| "runtime checks on array/slice access and integer
| under/overflow"
|
| I'm probably missing something. I feel like you'd get this and
| a lot of the other benefits you list if you just compile C/C++
| with Debug options - or run with Valgrind or something. Are you
| saying you get automatic checks that can't be disabled in Zig?
| (that doesn't sound like a good thing.. hence I feel I'm
| missing something :) )
| djur wrote:
| Slices allow catching a lot of bounds errors that you can't
| reliably catch when using raw pointers.
| tialaramex wrote:
| What "Debug options" are you imagining will provide runtime
| checks for overflow and underflow in C and C++ - languages
| where this behaviour is deliberately allowed as an
| optimisation?
|
| In C it's simply a fact that incrementing the unsigned 8-bit
| integer 255 gets you 0 even though this defies what your
| arithmetic teacher taught you about the number line it's just
| how C works, so a "Debug Option" that says no, now that's an
| error isn't so much a "Debug Option" as a different
| programming language.
| bruce343434 wrote:
| -fsanitize=address,undefined,etc
|
| There's even threadsanitizer which will tell you about
| deadlocks and unjoined threads.
| [deleted]
| not2b wrote:
| C unsigned integers are completely well behaved: they do
| arithmetic modulo 2^n, and I hope you had a teacher that
| exposed you to that. C has many problems but that isn't one
| of them: overflow of unsigned is designed and documented to
| wrap around.
| justinpombrio wrote:
| > C unsigned integers are completely well behaved: they
| do arithmetic modulo 2^n
|
| Sadly, one rarely finds an excuse to work in the field
| Z_(2^32) or Z_(2^64), so while that behavior is well-
| defined, it's rarely correct for whatever your purpose
| is.
| naasking wrote:
| Array indices should arguably be unsigned (and
| struct/type sizes), so I'd say it's a lot more common
| than you imply.
| yxhuvud wrote:
| And exactly how is silent wraparound useful or even sane
| for that use case? You just proved the point of the one
| you responded to.
| naasking wrote:
| Wrapping is more sensible than negative indices.
| saagarjha wrote:
| It's useful when working with bits and bytes and stuff.
| Aside from that, I fully agree.
| saagarjha wrote:
| > What "Debug options" are you imagining will provide
| runtime checks for overflow and underflow in C and C++ -
| languages where this behaviour is deliberately allowed as
| an optimisation?
|
| -fsanitize=undefined.
|
| > In C it's simply a fact that incrementing the unsigned
| 8-bit integer 255 gets you 0 even though this defies what
| your arithmetic teacher taught you about the number line
| it's just how C works, so a "Debug Option" that says no,
| now that's an error isn't so much a "Debug Option" as a
| different programming language.
|
| Yes, but this happens to be defined behavior, even if it's
| what you don't want most of the time. (Amusingly, a lot of
| so-called "safe" languages adopt this behavior in their
| release builds, and sometimes even their debug builds.
| You're not getting direct memory corruption out of it,
| sure, but it's a great way to create bugs.)
| IgorPartola wrote:
| That's a distinction without a difference. Yes it's
| defined behavior. No, there isn't a strictness check in
| C++ nor a debug option that will catch it if it causes a
| buffer overwrite or similar bug. Your comment is
| basically "no need to watch out for these bugs, they are
| caused by a feature".
| saagarjha wrote:
| Did you read the same comment that I wrote? The very
| first thing I mentioned is a flag to turn on checking for
| this. And I mentioned the behavior for unsigned
| arithmetic is defined, but then I _immediately_ mentioned
| that this behavior is probably not what you want and that
| other languages are adopting it is kind of sad.
| tialaramex wrote:
| People read the comment that you wrote, in which you, in
| typical "real programmer" fashion redefined the question
| so that it matched your preferred answer, by mentioning a
| flag that does not in fact, check for overflow and then
| clarifying that you've decided to check for undefined
| behaviour not for overflow.
|
| [ saagarjha has since explained that in fact the UBSan
| does sanitize unsigned integer overflow (and several
| other things that aren't Undefined Behaviour) so this was
| wrong, left here for posterity ]
|
| _Machines_ are fine with the behaviour being whatever it
| is. But _humans_ aren 't and so the distant ancestor post
| says they liked the fact Zig has overflow checks in debug
| builds. So does Rust.
|
| If you'd prefer to reject overflow entirely, it's
| prohibited in WUFFS. WUFFS doesn't need any runtime
| checks, since it is making all these decisions at compile
| time, but unlike Zig or indeed C it is not a general
| purpose language.
| tialaramex wrote:
| > -fsanitize=undefined.
|
| As you yourself almost immediately mention, that's not
| checking for overflow.
|
| Was the goal here to show that C and C++ programmers
| don't understand what overflow is?
|
| > Yes, but this happens to be defined behavior, even if
| it's what you don't want most of the time
|
| The defined behaviour is an overflow. Correct. So,
| checking for undefined behaviour does not check for
| overflow. See how that works?
| saagarjha wrote:
| Sorry, perhaps I assumed a bit too much with my response.
| Are you familiar with -fsanitize=unsigned-integer-
| overflow? Your response makes me think you might not be
| aware of it and I wanted you to be on the same footing in
| this discussion.
| tialaramex wrote:
| I was not. So, UBSan also "sanitizes" defined but
| undesirable behaviour from the language under the label
| "undefined". Great nomenclature there.
|
| It also, by the looks of things, does not provide a way
| to say you want wrapping if that's what you did intend,
| you can only disable the sanitizer for the component that
| gets false positives. I don't know whether Zig has this,
| but Rust does (e.g. functions like wrapping_add() which
| of course inline to a single CPU instruction, and the
| Wrapping<> generic that implies all operations on that
| type are wrapping)
|
| But you are then correct that this catches such
| overflows. Thanks for pointing to -fsanitize=unsigned-
| integer-overflow.
|
| Since we're on the topic of sanitizers. These are great
| for AoC where I always run my real input under Debug
| anyway, but not much use in real systems where of course
| the edge case will inevitably happen in the
| uninstrumented production system and not in your unit
| tests...
| vinkelhake wrote:
| Runtime checks for _signed_ overflow can be enabled with
| -ftrapv in GCC and clang. Having this option open is why
| some people prefer to use signed integers over unsigned.
| AnIdiotOnTheNet wrote:
| The runtime safety checks are enabled in Debug and
| ReleaseSafe modes, but disabled in ReleaseFast and
| ReleaseSmall modes. They can be enabled (or disabled) on a
| per-scope basis using the `@setRuntimeSafety` builtin.
| pcwalton wrote:
| You're correct: you do get virtually all of the _safety_
| benefits of Zig by using sanitizers in C++. (Not speaking to
| language features in general, obviously.) In fact, C++ with
| sanitizers gives you more safety, because ASan /TSan/MSan
| have a _lot_ of features for detecting UB.
|
| Especially note HWASan, which is a version of ASan that is
| designed to run in production:
| https://source.android.com/devices/tech/debug/hwasan
| typon wrote:
| Defaults matter a lot. Just because something is possible
| doesnt mean it is likely to happen.
|
| Are most people going to enable asan, run their programs
| through valgrind extensively, or just do the easy thing and
| not do any of that?
|
| This is also why neovim is being actively developed and
| successful and vim is slowly decaying. The path of least
| resistance is the path most well travelled.
| snovv_crash wrote:
| Any project with a decent test coverage and CI can easily
| set up an ASAN / Valgrind run for their tests. I know I've
| had this on the last few C++ codebases I've worked with.
| superjan wrote:
| I would say that keeping the checks in runtime for release
| builds is the smart default. For most usages, removing the
| checks in release builds only adds security holes without
| measurable impact on performance.
| OtomotO wrote:
| The one thing I personally love about Zig, from an outsiders
| perspective, is the relatively clear scope and the "no, we won't
| add each and every feature we can imagine"-stance.
|
| It also seems rather elegant.
| ArtixFox wrote:
| the world has learnt from the horrors of C++, the same mistakes
| should not be repeated lol.
| OtomotO wrote:
| I don't see that the world has learned.
|
| Take Rust for example. Rust is a language that does a lot
| right. It's currently my favorite language.
|
| Still there is feature creep. And a lot of it.
| dilap wrote:
| swift, too
| anonymoushn wrote:
| > One nugget of knowledge I've worked out though - Zig is not a
| replacement for C. It is another replacement for C++.
|
| While comptime is a potential source of complexity, I sort of
| think C++ developers won't accept a replacement that has no RAII
| or automatic invocation of destructors.
| tomcam wrote:
| Meh. Not trying to start a language war but I was grateful to
| switch from C++to Go when the price was to lose generics and a
| few other things in exchange for the language's simplicity and
| clarity.
| dnautics wrote:
| I think there are so many corners where people are using C++
| that making generalizations about them is likely to fail.
| tomcam wrote:
| That's why I referred exclusively to my preferences. It is
| obviously the more comprehensive and versatile language.
| dnautics wrote:
| I was mostly referring to gp's broad statements, comment
| was in support of your experience, which is good feedback
| to hear.
| [deleted]
| Bekwnn wrote:
| There's an open issue to add some kind of function
| annotation+errors for functions which require you to call a
| cleanup function.
|
| The discussion has had a lot of back and forth and they haven't
| really settled on a desirable solution yet, but it's something
| they're hoping to add.
|
| https://github.com/ziglang/zig/issues/782
|
| I work in games with C++ and we already do so much manual
| management and initialization+teardown functions that lack of
| RAII isn't a deal-breaker. Though I'd definitely prefer it if
| there was something either well-enforced or automatic.
| gilbetron wrote:
| > One nugget of knowledge I've worked out though - Zig is not a
| replacement for C. It is another replacement for C++.
|
| I hope this isn't the case, since I see Rust as the C++
| replacement, and another replacement isn't very interesting to
| me. The main reason I've been interested in Zig is because I
| thought it was a replacement for C, which is an interesting idea.
| oconnor663 wrote:
| Is a C replacement (which is not _also_ a C++ replacement)
| really what anybody wants? Like with no generics, no dedicated
| error handling, and no automatic cleanup? I get that everyone
| enjoys a simple language, but these features feel like table
| stakes now.
| sirwhinesalot wrote:
| Zig has all of those things, if you consider defer to be a
| form of automatic cleanup.
| ncmncm wrote:
| Hint: _Rust will not be replacing C++_. C++ and Rust will
| coexist indefinitely. At some point in the future, it is
| possible that more Rust coders will be using it daily in their
| work than the number who pick up C++ for the first time _in any
| given week_ , who will go on to use it professionally. Or, that
| might not happen, and Rust will join Ada and so many other
| languages that never got their miracle.
|
| Even if Zig doesn't fizzle like the overwhelming majority of
| languages, it won't replace, or displace, C, never mind C++.
| Everybody willing to move on from C already did a long time
| ago. People still using C today like it for its failings, so a
| language that fixes them is exactly what they don't want. It
| doesn't give C++ users any of the things they need.
|
| The only real advance in systems languages in the last 50 years
| is the destructor, so it is frankly weird to find a new
| language without it. The Drop trait is all that makes Rust a
| viable prospect for its own miracle.
| gilbetron wrote:
| Oh I definitely meant "replacement" as in "replacement for me
| and many people", not that C++ would vanish. C and C++ are
| not going anywhere.
| isaiahg wrote:
| That's the one part in which I really disagreed and the author
| does a bad job of explaining why they think that.
| dnautics wrote:
| I don't understand where that came from. It's really a
| replacement for C. The place where complexity comes from in zig
| is pretty much the comptime type system, which is emergent from
| the idea of replacing irregular consteval rules for C and
| replacing preprocessor macros
|
| I would say that Zig is:
|
| C - {make, autoconf, etc., preprocessor, UB[0]} + {*defer, !,
| ?, catch/try, "async"[1], alignment, comptime}
|
| I don't think that rises to the level of "C++ replacement".
| Maybe it's that comptime lets you do generics a la C++
| templates?
|
| [0] by default, in zig you can have UB for performance
|
| [1] in quotes because async is not actually async, it's a
| control flow statement that is usually and most usefully used
| to do async things.
| rStar wrote:
| flohofwoe wrote:
| The part about making things easy to type is interesting, because
| this generally only works with a single international keyboard
| layout (usually US English), e.g. making things easy to type on
| the US keyboard layout may make it harder on an international
| layout.
|
| It's an old problem though, for instance the {}[] keys are
| terribly placed on the German keyboard layout, requiring the
| right-Alt-key which was enough for me to learn and switch to the
| US keyboard layout, and not just for coding.
|
| I think a better approach for a programming language would be to
| use as few special characters as possible.
|
| PS: Zig balances the '|' problem by using 'or' instead of '||' ;)
| formerly_proven wrote:
| I've heard this before but personally I've never had a problem
| with {}[], I just use the right thumb for shifting to the
| ancient greek layer.
| dralley wrote:
| >PS: Zig balances the '|' problem by using 'or' instead of '||'
| ;)
|
| I wish Rust had made that decision as well.
| AnIdiotOnTheNet wrote:
| > Initializing arrays is weird in Zig. Lets say you want to have
| a 0 initialized array, you declare it like [_]u8{0} * 4 which
| means I want an array, of type u8, that is initialized to 0 and
| is 4 elements long. You get used to the syntax, but it's not
| intuitive.
|
| Alternatively: var some_array =
| std.mem.zeroes([4]u8);
|
| Though as mentioned later in the article the standard library
| documentation is not very good, making this not as obvious as it
| could be.
|
| > Everything in Zig is const x = blah;, so why are functions not
| const bar = function() {};?
|
| Good question, there's an accepted proposal to fix this:
| https://github.com/ziglang/zig/issues/1717
|
| > The builtin compiler macros (that start with @) are a bit
| confusing. Some of them have a leading uppercase, others a
| lowercase, and I never did work out any pattern to them.
|
| In idiomatic zig, anything that returns a type is uppercased as
| though it were itself a type. Since only a few builtins return
| types, the vast majority of builtins will start with a lowercase
| letter. I think it is only `@Type`, `@TypeOf`, `@This`, and
| `@Frame` that don't.
| rgrmrts wrote:
| Any recommendations on learning more about what constitutes
| idiomatic zig? This is an issue I have with learning any new
| language - it's kind of hard for me to figure out what writing
| idiomatic code in that language looks like. I usually go
| looking for popular/high quality projects and reading that code
| but it takes away from the experience of actually just toying
| around not to mention it being hard knowing what a high quality
| project is. Thanks in advance!
| gavinray wrote:
| https://ziglearn.org/
| dnautics wrote:
| I don't think of this as an idiomaticity guide.
| gavinray wrote:
| It's unfortunately the only comprehensive reference I
| know of besides the official docs.
|
| There is "Ziglings", but those are a collection of small
| exercises with answers rather than a full guide.
|
| https://github.com/ratfactor/ziglings
|
| If you know of better resources than these two, please do
| share (not being passive aggressive here).
| dnautics wrote:
| honestly, the standard library. But there _should_ be an
| idiomaticity guide. At least for capitalization patterns,
| any other naming conventions, etc (things that zig fmt
| can 't capture). I don't know that this officially
| exists, anywhere yet. Here is an example of what I'm
| talking about in my $DAYJOB lang:
|
| https://hexdocs.pm/elixir/naming-conventions.html#content
|
| edit: see sibling comment, apparently I never noticed the
| style guide in the standard docs
| [deleted]
| cturtle wrote:
| The style guide in the language reference explains the
| accepted naming conventions [0].
|
| [0]: https://ziglang.org/documentation/master/#Style-
| Guide
| dnautics wrote:
| oh man! I'd missed this! Thanks
| rgrmrts wrote:
| Oh cool, I hadn't seen this either! Thanks!
|
| Also thanks for the other links, ziglearn is great.
| Shadonototra wrote:
| if you need to import a package to clear an array, something
| went very wrong somewhere..
| kristoff_it wrote:
| In Zig zero initialization is not idiomatic. Unless you have
| an active reason to do so (and during AoC you need zero init
| a lot more than normal IME), you should just set the array to
| undefined like so: var foo: [64]usize =
| undefined;
| christophilus wrote:
| Why?
| Shadonototra wrote:
| because it's trivial, it's like assigning a value to an
| integer, it shouldn't require a package
| messe wrote:
| Depending on your perspective, it's not trivial. It's
| significantly more expensive than assigning a value to an
| integer. Zeroing a [4096]u64 would require several
| thousand times more operations than zeroing a u64. In the
| areas that zig targets, this can be quite important.
| Shadonototra wrote:
| i disagree, it's just backward to need to import a
| package
| ithkuil wrote:
| I don't know zig. Is one a constant initializer while the other
| is not?
| e12e wrote:
| > For loops are a bit strange too - you write for
| (items) |item| {}
|
| >, which means you specify the container before the per-element
| variable. Mentally I think of for as for something in many_things
| {} and so in Zig I constantly had to write it wrong and then
| rewrite.
|
| That does feel like the syntax is missing an "each" or a "with",
| as in "for each somethings as some do" or "with each somethings
| as some" - or in a similar terse/compact syntax:
| each (items) |item| {}
|
| I'm surprised there's no mention about (lack of) string type -
| considering the domain (advent of code). I've not found the time
| to actually work on aoc this year, but I also had a brief look at
| starting with Zig - and quickly met a bit of a wall between the
| terse documententation on allocator, and the apparent lack of
| standard library support for working with strings.
|
| I think the documentation will improve as the language stabilizes
| and there's likely to be more tutorials that work with text (both
| trivial like sirt/cat/tac in zig, and more useful like http or
| dns client and servers etc).
| eirojhupp wrote:
| for strings, these might be good (haven't tried them yet):
| https://github.com/jecolon/zigstr
| https://github.com/jecolon/ziglyph
| kzrdude wrote:
| In "The Good" section, the author says there are only while
| loops and no for.. but apparently there is a for, now I'm
| unsure what it means. Is `for` a function?
| firethief wrote:
| `for` is a foreach. If you want to increment a number through
| a range like a typical C `for`, you have to use a `while`
| loop. I don't really see the draw.
| dnautics wrote:
| man if I were andrew I'd just rename for to "foreach",
| because this is a huge complaint and source of confusion.
| kzrdude wrote:
| Seems fine to me. Rust has a "foreach" which is named
| for, working ok. Of course Rust has ranges as iterators,
| so it's not necessarily noticed that "there is no
| numerical for" but it works.
| chubot wrote:
| I thought this syntax was weird too, but one cool thing is that
| you can ask for the index too for (items) |i,
| item| { }
|
| I guess Go does something similar, but it's a little weirder
| IMO because it has maps, while Zig doesn't.
|
| I suppose it could have been for i, item in
| items { }
|
| But I guess coming from Python that feels like tuple unpacking,
| which I don't think Zig has, but could make sense?
| llimllib wrote:
| I found the standard library's support for strings was plenty
| fine, doing AoC problems in zig tests it out thoroughly.
| Tokenize[1], split[2] and trim[3] were the most common ones I
| used.
|
| Was there something in particular you were looking for and
| didn't find?
|
| [1]:
| https://github.com/ziglang/zig/blob/master/lib/std/mem.zig#L...
|
| [2]:
| https://github.com/ziglang/zig/blob/master/lib/std/mem.zig#L...
|
| [3]:
| https://github.com/ziglang/zig/blob/master/lib/std/mem.zig#L...
|
| * After I read my own comment, I'd note that AoC tests out
| string manipulation pretty thoroughly but things like unicode
| handling not at all, so []const u8 as a string may be more
| annoying in the real world than in AoC answers and I haven't
| used zig's unicode facilities at all
| tialaramex wrote:
| In AoC it's completely fine to conflate a character and a
| byte. Neither your daily input nor the provided tests will
| have anything beyond ASCII.
|
| Which is fine for AoC, good choice, but it means the language
| needn't get this right, or even provide any help to
| programmers who need to get it right, in a similar way to how
| "big" numeric answers in AoC will fit in a 64-bit signed
| integer, never testing whether your chosen language can do
| better if the need arises.
| shakow wrote:
| > Was there something in particular you were looking for and
| didn't find?
|
| Unicode handling. Treating a string as a byte array is all
| fine and dandy if you're only processing english latin
| alphabet data, but it's a PITA as soon as you start using
| e.g. extended characters (math symbols, fancy quotes, ...)
| other languages, or emojis.
| kristoff_it wrote:
| Also here is a blog post that gives more info on how to
| deal with unicode in Zig.
|
| https://zig.news/dude_the_builder/unicode-basics-in-zig-dj3
| chubot wrote:
| What do you need to do with them? All my data is UTF-8, and
| low level code is generally parsing, which doesn't involve
| any of those characters. It generally just works with all
| special characters (e.g. on my blog).
|
| I think Unicode on the server or CLI is very different than
| Unicode on the desktop/GUIs.
|
| Since Zig interfaces well with C, it should be set up well
| for the harder desktop case, because the "real" Unicode
| libraries handling all the corner cases are written in C
| (or have C interfaces). I don't think even Python's unicode
| support is up to the task for desktop apps.
| dmitriid wrote:
| > What do you need to do with them? All my data is UTF-8,
| and low level code is generally parsing, which doesn't
| involve any of those characters.
|
| For example, file names on MacOS are Unicode. Depending
| on how low level your code is and what parsing it does,
| you _will_ run into issues because UTF-8 will not save
| you.
|
| > I think Unicode on the server or CLI is very different
| than Unicode on the desktop/GUIs.
|
| This is a non-sensical statement. It's the same Unicode.
| Yes, you won't have all the same use cases as a GUI, but
| Unicode is the same.
|
| The best example I have for when things are not handled
| is not strictly Unicode-related. WiFi SSIDs are 32
| octets. That is _any byte value at all_. Recent standard
| modifications allow devices to specify whether the SSID
| is in UTF-8.
|
| And yet. Too many devices assume that, basically, " All
| my data is UTF-8, and low level code is generally
| parsing, which doesn't involve any of those characters",
| and Internet is full of people asking questions like
| "cannot connect to WiFi with non-English SSID"
| llimllib wrote:
| There is the `std.unicode` module[1] which provides
| standard unicode functions (encode, decode, length,
| iteration over code points), so I don't think it's fair to
| say that the language's library lacks strings in any real
| sense.
|
| I will re-emphasize that I've not used it, so I cannot
| speak for its quality.
|
| [1]: https://github.com/ziglang/zig/blob/master/lib/std/uni
| code.z...
| winter_squirrel wrote:
| [deleted]
| typon wrote:
| While the standard library documentation is non existent, using
| grep on it and just reading through it is very easy, compared to
| almost any other language I have used.
|
| I would actually say this is preferred: it's early days, so the
| documentation can't go out of sync because it doesn't exist, and
| library maintainers are incentivized to write understandable
| code, which most people who are getting into the language are
| forced to read, creating a consensus of what is considered
| idiomatic in the community.
| zppln wrote:
| I've also found the tests for the standard library pretty
| useful when digging around trying to figure out how to use
| stuff.
| kristoff_it wrote:
| Yep, and we also encourage this, if you open the (incomplete,
| buggy) autogenerated doc for the standard library, you get a
| banner at the top that links you to a wiki page that explains
| how the standard library is structured.
|
| https://github.com/ziglang/zig/wiki/How-to-read-the-standard...
| kaba0 wrote:
| How does it compare to other C-replacement languages, like Beef?
| adamrezich wrote:
| > For loops are a bit strange too - you write for (items) |item|
| {}, which means you specify the container before the per-element
| variable. Mentally I think of for as for something in many_things
| {} and so in Zig I constantly had to write it wrong and then
| rewrite. Also you use the | character in Zig quite a lot, and
| while this may just be a problem with Apple UK keyboards,
| actually getting to the | character on my laptop was
| uncomfortable. When doing C/C++ or Rust, you use the | character
| much less and so the pain of writing the character was something
| I never noticed before. Jonathan Blow has gone on the record to
| say that with his language, Jai, he spent a lot of time working
| out how easy it would be to type common things, such that the
| more common an operation in the language, the easier it would be
| to type.
|
| I've written much more "Jai" than Zig but this is one of the
| things that stuck out to me the most in Zig's syntax as being
| strange. in "Jai", for loops iterate over arrays, ranges (0..10),
| or anything else that has a for_expansion defined for it.
| implicitly, "it" is the iterator and "it_index" is the index. to
| for-loop over an array, you simply write foos:
| [..] int; for foos { /*...*/ }
|
| if you don't want to use it and it_index, likely because you're
| nesting loops, you write for foo: foos { }
| for foo, foo_index: foos { }
|
| this has some very nice properties in addition to relative
| terseness: when you want to iterate over something, which is
| something you do all the time in all kinds of contexts, you just
| write "for foos do_something_to_foo(it);" suddenly you find you
| need to use something other than the implicit it/it_index, so you
| just change it to "for foo: foos do_something_to_foo(foo);" maybe
| when you're "sketching out" your program code, "foos" is just an
| array of ints, but as you flesh things out further, you realize
| you want it to be a custom data structure with additional
| information that can nonetheless be iterated over as if it were
| still an array. you simply write a for_expansion for the new data
| structure: Foo_Storage :: struct {
| items: [..] int; additional_info: string; }
| for_expansion :: (using storage: *Foo_Storage, body: Code, flags:
| For_Flags) #expand { for `it, `it_index: items {
| #insert body; } } foos: Foo_Storage;
| for foos { /*...*/ } // the loop interface remains unchanged
|
| I completely agree with the author here in that I appreciate this
| approach as opposed to Zig's, with regards to making it as easy
| as possible to write basic constructs ("loop over some stuff")
| that you're going to be writing a lot, in a lot of different
| places, in a lot of different contexts, all the time, constantly.
| this is the one area in which this language and the design ethos
| behind it is completely different from Zig and other
| contemporaries--it balances power and simplicity with developer
| ergonomics quite nicely.
| balaji1 wrote:
| I got into Rust by working on Advent of Code 2021. The problems
| seem arbitrary, repetitive and sometimes unnecessarily hard. But
| they are well-designed for starting on a new language. We are
| forced to repeatedly use basic concepts of a language, so that is
| useful to get a few reps in on a new language. We are also forced
| to build utils that can be used a few times.
|
| And if you challenge yourself to solve the problem as quickly as
| possible so as to see where the story leads, you can stay
| motivated to work thru the problems. Helps if you have a friendly
| competition going with a few friends.
| dureuill wrote:
| > Try and read a moderately complex Rust crate and it can be mind
| boggling to work out what is going on.
|
| I do that all the time, even reading the source of the std,
| something that I cannot do sanely in C++. IME Rust code is easy
| to read, with symbols that are always either defined in the
| current file, imported, or referred to by their full path.
| skrtskrt wrote:
| Macros are extremely hard to grok and and so many use such
| short variable names that it looks like absolute gibberish.
|
| They also look so different than normal Rust code. Python
| metaprogramming still looks exactly like Python, for example.
| dagmx wrote:
| Agreed. This was my first AOC, and I did every day in rust
| (except for one that I did by hand).
|
| Multiple times I'd go look at the source of a data structure
| and it reads very easily. I'd even share my code with friends
| and coworkers who weren't familiar with Rust (so we could
| compare..they were most familiar with Python). Not only could
| they easily grok my code, I showed them how docs.rs let's you
| easily see source. All of those that looked, could read it
| easily with some explanation from me on traits, pattern
| matching and generics.
|
| I think it's obviously a subjective thing...but I very much
| disagree with the author that idiomatic Rust is difficult to
| read or comprehend.
|
| In fact, I find Rust easier to grok, because I need to keep
| less in my head at any given time. Function bodies become
| almost self contained, without me having to think about lots of
| details like errors and return validity etc...
| nu11ptr wrote:
| Agreed, when I see comments like this I tend to think they
| haven't spent much time using the language. It takes a while,
| but after a month or so you can read just about any Rust code.
| Honestly, feels like a much simpler language in day to day
| usage than say a language like Scala (just an example) to me.
| oxymoron wrote:
| I also agree with this sentiment, although there are some
| examples of really weird meta programming that remains opaque
| to me. For instance, I'm able to use `warp` as a framework,
| but the use of things like type level peano arithmetic is
| mostly incomprehensible to me at the moment. I also find that
| I run into Higher Rank Trait Bounds so rarely that I have a
| poor grasp of it (which might be as intended). All that to
| say that there are some odd corners of the language, given
| that I've been using it for five years now and as my main
| professional language for three years.
| tialaramex wrote:
| To be fair, the thing that makes a working C++ standard library
| unreadable is also a hazard in understanding Rust's std.
| Macros. The macros in a C++ standard library are horrible,
| because it is here that essential compliance and compatibility
| are squirreled away, and because the C++ macros aren't hygienic
| they're bigger than they'd otherwise need to be (e.g. you
| mustn't call it foo, say __x5_foo instead). But while they're
| far more readable on their own terms, the Rust macros littering
| std do mean it's harder to see how say, a trivial arithmetic
| Trait is implemented on u32 because a macro is implementing
| that trait for all integer types.
|
| A macro-pre-processed std might be easier for the non-expert
| rustacean to grok even though it isn't the canonical source.
|
| The symbol thing is pure insanity, machines have no problem
| knowing what symbol8164293 refers to, but humans can't get that
| right, and programming languages, including in theory C++ are
| intended for humans to write.
| khuey wrote:
| The thing that makes the C++ standard library source
| difficult to understand in my experience is heavy usage of
| templates and very deep inheritance chains.
| pcwalton wrote:
| The _Weird_identifier_naming_convention that the STL has to
| use to avoid colliding with potential user-defined macros
| doesn't help either.
| travisgriggs wrote:
| Computer language inventors are torn between a voice whispering
| "use the language Luke" and and a more gravelly "let your
| feelings for the compiler grow, embrace the syntax side."
|
| I did a two-day sprint through Zig a month ago and really really
| liked it. It has some quirks that I would have done differently,
| but overall I think it's pretty cool. I just need a good small
| scale project to try it out on now.
|
| My favorite example of the "use the language" ethos is the way
| Zig does generics. I have hated generics/templates in every
| language I use them in. They are the gordian knot of confusion
| that most languages in pursuit of some sort of unified type
| theory impale themselves on (yes, I'm mashing metaphors here).
| But Zig just uses its own self to define user types and generics
| come along for free. It resonates (with me at least) as a
| realization of what Gilda Bracha has shared about how to reify
| generics.
|
| [1] https://gbracha.blogspot.com/2018/10/reified-generics-
| search...
| skybrian wrote:
| It looks very nice and I look forward to using it.
|
| A downside for package authors (once Zig gets to the point of
| having a package ecosystem) will be that syntax errors are
| checked only with the comptime parameters that are actually
| used at call sites. To maintain compatibility, authors of
| public APIs will need tests with commonly-used comptime
| parameters.
|
| It seems like a good way to avoid lots of type-level complexity
| in the language, though.
| sekao wrote:
| The way zig does generics is brilliant, and made me think "this
| is how every language should do it". One of those things that
| seems obvious in retrospect: just pass the types as normal
| parameters, and for generic data structures just represent
| their type as a function that receives those parameters and
| returns a new type. Man that's just beautiful. Only downside
| is, i suppose the types can never be inferred so they always
| need to be explicitly passed. But being explicit seems to be
| their thing anyway.
| naasking wrote:
| Zig's approach isn't unique, it's basically what dependently
| typed languages do. The problem is that if you don't do it
| right, it's too expressive and the type checker can loop
| forever.
|
| Of course, this is also true of C++ templates since they're
| Turing complete, but it's not true of generics in most
| languages.
| rackjack wrote:
| This isn't a gripe directed at you specifically.
|
| I've noticed that when somebody says they like a growing
| language (Rust, Zig, ...) for this or that feature, people
| often come out of the woodwork to claim that it isn't
| unique, that some research project did it 5 years ago, etc.
| etc.
|
| First, that's not even what they were saying. They like a
| feature of the language, they weren't making a claim about
| the novelty of it.
|
| Second, even if they did erroneously make a claim about the
| feature's novelty, I think the theory type of people
| dismiss these sorts of languages too readily. Yes, somebody
| did it before. But it is usually very hard to bring these
| features into the mainstream or even adjacent to it.
|
| It's just annoying when we're appreciating a language and
| the work that went into it and somebody pops their head in
| and says "Ackshually, Joe Gringoff published a paper in '95
| detailing that exact thing, so it isn't anything new." Like
| I'm trying to enjoy "Samson and Delilah", I'm not really
| thinking about who did that kind of lighting first, so why
| are you using the lack of novelty to diminish the effort
| put into the lighting? If you want to say, "Fun fact,
| Giorno Capucelli was the first one to popularize that kind
| of lighting!" Then that's cool, but instead these people
| always use these facts to diminish something else instead
| of enhancing it. Just let me enjoy their handiwork!
| Mathnerd314 wrote:
| I argue to the contrary that it's quite disappointing to
| see features developed without referencing previous work.
| The annoying "Actually, so-and-so" is really just a
| symptom of the general lack of citations.
|
| Maybe it's possible to design a language de novo without
| being aware of previous designs in the space, as Zig
| seems to be doing, but it seems quite dangerous - one
| wrong step and you've locked in a bad design. Whereas
| using proven designs (as Rust claims to be doing in their
| FAQ) is at least making use of some form of validation.
| dnautics wrote:
| > the type checker can loop forever
|
| Oversimplifying: Zig anyways gives you "compiler branching
| tokens" so your type system can't loop forever, if I'm not
| mistaken
| uh_uh wrote:
| Kind of how Ethereum uses "gas" to make sure that
| bad/lazy actors won't waste CPU cycles when running smart
| contracts.
| dnautics wrote:
| sure, but you aren't going to run out of compiler tokens
| unless you are trying to do something really crazy, like
| generate a precompiled table of prime numbers, as I have
| done. And I don't think you use them up in general, just
| when your compiler is taking certain branching operations
| (i don't actually know what the rules are)... I _believe_
| you can have a long program that consumes as many tokens
| as a short program, if their typesystems are the same and
| you don 't ever use compile-time branches in the
| functions you're writing.
| jules wrote:
| Dependent types go beyond what Zig does by removing the
| distinction between comptime variables and runtime
| variables (so types can depend on runtime variables). Zig
| goes beyond dependent types in the sense that the
| comptime/runtime distinction allows Zig to handle all
| comptime values at compile time, which is important for
| efficiency. It would be interesting to combine the two
| approaches, via partial evaluation or staging.
| wk_end wrote:
| > but it's not true of generics in most languages
|
| not sure how true that is.
|
| * TypeScript:
| https://github.com/microsoft/TypeScript/issues/14833
|
| * Rust:
| https://sdleffler.github.io/RustTypeSystemTuringComplete/
|
| * Haskell (GHC): https://mail.haskell.org/pipermail/haskell
| /2006-August/01835...
|
| * Scala: https://michid.wordpress.com/2010/01/29/scala-
| type-level-enc...
|
| * Java: https://arxiv.org/abs/1605.05274
| jstimpfle wrote:
| I've grown used to the idea that generics (data structure
| macros, C++ templates...) aren't that useful. If I find myself
| in a situation where I'm thinking of a solution that involves
| generics, I stop and ponder what is actually the essence of the
| repeated stuff. It rarely is on the syntactic level, often it
| runs deeper. Probably the commonalities can be distilled into a
| data structure.
|
| Simple example: Intrusive linking headers (e.g. Linux kernel
| list.h). While those can benefit from an optional very very
| thin generics layer on top, essentially the code for linking
| e.g. nodes in a tree should be the same regardless of the data
| structure where the headers are embedded.
|
| Getting this right simplifies the code but can also speed up
| compile times.
| ncmncm wrote:
| Generics are the way that useful semantics can find their way
| into useful libraries. A collection of useful libraries is
| what is needed to make a language useful.
|
| This is why C++ usage is still growing fast: almost every new
| feature makes it possible to write more powerful libraries
| that get, thereby, easier to use.
| jstimpfle wrote:
| This is a blanket assertion. Why should useful semantics
| require generics? Why can't they come with simple fixed
| data structures? If you can provide a nice counterexample
| to the example that I gave, this would be more convincing.
|
| Absent the absolute necessity of generics to support
| essential semantics, I'm probably in favour of a simpler
| generics-less version. I don't see why "more powerful
| libraries" get automatically easier to use. It could often
| be the opposite.
|
| There is a tradition in some C++ subcultures where they try
| to cram in as many invariants as possible into types,
| concepts, templates, etc. at all costs. I tend to think
| that if all this heavy machinery is needed, the
| functionality might be too complicated from the start.
| ncmncm wrote:
| If you were writing a library, and a language feature
| made it possible to make your library easier to use
| safely, why would you not?
|
| You might as well say, "I don't see why a more expensive
| dinner has to be tastier."
|
| It is always easy to make libraries that are hard to use,
| in any language, but those do not become popular if there
| are better alternatives.
|
| We need generics because users have types that they need
| libraries to work with. Your own example, of an
| intrusively-linked list, illustrates this: without
| generics, you could not write a linked-list library
| component that would be usable for that. This is why C
| programs are crammed with so many custom one-off hash
| tables: You cannot code a useful hash table library in C.
|
| Libraries start simple, and accumulate your "heavy
| machinery" as they are made more useful and usable for
| their growing family of users. A library that lacks users
| does not.
| jstimpfle wrote:
| It's more like, "I don't see why a tastier dinner has to
| be more expensive". It's a tradeoff situation, and while
| I might be willing to buy a great $50+ meal from time to
| time instead of just a $20 one, I don't have any
| inclination to pay $1000 even if the meal is a tiny
| little bit better than the just-great one.
|
| It's a matter of tradeoffs. Library design is a balancing
| act.
|
| > We need generics because users have types that they
| need libraries to work with.
|
| This is right where I become sceptical. Most libraries
| shouldn't care for the user's types at all. They should
| expose their own types so you can work the library - not
| the other way around.
|
| Please provide an actual use case where the library has
| to "know" about the user's types (and please don't
| mention std::sort. It's as common an example as is "Dog
| :: Animal" examples to argue for class inheritance, and
| is just as irrelevant).
|
| > Your own example, of an intrusively-linked list,
| illustrates this: without generics, you could not write a
| linked-list library component that would be usable for
| that.
|
| As I mentioned, linked list can tangentially benefit from
| a very very thing generics layer on top of a fixed data
| structure implementation. The layer does nothing more
| than "instanciate" the types, but there is no code
| generated. But then again, the added convenience / safety
| is minimal, I once wrote a C++ implementation that I was
| quite happy with, and never used it.
|
| > This is why C programs are crammed with so many custom
| one-off hash tables
|
| There are probably not huge issues with a design like
| "HashTable ht = { .hash = &foo_key_hash, .equals =
| &foo_key_equals }" (if quick results are what you're
| after), but yes it is nice to have a few lines of type-
| safe wrap generated over some generic container
| interface.
|
| An alternative reason why you'll find a good amount of
| custom hashtables in C code bases is, I suppose, that
| there are a lot of different ways to implement hash
| tables. Also writing a little code here might not be so
| bad. I've had 1 use case for a "hash" table in 2021
| (glyph hash table for a new iteration of my font
| rendering code) and I've gotten away without even
| implementing it - the for-loop I've written has never
| shown up in any performance profile.
|
| Considering that container data structures are probably
| the highest profile application for generics, I'm still
| not convinced that generics are needed in a systems
| programming language...
| tsimionescu wrote:
| > This is right where I become sceptical. Most libraries
| shouldn't care for the user's types at all. They should
| expose their own types so you can work the library - not
| the other way around.
|
| This is an utterly bizarre statement. Should people
| convert collections of Xs from library 1 to other kinds
| of collections of Ys to work with library 2?
|
| A very basic use case that shows up all the time
| everywhere, especially in systems programming, is that I
| have an array of items and I want to pass it to some
| library. But, to work with an array, you need to know the
| size of elements of the array, so that arr[x] knows how
| to compute the address of x. There are exactly 3 ways to
| achieve this:
|
| 1. The library only accepts arrays of some specific type
| (e.g. int[]). If you have some other type, you have to
| find some way of converting to the type known in the
| library - in O(n) time.
|
| 2. The library expects arrays of some fixed-size type
| that can hold any value (e.g. void*[]). This typically
| adds huge overhead, since now the elements of the array
| are not together in memory, they are scattered all over
| the place; and the library must fetch them from memory
| before doing anything with them.
|
| 3. Generics - the library accepts T[], and can easily do
| sizeof(T) to know how to access arr[x].
|
| Even qsort is actually an example of C's convoluted
| support for generic arrays: you're explicitly passing the
| array base and its type (the number of bytes) to qsort.
| C's extremely basic type system anyway essentially
| identifies types with byte sizes in practice, so it's not
| obvious that qsort works like this (and of course, it's
| less safe, since instead of passing an int _, you have to
| pass void* + sizeof(int), potentially getting them mixed
| up)._
| jstimpfle wrote:
| > A very basic use case that shows up all the time
| everywhere, especially in systems programming, is that I
| have an array of items and I want to pass it to some
| library.
|
| My thinking is this. Either a library function cares
| about what type of data you pass to it because it is
| designed to do some operation on it - then it will accept
| only a specific type, like "void func(Foo *foos, int
| count)". Or, the function does not care about the type of
| data but just wants to send it somewhere else. In this
| case, you can pass a void-pointer + size to it.
|
| Interfaces like sorting functions can be seen as
| exceptions under the umbrella term "container library".
| Like the qsort() example, usually all you need is a few
| metrics for the data type - size, maybe alignment - and
| at most a few "method calls" - equality, simple get/set
| operations - and that's it. For all I can say these
| situations are not very common in systems programming, at
| least not for what I do. This stuff tends to come up in
| application level programming. It's definitely very
| common in my Python scripts. When it does come up in
| systems occasionally, a vtable struct is a good way to
| deal with the situation generically. In the qsort() case,
| the function pointer can be passed directly.
|
| > 1. The library only accepts arrays of some specific
| type (e.g. int[]). If you have some other type, you have
| to find some way of converting to the type known in the
| library - in O(n) time.
|
| Considering how computers work, I don't think it is much
| of a constraint to assume that arrays are of the form
| "Foo *foos". Coincidentally C _does_ have builtin
| "generics" for arrays of things, and real machines do
| have assembly instructions to quickly compute addresses
| based on hardcoded element size.
|
| > (and of course, it's less safe, since instead of
| passing an int, you have to pass void* + sizeof(int),
| potentially getting them mixed up).
|
| There is little real chance of getting them mixed up. If
| you actually run the code you will notice.
| ncmncm wrote:
| You miss the point. You can get more money for the dinner
| only if you deliver more. Your $20 meal delivers more
| than your $2 meal. If you can only deliver the $2 meal,
| you aren't getting $20 for it.
|
| I have already cited two separate examples, which you
| have studiously ignored. Look online and find them in
| their thousands. It is hard to find even one C++ library
| that is not made better by its ability to integrate with
| users' choices of types.
| jstimpfle wrote:
| Updated to not ignore your examples.
| dmitriid wrote:
| > This is a blanket assertion. Why should useful
| semantics require generics? Why can't they come with
| simple fixed data structures?
|
| What do you mean by "fixed data structures"?
|
| > If you can provide a nice counterexample to the example
| that I gave
|
| I may be missing somehing but I don't understand your
| example. The idea of a list is that we want it to hold
| any data, so the API to a list by necessity becomes
| generic. Unless you cas everything to void*.
|
| ---
|
| I have another counterexample from userspace. Our
| internal APIs all use the same converntions which make
| working with them predictable and same from any language.
| An API returns: { result: [a
| list of data], nextPageToken: }
|
| Different APIs will of course have different data. Some
| API will return a list of contracts. Another API will
| retur a list of media. A third API will... etc.
|
| I had the misfortune of working with these APIs in Go
| before generics. Welcome to copy-pasting code that deals
| with this. Or cast from interface{}/void* to proper data
| in runtime. Any sane language of course lets you write
| something like GetAllData<T>()
| tsimionescu wrote:
| > the code for linking e.g. nodes in a tree should be the
| same regardless of the data structure where the headers are
| embedded.
|
| Why should that be so? It's particularly NOT true for one of
| the simplest and most efficient data structures in any
| program: the array. The code that needs to work with an array
| needs to know exactly the size of each element of the array,
| so you basically can't have an efficient language that
| doesn't support generic arrays. Haskell maybe sometimes gets
| by with a very smart compiler and heavy use of laziness, but
| even C has support for generic arrays.
| pcwalton wrote:
| How do you write a generic quicksort function without
| generics?
|
| There are two ways I know of: (1) throw type and memory
| safety out the window and use void pointers plus
| size/alignment like qsort(3) does; (2) require that users
| manually write an interface with a swap(int i, int j)
| function like Go before generics does. Both solutions are
| really bad.
| mpweiher wrote:
| Use Smalltalk?
| joe_guy wrote:
| Void pointers also add an unfortunate layer of
| dereferencing.
|
| In C# for example, generics are important to keep values on
| the stack instead of the heap, not only avoiding garbage
| collection but improving data locality.
| pcwalton wrote:
| Well, a modern optimizing compiler can get rid of the
| indirection if it inlines qsort() and then inlines the
| comparison function. Of course, it needs to be able to
| see the source of qsort() to do so (which might be a
| problem if it's dynamically linked).
| joe_guy wrote:
| If I'm understanding correctly, you're saying a compiler
| will optimize out "void*" to instead be the value put
| into it, so theres no pointer dereference to get to it?
|
| That's what I am talking about. I don't know much about
| modern C optimizations.
| pcwalton wrote:
| Among other things, yes, compilers will do that.
| joe_guy wrote:
| That seems strange because you'd think you may need the
| guarantee the pointer remains a pointer for an ABI? Or
| does it only do this in very limited circumstances? Do
| you know the term I can use to Google more info about
| this?
| pcwalton wrote:
| It only works if you use link-time optimization or the
| qsort() definition is otherwise available for the
| compiler to see.
| jstimpfle wrote:
| There isn't any pointer to optimize out really, I don't
| think - at the machine level, a pointer is a pointer is
| an integer number regardless of its type at the source
| code level.
|
| What can be measureable though is function call overhead
| - if you write a sort function that takes a function
| pointer and your sort function calls that, there is a lot
| of work for saving the local state, preparing all the
| arguments for the function, etc. That work can
| potentially be avoided if the compiler can inline the
| function call into (a copy of) your sort function.
| tsimionescu wrote:
| In the case of qsort, void* isn't a problem, since sort()
| would always take a pointer, but there absolutely is a
| difference at the machine level between void* and int -
| one is a value, the other is the address of a value. One
| is used for calculations, the other is used for
| calculations OR loading data from memory.
| joe_guy wrote:
| At the machine level there is absolutely a difference.
| one means your data is right here, the other means the
| data is elsewhere. That poor data locality can cause
| cache misses and always consumes extra cycles.
| jstimpfle wrote:
| What is "right there"? If the computer should sort an
| array that is located in main memory, it needs to know
| their addresses (pointers) to even load them. There is no
| way around.
| [deleted]
| jstimpfle wrote:
| I rarely ever do need a sorting algorithm for what I do. I
| might not have done even one sort call in 2021. Usually the
| data I have is sorted by construction, or doesn't need to
| be sorted in any particular order.
|
| When I do need to sort, I just use qsort from libc. It's
| also easy to write my own version of qsort.
|
| I once measured qsort vs std::sort on plain ints (basically
| the most pessimistic case for qsort vs std::sort) and the
| difference was like 2x. If this becomes a bottleneck then
| it's worth investigating how to take advantage of
| additional context to speed up the sorting much more than
| any generic sorting algorithm could do anyway. Simple
| example: bucket sort. (I'm 100% confident sorting
| performance hasn't ever been noticeable in my career so
| far, but I've done simple optimizations like that once in a
| while for fun).
|
| > Both solutions are really bad.
|
| Due to what I described above, I'm actually in favour of
| qsort - no new code is generated, just a simple library
| call that takes a pointer to the comparison function.
| Really really easy to use.
| tsimionescu wrote:
| qsort is a horrible function, and infamously so. First,
| since it's part of the C stdlib, it is almost universally
| dynamically linked, precluding any chance for it to be
| inlined, and even worse, it calls a function pointer.
|
| Even ignoring performance, it has no support for checking
| the types, which in C means it can easily mess up memory
| badly. For example, I can call `qsort(arrayOfInts, num,
| sizeof(float), funcThatComparesTwoStrings)` and I'll get
| a really bad time, with absolutely no way for the
| compiler to tell me I'm doing something obviously wrong.
| dnautics wrote:
| I don't know what you do, but for our codebase, we run
| transactions against the database in most of the
| integration tests. We can't _generally_ be sure the
| database won 't return rows in some order it likes that
| aren't the rows we expect -- so if we don't have a
| generic sorting function we will have flaky tests.
| jstimpfle wrote:
| If you can't bring the database to return the values in a
| well defined order, absolutely go ahead and sort it! No
| need for _generics_ though, qsort() will do just fine.
|
| (C probably isn't great for DB interop in any case).
| [deleted]
| dnautics wrote:
| > Everything in Zig is const x = blah;, so why are functions not
| const bar = function() {};?
|
| This may or may not happen:
| https://github.com/ziglang/zig/issues/8383
|
| > Fixing the standard library documentation would be my biggest
| priority if I worked on Zig, because I think that is the only
| thing holding back general usage of the toolchain.
|
| This is a valid concern, but I believe the zig team is
| deliberately holding off on improving the std lib documentation,
| because they are expecting (potentially huge, maybe not? who
| knows) breaking changes down the line. The "stdlib is not
| documented" is a deliberate choice to signal "use at your own
| risk, especially with respect to forwards compatibility".
|
| > there are still quite a few bits of syntatic sugar hiding the
| real cost of certain operations (like the try error handling,
| there is implicit branches everywhere when you use that...
|
| I dunno, that's like saying that `if` hides branching. It's a
| language-level reserved word, you're expected to understand how
| they work under the hood.
| ArtixFox wrote:
| yes, our first priority is stage2, after that, we might deal
| with stdlib. Andrew is going to go through the stdlib before
| the 1.0 release.
| dnautics wrote:
| it's super reasonable to expect language-level stability
| before shoring up the stdlib. I know 'gatekeeping' is a bad
| word sometimes here, but this is soft-gatekeeping, and imo, a
| good thing (for now) to help focus the language.
| ArtixFox wrote:
| unfortunately yes, it somewhat is, but the devs try to
| maintain extremely readable source, not the best thing but
| i think its really good and important cuz its the best
| example of good zig code and might teach you a bit or two
| like i learnt how to write saner and better code.
|
| and the stdlib breaks sometimes soo its better to not put a
| loot of effort in docs
| guidorice wrote:
| > I wanted something more like Rust's cargo test that'd find all
| tests and run them. Maybe Zig does have this but I just didn't
| find it?
|
| Try `zig build test`
|
| https://ziglang.org/documentation/master/#Zig-Build-System
| moffkalast wrote:
| Now you can really move zig.
|
| For great justice.
| fmakunbound wrote:
| > and so making getting at heap allocations harder by explicitly
| getting them through an allocator is a great thing.
|
| Did not follow this logic.
| llimllib wrote:
| I think the idea is that heap allocations are costly, so making
| them explicit exposes their cost clearly, and promotes careful
| thought about strategies for heap allocaiton.
| ModernMech wrote:
| How is that any more explicit or intentional than calling
| malloc though? You have to tell it exactly how many bytes you
| want on the heap.
| formerly_proven wrote:
| Zig makes it supremely easy to use different allocators for
| different pieces so you can do much better than just
| calling malloc everywhere. This enables very easy and
| straightforward use of arena allocators in particular.
| sirwhinesalot wrote:
| If I call some function foo() I have no idea if it does any
| heap allocation internally.
|
| In zig I always know because all functions that allocate
| memory explicitly request an allocator. That also allows me
| to control which allocator they use.
| ModernMech wrote:
| I see, I didn't understand that's what they meant about
| zig.
| flohofwoe wrote:
| For many types of programs it's really, really important to
| have control over allocation behaviour which malloc doesn't
| provide (for instance a per-frame bump allocator, an
| allocator which reuses memory blocks but still works within
| a pre-allocated memory chunk, a specialized allocator for
| GPU memory blocks, etc...).
| anonymoushn wrote:
| It's more explicit to pass an allocator to things that will
| allocate than not to, in the same way that other local
| variables are more explicit than other global variables. By
| cultural convention, you'll be able to bring your own
| allocators to almost any library, so you won't have to let
| your libraries decide when your program performs heap
| allocations.
| JediPig wrote:
| I get this feeling its a one / two / three man I want to write a
| compiler phase. I seen dozens of these languages over the years.
| I looked at the language, and without something revolutionary,
| this will die a slow death. I think it is experiencing that ,
| main developer(s) are using crowd funding and nothing has really
| been "wow".
|
| I was right about dart and flutter being the next big one.
| However, zig is a dead language in a grave yard of 100s. It tries
| to revolution coding without changing the methodology.
| anonymoushn wrote:
| Were you right about Dart being the next dead language or the
| next language propped up by Google?
|
| We're having a great time using Zig in production. Thanks.
| ArtixFox wrote:
| nice! where are u using zig in production?
| jimbob45 wrote:
| Dart has a lot of really neat features that should have caught
| on in other languages by now. In particular, the Dart ".."
| operator, implicit interfaces on every declared class, and
| mixins really should have made their ways to C# and Java by
| now.
| kristoff_it wrote:
| I'd say comptime and having enough features to be better than C
| at using and (cross-) compiling C libraries is somewhat
| revolutionary.
| nick__m wrote:
| To me Zig doesn't seems like a zombie.
|
| * It is actively developed, has frequent releases and it is
| steered toward a stable 1.0
|
| * The crowd funding model proves that some users want that
| language.
|
| * It doesn't try to revolutionize coding, it tries to be a
| better C, hence the wowlessness
|
| Nim1 and Crystal2 have probably more chances of eventually
| becoming dead languages but I hope they don't as they are both
| fun languages.
|
| 1- https://nim-lang.org/ : Nim is a statically type checked
| compiled Pythonesque language with a type system residing
| somewhere between Pascal and Ada.
|
| 2- https://crystal-lang.org/ : Crystal is a statically type
| checked compiled Ruby-like language.
| jasfi wrote:
| I've been using Nim for a few years now. You're right, it is
| fun, by syntax alone. The performance is great, too. The
| development of Nim has been slower than some would like, but
| this has not been a problem for me.
| flohofwoe wrote:
| Anecdotal, but I had a strong and immediate "wow" feeling when
| I first looked at Zig which I didn't have with most of the
| other "better C" languages before. Also, not having a big
| company behind is a good thing, not a disadvantage (because it
| means that bad design decisions can't be forced on users just
| because of the "it's made by Google or Apple, so it must be
| good" effect - as a real world example for this problem, see
| WebAudio).
|
| PS: Dart being the next big thing, or the next big flop?
| Because from my bubble, Dart and Flutter don't look all that
| popular.
| jmull wrote:
| > main developer(s) are using crowd funding and nothing has
| really been "wow"
|
| It's funny: I think these are both positive signs.
|
| The weight of history tells us that the overwhelming majority
| of new languages will die. So, in that sense, I agree zig
| probably will too. However, I think it's worth separating
| languages into ones that have essentially zero chance to
| survive long-term, and those that have a reasonable non-zero
| chance to succeed (and those in the middle).
|
| To me, zig is the most promising "better C" (which is why I'm
| one of those people crowd-funding zig). Since I think we really
| need a better C, I think zig is on the side of those with a
| reasonable non-zero chance to succeed.
|
| Some things that might get it over the hump:
|
| Zig aspires not just to be a better C, but a better C compiler.
| It includes a first-class C compiler (Clang/LLVM), and adds
| first class cross-compiling support.
|
| Along with native C integration and a language designed to
| appeal to C programmers, it's potentially low-friction to adopt
| in places where one would go to C now.
|
| That is, it seems to me there is a viable, incremental path
| from C to Zig. (Incremental is important. E.g., even if zig
| succeeds, there won't be a RIIZ movement because there's no
| need. People will write or rewrite as it makes sense for their
| project, not their programming language.)
|
| I get that zig is a work-in-progress, and that it may well not
| succeed. Just that it looks like it has the best chance to me
| (or maybe it's just that I like the approach it is taking).
| ncmncm wrote:
| All it needs to be actually useful is destructors. And,
| constructors.
|
| I am always amazed when a new language omits destructors. Sure,
| CS assignments don't need them, but out here we have real
| resources, not just memory, to manage.
| kristoff_it wrote:
| There's plenty of great languages with those features. Zig
| brings a different mindset to programming, so you need to empty
| your cup of tea before being able to enjoy Zig.
|
| https://ashidakim.com/zenkoans/1acupoftea.html
| flohofwoe wrote:
| Most of the problems that C++ added on top of C are caused by
| RAII though (e.g. a too rigid coupling of code and data). I
| think Zig made the right decision with the 'defer' keyword,
| even if it requires some change of perspective.
| [deleted]
| ncmncm wrote:
| By "problems" I guess you meant usefulness: "Most of the
| usefulness that C++ added on top of C is a product of RAII."
| Because usefulness is why C++ is used so much.
|
| And, "too rigid" meaning maximally flexible, likewise.
| flohofwoe wrote:
| That's what I mean with "change of perspective". POD
| structs are entirely fine if a library API is designed from
| the ground up for them. It only becomes a problem when
| trying to write C++ code in C (or Zig in this case).
| jstimpfle wrote:
| The application I'm working on has about 8K lines of bare bones
| C + Win32 + some optional OpenGL currently. It probably has
| about 5 lines of repetitive cleanup code that can't be easily
| folded into a single place (so might benefit from RAII style
| cleanup). If I want to properly release even "static" resources
| so the app can be used as a library (which is not necessary),
| that number might grow to 20 lines.
|
| It's not something I'm sweating about, and I'm happy about all
| the time I saved by not doing "proper" RAII design prematurely,
| which has more ramifications and constraints than one might
| think.
| formerly_proven wrote:
| The optionals story in Zig seems a bit weak to me, because it has
| dedicated syntax to support conditional unwrapping:
| if(optional) |captured_optional| { ... }
|
| if actually is three different syntaxes:
| if(expression) {} else {} if(optional) |captured| {)
| if(optional) |captured| {) else {} if(errunion)
| |result| {} else |err| {}
|
| The latter is kinda awkward because it looks exactly like the
| optional syntax until the else and you have to know the type of
| the variable to know which is which. Capturing doesn't allow
| shadowing, which makes the optional case awkward.
|
| This is one area that e.g. Kotlin has done better by checking if
| the expression of any if statement implies non-nullity of
| variables and then implicitly unwrapping them, as they can't ever
| be null: if(optional != null) { use optional
| directly }
|
| This works much better for multiple optionals:
| if(optA != null && optB != null) { can use both optA and optB
| directly }
|
| You can write this in Zig as well, but it results in a sea of
| unchecked .?, while Kotlin while give you compile errors if you
| use an optional without unwrapping that was not implied to be
| non-null.
|
| Or you go multiple levels deep, as the if-optional syntax only
| allows one optional: if(optA) |capturedOptA| {
| if(optB) |capturedOptB| { } }
|
| The error union story is fairly sound so far but one major
| annoyance is that while it composes well for returning errors, it
| doesn't compose well for error handling. You can't do:
| someComplexThing(file1, file2) catch |err| switch(err) {
| CryptoErrorSet => handle_crypto_error(err);
| FileErrorSet => ... }
|
| as switch does not support error sets for branches, only error
| values. This seems to me like it incentivizes you to do have
| either something like this:
| someComplexThing(file1, file2) catch |err| {
| if(cryptoErrorToString(err)) |errdescription| { //
| ... } if(ioErrorToString(err))
| |errdescription| { // ... } }
|
| Or just a huge handleAllTheErrorsPls thing.
|
| Errors are also just a value - if you want some extra
| information/diagnostics to go along with an error, you'll have to
| handle that yourself out-of-band.
|
| On errors, Zig doesn't seem to have a strerror for std.os errors
| - awkward.
| kristoff_it wrote:
| > you have to know the type of the variable to know which is
| which.
|
| That's not correct. The error version differs from optional by
| the capture on the else branch. The optional version can't have
| it, and all error versions must have both captures. You can
| always tell which case it is just by looking at the code,
| without having to know the types involved.
| formerly_proven wrote:
| Yes, I meant if you're looking at the if() part - you either
| have to know if that's an optional or an error, or go looking
| for the else to see if that captures an error.
|
| > because it looks exactly like the optional syntax until the
| else
| jmull wrote:
| Consider the difference between an optional and an error
| union:
|
| An error union is a value or error.
|
| An optional is a value or null.
|
| The if/else capture syntax follows: "if" captures the
| value. "else" captures the error if there can be one, or
| doesn't capture anything if there can't. That is, you don't
| need (and therefore don't want) a difference between error
| union and optional in the if part of the syntax.
| formerly_proven wrote:
| That's a fair point but doesn't really distract from the
| syntax being awkward for optionals. The if/else for
| errors matters less, at least to me, as there are already
| dedicated control-flow primitives for errors. [1]
|
| [1] Which reminds me that the syntax for blocks-
| returning-values (named blocks) is _really_ awkward. That
| might be the ugliest bit of syntax in the entire
| language: const something = if(bar) foo
| else someblock: { // stuff break
| :someblock value_to_return; };
| dnautics wrote:
| write shorter blocks so you can see the else? If the ifs
| get too nested, encapsulate them in functions? Buy an extra
| monitor and mount it vertically (or diagonally)?
___________________________________________________________________
(page generated 2021-12-27 23:00 UTC)