[HN Gopher] Zig's multi-sequence for loops
___________________________________________________________________
Zig's multi-sequence for loops
Author : kristoff_it
Score : 213 points
Date : 2023-02-27 13:55 UTC (9 hours ago)
(HTM) web link (kristoff.it)
(TXT) w3m dump (kristoff.it)
| AnIdiotOnTheNet wrote:
| Ranges for for loops was a long time coming. Some C programmers
| just completely lost their shit at having to use a while loop for
| some reason. It was such a common complaint that we invented
| stuff like the following as a joke, that of course became
| something people actually used because internet gonna internet:
| for(@as([10]void, undefined)) |_, idx| { _ = idx; }
|
| Which, for those unfamiliar with Zig, has `for` iterate over an
| 'array' of 0-bit values and capture the index while throwing out
| the value (which would always be 0 because that's the only value
| a 0-bit type can represent).
|
| The implementation of this proposal brings with it the additional
| benefit of making index capturing less mysterious.
| gonzus wrote:
| I suspect the real reason behind "some C programmers losing
| their shit" was being forced to introduce a new variable into
| the outer scope, or to use extra braces around your loop --
| both are distasteful hacks, according to some.
| Decabytes wrote:
| I've been curious about Zig. I find its cross compilation story
| using zig cc interesting. I like its focus on simplicity instead
| of debugging your knowledge of the language. On the surface it
| looks like a better C, that isn't as complicated as Rust.
|
| I'll admit though the syntax is a little off putting. But that is
| a minor complaint. I know it's not 1.0 and there are still lots
| to do, but I'm curious if they do more for memory safety. With
| companies trying to avoid starting new code in memory unsafe
| languages if they don't have to , I wonder if that will hurt Zig
| adoption. Right now it just seems like their approach is, reduce
| Undefined Behavior as much as possible, make the "correct" way of
| programming the easiest, and have array bounds on by default.
| Will this be enough to make the language "memory safe" enough?
| throwawaymaths wrote:
| If zig includes tags or annotations (there are a few proposals
| in the issues tracker) and surfaces this information at an
| exportable zir level, it seems likely that data provenance
| tracking (this includes memory safety and file descriptor,
| socket closing, thread spawn/despawn etc) would be able to be
| checked by a static analysis system. If zig supports compiler
| hooks, then it could conceivably halt compilation unless these
| invariants are satisfied.
|
| I'm not convinced that this _needs_ to be in the type system.
|
| Nonetheless it's not there yet.
| matu3ba wrote:
| > If zig supports compiler hooks, then it could conceivably
| halt compilation unless these invariants are satisfied.
|
| Can you sketch out why hooks are necessary on the proposals,
| like 14656 ? So far I have not read a justification, what the
| advantage of coupling it to the compiler provides as
| performance gains on concrete examples. Afaik, Rust has
| lifetime checks decoupled to parallelize compilation and I
| have not seen research or a list of the possible
| optimizations with synthetic benchmark results.
|
| > I'm not convinced that this needs to be in the type system.
|
| If you want to prevent static analysis semantics becoming an
| incompatibilty mess like C due to 1. unknown and 2.
| untrackable tag semantics, then you have to provide a at
| least an AST to ensure annotations have known + unique
| semantics as I explain in https://github.com/ziglang/zig/issu
| es/14656#issuecomment-143.... I would say that this is a
| primitive quasi-type check decoupled from main compilation
| and could be kept very flexible for research experiments by
| having an uri + multihash (so basically what Zig uses as
| package).
|
| More concretely: Stuff to look out for is RefinedC and
| related projects to have more strict C semantics + how to
| compose those things outside the regular Zig type system.
| throwawaymaths wrote:
| The only advantage to coupling to the compiler is that it
| gives a tighter feedback loop for the developer, versus,
| say, statically checking as a part of CI.
| rwbt wrote:
| I would suggest you to checkout Odin[0]. It's very similar to
| Zig but has much better ergonomics and probably the closest to
| a 'Better C' replacement in my experience. It does array bounds
| checking by default (which you can turn off if you choose to)
|
| [0] - https://odin-lang.org
| brundolf wrote:
| I would guess there will always be a niche for languages that
| make C's overall set of trade-offs. Rust will shrink that
| niche, but it'll still be there. I see Zig as targeting that
| niche specifically: "we can strictly improve on C in a whole
| bunch of ways, without changing the fundamental bargain"
| throwawaymaths wrote:
| There is a market for statically checked c, as evidenced by
| the existence of something like SEL4 (though to be fair it
| technically checks assembly code)
|
| It seems like statically checked zig has a chance to be
| strictly easier to implement relative to statically checked
| C, and that's I think something to shoot for, especially
| since it could require very little on the part of the zig
| developers proper.
| Y_Y wrote:
| Is this different from good old `zip`?
| Yoric wrote:
| Apparently, it's zip, except with an UB if sizes don't match.
| helloworld23443 wrote:
| Hilarious. I was evaluating Zig, I took a look at Bun,
| probably the most well known Zig project. Multiple issues
| related to seg faults.
| saagarjha wrote:
| Those are safe depending on what kind they are.
| kristoff_it wrote:
| Take a look at TigerBeetle which is also written in Zig, if
| you find segfaults there they even give you money :^)
|
| https://github.com/tigerbeetledb/tigerbeetle
| rosetremiere wrote:
| It's hard to understand what tigerbeetle is about. Can
| anyone ELI5 it for me? As far as I can tell, it's some
| kind of a library/system geared at distributed
| transactions? But is it a blockchain, a db, a program ?
| (I did look at the website)
| eatonphil wrote:
| Hey thanks for the feedback! We've got concrete code
| samples in the README as well [0] that might be more
| clear?
|
| It's a distributed database for tracking accounts and
| transfers of amounts of "thing"s between accounts
| (currency is one example of a "thing"). You might also be
| interested in our FAQ on why someone would want this [1].
|
| [0]
| https://github.com/tigerbeetledb/tigerbeetle#quickstart
|
| [1] https://docs.tigerbeetle.com/FAQ/#why-would-i-want-a-
| dedicat...
| rosetremiere wrote:
| The faq helped, thanks! So, an example of typical use
| would be, say, as the internal ledger for a company like
| (transfer)wise, with lots of money moving around between
| accounts? But I understand it's meant to be used
| internally to an entity, with all nodes in your system
| trusted, and not as a mean to deal with transactions from
| one party to another, right?
| jorangreef wrote:
| Great to hear! Joran from TigerBeetle here.
|
| Yes, exactly. You can think of TigerBeetle as your
| internal ledger database, where perhaps in the past you
| might have had to DIY your own ledger with 10 KLOC around
| SQL.
|
| And to add to what Phil said, you can also use
| TigerBeetle to track transactions with other parties,
| since we validate all user data in the transaction--there
| are only a handful of fields when it comes to double-
| entry and two-phase transfers between entities running
| different tech stacks.
|
| The TigerBeetle account/transfer format is meant to be
| simple to parse, and if you can find user data that would
| break our state machine, then it's a bug.
|
| Happy to answer more questions!
| eatonphil wrote:
| Yes that's a good example!
|
| And you can model external accounts that have their own
| confirmation process using our two-phase transfer
| support.
|
| https://docs.tigerbeetle.com/FAQ#what-is-two-phase-commit
| ngrilly wrote:
| It's a distributed database for financial transactions,
| using double entry accounting, written in Zig, and with a
| very innovative design:
|
| - LMAX inspired
|
| - Static memory allocation
|
| - Zero copy with Direct I/O
|
| - Zero syscalls with io_uring
|
| - Zero deserialization
|
| - Storage fault tolerance
|
| - Viewstamped Replication consensus protocol
|
| - Flexible Quorums
|
| - Deterministic simulation like FoundationDB
| Yoric wrote:
| Zero deserialization? That sounds rather scary. This
| means absolute trust in data read from disk or received
| from other nodes?
| rom-antics wrote:
| What is the threat model you're worried about? If an
| attacker can write data to your disk or authenticate to
| your cluster, aren't you already screwed?
| Yoric wrote:
| Yes, these are exactly my threats.
|
| First, because I'm a strong believer in defense-in-depth.
| Secondly because both disk corruption and network packet
| corruption happen. Alarmingly often, in fact, if you're
| operating at large scale.
| jorangreef wrote:
| Ours too!
|
| For example, our deterministic simulation testing does
| storage fault corruption up to the theoretical limit of f
| according to our consensus protocol.
|
| Details in our other reply to you.
| jorangreef wrote:
| Great question! Joran from TigerBeetle here.
| "This means absolute trust in data read from disk or
| received from other nodes?"
|
| TigerBeetle places zero trust in data read from the disk
| or network. In fact, we're a little more paranoid here
| than most.
|
| For example, where most databases will have a network
| fault model, TigerBeetle also has a storage fault model (
| https://github.com/tigerbeetledb/tigerbeetle/blob/main/do
| cs/...).
|
| This means that we fully expect the disk to be what we
| call "near-Byzantine", i.e. to cause bitrot, or to
| misdirect or silently ignore read/write I/O, or to simply
| have faulty hardware or firmware.
|
| Where Jepsen will break most databases with network fault
| injection, we test TigerBeetle with high levels of
| storage faults on the read/write path, probably beyond
| what most systems, or write ahead log designs, or even
| consensus protocols such as RAFT (cf. "Protocol-Aware
| Recovery for Consensus-Based Storage" and its analysis of
| LogCabin), can handle.
|
| For example, most implementations of RAFT and Paxos can
| fail badly if your disk loses a prepare, because then the
| stable storage guarantees, that the proofs for these
| protocols assume, is undermined. Instead, TigerBeetle
| runs Viewstamped Replication, along with UW-Madison's
| CTRL protocol (Corruption-Tolerant Replication) and we
| test our consensus protocol's correctness in the face of
| unreliable stable storage, using deterministic simulation
| testing (ala FoundationDB).
|
| Finally, in terms of network fault model, we do end-to-
| end cryptographic checksumming, because we don't trust
| TCP checksums with their limited guarantees.
|
| So this is all at the physical storage and network
| layers. "Zero deserialization? That
| sounds rather scary."
|
| At the wire protocol layer, we: * assume
| a non-Byzantine fault model (that consensus nodes are not
| malicious), * run with runtime bounds-checking (and
| checked arithmetic!) enabled as a fail-safe, plus *
| protocol-level checks to ignore invalid data, and *
| we only work with fixed-size structs.
|
| At the application layer, we: * have a
| simple data model (account and transfer structs), *
| validate all fields for semantic errors so that we don't
| process bad data, * for example, here's how we
| validate transfers between accounts: https://github.com/t
| igerbeetledb/tigerbeetle/blob/d2bd4a6fc240aefe04625138210
| 2b9b4f5384b05/src/state_machine.zig#L867-L952.
|
| No matter the deserialization format you use, you always
| need to validate user data.
|
| In our experience, zero-deserialization using fixed-size
| structs the way we do in TigerBeetle, is simpler than
| variable length formats, which can be more complicated
| (imagine a JSON codec), if not more scary.
| Yoric wrote:
| > Where Jepsen will break most databases with network
| fault injection, we test TigerBeetle with high levels of
| storage faults on the read/write path, probably beyond
| what most systems, or write ahead log designs, or even
| consensus protocols such as RAFT (cf. "Protocol-Aware
| Recovery for Consensus-Based Storage" and its analysis of
| LogCabin), can handle.
|
| Oh, nice one. Whenever I speak with people who work on
| "high reliability" code, they seldom even use fuzz-
| testing or chaos-testing, which is... well, unsatisfying.
|
| Also, what do you mean by "storage fault"? Is this
| simulating/injecting silent data corruption or
| simulating/injecting an error code when writing the data
| to disk?
|
| > validate all fields for semantic errors so that we
| don't process bad data,
|
| Ahah, so no deserialization doesn't mean no validation.
| Gotcha!
|
| > In our experience, zero-deserialization using fixed-
| size structs the way we do in TigerBeetle, is simpler
| than variable length formats, which can be more
| complicated (imagine a JSON codec), if not more scary.
|
| That makes sense, thanks. And yeah, JSON has lots of
| warts.
|
| Not sure what you mean by variable length. Are you
| speaking of JSON-style "I have no idea how much data I'll
| need to read before I can start parsing it" or entropy
| coding-style "look ma, I'm somehow encoding 17 bits on
| 3.68 bits"?
| jorangreef wrote:
| > Also, what do you mean by "storage fault"? Is this
| simulating/injecting silent data corruption or
| simulating/injecting an error code when writing the data
| to disk?
|
| Exactly! We focus more on bitrot/misdirection in our
| simulation testing. We use Antithesis' simulation testing
| for the latter. We've also tried to design I/O syscall
| errors away where possible. For example, using O_DSYNC
| instead of fsync(), so that we can tie errors to I/Os.
|
| > Ahah, so no deserialization doesn't mean no validation.
| Gotcha!
|
| Well said--they're orthogonal.
|
| > Not sure what you mean by variable length. Are you
| speaking of JSON-style "I have no idea how much data I'll
| need to read before I can start parsing it"
|
| Yes, and also where this is internal to the data
| structure being read, e.g. both variable-length message
| bodies and variable-length fields.
|
| There's also perhaps an interesting example of how
| variable-length message bodies can go wrong actually,
| that we give in the design decisions for our wire
| protocol, and why we have two checksums, one over the
| header, and another over the body (instead of one
| checksum over both!): https://github.com/tigerbeetledb/ti
| gerbeetle/blob/main/docs/...
| Yoric wrote:
| Alright, I'm officially convinced that you've thought
| this out!
|
| So, how's the experience of implementing this in Zig?
| jorangreef wrote:
| Thanks! I hope so! [:raised_hands]
|
| And we're always learning.
|
| But Zig is the charm. TigerBeetle wouldn't be what it is
| without it. Comptime has been a gamechanger for us, and
| the shared philosophy around explicitness and memory
| efficiency has made everything easier. It's like working
| with the grain--the std lib is pleasant. I've learned so
| much also from the community.
|
| My own personal experience has been that I think Andrew
| has made some truly stunning number of successively
| brilliant design decisions. I can't fault any. It's all
| the little things together--seeing this level of
| conceptual integrity in a language is such a joy.
| ngrilly wrote:
| I can't stop being amazed by TigerBeetle's design and
| engineering.
| jorangreef wrote:
| Thank you Nicolas!
| [deleted]
| laserbeam wrote:
| It's a special purpose DB. No relation to blockchains.
| speed_spread wrote:
| Well, Zig would not be such a good C replacement if it did
| not also allow segfaulting all over the place.
| jeroenhd wrote:
| Looking at the bugs themselves, I don't think any low level
| language would've caught those. C/C++ would've crashed as
| well (hopefully, at least, many of these problems would be
| UB and the compiler might just ignore the problem or patch
| out the offending code) and Rust would've panic'd. There
| are a few cases where Rust wouldn't have allowed the code
| to panic but the surrounding code would be pretty
| unreadable in safe Rust (without stacking types like
| Box+RefCell+Rc and clone()ing a bunch) so it's hard to
| compare the two.
|
| The advantage of Rust would be a nice and readable stack
| trace to the crashing method, but a core dump would've
| included even more information for the person debugging the
| binary, so I think it ends up quite even.
| jjnoakes wrote:
| A panic (deterministic, guaranteed, immediate, and worst-
| case a dos) is an order of magnitude better than memory
| corruption (non-deterministic, not guaranteed, eventual-
| if-at-all, and worst-case-rce).
| cormacrelf wrote:
| I don't know what's going on in this thread where
| encountering UB has somehow been morphed into some kind
| of guaranteed immediate core dump that's basically better
| than panicking anyway. Yes, people are talking about
| segfaults. But it's memory corruption. Maybe you get a
| crash at some point, maybe you do not.
|
| A reminder for all that have forgotten: UB is the one
| that can email your local council and submit a request to
| bulldoze the house you're in. It is not a free core dump.
| sfink wrote:
| You appear to have particularly vengeful nasal demons.
| AndyKelley wrote:
| A sementation fault is well-defined behavior. If you look
| at Jarred's comment nearby he reveals that the pointers
| in question are special pointers, e.g. 0x0, 0x1, 0x2,
| etc.
|
| It is 100% well-defined behavior to dereference these
| pointers. It always segfaults, which as Jarred mentioned
| is a lot like a panic.
|
| Rust evangelists need to be careful because in their zeal
| they have started to cause subtle errors in the general
| knowledge of how computers work in young people's minds.
| Ironically it's a form of memory corruption.
| adwn wrote:
| > _If you look at Jarred 's comment nearby he reveals
| that the pointers in question are special pointers, e.g.
| 0x0, 0x1, 0x2, etc._
|
| Is that guaranteed by the language semantics, or could it
| possibly change at some point in the future? If it's the
| latter, then yes, it is very much Undefined Behavior, and
| not guaranteed to segfault before opening the door for
| potential exploits.
| rom-antics wrote:
| I can buy that dereferencing null is a special case, but
| why is 0x2 special? Is 0x20 also special? What about
| 0x20000? Are the invalid non-null pointer values listed
| in a reference somewhere? If 0x2 is an invalid pointer,
| what do I do if my microcontroller has a hardware
| register at 0x2?
| jeroenhd wrote:
| On many platforms, the zero page is set up so access to
| it will always segfault. This isn't a language guarantee,
| but it's a guarantee in most modern operating systems
| (Linux, FreeBSD, Windows). This is set up for pointers
| all the way up to the end of the first page.
|
| On Windows and Linux this is the first 4KiB so range
| 0x0000 up to 0x1000, unless large pages are on (then it's
| even more).
|
| On macOS in x64 this is the entire 4GiB memory space,
| probably a method to help developers port their 32-bit
| software to x64. I don't know what the zero page size on
| ARM is.
|
| If your microcontroller doesn't have this guarantee, you
| can't make use of this feature.
| rom-antics wrote:
| That's a guarantee on the level of the hardware/OS, but
| hardware semantics are not the same as language/compiler
| semantics. Even if according to the _source code_ you 're
| dereferencing a pointer value 0x0 or 0x2, that doesn't
| mean the compiler-emitted machine code will end up
| telling the hardware to do the same.
|
| Remember this gem?
|
| https://kristerw.blogspot.com/2017/09/why-undefined-
| behavior...
|
| Once you trigger UB, all bets are off and your code could
| do anything. A segfault just means you spun the roulette
| wheel, bet it all on red, and got lucky your house wasn't
| bulldozed.
|
| Zig also uses LLVM under the hood, right? So it's subject
| to these same semantics. An LLVM pointer value cannot
| legally contain arbitrary non-null non-pointer integers
| such as 0x2. That's a dead giveaway of UB. And I doubt
| the emitted Zig code safety-checks every pointer
| dereference for a value less than 0x1000 before
| performing the dereference.
| kristoff_it wrote:
| > An LLVM pointer value cannot legally contain arbitrary
| non-null non-pointer integers such as 0x2.
|
| 0x2 is a perfectly valid pointer value, it just happens
| to never be a good _virtual memory_ address on modern
| systems where virtual memory is setup by the usual OSs,
| hence the fact that you can rely on it segfaulting.
| jeroenhd wrote:
| The semantics are actually operating system and even
| compiler flag dependent. On macOS you can choose the size
| of your zero page during build. The numbers I've listed
| are just the defaults.
|
| Zig UB is not C UB. There is an entire language built on
| top of it. Just because something behaves a certain way
| in C, doesn't mean the same thing is true in Zig. Zig is
| no longer a code generator for C, it has switched to a
| self hosted compiler a while back. In fact, the language
| is rapidly progressing to the point where LLVM is a mere
| optional dependency.
|
| I don't know the semantics around LLVM pointers. I don't
| see why 0x2 would be invalid, there are plenty of
| platforms programmed in C(++) that have a flat memory
| model. It would be quite painful to have a
| microcontroller where you can't send data to the output
| pin because LLVM decided that 2 is invalid (but 0 isn't).
| I've never seen LLVM complain about invalid
| dereferencing, though, it always ends up doing what the
| compiler tells it to do as far as I can tell.
|
| Zig pointers will definitely cause UB but most Zig code
| shouldn't need them. Slices are actually bound checked
| and should probably be preferred in most cases of pointer
| arithmetic. Simple pointers can't be increased or
| decremented so you need to manually go through @intToPtr
| if you want to do real pointer arithmetic, which is quite
| unusable.
|
| I haven't used Zig much so I don't know how many Zig
| semantics are copies of C semantics and how many are
| translated by the Zig frontend. However, "this is a
| bad/undefined thing in C so it must be a bad/undefined
| thing in Zig" is simply not true.
| rom-antics wrote:
| I know Zig is not C, that's why I specifically mentioned
| LLVM. It's fine if Zig has different opinions about UB
| than LLVM does, but in that case ReleaseSafe builds
| should not use LLVM, not even optionally. If Zig says
| some operation is defined, but LLVM says it's undefined,
| well, LLVM is the one optimizing code so it's LLVM's
| invariants that matter. Right now it looks like Zig is
| playing fast and loose with correctness, shoving
| everything through LLVM but not respecting LLVM's
| invariants. And hey, if something is observed to segfault
| under some conditions today on the current version of
| LLVM, we'll just say segfaults are guaranteed. It's
| disappointing to see.
| AndyKelley wrote:
| A lot of people have the same misunderstanding as you.
|
| LLVM has rules about what is legal and what is not legal.
| If you follow the rules, you get well-defined behavior.
| It's the same thing in C. You could compile a safe
| language to C, and as long as you follow the rules of
| avoiding UB in C, everything is groovy.
|
| Likewise, this is how Zig and other languages such as
| Rust use LLVM. They play by the rules, and get rewarded
| by well-defined behavior.
| rom-antics wrote:
| Is not one of the LLVM rules, pointers must be valid and
| have a valid provenance in order to be dereferenced? If
| 0x2 ends up in a pointer that is dereferenced (or 0x0 in
| a nonnull pointer), has that rule not been broken? And if
| the rule is broken, does that not trigger undefined
| behavior?
| avgcorrection wrote:
| > On many platforms, the zero page is set up so access to
| it will always segfault. This isn't a language guarantee,
| but it's a guarantee in most modern operating systems
| (Linux, FreeBSD, Windows). This is set up for pointers
| all the way up to the end of the first page.
|
| Then I guess it could be a language guarantee if Zig only
| supports/targets those platforms. However, considering
| how low-level Zig is, I doubt that _that_ is the case.
| jeroenhd wrote:
| First of all: Zig is not C. The rules for undefined
| behaviour can be found here:
| https://ziglang.org/documentation/master/#Undefined-
| Behavior
|
| TL;DR: Zig injects checks and aborts the program at
| runtime unless you specify that you wish to ignore the
| problem. This can be done explicitly within the code or
| by compiling under a build mode that ignores checks
| (unless specified manually).
|
| Programs compiled as Debug and ReleaseSafe will terminate
| at runtime if UB is triggered. Compiling for ReleaseSmall
| and ReleaseFast will cause traditional C-style UB. If you
| care about your program doing what it's supposed to do,
| you use ReleaseSafe. Doing Release[Fast|Small] will do
| something similar to -O3 in other languages, which will
| often change behaviour.
|
| Note, however, that you can compile your code under "just
| allow UB and see what happens" mode but still benefit
| from checked UB by setting @setRuntimeSafety(true); this
| will introduce the assertions despite the unsafe build
| modes you may specify.
|
| It's like introducing a C++ compiler flag* telling the
| compiler "ignore exceptions and just continue". You know
| you're in for a bad time the moment you specify it, but
| it makes your program blazingly fast because it greatly
| reduces the amount of code to generate/checks to execute.
|
| The main advantage of checked UB is that well-tested code
| can make use of the unchecked nature of these features
| for speed without having length check code blocks that
| need to be wrapped in debug #ifdefs or similar. Assuming
| you don't run test builds with checks enabled (and why
| wouldn't you) you'd catch these problems in your build
| pipeline.
|
| This is different from the normal way of working with C
| and friends, where UB remains in debug/-O1 builds but
| just acts a little differently. Some compilers will
| insert breakpoints, others will ignore the problem like
| in release mode, nobody knows what will happen and your
| compiler can't detect this problem for you.
|
| * note that -fno-exceptions exists, but that aborts the
| program rather than let it continue.
| Jarred wrote:
| Most of what manifests as a segfault in Bun have been due
| to assuming a JSValue is a heap allocated value when it is
| (the JavaScript representation) "null", "undefined",
| "true", "false" etc. These are invalid pointers, the
| operating system signals the memory access was invalid, Bun
| runs the signal handler, prints some metadata, and exits.
| This is a lot like a panic
| naikrovek wrote:
| safety-checked UB; an important distinction. I assume.
| Yoric wrote:
| Could you elaborate? What's a safety-checked UB?
| codethief wrote:
| https://ziglang.org/documentation/master/#Undefined-
| Behavior
| Yoric wrote:
| Thanks.
|
| So, if I read this correctly, barring the simple cases
| that Zig can detect at compile-time, this means that
| whether it's a UB (in the C++ definition of the term)
| depends on the flags specified by the author of the
| library and the person who compiles the final binary.
|
| That's definitely much better than C++ UB. Still a bit
| scary, though.
| planede wrote:
| Not very. It really just means that "there is a sanitized
| mode for building", which already exists in many C and C++
| compilers (for language UB) and standard libraries (for
| library UB).
| naikrovek wrote:
| I think of this case as an assert() that both lists are
| the same length. if they're not, I want to know about
| that via a crash.
| oconnor663 wrote:
| I think Zig's ReleaseSafe mode is intended to be suitable
| for production, which IIUC isn't really the case with
| ASan, UBSan, and friends. Those have some performance
| problems and also some attack surface problems.
| planede wrote:
| OK, but is "safe" in ReleaseSafe any kind of guarantee,
| or is it just safer than ReleaseFast?
|
| I can enable lightweight assertions in libstdc++ and
| libc++ and it makes C++ safer, but not in any way "safe".
| There are some flags that can be enabled to trap on some
| language UB too, without bringing in the heavy weight
| sanitizers.
| oconnor663 wrote:
| Last time I checked (more than a year ago) there were
| major open questions about what could be guaranteed. My
| impression was that you could expect e.g. all array
| accesses to be bounds checked, but that use-after-free
| and dangling pointers were still issues, especially if
| you use the C allocator.
| [deleted]
| randyrand wrote:
| Is it UB?
|
| I thought it was defined as an OOB access in non-safe modes,
| or a panic in safe-modes.
| Laremere wrote:
| As much as a concept can be translated from one programming
| language to the next, they're conceptually pretty much
| identical. However for Zig there are two important differences:
|
| 1. It's simpler syntax than reaching for a zip function. I
| personally like this design because the conceptual load is
| pretty low as it feels like a natural extension of simple for
| loops. Eg you could teach someone simple for loops, then later
| go "hey, you could do this the whole time!"
|
| 2. Zig doesn't have support for custom iterators. Zip is doable
| using the existing metaprogramming features, but it's not as
| simple. Support for iterators also likely violates Zig's `No
| hidden control flow.` maxim. Plus I imagine it's a lot easier
| for the compiler to perform optimizations this way.
|
| Both points combined are related to Zig's design goal for being
| good at writing code that can run fast on modern CPU
| architectures. Being able easily to loop over multiple arrays
| is a good step for making that practical.
| kps wrote:
| > No hidden control flow.
|
| Off-topic, but... I very much like this idea, but I think Zig
| shares one wart with languages like C++: it's impossible to
| tell syntactically whether `f()` is a direct or indirect
| call. An indirect branch is a _conditional_ branch, where the
| condition can be arbitrarily far away in space and time, and
| it 's invisible. Dynamic control flow can be tricky to reason
| about even when you know it's there.
| rom-antics wrote:
| iirc there was a proposal to have Zig used `funcptr.()`
| syntax (with a dot) for indirect calls, but it was
| rejected.
| Jayschwa wrote:
| While with optionals enables some iterator-like behavior.
| https://ziglang.org/documentation/master/#while-with-
| Optiona...
| malcolmstill wrote:
| To expand on the parent and grandparent:
|
| Laremere is correct in that there is no "magic" built-in
| understanding of iterators in the language, i.e. under the
| hood calling a `.next()` method, without explicitly having
| to call it. That _would_ violate the no hidden flow control
| maxim.
|
| However, as Jayschwa points out, Zig's `while` loop will
| bind the result of its expression (in its own block scope)
| if it is non-null and otherwise exit the loop. This gives
| you essentially the same as a for loop that has some
| language-level knowledge of the iterator pattern, except
| there is no hidden flow control (I have to explicitly call
| `next`).
|
| And indeed the Zig standard library is replete with
| iterators (and in most of the Zig code I write I will will
| write iterators for my own collections). For example,
| `mem.split` returns an iterator: var it =
| mem.split(...); // it.next() returns null
| after we run out of // split text and the while
| loop exits while (it.next()) |substr| {
| // In here we have a non-nil substr }
|
| > Plus I imagine it's a lot easier for the compiler to
| perform optimizations this way.
|
| That's an interesting point: does Zig miss out on some
| optimisation possibilities with iterators given they are
| not a language-level construct? I don't know.
| jeroenhd wrote:
| It's zip with an arbitrary amount of array parameters. Quite
| useful for the purposes provided by the article.
|
| I do wonder about performance, though, as multiple array
| derefences may not be captured well by the L1 cache like a
| well-rounded struct might.
|
| An L1 cache line is often 64 bytes long, enough to fit one of
| the "monster" example structs but never two. Performance in
| real life scenarios may actually increase if these structs are
| padded with an additional 16 bytes so none of the structs are
| on a cache line boundary.
| masklinn wrote:
| > It's zip with an arbitrary amount of array parameters.
|
| So... zip?
|
| Python's does that, and for most other langages you could use
| overloading, basic macros, or traits trickery to get there if
| you really wanted to support unreasonable widths (IME you
| almost never need more than 3, and combining two zip/2 works
| fine then).
| klyrs wrote:
| Zig's "zip" is purely syntactical, where Python's zip is a
| generator. This is significant both in terms of performance
| (Zig wins) and flexibility (Python wins).
|
| Unlike Python, you can't pass a zip generator around. It's
| just a for-loop. While zig loops are expressions, they only
| return a single value.
| masklinn wrote:
| > This is significant both in terms of performance (Zig
| wins)
|
| Does it, actually? Does Zig's built-in pseudo-zip
| outperform Rust's? Or C++23's?
|
| > It's just a for-loop.
|
| Except it's not "just" a for loop, it's a weird special
| case for a for loop. And one which is actively dangerous
| too.
| klyrs wrote:
| Please check the context, I was comparing to Python. And
| please chill out. It's only "dangerous" when you decide
| to run it that way.
| masklinn wrote:
| > Please check the context, I was comparing to Python.
|
| Ah yes, that's very honest, no shit zig is faster than
| Python, what's the next one, thrustscc may be faster than
| polio-johnny down the road?
| klyrs wrote:
| Hi, no need to be so abrasive about it.
|
| > Please respond to the strongest plausible
| interpretation of what someone says, not a weaker one
| that's easier to criticize. Assume good faith.
|
| Python's zip function returns a generator. A generator in
| zig would look like a function pointer with a closure. If
| zig were to implement the Python-style "zip" function,
| constructing the closure, and iterating over the
| generator, would be significantly slower than the naked
| "just a for loop" that we see in TFA. And that's not even
| considering the tuple construction (oops, now we need an
| allocator) & unpacking.
|
| Ergo, the zig-style "syntactical zip" is higher
| performance than the Python-style "functional zip". Even
| when you cut through the baseline performance differences
| between the languages.
| masklinn wrote:
| > Hi, no need to be so abrasive about it.
|
| That's just projection.
|
| > Please respond to the strongest plausible
| interpretation of what someone says, not a weaker one
| that's easier to criticize. Assume good faith.
|
| I'll get right on that as soon as you extend the same
| courtesy, which you have refused to do at every
| opportunity.
| judofyr wrote:
| > Except it's not "just" a for loop, it's a weird special
| case for a for loop.
|
| You could also argue that a for loop which can only
| iterate over a _single_ sequence is a special case of a
| multi-sequence for loop.
|
| > And one which is actively dangerous too.
|
| In safe builds (ReleaseSafe, Debug) it will cause a
| controlled panic if the sequences are not of the same
| size. Most likely it's a logical bug if you iterate over
| two sequences of different sizes. In ReleaseFast the
| compiler will make assumptions to improve performance. If
| it's very important for your code you can force a certain
| code block to always have runtime safety. Yes, there are
| trade-offs, but I don't feel it's _unreasonable_.
| jeroenhd wrote:
| There is no single "zip". Java's .zip() will work on two
| sources, as will C#'s Zip(). Haskell's zip is no different,
| only accepting two parameters. I don't know any language
| other than Python that shares Python's iterator zip()
| implementation.
|
| In implementation, Python's zip will return a generator
| that is iterated over using the iterator functionality,
| while Zig's .zip is compiled as a loop. Python's iteration
| may be turned into a loop, it may be interpreted, or it may
| be turned into some other kind of bytecode, who knows. The
| standard cpython implementation is much more complex,
| though: https://github.com/python/cpython/blob/main/Python/
| bltinmodu...
|
| Concatenating zip()s is an unnecessarily complex solution,
| both in terms of syntax and in code generated. In Python
| this may not matter because it's a relatively slow
| programming language in general (the language often being
| "glue between fast C methods"), but in Zig this can easily
| become untennable.
|
| I also disagree that you don't need more than 3. As the
| article states, if you leverage array-of-structs rather
| than struct-of-arrays you can use this to "deconstruct"
| objects without paying the memory usage penalty of struct
| padding. The 15% wasted RAM in this example is relatively
| small compared to some real use scenarios; something as
| common as a 3D vector will often have a whopping 25% space
| waste.
|
| Other languages allow this as well (and often using such
| iterations are much faster than zip()ing lists together)
| but the lack of guarantees and repetitive syntax becomes a
| pain.
| masklinn wrote:
| > There is no single "zip".
|
| Which means you can implement yours to fit your needs.
|
| > I don't know any language other than Python that shares
| Python's iterator zip() implementation.
|
| https://docs.rs/itertools/latest/itertools/macro.izip.htm
| l
|
| > In implementation
|
| Which is hardly relevant. Python's entire implementation
| has aims, means, and purpose with no relation to Zig's.
|
| > I also disagree that you don't need more than 3.
|
| Which is not what I wrote.
|
| > As the article states, if you leverage array-of-structs
| rather than struct-of-arrays you can use this to
| "deconstruct" objects without paying the memory usage
| penalty of struct padding.
|
| Sure? And the article uses an example with 3 values.
|
| > The 15% wasted RAM in this example is relatively small
| compared to some real use scenarios; something as common
| as a 3D vector will often have a whopping 25% space
| waste.
|
| It also could hardly be less relevant: it's an issue in
| an AoS structure because all your objects have that
| overhead, therefore that's your total overhead.
|
| Here it's 15 or 25% padding _in a single value within a
| stackframe_. You 're probably wasting more stackframe
| space due to the compiler not bothering reusing
| temporally dead locations.
|
| And that's if the compiler reifies the tuple instead of
| eliding the entire thing.
|
| > Other languages allow this as well
|
| OK?
|
| > (and often using such iterations are much faster than
| zip()ing lists together)
|
| Until they are not.
| jeroenhd wrote:
| > Which means you can implement yours to fit your needs.
|
| Which this doesn't, as zip is an expression and multi-
| sequence loops aren't.
|
| > https://docs.rs/itertools/latest/itertools/macro.izip.h
| tml
|
| External libraries aren't part of a language.
|
| > Which is not what I wrote.
|
| I admit, I read over the "almost" in "you almost never
| need more than 3".
|
| > It also could hardly be less relevant: it's an issue in
| an AoS structure because all your objects have that
| overhead, therefore that's your total overhead.
|
| > Here it's 15 or 25% padding in a single value within a
| stackframe. You're probably wasting more stackframe space
| due to the compiler not bothering reusing temporally dead
| locations.
|
| That's not true: arrays are byte-addressable so inside an
| array the alignment can be shorter. An array of 121
| 33-byte values is 3993 bytes in size, an array of 121
| usizes is 968 bytes in size, and assuming enums resolve
| to 32-bit values an array of 121 enums is also 484 bytes
| in size. There is no overhead here.
|
| This has advantages and disadvantages. Unaligned access
| is slower in general but in many cases and unaligned
| array can be faster because of how many of its entries
| can be loaded into the CPU cache. There's no definite
| advantage here in terms of CPU performance, but in terms
| of RAM usage there is.
|
| > Until they are not.
|
| When does a for loop ever become faster than a generator?
| The values being mapped over are already evaluated, there
| is no lazy loading+early stopping to take advantage of
| the generator.
| senkora wrote:
| Nit: It should be "Air Nomads" instead of "Wind Nomads".
|
| (I know this doesn't matter but I figured the author would
| appreciate the heads up!)
| kristoff_it wrote:
| thanks, fixed, I started by thinking about the last example
| (the one about pokemons) and then it stuck
| planede wrote:
| In C++23 with zip it looks something like: for
| (const auto& [x, y] = std::views::zip(a, b)) { /* ... */
| }
|
| A notable difference is that the ranges don't have to match in
| size, the loop will run until the end of the shorter range is
| reached.
|
| If it is required for optimization to not check for reaching the
| end of one of the ranges then it can be achieved with something
| like: for (const auto& [x, y] =
| std::views::zip(a, std::ranges::subrange(std::ranges::begin(b),
| std::unreachable_sentinel))) { /*...*/ }
|
| I guess it's hard enough to bump into this by accident.
| WalterBright wrote:
| D has std.zip:
|
| https://dlang.org/phobos/std_range.html#zip
|
| Sorting two arrays in parallel: import
| std.algorithm.sorting : sort; int[] a = [ 1, 2, 3
| ]; string[] b = [ "a", "c", "b" ]; zip(a,
| b).sort!((t1, t2) => t1[0] > t2[0]); writeln(a);
| // [3, 2, 1] // b is sorted according to a's sorting
| writeln(b); // ["b", "c", "a"]
| edflsafoiewq wrote:
| In Common Lisp it's (loop for x in a
| ; on each iteration, steps x to next element of a for y
| in b ; same thing do ...)
|
| This is like the Zig in that it's a hard-coded feature of the
| looping construct instead of being a general combinator like
| C++/Rust, but I think it's neat that by allowing multiple
| clauses, zipping falls out completely for free.
| Someone wrote:
| > This is like the Zig in that it's a hard-coded feature of
| the looping construct
|
| I think lisp's _loop_ isn't hard-coded in the language, but
| defined in the standard library. See for example the _loop_
| implementation at
| https://github.com/sbcl/sbcl/blob/master/src/code/loop.lisp
| (about 2,000 lines because _loop_ is a monster /very flexible
| (pick whatever you prefer. I would pick both))
| edflsafoiewq wrote:
| LOOP isn't hard-coded into the language, but the possible
| clauses are hard-coded into LOOP. This is in contrast to
| ITERATE, which is an extensible CL looping macro, or the
| generator/iterator style popular in C++/Rust/Python/etc.
| netr0ute wrote:
| As someone who is using a lot of C++20, I can't wait to use
| this feature when C++23 is finally ready :)
| noobermin wrote:
| God, C++ even today is still horribly verbose. I get that being
| "explicit" is being "better" but there are limits. Even after
| auto removed a lot of verbosity there still is a lot there just
| to get it to do exactly what you want.
| nikbackm wrote:
| What is verbose about that? (The first case, not the second,
| special case)
|
| Seems hard to make it any shorter. Well, I guess you could
| remove "const" if you want.
| jeroenhd wrote:
| Here's how it looks when you write readable C++:
| for (const auto &[x, y]: zip(a, subrange(begin(b),
| unreachable_sentinel))) { /* Do something with x
| and y */ }
|
| I'm not sure why C++ programmers don't like using `using`,
| it's as if Java programmers insist on typing out
| java.util.ArrayList every time because you may have an
| ArrayList of your own in the future.
|
| So much C++ code can become readable by adding `using
| namespace std::something`.
| twic wrote:
| C++ doesn't have a lot of namespacing below 'std', so if
| you go around using everything you need, you end up with a
| lot of short, possibly conflicting names in scope.
|
| Java avoids this somewhat because functions are always in a
| class, so there is a little bit of extra namespacing, even
| if you've imported the class.
| winrid wrote:
| You can still import fully qualified static method on a
| class:
|
| import someclass.doThing;
|
| Or maybe that's what you meant by somewhat :)
| cmovq wrote:
| The reason is header files. If you do `using namespace
| std::something` in one file and it gets included in
| another, the other file now has std::something in the
| global namespace which it may not have excepted.
|
| Other languages like Java have imports scoped a to a single
| file, so this is not a problem.
| titzer wrote:
| The word "antiquated" comes to mind, but it's worse than
| that. An age-old self-created fractal hell of dumb
| problems created by a _simplistic_ view of code reuse
| based on the _simple_ mechanism of text inclusion that
| constantly restrains future evolution. It 's antediluvian
| and just plain backwards.
| twic wrote:
| We are at least getting modules real soon now:
| https://en.cppreference.com/w/cpp/language/modules
| duped wrote:
| The only compiler that seems to care about implementing
| modules is MSVC. It'll be "real soon" when GCC stops
| crashing.
| xigoi wrote:
| We've been getting them real soon for... about 5 years
| now?
| tragomaskhalos wrote:
| Using in a header file => complete no-no.
|
| Using in a cpp file => absolutely fine.
| jcelerier wrote:
| > Using in a cpp file => absolutely fine.
|
| definitely not as soon as you want to do unity/jumbo
| builds (which are in my experience the ndeg1 thing to do
| to get fast CI builds)
| jeroenhd wrote:
| I see, but that's only a program in header files, isn't
| it? Most code will end up in .cpp files which don't get
| included (usually).
|
| It makes sense to use std::vector in a .h(pp) file, but
| in the .cpp you should be able to `using namespace std`,
| right?
| chongli wrote:
| _Most code will end up in .cpp files which don 't get
| included (usually)._
|
| Not if a lot of your code is in the form of templates.
| You have to put those in headers.
| dcow wrote:
| At this point, just use a modern language already d=
| pdntspa wrote:
| I don't do C++ but if it's anyting like my Java IDE most of
| that stuff pops up in autocomplete and you just tab through
| it all
| codethief wrote:
| While this solves one of the issues I've had with Zig, it doesn't
| seem very flexible. I would love to the same thing for (tuples
| of) variable-length arrays, arrays of different lengths etc. Now
| my implementation will still look completely different in
| slightly different situations.
|
| Yes, a flexible solution (a zip function) would probably need
| iterators but why would introducing them be such a big problem?
| (FWIW, I know that one can emulate looping over iterators with
| while() and optionals[0] but it feels a bit dirty.)
|
| More generally, my biggest gripe with Zig has been a lack of
| expressiveness: Things like deep equality checks between structs,
| arrays and optionals; looping over different kinds of containers;
| ... should just work(tm). Sure, I understand that Zig doesn't
| want hidden control flow but OTOH explicit control flow
| everywhere often just gets in the way of readability and of
| implementing business logic. I usually follow the approach
| "Implement first, optimize later" but with Zig the implementation
| will look completely different depending on which optimization or
| data structure I choose, so I need to think about optimizations
| from the start if I don't want to rewrite everything later. I
| should mention, though, I'm very used to Python these days and
| haven't written C or C++ in ages, so maybe that sort of culture
| shock is somewhat expected.
|
| Anyway, I'm still excited about the language and my impression is
| that Andrew Kelley is very open to new suggestions and new ideas,
| so things will certainly still change in one way or another till
| v1.0.
|
| [0]: https://news.ycombinator.com/item?id=34958051
| AndyKelley wrote:
| There is no "just work" for deep equality. The standard library
| would have to make decisions on behalf of the application that
| it has no business making.
|
| I can tell you right now that while there will be many upcoming
| language changes, none of them will be comfy to Python
| programmers. Zig is very much an imperative language. Or
| perhaps think of it as a declarative DSL for outputting machine
| code.
| _a_a_a_ wrote:
| > a declarative DSL for outputting machine code
|
| and thanks for the laugh
| codethief wrote:
| Thanks for your message, Andrew!
|
| Just to be clear, I didn't mean any offense and maybe my
| critique wasn't as well-balanced as it could have been. So
| far, coding in Zig has been a fun ride, despite the
| occasional obstacles I've run into!
|
| > There is no "just work" for deep equality
|
| Say I have two variables A and B of the same struct type.
| Each points to a finite region in memory of the same size.
| Why can't I just compare these two regions using `A == B` to
| make sure they are equal? Yes, one can obviously come up with
| alternative definitions of what equality might mean for
| structs (only compare certain subfields etc.) but wouldn't
| the aforementioned definition be a good default that would
| work in the majority of cases?
|
| Alternatively, there is `std.testing.expectEqualDeep()` which
| walks through all fields but as far as I know there is no
| equivalent for production code(?)
|
| > none of them will be comfy to Python programmers
|
| Oh I think the multi-sequence for loops feature already makes
| Zig more comfy! :)
|
| Just to be clear: I wouldn't want Zig to be another Python.
| While I like Python from a developer experience POV, it is
| dictionaries and magic methods all the way down and often
| unnecessarily slow and complex.
|
| I still think one could find a good balance between DX and
| full low-level control, though. One could e.g. have one
| convenient way to express a certain problem that gives you
| medium control over performance (e.g. the `==` example above)
| and one or more fine-tuned, but possibly less concise ways of
| expressing the problem that provide full control but require
| more lines of code. In the struct equality example the latter
| would mean defining some kind of `eql()` function that would
| be optimized to the struct type (e.g. compare certain fields
| first as they are more likely to differ etc.). Would this
| violate the Zen of Zig?[0]
|
| > Only one obvious way to do things.
|
| After all, there is also
|
| > Favor reading code over writing code
|
| Right now, at least, I often run into situations where I
| don't know of any obvious way to solve my problem. Then I end
| up writing lengthy code to tell Zig what I want and end up
| with code that's so-so on the fun-to-read scale.
|
| [0]: https://ziglang.org/documentation/master/#Zen
| munificent wrote:
| _> Why can 't I just compare these two regions using `A ==
| B` to make sure they are equal?_
|
| Why is shallow equality useful?
|
| You could have `A == B` be true, but then as soon as you
| wrap pointers to them in C and D, now `C == D` is false.
| shagie wrote:
| It gets into even more fun if A has a pointer to C has a
| pointer to B, and B has a pointer to D has a pointer to
| A.
| [deleted]
| Existenceblinks wrote:
| Probably a good signal for potential O(n^2) when reading the
| code.
|
| EDIT: Nope. I was wrong. This is not a list comprehension.
| tuukkah wrote:
| Right, it's not an ordinary list comprehension. It's a parallel
| list comprehension though: zip as bs = [(a,b) |
| a <- as | b <- bs]
|
| https://downloads.haskell.org/ghc/latest/docs/users_guide/ex...
| loeg wrote:
| Why do you say so? It's equivalent to iterating indexes from 0
| to N-1 of two (or more) lists with N elements and providing
| syntax sugar for those lists' elements at that index. This is
| O(N).
| Existenceblinks wrote:
| True. I misread it, it's a zip pattern. I thought it was a
| fancy list comprehension.
| dan00 wrote:
| I think the naming of the 'else' branch in the loop could be more
| telling, like using the name 'finally' or 'finish'.
| Someone wrote:
| I agree, but 'finally' or 'finish' IMO aren't good choice
| because that code doesn't always execute.
|
| I think I would go for something expressing 'default', but
| would first look at existing code to see how common this is,
| and if I decided I wanted this feature, look hard for
| alternative syntax. const match: ?usize = for
| (text, 0..) |x, idx| { if (x == needle) break idx;
| }
|
| could return an optional int, for example. If so, you would get
| a 'null' for free, and if you didn't want a null, you could
| tack on a _.getOrElse(NOT_FOUND)_.
|
| I guess they picked this because Python has it, too.
| https://docs.python.org/3/tutorial/controlflow.html#break-
| an...:
|
| _"Loop statements may have an else clause; it is executed when
| the loop terminates through exhaustion of the iterable (with
| for) or when the condition becomes false (with while), but not
| when the loop is terminated by a break statement."_
| masklinn wrote:
| `finally` hints at a very different behaviour because in most
| languages' context a finally clause is executed whether an
| exception is raised or not.
| puffoflogic wrote:
| Sorry, but using new syntax to accomplish something other
| languages have as library code is not clever.
|
| When reading zig code you have to stop and think, "wait does this
| syntax mean zip or direct product?" But when expressed as a
| _function called zip_ , the meaning is clear.
|
| (Obligatory reminder that zig devs think that sometimes running
| code inside `if (false)` is a minor bug of no consequence, and
| after all what are the _real_ motives of anyone pointing it out,
| eh?)
| Kukumber wrote:
| Kinda nice to have, D can do it aswell: import
| std; void main() { int[] a = [1,
| 2, 3]; string[] b = ["a", "b", "c"];
| foreach (e1, e2; zip(a, b)) {
| writeln(e1, ":", e2); } } 1:a
| 2:b 3:c
| hota_mazi wrote:
| Question for Zig experts: for (elems) |x| {
| std.debug.print("{} ", .{x}); }
|
| Why is the .{x} necessary here? What happens if I just write "x"?
| Laremere wrote:
| Zig doesn't have variable length args. The .{} syntax is for an
| anonymous struct with no field names (which are named tuples in
| Zig.) Print takes the struct's type info at compile time to
| check the validity of the statement, and also produce optimal
| code. This is implemented entirely within Zig's standard
| functionality that's available to all users.
|
| So, if you just type X, you're getting an error about it not
| being a struct. That's unless X is a struct with one field,
| where it'll just print that field. I find Zig meta-programming
| to actually be fairly readable, here's the function that does
| the formatting:
| https://github.com/ziglang/zig/blob/master/lib/std/fmt.zig
| anonymoushn wrote:
| `std.debug.print` and similar fmt-like functions take 2
| arguments. The first argument is a format string, and the
| second argument is a tuple. I think a tuple is an anonymous
| struct whose members are named 0, 1, 2, etc., but I'm not
| completely sure on this. If you just write "x", it won't work,
| since you needed to pass a tuple containing 1 thing, and x
| probably isn't a tuple containing 1 thing.
| quic5 wrote:
| The print function is implemented in the std library[1] not the
| compiler and Zig does not have varargs
|
| [1]
| https://github.com/ziglang/zig/blob/f6c934677315665c140151b8...
| AnIdiotOnTheNet wrote:
| Zig doesn't have varargs anymore. Instead, it has anonymous
| structs/arrays/tuples. The second argument to `print` here is
| expected to be a list of the values referenced by the `{}`
| placeholders in the string in the first argument.
|
| `.{a, b, c}` is the syntax for an anonymous struct/array/tuple,
| and a single element still needs to be wrapped in it.
| noobermin wrote:
| >In the multi-sequence for loop version it's only necessary to
| test once at the beginning of the loop that the two arrays have
| equal size, instead of having 2 assertions run every loop
| iteration.
|
| So, I'm assuming zig generally cannot be used with multi-threaded
| code? Can the underlying arrays not be modified during the whole
| loop execution?
| int_19h wrote:
| Arrays can be modified, but their size is a part of their type,
| just like C.
|
| For slices, length is only known at runtime, but it's immutable
| once the slice is created, so there's no issue there, either.
| [deleted]
| masklinn wrote:
| That's broken in every language, except for the few which just
| don't allow doing it. So I'm not quite sure what the question
| is about.
| spullara wrote:
| I'm not sure zip is used enough to add it to the language but
| since they are also using it for tracking the index maybe that is
| the primary use case.
| kristoff_it wrote:
| The example of using SoA memory layout is not there just as a
| random example. We hope for Zig developers to employ DOD
| principles whenever appropriate, which is not going to be that
| rare in a low-level programming language like Zig.
|
| Andrew has a full talk about how the Zig compiler benefits
| tremendously from DOD:
|
| https://vimeo.com/649009599?embedded=true&source=video_title...
| spullara wrote:
| Honestly working with them in a column oriented way makes a
| lot of sense. I wonder though if that should just be handled
| at the struct level? i.e. ask for row vs column layout.
| conaclos wrote:
| Is there other rationale behind this `for` and `while` syntax?
| Why not: for x in elms {} for x in
| elms, i in 0.. {}
|
| In my view, this seems simpler to read and understand. In
| particular, the iteration variable is close to the iterated
| structure. In the Zig proposal you have to think about the
| position of the iterated structure and the position of the
| iteration variable. for (elms, 0..) |x, i| {}
| ^^^ ^ Ok it is in second position
|
| I still don't get why `||`. Is this a lambda? Can I write
| something like: fn func(x: i8, i: i8) void {}
| for (elms, 0..) func
|
| In the same vein I do not understand the rationale behind the
| `while` syntax. Why `:`? while (condition) : (i
| += 1) {
| messe wrote:
| Because it's consistent with the rest of Zig's capture syntax:
| if (foo()) |result| { while (it.next()) |elem| {
| bar() catch |err| ...
| kazinator wrote:
| Here is Awk with C99 preprocessing: cppawk!
|
| Loop macro for parallel/nested iteration featuring a (user-
| extensible!) vocabulary of clauses: $ cppawk '
| #include <cons.h> #include <iter.h> BEGIN {
| loop (list(iter0, item, list("alpha", "charlie", "bravo")),
| list(iter1, ltr, list("a", "b", "c")), range(i, 1,
| 3)) { print item, ltr, i } }'
| alpha a 1 charlie b 2 bravo c 3
|
| This is a tiny shell script plus a collection of header files in
| a small directory structure. It requires an Awk such as GNU Awk,
| and the GNU C preprocessor.
|
| Preprocessed programs cam be captured, to run on systems that
| don't have the cppawk script or a preprocessor, and with less
| startup overhead.
|
| https://www.kylheku.com/cgit/cppawk/about/
| stephc_int13 wrote:
| I have absolutely no problem with the good old C-style for loop
| syntax.
|
| I think that a separate foreach or for-each loop, made for built-
| in or extended containers could be a nice addition.
|
| Not seeing much value there.
| titzer wrote:
| Regarding the lengths must match:
|
| > (i.e. you will get a panic in safe release modes)
|
| Should I take that to mean there is an unsafe release mode
| without the bounds check? But UB is mentioned too. Is there UB!?
|
| It's 2023; I think we can afford a single branch to avoid UB,
| even in release mode.
| conradev wrote:
| it is one of the build modes:
| https://ziglang.org/documentation/master/#Build-Mode
| titzer wrote:
| Interesting, thanks for the link.
|
| I think that disabling safety checks is a thing you should
| only do if you are studying the cost of safety checks (i.e. a
| compiler switch only available to compiler engineers). IMHO,
| the _whole point_ of safety checks is to find the bugs that
| are in your program[1]. Crashing it safely with an exact
| source stack trace is the nice way of both motivating you to
| fix it and also _helping_ you fix it.
|
| [1] And there _are_ bugs in your program. Right now. Bugs.
| In. Your. Program. Running without safety checks is like
| closing your eyes and rolling the dice.
| quic5 wrote:
| There's also the compromise of only disabling safety checks
| per block e.g. in your hot loop with `@setRuntimeSafety`[1]
| where you are confident that they aren't needed.
|
| [1]
| https://ziglang.org/documentation/0.10.1/#setRuntimeSafety
| laserbeam wrote:
| Most software should probably release with safety checks
| on. Certain software shouldn't (i.e. games). Toolchains
| like zig give you that option and respect that you can
| decide what's most appropriate for whatever you ship.
|
| Arguing that safety checks should always be enabled doesn't
| really make sense. Context matters.
| kristoff_it wrote:
| > Should I take that to mean there is an unsafe release mode
| without the bounds check?
|
| Yes, it's called ReleaseFast.
|
| > It's 2023; I think we can afford a single branch to avoid UB,
| even in release mode.
|
| Zig has 3 release modes: ReleaseSafe, ReleaseFast,
| ReleaseSmall. If you want the safety checks, just use the
| first.
|
| It might even be $currentYear, but many of the latest AAA games
| still don't always run at more than 60fps on my fairly powerful
| machine and I sincerely hope they were built with all the
| optimizations enabled.
| Maken wrote:
| That's probably because of the DRM.
| AnIdiotOnTheNet wrote:
| And 60fps is the low end in an era where many monitors are
| 120-240hz. 4ms/frame is a pretty tight budget.
| titzer wrote:
| > I sincerely hope they were built with all the optimizations
| enabled.
|
| Sure, but I don't agree that disabling safety checks is an
| "optimization". It is a regression in functionality that is
| betting on nothing going wrong.
|
| Bounds checks do not cost much[1]. Maybe if a bounds check
| disables vectorization[2].
|
| [1] https://blog.readyset.io/bounds-checks/
|
| [2] https://github.com/matklad/bounds-check-cost
|
| There are a lot of techniques to remove bounds checks, e.g.
| in counted loops [3][4].
|
| [3] https://ieeexplore.ieee.org/document/5381765
|
| [4] https://en.wikipedia.org/wiki/Bounds-checking_elimination
| AnIdiotOnTheNet wrote:
| If that's what you believe, you are free to enable them as
| the programmer. Programmers who disagree are likewise free
| not to.
|
| This can even be decided on a scope-by-scope basis if so
| desired.
| titzer wrote:
| If it's a program you wrote to run on your on hardware,
| feel free. But in reality most programmers write programs
| for other people's computers, or just write programs
| because it's fun or pays well, and then their work gets
| integrated into a larger whole at a much later date, and
| then it's run in contexts the original author never
| imagined, long after they move on. Safety checks catch
| the base-level logic bugs that would otherwise cause
| programs to go silently wrong and misbehave in complex
| and inscrutable ways. Disabling them is not just living
| dangerously, it's a moral hazard; the programmer doesn't
| suffer the consequences, users do. It's not your program
| or computer at risk, but someone else's. I don't know how
| as a profession we're so cavalier with shipping exposed
| whirling knives, but we are.
| verdagon wrote:
| If we go so far as to say that using anything unsafe is
| dangerous and a "moral hazard" then we would also have to
| disqualify Rust, C#, and any other language that allows
| unsafe escape hatches (especially in dependencies).
| AnIdiotOnTheNet wrote:
| > the programmer doesn't suffer the consequences, users
| do.
|
| The same is true of poorly performing programs. My
| computer's resources are not the programmers' to waste,
| yet they routinely do waste it to save themselves
| time[0].
|
| > I don't know how as a profession we're so cavalier with
| shipping exposed whirling knives, but we are.
|
| That's a separate problem than not handcuffing
| programmers and forcing them into safety checks. Why
| should Zig force this?
|
| Like, I just don't even get what you're complaining about
| here. The default build mode _and_ the recommended
| release one insert the check. Checks can additionally be
| enabled and disabled on a scope-by-scope basis. What
| exactly do you want? Just eliminate ReleaseFast as an
| option and give people more reasons to go back to
| footgun-laden C because it 'll be the only way to
| eliminate a bounds check in a tight loop hot spot?
|
| [0] Yes, I know this isn't due to safety checks in the
| vast majority of circumstances, that's not the point. I
| have nothing against safety checks, my problem is with
| the mentality that it should not be possible to disable
| them. Even Rust has `unsafe`.
| adamrezich wrote:
| the mere naming of the keyword `unsafe` has been a wholly
| unintentional disaster for programming in general as more
| and more people use Rust, because
| "safe"/"safety"/"unsafe" are sort of emotionally-loaded
| words in English, and it's led to people to build mental
| heuristics about the pros and cons of "safe" and "unsafe"
| code which may be subtly incorrect. the language feature
| itself is completely reasonable of course, given the
| design decisions of the language, but as Andy said
| elsewhere in this comments thread:
|
| > Rust evangelists need to be careful because in their
| zeal they have started to cause subtle errors in the
| general knowledge of how computers work in young people's
| minds. Ironically it's a form of memory corruption.
|
| I'm not even a zig user or fan or anything and I don't
| have any real opinion about Rust, either, except for
| completely agreeing with this analysis based on how I've
| seen Rust evangelists talk online. I'm not sure what the
| solution to this is, but it seems like it's just going to
| get worse over time as Rust becomes more popular and
| gains market share.
| Gene_Parmesan wrote:
| So isn't it on the programmer to ensure the safety checks
| are enabled if appropriate? I agree with the gist of your
| statement, I'm just not sure how this is the
| responsibility of the language itself. It ships with the
| option to build via a safe mode. I don't think it's a
| moral imperative of the language designer to ship without
| an unsafe mode. Even rust has unsafe blocks.
|
| In most engineering professions, it's the engineer's
| responsibility to ensure appropriate levels of safety,
| not the CAD software used to build the blueprints. But
| every situation doesn't have the same level of safety
| required; backyard sheds don't have the same needs as
| skyscrapers.
| titzer wrote:
| Most engineering disciplines are considerably more
| regulated than software development, and for good reason;
| bridges and skyscrapers falling down can kill people.
| Even electrical engineering and device manufacturing have
| to fit in with standards that address shock hazard and
| EMF interference.
|
| I actually _do_ think it is the responsibility of the
| language and runtime system to ensure some base-level
| safety of programs. The one constant over the years is
| that programmers keep making mistakes. No matter how much
| they keep yelling "trust us", they (we) just keep
| screwing up. That's not to pillory us programmers. It's
| just the facts that everyone screws up. In some sense,
| engineering is putting processes and procedures and
| checks in place that move human fallibility out of the
| critical load-bearing situations so that a simple whoops
| or memory slip doesn't kill people or ruin things.
| krona wrote:
| Without bounds checks: game crashes, core dump.
|
| With bounds checks: game crashes, meaningless error message
| given to the user.
|
| What am I missing?
| gaganyaan wrote:
| The meaningless error message can be entered into Google
| and the user can find a thread about how to fix their
| specific problem instead of wading through endless
| threads of similar-but-unrelated problems.
| ekimekim wrote:
| With bounds checks: game crashes, meaningless error
| message given to the user.
|
| Without bounds checks: I join a multiplayer lobby, and
| the next thing I know my computer is part of a botnet.
|
| This isn't an imaginary fear, it has happened many times.
| Some examples from a brief search:
| https://gridinsoft.com/blogs/rce-vulnerability-in-gta-
| online... https://www.polygon.com/22898895/dark-souls-
| pvp-exploit-mult... https://security.gerhardt.link/RCE-
| in-Factorio/
|
| I am not claiming all of these would've been prevented by
| bounds checking arrays, or even memory safety in general.
| The point is that security is not optional just because
| it's a game.
| dxhdr wrote:
| Now suppose your game runs in a WASM sandbox and re-run
| those scenarios. What do you gain from bounds checks?
|
| I'm not suggesting that shipping without bounds checks is
| wise or leads to a better product. However I do think
| with /some games/ security is basically not a concern.
| Arnavion wrote:
| Heartbleed still happens inside a sandbox, because it's
| the sandboxed memory that leaks. For multiplayer games
| specifically, that can be a client auth key that can be
| used to impersonate you.
| adgjlsfhk1 wrote:
| the bounds check can sometimes catch the error before it
| corrupts your save.
| titzer wrote:
| > Without bounds checks: game crashes, core dump.
|
| I think it's more like (assuming it does actually go out
| of bounds at some point):
|
| 30% chance of core dump right away
|
| 20% chance of core dump at some point after errant write
|
| 40% chance it never crashes in testing
|
| 5% chance it doesn't crash the first year after shipping
|
| 5% chance it never crashes
|
| With an explicit bounds check, all of these scenarios
| result in a crash at the exact location where the program
| first violated safety[1]. The developer gets a source-
| level crash and doesn't spend the first 20 minutes just
| trying to figure out what the crash dump even means.
|
| [1] Hopefully with a complete stacktrace, maybe even the
| index and length values!
|
| It's time we recognized that _all_ our tooling should be
| designed to help us programmers who _do have bugs in our
| program_. Like, this crashing part is the normal part
| that all the tools should help deal with.
| nordsieck wrote:
| > What am I missing?
|
| Sometimes without bounds checking you get an exploit
| instead of a crash.
| kristoff_it wrote:
| btw note that you're arguing this point in a the thread of
| a blog post about a feature that is all about maintaining
| safety while not paying for it at runtime. There's an
| entire section dedicated to explaining this point.
| tuukkah wrote:
| > _The new multi-sequence syntax allows you to loop over two or
| more arrays or slices at the same time_
|
| In Haskell, this is called a parallel list comprehension:
| [x+y | x <- xs | y <- ys]
|
| In a normal list comprehension, you have a single pipe, in a
| parallel one you have as many pipes as how many lists you are
| zipping.
| https://downloads.haskell.org/ghc/latest/docs/users_guide/ex...
| masklinn wrote:
| No, your list comprehension is a product (it iterates ys for
| every x). The feature here is zip.
| tuukkah wrote:
| I'll quote from the documentation link I referenced:
|
| > _For example, the following zips together two lists:_
| [ (x, y) | x <- xs | y <- ys ]
|
| That's precisely the difference between a normal list
| comprehension (one pipe) and a parallel list comprehension
| (multiple pipes).
|
| For clarity, here's your normal list comprehension (with one
| pipe) that produces all the combinations instead:
| [ (x, y) | x <- xs, y <- ys ]
|
| And here's the full example from the article converted to
| Haskell and producing the exact same output:
| {-# LANGUAGE ParallelListComp #-} import
| Control.Monad (mapM) elems = [ "water", "earth",
| "fire", "air" ] nats = [ "tribes", "kingdom", "nation",
| "nomads" ] main = mapM putStrLn [ show
| idx ++ " - " ++ e ++ " " ++ n | e <- elems
| | n <- nats | idx <- [0..] ]
|
| EDIT: I suppose an explicit zip with an anonymous function
| looks more idiomatic though: main = forM
| (zip3 elems nats [0..]) $ \(e, n, idx) -> putStrLn
| (show idx ++ " - " ++ e ++ " " ++ n)
|
| EDIT2: Best of both worlds with the list monad?
| main = mapM putStrLn $ do (e, n, idx) <- zip3 elems
| nats [0..] [ show idx ++ " - " ++ e ++ " " ++ n ]
| arethuza wrote:
| I was fond of the Common Lisp loop macro that handled iterating
| over multiple things quite nicely:
|
| https://lispcookbook.github.io/cl-cookbook/iteration.html#lo...
|
| Edit: 27 years since I was paid to write Lisp....
| masklinn wrote:
| This is very strange, because it looks like a comprehension
| (https://wiki.haskell.org/List_comprehension), which would be a
| product iteration.
|
| Most languages have a function called zip or something similar
| (https://hackage.haskell.org/package/base-4.17.0.0/docs/Prelu..
| .) which handles pairing sequences, to be composed upstream of
| the iteration proper.
| tuukkah wrote:
| It's a _parallel_ list comprehension, as linked from the wiki
| page you referenced: https://downloads.haskell.org/ghc/9.4.4/
| docs/users_guide/ext...
| MrBuddyCasino wrote:
| Is must say this looks pleasant, coming from Kotlin. Also ranges
| seem to work very similarly.
|
| Not sure how I fell about the UB - is it really necessary to
| optimise away a single length check per loop (not iteration)?
| AnIdiotOnTheNet wrote:
| That's up to the programmer. Zig's default build mode is Debug,
| and ReleaseSafe is recommended if you don't require extreme
| performance. Both modes will insert the check.
|
| Safety checks can also be enabled or disabled on a scope-by-
| scope basis if desired.
| carterschonwald wrote:
| Zipwith is a great iteration api to have available.
| andrewstuart wrote:
| Off topic, but I was weighing up trying Zig last night for a
| project.
|
| No doubt Zig has changed alot and is better than it was only a
| year or two ago.
|
| Is anyone here willing to say if they have experienced success
| and satisfaction using Zig? I'm wanting to do some C library
| interfacing.
| blameitonme wrote:
| Hey Im just a student and cant even think to build stuff of
| complexity most of the guys here make rn, but I made a json
| parser in zig and it was fun.
| marmada wrote:
| I really like that for loops can be expressions. It seems obvious
| in hindsight, but hindsight is always 20/20 :)
| masklinn wrote:
| > It seems obvious in hindsight
|
| It's not, because most languages don't have an `else` clause in
| their for loop (and in my experience with Python that clause is
| quite confusing so its use is not common).
|
| And a for loop can be executed 0 times, so without a mechanism
| for a fallback it might not have a value _to_ yield.
| Someone wrote:
| > And a for loop can be executed 0 times, so without a
| mechanism for a fallback it might not have a value to yield.
|
| I would think that and the similar case where no iteration
| hits _break_ are solvable by having a _for_ loop return an
| optional type.
| avgcorrection wrote:
| Special-casing (same-length) zip and iteration+count might make
| sense for an imperative language which doesn't want to go down
| the rabbit hole of implementing efficient, lazy iterators. It
| doesn't make sense in a language where you want the flexibility
| of switching between (as in: compiling to) serial loops and
| paralell code, but it makes sense for a language which leans more
| towards what-you-see-is-what-you-get rather than sufficiently-
| smart-compiler.
| noobermin wrote:
| Tbh, there are limits to how much any language that does
| "wysiwyg" compilation that would have for loops. For example,
| any "for" loop can be a "while" loop in asm, the one
| optimization is you can use the index registers as long as the
| number of arrays is less than the number of index registers you
| have. If it is more, which the language does not constrain of
| course, you just go back to a loop with memory locations for
| pointers. But of course, in that case then, you _must_ have a
| "smart compiler" that can decide that which case it is and thus
| compile to the right code.
|
| That said, this likely will be an esoteric case on most modern
| machines (like x86_64 has 16 regs that can be used for indexes)
| and I doubt people want to use this for like avr.
| nemo1618 wrote:
| This is a gripe I have about Go -- a very minor gripe, to be
| sure, but it's still there. If you want to iterate over two
| arrays/slices that have the same length, you have to choose
| between: for i := 0; i < n; i++ {
| fn(foo[i], bar[i]) } for i := range foo {
| fn(foo[i], bar[i]) } for i := range bar {
| fn(foo[i], bar[i]) } for i, x := range foo {
| fn(x, bar[i]) } for i, y := range bar {
| fn(foo[i], y) }
|
| But none of these are satisfactory; what I _really_ want to write
| is: for _, (x, y) := range (foo, bar) {
| fn(x, y) }
| maxmcd wrote:
| This is pretty ugly and add the overhead of a function
| callback, but just for fun: func multiLoop[X,
| Y any](x []X, y []Y, cb func(i int, x X, y Y)) { if
| len(x) != len(y) { panic("invalid slice
| lengths") } for i := 0; i < len(x); i++
| { cb(i, x[i], y[i]) } }
| func foo() { multiLoop([]int{1, 2, 3},
| []string{"a", "b", "c"}, func(i int, x int, y string) {
| fmt.Println(i, x, y) }) }
| masklinn wrote:
| FWIW this is often called `zipWith`, or sometimes just `map`
| (some `map` implementations can take a variable number of
| sequences to map over).
| rcme wrote:
| This kind of syntactic sugar used to appeal to me, but now I
| think it's a pretty weird feature to add to a language. Using zip
| / enumerate primitives feels a lot more flexible.
| cryptonector wrote:
| To me this looks a lot like closure syntax w/ non-local exits.
| Seems quite reasonable for a functional programming language.
| moomin wrote:
| I think it matters what your target use cases are. This makes
| me think quite a few people are running ECS systems.
| cgh wrote:
| How cache-friendly are zip/enumerate implementations? Zig is
| influenced by the ideas behind Data Oriented Design, mentioned
| in the article (and a buried lede, if you ask me). Explicit for
| loops like this are generally cache-friendly and ideal for eg
| game programming, as shown in the structs of arrays example.
| [deleted]
| steveklabnik wrote:
| I tossed together a simple function using enumerate
| https://godbolt.org/z/PKsEdKvKK
|
| You get the same exact asm as the manual loop.
|
| Of course, the idiom recognition seems to kick in, in both
| cases, as there's no actual loop here. I tossed in a +sum,
| which makes that fail, so you get some loops, check it out:
|
| https://godbolt.org/z/1ddf5ded7
|
| They are one instruction different in length, which is kind
| of amusing to me. Some small differences.
| cgh wrote:
| Thanks, that's exactly what I was asking. I don't write
| Rust so it's informative to see this.
| steveklabnik wrote:
| Any time.
| vore wrote:
| As cache-friendly as advancing two pointers and a bounds
| check.
| [deleted]
| pmontra wrote:
| I don't know how common working with ranges is in Zig. Ruby
| would iterate on multiple ranges by converting them to arrays
| one = (1..3) seven = (7..10) (one.to_a +
| seven.to_a).each {|n| puts n}
|
| I suppose that if it was common they would have added a +
| method to Range. Actually I think that's possible to implement
| it with a refinement on the Range class.
|
| Yup, it works. First time I ever used refinements.
| module JoinRanges refine Range do def
| +(other) self.to_a + other.to_a end
| end end using JoinRanges one = (1..3)
| seven = (7..10) (one + seven).each {|n| puts n}
| kdmccormick wrote:
| This is different. You are concatenating the arrays, whereas
| the article & discussion are about zipping arrays.
| laserbeam wrote:
| Depends on what you mean by flexible. If you want to use them
| outside of loops then they could cause magic data copies behind
| the hood. Zig really hates hidden control
| flow/allocations/copies. Within the for syntax it's pretty
| straightforward what gets assigned to what variables and how
| copies can be avoided.
|
| Doing things like `a = @zip(some_list, some_other_list)` can be
| reasoned about in multiple ways, some of which involve silently
| calling malloc. It's particularly unclear what could be done
| with `a` afterwards. Zig hates that kind of ambiguity and is
| happy to err away from flexibility at times.
| brundolf wrote:
| Rust also hates hidden allocations, and its iterator system
| can do all of this without them
|
| Although- thinking about it, that may rely on the borrow
| checker (move semantics specifically)
| gpanders wrote:
| >Rust also hates hidden allocations
|
| Does it? Rust seems happy to allocate silently all the
| time. let x = String::new("hi");
| let y = vec![];
|
| Do either of these allocate? As the writer or reader of
| this code, how do I know if either of these statements
| result in a heap allocation, or if the data is strictly on
| the stack?
|
| Zig's requirement of explicitly passing around an Allocator
| type removes any ambiguity completely.
| steveklabnik wrote:
| (You're forgetting a "new" in the string example)
| gpanders wrote:
| Thanks, I fixed it :)
| brundolf wrote:
| Sure, any arbitrary function (or macro) logic can
| allocate. It's more a philosophy, not something that's
| language-enforced[0] in Rust- if you're creating a
| mutable, variable-size data structure like a String or a
| Vec or a HashMap you're not going to be very surprised
| that it allocates at some point (though technically zero-
| length Vecs don't allocate on construction, they wait
| until an item is added)
|
| But closures don't require allocation, iterators don't
| require allocation, async doesn't require allocation.
| Copy semantics also don't allow allocation- implicit
| copies can only happen for data structures that are
| bitwise-copyable, which is enforced by the compiler. For
| copy-with-allocation you have to implement the Clone
| trait, and then invoke it explicitly with the .clone()
| method
|
| But the original context was a question of philosophy, so
| I was only speaking to Rust's overall philosophy
|
| [0] Technically I think if you're using no_std you won't
| have access to any standard constructs that allocate
| (which obviously will prevent their use at compile-time),
| though I believe you're still allowed to eg. call out to
| foreign functions manually that would allocate. And of
| course, this still isn't as granular as Zig's allocation-
| control.
| [deleted]
| masklinn wrote:
| That's nothing special though, `zip` just takes an item
| from each iterator, packs them into a tuple, and yields
| that. It has no weird bounds or requirements or anything:
| https://doc.rust-lang.org/std/iter/fn.zip.html
|
| The impl of the default `next` is: fn
| next(&mut self) -> Option<(A::Item, B::Item)> {
| let x = self.a.next()?; let y = self.b.next()?;
| Some((x, y)) }
|
| So completely straightforward.
| defen wrote:
| Can that zip more than two iterators? And does it perform
| a bounds check on each call to `a.next()` and `b.next()`?
| brundolf wrote:
| It stops when one of the two iterators ends
|
| It can't zip more than two per se, but you could zip the
| result of the first zip into a third and get ((item1,
| item2), item3). You could then map these if you wanted,
| to flatten them into a single tuple .map(|((item1,
| item2), item3)| (item1, item2, item3))
|
| Of course there's a trade-off here between ergonomics and
| generality
| masklinn wrote:
| > It can't zip more than two per se, but you could zip
| the result of the first zip into a third and get ((item1,
| item2), item3). You could then map these if you wanted,
| to flatten them into a single tuple .map(|((item1,
| item2), item3)| (item1, item2, item3))
|
| FWIW that's more or less what `itertools::izip!` does for
| you, it just chains `zip`s then "splats" them using a
| `map`.
| defen wrote:
| > It stops when one of the two iterators ends
|
| Right; my question is, suppose you're iterating over two
| slice iterators - won't each call to `a.next()` and
| `b.next()` have to check whether that sub-iterator is
| done? One of the benefits of the Zig approach is that you
| can iterate over an arbitrary number of slices and do one
| check before entering the loop, followed by the compiler
| emitting unchecked index access in the loop. So it
| basically compiles down to the equivalent of a C `for`
| loop.
| masklinn wrote:
| Rust's zip has a specialisation for iterators with a
| trusted length. Such as slice iterators.
|
| `zip` yields exactly the same assembly as a loop over the
| index range with an unsafe item access:
| https://godbolt.org/z/7ebfxbhxc
| brundolf wrote:
| Interesting, are "trusted-length" iterators something
| that might ever make it into userspace? Maybe as const
| generics?
| masklinn wrote:
| It's already in userspace, though nightly (and unsafe,
| obviously), so whether it'll be stabilised, and in what
| form, is an open question: https://doc.rust-
| lang.org/std/iter/trait.TrustedLen.html
| the8472 wrote:
| For Zip it's TrustedRandomAccess[0] instead of
| TrustedLen. Imo the most radioactively unsafe trait in
| the standard library and will likely never be stabilized
| in its current form.
|
| [0] https://github.com/rust-
| lang/rust/blob/f540a25745e03cfe9eac7...
| kristoff_it wrote:
| You have to pass -O though, the point of Zig's for loop
| syntax is to get fast compile times and good performance
| also in debug mode :^)
| defen wrote:
| That's cool. At the same time though, it almost feels
| like a distinction without a difference in some ways -
| Zig has a special built-in syntax; Rust doesn't use
| special syntax, but it does use complex special-cased
| unsafe code in the stdlib in order to implement a safe +
| performant API.
| masklinn wrote:
| On the other hand, the "special cased unsafe code" is
| applicable to more than just zip, more than just the one
| array type, and is available in userland (though
| currently unstable so nightly only, both to implement it
| on a bespoke type and to rely on it).
| brundolf wrote:
| Rust's is built on top of (and exposed to) Iterators,
| which are a very general concept that can be rooted in
| all kinds of data structures, composed in all kinds of
| ways, and collected/processed in all kinds of ways (i.e.
| the user's code might not even contain an actual loop).
| The code continues to work in many situations, even where
| the optimization doesn't apply
|
| You trade some special-case syntax and ergonomics for
| that generality, but it is very general even if not all
| of it is optimized in the same way
| brundolf wrote:
| I was thinking about the fact that whatever you're
| iterating over has to be copied around throughout the
| process. Rust can guarantee that eg. deep-copies (clones)
| of allocated structs will never happen implicitly, if
| your iterator owns the values being iterated. But in
| languages where copying can trigger allocations, this
| could be a problem
|
| I don't actually know whether that applies to Zig though
| hryx wrote:
| In general Zig foregoes syntactic sugar and requires
| implementing higher-level APIs by composing primitives. But a
| new language feature is a candidate when it solves a use case
| that can't otherwise be solved, or opens up a path to more
| efficient code.
|
| Loris' blog post points out that the new for loops address the
| latter:
|
| > In the multi-sequence for loop version it's only necessary to
| test once at the beginning of the loop that the two arrays have
| equal size, instead of having 2 assertions run every loop
| iteration. The multi-sequence for loop syntax helps convey
| intention more clearly to the compiler, which in turn lets it
| generate more efficient code.
|
| It also builds on existing properties of slices/arrays, rather
| than adding a new "enumerate primitive".
| travisgriggs wrote:
| This is my take as well. The older and more travelled I get the
| more I disdain these kinds of things. Your language syntax
| should do whatever the "thing" is that your language model is
| all about. Syntactic sugar should be for the things you do
| LOTS.
|
| I watch language after language add sugar to maintain the
| appeal of their product, one niche group or application at a
| time. It turns into a death by a thousand cuts, or by a
| thousand sugar cubes. Most languages start out simple and
| appealing and understandable, an increasingly short amount of
| time later, they've layered on "helper" after "helper" to the
| point it takes a bit of expertese to consume the language
| effectively.
|
| I dream of a world where we'd measure languages by the
| complexity of their ASTs rather than their popularity on a
| TIOBE or StackOverflow index.
| AndyKelley wrote:
| Arguably this change to the zig language is overall a
| simplification because the loop index capture is no longer a
| special case.
| travisgriggs wrote:
| Could be. I think you're more the expert here than me? :D
|
| To me, the followin is a bit of syntactic sugar that I
| think is the kind of transcendental "go big/basic with it"
| that I hint at.
|
| Some time ago, I worked in a language that had this idea
| that any composable block of code could be captured as 0-N
| statements between the characters [ and ]. They thought
| they were being clever and called it a "block of code".
| Which I thought was cool, because it looked like a block.
| Pedants called it a BlockClosure. If you wanted to pass
| parameters to one of these, they used a colon denoted list.
| So a two arg block might look like
|
| [:a :b | <code goes here> ]
|
| So yay, pass a closure to a service, and it "captures" the
| values be invoking said closure with arguments.
|
| And then the authors thought, okay, enough sugar for a few
| days, let's just use this. I mean really really really use
| this.
|
| You can use a two arg block like that for a zip function of
| course, but why limit it to iteration? Use it in the
| standard library to implement the "for each" function.
| Which when you looked at was just that "how dare they not
| have a for syntax" while implementation. But because it
| wasn't embedded in the syntax, you could copy/paste/modify
| to come up with a filter iteration. Or a reduce. Or a map.
| Or all kinds of interesting compositions
| "selectAndCollectAndReject" with 3 closures.
|
| And why stop there? They decided, "let's just do boolean
| logic with these block things too". So where as most
| languages has special syntax for conditionals (and once
| they start, they're in competition with their peers to keep
| adding more and more of them (do while, case, if, if with N
| elses, on and on). But they just wrote it like
|
| <condition> ifTrue: [trueBlock] ifFalse: [falseBlock]
|
| Sure they optimized it, but from a linguistic point of
| view, it was the same thing as above. No new sugar was
| needed.
|
| Whereas many languages have added sugar for optionals
| (usually involving ?s), this language, 20 years ago, was
| doing it with closures already. Someone noticed they could
| implement the following family of "functions"
|
| ifNil: [nilBlock]
|
| ifNil: [nilBlock] notNil: [:notNilValue | notNilBlock]
|
| ifNotNil: [:notNilValue | notNilBlock]
|
| Sure, not as terse as ? (which some endeavoured to deal
| with), but the language semantics didn't have to change
| each time there was a new thing to do.
|
| I'm sure there's a Lisper out there that can write their
| analog to the above. Because it too, is was one of these
| "do much with little" langauges.
| duped wrote:
| "Sugar" is implemented by converting an AST into itself, so
| it wouldn't change its "complexity" at all.
| Bekwnn wrote:
| Working on low level performance sensitive code in games,
| this is something I see in code LOTS.
|
| As mentioned in the article, data oriented design runs into
| the pattern of wanting to iterate over parallel arrays of
| data frequently.
| Terretta wrote:
| coming from total unawareness of zig: in the for (1..5)
| construct, these integer ranges consistently not including the
| upper limit element when lists do include the last element, seems
| surprising. i guess it's a range boundary (1 TO 5), not a list (1
| THROUGH 5), but the other behavior feels like a list, so it feels
| like 5 should be in.
| throwawaymaths wrote:
| It's consistent with (some) other languages, for example iirc
| in ruby .. is exclusive of the last item and ... Includes the
| last item.
| cmoski wrote:
| I completely agree with you on the madness of not including the
| upper limit. However, I don't see how the phrase "one to five"
| would not include five. "Rate this film on a scale of one to
| five" does not mean four is the highest rating.
|
| It translates to "increment from one, stopping before you get
| to five". Ridiculous.
| kzrdude wrote:
| It's not ridiculous, "1 to 5" translates into it starts with
| 1 and ends with 5, and both versions are ambiguous on the
| point of including the endpoint or not. In a programming
| context, it seems "clear" that it's ambiguous or down to
| convention.
| andrewstuart wrote:
| Hang on, I was reading last night that Zig has no for loop? That
| you have to use while.... is this not correct?
| messe wrote:
| It has no "for (init; cmp; step)" type loop, and instead you
| had to use: var i: usize = 0; while (i
| < sz) : (i += 1) { ... }
|
| Meaning that the scope for i would leak.
|
| It did have a foreach-style for loop, as seen in the article
| though.
| kzrdude wrote:
| Looks like they have the most important part in place, the
| increment before the next iteration.
| tialaramex wrote:
| > Ranges can only exist as an argument to a for loop. This means
| that you can't store them in variables
|
| I am confident this is a mistake. Every time you make a new kind
| of "thing" in your language somebody will want to do all the same
| stuff with it that they did with the other things, such as
| integers, ie in this case store a range in a variable. Ideally
| you'd just always be able to do that, see Lisp, but it can get
| very unwieldy, thus this is a reason to avoid making new kinds of
| thing so the issue doesn't arise.
|
| C++ chooses to actually do the heavy lifting here, which is why
| std::format (and its inspiration fmt::format) was such an
| enormous undertaking -- C++ can express the idea of a function
| which takes a variable number of arguments and yet all those
| arguments are independently type checked at compile time, not via
| compiler magic but just as a normal feature of the language. This
| is an enormous labour, and because they don't have any way to fix
| syntax issues the resulting problem accumulate forever in their
| language so I cannot recommend it as a course to other languages.
| It's like the Pyramids, do not build giant stone tombs for your
| leaders, this is a bad idea and your society should not copy it -
| however, the ancient Egyptians already did build giant stone
| tombs and they're pretty awesome to look at.
|
| Anyway, Rust chose to make its half-open range type
| std::ops::Range an actual type which you can store in a variable,
| pass to functions, modify etc. as well as using it in a for loop.
| Obviously don't copy Rust here exactly, for one thing Range
| should probably be IntoIterator, not an Iterator itself if they
| had it over, but you will wish this was an ordinary type in your
| language, so, just do it now. let a = 0..4; //
| The Range starting at zero and (non-inclusively) ending at four.
| masklinn wrote:
| The problem is that zig's designers apparently don't want to
| introduce an iterator abstraction, hence the frankenstein-ing
| of the for loop instead.
|
| Though in fairness getting an iterator abstraction to the same
| efficiency as a for loop requires pretty brutal optimisations,
| frankensteining your for loop, a lot less so.
| matu3ba wrote:
| Yes, this makes debug builds bloated and slow.
| throwawaymaths wrote:
| I don't know how ranges are implemented now (and I'm too lazy
| to check right now) but It's entirely possible zig's ranges
| could wind up as comptime-only values.
|
| Then you _could_ pass them around, but only at comptime,
| which will achieve many of the things you expect.
|
| There's also nothing stopping you from creating an iterator
| interface in userland.
| xigoi wrote:
| > Though in fairness getting an iterator abstraction to the
| same efficiency as a for loop requires pretty brutal
| optimisations
|
| How about Nim's inline iterators?
___________________________________________________________________
(page generated 2023-02-27 23:01 UTC)