[HN Gopher] An intro to Zig's integer casting for C programmers
___________________________________________________________________
An intro to Zig's integer casting for C programmers
Author : jorangreef
Score : 120 points
Date : 2021-05-11 08:17 UTC (14 hours ago)
(HTM) web link (www.lagerdata.com)
(TXT) w3m dump (www.lagerdata.com)
| gher-shyu3i wrote:
| Zig is shaping up to be a very nice language. It has some very
| cool ideas. I hope we'll see more of it at a larger scale.
| veltas wrote:
| >You won't even get a warning unless -Wextra or -Weverything is
| turned on.
|
| I know people are used to languages where compiler warnings are
| on by default, but if you are building C without -W/-Wextra you
| are kind of asking for trouble. -Weverything is probably too
| pedantic for most people but -Wextra is pretty much required. And
| -Werror is required for projects where discipline can't be
| assumed or where the build log is bigger than a screenful.
| lionkor wrote:
| > If runtime safety is turned off, you get undefined behavior
|
| You already know someone will teach their students to always have
| it off because its "slower" or something.
|
| Make something idiot-proof and the world invents a better idiot.
|
| Another language that does safety like this incredibly well is
| Ada (Ada/SPARK), and I'm unsure why people aren't more hyped
| about it. So many people hype Rust or Zig or whatever new
| language, and yet there's an incredibly safe and expressive
| language that's used in high-reliability applications like
| missile guidance systems, that nobody seems to talk about.
| AnIdiotOnTheNet wrote:
| > Make something idiot-proof and the world invents a better
| idiot.
|
| Sure, but if you only ever give people safety scissors then
| you're severely limiting the kinds of things that they can
| build. You have to let people who know what they are doing be
| able to do what they need to do, so you have to be able to turn
| the runtime safety off. Zig can do this at the scope level.
| TchoBeer wrote:
| I don't understand why there aren't any solutions like this.
| Why aren't there any languages with a good garbage collector
| but also let you turn it off and work with memory manually.
| Maybe there is and I don't know of it? Maybe garbage
| collected languages and manual memory languages require
| different design? I don't know.
| june11 wrote:
| dlang has exactly this, with both marking functions as
| @nogc and being able to both use the gc in one block, and
| malloc free in the next. You can also limit D a subset
| called 'BetterC'. I Really enjoy being able to gc my way
| though a problem until I need some explicit memory
| structure for putting together a ECS or similar memory
| management heavy patterns.
| arc619 wrote:
| Nim is built like this. The gc is opt-in by type and manual
| memory management is simple. All other types are stack
| allocated by default.
|
| The new gc is pretty much compiler assisted scope managed
| smart pointers as well. Also when using the gc you can
| build your own types with custom create/free/copy/move
| operators to do what you want without worrying about a stop
| the world gc.
|
| Using the arc gc pretty much compiles down to what you'd do
| manually. For cycles you can use the orc gc, again it's all
| opt in.
|
| I find this a great balance of productivity and
| performance, where it's easy to have high control when you
| want it, and still get good performance when you're not
| bothered.
| elcritch wrote:
| +1 for Nim and being usable without a GC or for manual
| memory management like dealing with C apis. Though to be
| fair, quite a few types in the stdlib are heap based. But
| you can make your own static heap pool.
|
| With the new ARC & move semantics, Nim has hit a sweet
| spot, IMHO. It's like the language struggled to find a
| fit for a long time. But ARC allows very low overhead
| memory safety, perfect for mcu's and wasm in particular.
|
| Though perhaps Swift could be a contender in this arena
| as well with its ARC, but it's development is so Apple
| centric.
|
| What's interesting to me is that with move semantics and
| smarter compiler analysis, an ARC based GC overhead
| approaches that of Rust's compile time lifetime memory
| management. For any non-trivial program Rust seems to use
| a lot of Rc's or copies to get around lifetime analysis
| issues. So if the compiler can automatically figure out
| lifetimes in code it can eliminate many ARC operations.
|
| I hope more languages adopt more flexible ARC based gc's
| or improve on rusts ergonomics.
| kaba0 wrote:
| Well, GC languages can't really allow arbitrary pointer
| arithmetic (because it would make GC useless/unsafe), but
| languages that can do explicit stack--allocation are
| numerous, like C#, Go, I'm sure D as well. Nim also has an
| optional GC, so there are languages all around the
| spectrum.
| dnautics wrote:
| for erlang, if you drop down to C you completely lose the
| garbage collector, but the docs for doing this show you how
| to set up a hook for your memory to be garbage collected
| just like any other first-class data in the system.
|
| If you're really responsible, it's possible to use the
| system allocator to inform the system about how much memory
| your allocations consume so that the memory pressure
| triggers have a correct accounting of how much memory is
| being used.
| Narishma wrote:
| I think this will just lead to a split ecosystem with some
| libraries requiring a GC and others requiring manual memory
| management.
| ArtifTh wrote:
| C# has GC by default, but you can do manual memory
| management in "unsafe" blocks
| pjmlp wrote:
| Or Modula-2, NEWP, BASIC, or really most languages that aren't
| copy paste compatibel with C.
|
| Ada suffered from its domain, original price of compilers, and
| the few UNIX vendors that cared to offer compiler like Sun, it
| was an additional acquisiation on top of UNIX SDK.
|
| Everyone knows that only newbies do coding errors in C, so why
| spend the extra money? /s
| qayxc wrote:
| > Everyone knows that only newbies do coding errors in C, so
| why spend the extra money? /s
|
| The Ariane 5 maiden flight disaster illustrated quite clearly
| that the choice of language has little influence on the
| actual correctness of a program.
|
| Ada on its own is no better than Pascal in that regard. A
| formally verified and thoroughly tested MISRA C program can
| be safer and more correct than a sloppily written Ada
| program.
|
| So the question is indeed not as rhetorical as one might
| think - why spend the extra money indeed? Isn't it better
| spent on verification, testing, tooling and culture, which
| benefits the development regardless of programming language?
| kjs3 wrote:
| No. The Ariane 5 disaster was caused by a development team
| deliberately turning off safety checks. Ada is in all
| regards a 'safer' language than Pascal, as long as you
| don't disable the safety features.
|
| http://www-
| users.math.umn.edu/~arnold/disasters/ariane5rep.h...
|
| http://www.adapower.com/index.php?Command=Class&ClassID=FAQ
| &...
| pjmlp wrote:
| Ariane 5 error was caused by the remaining 30% programming
| errors when we leave out the 70% of software failures
| caused by C typical errors.
|
| So yes, it is quite worthy to reduce the amount of money
| spent in verification, testing, tooling and culture.
|
| The alternative is to just give up that programmers will
| never learn and just force verification at hardware level,
| like Google is doing on Android.
|
| "Memory Tagging for the Kernel: Tag-Based KASAN"
|
| https://www.youtube.com/watch?v=f-Rm7JFsJGI
|
| Oracle on Solaris SPARC,
|
| https://docs.oracle.com/cd/E37838_01/html/E61059/gqajs.html
|
| Apple on iOS,
|
| https://developer.apple.com/documentation/security/preparin
| g...
|
| Microsoft on Azure Sphere,
|
| https://www.microsoft.com/security/blog/2020/11/17/meet-
| the-...
| vlovich123 wrote:
| That's not an alternative to memory safety. That's just a
| basic security measure even if you write everything in a
| safe language like Rust or Zig (let alone the fact that
| you have enormous C/C++ legacy codebases that are likely
| still seeing ongoing development due to switching costs).
| The reason is you will always have some amount of unsafe
| code when dealing with the hardware and this is a
| hardening measure to protect against slipups.
| messe wrote:
| Ironically enough, I recall reading that if the overflow
| hadn't triggered a hardware exception and instead was
| silently ignored, the first stage would have survived;
| the code in question had just been carried over from
| Ariane 4 and was no longer important for the operation of
| the booster..
| kjs3 wrote:
| _few UNIX vendors that cared to offer compiler like Sun, it
| was an additional acquisiation on top of UNIX SDK._
|
| The Sun C compiler was an additional cost item, just like
| everyone else.
| tored wrote:
| Is anyone actively using Modula-2? Actively as in starting
| new projects, compiler improvements, tooling, community etc.
| pjmlp wrote:
| The GNU M2 compiler is kept up to date with GCC, it is just
| kept out of tree.
|
| https://www.nongnu.org/gm2/homepage.html
|
| However I doubt the language gets much use nowadays beyond
| some legacy code bases, its opportunity is now gone.
| butterisgood wrote:
| I think I'd like to get into Ada more, but I'm always a bit
| unsure on how to select the right toolchain (is the
| Dragonegg/LLVM option viable today?) etc.
|
| I would love to see something like "Rustlings" for Ada, as I
| found that was a good way to practice not only writing code,
| but reading it as well.
|
| I was able to self-teach Haskell and Erlang without any major
| problems, and even managed to ship applications written in it
| commercially.
| pjmlp wrote:
| Plenty of learning paths at https://learn.adacore.com/
| butterisgood wrote:
| Thanks, I'll check it out!
|
| (Looks like I found a use for the new Rosetta on my M1
| Mac.... I wonder when there will be builds for aarch64 for
| Darwin)
| johnisgood wrote:
| Yeah, and Ada/SPARK performs those checks people are hyped
| about (and more) at compile-time! Honestly, Ada does everything
| those languages do (regarding safety), and does more and does
| it better (again, talking about safety and correctness here).
| If you want evidence, check out my posts:
|
| https://news.ycombinator.com/item?id=19122884 (!)
|
| https://news.ycombinator.com/item?id=19245898 (!!)
|
| https://news.ycombinator.com/item?id=19274244 (!!)
|
| https://news.ycombinator.com/item?id=19274412
|
| https://news.ycombinator.com/item?id=19770405
|
| https://news.ycombinator.com/item?id=20776296
|
| https://news.ycombinator.com/item?id=20934511 (!!) [3]
|
| https://news.ycombinator.com/item?id=20939336 (!!)
|
| https://news.ycombinator.com/item?id=21286061
|
| https://news.ycombinator.com/item?id=21286292
|
| https://news.ycombinator.com/item?id=21435869 (!)
|
| https://news.ycombinator.com/item?id=23609499 (!!)
| steveklabnik wrote:
| Most of Rust's checks are at compiletime, and Ada employs
| runtime checks as well.
| johnisgood wrote:
| It can employ, but you do not have to (you can turn off all
| runtime checks and use formal verification only, something
| that Rust cannot do (statically verify correctness the same
| way you can with SPARK using GNATprove)), it can all be
| done at compile-time with Ada/SPARK.
|
| > These runtime checks[1] are costly, both in terms of
| program size and execution time. It may be appropriate to
| remove them if we can statically ensure they aren't needed
| at runtime, in other words if we can prove that the
| condition tested for can never occur.
|
| > This is where the analysis done by GNATprove comes in. It
| can be used to demonstrate statically that none of these
| errors can ever occur at runtime. Specifically, GNATprove
| logically interprets the meaning of every instruction in
| the program. Using this interpretation, GNATprove generates
| a logical formula called a verification condition for each
| check that would otherwise be required by the Ada (and
| hence SPARK) language.
|
| Additionally, in Ada/SPARK, you can formally verify tasks
| (concurrency), too: https://docs.adacore.com/spark2014-docs
| /html/ug/en/source/co....
|
| Moreover:
|
| > SPARK builds on the strengths of Ada to provide even more
| guarantees statically rather than dynamically. As
| summarized in the following table, Ada provides strict
| syntax and strong typing at compile time plus dynamic
| checking of run-time errors and program contracts. SPARK
| allows such checking to be performed statically. In
| addition, it enforces the use of a safer language subset
| and detects data flow errors statically.
|
| Contract programming:
|
| - Ada: dynamic
|
| - SPARK: dynamic / static
|
| Run-time errors:
|
| - Ada: dynamic
|
| - SPARK: dynamic / static
|
| Data flow errors:
|
| - Ada: -
|
| - SPARK: static
|
| Strong typing:
|
| - Ada: static
|
| - SPARK: static
|
| Safer language subset:
|
| - Ada: -
|
| - SPARK: static
|
| Strict clear syntax:
|
| - Ada: static
|
| - SPARK: static
|
| Additionally, safe pointers in SPARK:
| https://blog.adacore.com/using-pointers-in-spark and
| https://arxiv.org/abs/1710.07047.
|
| More information about Get_Line (i.e. even where you would
| think you cannot go static):
| https://blog.adacore.com/formal-verification-of-legacy-
| code.
|
| [1] overflow check, index check, range check, divide by
| zero
| nine_k wrote:
| Ada / spark may be a great thing. But how do people know?
|
| Is there a good compiler that one can install for free?
|
| Are there a few good and comprehensive books / references /
| guides, available online for free?
|
| If you want something to become popular, _give it away_.
| Ideally, push it. At the very least, have a limited free
| version. Or turn a bling eye to small-time piracy, as many
| vendors did for a long time. Or make it dead easy to buy it for
| $10.
|
| (None of the above works for you? Sorry, you have just turned
| down 99% of kids who might like to tinker with your tech, love
| it, and then promote it wherever they go. Your tech may be hot
| and desired in the narrow circle of pros, but it's not going to
| become popular and win the world.)
| roblabla wrote:
| Not sure why you think ADA/Spark isn't free. There's GNAT[0],
| an open source ADA compiler that's part of the GNU toolchain,
| a fairly good book teaching ADA and SPARK[1] by Ada Core, and
| plenty of tutorials.
|
| Accessibility is really not the problem. The problem is, Ada
| is an old language that has many similar problems to C: lack
| of a good package management story, and a design by committee
| making language evolution glacially slow (though that also
| doubles as a feature). It also lacks a vibrant ecosystem like
| we can find in other popular languages.
|
| Furthermore, SPARK in particular takes the concept of safety
| much further than Rust or Zig do right now, as it allows
| proving the correctness of a program according to a formal
| specification. Rust/Zig only care about proving the absence
| of UB. This makes SPARK much more complex, and thus have a
| much higher barrier of entry.
|
| [0]: https://www.gnu.org/software/gnat/
|
| [1] https://learn.adacore.com/index.html
| xfer wrote:
| Judging the state of affairs there might as well go with
| ATS. http://www.ats-lang.org/
| ModernMech wrote:
| Your first link pretty much says it all: barebones website
| 5 years out of date, no docs, no resources, no community
| links, no indication that this project is even alive. To
| top it off the project is named after one of the most
| annoying creatures on the face of the Earth. This is a
| marketing issue.
|
| The problem for older languages is that the bar for what's
| expected of a language has risen substantially. A long time
| ago, languages weren't even expected to have
| implementations. Today, not only are implementations
| expected for all major platforms, languages are also
| expected to run on phones and the browser, include package
| managers and library repositories, have a language server
| implementation and editing modes for all major IDEs, be
| open source with active development, provide extensive
| documentation on the scale of a book, and also promote a
| vibrant and active community. Oh, and to top it off, users
| don't want to pay for any of that. It's just expected,
| which is why most languages these days only come out of
| large tech companies that can afford to fund all of the
| above with no expectation of profit.
| MaxBarraclough wrote:
| > Your first link pretty much says it all: barebones
| website 5 years out of date, no docs, no resources, no
| community links, no indication that this project is even
| alive.
|
| A better starting point might be
| http://www.GetAdaNow.com/
| nobodywasishere wrote:
| There's an Ada compiler as part of gcc.
|
| The most prominent project that i know of written in ada is
| ghdl.
| greatgoat420 wrote:
| There was at least always the recommendation to only use
| the version of the Ada compiler released by the FSF as the
| regular GCC one is/was more encumbered.
| 2526702934 wrote:
| > there's an incredibly safe and expressive language that's
| used in high-reliability applications like missile guidance
| systems, that nobody seems to talk about.
|
| Maybe because they assume that it's only used for that purpose?
| You talk about _rocket science_ and you expect people to jump
| on the bandwagon...
| gilnaa wrote:
| Not to sound too aggressive, I'm not sure why ada programmers
| keep being surprised at the hype.
|
| Safety (of all kinds) is not the only selling point of Rust and
| Zig, and the fact that Ada also have safety measures doesn't
| mean that it's instantly an option for me.
|
| For example, I keep seeing people hype about Zig/Rust, for all
| kinds of reasons; and the way that they present their arguments
| is very compelling. While I've heard of Ada before, AFAICT the
| only context in which it is brought up is to downplay hype over
| Rust/Zig; never as a standalone.
|
| Sounds like a PR problem, mostly.
|
| _EDIT_ : Not to say that there aren't other problems, just
| saying that this point alone isn't some magic bullet.
|
| I don't like Rust because it gives me memory safety, tons of
| languages do that. I love Rust because the tradeoffs it gives
| me worth the switch, and the overall tooling and ecosystem are
| very well made.
| dnautics wrote:
| > Sounds like a PR problem, mostly.
|
| It's the case that the Zig creator did put a high priority on
| soft "PR" for the language, the first hire was a developer
| advocate/community manager. I think this is a great case of
| "lessons learned" from the history of programming languages.
| [deleted]
| johnisgood wrote:
| > I don't like Rust because it gives me memory safety, tons
| of languages do that. I love Rust because the tradeoffs it
| gives me worth the switch, and the overall tooling and
| ecosystem are very well made.
|
| Same applies to the tooling of Ada/SPARK, but as you have
| said, it is pretty much a PR issue.
| c-cube wrote:
| I'm not saying it's justified, but I think the syntax is
| also a big impediment (just like for my daily driver,
| OCaml). Ada looks pretty verbose and annoying to write code
| in, at least if one wants to use it for side projects,
| small utilities, etc. rather than spaceship firmware; and
| that's probably how a lot of languages become popular, you
| need to tinker with them on small things.
|
| OCaml has seen the alternative "Reason" syntax (JS-like)
| grow in popularity. Maybe a C-like (or better, rust-like)
| syntax for Ada would help the language's adoption. People
| already using Ada would complain that the current syntax is
| fine but they're not the target demographic :-)
| ksec wrote:
| > Ada looks pretty verbose and annoying to write code in,
|
| I still dont understand this, you spend 80% of the time
| thinking about code, 19% reading, and 1% typing.
| jstimpfle wrote:
| 80% thinking when you're stuck and don't know who to ask.
| I've made the same mistake. Have you watched some
| streamers? They are prototyping by typing, I would say,
| 50% of the time.
| c-cube wrote:
| I don't have this experience. I like to prototype by
| writing code (or type signatures), and then refactor a
| lot from the initial solution. That's easier to do if the
| syntax is reasonably concise (the other half is to have
| good types which Ada definitely provides).
| mousepilot wrote:
| I never used Ada, but read a book on it.
|
| I don't even remember much, just that how much I loved
| just the look of the example programs, but that was
| during community college when I was new to programming.
| It wasn't part of a class, but the guy who implemented
| the curriculum just jammed the library with jewels and
| the entire library collection was just full of classics
| especially since the actual curriculum was just a basic
| associates in "business data processing."
| elcritch wrote:
| From what I can tell, safety isn't a selling factor of Zig.
| From a "safety" perspective Zig seems like a step backwards
| compared to the latest generation of languages, and Rust in
| particular. Zig's ergonomics seem decent but its memory
| safety tact appears to basically be to include valgrind-like
| tools into debug builds with good PR.
| rslabbert wrote:
| Not a Zig expert, but safety is a factor for Zig, it just
| treats it as less of an absolute than Rust. I think the
| thing to keep in mind is that something can be a priority
| without being an absolute priority. I'd make a comparison
| to OpenBSD vs Linux. Both have security as a priority,
| OpenBSD just has a more absolute focus on it.
|
| For example, a couple of features come together really
| nicely to make memory safety easier to test in Zig: * You
| need a reference to an Allocator to be able to allocate
| memory, so as a general rule, the caller can control which
| allocator is used. * Unit testing is integrated well into
| the language. * Therefore, you can create an allocator for
| each unit test, and fail the test at the end if any memory
| was leaked. * This process can also happen at the
| application level with the General Purpose Allocator, which
| can let you print an error when the program exits if
| anything was leaked.
|
| The above doesn't solve every memory safety problem (and
| there are other features like native bounds-checked slices
| that solve other kinds of issues), but it provides an extra
| layer that can probably get us quite far into the "quite
| safe" camp.
| dnautics wrote:
| Putting valgrind into the stdlib is really clever, and
| also, I like memory safety being the carrot to get you to
| write tests. I worry having a 'safe' system like rust
| sometimes causes very smart (TM) developers, especially
| less experienced ones, to be complacent and write less
| tests.
| nicoburns wrote:
| Writing fewer tests is somewhat justified when you can
| encode invariants in the type system. It depends on the
| level of reliability you require of course. But my Rust
| code without tests has been comparably reliable to my
| code in other lanaguages eith tests.
| StreamBright wrote:
| > Make something idiot-proof and the world invents a better
| idiot.
|
| :D
|
| My argument is that if I need that insane speed and safety
| features have to be off I should actively work for that not the
| other way around that I should actively work for safety.
|
| Why? Because we forget to do things.
|
| > Another language that does safety like this incredibly well
| is Ada (Ada/SPARK)
|
| I wanted to learn Ada for a long time. Do you have a guide or
| tutorial how to get into it?
| johnisgood wrote:
| I once collected all of my posts about Ada which do contain
| lots of links that can get you started. I will not re-post
| them here, instead you may go to
| https://news.ycombinator.com/item?id=23808305. Check out the
| links under [2].
|
| Especially these ones:
| https://news.ycombinator.com/item?id=21435869 and
| https://news.ycombinator.com/item?id=21437498
| StreamBright wrote:
| Thanks!
| johnisgood wrote:
| No problem! You might find this website useful, too, as
| it is filled with resources:
| https://www.adaic.org/learn/materials/
| petre wrote:
| > incredibly safe [...] high-reliability applications like
| missile guidance systems
|
| https://en.m.wikipedia.org/wiki/Cluster_(spacecraft)
| sigzero wrote:
| Yup, people are the issue.
| dapids wrote:
| Yes, sure, but have you actually used Ada in practice? It's not
| exactly ergonomic or straightforward either, it's not magic.
|
| You don't hear about the struggles of using it in practice
| because the projects you've pointed out are usually
| confidential.
|
| In my experience MISRA is a much more popular standard for
| secure programming in the defence industry versus Ada. MISRA is
| a C standard, and yes, it runs missile systems, Boeing braking
| systems, you name it.
| friend-monoid wrote:
| As a C programmer, I have definitely shot myself in the foot in
| the conversions between types on multiple occasions. This is a
| really welcome feature.
| [deleted]
| scoutt wrote:
| I am curious to know the reasons behind the @as() syntax. What's
| wrong with "i32 y = (i32) x;"?
|
| I look with interest to new languages but after so many years
| dealing with C, my parser crashes when I see the type after the
| variable name and at the end of functions declarations.
| var x : u8 = 5;
|
| What's wrong with "u8 x = 5;"?
|
| And I don't really like type inference very much (or when it's
| abused or cannot be avoided). What is "b"? Oh, I have to check
| what it's being casted from... var a: u8 = 255;
| var b = 1 + 2 + 3 - (4 + @as(i32, a));
|
| What type is "b"? i32?
|
| Curious fact: the above code generates >40000 lines of assembly
| in godbolt: https://www.godbolt.org/z/fxfbb9jfn
| tomp wrote:
| juxtaposition (i.e. syntax that has meaning when two
| identifiers stand together separated only by whitespace) is
| generally a problem for parsing. Basically, it introduces all
| kinds of parsing ambiguities. Compare the following:
| a *b a * b a[n] x b [n] a<x> y
| a < x > y e(1) f e (1)f
|
| Even for C/C++ this is complicated (you need the symbol table
| when parsing, to know if an identifier is a type or not), but
| for more advanced languages, it's even worse (e.g. if you
| support pattern matching / destructuring assignment).
|
| Declaring variables with _var_ let 's you also write _let_ or
| _const_ instead, giving the programmer more options (and making
| those options obvious as well).
|
| You might not _" like"_ type inference, but it really is
| superior to no-type-inference - you can always _also_ write the
| expected type, if you want.
| layer8 wrote:
| > you can always also write the expected type, if you want.
|
| But you have to read the code of people who didn't.
| eps wrote:
| var x : u8 = 5;
|
| allows making type optional while keeping the code about the
| same, e.g. var x = 5;
|
| This arrangement has this nice consistency to it, with the type
| info being more of a hint (to the compiler or to the
| programmer) than a required component.
| dnautics wrote:
| > What's wrong with "u8 x = 5;"?
|
| I believe it makes the parser marginally more complicated.
| Let's say you had a non-builtin type, "foo". The var trigger
| immediately declares what's going on. For "foo x = 5;" the
| parser must either 1) have contextual information that foo is a
| declared type or 2) wait to notice that there are two
| identifiers side-by-side and then resolve that this means "it
| must be a variable declaration".
|
| Honestly, C and C++ are very much in the minority for choosing
| this syntax. Coming form pascal, I remember finding this syntax
| to be annoying 20 years ago, so it's my bias to believe that
| this is just internalized pain for C devs.
|
| As for the type coercion with the () operator... well, C is
| infamous for "spiral types". Typecasting complicated things in
| C, like, say, an array of pointers, can get scary as hell.
| Someone wrote:
| A bit more than marginally more complicated, if you consider
| the cases of parsing partial or incorrect programs.
|
| You want to be able to (somewhat) parse those so that you can
| syntax-colour, auto-complete, and mark errors in a code
| editor.
|
| I think those are the major reasons modern languages don't do
| such things the C way.
| messe wrote:
| > Curious fact: the above code generates >40000 lines of
| assembly in godbolt: https://www.godbolt.org/z/fxfbb9jfn
|
| I took a quick glance at the ASM, and it seems that the
| majority of that is code from the stdlib run prior to main.
| Compiling with -OReleaseSafe brings that down to 15,000 lines
| of assembly.
|
| If you get rid of "main" and compile it as an exported
| function, you get far less code:
| https://www.godbolt.org/z/jjGP4xsfx
|
| If you add -OReleaseSafe to the "export fn" version I shared,
| it's around 4 lines of code calling panic. Adding
| -OReleaseFast, and it's just a ret statement.
| meepmorp wrote:
| > Compiling with -OReleaseSafe brings that down to 15,000
| lines of assembly.
|
| I tried it with -OReleaseFast to compare, and it's 500-ish
| lines. That's amazing.
| dnautics wrote:
| remove "pub" from your godbolt. Zig/godbolt is identifying that
| you're trying to build a full program and so it brings in a lot
| of the boilerplate necessary to launch a program (for example,
| the panic handler, stdlib stuff to format strings for the panic
| handler, etc).
| logicchains wrote:
| It says Zig eliminates "implicit conversions unless they are
| guaranteed to be safe (for example, assigning a u8 value to a u16
| variable cannot fail or lose data)", but then suggests that @as
| should be used when "casting an int to a larger-size int of the
| same sign". Why should @as be used if it's safe to implicitly
| convert?
| AndresNavarro wrote:
| What I took from the article is that it's useful when combined
| with type inference. So in Zig you could do:
| var x : u8 = 5; var y = @as(u32, x);
|
| OR var x : u8 = 5; var y : u32 = x;
|
| The cast is needed in the first case to force the inference y:
| u32, otherwise it would be y: u8
| flohofwoe wrote:
| Hmm, this indeed doesn't make much sense. Assigning to a wider
| type with or without different signedness doesn't require a
| cast as long as all bits fit into the new type:
|
| https://www.godbolt.org/z/x8a65sPcr
| flohofwoe wrote:
| IME the strict conversion rules in Rust and Zig can be quite a
| bummer for somebody coming from C because they may add a
| surprising amount of friction in day-to-day coding. Yes, C code
| is often way too sloppy when it comes to picking the right type
| (signed vs unsigned vs float), and it conveniently hides the
| problems if the wrong choice was made.
|
| But sometimes the same value needs to be used in integer and
| floating-point math, and there is no single correct type for this
| value.
|
| There are also some tricky choices in bit twiddling operations.
| Is it really such a problem when bits are shifted out of a value
| when this is normal behaviour down on the assembly level?
|
| I guess I'll eventually learn to live with strict explicit
| conversion, and eventually I'll get better at picking the right
| type from the start, but I think implicit versus explicit
| conversion is an area where there is no "completely right"
| solution, because with 100% explicit conversions even coding down
| on the assembly level is more convenient ;)
| themulticaster wrote:
| Could you give some examples where the same value needs to be
| used in integer _and_ floating-point math?
|
| I'm not quite sure if I understood you correctly, but in case
| you propose that integer <-> float conversion should happen
| implicitly I have to disagree. While implicitly converting from
| integer to float/double would probably be fine, implicitly
| converting from float/double to integer sounds like a recipe
| for headaches: There are just too many options
| (truncation/ceil/floor/rounding). Even if you decided that some
| option should be the standard since it makes sense in 90% of
| all cases (let's say rounding), you now have a difficult-to-
| find (since it's implicit) footgun ready to cause damage in the
| remaining 10% of all cases.
|
| Even in that paragraph lies a small surprise waiting to be
| found (at least for some people): The floating point standard
| IEEE 754 defines five different rounding modes - two "normal"
| modes (Round to nearest, ties to even as well as Round to
| nearest, ties away from zero) and the additional directed ones
| mentioned above (truncation, ceil, floor). Interestingly, the
| default rounding mode (Round to nearest, ties to even) is not
| the one you probably learnt in school (that would be Round to
| nearest, ties away from zero). In school, you always round up
| if you end up exactly between two numbers, i.e. round(0.5) = 1,
| round(1.5) = 2. However, this introduces a small bias that can
| manifest itself into a real problem, for example if you round
| many measurements and then calculate the mean. That's why the
| default floating-point rounding mode will essentially alternate
| between rounding up and down, i.e. round(0.5) = 0 and
| round(1.5) = 2.
|
| Most of the time this is not an issue and you really want the
| default rounding mode, but I hope this example illustrates why
| hiding the "implementation detail" of converting floating-point
| numbers to integers might not be a good idea.
|
| By the way, I just looked up the man page for round(), and to
| my surprise found that it will always round ties away from
| zero, independently of the floating-point environment. If you
| want to round using different rounding modes in C, you
| apparently have to use nearbyint() and friends after setting up
| the rounding mode using fesetround().
|
| PS: Of course the rounding modes are all about rounding
| floating-point values, not necessarily converting them to
| integers, but I think the point should be clear.
| huachimingo wrote:
| >Could you give some examples where the same value needs to
| be used in integer and floating-point math?
|
| When you want to draw a pong game in a finite matrix
| representation, like a screen in ncurses, you cannot have
| something like screen[2.1][3.5], so you have to truncate (or
| round, depending what you want to do) the same float
| coordinates to write that matrix.
|
| Of course, you could avoid these using fixed point but even
| there you need a type conversion.
| flohofwoe wrote:
| > Could you give some examples where the same value needs to
| be used in integer and floating-point math?
|
| Mainly when working with pixels. In some contexts, pixels are
| clearly integer values (for instance the width and height of
| a texture in a 3D API is almost always given as integers, a
| texture with a width of 12.5 pixels simply doesn't make
| sense).
|
| Computations on 2D pixel coordinates on the other hand need
| to have subpixel precision, otherwise you'll get jittering
| artefacts. This results in code where integer values must be
| converted to floating point before going into computations,
| and sometimes the results need to be converted back to
| integer.
|
| I started to add duplicate functions to my C APIs to reduce
| the need for explicit conversions when using those APIs from
| stricter languages like Zig or Rust, for instance:
| void sg_apply_viewport(int x, int y, int width, int height,
| bool origin_top_left); void
| sg_apply_viewportf(float x, float y, float width, float
| height, bool origin_top_left);
|
| PS: interestingly, even 3D-APIs don't agree here. For
| instance in OpenGL, the glViewport() function takes integer
| values, while in D3D11 and Metal a viewport is defined with
| floating point values.
| pornel wrote:
| I agree that float<>int conversions are evil. I would make an
| exception for int to float conversion for literals. When
| switching from C to Rust I was annoyed by:
| if x > 0
|
| not compiling, because that's an _integer_ zero, not a
| _float_ zero! This makes the compiler feel very petty.
| darthrupert wrote:
| To me that seems equivalent to complaining that the
| following fails to compile: if 0 == "0"
|
| ... which is obviously broken code.
| joppy wrote:
| The number 0 happily and unambiguously plays a role in
| many numeric types and algebraic structures, and it's
| kind of nice if the compiler can just figure out the type
| that it should be from context. Comparing a literal 0 to
| a double should cast to a double, comparing a literal 0
| to an int should cast to an int, etc. I would be happy
| for "(int) 0 == (double) 0" to raise a type error though.
|
| I guess it's a matter of interpretation: does "x == 0"
| mean that we are comparing x with the integer zero, or
| the "zero value" of the same type as x? Numeric code is
| difficult for many reasons, but something that can make
| it much more tractable is having the code as close as
| possible to the mathematics underlying the algorithms,
| and this kind of "polymorphic constant" behaviour can
| help a lot, especially for integer literals which
| unambiguously embed into essentially any numeric type you
| could think of.
| p0nce wrote:
| D doesn't try to "fix" C integer promotion/casts. It's because
| they aren't bad defaults. As a bonus you can port code from C
| with less risk. Soon you'll be able to compile it directly.
|
| I contributed to found the only discrepancy in D vs C code wrt
| integer (-byte would yield byte instead of int), and it was
| fixed so that they match exactly C _conventions_.
| SuchAnonMuchWow wrote:
| C integer promotions are what remains of old computers
| systems which couldn't handle anything but a 'word'.
|
| It result in things highly inconsistent on modern 64-bit
| machines:
|
| For example, char/short are automatically promoted to int or
| unsigned int when an operation is performed on them, but this
| doesn't happen between int and long.
|
| Simple things like adding two int16_t together to get a
| int16_t in return is really hard: you need a mask and a extra
| explicit cast, and you rely on the compiler being smart
| enough to understand what you are doing and to emit just the
| instruction you want.
|
| But you don't need to do the same with int32_t, because
| reasons.
|
| This is a bad default behavior, as it doesn't match either
| the intuition or underlying hardware, it is inconsistent
| across all integer types, and to understand why it works that
| way you need to understand how computers works 30 years ago.
|
| D supporting C integer promotion might be a good thing for
| portability, but that's it.
| p0nce wrote:
| > This is a bad default behavior
|
| I don't think so. A lot of existing C code relies on this,
| and removing the promotion will prime this code to be
| copy/pasted without fix. Who will rewrite all the C and C++
| codecs code out there?
|
| A lot of C and C++ programmers actually ignore the int
| promotion rule, but their code relies on it involuntarily.
| p0nce wrote:
| > adding two int16_t together to get a int16_t in return is
| really hard
|
| Because the first thing people will do is:
| int16_t a, b; a = (int16_t) ((b * 157) >> 12);
|
| and then if b * 157 doesn't promote to 32-bit it overflows
| in the negative and then the code is incorrect.
| dgrunwald wrote:
| You're assuming `int` is 32-bits. The fact that C
| implicitly promotes to `int` makes it extremely hard to
| use the stdint.h typedefs without accidentally making
| platform-specific assumptions. uint16_t
| a, b, c; a = 50000; b = 40000; c = a
| * b;
|
| This simple code may or may not have undefined behavior
| depending on the target platform. (it's fine for 16-bit
| int, UB for 32-bit int, and fine again for 64-bit int)
|
| Numeric promotion means that it's almost impossible to
| write code that is correct on multiple platforms with
| different sizeof(int).
| p0nce wrote:
| Well there is the integer promotion issue, and the
| integers-have-unknown-size issue.
| flohofwoe wrote:
| Not my idea, but: expanding all values used in expressions
| to "native word size" (e.g. int64_t on modern machines),
| and only treat "bit width" and "signedness" as load/store
| attributes doesn't seem like such a bad idea.
|
| The problem seems to be that (AFAIK) C stopped at 32-bit
| integers.
| p0nce wrote:
| The hardware match C, in x86 using 32-bit integers is not
| slower than using 64-bit integers.
| SuchAnonMuchWow wrote:
| I agree, both choices are fine as long as they are
| consistent.
|
| One issue with expanding all values to native word size
| is that this would manly benefit _very_ old hardware:
| modern CPUs are as fast directly manipulating any int
| size as they are manipulating "native word size", or even
| worse because vector instructions usually manipulate
| twice many int32 per cycle than they manipulate int64
| (and twice many int16 as int32), even if not every code
| can be vectorized.
| gavinray wrote:
| You basically have to accept the relatively small, upfront cost
| of verbose declarations for these sorts of things in exchange
| for the lack of headaches + bugs you wind up with later.
|
| Or not, you can also choose not to write these kinds of
| languages and nobody can blame you -- to each their own.
| msaltz wrote:
| I sort of felt similarly when first using Zig until it crashed
| with a helpful error at runtime when I was trying to cast -1 to
| an unsigned integer :) The debugging time that saved more than
| outweighed the amount of time it took to do the explicit cast.
| pjmlp wrote:
| C programmers are well known to refer to such programming
| safety as straighjacket since the Pascal days anyway.
|
| There was more than enough time to learn why it was the right
| option to start with.
| lrfield wrote:
| The straight-jacket is the syntax in Ada. I don't feel locked
| in by the correctness measures, but if you are going to have
| elaborate declaration blocks I prefer "let foo = bar in ...".
|
| Because everything else is "let" in disguise. ML got the
| syntax right.
| pornel wrote:
| Rust does go overboard though, e.g. indexing requires
| `usize`. You either use `usize` for all your integers, or you
| end up with a cast-salad. arr[i as usize]
|
| This is actually risky, because even when you only meant to
| extend, you can also accidentally truncate or change sign.
| `as` does all of these things without a warning, and mixed
| with type inference it can easily lead to surprises.
|
| Rust makes it worse by insisting on a theoretical support for
| 16-bit and 128-bit `usize` regardless of the target platform,
| so you can't use `a[i.into()]` to infallibly convert `u32` to
| `usize` on 32-bit and 64-bit platforms.
|
| The correct syntax is:
| arr[usize::try_from(i).unwrap()]
|
| which is painfully verbose. In larger expressions it's almost
| an obfuscation of what the code is actually doing.
| piaste wrote:
| I'm not a Rust programmer, but AFAIK Rust has inline
| functions, so if your code has lots of indexing calls,
| could most of the pain go away with a few short helper
| functions? #[inline(always)] fn
| idx(i : u32) { usize::try_from(i).unwrap() }
|
| or #[inline(always)] fn get(arr:
| [T], i : u32) { arr.[usize::try_from(i).unwrap()] }
|
| Have to repeat the definition for each numeric type you
| use, I guess, but hopefully you aren't using _that_ many of
| them.
| joppy wrote:
| The Rust approach of having all indexing being unsigned has
| been extremely annoying for programming algorithms which
| need to do interesting indexing patterns like walking
| backwards through an array while indexing into another: the
| fact that "i >= 0" can no longer be used as a loop
| condition is quite exasperating. It means other more
| complicated indexing or looping approaches need to be used,
| and when showing code to coworkers (mostly mathematicians),
| they puzzle over this for a while before asking "why not
| just use i >= 0"? It's not just this case - doing index
| arithmetic in general is vastly complicated by the fact
| that it's difficult to detect idx < 0.
|
| I think unsignedness on array indices is one of those
| places where the "make invalid states unrepresentable"
| mantra has gone too far: yes it's nice that theoretically
| the whole 64-bit index space is addressable from a byte
| array based at 0, but in reality -1 is just as stupid and
| invalid an array index as 2^64 - 1 for pretty much every
| use-case. What we gain is negligible; what we lose is a lot
| of sensible code for dealing with index arithmetic.
| jstimpfle wrote:
| negative indexing could even make sense in some
| situations. It's not at a far stretch to imagine a
| pointer pointing to the one-past-last element of a
| series, and use e.g. -1 to get the most recent element.
| Python's indexing indeed works like that way.
| pjmlp wrote:
| Just wait for the updated C++ standard with size_t
| indexers.
| bbatha wrote:
| As I'm sure you're aware, this gets lost in bike shed land
| every time it comes up but Rust could implement
| `Index<iN|uN>` pretty easily. Unlike C you don't need an
| implicit cast to do the right thing.
|
| Personally, I have datastructure that uses non `usize`
| indexes I usually wrap my vector/array in in a custom type
| that implements index on whatever my common index types
| are.
| steveklabnik wrote:
| It's not only "bikeshedding", there are (in my
| understanding) significant inference issues that happen
| if we were to enable this, and that would have to be
| dealt with in a satisfactory way.
| Shadonototro wrote:
| casting, one of the reason i gave up with Zig quickly, it just is
| painful and solves nothing, it makes the code hard to decipher
| losvedir wrote:
| How quickly?
|
| Like, you saw it in the docs when you were mildly interested
| and decided it wasn't for you? Or you started a project, had at
| least a "hello world" compiling, but then eventually gave up on
| it because you were tired of casting? In that case I'm curious
| what the project was since I haven't run into casting all that
| often, but it might vary by domain.
| dnautics wrote:
| i just looked over some code I have which takes a complicated
| 3rd party binary format, parses, and extracts information
| from it. In about 2K LOC including tests there are no casting
| events except for some int-to-pointer and pointer-to-int at
| the boundary of passing reference to WASM.
| Shadonototro wrote:
| It was a while ago, a game engine, it just feels bad, maybe i
| misused the language, but that is a sign things aren't
| intuitive
|
| Some people love typing code like they'd write books, i
| don't, i like to be concise and to go straight to the point
|
| And it's not just as(i32) intToFloat(f32) floatToInt(i32),
| and then between i8 i32 i64, and then bitwise operations, and
| then slices, and then C strings etc etc etc, lot of visual
| bloat
| AnIdiotOnTheNet wrote:
| I like it because it provides explicitness and readability.
| There is no guessing or following a flow chart of implicit type
| conversions to see what's going on.
| bla3 wrote:
| > You won't even get a warning unless -Wextra or -Weverything is
| turned on.
|
| That's because -Wconversion isn't part of -Wextra for some
| reason. It is part of -Weverything though, and with that clang
| does warn: warning: implicit conversion changes
| signedness: 'int' to 'unsigned long' [-Wsign-conversion]
| if (sizeof(int) < -1) abort(); ~ ^~
___________________________________________________________________
(page generated 2021-05-11 23:01 UTC)