Post APcezcGnGXRWSBLgQ4 by deshipu@chaos.social
(DIR) More posts by deshipu@chaos.social
(DIR) Post #APbrmvCMgPRJdnauO0 by piggo@piggo.space
2022-11-14T23:32:17.605437Z
0 likes, 0 repeats
the thing i least like about AVR is the progmem address space. stupid harward nonsense needs two separate copies of basically the same functions for progmem and ram data
(DIR) Post #APbt8140AhyhrnIycy by deshipu@chaos.social
2022-11-14T23:46:25Z
0 likes, 0 repeats
@piggo it's the laziness of the compiler writers, not the architecture, though
(DIR) Post #APbt81gHsMbRmXDYIq by piggo@piggo.space
2022-11-14T23:47:18.214559Z
0 likes, 0 repeats
@deshipu I don't know, how would that work? Once you take a pointer it's wild west
(DIR) Post #APbtcKLh29bAMn28Zc by piggo@piggo.space
2022-11-14T23:52:46.860045Z
0 likes, 0 repeats
@deshipu you can't just copy from progmen, it needs lpm. But there no way to tell where that pointer came from once it's a number, even if you wanted to do some magic switching at runtime
(DIR) Post #APcWJtnKv31X0hAQ1w by ignaloidas@not.acu.lt
2022-11-15T07:06:28.445Z
0 likes, 0 repeats
@piggo@piggo.space @deshipu@chaos.social The problem is C. If you use function pointers in C the compiler essentially has to duplicate it.Good languages can avoid that problem.
(DIR) Post #APcXz2z9klH9LEseno by piggo@piggo.space
2022-11-15T07:25:05.577780Z
0 likes, 0 repeats
@ignaloidas @deshipu the compiler doesn't duplicate it, you duplicate it. This isn't about function pointers. The same address can point either to flash, eeprom or ram, each needing different way to read it
(DIR) Post #APcY56EiZxXfA62sca by ignaloidas@not.acu.lt
2022-11-15T07:26:11.667Z
0 likes, 0 repeats
@piggo@piggo.space @deshipu@chaos.social Different address spaces have existed for a while and C has always been kinda shit with them.
(DIR) Post #APcYMMewluIX1IXw3c by wolf480pl@mstdn.io
2022-11-15T07:29:14Z
0 likes, 0 repeats
@piggo @deshipu @ignaloidas is there also an io space?
(DIR) Post #APceTAmnujYLwoqpyi by piggo@piggo.space
2022-11-15T08:37:45.616194Z
0 likes, 0 repeats
@wolf480pl @deshipu @ignaloidas i think so, all IO is done using SFRs
(DIR) Post #APceWZ7zrtHeJMNjAO by piggo@piggo.space
2022-11-15T08:38:23.648523Z
0 likes, 0 repeats
@wolf480pl @deshipu @ignaloidas but iirc these are encoded as part of the opcodes, they don't have real addresses
(DIR) Post #APceoO8ZTDJZR89HyS by piggo@piggo.space
2022-11-15T08:41:35.731805Z
0 likes, 0 repeats
@deshipu @ignaloidas @wolf480pl nvm i was wrong, there are memory address space aliases. they probably couldn't do the same thing for flash, because it would overflow uint16
(DIR) Post #APcezcGnGXRWSBLgQ4 by deshipu@chaos.social
2022-11-15T08:40:06Z
0 likes, 0 repeats
@piggo @ignaloidas a pointer as an int is a fantasy anyways, the internal representation of a pointer in a C compiler has to be much more complex to account for all the undefined behavior
(DIR) Post #APcezcqx66MmGKGYmO by piggo@piggo.space
2022-11-15T08:43:36.610804Z
0 likes, 0 repeats
@deshipu @ignaloidas you can cast pointer to int and later cast it back and it will work, there is no magic here. I don't know about restrict and aliasing and other weird stuff, but once you turn it into an int, it's an int
(DIR) Post #APcf7VuOngg92cdiXg by ignaloidas@not.acu.lt
2022-11-15T08:45:04.780Z
0 likes, 0 repeats
@piggo@piggo.space @deshipu@chaos.social NO YOU CAN'TCHERI et al sends you their regards.
(DIR) Post #APcfJl7Q4l4vgnZKiG by piggo@piggo.space
2022-11-15T08:47:15.086806Z
0 likes, 0 repeats
@ignaloidas @deshipu wellit works 🤷♂️
(DIR) Post #APcfPmOKFzERNFLbLk by ignaloidas@not.acu.lt
2022-11-15T08:48:22.928Z
0 likes, 0 repeats
@piggo@piggo.space @deshipu@chaos.social Until it doesn'tPointer provenance is a thing, and C imagining that it doesn't exist is why you need to duplicate things across flash and RAM.
(DIR) Post #APcgNUW7qF8PJTT6TQ by piggo@piggo.space
2022-11-15T08:59:08.440659Z
0 likes, 0 repeats
@ignaloidas @deshipu so your idea is that the compiler would somehow remember from what a pointer was taken (thick pointers?) and then each time this pointer is accessed (and nested fields of the struct are accessed, etc.), it would automagically choose LD or LPM to do the right thing? so kinda creating the strcpy and strcpy_P behind the scenes, like rust generics do it?
(DIR) Post #APcgaF9oGEHZRK6x0a by ignaloidas@not.acu.lt
2022-11-15T09:01:27.924Z
1 likes, 0 repeats
@piggo@piggo.space @deshipu@chaos.social If it wouldn't allow pointers to be cast to and from integers, and have a reasonable type system, it could do this all at compile time, with zero costs at runtime, yes.C took this from us.C is why every new thing only has memory mapped IO.
(DIR) Post #APcgnfvQ38j2DumijA by wolf480pl@mstdn.io
2022-11-15T09:03:52Z
0 likes, 0 repeats
@piggo @deshipu @ignaloidas IIRC, if you take a pointer, cast it to an int, then subtract from it, or add a number greater than sizeof(thing it poins to) then you get Undefined Value. Dereferencing that pointer is Undefined Behaviour.It may work now, but it's allowed to break when you update gcc. Or enable -O3. Or remove a printf.
(DIR) Post #APchYWckRstGzFOKRM by piggo@piggo.space
2022-11-15T09:12:20.314333Z
0 likes, 0 repeats
@ignaloidas @deshipu ok this actually working and being implemented in avr-gcc would be great. Memory mapped IO is really nice though ... if you have wide enough bus to do it
(DIR) Post #APcmGHyqebjg83qtlo by wolf480pl@mstdn.io
2022-11-15T10:04:58Z
0 likes, 0 repeats
@ignaloidas @piggo @deshipu Ok but why would anyone want a separate IO address space? What are the advantages of it?
(DIR) Post #APcmUejOrbnMkATdtQ by ignaloidas@not.acu.lt
2022-11-15T10:07:40.908Z
0 likes, 0 repeats
@wolf480pl@mstdn.io @piggo@piggo.space @deshipu@chaos.social You can have separate buses to your core for IO and memory, which can reduce bus contention, which is useful if you're mostly doing IO.
(DIR) Post #APcmf7mhhP3vkM0we8 by wolf480pl@mstdn.io
2022-11-15T10:09:32Z
0 likes, 0 repeats
@ignaloidas @piggo @deshipu or you could have two memory buses...
(DIR) Post #APcmnbkm9w3n0h9pfU by ignaloidas@not.acu.lt
2022-11-15T10:11:06.781Z
0 likes, 0 repeats
@wolf480pl@mstdn.io @piggo@piggo.space @deshipu@chaos.social And have to deal with the possibility of unstable store ordering from a single core?That's a no from me
(DIR) Post #APcnztBxFX0kWWI3rk by wolf480pl@mstdn.io
2022-11-15T10:24:25Z
0 likes, 0 repeats
@ignaloidas @piggo @deshipu hmm I guess store ordering between memory and IO doesn't matter as much?Like, if you're giving a GPU a pointer to the command queue in RAM, then you have to do some cache control stuff anyway to make sure the command queue is actually in memory and not just in your cache, so store ordering between memory and IO doesn't matter...but then if the only IO you're doing is control registers then you're not doing a lot of IO...
(DIR) Post #APco2HI6cVs1qSJ408 by wolf480pl@mstdn.io
2022-11-15T10:24:57Z
0 likes, 0 repeats
@ignaloidas @piggo @deshipu it probably makes much more sense on an MCU which has lots of DMA-less peripherals
(DIR) Post #APcpzKMWwbkWPV64Lw by ignaloidas@not.acu.lt
2022-11-15T10:46:50.602Z
0 likes, 0 repeats
@wolf480pl@mstdn.io @piggo@piggo.space @deshipu@chaos.social It makes much more sense when you're considering IO rather than co-processors.Ethernet straight into the core might be a bit much, but with some IO-specific buffers (Maybe 1MB or so?) you could do a lot with it. But worth noting that if we do things like TCP or TLS in coprocessors (like servers are moving towards) then yeah, it doesn't make as much sense.
(DIR) Post #APcr2duCq7X1qSzdU8 by wolf480pl@mstdn.io
2022-11-15T10:58:37Z
0 likes, 0 repeats
@ignaloidas @piggo @deshipu so you mean a separate bus for accessing Ethernet ring buffer instead of DMA into main memory?And then you need to copy that into main memory anyway in order for the network stack to process it?
(DIR) Post #APcrTWnnMFuDdJoYQi by ignaloidas@not.acu.lt
2022-11-15T11:03:30.449Z
0 likes, 0 repeats
@wolf480pl@mstdn.io @piggo@piggo.space @deshipu@chaos.social For what it's worth, if your CPU is getting raw frames into the memory as-is right now, then not much would change. But it would be viable to preform some filtering on the CPU before copying it into the main memory.Or, if the CPU had tightly coupled memory instead of cache, then you could perform a lot more processing without the packets hitting the main memory, possibly enough to satisfy some applications like load balancing.
(DIR) Post #APcspcVIwJBNdrGPUe by wolf480pl@mstdn.io
2022-11-15T11:18:39Z
0 likes, 0 repeats
@ignaloidas @piggo @deshipu I thought the state-of-the-art was to do things in a zero-copy way, i.e. parse frames in-place, without ever copying them.Though at some point you will have to do TCP reassembly, which necessitates copying. And presumably a core-local memory would help with that. But that starts to feel like a coprocessor...unless we figure out how to run a generic-purpose OS on a CPU with core-local memory, and share that core-local memory between dynamically scheduled tasks.
(DIR) Post #APct9j6G7p5ttI8W5A by ignaloidas@not.acu.lt
2022-11-15T11:22:20.266Z
0 likes, 0 repeats
@wolf480pl@mstdn.io @piggo@piggo.space @deshipu@chaos.social Yeah, zero-copy is almost a necessity above a certain bandwidth, because your memory only has so much. But if you can, it's a lot better if you can do it without even touching the main memory. Netflix has done a bunch of work on that for their video serving, there's some presentations somewhere about how they do it.