Post Ahdu0319fAcg5wp7rc by JustJimWillDo@mastodon.online
 (DIR) More posts by JustJimWillDo@mastodon.online
 (DIR) Post #AhdsROULk82Yb1OJpg by foone@digipres.club
       2024-05-07T08:25:12Z
       
       1 likes, 0 repeats
       
       The way sentences containing the German character ß get longer when uppercased was specially designed to create memory problems in C programs doing string handling
       
 (DIR) Post #Ahdsc1FaTwnXi4YKoa by humanhorseshoes@mastodon.world
       2024-05-07T08:26:24Z
       
       0 likes, 0 repeats
       
       @foone just use ss
       
 (DIR) Post #Ahdsnc6vmC5hDR24e0 by stfn@fosstodon.org
       2024-05-07T08:27:08Z
       
       0 likes, 0 repeats
       
       @foone Probably not related that much, but I remember that in the early days of mobile phones, when every text message was expensive, there was an an outrage that Polish diacritics (ąęźćńż) were counted as more than one character within the 140 characters limit.
       
 (DIR) Post #Ahdt0ZXycTqz1qrVmi by cato@chaos.social
       2024-05-07T08:27:44Z
       
       0 likes, 0 repeats
       
       @foone I was gonna say "just use ẞ" but depending on encoding, that might also add another byte or two I guess? Then again, is this really the only case where the uppercase variant of a character would require more bytes than the lowercase variant?
       
 (DIR) Post #Ahdt8ciWiVn1B4dU0W by fuzuki@mas.to
       2024-05-07T08:29:12Z
       
       0 likes, 0 repeats
       
       @foone Well why wouldn't "Ss" be longer than "ss" huh? Makes complete sense doesn't it?
       
 (DIR) Post #AhdtJ8Jw4TWg5c93CK by tehabe@norden.social
       2024-05-07T08:34:35Z
       
       0 likes, 0 repeats
       
       @foone at least there is a ẞ now
       
 (DIR) Post #AhdtPV2YVrKGI63FRo by Cryptomon@bunt.social
       2024-05-07T08:34:28Z
       
       0 likes, 0 repeats
       
       @foone there is an uppercase ß! (But most Germans don't know and it's on no keyboard...)I had to copy it from Wikipedia: ẞ
       
 (DIR) Post #AhdtXYq6GuSCOG4MS0 by slyecho@mdon.ee
       2024-05-07T08:37:24Z
       
       0 likes, 0 repeats
       
       @foone Both are 2 bytes in UTF-8: `c39f` or `7373`.
       
 (DIR) Post #AhdthEd4GeXSbaaQRE by acb@mastodon.social
       2024-05-07T08:39:43Z
       
       0 likes, 0 repeats
       
       @foone They’ve finally added an eszett to Unicode, though typographers are still debating what it should look like: http://cinga.ch/eszett/
       
 (DIR) Post #Ahdu0319fAcg5wp7rc by JustJimWillDo@mastodon.online
       2024-05-07T08:43:03Z
       
       0 likes, 0 repeats
       
       @foone I doubt this, but I love that it is at least a possibility.
       
 (DIR) Post #AhduFHEb5UHeX0NBsO by larsmb@mastodon.online
       2024-05-07T08:45:44Z
       
       0 likes, 0 repeats
       
       @foone Thankfully UTF-8 provides this services for many languages now, us Germans are no longer special
       
 (DIR) Post #AhdyTdKi7c98wKKkgy by krono@toot.berlin
       2024-05-07T09:32:38Z
       
       0 likes, 0 repeats
       
       @foone In some official capacities, where things have to be uppercased (from typewritert days), the "ß" has to be transformed into "(SS)" (so as to differentiate "Assman" -> "ASSMAN" and "Aßman" -> "A(SS)MAN", and yes it is hilarious for English readers). It is its own pumping lemma of sorts.
       
 (DIR) Post #Ahe0xxNiLXgfiNKFkG by kawa@mas.to
       2024-05-07T10:00:06Z
       
       0 likes, 0 repeats
       
       @foone In UTF-8 they'd remain the same length :3
       
 (DIR) Post #Ahe19Rfk4BAKqTEhhA by technocidal@mastodon.social
       2024-05-07T10:03:15Z
       
       0 likes, 0 repeats
       
       @foone And now that we’ve added an uppercase ẞ all the primitive search-and-replace tactics no longer work 😀
       
 (DIR) Post #Ahe8GHphkLKa2G6arA by FantasmitaAsex@todon.eu
       2024-05-07T11:22:45Z
       
       0 likes, 0 repeats
       
       @foone *doing string handling bad and/or assuming that everything is ASCII or Latin1
       
 (DIR) Post #Ahe8mg7Wyui0d1tBQG by _nd_@fnordon.de
       2024-05-07T11:28:35Z
       
       0 likes, 0 repeats
       
       @foone I was part of an upgrade project where program parts written in C was moved to Java. The DB layer of the program used two columns VARCHAR(n) for text - one as-is, and one in upper case for indices; both with the same n. The client truncated the string.The upgrade was a long project and tested extensively, but on the day of the go live the DB connection suddenly hang.Reason: Java did The Right Thing and converted ß → SS, and the DB interface didn't deal well with too long strings.
       
 (DIR) Post #AheOay7fp22yFlOUkK by henrikjernevad@mastodon.social
       2024-05-07T14:25:06Z
       
       0 likes, 0 repeats
       
       @foone That was actually the cause of a bug I spent way too long time finding. 😂 I even wrote a blog post about it.https://henko.net/blog/i-can-be-wrong/