[HN Gopher] EBCDIC is incompatible with GDPR
       ___________________________________________________________________
        
       EBCDIC is incompatible with GDPR
        
       Author : edent
       Score  : 241 points
       Date   : 2021-10-25 11:48 UTC (11 hours ago)
        
 (HTM) web link (shkspr.mobi)
 (TXT) w3m dump (shkspr.mobi)
        
       | exporectomy wrote:
       | People are so prissy about the unicode representation of their
       | name while forgetting that even that is a machine-only
       | representation made to approximate the technical limitations of
       | printing presses and is different from anything they can write by
       | hand or speak with their mouth. If you want the bank to use your
       | "real" name, for most people, it needs to be spoken or possibly
       | hand written. And it had better be in the correct accent or
       | writing style too. In other words, storm in a teacup.
        
       | mjburgess wrote:
       | The ruling was that the bank has to change:
       | 
       | https://gdprhub.eu/index.php?title=Court_of_Appeal_of_Brusse...
        
       | gregw2 wrote:
       | Transitioning off a long established core mainframe/AS400 app is
       | not necessarily so easy as just changing to UTF-8 as the article
       | author implies.
       | 
       | If you have no mainframe or enterprise experience to relate to
       | that observation, consider the effort involved to transition from
       | python 2 to (UTF-8 clean) python 3!
       | 
       | That said, I am not even clear from the article which diacritical
       | markings are missing from EBCDIC and if the lawyers arguments to
       | "not change" were legitimate in the way the article implies...
       | you do realize there are hundreds of EBCDIC code pages covering
       | at least all the European languages ... since these are markets
       | which IBM has sold into for 50+ years now, right?
       | 
       | I only learned about EBCDIC code pages when trying to proactively
       | properly setup character encoding handling for data extraction
       | from one of my employer's long running AS400s... "Which EBCDIC?"
       | is not that different a headache from "which extended ASCII code
       | page?"... EBCDIC is not just like 7-bit (non extended) ASCII as
       | the article implies.
        
         | nemoniac wrote:
         | > there are hundreds of EBCDIC code pages covering at least all
         | the European languages ... since these are markets which IBM
         | has sold into for 50+ years now, right?
         | 
         | Yes IBM has sold into Europe for somewhat longer than that, but
         | not always in the most positive way.
         | 
         | "IBM and the Holocaust: The Strategic Alliance between Nazi
         | Germany and America's Most Powerful Corporation"
         | 
         | https://en.wikipedia.org/wiki/IBM_and_the_Holocaust
        
           | meepmorp wrote:
           | Are you implying that the Nazis are responsible for EBCDIC?
           | If not, how does your point relate to the topic at hand?
        
             | sipos wrote:
             | I'm pretty sure they are just pointing out that IBM has
             | been selling in Europe for longer than 50 years, but the
             | parent comment to theirs did say 50+.
        
             | jackjeff wrote:
             | Nah. The Nazis were too busy creating Facebook :)
        
         | toyg wrote:
         | _> Transitioning off a long established core mainframe app is
         | not necessarily so easy_
         | 
         | I don't think the author implied it was easy, just that it
         | should have been done at some point in the 25 years since the
         | system was first implemented. The last paragraph is just an
         | exortation to use Unicode everywhere all the time, today.
        
         | WorldMaker wrote:
         | > "Which EBCDIC?" is not that different a headache from "which
         | extended ASCII code page?"
         | 
         | Sure, but that's still a massive headache. You've probably
         | never had a a headache like needing to switch ASCII or EBCDIC
         | code pages. You generally can't just switch code pages per-
         | record in a file, storing mixed code page data to disk is
         | generally a bad idea, and in some operating systems you can
         | barely switch code pages per _application_ and sometimes need
         | ROM hacks and entire mainframe restarts to switch code pages.
         | (Modern z /OS supports something more like modern Linux locale
         | switching with environment variables before running
         | applications so should at least allow per-application code
         | pages.)
         | 
         | Even if the lowest common denominator code page you choose to
         | run your application in is a full bit or two more than the
         | 7-bit ASCII lowest common denominator a single code page per
         | application is still never going to cover the breadth of UTF-8
         | without nasty hacks. (That's of course assuming you don't have
         | other problems such as intermediate tools that presume you are
         | only using ASCII compatible EBCDIC subsets of code pages, which
         | may be the case when you've got an eclectic evolution of code
         | accreted around your mainframe apps.)
        
       | AtNightWeCode wrote:
       | Simple to fix comparing to GDPR in general. Like where there are
       | local laws that overlap and then EU laws that overlap that and
       | then GDPR upon that. It is not like you can just follow GDPR
       | cause then you may break a bunch of other laws.
        
       | qwerty456127 wrote:
       | The first comment saying "TrA"s intA(c)ressant !" looks hilarious
       | in this context. I wonder if it has been made to look like this
       | intentionally or not.
        
         | nerdponx wrote:
         | I'm sure it was deliberate. Got a good laugh out of it!
        
         | capitainenemo wrote:
         | Certainly looks like a joke to me, especially given all the
         | correctly rendered text, and the various encoding related
         | comments. Was probably rendered like this.
         | 
         | $ echo "tres interessant" | iconv -f iso-8859-1 -t utf-8
         | 
         | trA"s intA(c)ressant
        
           | murkle wrote:
           | ... and in reverse with this very cool tool (found on HN I
           | think) https://ftfy.vercel.app/?s=tr%C3%83%C2%A8s+int%C3%83%C
           | 2%A9re...
        
             | capitainenemo wrote:
             | Well... you can do it in reverse with iconv too...
             | 
             | $ echo trA"s intA(c)ressant | iconv -f utf-8 -t iso-8859-1
             | 
             | tres interessant
             | 
             | Admittedly no autodetection. Luckily EU mangling is usually
             | just one or two encodings.
        
               | rndgermandude wrote:
               | > Luckily EU mangling is usually just one or two
               | encodings.
               | 
               | Just to list the iso-8859 parts concerning EU member
               | states:
               | 
               | - iso-8859-1 (Latin-1, Western European, including German
               | umlauts, French accents, etc)
               | 
               | - iso-8859-2 (Latin-2, Central European, including
               | characters to support Polish, Czech, Slovakian, Hungarian
               | and other)
               | 
               | - iso-8859-3 (Latin-3, South European, including
               | characters to support Maltese)
               | 
               | - iso-8859-4 (Latin-4, North European, including
               | characters to support the Baltic states)
               | 
               | - iso-8859-5 (Latin/Cyrillic, including characters to
               | support Bulgarian)
               | 
               | - iso-8859-7 (Latin/Greek, including characters to
               | support Greek)
               | 
               | - iso-8859-10 (Latin-6, Nordic, refinement of Latin-4,
               | popular in Baltic states)
               | 
               | - iso-8859-13 (Latin-7, Baltic Rim, because -10 was not
               | enough)
               | 
               | - iso-8859-15 (Latin-9, basically Latin-1 with the EUR-
               | sign and some commonly used characters missing in
               | Latin-1)
               | 
               | - iso-8859-16 (Latin-10, South-Eastern European,
               | "Intended for Albanian, Croatian, Hungarian, Italian,
               | Polish, Romanian and Slovene, but also Finnish, French,
               | German and Irish Gaelic (new orthography)")
               | 
               | And they are all still in use. ;)
               | 
               | It seems to me -15 is now more popular than -1, probably
               | because it supports the Euro currency sign.
        
               | garaetjjte wrote:
               | Oh, but that's not all! :)
               | 
               | Microsoft had its own Windows-125x codepages, which were
               | not always compatible with ISO ones.
        
               | capitainenemo wrote:
               | $ for i in {1..16};do echo -n "ISO-8559-$i: ";echo trA"s
               | intA(c)ressant | iconv -f utf-8 -t "iso-8859-$i"
               | 2>&1;done | grep -Pv "illegal input|failed"
               | 
               | ISO-8559-1: tres interessant
               | 
               | ISO-8559-9: tres interessant
               | 
               | ;)
        
       | Aardwolf wrote:
       | Makes me wonder what the rules for registered names are: is a
       | registered name a series of characters from an existing writing
       | system, that would hopefully be compatible with Unicode, or is it
       | anything a human being could possibly write on a piece of paper,
       | including something that has no equivalent in any writing system?
        
       | surfingdino wrote:
       | Mainframes, COBOL, and databases that store data in formats that
       | replicate the hard limits of paper punch cards are a real
       | problem. Banks, insurers, and governmental institutions won't get
       | rid of them and choose to surround those fossils of IT with layer
       | upon layer of tech that gets outdated the moment it gets
       | delivered. I was on a project where we were told to come up with
       | an alternative way of encoding UUID4s, because "bank X runs their
       | DB on a mainframe and they only have N bytes they can use for an
       | ID."
       | 
       | It used to be a given that nobody wanted their bank to run on
       | anything, but mainframes. Now we'd rather they used cloud
       | computing and Postgres. Mainframes have had their day. They may
       | have a future, but they need modern databases and development
       | tools.
        
         | that_guy_iain wrote:
         | > It used to be a given that nobody wanted their bank to run on
         | anything, but mainframes. Now we'd rather they used cloud
         | computing and Postgres. Mainframes have had their day. They may
         | have a future, but they need modern databases and development
         | tools.
         | 
         | I suspect you ask random Joe on the street they would say they
         | would rather be on a mainframe.
         | 
         | Also, I would much rather my bank didn't host on AWS, GCP,
         | Azure, etc.
        
           | mmis1000 wrote:
           | Mainframe nowadays do support modern language and modern tech
           | stack. It's just no one willing to move on to the next
           | stack.(or probably too expensive to)
           | 
           | For example, IBM z runs linux.
           | 
           | Backward compatibility is one of the biggest selling point
           | mainframes provide, you can run code written 30 years ago
           | with all limitation and bug preserved as is.
           | 
           | And IBM z can also run many program write for original
           | ibm/360 unmodified.
           | 
           | Probably many ones decided to do so.
        
         | seabird wrote:
         | "Technologies that replicate the hard limits of paper punch
         | cards" and "mainframe" are not necessarily synonymous. You can
         | make bad decisions with any given set of hardware/software you
         | choose to use.
         | 
         | Mainframes have outrageous transaction processing and
         | reliability/redundancy capabilities. If anything, mainframes
         | and modern programming techniques for them are _underrated_ ,
         | largely because dumb licensing on the tooling keeps people from
         | realizing their capabilities.
        
       | zinekeller wrote:
       | .. until it ends with "SWIFT is incompatible with GDPR".
       | 
       | (Okay, privacy wise, some might have uneasiness with SWIFT but
       | we're talking about how it can't handle (in this case) characters
       | outside US-ASCII, unless you have negotiated it with the bank
       | you're sending on, which if it's a US bank, is not supported:
       | https://twitter.com/ajlobster/status/735240869859753985)
        
         | PeterisP wrote:
         | I'm not sure about the speed and schedule of adoption, but I
         | believe that SWIFT systems across the world (including USA!)
         | are migrating towards ISO 20022 messages which has technical
         | support for characters outside US-ASCII; and requirements such
         | as these are a driving factor to migrate away from the earlier
         | SWIFT MT standards.
        
       | xfz wrote:
       | https://tools.ietf.org/id/draft-msporny-base58-01.html
        
       | jwildeboer wrote:
       | (Decision from 2019, unfortunately article doesn't add updated
       | information on if/how the bank solved this)
        
       | markstos wrote:
       | I once worked for a newspaper while they were researching if dead
       | people were voting in the state of Kentucky. The project would
       | compare voter records with those of the deceased. The State
       | responded to one of their open record requests with a a magnetic
       | real about a foot in diameter, which I was tasked with decoding
       | into a spreadsheet.
       | 
       | I took the magnetic reel to college with me that summer and asked
       | around. Turns out they had magnetic tape reader for reels of this
       | size hooked up their VAX system. A friendly sysadmin tried to
       | read the data for me, but it came back has gibberish.
       | 
       | I wasn't surprised. Then he said "Aha! EBCDIC!" I hadn't heard
       | it, but as the reel spoon and the names of the dead spun off the
       | reel, he spun his own yard about this arcane format that was an
       | ancient as the magnetic tape reel I'd brought it.
       | 
       | And yes, there were some dead people voting in Kentucky.
        
         | cesaref wrote:
         | The international banking system is coordinated by the SWIFT
         | network, and all inter-bank messages are encoded in EBCDIC. If
         | you transfer money between countries, or get statements from a
         | broker, chances are it lived in EBCDIC at one point in it's
         | journey.
        
         | pavel_lishin wrote:
         | When suspected necromancy is afoot, of course you'd need to see
         | a wizard about it.
        
       | spullara wrote:
       | My brother-in-law has an apostrophe in his last name and almost
       | no systems bother to support it and it is in the character set.
       | If this is an example of an appropriate use of GDPR, wow.
        
       | _pmf_ wrote:
       | That's one way to get your account cancelled.
        
         | PeterisP wrote:
         | Since banking account history has to be stored and provided for
         | literal decades after account closure, they would still have to
         | implement the changes even if that customer left, as they will
         | still be processing his data and have to do it according to the
         | law.
        
         | gpderetta wrote:
         | I'm sure that cancelling accounts of people with "funny"
         | spellings, will definitely not get the bank in trouble for
         | (indirect) discrimination at all.
        
         | gambiting wrote:
         | Not sure about Belgium specifically, but at least in UK the
         | bank can't close your account without a valid court order. They
         | can temporarily suspend it if they suspect you of some crime,
         | but in general a normal checking account cannot be closed by
         | the bank "just because". I'd expect it to work the same way in
         | all of EU.
        
           | fallingknife wrote:
           | That's great! Should be the same in the US, and also apply to
           | payment processors.
        
         | silon42 wrote:
         | Or in the future, getting international payments disabled
         | (including most credit cards).
        
         | DarkWiiPlayer wrote:
         | ooof... didn't GDPR also have some strong opinions on
         | retaliation though?
        
       | CWuestefeld wrote:
       | At the time I moved out of New Jersey 8 years ago, the state was
       | still unable to represent my completely vanilla name on my
       | driver's license. My first name is "Christopher", but their
       | computers can't/couldn't handle an 11-character name. It was
       | always truncated on my driver's license.
       | 
       | This led to problems when they instituted their trusted ID
       | compliance. When renewing the license we were required to provide
       | some combination of documentation to corroborate our identity,
       | and obviously that documentation needs to match the name shown on
       | the driver's license - and of course mine did not.
       | 
       | There was one way out for Christophers like myself. A birth
       | certificate was considered the ultimate truth, so as long as I
       | had a notarized (with the raised seal) birth certificate to prove
       | my identity, they would allow me to renew my license.
       | 
       | The State of New Jersey is very awful at IT. My wife, who works
       | in healthcare finance, told me about problems she was having with
       | the State because - get this - their field for what amounts to
       | "Medicaid ID#" was too narrow, so they had to recycle ID#s for
       | new recipients! And to make that worse, they discarded old backup
       | data so when checking the data for a patient several years ago,
       | it's only possible to find that of the latest owner of ID# 12345.
        
         | throwaway946513 wrote:
         | As a fellow Christopher -thank the IT overlords of my state
         | (MO) that my driver's license uses my name in its entirety and
         | not cutting off the 'r'.
        
         | nhoughto wrote:
         | yep this is fairly common where the original form-factor, a
         | piece of paper/plastic card, was the target. The intent of
         | capturing the name was to put it on the card, and long names
         | can't fit so we have to truncate them or abbreviate etc.
         | 
         | Nevermind that that might not be your name or that in the
         | future having the untruncated name might be useful.
         | 
         | Definitely a big problem for then using that data to form the
         | base of an identity system like trusted ID compliance.
        
         | adolph wrote:
         | Cue patio11 link:
         | 
         | Falsehoods Programmers Believe About Names
         | 
         | https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...
        
         | jjice wrote:
         | The absurdity of not being able to support a name as common as
         | Christopher or anything as long or longer just screams
         | "government work". What the hell went through everyone's head
         | when they built this system? Absolutely no testing or real data
         | was used either, but that shouldn't matter, because having a
         | max limit on someone's (very common) name is honestly
         | impressive. The fact they developed this system without
         | addressing this issue is a testament to the quality of
         | government software development.
         | 
         | I'm sure there is good government software out there, but there
         | are plenty of showcases of the opposite (especially since these
         | are systems that NEED to work).
        
           | mikewarot wrote:
           | >What the hell went through everyone's head when they built
           | this system?
           | 
           | They thought that the next computer system would be better,
           | and when they re-wrote it for the new machine, they'd be able
           | to fix the problems they found, in about 3 years or so. They
           | certainly didn't expect it to still be running in the 1970s
           | or 1980s, let alone in 2021.
           | 
           | IBM broke everything when they introduced backwards
           | compatibility. It saved a ton of time, in the short term, but
           | everything before that point was frozen, and the technical
           | debt it caused has never been paid.
        
           | nikanj wrote:
           | > What the hell went through everyone's head when they built
           | this system?
           | 
           | The way to make a profit on government contracts:
           | 
           | 1) Underbid
           | 
           | 2) Find every possible use case that's OBVIOUSLY needed, yet
           | not in the specs
           | 
           | 3) Leave them unimplemented
           | 
           | 4) Charge through the nose for the extra work.
           | 
           | The system has already been paid for and put to production.
           | It's too late for the buyers to back out, lest they be out
           | both money and face
        
             | londons_explore wrote:
             | It backfired this time though... the government was like
             | "we can just live with 10 character first names".
        
           | WalterBright wrote:
           | Sounds like the media player I just bought that errors out
           | with "too many songs to shuffle". And you're hosed.
        
           | Uehreka wrote:
           | CVS also has trouble with my name (I'm also a Christopher) as
           | do some other private entities like my doctor's office. So
           | this may not be entirely a government thing.
        
             | zdragnar wrote:
             | You can go to a new doctor or pharmacy. The consequences
             | for them not having your full first name recorded are
             | between minimal and nonexistant.
             | 
             | To be a US resident, you must be known to the government.
             | To leave, the government will tax a percentage of your
             | wealth. Try to tell the government to pound sand while
             | remaining, and they can bring the full legal weight of
             | their monopoly on force against you.
             | 
             | The difference in scale of harm between what CVS is capable
             | of versus the government means we should have no issue with
             | holding government to a much higher standard, And be
             | proactive in pointing out when it falls short.
        
           | Vvector wrote:
           | Wait until you find out about Y2K
        
             | marcodave wrote:
             | Wait until we start to reach 2038, it's going to be either
             | awesome or an epic shitshow
        
           | nsxwolf wrote:
           | So Christopher is an interesting case because it's the
           | longest common English first name at 11 characters. There are
           | names that seem longer but aren't, like Maximilian.
           | 
           | My first son is named Christopher, and we realized right away
           | that there is a 10 character first name limit in a ton of
           | systems almost from day 1 - calling my insurance company, the
           | automated system asked "Are you calling about...
           | Christophe?"... in a French accent, which was hilarious.
        
             | [deleted]
        
           | WalterBright wrote:
           | It was common in the 1980s for compilers to have arbitrary
           | implementation limits. The C Standard even lists minimums,
           | like 127 nesting levels of blocks, 12 pointer declarators,
           | 4095 characters in a string literal, etc.
           | 
           | My compiler started out with those, but I quickly realized
           | that there were only two actual limits:
           | 
           | 1. allocated memory
           | 
           | 2. stack size
           | 
           | It turned out to be much less code and many fewer error
           | messages to just detect out of memory and blowing up the
           | stack.
        
           | Johnny555 wrote:
           | _What the hell went through everyone 's head when they built
           | this system?_
           | 
           | That decision probably dates back decades and was based on
           | the limited amount of space on a 80 column punch card. And as
           | the system evolved past punch cards, no one bothered to
           | update the spec because "it's always been 10 characters, if
           | we change it now, something might break"
        
           | MonkeyClub wrote:
           | > What the hell went through everyone's head when they built
           | this system?
           | 
           | Not sure what did go through, but I'm sure that "Christofer"
           | as a test string didn't.
        
         | pickledcods wrote:
         | I have the exact same problem with my European passport.
         | maximum name length, that is total first+middle+surname must be
         | less than 30 characters.
         | 
         | Officially I am not who I am.
        
           | Bayart wrote:
           | Seems short sighted, considering it's normal in a lot of
           | European countries to have three given names (and completely
           | legal to have far more).
        
           | markstos wrote:
           | My wife's name was adjusted by the Department of Motor
           | Vehicles in Indiana because the format she wanted "wasn't
           | allowed in the computer". Of course, the name on your
           | driver's license becomes your legal name for many practical
           | purposes.
        
           | toyg wrote:
           | That's not a limitation of the European format, which is in
           | fact not agreed on in the original resolution [1]. Your
           | country is doing it wrong. In the UK, where the format has
           | been set in 2014 so still in an EU context, the limit is 60
           | (30 for surnames, 30 for the rest).
           | 
           | Interestingly, it looks like we're actually going backwards
           | on this: accented characters have actually been dropped from
           | the main passport page in recent years, to be replaced with
           | ICAO transliterations. Which is shameful, to be honest, since
           | it implies passports are now incomplete as a form of ID
           | (unless the real name is recorded somewhere else). Airline
           | lobbying clearly won the day, years ago. This seems to be a
           | UK-only thing at this point.
           | 
           | [1] https://eur-
           | lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX...
        
             | RF_Savage wrote:
             | That's a very interesting document for future reference.
             | Decent ground truth on how to transliterate stuff like the
             | EU does.
        
             | azalemeth wrote:
             | That is fascinating - thank you. I always wondered why my
             | first names got truncated on the driving license despite
             | there being space - now I know.
        
             | wdb wrote:
             | UK likes to put your full name on bank cards and spend a
             | long time to convince Lloyds to put my initials + last name
             | instead.
             | 
             | Why would I want a random user in a shop to know my full
             | name?!?
        
               | tialaramex wrote:
               | > UK likes to put your full name on bank cards
               | 
               | Do they? I had three unexpired bank cards to compare.
               | 
               | My good bank issued me my non-contactless credit card,
               | which is a backup and also the card my Phone "is" when I
               | use that to pay for stuff, which is most of the time.
               | That card (which a little worn) has my first and last
               | name with middle initial.
               | 
               | My good bank also issued me a debit card very recently.
               | This card is entirely black on the front except for the
               | name of the bank and the logo of the card network.
               | However on the back it has my initials and surname.
               | 
               | The other bank I use that does card transactions issued
               | me a more traditional looking card with just my initials
               | and surname on the front.
        
             | jart wrote:
             | Wow this PDF is interesting. It explains how to canonically
             | transliterate European ligatures/diacritics, Cyrillic, and
             | even Arabic into Roman. The government clearly wants ASCII
             | to become the one charset to rule them all.
        
               | [deleted]
        
               | dane-pgp wrote:
               | That got me wondering what percentage of the world uses
               | the Roman alphabet, and the answers I found vary from
               | "36% of the world population"[0] to "nearly 70 percent of
               | the world's population"[1].
               | 
               | In any case, I think it's fair to say that it's a
               | plurality, if not a majority, and that the letters A-Z
               | are the most natural "core" set of glyphs from which the
               | other (upper case) Roman-derived letters are built.
               | 
               | [0] https://www.worldstandards.eu/other/alphabets/
               | 
               | [1] https://www.britannica.com/list/the-worlds-5-most-
               | commonly-u...
        
             | twic wrote:
             | > replaced with ICAO transliterations
             | 
             | Here's the (a?) specification for the machine-readable part
             | of the passport, with transliteration and so on:
             | 
             | https://www.icao.int/publications/Documents/9303_p3_cons_en
             | ....
             | 
             | (toyg, did you originally include this link in your
             | comment, then edit it out?)
             | 
             | > passports are now incomplete as a form of ID
             | 
             | Passports are, and always have been, tools for travelling
             | internationally. Depending on a passport for general
             | identity is arguably as much of a mistake as using social
             | security numbers for identification.
        
               | spdustin wrote:
               | > Passports are, and always have been, tools for
               | travelling internationally. Depending on a passport for
               | general identity is arguably as much of a mistake as
               | using social security numbers for identification.
               | 
               | In the U.S., every state-issued ID card or drivers
               | license requires, among other things, a document proving
               | identity; a current, valid U.S. passport is considered to
               | meet that requirement. Here, a passport is a federally-
               | issued identity document.
        
               | xxpor wrote:
               | And is usually the only acceptable form of ID when
               | traveling internationally (outside of Canada and Mexico),
               | for example when trying to enter a bar.
        
           | anthk wrote:
           | I call it bullshit because a lot of Basque names+surnames
           | could pass the 30 char limit with ease.
        
       | jstanley wrote:
       | According to the author's tweet, the customer sued _and won_ :
       | https://twitter.com/edent/status/1450731852302532608
       | 
       | I wonder why this fact is absent from the blog post?
       | 
       | This is a baffling ruling. I don't think an inability to support
       | funny characters should be considered a GDPR violation. Anyone
       | can put any characters they want in their name, and everyone is
       | not breaking the law just because Unicode doesn't have made-up
       | squiggles.
        
         | orf wrote:
         | For a lot of the world the characters you are typing with are
         | "funny characters".
        
           | adolph wrote:
           | At least 1.4B of them.
           | 
           | https://en.wikipedia.org/wiki/Demographics_of_China
        
         | xyzzyz wrote:
         | First, just to be clear, these are not by any means "funny
         | characters". These are perfectly normal characters in languages
         | used by these people.
         | 
         | Second, if you were trying to make a point that people can put
         | Unicode emoji in their names, well, try doing that on a birth
         | certificate and tell me what the registration office tells you.
         | If you successfully manage to get an actual "funny character"
         | in your legal name, let me know.
        
       | cupcake-unicorn wrote:
       | Good on this consumer for dragging the bank through this. I'm
       | sure the consumer probably got crap from friends/family about why
       | they were doing this but this is sheer laziness on behalf of the
       | bank and they deserve to be dragged through this to force them to
       | uphold reasonable tech standards for all their customers. Glad
       | that the EU has this option, I'm in the US and would use it more
       | for stuff here :/
        
         | jimmaswell wrote:
         | Not spending millions of dollars to appease people who are
         | disproportionately upset over such a minor thing as missing
         | accent marks is sheer laziness to you?
        
       | Muromec wrote:
       | oh, now I wonder if I can cite GDPR (and Dutch government for
       | their BRP thing) and ask my bank to spell my name in a proper
       | Ukrainian Cyrillic the same way it is done in my id.
        
       | GoblinSlayer wrote:
       | Just store UTF-8 in base64 encoding, it's compatible with ebcdic.
        
       | erk__ wrote:
       | Time to break out UTF-EBCDIC!
       | 
       | https://en.wikipedia.org/wiki/UTF-EBCDIC
        
         | justin_oaks wrote:
         | Interesting. It makes a bad system worse to try to make it
         | better.
        
         | LinAGKar wrote:
         | Thanks, I hate it
        
         | [deleted]
        
         | Kim_Bruning wrote:
         | truly terrifying!
        
       | tyteen4a03 wrote:
       | This ruling is interesting. As a person with names in Chinese, I
       | could technically force my bank to support UTF-8 simply by saying
       | I do not wish to be known as my English name, which is the
       | phonetic spelling of my Chinese one.
       | 
       | Now since I'm Hongkongese where my English legal name is as legal
       | as the Chinese one the law might be different but for Chinese
       | people though...
        
         | caf wrote:
         | Same for those with Arabic, Persian, Korean, Thai, Russian ...
         | names
        
           | egeozcan wrote:
           | Also Greek, Turkish, German, Romanian... Is there any
           | language other than English that can be written 100% by ASCII
           | characters?
           | 
           | If you have special letters in your name, you'll have a
           | different name in another country without that letter. My
           | surname is supposed to be Ozcan, but it's Ozcan or Oezcan in
           | many official documents. Don't even let me start with the
           | "Turkish iIiI problem"...
           | 
           | I mean it's not totally unrecognizable but it's a different
           | name nevertheless.
           | 
           | I was talking to a Romanian colleague recently and she told
           | me that most of the country uses some US keyboard layout
           | instead of Romanian and cannot type Romanian letters, so
           | people have 2 names even in their home country.
        
             | jhbadger wrote:
             | > Is there any language other than English that can be
             | written 100% by ASCII characters
             | 
             | Latin. Yes, a lot of textbooks add some diacritics to show
             | pronunciation (as Latin wasn't 100% consistent between
             | spelling and pronunciation), but the Romans themselves
             | didn't use them.
        
               | N19PEDL2 wrote:
               | So they will never face this problem in Vatican City [0].
               | Although I guess that Vatican institutions are not
               | required to comply with GDPR.
               | 
               | [0] https://www.reddit.com/r/todayilearned/comments/5vcd2
               | h/til_t...
        
               | cabalamat wrote:
               | > the Romans themselves didn't use them
               | 
               | They did sometimes:
               | https://en.wikipedia.org/wiki/Apex_(diacritic)
        
             | samus wrote:
             | Dutch, although they insist on treating the ligature IJ as
             | a letter on its own. It's even part of the sort order in
             | dictionaries and telephone books.
             | 
             | Also, probably thanks to Dutch colonialism, the unmodified
             | Latin alphabet is the official writing system in Malaysia,
             | Indonesia, Brunei, and Singapore, and is used to write the
             | Malay and Indonesian languages.
             | 
             | It is also used as the base for romanization systems for
             | languages that don't have a latin-style alphabet already.
             | These are often designed to stick as close to plain lating
             | as possible. Apart from academia and language teaching, a
             | few of them are actually used by governments to render
             | names in latin characters for passports, street signs etc.
             | 
             | Personally, I think that the plain latin alphabet is quite
             | limited and that extensions are necessary. Accents,
             | macrons, circumflexes, etc. are certainly annoying to
             | input, but certainly not worse than inventing completely
             | new letters or using digraphs for everything. I rather
             | think that our educational systems don't teach well how to
             | handle them. We don't have to pronounce them all correctly,
             | and certainly can't be expected to, but typing them is not
             | impossible at all!
        
             | anthk wrote:
             | Basque.
        
             | int_19h wrote:
             | > Is there any language other than English that can be
             | written 100% by ASCII characters
             | 
             | Indonesian and Malay languages.
        
             | clankyclanker wrote:
             | Not even English can be correctly written in (lower) ASCII,
             | it has far too many borrowed words, like naive and resume.
             | Say nothing of archaic spellings or ligatures, like
             | encyclopaedia, or ruffle. It's almost surprising ASCII was
             | as successful as it is.
        
               | egeozcan wrote:
               | "Not good enough, but used everywhere so nothing we can
               | do" is the worst enemy of good, even worse than
               | perfectionism.
        
               | mastax wrote:
               | Worse is Better:
               | https://www.dreamsongs.com/RiseOfWorseIsBetter.html
        
               | retrac wrote:
               | Not _that_ surprising. It was a big improvement over the
               | 6-bit encodings that came before. All caps! And it was
               | broadly assumed from the 70s onward that the 8th bit
               | extended to a regional character set. Even my 1980s
               | Canadian Apple  //e supported displaying French
               | characters, in some variant of Latin-1 I think. The easy
               | extensibility of ASCII on 8-bit-byte systems was a big
               | part of its popularity (and eventually its greatest curse
               | when all the divergent extensions started meeting
               | online).
               | 
               | Or just consider how the Japanese put up with computing
               | in pure katakana (their writing system's equivalent of
               | all caps) well into the 1980s.
        
               | coldacid wrote:
               | Most English ligatures (pretty much all other than ae and
               | oe) are simply artifacts of formatting, rather than
               | actual letters. With that in mind, ligature versions of
               | fl and ffl (and the like) are unnecessary.
        
               | PaulDavisThe1st wrote:
               | There's a person who was involved for years in the
               | maintainance of the venerable CSound audio programming
               | language who specifically changed his last name to ffitch
               | (with a ligature, and no leading capital). I don't know
               | for certain, but I think it was intended to
               | provoke/test/trouble weak text representation in
               | software.
        
             | gpderetta wrote:
             | thinking that ASCII is enough for English is a bit naive.
        
               | xxpor wrote:
               | Spelling it that way is a fantastic way to look
               | pretentious.
        
               | WorldMaker wrote:
               | It's at least a little more elementary than that. There
               | are many, many school teachers of young children that
               | will tell you that words like cooperative lost a lot
               | useful disambiguation power when English dropped support
               | for such syllabic markers. I see words written like that
               | and I think of grade school, which seems like the
               | opposite of pretension.
        
               | gpderetta wrote:
               | I could have spelled in the naive way, but then the joke
               | wouldn't have worked.
        
         | Eduard wrote:
         | https://en.wikipedia.org/wiki/Languages_of_the_European_Unio...
        
         | tdeck wrote:
         | When you write your name in Chinese characters, how do people
         | know whether to pronounce it in Cantonese or Mandarin (or some
         | other Chinese language)? Does that ambiguity ever come up?
        
           | samus wrote:
           | I guess it depends on the language they are using at the
           | specific moment. People in Hong Kong are probably going to
           | pronounce it in Cantonese. Border guards in Beijing will
           | probably pronounce it in Mandarin.
           | 
           | There is actually no good way to tell whether a name is
           | Mandarin or Cantonese, except _maybe_ by looking at the place
           | of birth or residence. Ironically, the romanized form might
           | give clues as there are many different romanization systems
           | in use.
        
             | tyteen4a03 wrote:
             | There are some limits but other places (especially China)
             | has names that I would find unusual, and I can sort of
             | guess that way. It's definitely not as sure-fire than
             | looking at the romanization for sure.
        
           | bialpio wrote:
           | I'd expect that people who speak Cantonese would use
           | Cantonese, and people who speak Mandarin would use Mandarin
           | pronunciation. When you see a name "Peter", how do you know
           | which pronunciation to use - Dutch, German, Norwegian,
           | English, or other (there's a couple more)? :-)
        
         | consp wrote:
         | You might be able to but I wonder if you want to. (Considering
         | this is in Western Europe, Belgium) Most of the people will not
         | be able to convert the characters into something they can
         | process, even if they wanted to. While maybe legal, it would
         | speed your processing up a lot to use the phonetic writing in
         | the extended latin character set.
         | 
         | The diacritical marks however have some familiarity and are in
         | common use.
         | 
         | On a sidenote: lots of airlines also have this issue where an
         | accent or other dimark will remove the character completely
         | making your name different from the one in your passport. Could
         | be quite annoying.
         | 
         | edit: thought it was in the Netherlands but it was in
         | Flanders/Belgium.
        
           | gambiting wrote:
           | In all of EU you can have your name spelled with or without
           | the diacritics are it's equally valid, I have official ID
           | documents with my name with and without the diacritics and
           | it's not a problem in the slightest. In fact when my son was
           | born, we decided to keep the diacritics off his first
           | passport(he has dual nationality) but keep them in his second
           | passport for the country where the diacritics came from
           | originally.
        
           | irishsultan wrote:
           | It's in Belgium actually.
        
             | consp wrote:
             | Yes, Noticed that later. Though my point still applies.
        
           | tdeck wrote:
           | Southwest Airlines doesn't even support having a hyphen in
           | your name, as if that's some exotic character and not
           | something fairly common in English surnames.
        
         | nroets wrote:
         | Even the Dutch have words that cannot be encoded with in
         | Ebcdic[1]. And I suppose many Dutch have names like Andre.
         | 
         | https://blogs.transparent.com/dutch/tremas-e-i-u-o-a/
        
           | erk__ wrote:
           | I assume that code page 37 [0] is used in the Netherlands, so
           | it is likely something more other than the common diacritics.
           | 
           | Edit: I just saw it was in Belgium, but the same should apply
           | there. Although they seem to be using a variant of code page
           | 37 called code page 500 (also in [0]).
           | 
           | https://en.wikipedia.org/wiki/Code_page_37
        
           | Deukhoofd wrote:
           | And considering the ruling was in Belgium, where half the
           | population is French speaking I'd expect a lot of diacritics
           | to occur.
        
       | CWuestefeld wrote:
       | Question: does the bank have the right to say, "I'm sorry Mr
       | potential customer, but we can't meet your requirements so are
       | unable to give you an account"? Or is it essentially required
       | that everyone doing business must do things like keep all
       | computer systems modernized?
        
         | hyperman1 wrote:
         | Belgian law, if I am not mistaken, requires that every Belgian
         | (European?) Inhabitant has access to a basic package. The bank
         | can deny access to credit etc, but an account and a (normal
         | debet) bank card are a right.
        
         | PeterisP wrote:
         | I wanted to say that they should not have that right, however,
         | looking at GDPR, perhaps it's not forbidden after all.
         | 
         | However, it's worth noting that it's not just a single
         | exceptional person - Belgium has accented letters in two of
         | their three official languages and names with accents are
         | reasonably common, so if you tried that, you would have to
         | discard many customers, and also those customers would be
         | overwhelmingly from the french-speaking part of the country so
         | that might be treated as explicit discrimination targeting the
         | french-speaking minority community.
        
       | retrac wrote:
       | In slightly related news, Ontario just last year finally allowed
       | people to use accented characters in their official legal names,
       | birth certificates, and so on. French has been an official
       | language in Ontario for over half a century. The reason it wasn't
       | possible until recently was entirely technical. The systems were
       | limited by ASCII or, yeah, possibly EBCDIC. (I don't have the
       | details.) Still no guidance on how the average government clerk
       | with the very common US-style layout is supposed to type them in,
       | though.
       | 
       | https://news.ontario.ca/en/release/58538/ontario-introduces-...
        
         | gspr wrote:
         | What I don't understand when I hear stories like these is _why
         | the hell not just use someone else 's solution_? Surely
         | neighboring Quebec had this sorted out ages ago - why not just
         | duplicated whatever they did? Problem solved in no time.
         | 
         | Going further, I wonder why for example the EU doesn't try to
         | get schemes going that facilitate the copying of IT solutions
         | between member states. Why does every country have to reinvent
         | the wheel?
        
           | 908B64B197 wrote:
           | Someone told me they solved it quite simply and elegantly:
           | 
           | There's a law that says: "For a computer system to be
           | purchased by the government it must work in French".
           | 
           | Implementation is then left to potential sellers.
        
           | [deleted]
        
           | coldacid wrote:
           | I think even the Quebecois hate the French-Canadian keyboard
           | layout. Certainly it's incredibly hated here in Ontario.
        
             | jackjeff wrote:
             | I grew up in France and I hate the PC French keyboard with
             | Alt Grrr with a passion.
        
             | kps wrote:
             | https://en.wikipedia.org/wiki/CSA_keyboard is just awful --
             | it uses 'right Control' as a graphic-shift modifier for
             | most characters, instead of AltGraph/Option. (It _also_
             | uses AltGraph /Option for some common characters like []<>
             | and for French <<>>.) You can't find a better example of
             | government committee work anywhere.
        
             | toyg wrote:
             | You mean the _Quebecois_ , surely
             | 
             | (sorry, couldn't resist - on topic for the thread...)
        
           | pas wrote:
           | Corruption, and putting too big emphasis on having their own
           | system so they are not dependent on someone else.
           | 
           | Hopefully we'll move past these eventually.
        
         | coldacid wrote:
         | There /is/ a US-International layout which uses both AltGr
         | style and compose style entry of accented characters, although
         | it's not the best. I actually made my own customized version of
         | US-International for Windows in order to support more options
         | for accented characters and certain extended Latin characters
         | used in old and middle English.
        
           | bawolff wrote:
           | Keyboard layout (as an input method) has nothing to do with
           | which characters can be encoded (stored).
        
           | toyg wrote:
           | I really wanted to use US-International, but the way it
           | breaks quotes and double-quotes is so bad, I ended up with
           | similar hacks in Windows (via AutoHotKey). It's one of those
           | things where Apple really got it right, and I don't
           | understand why MS cannot adopt similar solutions to what the
           | Macs do.
        
             | ulucs wrote:
             | AHK is really a hack, but Windows has the best software for
             | keyboard adaptation. I used this to create a custom layout
             | that includes Turkish and Greek characters which helps me a
             | lot
             | 
             | https://www.microsoft.com/en-
             | us/download/details.aspx?id=102...
        
               | coldacid wrote:
               | Exactly what I used, too.
        
             | coldacid wrote:
             | You might be interested in my US-International alternative
             | layout. Back when I created it, I also put it up on
             | BitBucket[0] for others, and wrote up some details too[1].
             | It eschews dead keys for AltGr style composition so there's
             | no need to double-tap any of the keys used for diacritics.
             | 
             | [0]: https://bitbucket.org/coldacid/usintalt/src/master/
             | 
             | [1]: https://web.archive.org/web/20160327005949/https://chr
             | is.cha...
        
             | ynik wrote:
             | The "(no dead keys)" variant to US-International solves
             | that problem. Windows unfortunately doesn't have it out of
             | the box (Ubuntu does). But you can make your own layout
             | with "Microsoft Keyboard Layout Creator".
             | 
             | And plenty of people have already made "United States-
             | International (no dead keys)" for Windows, so if you don't
             | want to figure out the MS tool, you can just
             | download+install a layout from GitHub.
        
             | jackjeff wrote:
             | Amen. For someone who programs every day but frequently has
             | to type in French, Spanish or German I could not agree
             | more. The Mac is awesome at typing everything.
        
           | Bayart wrote:
           | I use the international US layout. It's ironically much
           | better for writing my native French than the regular AZERTY
           | layout.
        
       | cabalamat wrote:
       | If EBCDIC is incompatible with GDPR, then so are machine-readable
       | passports as the format only allows ascii letters A-Z.
       | https://en.wikipedia.org/wiki/Machine-readable_passport#Name...
        
         | dane-pgp wrote:
         | That seems like a sufficiently different scenario to me (as a
         | non-expert) that I think a court could reasonably reach a
         | different conclusion.
         | 
         | Firstly, if a government decides to "comply" with the GDPR by
         | just seizing and revoking your passport, you might not have a
         | case against them as the granting of passports could be
         | considered a Royal Prerogative (or an equivalent under other
         | systems of government) and thus non-justiciable. You might try
         | to claim this is discrimination, but I don't think that "non-
         | ASCII characters in name" is a legally protected class, and of
         | course anyone could change their name to have or avoid non-
         | ASCII characters.
         | 
         | Also, if the format is designed to be machine-readable, then
         | arguably the "accuracy" of your name on the passport has to be
         | judged by the machine, not by you as the holder of that name.
         | Moreover, the format is agreed as a consequence of an
         | international treaty, which again might put it beyond the
         | jurisdiction of a domestic court, and if your passport was
         | declared invalid by a nation you were attempting to enter
         | because it contained non-standard characters, that is not
         | something that a domestic court could provide a remedy to.
        
       | dhosek wrote:
       | One of the fun things about EBCDIC is that 370 assembler has
       | opcode-level support for converting an EBCDIC-encoded numeric
       | string into an integer (and maybe the other way around too, it's
       | been a while). This is one of two things I remember about my now-
       | ancient 370 assembler knowledge. The other is that there is no
       | built-in support for maintaining a call stack. It is up to each
       | subroutine to handle this and there were some weird declarations
       | around this to indicate whether a subroutine was reentrant, the
       | definition of which escapes me now.
       | 
       | And people shouldn't criticize EBCDIC too much, after all Windows
       | still dumps a lot of crap in legacy 8-bit coding that can cause
       | applications to break (there was a recent post on HN about
       | someone being unable to run the IntelliJ debugger because of an
       | accent in their username). At least EBCDIC is clear about its
       | limitations.1
       | 
       | [?][?][?]
       | 
       | 1. I'd be remiss if I didn't point out one other EBCDIC
       | weirdness: It has _two_ vertical bars, | and | which always
       | caused complications in translations between EBCDIC and ASCII.
       | IIRC, | was the more common symbol in EBCDIC coding but some
       | converters wanted to translate | to | instead (or maybe it was
       | the other way around--the last time I did IBM big metal was 30
       | years ago).
        
         | toyg wrote:
         | _> there was a recent post on HN about someone being unable to
         | run the IntelliJ debugger because of an accent _
         | 
         | That's not Windows, that's JVM weirdness. Using the right
         | calls, this sort of thing has been fine in Windows for some
         | time.
        
           | dhosek wrote:
           | It's JVM weirdness on Windows. This isn't a problem on Linux
           | or MacOS where file paths won't be in some arbitrary
           | encoding. This ends up biting a lot of other cross-platform
           | software as well and is why Rust has OSString, but for code
           | in, say, C/C++ it ends up being a major pain point (the TeX
           | development team often end up dealing with this sort of
           | issue).
        
             | breakingcups wrote:
             | File paths on Windows aren't in some arbitrary encoding
             | either?
        
             | howinteresting wrote:
             | It's actually on Linux where file paths can be just about
             | any byte sequence. They're restricted to be UTF-8 on
             | APFS/HFS+ (with some complicated case folding rules) and
             | UCS-2 on NTFS.
        
               | colejohnson66 wrote:
               | UCS-2? I thought it was UTF-16?
        
               | int_19h wrote:
               | You can have unmatched surrogates in the name, for
               | example.
        
       | andrewaylett wrote:
       | Sounds like the perfect use-case for UTF-7?
       | https://en.wikipedia.org/wiki/UTF-7
       | 
       | No, I'm not _entirely_ serious.
        
         | krallja wrote:
         | You probably want UTF-EBCDIC instead:
         | https://news.ycombinator.com/item?id=28987256
        
       | Kiro wrote:
       | In a similar case in Ireland they ruled in favor of the data
       | controller:
       | 
       | > Following an eight-month investigation, the Data Protection
       | Commission (DPC) have ruled that individuals do not have an
       | 'absolute right' to have their names spelled with fadas.
       | 
       | https://ireland.bloomsburyprofessional.com/blog/no-right-to-...
        
         | DoubleGlazing wrote:
         | That was an abomination of a ruling.
         | 
         | I find it amazing that the Data Protection Commissioner
         | basically went against the constitution which clearly states
         | that Irish is the first language of Ireland.
        
       | amelius wrote:
       | Try booking a flight with diacritics in your name. Same
       | situation.
        
         | PeterisP wrote:
         | What's funny is when the system explicitly requires "Write your
         | name exactly as in the passport" and then fails validation by
         | requiring only unaccented latin letters only, so it's
         | impossible to fulfill both conditions at the same time.
        
           | exporectomy wrote:
           | They might have jumped the gun since new passports can't have
           | accents but older ones might.
        
       | a3w wrote:
       | Someone I know told me that he got a German passport, but
       | absolute garbage as his name in there because his actual name is
       | in Arabic.
        
       | edwinjm wrote:
       | Heh, if you're looking for a good example of Technical Debt...
       | 
       | Yes, already in 1995 Unicode was an established standard (even
       | Windows 95 started to support it). The bank should have known it
       | would be a requirement in the future.
        
         | coldacid wrote:
         | Unicode's old enough that Windows NT was built to work with it
         | natively. In fact, all the "ANSI" Windows API calls in NT were
         | just wrapper functions around the Unicode equivalents handling
         | Unicode/code-page conversions. And this was 1993.
        
           | WorldMaker wrote:
           | Yup, in fact the biggest compatibility headaches in NT
           | _today_ stem from how early they adopted it: they made some
           | assumptions about UCS-2 that turned out to be wrong and had
           | to shoehorn in UTF-16 support that mostly works (except when
           | it falls over a cliff). Meanwhile Linux and others waited for
           | UTF-8 to exist and that 's become the internet/web's major
           | standard as well and there are some small papercuts
           | interoperating between UTF-16 and UTF-8 that with today's
           | hindsight shouldn't have been so annoying or necessary.
           | Windows _might_ have been better off waiting for UTF-8 itself
           | other than Windows made the right architectural decision for
           | the time when it made that decision and could not have
           | suspected UTF-8 to turn up only a few years later.
        
             | anthk wrote:
             | > Meanwhile Linux and others waited for UTF-8
             | 
             | UTF-8 was already a thing in Plan9.
        
               | WorldMaker wrote:
               | UTF-8 was first presented to IETF at Usenix in January
               | 1993. NT 1.0 shipped June 1993 and had been in
               | development for several years before that.
               | 
               | The famous "Plan 9 implemented UTF-8 first" thread's most
               | specific date mentioned was September 1992 which only
               | three months more lead time before the standardization
               | notice in January 1993.
               | 
               | Are you suggesting the NT Kernel team should have somehow
               | better paid attention to a not-yet-standard from a
               | research laboratory Operating System? It still probably
               | would have been a couple years too late in the
               | design/architecture process even if they had, given the
               | release data in June 1993.
        
       | SavantIdiot wrote:
       | In 2018 I inherited a rather large website, and have been slowly
       | fixing it to support unicode because many of the users want to
       | use their real names, not an English-hack version of it.
       | 
       | It is WAY more complicated than I thought it would be. There is
       | so much code that manipulates strings that is not unicode aware.
       | 
       | I've fixed the simple things, like places where the user name is
       | displayed, etc., but the email subsystem is a train wreck and
       | there are still places in the database where I couldn't
       | retroactively fix old entries. Going on 4 years fixing this!
       | 
       | But EBCDIC? Damn, opportunity to make lots of $$ here fixing
       | people's code. I had friends that made bank on Y2K prep in the
       | mid-90's.
        
         | kccqzy wrote:
         | > email subsystem is a train wreck
         | 
         | Actually, which email system supports email addresses with non-
         | ASCII user names? And which additionally supports IDN domains?
        
           | SavantIdiot wrote:
           | Actually, there are other parts to the email subsystem
           | besides the POP/SMPT interface, such as code that dynamically
           | generates the subject and body, and have lots of regex and
           | string manip code in them.
        
       | po1nt wrote:
       | Imagine you maintain this system and somebody named X AE A-XII
       | Musk will try to register.
       | 
       | Jokes aside. I know a person named exactly like me just with a
       | small diacritic difference. I realize they use secondary
       | identifiers but this is identity theft waiting to happen.
        
         | kayodelycaon wrote:
         | I go by Kayode online (Kay (actually Que) oh Deh), which isn't
         | the African name Kayode (Ki-oh-Day or Ki-oh-Dee).
         | 
         | The number of places that don't support diacritics this is
         | absolutely mind-boggling.
        
       | cannabis_sam wrote:
       | I want to suggest that businesses should be penalized somehow for
       | using "ancient" technology, but then on the other hand you have
       | roman concrete...
        
         | mminer237 wrote:
         | You shouldn't penalize stuff for being old. You should penalize
         | stuff for being bad. Not being able to accurately store and
         | represent customers' names is the problem here, not that it's
         | old.
        
           | cannabis_sam wrote:
           | Yeah, that was my point..
        
       | rocqua wrote:
       | So, the part of the GDPR the bank was unable to comply with here
       | is the "right to rectification".
       | 
       | That suggests that the bank made a 'mistake' when it recorded the
       | name in its system as well as it could. I don't think that should
       | count as a mistake. The information in the bank's system is as
       | good as it could be, so there is nothing to rectify.
       | 
       | It feels weird to me when privacy legislation turns out to
       | require supporting UTF-8. I think something in the legal process
       | went wrong here.
        
         | dTal wrote:
         | Yes. Provided there is a defined protocol when handling
         | unrepresentable characters in the system, like e -> e, the
         | information is _not_ "inaccurate". It is merely imprecise. You
         | could imagine explicitly putting ? instead, which would carry
         | strictly less information - but would still be "accurate" in
         | the sense that it doesn't assert a falsehood.
        
         | utucuro wrote:
         | Considering that the P in GDPR stands for Protection, not
         | Privacy, the scope of the legislation is significantly broader.
         | If we look at the ISO standard for information security, ISO
         | 27001, apart from the confidentiality and accessibility of
         | data, it considers integrity as one of the three things to
         | consider when classifying data and similarly, the GDPR expect
         | PID to be handled in a manner that assures correctness at the
         | very least.
         | 
         | In the specific case of this bank, like everyone else, they
         | were expected to update systems unable to comply with the
         | legislation within the grace period and yet it seems that they
         | were unwilling or unable to update or replace a system that is
         | incapable of achieving data integrity in a matter as basic as
         | the name of a customer.
        
           | SpicyLemonZest wrote:
           | But it's just not true that everyone else was expected to do
           | this. Credit card names are still running on ASCII! (I'm
           | also, to be frank, highly skeptical that the court would have
           | taken such a hard line if the customer had been complaining
           | that Chinese characters aren't supported or that his Arabic
           | name should be written right to left.)
        
         | dmitriid wrote:
         | > It feels weird to me when privacy legislation turns out to
         | require supporting UTF-8.
         | 
         | No. It requires you to store data correctly. And in the case of
         | a _bank_ storing data incorrectly could have potential
         | ramifications (think two different people, one with diacritics
         | and without).
         | 
         | The law doesn't care whether you use UTF-8 or a manually
         | written translation table, or a 15th-century printing press
        
           | Volundr wrote:
           | > And in the case of a bank storing data incorrectly could
           | have potential ramifications (think two different people, one
           | with diacritics and without).
           | 
           | Surely this situation wouldn't cause any issues though right?
           | If they are relying on names as unique identifiers, they've
           | got far bigger problems than a users name being spelled
           | incorrectly.
        
             | gspr wrote:
             | Issues? Depends on what you mean. Maybe not practical
             | issues, but I for sure would be offended if my bank refused
             | to use my actual name.
             | 
             | I find this whole ordeal delightful, and applaud the
             | intention of the GDPR and the ways the courts upheld it in
             | this case.
        
             | TeMPOraL wrote:
             | No, but banks interact with people - both customers and
             | employees. People _interpret_ names. Errors like this could
             | be used to, for example, impersonate someone (by tricking
             | an employee), or deny someone service (e.g. via a clerk who
             | behaves like a zombie, a protein peripheral of the bank 's
             | computer system).
        
       | oldie wrote:
       | Remember all the ghastliness with code pages that sprang up
       | around Ascii, such that systems configured for different
       | languages didn't agree about what characters most code points
       | were supposed to represent? Well, good news: Ebcdic supports
       | that. For example, here's a code page that can represent all the
       | characters you're likely to need in French:
       | 
       | https://en.everybodywiki.com/EBCDIC_297
       | 
       | So, to be unable to represent a, e, o, u, c, etc, the application
       | would have to be locked into not just Ebcdic but also a
       | particular Ebcdic code page that seems unsuited to the locale
       | where the program was running.
       | 
       | Admittedly, an Ebcdic system will have difficulty representing
       | French, Greek and Russian names at the same time, because there's
       | no code page that encodes all the necessary characters.
       | 
       | An application hard-coded to US-Ascii would also be unable to
       | support accented characters, and an application using any one
       | Ascii code page (as opposed to Unicode) would have the same
       | difficulty representing French, Greek and Russian names at the
       | same time. Which is why, in 2021, we don't do that.
        
       | asdfe8988 wrote:
       | >EBCDIC is an ancient (and much hated) "standard" which should
       | have been fired into the sun a long time ago. It baffles me that
       | it was still being used in 1995 - let alone today.
       | 
       | I want a pony.
        
         | BBC-vs-neolibs wrote:
         | That's my que:
         | 
         | "Does this mean that Z[?][?][?][?][?][?][?]a[?][?][?][?][?][?][
         | ?][?][?][?]l[?][?]g[?][?][?][?]o can finally open a bank
         | account?"
        
       | theragra wrote:
       | My friend often is having issues with flying, because his name is
       | written as Maksims, and old booking systems think Ms at the end
       | means missis.(he is male)
       | 
       | Crazy shit everywhere in these old systems.
        
       | dirtyid wrote:
       | Are there precedence of legal requirements for diacritic support?
       | Does it extend to all latin diacritics or popular regional
       | subsets. I remember a Chinese scholar pushing for multiple
       | language email address standard years ago, thinking it would be
       | neat (and profoundly inpractical). Also maybe I'm misremembering,
       | I swear I've seen arabic email addresses before.
        
       | seiferteric wrote:
       | It seems like there should be some technical solutions though.
       | Maybe just use the name field in the mainframe DB as a unique
       | hash of the utf-8 encoded name and store the real utf-8 encoded
       | name in an external DB or something.
        
       | CoastalCoder wrote:
       | Can someone comment on what assumptions those banks _are_
       | permitted to make regarding names?
       | 
       | E.g., can they assume that names can be expressed as a sequence
       | of (current) Unicode characters with some specific maximum
       | length? Can they assume that names have no leading / trailing
       | spaces?
        
         | PeterisP wrote:
         | I believe that the main assumption they can make is that they
         | can use the name on the ID forms issued by the government or,
         | in case of foreign citizens, their passports. Due to history of
         | international diplomacy, the general standard for passports
         | expects that in addition to whatever script the country uses,
         | they will also include the name of the person in English or
         | French - so this is the key source of the problem, as for
         | passports in e.g. Russian you will get an "English" name that
         | you might use, however, you may get passports with names only
         | in French, so you would have to support the English and French
         | alphabets but perhaps not necessarily any others.
         | 
         | Regarding trailing spaces etc, IMHO the standard would be "as
         | shown in passport" i.e. trailing spaces definitely would not
         | matter, but spaces and punctuation between words would (e.g.
         | D'Artagnan as a name). I looked for but did not find any
         | specific restrictions on name length. In general, the country
         | will have regulations on what they accept as names in their
         | official IDs, and again you may piggyback on other institutions
         | - as long as you accept everything for which your government
         | have issued documents, you should be fine; and if someone has
         | an interesting case that requires changing the process, let
         | that fight happen between them and the government first.
        
         | mqus wrote:
         | I think that it has to be reasonable. Assuming that your
         | French-speaking target region has only names without accents is
         | unreasonable. Assuming a maximum length of 200(?) utf8
         | codepoints(or even bytes) seems reasonable (defendable) in
         | court. Same for leading/trailing spaces.
        
         | tialaramex wrote:
         | Probably reasonable assumptions. When you're not sure, assume
         | the standard will be reasonableness, because that's what the
         | law assumes when it isn't specified.
         | 
         | So, you can make _reasonable_ assumptions. What is reasonable
         | will change, which is fine because the way courts figure out
         | what 's reasonable in some particular case is to either have
         | the judge decide, or have a jury decide, and people change too.
         | 
         | The nice thing about reasonableness is that you are equipped to
         | make a first pass at judging it yourself, since you are
         | presumably a reasonable person. If you need second guessing,
         | have a team mate consider it, and, if you're worried that your
         | collective idea of "reasonable" might be distorted in an
         | important way, that'll be why your organisation probably
         | encouraged _diversity_ to avoid that.
         | 
         | You might say, this seems awful because it isn't precise enough
         | to say, implement it as a Javascript library. That's true, but
         | intentional. Justice will necessarily involve such judgement
         | calls, and trying to evade that by specifying everything
         | precisely with no room for judgement is a bug not a feature.
        
         | contravariant wrote:
         | I'm somewhat wondering to what extent a bank is required to
         | support storing the names natively.
         | 
         | I mean something like "${name} spelled with an acute accent on
         | the e" would be _technically_ a correct description even if it
         | is impractical to use. The GDPR does grant you the right to
         | correct your personal information but doesn 't specify how this
         | information is represented.
         | 
         | As far as I can tell the GDPR also doesn't grant the customer
         | the right to have their name represented correctly on their
         | bank pass (otherwise everyone with a long surname would require
         | impractically long bank passes), the court only ruled that the
         | inability of the bank to store the name correctly simply isn't
         | an excuse.
        
         | mindcrime wrote:
         | I wonder if there are any limits on this from the GDPR
         | perspective? What if my name has 2^40 characters in it? Are
         | companies required to support that? What if I change my name
         | _from_ whatever it is today (say,  "Phillip") _to_ a name that
         | has 2^40 characters? Would the bank be required to accommodate
         | that? etc..
        
           | PeterisP wrote:
           | In civil law countries (after Brexit, 100% of EU is civil
           | law) generally you can't change your name at will or by
           | simply starting to use it, it usually requires asserting one
           | of specific reasons that (in the eyes of the law) justify a
           | name change, a request to authorities and their approval -
           | which would be denied if you wanted to change your name to
           | something that has 2^40 characters.
           | 
           | If you did officially change your name to something
           | interesting, I presume the bank would definitely have to
           | accommodate it; but the restrictive part would be the process
           | of actually changing your name.
        
       | gpderetta wrote:
       | Every problem can be solved with an additional level of
       | indirection. For example use html character entities for
       | characters that are not representable in the DB character set.
        
         | [deleted]
        
         | caf wrote:
         | And/or rename the existing "Name" field to "Named-based Index
         | Key" and add a new field for Name.
        
       | jan_Inkepa wrote:
       | They don't say what the outcome of the case is? I guess it's
       | still in progress(seems to be 2 years old though)? Really
       | interesting use though!
       | 
       | Edit: ah on the linked wiki article it says:
       | 
       | > The Court of Appeal of Brussels held that, in accordance with
       | Article 16 GDPR, the data subject has the right for their name to
       | be correctly spelled when processed by the computer systems of
       | the Bank
       | 
       | So the plaintiff won, but no word on if/how the bank actually
       | fixed it.
        
         | Luc wrote:
         | The lower court ordered the bank to spell the name correctly.
         | The court of appeal upheld this judgement.
         | 
         | Source (Dutch):
         | https://www.gegevensbeschermingsautoriteit.be/publications/a...
         | 
         | This tweet says it was ING Bank:
         | https://twitter.com/simonhania/status/1270812210584043521
        
           | qiqitori wrote:
           | Great, stupid lawsuits, exactly what the world needs.
           | 
           | The bank's lawyers took the wrong approach, IMO. The law (as
           | quoted in the article) says:
           | 
           | > The data subject shall have the right to obtain from the
           | controller without undue delay the rectification of
           | inaccurate personal data concerning him or her.
           | 
           | This doesn't have that much to do with how a certain name is
           | displayed anywhere. Can I get an airline to change their
           | systems if they abbreviated my name on my boarding ticket?
           | Yeah, I don't think so. The airline could say "well we have
           | the proper name in this database over here". And so could the
           | bank.
        
             | toyg wrote:
             | _> well we have the proper name in this database over here
             | ". And so could the bank._
             | 
             | I expect this is how they will eventually solve the issue -
             | the customer-visible parts will be insulated from the old
             | system with stuff that can handle Unicode. Chances are they
             | currently don't have such insulation, producing documents
             | with the wrong names, hence the complaint (bank statements
             | are often used as proof of ID).
             | 
             | Btw this is not a stupid law. Accents are important parts
             | of languages, the tech to handle them has been around for
             | decades now, there is no excuse for willful illiteracy.
        
           | Deukhoofd wrote:
           | > De geschillenkamer van de GBA heeft deze uitleg als niet
           | afdoende aangezien. Dat een bankinstelling anno 2018 niet bij
           | machte zou zijn om een naam van een klant correct te
           | schrijven onder uitleg dat zij nog gebruik maakt van een
           | informaticasysteem van 1995, werd niet afdoende beschouwd.
           | 
           | Ouch. Basically "That you're not able to write a customers
           | name correctly in 2018 because you use a system from 1995 is
           | not an excuse".
        
             | toyg wrote:
             | That's absolutely fair. The law is the law, and GDPR has
             | been adopted for 5 years at this point (enforced for 3),
             | there has been ample time to replace noncompliant systems.
             | If a car manufacturer gave you a new car without seatbelts
             | "because the production chain was built in 1995", you would
             | obviously sue them.
        
               | trasz wrote:
               | In 1995 EBCDIC, just like mainframes in general, were
               | already quite obsolete.
        
               | cgrealy wrote:
               | Depends on what you mean by "obsolete".
               | 
               | Were they superseded by more modern solutions?
               | Absolutely.
               | 
               | Were they nonfunctional? Hell no.
               | 
               | I worked on several systems in the early 2000s that still
               | had a big old mainframe at the back end.
               | 
               | I'm pretty sure most airlines and banks still run them.
        
               | akersten wrote:
               | > ample time to replace noncompliant systems.
               | 
               | I think the point here is this is completely out of left
               | field as far as what anyone has insisted would be non-
               | compliant with GDPR... If you had done a compliance audit
               | the day GDPR passed, I highly doubt this shortcoming
               | would have even made the footnotes.
               | 
               | "Overly broad and interpretable law with rabid defenders
               | is stretched to painful limits just as critics predicted"
               | is the real story.
        
               | notJim wrote:
               | I have no problem with this. We should not have to rename
               | ourselves, or change our language because of computers or
               | because of lazy companies' refusal to modernize. Both
               | computers and companies serve us, not the other way
               | around.
        
               | akersten wrote:
               | I'm not saying this is a bad outcome (modernization is
               | overall a Good Thing), I'm saying it's bad that the GDPR
               | is being used to achieve it.
               | 
               | Before today, you cannot seriously tell me that
               | (hypothetical) United Airlines being unable to print ae
               | on your boarding pass would be a GDPR violation. No one
               | would even have considered it. The best "GDPR auditors"
               | that popped up to save the day with expensive consulting
               | would have glossed right over it. And yet the overly
               | broad language of the regulation allowed this contrived
               | gotcha. And now any company that can't support emojis in
               | your surname is now in the Naughty Bucket of GDPR
               | Violators.
               | 
               | I'm just shocked how so many hackers are _ok_ with this
               | law existing in its current form, just because it
               | sometimes achieves things that they like.
               | 
               | If we find ourselves asking "what else can we hit with
               | this hammer," it's a bad law.
        
               | detaro wrote:
               | Companies no longer getting away with misspelling
               | customer names has absolutely been something that has
               | been discussed before this case. (and at the same time,
               | this doesn't mean every contrived example of a name and
               | where a name might appear actually has to support
               | everything)
        
               | akersten wrote:
               | > Companies no longer getting away with misspelling
               | customer names has absolutely been something that has
               | been discussed before this case.
               | 
               | Correcting incorrect data sure, that's part of what the
               | law grants you. But I believe this case is novel in that
               | the data is as correct as possible (for intents and
               | purpose of banking) yet the courts are requiring a
               | cosmetic adjustment to the data. Cosmetic as in: it does
               | not change the bank's or customer's understanding of the
               | contract and business organization (i.e. I'm not trying
               | to downplay ones attachment to accented letters, I'm
               | talking about correct identification for business
               | purposes).
               | 
               | > and at the same time, this doesn't mean every contrived
               | example of a name and where a name might appear actually
               | has to support everything)
               | 
               | Why not? What language in the GDPR would prevent that?
               | It's the same violation as this case: the name is not
               | displaying how the data subject wants it.
        
               | toyg wrote:
               | When two courts decide in the same way at the first
               | chance, the interpretation is hardly stretched or
               | painful.
               | 
               | To me the story with GDPR has consistently looked like
               | "IT companies unable and/or unwilling to comply with (or
               | even read) laws when they feel they go against their
               | established practices, no matter how bad such practices
               | might be".
        
               | xxpor wrote:
               | I'm sure the hundreds of millions of euros this will
               | probably cost ING is the most productive possible use of
               | that capital. A real economic growth driver. If this guy
               | cares so much, he can take his business to a competitor
               | that gets it right.
        
               | TedDoesntTalk wrote:
               | My surname is completely unpronounceable by Americans,
               | and I live in America. I got over it years ago and life
               | continues. Perhaps the plaintiff in this case should
               | learn to stop being offended by the world as it is.
        
               | cgrealy wrote:
               | Your name isn't unpronounceable, people are just lazy.
               | 
               | And this isn't someone "being offended", this is a legal
               | requirement of the GDPR to accurately record someone's
               | name.
               | 
               | "Zoe" is not the same as "Zoe" when you go searching for
               | it in a DB.
        
               | CamouflagedKiwi wrote:
               | > Your name isn't unpronounceable, people are just lazy.
               | 
               | That isn't necessarily true. We all have a certain set of
               | phonemes we can enunciate, and further have limits on how
               | they can be combined together. It is far from
               | inconceivable that the OP could have a name which
               | effectively _is_ unpronounceable to people speaking other
               | languages (and you can't just put in more effort to fix
               | this, so those people aren't "just lazy").
        
               | cgrealy wrote:
               | Sure there are sounds that humans cannot make, and
               | combinations that are difficult for non-native speakers.
               | I struggle with long Polynesian names, for example.
               | 
               | But they're not unpronouceable... they just take some
               | effort to learn.
               | 
               | Why do you think you can't fix this? I've never
               | encountered something that is physically unpronounceable
               | (fictional eldritch abominations and extra terrrestials
               | aside)
        
               | tialaramex wrote:
               | "The reasonable man adapts himself to the world; the
               | unreasonable one persists in trying to adapt the world to
               | himself. Therefore, all progress depends on the
               | unreasonable man." -- George Bernard Shaw.
        
             | jerf wrote:
             | The court does have a point in this case. There's a _huge_
             | number of systems in the world that existed in some form in
             | 1995 and were correctly handling names in 2018. It 's not
             | like that's some sort of weird case that nobody else has
             | encountered.
        
               | bawolff wrote:
               | Indeed:
               | 
               | Iso 8859-1: 1985 Unicode: 1991 Utf-8: 1992
               | 
               | There is even IBM277 for an ebcdic version.
        
       | N19PEDL2 wrote:
       | If the bank still relies on legacy software and IT standards,
       | well it's just its fault. They cannot expect people with
       | diacritics or other non-ASCII characters in their name just to
       | spell it incorrectly because their systems do not support Unicode
       | in the twenty-twenties.
       | 
       | Maybe their IT team had other priorities than replacing EBCDIC
       | with Unicode (or whatever they find more appropriate for their
       | systems), but this is an indicator of poor interest in
       | technological progress by the bank itself. It reminds me some
       | banks that gave millions to Microsoft to keep ATMs running
       | Windows XP after its end of life.
       | 
       | Edit: I elaborated a bit more and I realized that it might be
       | more difficult than just replace the character encoding standard
       | to a more modern one. For example, the name of the account owner
       | likely needs to match exactly the holder name on the credit card
       | associated with the account, and I'm not sure if diacritics can
       | be embossed correctly on the card.
        
       | sethammons wrote:
       | "My name is the complete work of Shakespeare, no, no not 'The
       | Complete work of Shakespeare,' I mean the concatenated plays and
       | poems, and I would like you to address me as such in any formal
       | communications. Thank you for being GDPR compliant."
        
         | t0mas88 wrote:
         | You can't change your name at will in Belgium (where this case
         | was). And I think in most of Europe the government has
         | reasonableness requirements on the names you can give your
         | kids. Elon Musk's strange combination would be refused for
         | example.
        
       | gpvos wrote:
       | More articles should have a "Dance" section.
        
       | gerikson wrote:
       | I have to admit, I did not have "EBCDIC" && "GDPR" on my 2021
       | bingo card.
        
       ___________________________________________________________________
       (page generated 2021-10-25 23:01 UTC)