[HN Gopher] How to untangle phone numbers
       ___________________________________________________________________
        
       How to untangle phone numbers
        
       Author : tosh
       Score  : 35 points
       Date   : 2024-06-12 07:58 UTC (1 days ago)
        
 (HTM) web link (factbranch.com)
 (TXT) w3m dump (factbranch.com)
        
       | Shank wrote:
       | > The way someone writes a phone number can give you hints about
       | the country and area codes.
       | 
       | Or other things, depending on the country. For example, in Japan,
       | area codes 090, 080, and 070 indicate mobile numbers. 050 means
       | it's an IP phone, irrespective of area (hikari denwa).
        
         | NeoTar wrote:
         | In the UK, 01, 02 area codes are land lines, 07 are mobiles, 08
         | are freephone or fixed price services, and 09 are premium
         | services. 03 and 05 are in use but less common, 04 and 06 are
         | unused.
        
       | rahimnathwani wrote:
       | Tangent - I really like Factbranch. At a previous company we used
       | it to display data from our production database within Zendesk.
       | It was a snap to set up the connection and also easy to edit the
       | template.
        
       | eggfriedrice wrote:
       | I particularly like how the UK phone number examples are not
       | written how we would write them in the UK, which I guess
       | underlines the point.
        
         | dsr_ wrote:
         | I like how the first US number does not have enough digits to
         | be valid.
        
           | saurik wrote:
           | It is all just extremely confusing if you stare at the
           | examples as I am pretty sure there can't be a country code
           | +12... that would get parsed as +1. So, that first one is
           | actually +1(234)567-8901 ;P.
        
             | usr1106 wrote:
             | Obviously no country code can be a prefix of another one.
             | The switches would not know how to interpret it.
             | 
             | But 2 or more countries can use the same country code. The
             | area code will determine which country you call to.
             | 
             | In the North American numbering plan +1 all numbers must
             | have a fixed length 3 + 3 + 4.
             | 
             | In many other numbering plans the length varies. Both area
             | code length and subscriber number length can vary. So it
             | gets pretty complicated to parse a number. Basically rules
             | don't help, you need a database and you need to update it
             | regularly. No idea whether an open, non-commercial data set
             | exists.
             | 
             | When I call to Germany Android shows the name of pretty
             | small places. I believe Android reports every number you
             | call to Google. But the database is probably still local, I
             | believe locations are also shown when you have no data
             | connection.
             | 
             | I am not from the US, but I understood there you might also
             | need extensions to be dialled after the number proper.
        
         | NeoTar wrote:
         | True, I'm not sure I've ever seen a 2 + 4 (e.g. 12 3456) style
         | six digit number in the UK. 01234 567890 is probably what I'd
         | expect; maybe 01234 567 890.
        
       | doctor_eval wrote:
       | I used to just store them as e164 (always with the country code),
       | and as bigints.
       | 
       | you need a bit of country-code based metadata to convert back to
       | strings, but the storage format is ultra compact and unambiguous.
        
         | xp84 wrote:
         | Is the storage savings worth this added complexity vs just
         | using normalized strings? What do you do when it needs to store
         | `*555` in such a field? To me, in 2024, this seems like using a
         | bitmask in a database to pack a bunch of bools into a single
         | byte: technically it could be perfectly valid (ignoring the
         | example above), but you're probably going to ruin someone's day
         | who comes along after you and just wants things to work.
        
           | doctor_eval wrote:
           | What customer has *555 as their phone number?
           | 
           | Edited to add:
           | 
           | It does depend on what you are doing with the numbers.
           | Focussing on the storage side is missing the point. It's
           | about the ambiguity. In general I've found that anything
           | other than integers is a mistake.
           | 
           | For example, *555 is not a valid phone number. You can't put
           | it in a tel: URL and you can't dial it or send a text to it.
           | It will work only in limited situations. If you want to store
           | an extension it should be a separate field.
           | 
           | Once you realise that phone numbers are functional, almost
           | exactly like IP addresses, you realise that storing them as
           | integers has benefits because that way you literally are
           | unable to store useless data.
           | 
           | Once you get in the habit of using only e164 numbers, the
           | phone number becomes usable in almost any context, and you
           | can make functional assumptions about it.
           | 
           | As an aside, it's much easier to create fast and small prefix
           | indexes on integers, but that's a different story...
        
             | dsr_ wrote:
             | An internal extension on your PBX
        
             | L3viathan wrote:
             | Maybe not a regular customer, but there's other reasons you
             | might have a phone field:
             | 
             | > In Israel, certain advertising numbers start with a *.
             | 
             | > In New Zealand, non-urgent traffic incidents can be
             | reported by calling *555 from a mobile phone.
        
             | Spunkie wrote:
             | Lots of extensions are written like that.
        
             | airstrike wrote:
             | > What customer has *555 as their phone number?
             | 
             | "Surely we won't ever need it"
        
             | ianburrell wrote:
             | Actually, a better example is 911. It is a phone number,
             | you can dial it, but has no E164 representation.
        
               | averageRoyalty wrote:
               | Wouldn't it be (assuming US) +1911?
        
         | ianburrell wrote:
         | One huge problem with this is that you need to normalize every
         | phone number. Lots of people enter local numbers without
         | country code, and you need to know the country.
         | 
         | For most phone numbers, you just need to round trip them. They
         | will be presented in the same context. It is better to store
         | ambiguous numbers as entered, along with context, and let human
         | figure it out than mess up.
        
       | m463 wrote:
       | remember to dial 9 first to get out.
        
         | gpvos wrote:
         | Or 0, depending where you are.
        
           | airstrike wrote:
           | or M, for murder
        
         | Scoundreller wrote:
         | And god help you if you get assigned an extension that matches
         | a popular first 4 digits of phone numbers in your area, because
         | you will forever get calls by people that forget the 9.
        
       | SunlitCat wrote:
       | For my phone at work, I could need a "how to untangle phone
       | cables" guide! :)
        
       | Spunkie wrote:
       | This article also completely skips over Phonewords, eg
       | 1-800-Flowers
       | 
       | https://en.m.wikipedia.org/wiki/Phoneword
        
         | Jolter wrote:
         | Yes but users of your web site or app are pretty unlikely to
         | put their phone number as one of these, aren't they?
        
           | Spunkie wrote:
           | Not really, at least for B2B stuff.
           | 
           | I have access to a bunch of big marketing lists and there are
           | always at least some Phonewords in there. The only big lists
           | without them are the ones collected by forms that only
           | allowed digits in the phone field.
           | 
           | Also on a personal level I have a "joke" number of
           | (404)myname that I put into contact forms that allow it.
           | Developers I talk to seem to have an easy time remembering my
           | # because of the joke.
        
         | gpvos wrote:
         | The mapping from letter to digit varies by country. Less so
         | nowadays with mobile phones, but still.
        
         | Scoundreller wrote:
         | Shout out to http://phonespell.org, on the internet since 1995.
         | 
         | > What language is PhoneSpell written in?
         | 
         | > PhoneSpell is a system of multiple parts, some in C++, some
         | in Perl, some in C, some in shtml, and some in shell scripts.
         | 
         | Checks out.
        
       | gus_massa wrote:
       | ... and here in Buenos Aires, we have a phantom 9. If your cell
       | phone is +54-11-5555-5555 in some apps/webpages you must write
       | +54-911-5555-5555. But not in all apps/webpages, so sometimes you
       | must try both options until the app/webpage is happy.
       | 
       | Also, from a cell phone you can use just use 5555-5555 but from a
       | line phone you must add a 15, i.e. 15-5555-5555. So users type
       | whatever combination of 11, 911, 15 or nothing they think is the
       | best one.
        
       | jensenbox wrote:
       | The E.164 Standard is also a great place to start.
       | 
       | E.123 is for printing and general purpose use (as in use in URLS:
       | https://datatracker.ietf.org/doc/html/rfc2806#section-2.2).
       | 
       | While the E.164 is (primarily) for storage and processing.
        
       | hypeatei wrote:
       | I suppose if a computer needs to use this for texts or calls then
       | normalizing makes sense. Other than that, I don't think it's
       | worth it as most of the time the phone field in a database is
       | just for humans to look at in the UI.
        
       | operator2140 wrote:
       | Fun fact, I have vanity numbers ending with NNN-0001 and 0000 on
       | my cell phones and I have revieved ZERO telemarker calls to date
       | on either number. Whatever mass calling software telemarkers use,
       | they won't call numbers that look obviously fake.
        
       | jeroenhd wrote:
       | > If a local number starts with a single 0, strip the 0 and
       | prepend the country code. If it starts with 00, strip the 00 but
       | assume the country code is already there.
       | 
       | This will work for most of the world, but not everywhere. There
       | are countries out there that don't use 00 as the international
       | call prefix (Wikipedia has a list:
       | https://en.wikipedia.org/wiki/List_of_international_call_pre...).
       | 
       | Take Austrialia, for instance, where 0011 is the prefix to strip
       | to turn an international number into a local number, but 0018 is
       | what you use to route a call to Optus. Australia's country code
       | is not 18!
       | 
       | All of this can be prevented on mobile phones by simply placing a
       | + at the front, but landlines (and systems simulating landlines,
       | such as VoIP) quickly become part of a hellish web of deviating
       | telephony standards.
       | 
       | Like email, the format may look simple, but only if you pretend
       | not to have to deal with any other country or culture in the
       | future.
        
       | woah wrote:
       | Just use ChatGPT API lol
        
       ___________________________________________________________________
       (page generated 2024-06-13 23:00 UTC)