[HN Gopher] Regular expression functions in Excel
___________________________________________________________________
Regular expression functions in Excel
Author : thunderbong
Score : 122 points
Date : 2024-05-21 10:57 UTC (3 days ago)
(HTM) web link (insider.microsoft365.com)
(TXT) w3m dump (insider.microsoft365.com)
| leshenka wrote:
| truly amazing times: Excel gets regex support (search-in-
| selected-cells is not available though, as well as search by
| regex)
| GregoireDev wrote:
| I am confused between all these AI announcements from Microsoft
| and the exciting new regex support in "preview" (aka soon to be
| dropped)
| intended wrote:
| Good gravy.
|
| Things im terrified of:
|
| - making an error in a financial model - checking a model -
| making an error in a regexes - checking a regex
|
| Thins I love; - dumb excel stuff
|
| This is a great time to be alive, and at arms length from
| finance.
|
| To all the people who will deal with the fallout- I salute you.
|
| This is awesome.
| cjk2 wrote:
| I am terrified of the same things. I got given a _validated_
| financial model for something ages ago and found a major Excel
| fuck up in it. When I say validated, it was run through an
| expensive validation company who charged a lot of money and
| they didn 't even notice it.
| intended wrote:
| Oh I know that fear. You can't trust anything anymore.
|
| It's so easy to just miss something. A date function which is
| off.
|
| Missing something in excel is a big deal for a firm which is
| mean to do it - and it will always happen. And that's the
| experts.
|
| I'm betting Someone was too tired, and uninterested and just
| followed the script.
|
| Hopefully that experience put the fear of God into some
| analyst and associate.
|
| Pretty much the only thing that I can rely on when it comes
| to modeling.
|
| Knowing that someone is as terrified as I am and has put the
| work in to not be embarrassed.
|
| And let's agree - it is work.
|
| I remember seeing a date function used by another firm for
| the first time.
|
| It too me at least an hour to decipher it.
|
| It was cool, I learnt a lot. But it takes time.
|
| Regexes are cool, and I guess people will be learning a lot.
| BitwiseFool wrote:
| Almost a decade ago I worked for a bank and in our little
| corner of the organization I was rapidly becoming the
| "Excel Guy". While I was happy to help people, the prospect
| of being pigeonholed into such a role terrified me and I
| switched jobs. That wasn't the only reason, but it was a
| major factor.
| bongodongobob wrote:
| This is why I pretend to not be good at Excel. Once word
| gets out, it becomes your job.
| Chris2048 wrote:
| Just to be clear, was the POC financial model implemented in
| Excel?
|
| If so, did it at least have embedded VB or was it all cell
| logic?
| cjk2 wrote:
| There was VBA and cell logic. The sheet was validated by a
| third party then someone ported it to a proper language
| (incorrectly). That model was not validated.
| listenallyall wrote:
| Why do you call it "a major Excel fuck up" when the error
| arose from the language port?
| Chris2048 wrote:
| Did they not compare data input/output of the excel
| original versus the port?
| cjk2 wrote:
| Nope
| jeisc wrote:
| never too late to do something that should have been since day
| one
| blitzar wrote:
| Yet another meta breaker - this is another move aimed at the
| casual audience to bring in the party gamers and make the program
| an absolute mess competitively.
| curiousgal wrote:
| Solid reference!
| eviks wrote:
| Could we get some easy aliasing of REGEXREPLACE to reRepl and
| picking a regex engine that matches the syntax rules you're used
| to in a the next decade or so?
|
| > Try asking Bing Copilot for regex patterns!
|
| Or maybe embed a cheaper and more reliable solution like
| https://regex101.com?
| cjblomqvist wrote:
| You could quite easily store a lambda formula (that just passes
| on the arguments) as a named function. Very neat trick for
| organizing and re-using formulas in excel.
| underlines wrote:
| Coincidentally using the same function names as Google sheets did
| for years. :D
| filcuk wrote:
| Why would spreadsheet software try to differ in formula names?
| You want users to be able to switch to your platform with
| little friction.
| hbossy wrote:
| It used to be the other way around, with Excel leading the
| way.
| brewdad wrote:
| This is one of those fringe features that 1% of users are
| clamoring for and 99% of users should never ever use. I'm
| not surprised it was never at the top of the Excel team's
| TODO list.
| IshKebab wrote:
| Yeah the problem is that those 1% of users are probably
| writing the most critical spreadsheets, and when
| companies using GSuite think about moving to Office365
| the more pain it is to migrate spreadsheets the worse for
| Microsoft.
| eb123 wrote:
| So why are only 2/3 of the formulas named the same?
| Excel's new REGEXTEST [1] is REGEXMATCH in Sheets [2]
|
| [1] https://support.microsoft.com/en-us/office/regextest-
| functio...
|
| [2]https://support.google.com/docs/answer/3098292?hl=en
| eb123 wrote:
| I've been using Sheets's regex functions for over a year,
| they're really helpful for basic data transformation.
|
| Excel also got a "split" function much later than Sheets
| did.
|
| I was excited to read on the Sheets blog that Sheets
| finally will have a table functionality, which Excel has
| always had:
| https://workspaceupdates.googleblog.com/2024/05/tables-in-
| go...
| jgalt212 wrote:
| Or Excel mimicking the broken calendar of Lotus 123.
| mnau wrote:
| And it was an excellent decision, see rationale:
| https://www.joelonsoftware.com/2000/06/03/strategy-
| letter-ii...
|
| TBH, they should have dumped during switch to OOXML, that
| was a wasted opportunity. On the other hand, that's hard
| sell to business.
| _factor wrote:
| Good ideas should have a provisional naming period where a
| standardize name can be agreed on. Anything else leads to
| fragmentation and lock in due to domain specific
| terminology.
| Akronymus wrote:
| Well, excel translates the formula names... (One of the many
| reasons I switched over to using windows in english, rather
| than german)
| Shorel wrote:
| I am waiting for the day I can use Excel as a REST API client, so
| I write the ID of some record in a cell, and Excel does all the
| work of calling the API repeatedly and fills the rows and sheets
| with the required information.
|
| This will kill a few Python jobs and make it a very popular REST
| client =)
| actionfromafar wrote:
| How about Excel as an API server, too?
| jodrellblank wrote:
| You've been able to do that for ages, a client automating
| Excel through COM and listening for events in Excel from
| script: https://news.ycombinator.com/item?id=37229850
|
| Microsoft's advice page for wrapping Office in a web
| application fronted by ASP/ASP.Net:
| https://support.microsoft.com/en-us/topic/considerations-
| for...
|
| And that has links to documentation about Excel Web Services
| of old (on-premises SharePoint days, not sure if it
| translates to today's cloud), which is SOAP/HTTP:
| https://learn.microsoft.com/en-us/sharepoint/dev/general-
| dev...
| WorldMaker wrote:
| The "new hotness" for today's cloud are all the Excel APIs
| in the Microsoft Graph:
|
| https://learn.microsoft.com/en-
| us/graph/api/resources/excel?...
| kevin_thibedeau wrote:
| MS has had ODBC integration for ages. That is a more rational
| way to connect Excel with data sources.
| nhatcher wrote:
| I'm getting closer! https://github.com/ironcalc/IronCalc
| xnx wrote:
| Can you do this with the =WEBSERVICE() function?
| https://support.microsoft.com/en-us/office/webservice-functi...
| ComodoHacker wrote:
| Can I parse response (JSON, XML, whatever) into cells without
| VBA?
| xnx wrote:
| With some effort, probably: =FILTERXML()
| https://support.microsoft.com/en-us/office/filterxml-
| functio...
|
| Could also do some crude/fragile parsing with
| aforementioned regex functions.
| jmkni wrote:
| Knock yourself out!
|
| https://stackoverflow.com/a/158657/969613
| voidUpdate wrote:
| Until they add python into excel, then there will be python
| jobs again :P
|
| https://support.microsoft.com/en-gb/office/get-started-with-...
| personalityson wrote:
| Every self-respecting corporate org will block execution in
| the cloud + no IDE (you edit the code inside the cells), its
| dead on arrival
| drivers99 wrote:
| I've actually done that. I needed to create a lot of json
| documents and POST them somewhere (it was metadata for some
| Chef cookbooks), and a lot of the information was similar with
| a few variations. Excel lets me create what I call a
| "conductor's score" of that information. (In music, the
| conductor's score has all the instruments on one page. Each row
| is a different instrument, and each column is the same measure
| [point in time] in the music.) In other words, I can easily
| look for commonalities, differences, and presence of the data
| as I update it.
|
| I was just going to copy/paste the fields into JSON which is
| repetitive and error prone, and realized I could automate it.
| So I wrote VBA to output a json version of that. (The annoying
| part of writing that in VBA is that JSON data needed a lot of
| double-quotes, but then you have to use escape sequences for
| every double-quote.)
|
| Once I had the JSON data generated, I was going to post each
| one manually, so again, I looked and realized you can POST the
| data directly from VBA. Added a button to the page and you
| could update the data in Excel and click a button to POST it.
|
| Of course, once I turned it over to someone else, they were
| like "what the hell" and started doing it a different way. lol
| deathanatos wrote:
| I've done this in Google Sheets, with "Google Apps Script", the
| most horridly named thing I think I've ever worked with.
|
| It's just (utterly ancient) Javascript (I think ECMAScript,
| technically, since there's no DOM, no browser, etc.), bolted
| on. I wrote in modern JS and used Babel to transpile the source
| back into the stone age.
|
| ... I'd _much_ rather write Python.
| marcus0x62 wrote:
| What's the joke for this - now you have three problems?
| 6c696e7578 wrote:
| LibreOffice has had regex for donkeys years.
| dylan604 wrote:
| I read the title, and was actually surprised how in 2024 this
| is a headline. I'm surprised that this has not been a feature
| since forever as well.
| alwillis wrote:
| Apple's Numbers spreadsheet has had regex support for a while
| as well.
| airstrike wrote:
| which is probably the one good thing you can say about
| Apple's Numbers
| animal_spirits wrote:
| I can say the UX of Apple Numbers is the best, much above
| the usability I've found of Excel. Much less powerful, but
| it is much easier to use Apple Numbers
| airstrike wrote:
| easier to use for power users or for beginners?
| perryprog wrote:
| As someone who regularly flits between Google Sheets,
| LibreOffice Calc, and Numbers regularly (with my most
| time in Google Sheets and Numbers), and as someone who's
| made some extremely complicated spreadsheets in all
| three, I have to say I vastly prefer Numbers if I can get
| away with it. It has a lot of issues that can make some
| stuff hard fast (no array formulas is a big one), but I
| find it significantly easier and faster to prototype in
| Numbers regardless.
|
| I think the main bit I love so much about it is having
| actual tables instead of the Infinite Grid that most
| spreadsheet software uses. You get named ranges for free,
| and it makes semantical sense too, among a good number of
| other benefits (sheet organization, refactoring, simpler
| styling...).
|
| There are some really nice things that Google Sheets
| does, and I've done a few fancy things with App Script
| which isn't too bad, and I do really like QUERY though I
| wish it was a bit higher power. I just always find myself
| missing the UX of Numbers, though.
| zorrn wrote:
| Am I the only one who distrusts domains with numbers in it?
| curiousgal wrote:
| No lookahead/lookbehind? Boo
| neallindsay wrote:
| Kudos using "regex" instead of "regexp", as it is much nicer to
| say out loud.
| aronhegedus wrote:
| huh, TIL that regex wasn't a thing in vanilla excel. Thought
| that'd be a basic thing that's included.
| appleiigs wrote:
| I realized that long ago... it's incredible that it took this
| long given how common it is in programming.
| axus wrote:
| The strategy of freezing VBA and BAT scripting, offering
| better alternatives, and hoping they'd go away on their own
| hasn't worked.
| wruza wrote:
| You never expect basic ergonomics from an ALL CAPS language.
| They all suck.
| layer8 wrote:
| You can using VBA functions and _VBScript.RegExp_. This is
| restricted to Windows though. Microsoft wants to get rid of
| VBScript, so maybe they are going through the most common use
| cases in Excel.
| blowski wrote:
| Have they officially said they want to get rid of VBScript?
| layer8 wrote:
| Pretty much:
| https://techcommunity.microsoft.com/t5/windows-it-pro-
| blog/v...
| against_entropy wrote:
| Hope the office copilot can do this for me
| louthy wrote:
| Now you've got two problems
| smartmic wrote:
| Any information on which standard they have implemented (POSIX
| BRE, ERE, PCRE, ...)? Since they are Microsoft, I would not be
| surprised if there is none.
| shawn_w wrote:
| According to the documentation links in the article, it's using
| PCRE2.
| sebsebmc wrote:
| > All regular expressions for this function, as well as
| REGEXEXTRACT and REGEXREPLACE use the PCRE2 'flavor' of regex.
| smartmic wrote:
| Okay, thanks! Now next question which comes into my mind: is
| there info about the regex engine they are using? I would
| expect there is some (proprietary?) C++ library also used in
| other MS products or are they even using a FOSS licensed one?
| jandrese wrote:
| Seems likely that if they explicitly say they're supporting
| the PCRE2 syntax it is because they are using the BSD
| licensed libpcre.
|
| Reinventing a regular expression system is very far down on
| the list of things I'd ever want to do. Those things are
| filled with dragons and require years of refinement to get
| the bugs out.
| stockhorn wrote:
| Now I can finally parse (X)HTML with Excel ;)
| layer8 wrote:
| Now I need to build a regex for zalgoifying text in Excel.
| mindaslab wrote:
| Use Libre Office if you value your freedom.
| Tronickle wrote:
| I love freedom and I love open source, but the problem is that
| Libre Office is too far from MS Office in terms of UX and
| functionality.
|
| If your usecases are not complicated, Libre office could work
| well. Otherwise your efficiency will be behind those that you
| can achieve with MS Office.
| resource_waste wrote:
| Oh man Libre Office is trash and so are the
| leadership/maintainers. No tears.
|
| I once got banned for asking to put 'text size' on the main
| screen for the powerpoint knockoff.
|
| Text size seems pretty important to have. You shouldn't have to
| google how to change the text size.
|
| I'm deeply convinced there is a Microsoft plant undermining
| everything.
| zamadatix wrote:
| Currently at least text properties and size is on the main
| screen in the current defaults for Impress and, just like
| PowerPoint, only made visible and editable when you click
| something which has a text size. Maybe it was different in
| the timeframe you were talking about (barring any actual
| information on the request) but this combined with how
| abrasive your comment is leaves little reasoning for why it's
| horrible and casts serious doubts the ban was actually
| related to asking about interface enhancements even if that's
| when you were banned.
|
| FWIW I prefer PowerPoint over Impress as well (particularly
| when it comes to the browser side of things on the go), using
| one or the other is just not a vendetta of mine.
| pessimizer wrote:
| > I'm deeply convinced there is a Microsoft plant undermining
| everything.
|
| It's conspiracy theory thinking, but I'm getting there, too.
| These things have been too close to being complete
| replacements for commercial products for so long, but somehow
| still fall over on problems that have been complained about
| for a decade or more.
|
| There's not one of them that some massive corporation with
| sights set on adding a letter to faang couldn't pick up and
| turn into a legitimate competitor in a month or six, khtml-
| style. Instead they often sit on moribund subpages of some
| larger project website, with a blog updated once every year
| or two.
|
| These projects _have to be_ targets for sabotage by their
| commercial competitors, just as government initiatives are,
| just cheaper. For e.g. Adobe it 's less than a rounding error
| to e.g. spend 500K/year supplying a developer to e.g.
| Scribus[0] to make sure that it remains difficult to
| contribute to, or makes bad architecture choices, and that
| hypothetical developer could be one of its biggest actual
| contributors.
|
| Maybe it's just because they have to spend a lot of time
| chasing Gtk, which makes it another redhat problem?
|
| edit: The 95% state of all a lot of these FOSS packages is
| also evidence that there are zero tech billionaire
| philanthropists. It would take a total of _one_ of them to
| grab all of these projects and wrangle them into good form.
|
| [0] _This is an actual hypothetical, I 'm not making an
| accusation about Scribus._ Between the GIMP and Inkscape,
| they literally are the only people who made a real effort
| with color for years. Inkscape openly said that their
| software was only for making images for the web (as opposed
| to print), as if there were a rational reason to cripple
| their product and narrow people's interest in it. Now that
| deviantart is gone, will anyone care about Inkscape anymore?
| Will the dopamine hit hobbyists get from sharing generative
| art cause Inkscape to be totally left behind? Why is it hard
| to design a form for your office's paperwork in Inkscape, a
| vector drawing program, if forms are just straight horizontal
| lines and text? Why aren't they trying to merge with Scribus
| (and LibreOffice Impress/Draw) and create a complete pdf
| solution? I have no idea.
|
| _I 'm being very ranty here, and I do want to say that I
| very much appreciate the work that people are doing for free,
| in their spare time, for others._
| personalityson wrote:
| The reason is MS is gradually deprecating VBScript. Today it's
| the VBScript.RegExp reference that is being used in VBA.
| nickpeterson wrote:
| Does excel have gpu accelerated calculations, seems an obvious
| thing to add if not?
| layer8 wrote:
| It may appear obvious, but only a very limited subset of real-
| life spreadsheets could meaningfully take advantage of it.
| Excel does not support GPU-accelerated calculations.
| nickpeterson wrote:
| doesnt the back-end of excel have a columnar data model that
| can store millions of records in a single excel file?
| recursive wrote:
| The front-end of excel doesn't allow the creation of more
| than 1048576 rows in a sheet. (1<<20)
| smokefoot wrote:
| It's interesting to see the give and take between google sheets
| and excel. Google sheets came on the scene shooting for total
| backwards compatibility and then proceeded to develop some really
| interesting innovations. Now we see new features emerging on both
| sides that are quickly replicated by the other player. Notably
| off the top of my mind:
|
| * spill formulas - google first, now supported in MSFT
|
| * # notation - MSFT enhancement to spill formulas not yet adopted
| in google
|
| * regular expressions - google first, now in Excel
|
| * check boxes - google first, now supported in excel
|
| There must be others. I would expect competitive dynamics where
| each side tries to build extensions that can't be replicated on
| the other side
| JoeyBananas wrote:
| Congrats on catching up to Perl! Maybe it took so long because
| Bill was on vacation on Epstein island.
| recursive wrote:
| Perl's UI still has some work to do if we're making
| incomprehensible comparisons.
| tezza wrote:
| At last the production reliability of Excel meets the clarity of
| regular expressions and the simplicity of matrix slices
| jupin wrote:
| Finally, an excel formula I will actually remember
___________________________________________________________________
(page generated 2024-05-24 23:01 UTC)