[HN Gopher] A functioning Turing Machine using Notepad++ and its...
___________________________________________________________________
A functioning Turing Machine using Notepad++ and its find/replace
regex engine
Author : MarcellusDrum
Score : 361 points
Date : 2022-02-22 08:18 UTC (14 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| kats wrote:
| Not a Turing machine if the user has to press a button for each
| step.
| coldtea wrote:
| That's not relevant as to whether it's a turing machine or not,
| tho...
|
| The possible calculations (and genericity) is what matters. The
| "button for each step" could be analogous to powering the
| turing machine, or turning some crank for Babbage's machine, or
| whatever..
| Banana699 wrote:
| This is absolutely relevant, turing's original definition
| described a completely autonomous and automatic machine,
| that's why every turing machine has a halting state that
| locks it in a loop when computation is finished.
|
| >could be analogous to powering the turing machine, or
| turning some crank for Babbage's
|
| Those things are done once for those machines, you press
| power-on or turn a crank for just one time and the machine
| starts, this is not the case here, here the human is acting
| as the control logic for the machine, repeatedly pulsing to
| drive the computation.
|
| Your computer is not a computer without a hardware clock, the
| repeated pressing of a button is acting as a hardware clock
| here.
| floatboth wrote:
| ...or clicking the On-Click Transitions in Microsoft(r)
| PowerPoint(tm)...
| KptMarchewa wrote:
| Not a Turing machine if it's powered by human created
| electricity.
| Banana699 wrote:
| Terrible and fallacious analogy, the repeated press of a
| button doesn't power the search+replace process, it still
| needs electricity to run, the repeated pressing is more like
| a hardware clock that repeatedly pulses to advance the
| computation, which does indeed makes the search+replace
| process alone not turing complete, every turing machine has
| control logic to automate the execution loop.
| dzqhz wrote:
| If you had to press different buttons you would have a point.
| numeri wrote:
| I did essentially this in vim once - although I made it a little
| more "user friendly" with vim macros' ability to both call and
| modify themselves, so it would stop itself if the Turing Machine
| halted.
| midjji wrote:
| Neat, Though it does make me wonder if a python parser wont soon
| become standard in editors. Its just too damned convenient for
| many things, sure you might be able to come up with a regexp
| search and replace that does the same thing, but odds are it will
| take longer than coding a few loops. BASH sucked far too much,
| and C++ and most languages lacked the convenient filesystem libs
| required, but python just works and is easy to read and modify.
| The downsides of python, like the atrocious speed and low
| maintainability dont matter for scripts you will use just once.
| vesinisa wrote:
| That sounds like a felony abuse of Python. Despite, wasn't perl
| originally a tragic attempt at such language? :)
| 30f0fn wrote:
| How about elisp?
| tjoff wrote:
| Python just works... Every now and then I'm still bitten by the
| 2->3 transition. And I'd very much not like it to be a required
| dependency in my systems either.
|
| And no, regexps get a bad rep but they are for the easy 99% and
| insanely quick to come up with. Learn basic syntax and you'll
| be thankful for decades to come.
| kadoban wrote:
| Learn basic syntax and you'll spend the next few decades
| wondering each time _which_ basic syntax is expected because
| none of them ever say.
| metalliqaz wrote:
| I know the basic syntax and I never have that problem.
| Perhaps you mean the advanced syntax? However I don't see
| the problem, there either. I usually don't need it, and to
| be honest, when I do, I find it more maintainable to use
| multiple simpler expressions combined with some
| programming.
| pstoll wrote:
| You must not have suffered enough, I mean used regexs
| widely enough.
|
| He meant "syntax" in the sense that different regex
| engines have different syntax and capabilities - can I do
| a negative look ahead assertion in engines Z, how do I do
| a zero width lookaround in pcre, gnu, python, posix, etc.
|
| Depending how far down the rabbit hole you want to go,
| start here:
|
| https://swtch.com/~rsc/regexp/regexp1.html
| metalliqaz wrote:
| I just don't count those cases as 'basic' syntax.
| kadoban wrote:
| Some by "regex" mean "globs" and you can just use "*" and
| "?" for stand-ins for some number of characters.
|
| Some allow "|", some allow backrefs, some allow "()", or
| require them escaped with \, or allow them but not with *
| after. Some are case insensitive, some not. Some allow
| "{0-5}", some allow "[0-9]", some have handy things like
| "\w".
|
| It's just the guessing game of exactly what they want. It
| should be required that each regex box has an example
| next to it using as many allowed features as possible.
| Semaphor wrote:
| > Some by "regex" mean "globs" and you can just use "*"
| and "?" for stand-ins for some number of characters.
|
| What kind of scary tool is that?
| RHSeeger wrote:
| Glob is used in most shells, plus in several languages
| (Tcl has both glob and regexp matching).
| Semaphor wrote:
| Yeah, but they don't call it "regex" which would be the
| scary part.
| RHSeeger wrote:
| Ah, I interpreted the original statement as
|
| > Some [people] by "regex" mean "globs"
| _jal wrote:
| Look for 'pcre' on the label. Accept no substitutes and
| you'll be fine.
| hexane360 wrote:
| *almost always: https://blog.cloudflare.com/details-of-
| the-cloudflare-outage...
|
| >The Lua WAF uses PCRE internally and it uses
| backtracking for matching and has no mechanism to protect
| against a runaway expression.
|
| https://blog.cloudflare.com/making-the-waf-40-faster/
|
| >Back in July 2019, the WAF transitioned from using a
| regular expression engine based on PCRE to one inspired
| by RE2, which is based around using a deterministic
| finite automaton (DFA) instead of backtracking
| algorithms. This change came as a result of an outage
| where an update added a regular expression which
| backtracked enormously on certain HTTP requests,
| resulting in exponential execution time.
|
| >After the migration was finished, we saw no measurable
| difference in CPU consumption at the edge, but noticed
| execution time outliers in the 95th and 99th percentiles
| decreased, something we expected given RE2's guarantees
| of a linear time execution with the size of the input.
| RHSeeger wrote:
| > Some by "regex" mean "globs" and you can just use "*"
| and "?" for stand-ins for some number of characters.
|
| But that's _not_ regexp, it's glob; a totally different
| pattern matching system. I mean, you can call a duck a
| horse, but that doesn't mean you're right... or that
| there's anything confusing about horses.
| aasasd wrote:
| Quick: which programs use '|' and which use '\|'?
| oldsecondhand wrote:
| You can record and edit beanshell macros in JEdit. It does
| exactly what you described but you also have access to all the
| Java Standard Library (including regex).
| badsectoracula wrote:
| I've needed such functionality often enough that i wrote a
| small utility[0] that uses my LIL scripting language[1]
| (similar to Tcl) to process some text and have its output. A
| couple of examples are in the shots [2][3] (note that as this
| is written in Lazarus it also works under Linux and Mac too,
| though the site only has a Windows executable).
|
| Before that i used to write scripts or even full programs (in
| Free Pascal which has some simple string handling) to do
| similar processing and i did find it much more cumbersome to go
| through that route.
|
| Of course this only works for editing/generated/transforming
| text pieces (and i pretty much always use it via clipboard),
| for processing files i still end up writing full scripts or
| programs (depending on the case, if i need to preprocess stuff
| i use another LIL-based tool, lip[4]).
|
| [0] http://runtimeterror.com/tools/liteproc/
|
| [1] http://runtimeterror.com/tech/lil/
|
| [2] http://runtimeterror.com/tools/liteproc/shot.png
|
| [3] http://runtimeterror.com/tools/liteproc/shot2.png
|
| [4] http://runtimeterror.com/tools/lip/
| throwanem wrote:
| I used to have a Perl script, bound to a key chord, that
| would pop a dialog box prompting for a s/// regexp-replace
| expression and apply it to the clipboard contents.
|
| Never got around to making it properly interactive, because I
| stopped needing it when I switched to Emacs and had both its
| native capabilities and C-u M-| available.
| 29athrowaway wrote:
| How do you sandbox python?
| aasasd wrote:
| If anyone whose life line says 'makes an editor' is reading the
| above: I beg you to use Lua instead of Python. It's _a lot_
| faster, and that matters in productivity apps. Speed can be the
| difference between 'I wrote a script' and 'I could write a
| script but didn 't'.
| vidarh wrote:
| I wouldn't be happy with Python, but I do agree with the
| general premise that extensibility or automation ought to be
| expected of editors.
|
| My own editor is written in Ruby, and so all extension is done
| by loading Ruby code into the running process, and I can drop
| into the Pry debugger with a keypress, or another keypress
| gives me a prompt to enter a single-line expression instead.
| The latter is literally a one-line method. Adding a binding to
| eval() a whole buffer would be equally trivial... Being able to
| extend everything trivially in a language I'm comfortable with
| (so not Emacs lisp) makes such a difference to usability.
|
| Incidentally, the ability to interact with the open buffers
| using a script also from _outside_ the editor is another thing
| I love as an extension mechanism for editors - an idea I first
| saw in FrexxEd (co-written by the founder of Curl) for the
| Amiga, which exposed the open buffers in the filesystem (think
| the Amiga equivalent of a FUSE filesystem), which would have
| the added benefit of not being language specific. It doesn 't
| need to involve any FUSE-like stuff either - just a command
| line utility to "cat" an open buffer and to replace the open
| buffer from stdin would be sufficient.
| virchau13 wrote:
| Out of curiosity, what do you actually use the Ruby
| liveloading feature for? I use Neovim, which has a similar
| (but less powerful) feature that allows one to live-execute
| Lua code (or Vimscript, I suppose). But I almost never use it
| outside of testing code for my configuration, because the
| rest of the editor's feature set suffices fairly well for
| text editing.
| coliveira wrote:
| Vim can run Ruby scripts, so it also provides this feature.
| vidarh wrote:
| I don't very often load code "live" other than on starting
| a client in the form of "macros" which are pretty much
| stuff I don't want to add into the editor core because it
| e.g. might be short lived or is just too esoteric. Even
| though my editor is mostly for myself, imagine anything you
| might put in your .vimrc, except it's only self-discipline
| that separates this from core editor code.
|
| E.g. I have functions in there to insert headers in my
| journal for example.
|
| In terms of executing code "actually" live, it's ~80% for
| debugging when I work on the editor itself. Pry provides
| all of the plumbing so all the code needed to add that is
| just to suspend the editors own input handling and call
| pry, coupled with an exception handler that calls pry as
| well, so there was no reason not to, basically.
|
| But once I'd added it, the 20% left felt worthwhile as a
| means of e.g. do complex searches, or generate tables or
| otherwise do search and replace of content that requires
| more (e.g. parsing timestamps and replacing them with
| another format....) and any number of things I don't do
| very often but that feels very comfortable when you do need
| it. To be clear it's not like it does much you can't do
| easily without it. E.g. after all I could just dump the
| buffer to a file, load it in my repl of choice, manipulate
| it and write it out again and reload it in my editor. It's
| one of those small things that feels unimportant when you
| don't have it there, that doesn't save you a huge amount of
| time, but that just makes things feel nicer when you get
| used to them.
| vermilingua wrote:
| I was under the impression that RegEx was not turing complete. Is
| there something special about N++'s regex engine that allows
| this?
| detaro wrote:
| No, nothing special about N++ (well, lookahead, but many regex
| engines have that). Repeated Search+Replace is the key, and not
| part of regex.
| dspillett wrote:
| I find myself wondering if anyone has done a proper analysis
| to prove that human activity is Turing complete!
| dTal wrote:
| It is so, trivially; humans wrote the find/replace box in
| Notepad++.
| teawrecks wrote:
| I mean, I can write a FSM that outputs the code for
| find/replace in Notepad++.
| assbuttbuttass wrote:
| Turing's original idea of a "computer" was a human
| manipulating symbols on paper, so I would say trivially yes
| tartoran wrote:
| We are guaranteed to halt, e.g lifespan. After death cells
| enter apoptosis stage where cells do final stages upon
| shutting down. I'd say it appears we are Turing complete,
| or at least it appears so from this angle
| skykooler wrote:
| Given that Turing machines cannot be guaranteed to halt
| (and it's trivial to write an infinitely-running Turing
| machine), therefore humans are not, technically, Turing-
| complete.
| chhickman wrote:
| Well, I would say that any physical implementation of a
| Turing machine will eventually halt because you cannot
| guarantee an infinite energy supply. So if we're willing
| to ignore physical limitations then humans can be Turing
| complete - if not, then no machine be either.
| walnutclosefarm wrote:
| Turing completeness is a theoretical construct that
| ignores that errors in execution inevitably occur during
| a sufficiently long program, due to the second law of
| thermodynamics (entropy must increase in a closed
| system). Any realization of a Turing complete engine
| eventually fails. That's different than halting, though,
| because it doesn't answer the question as to whether the
| program of the machine eventually reaches a halt
| instruction, when properly executed.
| [deleted]
| thehappypm wrote:
| How about a human population?
| tartoran wrote:
| Oh, that too though we're looking at it from a different
| angle. Most species so far have gone extinct though we're
| a special kind, we may as well cause that without any
| external factors.
| vidarh wrote:
| We can manually execute the steps of a Turing machine.
| That's all that's needed.
| vermilingua wrote:
| Aaah of course, thanks
| lupire wrote:
| So, Notepad++ basic interface is Turing complete because it's
| a place you can write a Turing machine and execute it.
| lifthrasiir wrote:
| And if you allow multiple pattern-replacement pairs, you
| don't even need regex (the pattern can be a single fixed
| string) as it is enough to simulate semi-Thue systems [1].
|
| [1] https://en.wikipedia.org/wiki/Semi-Thue_system
| [deleted]
| Banana699 wrote:
| Regular Expressions as defined mathematically is strictly less
| powerful than turing machines. (by several levels, they are
| strictly less powerful than general context-free grammars,
| which are strictly less powerful than general context-sensitive
| grammars, which are strictly less powerful than arbitary
| grammars)
|
| The 'regexes' in modern languages and tools, however, are not
| actually regular in the original mathematical sense, they have
| been augmented by several constructs that are not regular. Perl
| is the leader in this field, its latest addition is regexes
| which can reference itself recursively, making (presumably,
| never seen a proof) it at least context-free. Here, the non-
| regular constructs used is capture groups and arbitary forward
| lookahead.
|
| In addition to that, the author is doing something sneaky by
| making the user press a button continuesly to advance the state
| of the turing machine, so it's not actually search+replace that
| is turing complete, it's search+replace+"repeatedly pressing
| replace all". Unlike what some other people say in this thread,
| repeatedly pressing a button to simulate the machine doesn't
| count as 'turing complete' and is not comparable to plugging
| the machine in power.
| notahacker wrote:
| Seems like you could use automated hardware to repeatedly
| press the button to facilitate the computation without any
| additional software or human interaction required (I'm
| reminded of the episode of the Simpsons where Homer tries to
| automate his job with a nodding bird!)
| ptspts wrote:
| It is proven that Turing machines can recognize a wider class
| of languages than Chomsky regular expressions. However, the
| article uses something more powerful than Chomsky regular
| expressions, because they contain backreferences (as \2 and
| \4), and also there is a repeated search-and-replace involved,
| which also adds to their power.
| louhike wrote:
| Do you mean Chomsky hierarchy by Chomsky regex?
| nsajko wrote:
| Regular languages are an element included in Chomsky's
| hierarchy: https://en.wikipedia.org/wiki/Chomsky_hierarchy
| https://en.wikipedia.org/wiki/Formal_language
| [deleted]
| baryphonic wrote:
| It seems Notepad++'s "search + replace all" feature replaces
| text in place inside a search context rather than buffering the
| changes and applying them outside a search context after the
| search is complete. This makes search recursive.
|
| When combined with enhanced regexes (backreferences and
| lookahead), you have the ingredients for Turing completeness.
| tromp wrote:
| Cool demonstration of regular expression power! Note that as
| implemented, this is a space bounded Turing Machine with a
| hardcoded tape length limit:
|
| > With the current implementation, the read/write head looks at
| the 22nd element of the tape whenever we want to read the current
| position. With complicated machines that utilize long lengths of
| the tape, you would need to increase this number so that you
| never delete a necessary tape element as you scroll along it.
| This can be done by increasing the {20} that appears at the
| beginning of the expression.
|
| Embedding the tape head pointer ^ within the tape, just to the
| left of the scanned bit, should remove this restriction.
| NextHendrix wrote:
| So what you're saying is I could use this to parse HTML?
| dom111 wrote:
| If you like this, you might also like Retina[0]!
|
| [0]: https://github.com/m-ender/retina/tree/master/Examples
| armchairhacker wrote:
| Ah yes, regular expressions, my favorite Turing-complete
| language.
| JadeNB wrote:
| I made a harder-to-understand investigation using Perl (non-
| regular, of course) regexps a while back.
|
| https://perlmonks.org/?node_id=809842
| lupire wrote:
| If you like this, you'll love Save Endo, one of the top two most
| fun ICFP Programming Contests ever. (The other is Cult of the
| Bound Variable)
|
| https://save-endo.cs.uu.nl/
| DeathArrow wrote:
| Great, now someone should write a Javascript interpreter for it!
| wjd2030 wrote:
| "Yeah, but your scientists were so preoccupied with whether or
| not they could, they didn't stop to think if they should" - Dr.
| Ian Malcolm.
| wjd2030 wrote:
| jokes, this is neat.
| fouronnes3 wrote:
| Nice one to add to the list of accidentally Turing complete
| systems. What's your favourite :)?
| teawrecks wrote:
| I wouldn't say that it's on accident. The theoretical
| definition of Regular Expressions is specifically not Turing
| complete. But Regex as a tool just isn't as useful in that
| form, so it was deliberately extended to be Turing complete
| with new operators.
| Pompidou wrote:
| A bit off topic but in the french internet there is an add,
| these days, asking people to try the Turing test to find a
| (good) job. I find it very funny. Imagine a job interview where
| you have to be turing-complient to get the job : you play the
| tape and the recruiter act like a scanning device...
| giu wrote:
| Still one of my favorites: On The Turing Completeness of
| PowerPoint, https://www.youtube.com/watch?v=uNjxe8ShM-8
| unixhero wrote:
| So it can play Towers Of Hanoi?
| tromp wrote:
| Certainly; that's a space bounded computation, only needing
| O(n) tape for n disks.
| drexlspivey wrote:
| But will it run Doom?
| harpiaharpyja wrote:
| The instruction sets are the remaining lines, formatted like
| >C.I:WMN C current instruction name. I the input
| from the current tape position. Either 0 or 1. Each instruction
| has execution parameters for both inputs. W the
| output to be written to the tape at the current position. Either
| 0 or 1. M the movement of the read/write head. A 0 moves
| the head one position to the left. A 1 moves it to the
| right. N the name of the next instruction to be executed
| at the new tape position.
|
| Would it have been clearer to use "<" or ">" to specify whether
| to move left/right from the current position?
| 0xTJ wrote:
| As it's structured, that wouldn't work. To move, it replaces
| the leading zero of the tape with: * If the
| movement is "0": "00" * If the movement is "1": ""
|
| The choice of 1 to move right is arbitrary (but makes sense to
| match the 0), it could be any other symbol, but the 0 is needed
| to because that's what's prepended to the tape.
|
| Someone could probably find a way around it, but it would
| likely make it harder to understand.
___________________________________________________________________
(page generated 2022-02-22 23:00 UTC)