[HN Gopher] Wordle and Grep
___________________________________________________________________
Wordle and Grep
Author : tosh
Score : 46 points
Date : 2022-04-16 15:37 UTC (7 hours ago)
(HTM) web link (leancrew.com)
(TXT) w3m dump (leancrew.com)
| qsort wrote:
| What I find interesting about word games is that they are very
| rarely games about words. E.g. Scrabble is adversarial
| optimization, Wordle is constraint satisfaction, etc.
|
| To this day the puzzles I enjoy the most are crosswords and
| rebuses, mostly because I can switch off the part of my brain
| that screams "oh yeah, just grep /usr/dict/words".
| melony wrote:
| I have bad news for you. Here's a potential algorithm: 1) Fine-
| tune a language model to come up with words that fit the
| crossword descriptive sentence. 2) Problem is reduced to a
| constraint satisfaction problem. Implementation is left as
| exercise for the reader :)
| dieselerator wrote:
| This is an illustration that grep is a great tool, quick and easy
| to use.
| behnamoh wrote:
| rather egrep, though
| sizzzzlerz wrote:
| Can egrep do lookaheads or behinds? I find it necessary to
| use positive lookaheads to verify the test words include all
| the letters marked present but in the wrong location.
| benji-york wrote:
| Both grep and egrep are the same executable.
| bash-3.2$ diff `which grep` `which egrep` bash-3.2$
| Tabular-Iceberg wrote:
| Where does this scrabble5.txt come from? I've been looking at the
| system words file for this type of exploration, but that has an
| awful lot of words that make no sense to put in a word game.
| furyofantares wrote:
| Here's the target list I use in Xordle, inherited from hello-
| wordl:
| https://raw.githubusercontent.com/6zs/xordle/main/src/fives_...
|
| And here's the dictionary:
| https://raw.githubusercontent.com/6zs/xordle/main/src/fives_...
|
| The targets list is the one you want, it's the valid answers.
| The dictionary is the valid guesses.
|
| In the targets list, some words are starred out, just ignore
| them, those are words that got removed. The list is ordered and
| you can pick a spot in the list and ignore everything below it;
| for Xordle (and again inherited from hello-wordl) "mulch" is
| the last word used in the list.
| gabrielsroka wrote:
| There are two arrays in the Wordle JavaScript. One is a list of
| valid answers, the other is a list of other words you can guess
| but aren't valid answers. You can view source and pull them
| both out. My code does this for you and allows you to use
| regular expressions:
| https://news.ycombinator.com/item?id=30653322
|
| There are also lists of Scrabble words; I use one called
| enable.
| bee_rider wrote:
| Is this still the case after the NYT purchase?
| gabrielsroka wrote:
| Yes, my code still works.
| ziml77 wrote:
| I was surprised when NYT's wordle bot linked me to the
| solution list (alphabetically sorted)
| https://static.nytimes.com/newsgraphics/2022/01/25/wordle-
| so...
| farmerstan wrote:
| When you see a combination with a bunch of possible letters, you
| have to try to eliminate the letters instead of going for the
| win. Lots of people keep going for the win and losing when they
| could have eliminated a bunch of possibilities with a completely
| different search.
| compiler-guy wrote:
| Or they are playing I hard mode which forces this style of
| guessing.
| seoaeu wrote:
| Hard mode doesn't change the optimal strategy much. It just
| makes a small fraction of words "insta-lose" because if play
| one and get unlucky, you'll have to switch to eliminating one
| letter at a time and probably run out of turns before you've
| covered all possibilities.
| cleansingfire wrote:
| Often it's more helpful to use one guess you know will not
| match, just to eliminate or split numerous cases at once. If
| trapped with _IGHT which could start with many letters, a guess
| like FARM would reduce your guesses in a single move.
|
| This also helps avoid or escape focus on an expected pattern
| that may be wrong (premature convergence.)
| Symbiote wrote:
| Bight, eight, fight, light, might, night, right, sight,
| tight, wight.
|
| By letter frequency I think rents/stern/terns are good
| guesses. I don't think there's any two words cover 9 of the
| 10 initial letters in two guesses.
|
| Blent and frents/strew/terms/wefts/wrest cover 8.
| ghaff wrote:
| It can be hard to realize that's the case though. (_IGHT is
| pretty obvious of course.) And if you can only hit 2 or 3
| letters it can be somewhat of a gamble vs. just going for a
| valid answer. (Of course if there are a ton of options you may
| be in a tough position anyway.)
| ALittleLight wrote:
| Seems like the pattern should be 'l.[^t]..' and then just add in
| a '| grep t'. That method reduces the number of assumptions.
| smoyer wrote:
| Dots allow any character though. I don't have a scrabble5.txt
| so I'm using /usr/share/dict/words and using the dots resulted
| in the capitalized words (proper names) and words with
| apostrophes being included.
| ALittleLight wrote:
| If you use the wrong dictionary you will get the wrong
| results. No use of grep will solve that. You could replace
| the dots with [a-z] but that is unnecessarily verbose if you
| have the right dictionary.
| kaashif wrote:
| And if you use a specific enough dictionary for your
| problem, you can even just use cat! Just joking, I get what
| you're saying, the blog post is grepping a particular file,
| it's silly for people to grep a _different_ file and use
| that as evidence your regex is bad.
|
| But the dictionary is still wrong in one sense - there's an
| actual word list Wordle uses. I think from the way the blog
| post is written (e.g. "I refused to believe Wordle would be
| so evil as to..."), they're purposely avoiding looking up
| the actual Wordle word list, so they're purposely using the
| wrong dictionary.
|
| I kind of relate to that - looking at the source code to
| devise strategies kind of feels like cheating.
| malingo wrote:
| > I refused to believe Wordle would be so evil as to have two
| repeated letters.
|
| I remember VIVID which was pretty tough.
| emmelaich wrote:
| Worgle had SYZYGY the other day.
|
| I got it in three; I had a remarked to someone the previous day
| what a good word it would be for worgle!
| nunez wrote:
| This is why I like Quordle and Waffle. Both are evil and
| definitely do repeated letters and rare words.
| beardyw wrote:
| Thanks for Quordle. That's another chunk of my life taken up
| every day!
| wpasc wrote:
| I LOVE quordle. found it on an HN post. I enjoy it because it's
| much more (in my opinion) about reducing unknown information
| about all the letters than it is about guessing for each
| possible word.
| ghaff wrote:
| Octordle is definitely like that. I think you want to do a 3
| to 4 word opening sequence of some sort unless you get very
| lucky earlier. That said, going beyond 3 words I found turned
| it into more of an anagram solving game but not really
| improving my typical score. (I find it's very hard to have
| more than one or two guesses left given you pretty much have
| to use 8/13 for the final guesses unless you luck onto an
| answer you weren't shooting for.)
| Symbiote wrote:
| I aim to solve Octordle in 3+8 guesses. I did so yesterday
| (#82), but today's (#83) needed a 50-50 guess which I got
| wrong, so took 12.
|
| If it takes more than 12, it usually means I've been too
| casual.
| runarberg wrote:
| I'm at my worst in Octordle when I get my first word in
| the 3 guess. Then I feel pressured to get 2 + 8 guesses.
| However that usually doesn't work out and I end up
| wasting the 4th guess with unoptimal letter coverage.
| MauranKilom wrote:
| Quordle, for me, quickly turned relatively stale. I'm quite
| convinced it's optimal (in terms of human play) to have a
| fixed set of 4 words to cover 20 letters, at which point you
| have 5 guesses to solve 4 words based on that information.
| Fun for a while, but not very deep.
| furyofantares wrote:
| https://scoredle.com calculates this for you using the actual
| word list. in this case it shows 17 answers:
|
| lifts lints lists lilts lofts louts lowts lunts loots lusty lofty
| lusts linty light limit licht licit
| ghaff wrote:
| That seems to be a list of the valid words to guess, not the
| valid answers though. If I were trying to guess an answer from
| that list, I'd probably go with light followed by limit with
| others less to very much less likely.
| furyofantares wrote:
| Yeah you must be right. Maybe they considered it too spoilery
| to show you the target list.
| sizzzzlerz wrote:
| I wrote a python script that use regular expressions to winnow
| out words based on constraints imposed by previous guesses. There
| are 3 parts to a regex. The first uses positive look aheads to
| verify the test word includes each letter that has been marked as
| correct but in the wrong location. These are followed by negative
| groups with all wrong letters indicated. Interspersed are correct
| letters in the correct position. This filters my word list from
| 5300 words to a handful with each guess. It works pretty well but
| it still requires guessing. I always run it after I've solved the
| word using just my brain. It gives me a good sense as to how good
| or bad my logic was.
| wodenokoto wrote:
| Is there a pattern language that would allow for the search in
| question being done in a single pattern instead of glueing two
| patterns together with an OR-operator?
| TillE wrote:
| It would be fairly straightforward to write a Wordle DSL if you
| wanted to, which would be much faster than any generic pattern
| matcher.
|
| That's the risk of an overly complicated language, you can
| easily wind up slower than a series of regex filters.
| tyingq wrote:
| I don't see an or operator in the post anywhere, so I'm
| confused about which search you mean.
| wodenokoto wrote:
| (l[iouy][^taer]t[^aers])|(l[iouy][^taer][^aers]t)
|
| Is in effect what the author is doing in regex (and how it is
| explained).
| SAI_Peregrinus wrote:
| They used two separate searches for where the T was. Didn't
| bother with an OR in code, just listed the results manually.
___________________________________________________________________
(page generated 2022-04-16 23:01 UTC)