[HN Gopher] ROT8000
___________________________________________________________________
ROT8000
Author : edent
Score : 172 points
Date : 2021-09-22 13:07 UTC (9 hours ago)
(HTM) web link (rot8000.com)
(TXT) w3m dump (rot8000.com)
| rsj_hn wrote:
| This is cool, but I wish more people would use a more
| aesthetically pleasing cipher, like morse code:
|
| https://onlineasciitools.com/convert-ascii-to-morse
| dash2 wrote:
| Who are these nutcases? https://www.master-list2000.com/ and what
| is pl41nt3xt?
| kevinmgranger wrote:
| Where did you find this / how is it relevant? Was this in the
| OP but then removed?
| chris_st wrote:
| > _what is pl41nt3xt?_
|
| Leet-speak for "plaintext".
| genewitch wrote:
| 1337 4 "plaintext" nub
| ditherstudies wrote:
| Hi, I created rot8000 for The Wrong, an online biennial of
| digital art -- specifically for the pl41nt3xt pavillion, which
| included text-only works. The pavillion was taken down when the
| biennial ended, and looks like that link is no longer valid
| jstanley wrote:
| Interesting, I have made a similar project, except instead of
| rotN, it encodes the input as UTF-8, and then shifts up the
| codepoints to display each byte as a different character to what
| it would normally be. The invariant is that `byte & 0xff` is the
| real byte value.
|
| I call it "Mojibake Steganography":
| https://incoherency.co.uk/mojibake/
|
| I think in principle (judging by the description of rot8000), my
| tool should be able to decode rot8000 messages natively, but it
| doesn't seem to work on the example given here. From looking
| directly at the codepoints given, I think the example is wrong.
| It starts:
|
| u+7c5d u+7c71 u+7c6e - which works out to "]qn" instead of "The",
| unless I am misunderstanding something. And in fact that looks
| definitely wrong if we're expecting ASCII output because they're
| all more than 127 away from 0x8000, no matter how it works.
|
| The rot8000 page says:
|
| > It also bypasses 32 control characters, technically making it
| rotFFE0, sometimes with an additional offset.
|
| I definitely don't understand how this is meant to work. Why does
| skipping 32 control characters turn it from rot8000 into rotFFE0?
| Should that say 7FE0? I still don't see how ASCII is coming out
| as 7Cxx.
|
| Taking `char - 0x7c09` gets the expected ASCII output.
| NelsonMinar wrote:
| I like your system!
|
| One nice property of rot13 is it reverses itself;
| rot13(rot13(X)) = X. At least, for basic ASCII alphabet. Your
| UTF-8 encoding step makes that impossible. I wonder if there's
| a sensible Unicode-friendly algorithm that has that rot13
| property.
| tiziano88 wrote:
| Like the one in the original post?
| Tepix wrote:
| Cool! Works even with Emoji.
|
| However the feature of rot13 (and rot8000) that you can use the
| same operation to "decrypt" it again is unfortunately missing
| in your variant.
| robinhouston wrote:
| If you look at the code [1], it skips: *
| control characters * whitespace * surrogate code
| units (U+D800 - U+DFFF)
|
| 1.
| https://github.com/rottytooth/rot8000/blob/main/Rottytooth.R...
| sandwell wrote:
| I wonder if it is possible to generate a ROT8000 quine, that is a
| phrase like "hello world" which yields a semantically matching
| phrase in some other language?
| kderbyma wrote:
| the BabbelROT
| stavros wrote:
| 'ro-,tat is pronounced ROH-tat, by the way.
| azhenley wrote:
| Is there any benefit to make it so that the function must be
| applied X times to restore the original text? E.g., ROT2000.
| Tepix wrote:
| Yes you can use the same operation to encrypt and decrypt it if
| X == 2.
| jhvkjhk wrote:
| By extending rot13 to Unicode characters, it supports encrypting
| emoji message automatically!
| Uberphallus wrote:
| No. Emojis go from U+10000 to U+1FFFF, and this rotates chars
| U+0 through U+FFFF (hence the U+8000 middle point).
| aidenn0 wrote:
| Right you'd need rot 0x88000 to cover all 17 planes. Downside
| would be that the space is not fully packed so you'd get a
| lot of invalid characters.
| tragomaskhalos wrote:
| Pedantic point: some (early) emoji have codes below this
| cjfd wrote:
| I am just getting boxes with hex codes in them if I type ascii
| letters so that is not so very nice. Even if you have all of the
| required fonts I am not sure it is that great to get characters
| from a completely foreign language. Also, I suppose, one could
| end up with surrogate code points which do not have a character
| representation. To summarize: I think this sounded like more fun
| in theory than it turns out to be in practice.
| BoppreH wrote:
| Nice idea. I often use base64 for this, since it's somewhat
| recognizable and there are tons of decoding tools available.
|
| Base64 does lengthen the text by a third, which may or may not be
| a problem. On the other hand, it doesn't need special handling of
| control characters, and manages to hide word lengths well.
| arethuza wrote:
| Many years ago I was involved in finding and fixing a messaging
| bug that only appeared when the base64 encoded payload had a
| length that was a multiple of 87 bytes (it might have been some
| other value - it was 15+ years ago).
|
| Bug was in a C++ base64 encoder component.
| GekkePrutser wrote:
| I wonder how this will play with Unicode's highly complex
| combination rules.. (e.g. frowning face + brown texture = brown
| frowning face).
|
| I bet using ROT on this will lead to unintended consequences
| because the original characters won't combine but the replaced
| ones will.
|
| But anyway ROT is a dumb thing to do anyway so it doesn't have
| any real-world use.
| kapp_in_life wrote:
| I don't think it would matter right? The output might have less
| characters but inverting it would still show the original text.
| Like hypothetically
|
| ab => frowning face + brown texture = brown frowning face => ab
| contravariant wrote:
| If this always happens one way sure, but what if you also
| happened to include the symbol that gets translated to "brown
| frowning face"?
| GekkePrutser wrote:
| True, but some apps don't have the ability to show all these
| variations and may leave them out (simplify to just a face
| icon) when copying/pasting. Unicode interpretation is a
| really complex bundle of quirks these days so I'm pretty sure
| things will start going wrong.
| kasitmp wrote:
| Real world use case: geocaching.com uses it to hide hints, so
| you don't read and spoil yourself by accident. It's pretty much
| accepted and adopted by the users. I also would ban words like
| "dumb" or, for another example "easy" in IT and CS contexts.
| kyle-rb wrote:
| In this case, it would be very unlikely to actually happen, for
| a few reasons.
|
| Almost all combining rules (including skin tone modifiers)
| require a zero-width joiner character between the person emoji
| and the modifier emoji. So really it's frowning face + ZWJ +
| brown texture = brown frowning face. (Although technically I
| don't think frowning face can be modified.) Also, there are
| relatively few ZWJ combinations.
|
| Technically, there are some older combination emojis that
| predate ZWJ, mainly the flags, which are composed of two
| single-letter emojis, e.g. regional-indicator-U + regional-
| indicator-S = United States flag. So I guess it might be
| possible to get a couple of those.
|
| And in any case, I think this page assumes that you're staying
| within the bounds of the basic multilingual plane (it mentions
| a self-inverting transform would be ROT32768), which doesn't
| include emojis or skin tone modifiers.
|
| [1] https://emojipedia.org/emoji-zwj-sequence/
| jfk13 wrote:
| > Almost all combining rules (including skin tone modifiers)
| require a zero-width joiner character
|
| No, the skin tone modifiers apply directly to eligible person
| emojis; no ZWJ is involved. (Unless _other_ modifiers that
| require ZWJ are also present, such as the gender signs.)
|
| https://unicode.org/emoji/charts/full-emoji-modifiers.html
| maerF0x0 wrote:
| This makes me wonder how many of the craigslist (or other
| channel) posts all in Asian language characters are actually
| secret messages?
| genewitch wrote:
| There's a couple of browser plugins that do this, with a
| password, so long as someone else knows the password it will
| decode. I'm not near my machine that has it, but I know it does
| Korean, japanese, and Chinese characters - you choose which set
| you want. And it doesn't back-translate to anything useful,
| it's just encoding.
| ajanuary wrote:
| Bar bs gur avpr dhnyvgvrf bs ebg13 vf gung vf fgvyy cerfreirf
| fbzr fgehpgher. Nf jryy nf n pregnva nrfgurgvp nccrny, pbzzba
| jbeqf va pbzzhavgvrf gung hfr vg urnivyl orpbzr erpbtavfnoyr.
| Juvyr gung qbrf fyvtugyl qvzvavfu vg'f hfr nf n fcbvyre-grkg
| zrpunavfz, vg qbrf nqq gb gur phygher.
|
| Lei Shen Zi Lai Dang Dang Dang Du Shen Zhe Zi Zhuo Luo Shen Zi
| Zhuo Luo Lei Zhuo Duan Zhe Si Du Nu Lei Luo Xian Luo Lei Cun
| Luo Xian Fan Luo Xian Xian Xian Zi Lei Ni Li Zi Ni Lei Luo
| Hata Mi Ni Xian Zi Xian Nu Duan Li Luo Xian Pai Yan Ying Zhuo
| Xu Xian Shen Duan Di Luo Xian Xu Zi Cun Xu Xian Ni Duan Fan
| Fan Kume Shen Shen Lei Luo Si Luo Zhe Xian Luo Du Duan Zhe Si
| Fan Luo Xian Xian Zi Duan Zhe Zi Duan Fan Xu Xian Xu Zhe Yue
| Duan Xian Duan Xian Nu Shen Xu Fan Luo Lei Lu Zi Luo Qian Zi
| Shen Luo Li Zhuo Duan Zhe Xu Xian Shen Yan Mi Ni Zi [?] Shen
| Lei Di Xu Zhe Yue [?] Xu Zi Zhuo Zhe Shen Zhe Lu Fan Duan Zi Xu
| Zhe Xian Li Lei Xu Nu Zi Xian Lian Xu Xian Lian Duan Nu Nu Luo
| Duan Fan Xu Zhe Yue Yan Zhou Zi Zhuo Xu Zhe Di Zi Zhuo Luo Lei
| Luo Xu Xian Duan Ni Xian Luo Li Duan Xian Luo Ying Shen Lei
| Shen Zhe Fan Kume Lei Shen Zi Duan Zi Xu Zhe Yue Li Zhuo Duan
| Lei Duan Li Zi Luo Lei Xian Xu Zhe Zi Zhuo Luo Fan Luo Zi Zi
| Luo Lei Li Duan Zi Luo Yue Shen Lei Kume Yan
| amptorn wrote:
| This is a very bad idea because it's going to rotate ordinary
| characters to code points where Unicode normalization has an
| effect, including combining characters, whitespace, control
| characters... After normalization, rotating back will produce
| garbage.
| Spooky23 wrote:
| Depends on what you're trying to do. Might be a viable strategy
| for avoiding filters that are aware of thing like base64
| GekkePrutser wrote:
| Oops I was just writing the same, I didn't realise someone had
| already mentioned this.
|
| But anyway ROT in itself is a pretty stupid idea anyway,
| usually just done for show.
| wutbrodo wrote:
| > But anyway ROT in itself is a pretty stupid idea anyway,
| usually just done for show.
|
| How do you figure? It feels like the simplest way to handle
| eg spoilers in a universally portable and widely-recognizable
| way.
| woodruffw wrote:
| The website explains the primary actual use case for ROT-
| style transforms:
|
| > It is used to enclose the text in a sealed wrapper that the
| reader must choose to open - e.g. for posting things that
| might offend some readers, or spoilers.
|
| AFAIK, this has been a common use of ROT13 since the 1980s.
| It also preserves substring search and message length (unlike
| BaseN encodings), which are occasionally useful properties.
| hermitdev wrote:
| Yeah, in other words: it's not intended to hold up to
| scrutiny, just hold up to a glance.
| Phrodo_00 wrote:
| I don't get this line of thinking. Nowhere does it says
| it's supposed to have any security uses.
| shakna wrote:
| > including combining characters, whitespace, control
| characters...
|
| It actually skips whitespace, control characters and surrogate
| pairs [0].
|
| [0]
| https://github.com/rottytooth/rot8000/blob/main/Rottytooth.R...
| SommaRaikkonen wrote:
| For those who want to test it out: http://rot8000.com/Index
| Gormisdomai wrote:
| I'm curious why it successfully rotates some emoji but not
| others.
|
| E.g. stars and hearts get rotated but sunglasses do not
|
| (EDIT: rewrote my example to use words because HN doesn't render
| emoji, duh)
| OskarS wrote:
| From the article:
|
| > While rot13 is the self-inverse for a 26-character system,
| and rot47 for ANSI, the Basic Multilingual Plane of Unicode
| requires rot32768 (or 8000 in hex) for a reciprical cypher
|
| Not all emoji is in the BMP, at least some are in the
| Supplementary Multilingual Plane.
|
| It's weird to me that if you're gonna do this dumb "rot13 but
| for Unicode", you'd only do it for the BMP, and not ALL of
| Unicode.
| jsjohnst wrote:
| Technical answer:
|
| Star = U+2B50 which is less than U+FFFF
|
| Sunglasses = U+1F576 which is greater than U+FFFF
|
| The details you might be missing is that some emoji existed in
| Unicode before color graphic "emoji" was actually a thing. The
| stars (and hearts) are examples of ones which used to be just a
| basic shape in the font but now are commonly full color
| graphical "images".
| roamerz wrote:
| I saw this and thought yay a new version of Rise of the Triad!
___________________________________________________________________
(page generated 2021-09-22 23:01 UTC)