(DIR) read fullscreen -- root
(DIR) partner program -- special pages
Welcome...my...son...⢀⡤⠖⠛⠉⠛⠒⢦⣀⡤⠶⠶⠒⠓⠛⠛⠛⠓⠛⠒⠶⢤⣄⣀...⢀⢀⣀⣀...n,...welcome...t
o...the...machine...⢀⠏....⣠⠔⠋⠁..............⠉⠓⣦⠞⠉..⠈⠉⠓⢤⡀...e....Where..
.have...you...bee...⢼...⣠⠞⠁..................⢰⠇..⣠⠤⠖⠦⣄.⢹...en?...It's..
.alright.. ⠁...⠘⠂.⡇...know...wher
e...you've Ph1o6 3ntry 180509_1036 ⡀.....⣠⠃.......Welcome
...my...so ⠙⠦⣄⣀⣤⡞⠁...lcome...to..
.the...machine.....⢀⡤⠟⠚⠉⠉.......⠈⠙⠒⠷⣄⣀......⠠⡇........⣇......Where...ha
ve...you...bee...⣠⠖⠉.................⠉⠙⠢⣄⡀..⡼⠁........⢿...en?...It's...
alrig ⣹......know...whe
lcome #code ⡗....son,...welco
lcome ⡗....son,...welco
me... As you know I'm often biased in favor ⠇......machine...
.Wher of awk, so I just had to rewrite ⣄....you...been?.
.Wher mediawiki mwimport from perl. And here ⣄....you...been?.
..It' are results. ⠘⢦⡀....we...know.
..It' ⠘⢦⡀....we...know.
$ time ./mwimport.pl import.xml >/dev/null
./mwimport.pl import.xml > /dev/null 3.34s user 0.01s system 99% cpu 3.354 total
$ time ./mwimport.awk import.xml >/dev/null
./mwimport.awk import.xml > /dev/null 13.71s user 2.64s system 99% cpu 16.364 total
$ time ./mwrewrite.awk import.xml >/dev/null
./mwrewrite.awk import.xml 4.98s user 0.25s system 99% cpu 5.227 total
mwimport.awk was straightforward to write, it's usual pattern-matching
style of awk, like for example md2html kind of converters. It's easy to
read and extend. But somewhat as expected - it's dead slow.
OTOH rewrite didn't give desired improvement, so you have to live with
it until i find something. today awk sucks :c
WARNING: awk scripts are broken and won't convert mediawiki XML correctly!
UPDATE: and i finally found wtf after hours of polishing all bugs
there was critical typo in flush(), which should be:
dump_text = dump_page = dump_rev = ""
so now it's 1.1 second vs 3.5 with perl. Quite canonical ratio.
UPDATE2: pattern-matching styled example is now 1.5 seconds,
which is still 2 times better than perl.
Original perl mwimport: [1]
Awk rewrite (pattern-matching style): [2]
Rewrite of awk rewrite: [3]
lcome...my......⢳⡀..⣀⣀⣄⣄⣀.....................⡏.......⡗....son,...welco
me...to...the....⠉⠹⣍⢹⣠⠧⣼⣿⣿⡷⠤⣄⣀........⢀⣀⣤⣤⠦⠄.⢀⡇......⢰⠇......machine...
Links:
(HTM) [1] https://meta.wikimedia.org/wiki/Data_dumps/mwimport
(TXT) [2] mwimport.awk
(TXT) [3] mwrewrite.awk
.Where...have......⠈⠉⠁⣰⣿⣿⣿⠁..⠉⠉⠙⠑⠛⡟⢹⠛⠉⠉⢀⡞⠁..⢠⠞......⢠⠿⣄....you...been?.
..It's...alright......⢒⠹⣿⣿⡄.......⠉⠉.⢀⡴⠋..⢀⡴⠉......⠐⠋.⠘⢦⡀....we...know.
elcome...my...son,.....⡸⠓⠦⣄⡀........⢀⣠⠴⠚⠁................⠙⢆.....welcome
...to...the...mach...⢀⡞⠁...⠈⠉⠛⠒⠒⠒⠒⠛⠉⠉.....................⠈⢳⡀...hine...
(QRY) Leave comment
...Welcome....⡾...⡼⠁......................................⠹⡆......⢧....
..my...son...⡼⠁..⣼⠁........................................⣷......⠸⡆...
n,...welc...⢰⠃..⣰⠃....................................⠐....⢽.......⣗...
come...t...⢀⡏...⡾..........................................⣾.......⣽...
Post categories:
(DIR) #code
to...the...⢰⠇...⢿.........................................⢀⡏.......⡾...
e...mach...⠸⡅...⠘⣆.......................................⢀⣼⣁⣀⡀....⢀⡏...
Where......⢀⠊..⠐.⠁⢦⠈⠳⣄⡀............................⠰⠨⠈.⣾⠁.......⢀⡿⣁....
h...⡀⢀.⡀⢀⠠⠁.......⠑⡄.⠙⠦⣄..........................⠔..⠠⣏.......⢀⡼⡁.⢁...h
a.⠠.⠅..⠂⠁. .⡁....⠙⠓⠶⢤⢤⠤⠶⠋.⠂.⢒...a
v.⢐.⢀..... 2017/2065 // ulcer@sdf.org .⠕............⢄⠈.⡀⠢...
v..⡄⠈..... .⡂⠂.........⠌⠠⠈⢀⠈⢀.⠐⠄⡀
v..⡂⠡.⠂...............⠒⠘⢄.⣀⡤⠏.....................⡡.......⠄.⡐⡀⠂⠔.⠄⢁.⡀⠂⠉
v..⠂⡈.⠂⠈⠠⠠.............⠡.⠫⠁......................⠨⢂.......⠢⠁⡀⠁⡁.⠂⡐.⠄⡈⢀.
v.⢀⠁⡈.⠂⡁⠡⠐.⠂⠔.⡀.⠠.⠂⠠...⡀⠂⠅⡁⣄.....................⠨⠈....⡀⢀⠐.⢁⠂⠄⠔⡀⠡⠠.⠂⠄⠁⢈
v.⠆⠔⠠⠨⠐.⡁⢀⠢.⠢⠂⠈⡀⠄⠄⠂⡀⡈.⡐⢀.⠔⢀⠑⠆....................⣋.⠄⠐⢀..⠈⡐⡐⠈⡀⠈⠄⠂⡁⠠⠁⣈.⠂⠁
v.⢅⢈.⠤.⠄⠪⢀.⡈⡀⠊⡀⠄⡁⠂⡂⠤⠁⠄⠴.⠆⠬.⢍⠨...................⠸⠠.⢄⠈.⠠⠐.⡀⠖.⠪⠐⢀⢁⠔.⠁...v
e...⠁.⠂⠂⠂⠊⠄⡂⠴.⡁⠄⡁⠒⠰.⢂⠢⠁⠥⢐⢀⢂⠂⡪⢁⠤⠒⠒⠒⠉⠙⠘⠈⠉⠑⠑⠒⠒⠒⠐⠤⠤⢄⣐⠡⠐⡐.⠢.⢁⠢⠐⢈⠐⠈⣐⠐...e...y
ou...been?...⠈.⠂⠪⠠⠡⠰⠈⠌⠄⡒⡐⠐⠔⠠⣡⠕⠁...................⠉⠌⠔⡀⡒⢐.⢒⠨.⡂⠊...?...It
(DIR) web -2.0 -- special pages
(DIR) secure connection -- root
Page views: 9