[HN Gopher] We analyzed 425k favicons
___________________________________________________________________
We analyzed 425k favicons
Author : gurgeous
Score : 515 points
Date : 2021-10-20 17:29 UTC (1 days ago)
(HTM) web link (iconmap.io)
(TXT) w3m dump (iconmap.io)
| paulirish wrote:
| Aside: This article is a decent usecase for the esoteric `image-
| rendering: pixelated;` css property.
| nkriege wrote:
| Great tip. I've never come across this before. I updated the
| post and the scaled up icons look much sharper now.
| dmitrygr wrote:
| I used it to make this PWA work well on iPhones:
| http://dmitry.gr/89
| mod wrote:
| I loaded this up on a surface tablet--it renders larger than
| my viewport, but with no scrollbar.
|
| I was able to zoom out and see everything, but some people
| don't know (or wouldn't think of) that trick.
| dmitrygr wrote:
| Designed for personal use as a PWA specifically on my
| iPhone. I migrated from android where i had a TI-89
| emulator app. No such thing exists for iOS. Usability by
| others was never a requirement :)
| lostgame wrote:
| Ha - that's a fantastically nerdy little project. I love it!
| philistine wrote:
| My website, gameboyessentials.com, would not exist without this
| esoteric CSS property. I wanted to show Game Boy images in
| their exact resolution (160 by 144). With image-rendering:
| pixelated; I have crisp pictures on my site whose sizes are
| counted in bytes.
| 1cvmask wrote:
| The favicon visualization brought memories of the million dollar
| homepage. I suppose it was precursor of NFTs.
|
| https://en.wikipedia.org/wiki/The_Million_Dollar_Homepage
|
| http://www.milliondollarhomepage.com/
| mrkramer wrote:
| >The favicon visualization brought memories of the million
| dollar homepage. I suppose it was precursor of NFTs.
|
| It was not; NFTs are digital certificates saying that you own
| certain digital content on the other hand The Million Dollar
| Homepage was basically selling ad space on the website.
|
| You can argue you could buy part of the website(digital space)
| and therefore you own the part of the website but in reality
| you were renting it as an ad space meant to promote your
| website(link).
|
| Purpose and vision of The Million Dollar Homepage and NFTs are
| completely different but I can see similarities between quasi
| owning digital space(part of website) and owning digital
| content or digital certificate(digital token).
| alextheparrot wrote:
| Is it really necessary that we assume a precursor must be a
| strict equality in all dimensions aside time?
| mrkramer wrote:
| No but because of the all aforementioned reasons they are
| of minimal similarity.
| philshem wrote:
| Less analysis, but a couple years ago I posted a script to
| download and then generate mosaics from favicons:
| https://smalldata.dev/posts/favicon-mosaic/
|
| example image: https://smalldata.dev/images/mosaic.jpeg
|
| script to get the favicons:
| https://gist.github.com/philshem/e59388197fd9ddb7dcdb8098f9f...
| Groxx wrote:
| Off in one of the more esoteric corners of favicons, you have
| games played within the favicon:
| https://www.youtube.com/watch?v=fpjM5myls7I
|
| Sadly it doesn't quite work for me any more, but the youtube
| video does a decent job showing what it looked like when it
| worked.
| tinco wrote:
| Not really relevant, but using Go to fetch the data, and then
| Ruby to process the data is the best. I used this exact set up
| for a project and it was amazing. Really the sweet spot of use
| cases for both languages.
| tweakimp wrote:
| Can you please explain why they are the best languages for
| these jobs?
| tinco wrote:
| Go's got an awesome feature set built in to the language for
| building small networked services. I implemented a client to
| a cryptocurrency network to extract information about its
| status and clients. I can't really express why it's so good,
| it just feels right.
|
| Same for Ruby, the syntax is perfectly suited for
| transforming, digging through and acting upon data. I didn't
| even add a Gemfile, only used standard library functions,
| transforming the data the Go program mined into usable
| information serialized in JSON which was subsequently used as
| a static database for a webpage.
|
| You can find the source here:
| https://github.com/tinco/stellar-core-go, the Go is in cmd
| and the Ruby is in tools.
|
| The site it powers is now defunct, apparently they changed
| some stuff in the past 3 years and the crawler no longer
| functions.
| whalesalad wrote:
| I have always wanted to do this _exact_ analysis - so awesome!
| Every time I am building some kind of semi-intelligent parser to
| fetch an arbitrary visual icon for a URL I think to myself there
| has gotta be a better way do do this.
| account42 wrote:
| One weird behavior with favicons that I noticed is that Firefox
| will download both the 16x16 icon that matches the size its
| displayed at (on 1x pixel ratio screen) as well as the largest
| icon and then will display whichever finished last. This behavior
| makes no sense to me.
| Quai wrote:
| I worked on Opera Link, the first built-in synchronization
| between different installations of the Opera browser, both
| desktop, Opera Mini and Opera Mobile (+ a web view).
|
| Favicons got included in the data from day one, and it was
| awesome to get the look and feel of your bookmark bar/UI with the
| correct icons right away.
|
| Back then we stored the booksmarks in a home grown XML data store
| (built on top of mysql, acting more or less as a key-value
| store). This worked quite nice, and it allowed us to easily scale
| the system.
|
| One night the databases and backends handling the client requests
| suddenly started eating a lot more memory, and the database
| started using much more storage than normal.
|
| As one of only two backend devops working on Opera Link, I had to
| debug this, and find out what was going on. After a while I
| isolated the problem to a handful of users. But how could a few
| users affect the system so much?
|
| As a part of the XML data store, we decided naively to store the
| favicons in the XML, as a base64 encoded string. While not
| pretty, a 16x16 PNG is not that much data, and even with thousand
| of bookmarks, the total overhead on compression and parsing was
| neglishable. What we did not foresee was what I uncovered that
| night; A semi-popular porn site had changed something on their
| server. They had started serving the images while also pointing
| browser to the same images as the favicon! Each image being
| multiple megabytes, sent from the client, parsed on the backend,
| decoded, verified, encoded back to base64, added to the XML DOM,
| serialized, compressed and pushed back to the database...
|
| Before going to bed that night, I had implemented a backlist of
| domains we would not accept favicons for, cleaned up the
| "affected" user data, and washed my eyes with soap.
|
| I miss those days!
| eezurr wrote:
| I have fond memories of using Opera <= 12. You guys were in
| space compared to other browsers at the time.
| thrdbndndn wrote:
| Wait, so you can see user's data directly?
| lmm wrote:
| GP is talking about something implemented in 2008. It was a
| different time and a different mentality.
| Brybry wrote:
| Google Chrome Sync help docs imply it defaults to storing
| data on servers unencrypted by default.[1]
|
| Firefox Sync seems to have sane/encrypted defaults.[2]
|
| [1] https://support.google.com/chrome/answer/165139
|
| [2] https://hacks.mozilla.org/2018/11/firefox-sync-privacy/
| Quai wrote:
| The truth is that most services will have a set of devops
| with access to personal information. And some times, we need
| to look at private data to solve issues like this. My first
| instinct back then was that some smart hacker had created a
| FUSE support for Link or something similar.
|
| Opera Link did not encrypt bookmarks and speeddials etc, but
| had datatypes encrypted with master password, even while
| syncing. We where two people with the access and knowledge to
| access individual user information, and we took it very
| serious.
| munk-a wrote:
| Didn't they miss all the pre-sized icons in their scan as well?
| For a while Apple encouraged multiple resolution sizes for
| favicons for... reasons.
|
| I know they additionally missed the directory specific favicons
| which have always had iffy support (i.e. /index.html =>
| /favicon.ico and /munks-page/index.html => /munks-
| page/favicon.ico)
| arantius wrote:
| I did something similar in 2008:
| https://tech.arantius.com/favicon-survey
| achillean wrote:
| Nmap generated a similar version many years ago and it's still
| available at:
|
| https://nmap.org/favicon/
|
| We also did something looks at favicons by IP:
|
| https://faviconmap.shodan.io/
| Lorin wrote:
| This reminds me of the time I reported to CIRA (Canadian domain
| registry) that their favicon was ~2mb /w bad caching rules and
| was causing issues in ... many situations.
| [deleted]
| arp242 wrote:
| I got mine down to 160 bytes with some pixel tweaking and
| converting it to a 16-color indexed PNG. It's not a lot of work
| or very difficult (I'm an idiot at graphics editing), but you do
| need to spend the (small amount of) effort. I embed it as a data
| URI and it's just four lines of (col-80 wrapped) base64 text,
| which seems reasonable to me.
|
| Haven't managed to get my headshot down to less than 10k without
| looking horrible no matter how much I tweaked the JPEG or WebP
| settings, and thought that was just a tad too big to embed. Maybe
| I need to find a different picture that compresses better.
|
| I got that 280k Discord favicon down to just 24K simply by
| opening it in GIMP and saving it again. I got it down to 12K by
| indexing it to 255 colours rather than using RGB (I can't tell
| the difference even at full size). You can probably make it even
| smaller if you tried, but that's diminishing returns. Still, I
| bet with 5 more minutes you can get it to ~5k or so.
|
| It's very easy; you just need to care. Does it matter? Well, when
| I used Slack I regularly spent a minute waiting for them to push
| their >10M updates, so I'd say that 250k here and 250k there etc.
| adds up and matters, giving real actual improvements to your
| customers.
|
| The Event Horizon Telescope having a huge favicon I can
| understand; probably just some astronomer who uploaded it in
| WordPress or something. Arguably a fault of the software for not
| dealing with that more sensibly, but these sort of oversights
| happen. A tech company making custom software for a living is
| quite frankly just embarrassing to the entire industry. It's a
| big fat "fuck you" to anyone from less developed areas with less-
| than-ideal internet connections.
| TheJoeMan wrote:
| " I got that 280k Discord favicon down to just 24K simply by
| opening it in GIMP and saving it again. "
|
| You made me laugh out loud.
|
| I agree that stuff like YouTube.com saying 144x but really 145x
| seems like it should be embarrassing.
| arp242 wrote:
| I wouldn't be surprised if that was for a specific reason,
| like somehow showing up better somewhere for some reason, or
| something like that. Or maybe not; who knows...
| fbrchps wrote:
| Oh hey, Discord must have seen this article -- their favicon is
| down to 14k now.
| ehsankia wrote:
| It's not, at least for me. If you checked in devtools, that's
| gzip over the wire size. Hover over the size and it'll show
| you the actual resource size, still 285k for me.
| dmurray wrote:
| The gzipped size is probably the correct metric to care
| about, right? Virtually all browsers will support that.
|
| Sure, Discord could do a bit better, but it's not correct
| to knock them here for costing their users 285KB.
| MrBoomixer wrote:
| This is bad math, not researched heavily but in 2020
| discord had 300 million users. 285kb goes a long way with
| wasted energy and bits flowing through the pipes. I agree
| generally with what your saying though gzipped sizes are
| what's being sent some CPU usage somewhere to unzip. less
| bytes == less waste?
| lmm wrote:
| PNG basically includes gzip in the file format, so you're
| not reducing the amount of CPU used, you're just moving
| where it happens.
| giantrobot wrote:
| Includes but doesn't always use. PNG also includes
| filters which can dramatically decrease sizes, especially
| when combined with compression.
|
| That's why tools like OptiPng basically brute force all
| the combination of options. Depending on the image
| content different combinations of filters and compression
| will get the best file size.
| jhgg wrote:
| I committed a fix, it's now 24k uncompressed! :)
| blitzar wrote:
| Congratulations. Don't forget your 11x improvement when
| it comes to the end of year reviews.
| gremloni wrote:
| That's lit and a fantastic turnaround. Great work to whoever
| is reading this!
| nerfhammer wrote:
| there are png optimizer programs, e.g. optipng
| pseudosavant wrote:
| The Squoosh (web) app is awesome for this too! All processing
| is done locally with wasm.
|
| https://squoosh.app
| 101008 wrote:
| I'd love to have a browser plugin that converts all images
| I upload to CMS using Squoosh.
| ehsankia wrote:
| Yep, just tried the Discord icon with OxyPNG and it went
| from 285k to 6.35k, visually indistinguishable.
| vadfa wrote:
| `optipng -o9 -strip all' is a must
| jamesfinlayson wrote:
| I found https://pngquant.org/ to be pretty good.
| account42 wrote:
| Note that unlike some of the other tools mentioned here,
| pngquant does _lossy_ compression. Might still be the right
| tool in many cases, but it means you should check the
| output while e.g. optipng is a no-brainer to add to
| whatever your publishing pipeline is.
| memco wrote:
| ImageOptim was a favorite of mine. They have a standalone mac
| app and a webservice. It combines several of these tools into
| a single GUI.
| JohnTHaller wrote:
| 256x256 PNG reduced to 256 colors with pixel transparency gets
| it to 2.68K. I manually dropped the color depth to indexed and
| saved it out in PhotoShop and I used FileOptimizer to shrink
| it. It includes 12 different image shrinkers and runs them all.
| jtbayly wrote:
| > Check out this startling ICO with 64 images, all roughly 16x16.
| I suspect a bug.
|
| I suspect an animation. Anybody know how to find out?
| TazeTSchnitzel wrote:
| The non-PNG Apple touch icons might be CgBI files? It's an
| undocumented proprietary Apple extension to PNG which most PNG
| tools won't accept, but which Xcode uses for iOS apps.
| bugmen0t wrote:
| > We did a hacky image analysis with ImageMagick to survey
| favicon colors. Here is the dominant color breakdown across our
| favicons. This isn't very accurate, unfortunately. I suspect that
| many multicolored favicons are getting lumped in with purple.
|
| Writing or reviewing a sentence like this should make you
| reconsider. Either do the right analysis or remove this from your
| article. But when you say your analysis is probably wrong and the
| results look weird, then why publish as is?
| duckmysick wrote:
| Imperfect analysis with known limitations still has value. We
| can build upon it and improve. I'd rather have it out in the
| open than omitted.
| thrdbndndn wrote:
| > Strangely, only 96.1% of Apple touch icons are PNG. Presumably
| the other 4% are broken.
|
| What does broken mean in this context? Non-PNG, or actually
| broken? I assume the author has the files.
| ryan29 wrote:
| > In fact, I recommend that browsers ignore these hints because
| they are wrong much of the time.
|
| I don't agree. That's the kind of coddling that encourages
| incompetence. Instead of compensating for others' mistakes, just
| let their stuff break.
|
| I wonder if Safai on iOS ignores the hints. When I tested, I was
| surprised to see that pressing the share icon, which holds the
| option for `Add to Home Screen`, would cause a download of all of
| the icons listed with `link rel="icon"`.
|
| Favicons are a huge pain to deal with correctly.
| malfist wrote:
| People make mistakes all the time. Breaking because somebody
| made a mistake that you can correct for just leads to
| unnecessarily fragile code.
|
| What's the point of failing and breaking stuff if someone tells
| you their image is 144x144 but it's really 145x145? Who does
| that benefit?
| anyfoo wrote:
| The opposite is the case. Overall, being too lenient in what
| code accepts and applying heuristics will lead to way worse
| problems down the line. For example, you want your compiler
| to fail hard instead of saying: "Oh, this isn't a pointer,
| but I'm sure you meant well, I'm just going to treat it as a
| pointer!"
|
| In _this_ particular case, it seems to me that the hints
| serve no purpose and should be abolished, and in the meantime
| fully ignored, altogether. All necessary metadata is
| contained in the image file, and browsers should also be
| (relatively) strict in what image files with what metadata
| they accept, for security reasons alone.
|
| And if they also went so far as limiting file size, the
| perpetrators that clog up bandwidth by putting up multi-MB
| favicons would catch on much earlier (or at all), too.
|
| So what actually is the point of those hints, if browsers
| have to fallback anyway?
| notatoad wrote:
| The hints are not a hint in how to render the icon -
| browsers don't need hints for that. the hints are an
| instruction to browsers on which icon to download in the
| case where multiple icons are specified.
|
| if you are safari and you don't know how to display SVG
| favicons, then you don't need to waste bytes downloading a
| favicon only to fail to display it. the HTML does not limit
| a site to only one favicon.
| anyfoo wrote:
| Why is that not done through the MIME type and using
| HEAD? The server is apparently much better able to figure
| out the MIME type through magic numbers and file
| extensions of the actual file, than the author (human or
| not) of the HTML, as we see.
|
| The same headers also inform the browser that they can
| skip downloading a favicon that they consider too big,
| for example.
| scrollaway wrote:
| HEAD support is never a guarantee, and content type auto
| detection is just another kind of heuristics.
| anyfoo wrote:
| Ugh, HEAD is not being universally supported, at least
| for static content? Okay, I accept that this has value
| then.
|
| As for the MIME type, for image types I'd say it's more
| than stable enough. Certainly much, _much_ more stable
| than the 6.7% error rate mentioned in the article here, I
| 'd be surprised if it was even 1%. If you double click on
| an image on your desktop for example, you can in almost
| all cases expect that it will be opened correctly. It
| ceases being a heuristic entirely if you tell the
| webserver that *.png is image/png, and only put PNGs with
| names ending in ".png".
|
| Guess those are the reasons why I got out of web
| development in 10 years ago, everything's held together
| by scaffolding and needlessly wasteful and inefficient
| there.
| scrollaway wrote:
| You might be overthinking this. I agree with the
| philosophy that stricter is better, but in this case what
| do you expect broken hints to do?
|
| They're not used for rendering, they're used for figuring
| out what to fetch. A HEAD request would be far less
| efficient than knowing ahead of time what to fetch: 1
| request versus 2N+1 requests.
|
| What you suggest sounds all fine but the entire web is
| user input for a browser, so no matter what, you need to
| define how to fail. If you can fail gracefully, you might
| as well do so, because a failure might not even be
| triggered by bad code/configuration on your side but
| simply by flaky network issues.
| vbezhenar wrote:
| Just don't ignore filename extension. favicon.svg is SVG
| and that's about it. If you don't support SVG, don't
| download it. If you want to store png in favicon.svg,
| don't do that.
| account42 wrote:
| The web runs on mime types and file extensions are
| irrelevant except for buggy browsers that try to be too
| clever (Internet Explorer).
| anyfoo wrote:
| Yeah, I get how those hints make sense, now that you (and
| others in the thread) have told me how things are, and I
| did overlook that HEAD is still an extra request, while
| the attributes are (effectively) for free.
|
| I do wish that content negotiation (e.g. Accept headers)
| worked properly. In the end though, those hints implement
| a subset of content negotiation in a reasonable way,
| given the state of affairs.
| iudqnolq wrote:
| YouTube and Twitter both have wrong parameters. Presumably this
| means all major browsers ignore them or someone would have
| noticed their favicons not displaying right?
| paxys wrote:
| Browsers ignore the hints because they aren't needed. The image
| file itself has everything you need for rendering it.
| ygra wrote:
| The point for the hints is probably that the browser doesn't
| need to fetch the 2000x2000 favicon if it only needs
| something in 16x16 to render in the tab bar.
| sokoloff wrote:
| I don't see Postel's Law cited here yet, which I find pragmatic
| and worth sharing/considering as I used to be in the "let their
| stuff break" camp.
|
| https://en.m.wikipedia.org/wiki/Robustness_principle (Quite
| short)
| Conlectus wrote:
| A problem with this is that when a website breaks in one
| browser, but works in another, I imagine most people's reaction
| would be to blame the browser. This leads to a kind of race-to-
| the-bottom for browser compatibility. See for example the
| history of User-Agent strings.
| jiggunjer wrote:
| depends on the error message? Maybe instead of failing, give
| an annoying prompt to offer a workaround.
| ehsankia wrote:
| That may be your viewpoint but browsers have historically
| always taken the other viewpoint. Take HTML parsing for
| example. You can miss closing tags and a ton of other stuff,
| and it'll all work on a best-effort basis.
|
| The browsers job is to do the best it can, that's what users
| want. No one would use a browser that breaks at the smallest
| tiniest error in the source code.
| adamrezich wrote:
| > browsers have historically always taken the other
| viewpoint.
|
| except for the short-lived XHTML fad which tbh I kind of miss
| every day
| vbezhenar wrote:
| XHTML is still supported and works even with HTML5 tags.
| Diesel555 wrote:
| That article was a fun read! There was one sentence that bothered
| me though.
|
| > I recommend that browsers ignore these hints because they are
| wrong much of the time. We calculated a 6.7% error rate for these
| attributes, where the size is not found in the image or the type
| does not match the file format.
|
| I think of much in this context to mean at least more than 50% of
| the time. So I had to look up the definition of the word. One
| definition from Merriam is "more than is expected or acceptable :
| more than enough." So I guess the usage is acceptable!
|
| I always enjoy finding I have a slightly wrong definition in my
| mind for a word. Many arguments, or much arguments, fail to move
| forward due to the differing, unidentified, underlying
| assumptions relying on words with slightly different definitions,
| both people having a slightly different question they are arguing
| in their mind.
| silvestrov wrote:
| It such a shame that Safari does not support SVG favicons. It's
| the only major browser which doesn't: https://caniuse.com/link-
| icon-svg
|
| All current browsers support PNG.
| amelius wrote:
| Don't hold your breath. Safari is the new IE6.
| mixmastamyk wrote:
| Will it look good on a browser tab? Seems like the res would be
| too low.
| deathanatos wrote:
| It's a vector graphic; its resolution is whatever you render
| it at. "S" as in, "Scalable".
|
| Sure, there is some nuance in that you wouldn't want some
| fine detail to get lost at the displayed size, but presumably
| you know you're making a favicon when you do so.
|
| Or, you're the NFL & you're going to supply a 4 megapixel
| image IDK.
| account42 wrote:
| > Sure, there is some nuance in that you wouldn't want some
| fine detail to get lost at the displayed size, but
| presumably you know you're making a favicon when you do so.
|
| On the other hand, SVG is really not designed for the fine
| pixel control you want to make the icon look good at
| smaller sizes as it does not have the equivalent of font
| hinting.
| mixmastamyk wrote:
| Not at very low resolutions, <= 32 px. See sibling comment.
| est wrote:
| Its such a shame that PNG does not support packing multiple
| dimensions into one file like .ico formats actually do.
| kevin_thibedeau wrote:
| It can be done with MNG. There just has never been a tooling
| ecosystem that supports it for non-animated applications.
| fho wrote:
| That "I am feeling lucky" button does not seem random at all, it
| brought me in order to: Microsoft Windows, Blogger, The Financial
| Times, Github, Adobe ...
|
| As every other location I randomly scroll to has no recognizable
| image on it ... that seems preselected :-)
| ChrisArchitect wrote:
| What is the Tranco dataset that this is based on? I mean come on
| -- anything that claims to be based on 'Alexa' (or any of these
| others: Cisco Umbrella/openDNS? Majestic? Quantcast?) is sooo
| suspect. None of these sources are that good and especially Alexa
| which harks back to a time 20 years ago of browser toolbars and
| extensions which the large majority do not use anymore.
|
| Just saying yes maybe it's easy to come up with a top 1000 list
| of sites on the net, but other than that no one really knows
| unless you're like Google/Bing/Apple/Cloudflare that have
| redirection urls/DNS control, tracking clicks etc
| cratermoon wrote:
| I haven't updated the favicon on a site I run in years, if not
| decades. It's a 32x32 GIF 89a file that runs 131 bytes.
|
| It's interesting to ponder how many hundreds of bytes are
| exchanged between the browser and the site just for a simple GET
| request for the image.
| gurgeous wrote:
| Also, we turned up 2,000 domains that redirect to a very shady
| site called happyfamilymedstore[dot]com. Stuff like
| avanafill[dot]com, pfzviagra[dot]com, prednisoloneotc[dot]com.
| These domains made it into the Tranco 100k somehow.
|
| Full list here -
| https://gist.github.com/gurgeous/bcb3e851087763efe4b2f4b992f...
| johnx123-up wrote:
| IMHO, you should add this note in the blog too. Also, wondering
| about the use case of the website... are you building anything
| else too?
| unicornporn wrote:
| Lately, happyfamilymedstore has mysteriously always been in the
| top ~ten Google Images results for super niche bicycle parts
| searches I do. They seem to have ripped an insane amount if
| images that gets reposted on their domain.
| 0des wrote:
| What kind of parts are you looking for?
| noitpmeder wrote:
| Does anyone know the story behind these? How do seemingly
| obscure sites consistently get massive amount of obscure
| content placed highly in results.
| jacurtis wrote:
| What most of them do is they will use Wordpress exploits to
| get into random wordpress website ran by people who know
| nothing about managing a website and are running on a $3/mo
| shared hosting account.
|
| After they get into these random wordpress sites, then then
| embed links back to their sketchy site in obscure places on
| the wordpress site that they hacked, so that owners of the
| site don't notice, but search bots do. They usually leave the
| wordpress site alone, but will create a user account to get
| back into it again later if Wordpress patches an exploit. All
| of this exploit and link adding is automated, so it is just
| done by crawlers and bots.
|
| This is done tens of thousands or even millions of times
| over. All of these sketchy backlinks eventually add up, even
| if they are low quality, and provide higher ranking for the
| site they all point to.
|
| Think of websites like mommy blogs, diet diaries, family
| sites, personal blogs, and random service companies
| (plumbers, pest control, restaurants, etc) that had their
| nephew throw up a wordpress site instead of hiring a
| professional.
|
| I don't mean to pick on wordpress, but it really is the most
| common culprit of these attacks. Because so many Wordpress
| sites exist that are operated by people who aren't informed
| about basic security. Plus, wordpress is open source, so
| exploits get discovered by looking at source code and
| attackers will sell those exploits instead of reporting them.
| So Wordpress is in an infinite cycle of chasing exploits and
| patching them.
| lazide wrote:
| Pretty sure closed source wasn't very effective at stopping
| 0days either (Windows). The most common platform gets the
| attention generally.
| shuntress wrote:
| > "had their nephew throw up a wordpress site instead of
| hiring a professional"
|
| The web is _supposed_ to be accessible to everyone.
|
| This type of "blame the victim" attitude is a poor way to
| handle criminal activity.
| [deleted]
| pixl97 wrote:
| There are plenty of places that you can go to on this
| planet with little to no law enforcement. Don't be
| surprised if you end up dead there. Handling global crime
| is very difficult.
| charcircuit wrote:
| and anyone can hire me to design them a website.
| jiggawatts wrote:
| If they had used static content, it would remain 100%
| accessible to them, but also vastly more secure.
|
| Dynamic content generation _on the fly_ for a blog is
| unnecessary complexity that invites attacks.
| pc86 wrote:
| Static content is definitively _not_ as accessible to the
| typical person asking their nephew to put up a WP blog on
| shared GoDaddy hosting.
| jiggunjer wrote:
| wouldn't that preclude a few popular features like a rich
| text editor?
| jiggawatts wrote:
| You can have a separate system, even a locally running
| desktop app do that. You can still have a database,
| complex HTML templating, and image resizing! You just do
| it offline as a preprocessing step instead of online
| dynamically for each page view.
|
| Unfortunately, this approach never took off, even though
| it scales trivially to enormous sites and traffic levels.
|
| I recently tried to optimise a CMS system where it was
| streaming photos from the database to the web tier, which
| then resized it and even _optimised_ it on the fly. Even
| with caching, the overheads were just obscene. Over a 100
| cores could barely push out 200 Mbps of content.
| Meanwhile a single-core VM can easily do 1 Gbps of static
| content!
| vbezhenar wrote:
| I thought about "serverless" blog.
|
| Here's some rough scheme I came up with (I never
| implemented it, though):
|
| 1. Use github pages to serve content.
|
| 2. Use github login to authenticate using just JS.
|
| 3. Use JS to implement rich text editor and other edit
| features.
|
| 4. When you're done with editing, your browser creates a
| commit and pushes it using GitHub API.
|
| 5. GitHub rebuilds your website and few seconds later
| your website reflects the changes. JavaScript with
| localStorage can reflect the changes instantly to improve
| editor experience.
|
| 6. Comments could be implemented with fork/push request.
| Of course that implies that your users are registered on
| GitHub, so may not be appropriate for every blog. Or just
| use external commenting system.
| mkotowski wrote:
| So, essentially a site generated with Jekyll, hosted on
| GitHub Pages with Utterances [0] for comments and updated
| with GitHub Actions.
|
| I don't know if https://github.dev version of Visual
| Studio Code supports extensions/plugins, but if so, then
| there is also a rich text editor for markdown ready.
|
| All that's left would be an instant refresh for editing.
|
| [0]: https://utteranc.es
| pc86 wrote:
| If this is a serious suggestion (I really hope it isn't),
| you have never met the kind of person setting up the
| blogs the GP is talking about.
| mfkp wrote:
| I recently saw and reported one to a local business.
|
| If you typed in the domain and visited directly, it
| wouldn't redirect to the scam site. But if you clicked on a
| link from a google search, then it would redirect.
|
| Probably makes it harder to find for small website owners
| if they're not clicking their own google searches.
| IncRnd wrote:
| It happens through search engine optimization, SEO, and a mix
| of planting reviews and other tactics. Think of it like this
| - what would you do to get people talking about your site?
| You'd somehow put links, conversations, reviews, quotes, etc.
| in front of them.
| comeonseriously wrote:
| Slightly OT, but what was that one that came around a few years
| ago that would make everyone's CPU go to 100%?
| nanis wrote:
| I know of a company whose favicon was a hires true color PNG that
| weighed in at more than 2 MB. The web site was the dominion of
| marketing. Suggestions to improve the situation were detrimental
| to one's career path. _sigh_
| tonetheman wrote:
| I use an inline svg for mine... which is really just a poop
| emoji.
| anyfoo wrote:
| ... and wrote an interesting technical article about it, that
| even someone like me, who doesn't do web development, enjoys
| reading. Definitely why I come to HN (no sarcasm, it is).
| toast0 wrote:
| Favicons are slightly useful. You can serve your page at
| http://www.example.com with a favicon from https://example.com
| that has a HTTP Strict-Transport-Security header with
| includeSubDomains, and then future page loads in that browser
| will be https (across your whole domain). (This assumes you want
| your domain to be https)
|
| Other than that, I'm still pretty meh about them.
| gurgeous wrote:
| Also see the gigantic map - https://iconmap.io
|
| The blog post is the analysis of the data set, the map is the
| visualization.
| isoprophlex wrote:
| Is the dataset available for download? I couldn't immediately
| find a download to the dataset in the linked article.
|
| My hands itch to do some dimension reduction on that data and
| make some nice plots
| nkriege wrote:
| We'd be happy to share the data. Reach us at help at
| gurge.com if you're interested.
| wiz21c wrote:
| damn I was thinking about that too :-)
| oehpr wrote:
| I wonder if there might be a way to map all these using t-SNE
| to discrete grid locations? Maybe even an autoencoder. I'd love
| to see what features it could pick out.
|
| I don't see their data set though. hmmm.
|
| maybe I'll just have to crawl it on my own if I want to do it.
| yboris wrote:
| side note: instead of t-SNE consider UMAP - provides better
| results (and it's _much_ faster)
| https://github.com/lmcinnes/umap
| lgvld wrote:
| You can use t-SNE (or even better: UMAP or one of its
| variation) to create a 2D points cloud, and then use
| something like RasterFairy [1] to map 2D positions to the
| cells a grid. It usually works well.
|
| [1] https://github.com/Quasimondo/RasterFairy
| svdr wrote:
| I see a lot of repetitions in the map?
| gurgeous wrote:
| It's one icon per domain. Try hovering (on desktop) and
| you'll see that many domains have the same favicon.
| true_religion wrote:
| It also works on mobile if you tap the fav icon.
| bellyfullofbac wrote:
| Huh, there's a row of identical icons of 3 blue circles (search
| for cashadvancewow[dot]com) and all the domains using them are
| loan-related. Interesting way to do forensics on clone sites
| (although trying a few of them, they're not showing any icons
| right now, and the URL /favicon.ico 404's)
|
| And I checked a few of the sites, I just got lorem-ipsum style
| landing pages. I wonder what's the point, or are the scammers
| using the domains mostly for emails?
| deathanatos wrote:
| There are multiple runs of "just a bit _too_ abstract " icons
| that point into the abyssal cesspools of the Internet. Most of
| them seem to be about loans, so I'm going to avoid announcing
| that too loudly if I ever need a loan, since clearly, there are
| some scumbags out there.
| ScaleneTriangle wrote:
| Would have liked to see more color analysis, like a graph showing
| the number of distinct colours per icon.
| quitit wrote:
| The difference between the Apple "precomposed" and standard icons
| had to do with the gloss effect on icons on pre iOS 7 home
| screens.
|
| When adding a website/webapp to these earlier home screens, the
| OS would apply a gloss effect over the icon in order to match the
| aesthetic of the standard apps. The precomposed icon was a way
| for the developer to stop the OS from applying this effect, such
| as if their logo already had a different gloss effect already
| applied (i.e "precomposed") or other design where adding the
| glossy shine wouldn't look right. The standard icon allowed the
| OS to apply the gloss effect - which was a timesaver as Apple did
| tweak the gloss contour over the years: hence using a standard
| icon ensured that the website/webapp always matched the user's OS
| version.
___________________________________________________________________
(page generated 2021-10-21 23:02 UTC)