[HN Gopher] What's the size of the space of language covered by ...
___________________________________________________________________
What's the size of the space of language covered by the Pornhub
search engine?
There is some domain of concepts covered on pornhub. If one is to
search words such as "success" - one will receive titles such as:
"100% success rate eating pu*y technique" and "our experiment with
various bottles and success attempt to insert my most huge monster
toys". These are covering only a small part of the meaning of the
word "success", but they are still valid and they manage to retain
the meaning. Therefore, my question would be: how could we measure
the domain of definition of all possible meanings from the pornhub
search engine. Basically to find the "domain of knowledge". Same
we could with other more specialised libraries, such as academic,
technical, popular etc. Searching for "eating p**y" on google
scholar results in: "Sustainable production of healthy, affordable
food in the UK: The pros and cons of plasticulture" and "In the
house of the interpreter: a memoir". Cheers!
Author : toombowoombo
Score : 3 points
Date : 2022-10-01 22:09 UTC (52 minutes ago)
| metadat wrote:
| If you indexed all available titles, you could probably use NLP
| to parse them into ontological strata.
|
| But why? Porn titles are dull and repetitive. I predict you'll
| find the space includes just about every permutation of every
| combination of human proclivities.
| toombowoombo wrote:
| Pornhub was just an exaggerated example to explain the big
| differences in the domains of words in languages.
|
| I got curious in general about how one could encode all the
| knowledge in a system, whatever the system.
|
| In my mind I perceive this question as an extention of the
| intormation theory concept of measuring the transmitted
| information.
| toombowoombo wrote:
| I'm open to discussion. The ide is still in progress.
| toombowoombo wrote:
| Idea*
___________________________________________________________________
(page generated 2022-10-01 23:01 UTC)