[HN Gopher] Real World Recommendation System
___________________________________________________________________
Real World Recommendation System
Author : nikhilgarg28
Score : 254 points
Date : 2022-04-11 16:44 UTC (6 hours ago)
(HTM) web link (blog.fennel.ai)
(TXT) w3m dump (blog.fennel.ai)
| kixiQu wrote:
| And then all of it is thrown away and they show ads instead. :)
| greesil wrote:
| MANGA
| ehsankia wrote:
| Google -> Alphabet, then add in Microsoft, Tesla and NVIDIA.
|
| MANTAMAN
|
| https://streetsharks.fandom.com/wiki/Mantaman
| vikingcaffiene wrote:
| Gentle reminder to anyone reading this that your problems are
| probably not FAANG problems. If you architect your system trying
| to solve problems you don't have, you are gonna have a bad time.
| samstave wrote:
| Wow, this is something that has been a floater-in-mind for
| decades ;
|
| I'll top it off with an interview at Twitter with the Eng MGR
| ~2009-ish?
|
| --
|
| Him: _So tell me how you would do things differnetly here at
| twitter based n your experience?_
|
| ME: " _Well, I have no idea what your internal processes are,
| or architecture, or problems, so my previous experience wouldn
| 't be relevant._"
|
| I'd go for the best option that suits goals.
|
| [This was my literal response to the question, which I thought
| was a trap but responded honestly -- as a previous mgr of
| teams, the "well, we did it at my last company as such"]
|
| Dont reply this way. <--
|
| Here was his statement:
|
| This is a literal quote from a hiring manager for
| DevOps/Engineering at Twitter:
|
| _" Thank god!, We have hired so many people from FB, where
| that was there only job out of school, and no other experience,
| and the biggest thing they told me was "well - the way we did
| this at FB was... X"_
|
| --
|
| His biggest concern was engineering-culture-creep...
| HWR_14 wrote:
| Wow. That amazes me that anyone would answer that question
| without knowing anything about the problem space and
| implemented solutions.
|
| Wait, I got it, I would rewrite everything as AWS Lambdas.
| That's the right answer! Screw your (almost certainly SQL)
| DB, let's move it all to DynamoDB too.
| samhw wrote:
| > Wow, this is something that has been a floater-in-mind for
| decades
|
| Have you literally never come across the "you're not
| Google!!!" trope before now, during the whole ~decade leading
| up to this very day? Gosh I envy you.
|
| (Also, I am reaaally struggling to understand that story. Who
| is speaking? It sounds like a story within a story within a
| story. I can just about piece together the gist, but I'm very
| confused by all the formatting and nested quotes.)
| jeffbee wrote:
| "And note that you don't even have to be at FAANG scale to run
| into this problem - even if you have a small inventory (say few
| thousand items) and a few dozen features, you'd still run into
| this problem. "
|
| -TFA
| vikingcaffiene wrote:
| Fair enough. I still think people should read stuff like this
| with a healthy measure of skepticism.
| habibur wrote:
| > a machine learning model is trained that takes in all these
| dozens of features and spits out a score (details on how such a
| model is trained to be covered in the next post).
|
| This part was the one I was interested in. As most of the rest
| are obvious.
| arkj wrote:
| Looks like FAANG in the title is just to get your attention.
| Details are missing.
| nikhilgarg28 wrote:
| (Disclaimer: I'm the author of the post)
|
| Good feedback, noted. Will get the next post focused on
| training within the next couple of days.
| 1minusp wrote:
| Also, how is this article different or more informative
| compared to others that deal with the challenges of model
| deployment/management at scale?
| voz_ wrote:
| This is shallow and generic almost to the point of uselessness. I
| am having trouble understanding who the target audience is.
| priansh wrote:
| The main issue with deploying these systems right now is the
| technical overhead to develop them out. Existing solutions are
| either paid and require you to share your valuable data, or open
| source but either abandoned (rip Crab) or inextensible (most rely
| on their own DB or postgres).
|
| I'd love to see a lightweight, flexible recommendation system at
| a low level, specifically the scoring portion. There are a few
| flexible ones (Apache has one) but none are lightweight and
| require massive servers (or often clusters). It also can't be
| bundled into frontend applications which makes it difficult for
| privacy-centric, own-your-data applications to compete with paid,
| we-own-your-data-and-will-exploit-it applications.
| orasis wrote:
| I think we've done a pretty good job on the scoring side with a
| fast and simple to use API that runs in-process:
| https://improve.ai
| KaiserPro wrote:
| > As a result, primary databases (e.g. MySQL, Mongo etc.) almost
| never work
|
| I mean it does. As far as I'm aware Facebook's ad platform is
| mostly backed by hundreds of thousands of Mysql instances.
|
| But more importantly this post really doesn't describe issues of
| scale.
|
| Sure it has the stages of recommendation, that might or might not
| be correct, but it doesn't describe how all of those processes
| are scheduled, coordinated and communicate.
|
| Stuff at scale is normally a result of tradeoffs, sure you can
| use a ML model to increase a retention metric by 5% but it costs
| an extra 350ms to generate and will quadruple the load on the
| backend during certain events.
|
| What about the message passing, like is that one monolith making
| the recommendation (cuts down on latency kids!) or micro
| services, what happens if the message doesn't arrive, do you have
| a retry? what have you done to stop retry storms?
|
| did you bound your queue properly?
|
| none of this is covered, and my friends, that is 90% of the
| "architecture at scale" that matters.
|
| Normally stuff at scale is "no clever shit" followed by "fine you
| can have that clever shit, just document it clearly, oh you've
| left" which descends into "god this is scary and exotic" finally
| leading to "lets spend half a billion making a new one with all
| the same mistakes."
| judge2020 wrote:
| > . As far as I'm aware Facebook's ad platform is mostly backed
| by hundreds of thousands of Mysql instances.
|
| Same for YouTube itself
| https://www.mysql.com/customers/view/?id=750 and they use
| Vitess for horizontal scaling: https://vitess.io/
| emptysea wrote:
| YouTube has since migrated to Spanner, there's a podcast
| episode with one of the Vitess creators that covers the
| politics of the switch
| efsavage wrote:
| > mostly backed by hundreds of thousands of Mysql instances
|
| Kind of. It's part of the recipe but one you find at these
| large tech companies (I've worked at FB and GOOG) is they have
| the resources to bend even large/standard projects like MySQL
| to their will, while ideally preserving the good ideas that
| made them popular in the first place. There are
| wrappers/layers/modifications/etc that eventually evolve to
| subsume the original software, such that is acting more like a
| library than a standalone service/application. So, for example,
| while your data might eventually sit in a MySQL table, you'll
| never know, and likely didn't write anything specific to MySQL
| (or even SQL) to get there.
| samhw wrote:
| I mean, this post from a year ago makes it sound _not that
| non-standard_ : https://engineering.fb.com/2021/07/22/data-
| infrastructure/my...
|
| What you're describing sounds like you mean something on the
| level of Cockroach, talking the Postgres wire protocol but
| implemented entirely independently underneath (which came
| indirectly out of Google). Facebook's MySQL deployment sounds
| more like a heavily-patched-but-basically-MySQL installation.
| I think Facebook is overanalogised to Google sometimes, as an
| engineering org.
|
| (Admittedly I haven't worked at either whereas you have -
| though I have at another FAANG fwliw - but am basing this
| impression partly on what I hear from friends & partly on
| plain old stuff I read on the internet.)
| xico wrote:
| Meta is relatively open (and open source) in how they handle
| stuff, including ranking, scoring and filtering described in
| the original article, but also fast inverted indexes and
| approximate nearest neighbors in high-dimensional spaces. See,
| for instance, Unicorn [1,2] or (at a lower level) FAISS [3].
|
| [1]
| http://people.csail.mit.edu/matei/courses/2015/6.S897/readin...
|
| [2] https://dl.acm.org/doi/pdf/10.1145/3394486.3403305
|
| [3] https://faiss.ai/
| whimsicalism wrote:
| I disagree - this seems quite clearly to address issues of
| scale, going into multiple-pass ranking, etc. etc.
| dinobones wrote:
| How FAANG actually builds their recommendation systems:
|
| Millions of cores of compute, exabyte scale custom data stores.
| Good recommendations are expensive. If you try to build a similar
| system on AWS, you will spend a fortune.
|
| Most recommender models just use co-occurrence as a seed, this
| can actually work pretty well on it's own. If you want to get
| fancy then build up a vectorized form of the document with
| something like an an autoencoder, then use some approximate
| nearest neighbors to find documents close by. 95% of the compute
| and storage is just spent on calculating co-occurrence though.
| TheRealDunkirk wrote:
| > Millions of cores of compute, exabyte scale custom data
| stores. Good recommendations are expensive. If you try to build
| a similar system on AWS, you will spend a fortune.
|
| And then it will be gamed, and become as useless as every other
| recommendation system already going.
| samhw wrote:
| Also, 'millions of cores' is a ludicrously shitty, zero-clue
| answer. It's like asking how Eminem makes music, and saying
| 'millions of pills'. Like, yes, that's an input, but you're
| missing _the entire method of creation, of converting the
| crude inputs into the outputs_.
|
| For my money - and, for what little it's worth, I work in
| this field - I think most of the impressive feats of data
| science attributed to 'machine learning' are really just a
| function of now having hardware capacity so insanely great
| that we're able to 'make the map the size of the territory',
| so to speak. These models are essentially overfitting
| machines, but that's OK when (a) it's an interpolation
| problem and (b) your model can just memorise the entire input
| space (and deal with any inaccuracies by regularisation,
| oversampling, tweaking parameters till you get the right
| answers on the validation set, then talking about how 'double
| descent' is a miracle of mathematics, etc).
|
| Don't get me wrong, neural nets are obviously not rubbish.
| They are a very good method for non-convex, non-
| differentiable optimisation problems, especially
| interpolation. (And I'm grateful for the hype cycle that's
| let me buy up cheap TPUs from Google and hack on their
| instruction set to code up linear algebraic ops, but for way
| more efficient optimisation methods, and also in Rust, lol.)
| It's just a far more nuanced story than "this method we
| discovered and hyped up for a decade in the 80s suddenly
| became the key to AGI".
| nixpulvis wrote:
| These steps read to me like: first we filter, then we filter,
| then we filter; all of this being done based on some various
| orders of the data.
|
| The devil's in the details, which are surely domain specific and
| hopefully not too morally questionable.
| fmakunbound wrote:
| With all of this technology applied, I am still disappointed by
| Netflix's recommendations - to the point of just giving up and
| doing something else.
| liveoneggs wrote:
| I was actually pretty impressed the other day when searching
| for "shiloh" (which they didn't have) because it showed a bunch
| of "related" queries to other dog movies (they also didn't
| have). The available search results were a little lacking
| though.
| foldr wrote:
| In some ways it seems like a classic case of trying to solve
| the wrong problem because the wrong problem potentially has a
| technical solution. The real problem is making lots of
| interesting content for people to watch. If you can solve that
| problem then a simple system of categories is perfectly
| sufficient for people to discover content. But that's not a
| technical problem, and all those engineers have to be given
| something to do.
| chuckcode wrote:
| Do you think part of this is that Netflix has assumed zero
| effort from user model? My experience has been that Netflix
| does an ok job of recommendations, but fails at overall
| discovery experience. There is no way for me to drive or view
| content from different angles easily. I end up googling for
| expert opinions or hitting up rotten tomatoes to get better
| reviews. Netflix knows a ton about me and their content, but
| seems to do a poor job of making their content
| browseable/discoverable overall. I do like their "more like
| this" feature where I can see similar titles.
| imilk wrote:
| Google TV has the best content discovery I've come across so
| far. Recommendations across most streaming services based on
| overall similar movies, different slices of the genre, and
| movies with similar directors/cast members. Plus as soon as
| you select another movie, you can see all the same "similar"
| recommendations for that movie.
| invalidOrTaken wrote:
| >Do you think part of this is that Netflix has assumed zero
| effort from user model?
|
| Talking w/a friend who works at Netflix, it sounds like this
| is a warranted assumption. The way he told it, they were
| tearing their hair out at one point b/c users wouldn't put
| much into it.
| samhw wrote:
| What I don't understand about their response is: _why not
| make it configurable?_ Admittedly this is my philosophy for
| almost every product I work on - "make it maximally
| configurable, but make the defaults maximally sane" - but
| I'm baffled every time I hear someone talking about this
| 'dilemma'.
|
| You just keep your simple interface, but allow the power
| users to, say, click through to a particular menu and
| change their setting - the setting in this case being ~"let
| me provide feedback / configure how recommendations work".
| For that kind of user, finding a 'cheat code' is actually a
| gratifying product experience anyway.
| aleksiy123 wrote:
| I think its because the complexity of allowing
| configurability isn't always worth it. Verifying it works
| for all configurations becomes exponentially harder.
|
| I believe it can also have performance implications
| especially for things like recommender systems where you
| are depending a lot on caching, pre computation and
| training.
| invalidOrTaken wrote:
| I don't disagree!
| nonameiguess wrote:
| Rotten Tomatoes works fine as a recommendation system. It lists
| all of the new content coming out in a given week. I just read
| that every week, file down to what looks interesting based on
| the premise, and read a few reviews. I can usually tell pretty
| easily what I'll like. No need for in-app recommendations from
| any specific streaming service at all. Good old-fashioned human
| expert curators.
| edmundsauto wrote:
| This indicates that the problem is difficult to solve at scale
| and customized per person. Maybe the issue is with our
| expectations - I find other people are pretty bad at
| recommending things for me as well.
| buescher wrote:
| Maybe. Recommendation systems definitely seem to get worse as
| they scale. Amazon's was incredible circa 2000. Pandora seems
| to be getting worse and more repetitive. Netflix kept getting
| better and better until they ended their contest and since
| then they seem to have only become worse.
| jeffbee wrote:
| Maybe you're just disappointed with Netflix's inventory, not
| their recommendations.
| colinmhayes wrote:
| I think it's both. I'm usually able to find decent stuff by
| searching "best on netflix" with some modifiers, but I almost
| never find new stuff I like by scrolling on netflix.
| werber wrote:
| Tangent, but I was recently thinking about how FAANG, is now
| MAANG, and the definition of mange : (from a google search, lol)
| mange /manj/ Learn to pronounce noun noun: mange a skin disease
| of mammals caused by parasitic mites and occasionally
| communicable to humans. It typically causes severe itching, hair
| loss, and the formation of scabs and lesions. "foxes that get
| mange die in three or four months"
|
| I find it oddly poetic, but, this is my last day of magic.
| nitinagg wrote:
| What's going wrong with Google search's recommendations every
| day?
| ultra_nick wrote:
| Garbage data in. Garbage data out.
| samhw wrote:
| What? They have absolutely _tremendous_ data, the envy of any
| data scientist on the planet. I don 't understand how you
| could possibly describe their user data as garbage in any
| conceivable way. Even search result click-and-query data
| _alone_ - leaving out Android, Chrome, Cloud, and everything
| else - is a stupendously invaluable, priceless asset.
|
| If you call that garbage, what on earth - or, for that
| matter, off it - is _not_ garbage!?
| lysecret wrote:
| Interesting post. On thing to note, this seems to be about "on
| request" ranking. E.g. googleing something and in 500ms you need
| the recommended content.
|
| However, a lot of usecases are time insensitive rankings. Like
| recommending content on netflix, spotify etc. (spotifys discover
| weekly even has a one week! request time :D).
|
| In which case you can just run your ranking and store the recs in
| your DB and its much much easier.
| troiskaer wrote:
| This is pretty much what both Netflix and Spotify do. I would
| argue that there isn't a canonical recommendations stack that
| FAANG is converging towards, and that's a direct corollary of
| differing business requirements and organizational structure.
| endisneigh wrote:
| Is there any recommendation system people we actually happy with?
| They all seem to suck in my experience
| chudi wrote:
| all feeds are recommendations systems, instagram, facebook,
| twitter, tiktok, youtube, every single one is a recommendation
| system.
| notriddle wrote:
| Technically, yes, but when they're talking about this sort of
| thing, they mean "personal recommendation system" or
| "content-based recommendation system."
|
| For example, the HN front page is a recommendation system if
| you literally mean system-that-recommends-web-pages-to-look-
| at. But it's not personalized; every visitor sees the same
| front page. This fundamentally makes it a different sort of
| thing.
| chudi wrote:
| If You have 10000 posts that You have to sort it in some
| way and the user just going to see 20 of those, the sorting
| is the recommendation system, people are just used to think
| of products, movies and songs, but in those platforms the
| users are the products
| notriddle wrote:
| This hardly seems like a reasonable way to characterize
| Netflix, which has a personal recommendation system,
| especially compared to HN, which is ad supported yet
| gives the same recommendations to everyone.
| charcircuit wrote:
| YouTube, TikTok, and Twitter all work well for me.
| mrfox321 wrote:
| TikTok
| colesantiago wrote:
| Why TikTok in particular? What is the engineering story
| behind TikTok's recommendation system? How did they get it
| right?
| keewee7 wrote:
| TikTok seem to be learning from what the user is actually
| watching and for how long and not just the user's
| "Like"/"Not Interested In" actions. However it still seem
| to learn from the "Not Interested In" action more than any
| other platform.
| pedrosorio wrote:
| This is a pretty misinformed take when it's publicly
| known that YouTube was already doing this (learn from
| what the user is watching and for how long) the year
| Bytedance was founded (2012):
|
| https://blog.youtube/news-and-events/youtube-now-why-we-
| focu...
| hallqv wrote:
| Anyone have recommendations (no pun) for more in depth resources
| on the subject (large scale recommendation systems)?
| whiplash451 wrote:
| The RecSys conference proceedings might help
| lmc wrote:
| Much of the field seems to be fixated on throwing massive
| compute resources at models with results that can neither be
| evaluated nor reproduced.
|
| "the Recommender Systems research community is facing a crisis
| where a significant number of papers present results that
| contribute little to collective knowledge [...] often because
| the research lacks the [...] evaluation to be properly judged
| and, hence, to provide meaningful contributions"
|
| https://doi.org/10.1145%2F2532508.2532513
|
| More here...
| https://en.wikipedia.org/wiki/Recommender_system#Reproducibi...
| whimsicalism wrote:
| By "the field", you surely mean the academic field. In the
| industry, we run controlled experiments to validate all the
| time.
|
| Recommender systems is one of the few areas in ML where
| almost all of the knowledge is contained in industry, not
| academia.
| lmc wrote:
| That was my thinking - anything of value is product-
| specific and behind closed doors. It's not my field, but
| something I see come up from time to time that seems
| weirdly over-represented in ML articles.
| samhw wrote:
| I work on these systems, and if anything my only
| complaint about the field is the propensity to solve
| every optimisation problem with ML. I have seen people
| solve textbook-grade linear, and even differentiable,
| optimisation problems.
|
| And the reason it happens despite the 'invisible hand'
| etc is because _it still works, it just happens to be
| horrendously inefficient_. I think that 's the main area
| of inefficiency in the industry: not in getting the job
| done, nor even arguably in accuracy - at least not
| severely - but in overcomplicating the solution[0]
| because we've formed a cargo cult around one particular
| method of optimisation, beyond all nuance.
|
| [0] I mean 'overcomplicating' in absolute terms. Of
| course the very crux of my point is that, from the data
| scientist's perspective, it's _not_ overcomplicated - it
| 's _less_ complicated than using e.g. ILP precisely
| because we have made libraries like TensorFlow so
| incredibly easy and tempting to use.
| ocrow wrote:
| These recommendation systems take control away from individuals
| over what content they see and replace that choice with black box
| algorithms that don't explain why you are seeing the content that
| you are or what other content was excluded. All of the companies
| who have deployed these content selection algorithms could have
| also given you manual choice over the content that you see, but
| chose instead to let the algorithm solely determine the content
| of your feed, either removing the manual option entirely or
| burying it so thoroughly that no one bothers to use it.
|
| These algorithms are not benign. They make choices about what
| information you consume, whose opinions you read, what movies you
| watch, what products you are exposed to, even which politicians
| messages you hear.
|
| When people complain about the takeover of algorithms, they don't
| mean databases or web interfaces. They mean this: content
| selection or preference algorithms.
|
| We should be deeply suspicious. We should demand greater
| accountability. We should require that the algorithms explain
| themselves and offer alternatives. We should implement better.
| Give control back to the users in meaningful ways
|
| If software engineering is indeed a profession, our professional
| responsibilities include tempering the damaging effects of
| content selection algorithms.
| KaiserPro wrote:
| Did you know how a news paper used to choose what articles it
| wanted to run?
|
| Do you know how a TV channel decides to schedule stories?
|
| Humans, its all humans. Looking at the metrics, and steering
| stuff that feeds that metric.
|
| Content filters are dumb and easy to understand. seriously,
| open up a fresh account at FB, instagram, twitter or tiktok.
|
| First it'll try and get a list of people you already know.
| Don't give it that.
|
| Then it'll give you a bunch of super popular but click baity
| influencers to follow. why? because they are the things that
| drive attention.
|
| if you follow those defaults, you'll get a view of whats
| shallow and popular: spam, tits, dicks and money.
|
| If you find a subject leader, for example a independent tool
| maker, cook, pattern maker, builder, then most of your feed
| will be full of those subjects, save for about 10% random shit
| thats there to expand your subject range (mostly tits, dicks,
| spam or money)
|
| What you'll see is stuff related to what you like and stare at.
|
| And thats the problem, they are dumb mirrors. Thats why you
| don't let kids play with them. Thats why you don't let people
| with eating disorders go on them, thats why mental health needs
| to be more accessible, because some times holding up a mirror
| to your dark desires is corrosive.
|
| Could filter designers do more? fuck yeah, be we also have to
| be aware that filters are a great whipping boy for other more
| powerful things.
| oofbey wrote:
| Off-topic, but how did Netflix manage to get itself inserted into
| the FAANG acronym anyway? Their impact on the tech industry is
| trivial compared to all the others. Sure, if you just take out
| the N it's offensive, but we could have said "GAFA" or "FAAMG"
| would be more accurate to include Microsoft in their place.
| vincentmarle wrote:
| There was a point in time when FAANG offered the best
| compensation packages for engineers (Netflix was one of them) -
| so that's where the term originated from but while it's
| outdated in many respects (Microsoft is not included, Facebook
| is now Meta, Google is now Alphabet etc etc) it's still sticky
| for some reason.
| whimsicalism wrote:
| Microsoft in 2022 does not compensate as well as any of
| those. Microsoft in 2021 only out-compensated Amazon.
| hbn wrote:
| > Facebook is now Meta, Google is now Alphabet
|
| Eh, the new parent company names aren't really what people
| know them as still. I don't think most people are even aware
| that Google has a parent company.
|
| I have a friend that works at Google, and that's what we say.
| I don't think him or anyone would ever say he works at
| Alphabet.
| oofbey wrote:
| Yeah, Meta will likely stick because Zuckerberg and crew
| are actively trying to run away from the dumpster fire they
| lit with Facebook.
|
| But Google is still Google, and probably always will be.
| Just like Youtube is still Google, and Waymo is still
| Google.
| [deleted]
| TacticalCoder wrote:
| > "FAAMG" would be more accurate to include Microsoft in their
| place
|
| In Europe you nearly always see "GNAFAM", which includes
| Microsoft too. It's certainly weird to exclude MSFT, worth at
| times more than Amazon+Meta+Netflix combined.
| dljsjr wrote:
| The phrase originated w/ Jim Cramer, it refers to the 5 best
| performing tech stocks(or what were the best performing at the
| time). Nothing to do with their impact on the field from a
| technical perspective, just a business perspective.
| cordite wrote:
| Netflix has contributed a lot to Java micro services, see
| Eureka and Hystrix.
| troiskaer wrote:
| as well as to ML - Netflix Prize
| (https://en.wikipedia.org/wiki/Netflix_Prize) and Metaflow
| (https://github.com/Netflix/metaflow)
| oofbey wrote:
| No question they've done some things that have had some
| impact on others in the industry. But none of them are
| particularly important. It's all relative. Companies like
| Twitter, Uber, AirBnb have all released open source
| projects or figured things out how to solve hard problems
| in ways that others have emulated.
|
| But for every other one of the FAA(N)G companies, I can
| barely work a day as a developer without touching every one
| of their technologies. Yeah, Netflix got into ML years
| before most, but the netflix prize exists as a distant
| cautionary memory, and as an ML professional, I'd literally
| never heard of metaflow before. Just sayin'.
| troiskaer wrote:
| > But none of them are particularly important
|
| Nowhere was the argument made that somehow Netflix was
| more influential than Twitter/Uber/AirBnB, but your
| counter-argument that somehow it's less influential
| because you haven't heard of/used some projects directly
| holds no ground.
| samhw wrote:
| > your counter-argument that somehow it's less
| influential because you haven't heard of/used some
| projects directly holds no ground
|
| Oh come on, they are indisputably right that Microsoft,
| Twitter, Uber, Airbnb, hell, even Cloudflare are more
| technically influential than Netflix is.
|
| Apple and Google would make _anyone 's_ top 5, that's his
| point. No argument about it. Their products collectively
| dominate anyone's life, along with MSFT. Netflix is
| _maybe_ in your top 10, top 20 for sure, but it 's not up
| there as one of the few 'platform that everyone's lives
| are built on' techcos.
|
| (Like, Netflix vs Microsoft? Seriously? For that matter,
| Amazon probably wouldn't be in my top 5 either, and not
| only because it's not mainly a tech company. I s'pose it
| depends how you define 'Amazon', and if you include AWS.
| But for Netflix there's just no argument that they win a
| spot there.)
| troiskaer wrote:
| What's your argument for Twitter/Uber/AirBnB being
| indisputably more technologically influential than
| Netflix? And let's please talk facts rather than
| opinions.
| jedberg wrote:
| FAANG was created by the TV personality Jim Cramer to talk
| about high growth tech stocks. At the time Netflix was doubling
| every year. It was based purely on finance.
|
| It's now been taken over by the tech industry to be shorthand
| for places that are highly selective in their hiring and tend
| to work on cutting edge tech at scale.
|
| That being said, the impact of Netflix on tech is pretty big.
| They pioneered using the cloud to run at massive scale.
| oofbey wrote:
| > They pioneered using the cloud to run at massive scale.
|
| Which is to say they were AWS's biggest early customer?
| Doesn't really seem like Netflix should get the credit for
| that one.
| jedberg wrote:
| It was a lot more than that. They developed systems and
| techniques that even Amazon adopted and are still adopting
| to this day. They also created a ton of open source tools
| for other people to use the cloud:
|
| https://netflix.github.io
|
| Netflix tech even spawned a company to sell their open
| source tools:
|
| https://www.armory.io
|
| And they codified the entire practice of Chaos Engineering:
|
| https://en.wikipedia.org/wiki/Chaos_engineering
| hetspookjee wrote:
| That stockphoto on the front page of armory.io manages to
| trigger al kinds of spammy website triggers for me.
| [deleted]
| patmorgan23 wrote:
| They were pretty influential in refining the microservices
| architecture
| samhw wrote:
| > FAANG was created by the TV personality Jim Cramer to talk
| about high growth tech stocks. At the time Netflix was
| doubling every year. It was based purely on finance.
|
| That, and FAAG had less of a ring to it.
|
| _Edit: Dammit, the GP made the same observation. Oh well, I
| 'm keeping it._
| jedberg wrote:
| If Netflix hadn't been such high growth and not included,
| Cramer probably would have gone with GAAF. :)
| tempest_ wrote:
| All the cool kids say GAMMA now.
| aczerepinski wrote:
| What is the G?
| oofbey wrote:
| Those who used to do no evil, but gave up on the idea as
| not profitable enough.
| oofbey wrote:
| Not MAGMA?
| ystad wrote:
| Too hot I say :)
| svachalek wrote:
| Now I can't get Dr Evil saying MAGMA out of my mind.
| errantmind wrote:
| I think the acronym gained prominence before Microsoft's recent
| 'commitment' to open source. Netflix also seemed to be doing
| really interesting things scaling out 'disruption' to video
| delivery at the time. It stuck
| yukinon wrote:
| FAANG was never about impact on tech industry. Otherwise, MSFT
| would be part of FAANG. Instead, it's directly related to (1)
| stock price and (2) compensation.
| BubbleRings wrote:
| Want to dive in to all this stuff but can't find a starting
| point? Start with reading my patent!
|
| I was smart enough to see what collaborative filtering (CF) could
| be early on, and to file a patent that issued. I wasn't smart
| enough to make it a complicated patent, or to choose the right
| partners so I could have success with it.
|
| But the patent makes a good way to learn how to get from "what
| are your desert island 5 favorite music recordings?" over to
| "here is a list of other music you might like". Basic CF, which
| is at the core of a lot of this stuff. Enjoy!:
|
| https://whiteis.com/whiteis/SE/
| siskiyou wrote:
| All I know is that Facebook's recommendation systems always show
| me things that I hate to see. I suppose they may "work" at scale,
| but at an individual level it's epic failure.
| samstave wrote:
| FB needs an Ad-Rev-Share-Model with ALL of its users...
|
| Imagine if FB were to pay a fraction% of how yur data was used
| and paid you for it...
|
| It may be a small amount, but in super 4th world countries, it
| could affect change in their lives...
|
| Now imagine that this becomes big... and it works well.
|
| Now imagine that the populous is aware of the hand of god above
| them just pressing keys to affect land masses (yes I am
| referring to the game from the 80s)
|
| but this cauterizes them into union building...
|
| So when the people realize their metrics are the product to
| feed consumerism for capitalistic profits, and decide to
| organize, what happens?
|
| Is FB going to need a military force to protect their DCs?
|
| ---
|
| With "Zuck Bucks" (I still am not sure if true)
|
| This makes this ultimate "company store"
|
| Tokens?
|
| So how get?
|
| How EARN? (What service on FB GENERATES '$ZB'?)
|
| How spend?
|
| WHAT GET? (NFTs?, Goods? Services?)?
|
| The entire fucking model of EVERYTHING FB DOES is to MAP
| SENTIMENT!
|
| Sentiment is the tie btwn INTENT and SENTIMENTAL VALUE
|
| The idea is to map interest with emotional drivers which make
| someone buy _(spend resources their time and effort went into
| building up a store-of)_...
|
| ---
|
| So map out your emotinal response over N topics and forums..
| Eval your documented Online comments, NLP the fuck out of that,
| see what your demos are and build this profile to you....
|
| THEN THEN THEN THEN
|
| Offer an "earnable" (i.e. Grindable by farms and bots alike) --
| "Zuck Buck" which is a TOKEN (etymology that fucking word for
| yourself)
|
| of value...
|
| Meaning, zero INTRINSIC value, Zero accountability (managed by
| a central Zuck Bank) <-- Yeah fuck that)
|
| And the vaule both determined AND available to you via not
| INTRINSIC CONTROL, nor VALUE.
|
| ---
|
| FB Bots Galore.
| ParanoidShroom wrote:
| >With "Zuck Bucks" (I still am not sure if true) I expected
| more from this place than to believe every click bait FB
| news. Of all the UX people and tons of money they throw to
| into research... Yes the best option was... "Zuck bucks".
| Don't get played ffs
| samstave wrote:
| You >quoted with no commentary.. so your point is :: TROLL?
| samhw wrote:
| So are you just making the punctuation up as you go, or
| what?
| imilk wrote:
| Like many NFT/crypto posts, I have absolutely no idea whether
| this is serious or a parody.
| greatpostman wrote:
| I've built one of these at FAANG. Generally the different parts
| of the system are completely separate teams that interact through
| apis and ingest systems. Usually there's a mix of online and
| offline calculations, where features are stored in a nosqldb and
| some simple model runs in a tomcat server at inference time, or
| the offline result is just retrieved. Almost everything is
| precomputed.
|
| We had an api layer where another team runs inference on their
| model as new user data comes in, then streams it to our api which
| inboards the data.
|
| On top of this, you have extensive A/B testing systems
| splonk wrote:
| I have as well, and your comment matches my experience more
| than the article does. Different teams own different systems,
| and there's basically no intersection between "things that
| require a ton of data/computation" and "things that must be
| computed online".
| oofbey wrote:
| Yep. The author, as a peddler of recommendations solutions,
| has an incentive to convince people that this problem is very
| complicated, and they should hire a consultant.
|
| In practice, good old Matrix Factorization works really well.
| Can you beat it with a huge team and tons of GPU hours to
| train fancy neural nets? Probably. Can you set up a nightly
| MF job on a single big machine and serve results quickly?
| Sure can.
| lysecret wrote:
| Yea same here. What Nosql DB did you use for these lookups? Im
| currently using postgres for it but seems a bit like a waste.
| Even though the array field is nice for feature vectors.
| jenny91 wrote:
| Presumably they mean internal stuff like google bigtable or
| equivalent. (Though some version of that is now on gcp).
| nickdothutton wrote:
| If you can possibly precompute it. Precompute it.
| [deleted]
| rexreed wrote:
| Isn't this obvious list-building promotion for a company (Fennel)
| that sells recommendation systems?
|
| "Fennel AI: Building and deploying real world recommendation
| systems in production Launched 18 hours ago"
|
| Caveat reader.
| [deleted]
| warent wrote:
| Nothing wrong with some content marketing. They provide value
| to people in return for getting exposure to their brand. Simple
| healthy quid pro quo
| imilk wrote:
| I'll never understand why people think this is a valid
| criticism of an article, rather than pointing out an issue they
| have with the actual content of the article. There's nothing
| inherently wrong with a company sharing info about the space
| they operate in. In fact, it should be encouraged as long as
| what they share is useful.
| notafraudster wrote:
| It's a short-hand for the treatment of the subject being
| pretty shallow and non-descript, which seems to apply to this
| article exactly. I read this and didn't learn anything.
| ZephyrBlu wrote:
| Do you work on recommendations or something similar as part
| of your job? I don't and I found the article interesting.
| imilk wrote:
| Saying the article is "pretty shallow and non-descript: is
| much shorter and more useful than what they posted.
| notafraudster wrote:
| Right, but then it starts a meta-conversation about why
| the article got posted, or even written. It doesn't have
| the down-the-rabbit hole trait of an individual project
| of passion, or the sort of authoritative voice of a
| conference talk or even a Netflix blog post, it doesn't
| really speak to specific actionable technologies so it's
| not the kind of onboarding a Toward Data Science post
| would be. And that meta conversation inevitably leads to,
| oh, it's a marketing funnel. So just saying "this is
| content marketing" I think is a shibboleth for the entire
| conversation that starts with "pretty shallow and non-
| descript".
|
| Of course I didn't write the original comment and there's
| something to say for flag-and-move-on or whatever, and
| other people did enjoy it. I'm just saying I understand
| the impulse to short-circuit the entire tedious
| conversation!
| HWR_14 wrote:
| It provides more information. It's shallow and non-
| descript because it's an ad is the argument. I don't know
| if I believe that here. It's a blurry line with sponsored
| content.
___________________________________________________________________
(page generated 2022-04-11 23:00 UTC)