[HN Gopher] Launch HN: Lumona (YC W24) - Product search based on...
       ___________________________________________________________________
        
       Launch HN: Lumona (YC W24) - Product search based on Reddit and
       YouTube reviews
        
       Hey HN! We are Lumona (https://lumona.ai), a product search engine
       that recommends products based on what people on social media--
       Reddit and YouTube, for now--are saying about them.  Rather than
       going through SEO-filled Google results or adding site:reddit.com
       to your search, we explain what makes a good product, show you the
       best products, and back it up with Reddit and YouTube reviews about
       the product. We're starting with skincare products (more on that
       below) and plan to expand from there.  Here's a demo:
       https://www.youtube.com/watch?v=C4kKjW2YkZ4&lc=Ugzl94GP9SDBO...  We
       started off with skincare because, growing up, we struggled with
       acne but had no clue what skincare products could actually help us.
       Going down the rabbit hole of endlessly scrolling
       r/SkincareAddiction and watching countless hours of videos about
       cystic acne was not fun.  Lumona's skincare search index was built
       by first scraping the internet for listings of skincare products,
       along with their ingredient lists, through a combination of SERP,
       Amazon's API, and web page crawling. We then use a fine-tuned
       Mistral LLM to parse through a large number of Reddit threads and
       YouTube transcripts to extract opinions made by users, along with
       the context in which the opinions were made. These opinions are
       then matched with any relevant products through another fine-tuned
       LLM that looks at an opinion and any products that have a high
       cosine similarity as that of the opinion's subject and decides
       whether that opinion is relevant to any of those products. Using a
       Mistral-7B FT trained on GPT-4 outputs allowed us to parse through
       hundreds of thousands of Reddit threads in a simple way with just
       hundreds of dollars of compute.  If your query relates to a
       specific situation (e.g. "cleansers for my son who has inflamed
       acne on his forehead"), we search semantically through the opinions
       of Redditors and YouTubers to retrieve the products recommended by
       those who have dealt with a similar situation. If your query
       relates to a specific product (e.g. "iunik centella gel"), we
       instead go through the product listings themselves to return you
       the relevant products.  We also use an LLM to analyze your search
       query to tell you what ingredients or effects are preferable for
       your skin concern.For example, if you searched for "inflamed
       forehead acne", properties like "Oil-Control" and "Azelaic Acid"
       which are good for dealing with inflamed acne would be explained to
       you, and results containing those properties would be boosted and
       tagged in our results. You can also try out searches like "korean
       cleansers under $20 with Cica" to filter for certain ingredients
       and price points.  While we think we've built a product search that
       would be pretty helpful for our teenage (and current!) selves,
       there are many improvements we'd like to make, such as getting
       opinions from Tiktok and other social media platforms and making
       our opinion extraction process more robust for edge cases (e.g. by
       using OCR, video transcription tools). We're also planning on
       allowing our users to upload their own reviews and content and to
       expand our search across more products.  The long-term potential is
       to be a go-to product for anyone looking for what other people
       think about anything subjective (products, restaurants, b2b
       products, vacation planning, etc.). We believe that the entire
       discovery experience can be revolutionized by making it as easy as
       searching on Google to find out what the people you care about
       think about something. On the individual level, we want to make
       sharing your opinions with your friends and the world as easy as
       posting a picture on Instagram.  For now, if you have any skincare
       needs, whether it be to solve a skin concern, get rid of an
       annoying pimple, or just to find a good sunscreen, please give us a
       try: https://lumona.ai (We are an Amazon and Stylevana affiliate.)
       We'd love to hear your feedback on our search engine, whether that
       be how the skincare search performs, what you think is missing,
       what products you want to see there, or any technical suggestions!
        
       Author : philena
       Score  : 138 points
       Date   : 2024-03-29 19:04 UTC (1 days ago)
        
       | kaiomagalhaes wrote:
       | this idea is awesome, I hope you get into software products
        
         | philena wrote:
         | thank you! we've been wanting this as well while building this
         | out haha, will do :)
        
       | gumptionary wrote:
       | I understand why you went for a product search engine (gotta
       | monetize) but I think one of the reasons mining reddit for intel
       | is so helpful is you aren't always being sold a product.
       | 
       | For example: I recently turned to reddit because I was looking
       | for a foam roller to resolve some IT band issues from running,
       | and ended up finding a stretching routine that has fixed my
       | problem without buying anything.
       | 
       | Either way, I think this is really cool and bypassing the
       | nonsense that google is becoming is a winning path.
        
         | philena wrote:
         | thank you! i completely agree -- i often go to reddit when
         | looking for tv show recommendations because of its honest
         | advice from the community (maybe it's because of its
         | anonymity?)
         | 
         | we definitely want to expand this to outside product search and
         | be more of a general recommendation/opinion search (e.g. in
         | your case, finding out what people are saying about how to fix
         | band issues from running), interested as to what you would
         | think about this :)
        
         | zztop44 wrote:
         | I know this is wildly off topic, but can you please share any
         | info about your stretching routine? I have persistent IT band
         | issues from running...
        
           | moneywoes wrote:
           | likewise
        
         | 1oooqooq wrote:
         | you'd be surprised but reddit is mostly shills.
         | 
         | besides your anecdote, all the games reviews, computer part
         | reviews, etc are all paid by drop shippers. and some reddits
         | like mattress review ones are exclusively shills talking among
         | them.
        
       | ilrwbwrkhv wrote:
       | This is great. You can then start seeding products which give you
       | a high cut and then proclaim them as the "best". Basically what
       | Wired and all do now but without the whole article bit and you
       | can claim "knowledge of the public".
        
         | philena wrote:
         | that's an interesting idea -- we have been seeing this play out
         | successfully as well (like you mentioned, Wired + sponsored
         | youtube videos + etc). though that would be useful for
         | profitability, we're afraid that may compromise our
         | reputability as being the knowledge of the public. instead
         | we're looking for ways where, when we expand to more opinions
         | and reviews, we can robustly filter out those that seem
         | disingenuous / are sponsored. curious as to what you think
         | about this "filtering" out + if you have any ideas of going
         | about this :)
        
           | ilrwbwrkhv wrote:
           | As long as you don't show blatantly horrible products, your
           | reputation will be fine.
        
             | dawalker wrote:
             | we'll do our best :)
        
       | ravroid wrote:
       | Cool concept. Not relevant to me in its current state being
       | limited to skin care products, but would love to use something
       | like this for things like supplements or other products where I
       | otherwise have to sift through Amazon reviews & reddit threads.
        
         | dawalker wrote:
         | Thanks and makes sense. Supplements+general health and beauty
         | will probably be one of the first things that get added outside
         | of skincare. Would be interested in seeing the reviews as well
         | for those considering how supplements are sold+regulated.
        
       | huevosabio wrote:
       | This is so cool. I already do this in a very ad-hoc way. Will
       | definitely try it!
       | 
       | My only concern is that once Reddit reviews get used at scale for
       | product discovery, we will see an inflow of fake and paid reviews
       | in the comments. This will further pollute Reddit and probably
       | drive discussions to forums closed from the public eye, e.g.
       | Discord.
       | 
       | Obviously, this is not your fault at all, it's just the market
       | dynamics at hand.
       | 
       | Anyway, let me try it!
        
         | dawalker wrote:
         | Thanks, we really appreciate it! This is something we've been
         | thinking about too. One of the things that we've noticed is
         | that video reviews have a lot more effect on us than almost all
         | text reviews, which are harder to fake (for now). We're
         | thinking that letting people upload their own video reviews
         | will help solve this problem as long as we can detect video
         | deepfakes, but that's definitely not a complete solution (like
         | you said though, not sure anything is).
        
           | 01HNNWZ0MV43FF wrote:
           | tbf it's common for YouTube uploaders to be paid to advertise
           | products that barely work
        
             | dawalker wrote:
             | lol true, the state of youtube ads is pretty bad nowadays
             | (although the raid shadow legends spam is gone from my feed
             | now). Just places more emphasis on how much people trust
             | the individual channel.
        
         | A_D_E_P_T wrote:
         | > _My only concern is that once Reddit reviews get used at
         | scale for product discovery, we will see an inflow of fake and
         | paid reviews in the comments_
         | 
         | This is already happening.
         | 
         |  _A lot_ of product-related posts on Reddit are made by
         | marketing agencies, PR firms, SEO consultants, etc. There 's
         | also a thriving secondary market for "high karma" Reddit
         | accounts, which are bought and sold with ease. Unlike old-
         | fashioned forums, which were difficult for outsiders to crack,
         | Reddit is _easy_ to game and basically it 's already the most
         | astroturfed place on the internet. Making it the basis of a
         | product search system can only make it worse.
        
           | dawalker wrote:
           | Very true. It's interesting how Reddit has maintained a
           | relatively high trust within most people though. It's also
           | worth noting that while this is happening, there aren't many
           | other places that most people go to where the site is mainly
           | text-based and there is a higher level of trust that I know
           | of. Personally, I'd trust Reddit over a random blog from a
           | Google search, but that isn't a high bar.
           | 
           | All of that being said, I think this will be a much bigger
           | problem with misinformation generally on all of the internet
           | as AI gets better, especially considering the election later
           | this year.
        
             | 01HNNWZ0MV43FF wrote:
             | > It's interesting how Reddit has maintained a relatively
             | high trust within most people though
             | 
             | Maybe the PR companies moved up a level in the meta-game -
             | Don't talk up your product, talk up Reddit itself, _then_
             | go on Reddit to talk up your product.
        
               | dawalker wrote:
               | They're getting smarter for sure, maybe some shadow
               | marketing going on with the IPO too lol
        
               | pphysch wrote:
               | It's not even that "meta": they would just be dogfooding.
               | I would be surprised if this wasn't the case.
        
           | addandsubtract wrote:
           | As with anything on reddit, "the real ______ is always in the
           | comments." You get the best advice in place you expect it the
           | least.
        
             | dawalker wrote:
             | Great point. While reddit may be astroturfed, all it takes
             | is one good comment.
        
               | A_D_E_P_T wrote:
               | Thing is, there aren't many marketing firms that don't
               | have (or can't buy on a work-for-hire basis)
               | upvote/downvote networks. It's trivial to promote
               | comments in ways that look organic. It's equally trivial
               | to downvote commercially harmful comments into oblivion.
               | 
               | What's more, Reddit posts are usually actively discussed
               | only for a day or two. But votes can be cast, and new
               | comments can be added, for months. One strategy is to
               | wait until the conversation has completely died down,
               | then hijack it with new comments that somehow seem to get
               | as many votes as they need to rise to the top. When, a
               | year later, somebody digs up that thread on Google,
               | they'll see the promoted comments first.
               | 
               | Reddit has severe structural flaws that are, I think,
               | unfixable. In making the upvote/downvote thing a kind of
               | game, and in enabling easy throwaway accounts whose votes
               | are weighed in the same way as those of the longstanding
               | accounts of regular commenters, they've naturally made
               | their forum easy for commercial interests to game.
        
               | dawalker wrote:
               | Great points, not sure those are fixable either. It
               | definitely has some advantages from those same things in
               | some respects, but monetizing reddit outside of ads (i.e.
               | ecommerce like tiktok/instagram) because of these things
               | is going to be a challenge for sure.
        
             | fuzzythinker wrote:
             | I wouldn't blindly trust comments.
        
           | chasebank wrote:
           | Already happening? Here's a clip from an astroturfing firm in
           | the year 2000 for Sprite and other clients. They called it
           | 'under the radar marketing' back then. I'd wager 95% of all
           | product recommendations on Reddit, and HN for that matter,
           | are placed by people with agendas.
           | 
           | [0] https://www.youtube.com/watch?v=F0z0a4SLIsM
        
             | dawalker wrote:
             | The top comment on that clip scores home that point. While
             | there's no way to verify that, I wouldn't be surprised if
             | it's a lot more than we realize. That being said and to
             | play devils advocate - if all of them are already
             | astroturfed with some people discovering that, then how
             | much do most people care about it?
        
             | AndrewCopeland wrote:
             | That's what I do, I think I provide a great product/service
             | but also still want to get the word out.
             | 
             | A marketing agency who will sell a bag of shit as long as
             | they get paid is definitely a net negative.
             | 
             | Overall reddit has been going down hill for a decade at
             | this point and it only makes sense that it will/has been
             | captured by companies trying to profit off it.
        
               | qiongzhouh wrote:
               | Do you have any pointers for learning more about this
               | space? I've personally been pretty skeptical of sponsored
               | products on YouTube, though I find myself getting tempted
               | to / actually trying them out anyway, but haven't thought
               | too hard about small Redditors or Tiktokers getting paid
               | to shill products.
               | 
               | Curious what companies do this / how companies are going
               | about conducting these unpublicized marketing campaigns?
        
           | jsheard wrote:
           | Likewise on YouTube, which OPs service is also pulling from,
           | several times now I've gone looking for reviews of a specific
           | product and one of the top results has been a TTS voice
           | reading a probably ChatGPT-generated "review" which
           | invariably recommends the product because the point is to get
           | you to click the affiliate link in the description. The
           | channels I saw were posting "reviews" so frequently and
           | consistently, and for such random products that I suspect the
           | entire operation is completely automated.
        
             | dawalker wrote:
             | Interesting, haven't seen one of those yet. With public
             | video+audio models getting much better that will only get
             | way worse over time. Excited to see what YT/Google decides
             | to do about it.
        
               | jsheard wrote:
               | Example: search for "MSI G27C4X" on YouTube, for me at
               | least both the first and second results are fake robot
               | reviews. There are a couple of real impressions videos by
               | real people but for some reason YouTube sorts them below
               | the AI spam. One of those spam channels is posting
               | multiple reviews per hour, with >11,000 videos and
               | counting.
        
               | dawalker wrote:
               | That's crazy, going to be really interesting once this
               | ramps up/if it starts getting good traffic.
        
               | AndrewCopeland wrote:
               | I've seen AI generated content about dog breeds, the
               | content was absolutely horrible to watch and listen too.
               | 
               | In the near future we will have YouTube videos that pride
               | themselves on being organically made, no GMOs and built
               | by humans.
        
               | dawalker wrote:
               | lmao true, already seen a few companies gunning for the
               | YouTube for AI generated videos mantel so we'll see how
               | it goes
        
             | qiongzhouh wrote:
             | Hmm, that's interesting. I've been getting a lot of
             | Instagram and Tiktok reels with the robotic TTS voice
             | nowadays, I've just been assuming that it's a funny thing
             | that people do so that they don't have to record their own
             | voice.
             | 
             | Wondering how / if we should be filtering out this content
             | now that you can make TTS voices that sounds like they're
             | completely real
        
               | jsheard wrote:
               | Even with a perfectly convincing TTS voice it's still
               | given away by the fact that they don't show themselves on
               | video interacting with the product, they usually just
               | show a slideshow of official product images. At least
               | some people must already be falling for this crudely
               | generated content for it to be worth their while to
               | produce it though.
        
               | btown wrote:
               | There's an interesting phenomenon where a certain type of
               | rapid-to-experience, entertaining content, often with an
               | enjoyable twist, has become synonymous with the glaringly
               | imperfect TikTok voice... and thus, conversely, creators
               | use TTS to signal that their content is similarly
               | entertaining. And as more and more traditional creators
               | start to use TTS, real voices become devalued as a
               | quality signal. Avoiding recording is only a part of the
               | phenomenon!
               | 
               | https://gesserit.co/ (formerly tiktoktts) is one of the
               | most popular ways to generate a TikTok-esque TTS voice
               | outside of that platform. I don't think they could have
               | chosen a better name!
        
               | qiongzhouh wrote:
               | Wow, that voice on the page sounds exactly like what I
               | hear on TikTok / Instagram all the time. Definitely
               | evokes the feeling that I'm about to be entertained by
               | something.
               | 
               | Thanks for sharing that, we'll have to think hard about
               | how to measure the quality / realness of content online
               | beyond the simple things like upvotes and subscriber
               | count
        
           | Phanlan2016 wrote:
           | Can you give me some examples of old-fashioned forums, I just
           | one to see some of it for the sake of it ;)
        
             | A_D_E_P_T wrote:
             | Behold: https://www.saabcentral.com/forums/
        
           | criddell wrote:
           | Where can you buy and sell high karma Reddit accounts?
        
             | A_D_E_P_T wrote:
             | There are tons of clearweb markets, e.g.:
             | https://openmarketingstudio.mysellix.io/
        
               | ileaf wrote:
               | Reddit has become a massive content machine, and user
               | engagement is at an all-time high. There are a few
               | reasons why we're seeing more trending posts:
               | 
               | it's important to remember that not all trending posts
               | are created equal. There's definitely a growing trend of
               | services like SaaS https://openmarketing.studio/ that
               | offer businesses a suite of tools to manage their Reddit
               | presence, including tracking trends, analyzing sentiment,
               | and even (controversially) influencing post performance
               | through upvotes, downvotes, and comments.
        
         | edmundsauto wrote:
         | A centralized curator could actually help by drawing a "schill"
         | graph and excluding those signals.
        
       | avsavani wrote:
       | Results doesn't finish loading for me, I will try again in few
       | hours, I am really curious to see how it compare to generalized
       | search engines like Perplexity and You.com
        
         | philena wrote:
         | sorry about the loading issue -- we'll look into that right
         | now!
        
       | QAComet wrote:
       | This is a neat product, and I plan on trying out some of the
       | recommendations for sunscreen.
       | 
       | During my journey using the app there were a few things I noticed
       | 
       | 1) It seems like the intermediate page is generating text from
       | the LLM as well, which makes the whole process quite slow on my
       | machine. It took maybe 10 seconds before the loader finished
       | displaying the text. If I try and perform the same query again on
       | the same browser, the results are somewhat quicker, maybe
       | 700-800ms of wait time, but this still seems too slow. Once I ran
       | the query five or so times, it was as quick as the demo queries
       | on the front page.
       | 
       | 2) Consistent results: If I use the same query on separate
       | browsers, I'm given different products as the "Top Recommended
       | Product", which seems odd. I know LLMs are stochastic, but the
       | feed starting with the "Top Recommended Product" probably
       | shouldn't have stochasticity. This problem opens up some
       | interesting ML cans of worms, but I believe these issues could be
       | overcome.
       | 
       | 3) Another issue was if I wanted to scroll in the left column
       | while the right column was still loading, the scrolling was very
       | janky. This was an issue on firefox, but it took quite a long
       | time for the app to be functional (> 10s)
       | 
       | 4) Perhaps you could move the search bar and the logo to the top,
       | so the logo is on the top left corner and the search bar takes
       | space to the right of it. This way there aren't overlapping
       | elements, I'm sure there's some annoying edge cases there which
       | would frustrate users
       | 
       | 5) For negative ingredients (and maybe any of the ingredients) it
       | would be nice if you kept track of an ingredient database with
       | references. I want to know _why_ some ingredient is bad for my
       | skin, and what I could expect.
       | 
       | 6) If a product has many distributors, my first through was the
       | arrow scrolling through products was a slider for the distributor
       | list. I wonder if there's a nice way to differentiate the arrow
       | further, so its functionality is more apparent.
       | 
       | Anyway, this is an excellent proof of concept, I'm excited to see
       | how this product develops.
        
         | qiongzhouh wrote:
         | Thanks for trying it out!
         | 
         | As for the performance issues, we're looking into several
         | things that could speed things up - Fine-tuning a small LLM for
         | the results on the intermediate page and deploying on a
         | provider with higher throughput and time to first token -
         | Admittedly, there there are quite a few SQL query / index
         | optimizations we need to make on the backend, along with making
         | parts of our pipeline async - The frontend itself is also not
         | very performant right now, but we're working on it.
         | 
         | We cache previous calls to the API, so that's why the demo
         | queries or queries others have tried before you are faster.
         | I'll ship a change that makes the results more consistent but
         | not fully consistent later today.
         | 
         | As for the ingredients, citing sources is definitely a next
         | step. In the meantime, I recommend looking up the ingredients
         | that catch your eye on a place like EWG Skin Deep if it's a
         | huge concern for you (I used to do this to make sure my
         | ingredients weren't comedogenic for acne).
         | 
         | Great point about the distributor list UI, we'll think about a
         | better way to show it!
        
       | hypercube33 wrote:
       | How do you deal with bot posts to push products on either
       | platform skewing the reviews?
        
         | qiongzhouh wrote:
         | For now, we're excluding Reddit posts that are clearly
         | automated and making sure the YouTube content is not sponsored,
         | which you are required to disclose by the YouTube ToS.
         | 
         | We'll have to dig deeper into not to filter out spammy reviews.
         | I can imagine analyzing a user's post history or detecting if
         | content was clearly GPT written, but it's hard to really tell.
         | I know there things like Amazon review analyzers out there, but
         | we'll have to learn more about this. I wonder if the people of
         | HN have any suggestions on this front.
         | 
         | There'll probably be a lot AI generated reels that look like
         | they're from real people online soon too. I wonder what
         | platforms like Tiktok and YouTube will do about this. If this
         | ends up being a huge , we can probably try to use ML methods to
         | check if the video was filmed in the real world
        
           | moneywoes wrote:
           | what does clearly automated mean
        
             | qiongzhouh wrote:
             | For now, it's just removing AutoModerators and things
             | labeled as bots. Now that I'm reading this again, I realize
             | doesn't really help, since bots pretending to be people
             | recommending products, don't get filtered out.
        
               | dns_snek wrote:
               | That strikes me as very naive. Reddit bots are never
               | marked as bots, that's the whole point of astroturfing.
               | Youtubers aren't diligent about disclosing sponsorships
               | either, regardless of what their ToS say.
               | 
               | Slightly outdated (2018), but they found that only 10% of
               | Youtube videos disclosed sponsorship:
               | https://www.engadget.com/2018-03-28-youtube-influencers-
               | spon...
               | 
               | A recent report by the European Commission found that
               | only 20% of overall "influencer" posts disclosed
               | sponsorships: https://ec.europa.eu/commission/presscorner
               | /detail/en/ip_24_...
        
               | qiongzhouh wrote:
               | I think this is a very good point. We've focused a lot on
               | correctly matching reviews with products / brands, but
               | haven't taken hard enough of a look at astroturfing.
        
       | hubraumhugo wrote:
       | We've tried to build this in the past with Looria.com, where we
       | aggreagted and summarized reviews from the most trusted sources,
       | e.g. Reddit: https://www.looria.com/reddit
       | 
       | Couple of challenges:
       | 
       | - Astroturfing is everywhere
       | 
       | - The data sources, especially social media, become more
       | protective with their data
       | 
       | - Monetizing this is super hard. As an aggregator, you're always
       | just the intermediate. The glory times of ads and affiliate
       | marketing are over.
       | 
       | Vetted.ai is working on something similar and they raised $14M in
       | 2022. For all consumers, I really hope one of you will succeed!
        
         | nextworddev wrote:
         | Curious - what is your bearish case for profitability of
         | affiliate marketing?
        
           | p10_user wrote:
           | people can find multiple ways to get to the same product.
           | once your way starts charging they will find another way.
        
             | dawalker wrote:
             | Is the implication here that you need to charge and users
             | will leave you once you do? If you can make a product
             | that's significantly better, then you should be able to
             | charge. The thing I'd note for affiliate marketing as a
             | business model is that for it to generate significant
             | revenue, you need to have a lot of traffic while other
             | business models can generate that much faster
             | (subscriptions) or make you money based off of that traffic
             | (ads) instead of how many products are purchased.
        
               | p10_user wrote:
               | Your note on affiliate marketing is what makes your first
               | statement potentially unachievable. How does a consumer
               | "know" that a product is significantly better to the
               | point of "worth paying for"? There's always another free
               | (potentially ad supported) affiliate marketer (or 5)
               | around the corner. (Also considering the "worst" version
               | of this "product" is an unskippable ad").
               | 
               | I don't know the solution
        
               | dawalker wrote:
               | Fair. The best solution we've seen is building the
               | product in some way where it's somewhat defensible,
               | either through data, features that bigger players won't
               | build, etc. and then using a subscription based model if
               | users are willing to pay for that and value the searches
               | high enough or using an ads based model if you're
               | optimizing for traffic rather than pure value on each
               | search.
        
               | p10_user wrote:
               | I suppose two case studies worth exploring are:
               | 
               | Consumer Reports (subscription magazine recurring
               | revenue) NY times Wirecutter (a potential add on service
               | to boost apparent value for subscribers)
        
               | qiongzhouh wrote:
               | We've looked a bit into CR and Wirecutter, not that
               | deeply into CR yet though. I definitely used Wirecutter
               | for a bit of things in the past, and they have a high
               | level of trust that we'll need to seek to replicate.
        
           | robryan wrote:
           | Over time affiliate programs like the Amazon one have become
           | a lot less generous. On the other hand though from running an
           | ecommerce site I'd be happy to work with an affiliate like
           | this that isn't just a coupon website that basically adds no
           | value.
        
             | dawalker wrote:
             | That's what I've heard as well. Also, the lag to when you
             | actually get paid is super painful.
             | 
             | Also curious - how do you think about affiliates as someone
             | who runs an ecommerce site? Are there any reservations
             | about whether services like us take search traffic or ads
             | revenue?
        
         | dawalker wrote:
         | Thanks! Super interesting how many different approaches there
         | are to this problem. Definitely encountered these challenges,
         | and we think there's solutions to them eventually that we have
         | to build towards. I'll drop a message sometime, would love to
         | chat :)
        
       | mkchoi212 wrote:
       | Are you paying for Reddit's API or did y'all find a way around
       | it?
        
         | p10_user wrote:
         | they're either paying or it was a gift from sama
        
           | dawalker wrote:
           | would love to say that it was a gift from sama, but he hasn't
           | blessed us :(
        
         | qiongzhouh wrote:
         | We are not paying for Reddit's API to get our data, there are
         | some really good and complete and publically available dumps of
         | Reddit data available online. We are in contact with the folks
         | at Reddit, which is of course a YC company, so they're aware of
         | what we're doing.
        
       | shaoner wrote:
       | Love the idea, my only concern is how to trust that at some point
       | you're not going to include sponsored products?
        
         | qiongzhouh wrote:
         | As a user of our product, I'd really hate it if we were
         | recommending crappy products. I suspect users will also feel
         | the same and this thought will hold us accountable.
        
           | Loughla wrote:
           | That's a non-answer to the question. You're saying that you
           | would like to recommend good products only, but the question
           | was about sponsored products.
           | 
           | And on another comment about adding high margin results as
           | "best" you seem very interested in that concept.
           | 
           | I'm not trying to be hateful, but if you're thinking about
           | how to seed results with high margin products for yourself,
           | instead of the actual best results _already in the life cycle
           | of your product_ , I think it's just a matter of (not much)
           | time before any quality you have will immediately plummet.
           | This makes me very cautious about your product. Very.
        
       | pj_mukh wrote:
       | Yo, can I take a picture (of my skin) and you can suggest some
       | solutions? Multi modal plz!
        
         | dawalker wrote:
         | For sure! We'll work on that in the next couple of updates,
         | it's been on our minds for a while.
        
       | frankdenbow wrote:
       | Nice work! I kind of do this with google and reddit already
       | sometimes, as a well written explanation for why someone likes a
       | particular item plus the upvotes do help me make decisions.The
       | format looks pretty good, woudl just like to have a view of all
       | the products at once in a comparison if possible.
       | 
       | The concept of a search that is multi layered is something I see
       | The Browser Company and others doing to make your one search a
       | bit more impactful, so kudos for going in that direction as well.
       | I would do restaurants and search availability as well.
       | 
       | More thoughts: https://www.youtube.com/watch?v=xKFDuZsdXrc
        
         | dawalker wrote:
         | Thanks! We just watched your review together in the living
         | room, and we really appreciate your thoughts+detailed feedback.
         | The list of items is an interesting idea that we'll think about
         | how to fit into the ux. Comparisons is definitely something we
         | want to add later down the line as well.
         | 
         | The idea of restaurants, like you mentioned, would be really
         | great to have. It's not an immediate priority, but once we get
         | Tiktok/short form videos on the site and integrate it well,
         | it'd be really exciting to make and use.
        
       | CSMastermind wrote:
       | I sat on the before we begin page for a long time waiting for
       | something to happen before I realized nothing would:
       | 
       | https://imgur.com/a/cvT1iF8
        
         | dawalker wrote:
         | Sorry that happened :( What were you searching for? We'll look
         | into it.
        
           | CSMastermind wrote:
           | Seems to happen whenever I don't specify a specific product?
           | 
           | https://www.lumona.ai/search/results?q=vegan+european+leathe.
           | ..
           | 
           | ^ That search does the forever loading boxes for me but if I
           | add 'cream' to the end of it then it seems to work.
        
             | qiongzhouh wrote:
             | What's happening here is that we put your search query into
             | a model that tries to figure out what skincare product
             | characteristics (e.g. Vitamin C, Retinol, etc) would be
             | good for you, but when it sees something that isn't a
             | skincare product or a skincare concern, it gets confused
             | and doesn't return anything.
             | 
             | I'll put in an error handler for this soon.
        
               | yellow_lead wrote:
               | This is something I ran into as well, mainly bc of your
               | title and description. I tried it before reading your
               | whole post, so I didn't know it's only skincare for now.
               | 
               | > Launch HN: Lumona (YC W24) - Product search based on
               | Reddit and YouTube reviews
               | 
               | > Hey HN! We are Lumona (https://lumona.ai), a product
               | search engine
        
               | qiongzhouh wrote:
               | Sorry for the confusion, we'll try to have general
               | products soon.
               | 
               | I just pushed a change to give an error message
               | explaining what's happening for non-skincare related
               | searches.
        
               | yellow_lead wrote:
               | No worries, I'm excited to use it when that happens!
        
       | shiredude95 wrote:
       | how does this service deal with a coordinated advertising
       | campaign -- most likely also driven by LLM's over a period of say
       | X months. Moderators on subs can be bought out or marginalized,
       | while youtube reviews can also be bought out. In other words, how
       | is an aggregated source a better and more trustworthy source of
       | information than a single blogger who people can ascribe some
       | amount of trustworthiness to over a period of time.
        
         | dawalker wrote:
         | Great question. This would be a bigger issue if we were only
         | aggregating results and summarizing them, but because we both
         | aggregate and show (in our opinion) the highest credibility
         | reviews from YouTubers (and other sources like blogs once we
         | add them), our idea is that while the general mass opinion can
         | be shifted through campaigns like that, the top end of the
         | spectrum should hopefully still remain pure.
         | 
         | If on the other hand the top end of the spectrum is corrupted,
         | then hopefully the masses can compensate for that. If both are
         | corrupted and all of the data sources available are, then it
         | really comes down to our ability to filter out LLM or promoted
         | content which comes down to how well they can hide it. AI
         | detection tools have been scaling alongside models, so it's
         | also a question if that will continue over time. We'll think of
         | some more advanced things if that becomes a bigger issue for us
         | :)
         | 
         | At the end of the day, if a company can do a coordinated
         | advertising campaign across the internet over months to block
         | out any negative opinion, it's a big deal for both us and the
         | social media/data sources we pull from that's going to be a
         | challenge we have to deal with.
        
       | johnfn wrote:
       | How does this differ from https://www.looria.com/?
        
         | dawalker wrote:
         | A couple of ways from my understanding. We have different
         | focuses in our UX and UI as we, for example, feature reviews
         | directly next to the product and show products 1 at a time
         | instead of a listing view. We also place more emphasis on
         | having a semantic search where you learn about the products
         | being offered and how they're relevant to your specific
         | situation instead of a keyword based search. From a business
         | standpoint, we're also affiliates of Amazon and Stylevana while
         | Looria isn't.
        
       | sovnwnt wrote:
       | Strange that you chose as acne your demo topic but none of your
       | results mention one of, if not the most, powerful treatments that
       | is Tretinoin/Retinol and which comes up in the first search
       | results on Google.
       | 
       | Problem is that some of the best skincare is not available over
       | the counter, and surfacing prescription treatments dips into
       | medical care, which is a whole other can of worms.
       | 
       | In the end, you are missing valuable treatments but presenting a
       | summary of poorly researched (by Reddit users) or anecdotal
       | information.
       | 
       | I love the concept though and would love to see it catch on!
        
         | qiongzhouh wrote:
         | You're right that there are some very effective prescription
         | treatments that aren't shown, but it doesn't seem like
         | prescription acne treatments are the usually the appropropriate
         | / doctor prescribed choice for most people facing mild to
         | moderate acne.
         | 
         | Personally, my pediatrician told me that acne is just something
         | that happens to teens and recommended that I go try some acne
         | washes from the drugstore instead of prescribing something like
         | Tretinoin which could have some pretty intense side effects.
         | 
         | Reading r/SkincareAddiction has been really helpful for me,
         | especially seeing the range of experiences that people have
         | had, and that's why we made Lumona summarize these results.
        
           | sovnwnt wrote:
           | >my pediatrician told me that acne is just something that
           | happens to teens
           | 
           | Certainly not... https://www.yalemedicine.org/conditions/acne
           | 
           | > Clinical trial data revealed that approximately 50% of
           | women in their 20s, 33% of women in their 30s, and 25% of
           | women in their 40s suffer from acne
           | 
           | >which could have some pretty intense side effects
           | 
           | Your site recommends benzoyl peroxide which has similar or
           | worse side effects compared to tretinoin.
           | 
           | It's also a lauded product on both r/SkincareAddiction and
           | r/30PlusSkincare. Not something recommended for kids, but for
           | adults with persistent acne it is worth trying, especially
           | over antibiotics and alongside BP.
        
             | broast wrote:
             | Lately topical Tretinoin has been shown in numerous studies
             | to cause Idiopathic Intracranial Hypertension aka
             | Pseudotumor Cerebri which is quite intense. My post about
             | this on sca from years ago recieves a many comments to this
             | day from other sufferers. I'll never touch it again despite
             | my adult acne. I wonder how many other people have
             | debilitating potentially blinding brain pressure headaches
             | and don't realize it is caused by this medicine commonly
             | accepted as safe.
        
       | ada1981 wrote:
       | Just noting that best skin care routine is:
       | 
       | No Alcohol or caffeine Lots of water Vegan diet Using baking soda
       | Adequate sleep and time in nature
        
         | qiongzhouh wrote:
         | True, I think I would agree with most of this sentiment.
         | Unfortunately I'm not doing many of these things and am still
         | using my skincare products.
         | 
         | Perhaps we should be surfacing opinions like this beyond just
         | products.
        
       | compootr wrote:
       | If I had to guess, I'd say the top words on Reddit would be
       | "actually" or "because" and probably 69
        
         | qiongzhouh wrote:
         | r/SkincareAddiction Out of all posts comments from 2023:
         | 
         | and: #1 skin: #23 acne: #55 because #97 actually #263
        
       | barbazoo wrote:
       | I just want to know what corded stick vacuum to buy. Where can I
       | access something that a human has written? It's become impossible
       | for me. I'm on Kagi, I wonder if Google or Bing are better at
       | this.
        
         | qiongzhouh wrote:
         | https://www.reddit.com/r/BuyItForLife/comments/15b4iks/best_...
         | This one seems to be a good thread. Hopefully we'll have it on
         | Lumona soon.
        
       | ji_zai wrote:
       | > Using a Mistral-7B FT trained on GPT-4 outputs allowed us to
       | parse through hundreds of thousands of Reddit threads in a simple
       | way with just hundreds of dollars of compute.
       | 
       | Great idea. These sort of clever approaches are needed to be able
       | to build these sort of products that benefit from scale. When the
       | cost of inference goes down, it enables new experiences. And
       | clever ways to reduce cost before the big providers do, is a
       | massive competitive advantage that makes it tough for those who
       | wait to compete with you.
       | 
       | Anyone building AI products should take note.
        
         | qiongzhouh wrote:
         | The missing part of the story is when we made an early
         | prototype using GPT-4, leaving it on overnight, and realizing
         | that we've spent several thousand dollars of OpenAI credits...
        
       | kiranp wrote:
       | Similar thing, but for electronic gadgets -
       | https://shoppalui.vercel.app/
        
       | kristopolous wrote:
       | Skincare is an interesting beachhead. How do you test if your
       | results are good? What's the baseline?
       | 
       | I feel like something like movies or video games would be a great
       | way to validate the approach since there's generally agreed upon
       | sentiments regarding these products.
       | 
       | Skin care I'd imagine is fairly complicated. Goal, lifestyle,
       | budget, habit and individual based needs and preferences can lead
       | to different sentiments. How do you calculate say, your loss
       | function?
        
         | qiongzhouh wrote:
         | I think that's what makes skincare interesting. We want our
         | system to be able to understand your goal, lifestyle, budget,
         | ... and pick out which product is the best for you, given what
         | others who have used the product before said.
         | 
         | With less of this information, the ground truth would probably
         | related to how popular the product is or the average sentiment
         | of people reviewing the product. With this information though,
         | you can compare each one to see which best fits the user's
         | needs. Having compared enough products, you'll eventually
         | figure out which one is the best.
        
           | kristopolous wrote:
           | it's a recommendation system with a timeline of months and
           | the data to back it up IS your market hypothesis.
           | 
           | you can have whatever risk profile you want, some people are
           | gamblers. Personally I'd like to have greater confidence in
           | the fitness of the implementation before going all in.
           | 
           | Although I say this with the advice adage: unless the
           | business advice comes from someone who is on their own
           | private yacht, it's more opinion than advice. (I've got no
           | yachts)
        
       | moneywoes wrote:
       | how do you index reddit cost effectively without breaking their
       | tos
        
         | qiongzhouh wrote:
         | We're working with dumps of Reddit data, which means we don't
         | have to use their API or do any scraping on Reddit itself for
         | now. The data is updated monthly though, so we'll have figure
         | out how to get higher quality data for things that are more
         | time sensitive. We're in contact with the folks at Reddit, so
         | we'll try to see if there are ways to get better data later on.
        
       | franze wrote:
       | searched for best yoga mat, got strange video about sunscreen...
        
         | qiongzhouh wrote:
         | Sorry about that, we didn't make it clear initially that Lumona
         | only has skincare products for now, we'll be working to scale
         | it beyond these products soon, but that message about skincare
         | was probably not clear enough from our post
        
       | criddell wrote:
       | It's a neat idea, but I think using the affiliate model will
       | ultimately be corrupting.
       | 
       | Maybe once you get larger you can pivot to being a paid service
       | like Consumer Reports. To me, they still feel more trustworthy
       | than other services similar to yours (like Wirecutter).
        
       | potatoman22 wrote:
       | Doesn't fine tuning models on GPT-4 output violate OpenAI's terms
       | of service?
        
         | dns_snek wrote:
         | It seems like everyone is doing it. Does anyone care? Should
         | anyone care?
        
         | qiongzhouh wrote:
         | OpenAI says in their terms that you can't "use output from the
         | Services to develop models that compete with OpenAI" [0], and
         | it seems that people are interpreting it as training a model
         | that directly competes with them, which we aren't doing. There
         | are many companies out there built on using GPT-4 outputs to do
         | task-specific fine-tuning, so it doesn't seem like it's a
         | problem unless we were trying to make competing foundational
         | model from GPT outputs.
         | 
         | (I'm not a lawyer, so this needs to be taken with a large grain
         | of salt)
         | 
         | 0: https://openai.com/policies/terms-of-use
        
       | Ishan-002 wrote:
       | This is so cool! The way the search engine has been built up also
       | seems very smart. I'm honestly surprised too at the same time,
       | that this kind of idea hasn't been worked on before (couldn't
       | find anything similar; I could be very wrong)
       | 
       | I'm not sure if this type of problem is even a considerable one,
       | but how does the search engine handle reviews from subreddits
       | which are focussed only on a particular brand, and may
       | potentially form a bias around such products? Does the LLM's
       | awareness of each review's context handle that?
        
         | qiongzhouh wrote:
         | That's a really good point. I think in our current iteration of
         | the system, if we applied it to subreddits focused specifically
         | on a particular brand, it would not be able to account for the
         | bias there, even if it knows which subreddit the content is
         | from. That's probably too much to ask of the LLM.
         | 
         | We'll have to think about good ways to handle this. Curious
         | about your or others thoughts on these subreddits, how do you
         | process content on these subreddits differently?
        
           | Ishan-002 wrote:
           | Brand focussed subreddits generally have a detailed review of
           | the product (which is really really helpful), as compared to
           | general topic subreddits where comparisons and versus are
           | more often to encounter.
           | 
           | Likewise, if I want opinions on a specific product before
           | buying it, I would definitely go to brand focussed
           | subreddits, but if I'm unsure and have a generic problem in
           | mind like "acne on forehead", then probably going to general
           | topic subreddits would be a better choice.
           | 
           | I agree, that might be probably too much to ask from the LLM.
           | If you could possibly analyse the variety of brands discussed
           | on a subreddit and assign it a score according to the
           | versatility of discussions, maybe that could help.
           | 
           | Anyways, that's way too overcomplication of things and
           | probably something that should be of concern when the product
           | grows more mature (or if a brand bloats up a certain
           | subreddit :P).
           | 
           | Cheers, congrats on getting on YC!
        
       | dns_snek wrote:
       | Cool idea, but I don't see how it can ever possibly work with the
       | amount of astroturfing and frequency/ubiquity of undeclared paid
       | advertising.
       | 
       | You're ingesting highly biased, sponsored, astroturfed content.
       | What measures have you taken to filter Youtube reviews down to
       | the ones that haven't been sponsored, and likewise for Reddit?
       | Otherwise it's just garbage in, garbage out but wrapped in fancy,
       | legitimate-looking packaging.
        
         | tootie wrote:
         | It's the eternal September problem. Reddit was a great source
         | of honest reviews until everyone figured it out and started to
         | take advantage. Services trying to capitalize on it may be
         | well-intentioned but it's only going to accelerate the
         | enshittification.
        
       | NotYourLawyer wrote:
       | Reddit and YouTube are so astroturfed that I have trouble
       | believing there's much signal in the marketing noise.
        
       | epoch_100 wrote:
       | Very cool! It reminds me of https://chord.pub/.
        
         | qiongzhouh wrote:
         | Wow, thanks for sharing this. I find it interesting that they
         | chose to make it something that I have to wait 1-2 minutes for
         | before I get my AI generated article.
         | 
         | Seems to do a good job for various types of research, will give
         | it a try next time I'm curious about something and need it
         | researched
        
       | zwaps wrote:
       | What I find interesting here is how far a well working QA
       | application with LLMs (such as this one) is away from anything
       | that can be generalized to other topics.
       | 
       | Thats probably where we are right now: I have seen quite a few
       | purpose built and tuned AI systems for one specific use case or
       | topic which work really well. By contrast, I have yet to see any
       | general AI bot that does this with arbitrary data for any
       | reasonable definition of good.
       | 
       | I mean, take any of these Chat-with-data bots, load up a huge
       | document and ask it for information that is spread on many pages
       | (like make a list of prices for every product in a catalogue).
       | Then see it fail.
       | 
       | Exciting times.
        
         | qiongzhouh wrote:
         | Definitely feel this way too. Sometimes I think to myself that
         | it'd be really great to have an LLM give me a well researched
         | report on say like, recent trends on undisclosed marketing
         | online, wished that we supported that on Lumona, but realized
         | that we'll have to do it eventually, but pretty tough with the
         | current infrastructure
        
       ___________________________________________________________________
       (page generated 2024-03-30 23:02 UTC)