[HN Gopher] Show HN: Knowledge graph of restaurants and chefs, b...
___________________________________________________________________
Show HN: Knowledge graph of restaurants and chefs, built using LLMs
Hi HN! My latest side project is knowledge graph that maps the
French culinary network using data extracted from restaurant
reviews from LeFooding.com. The project uses LLMs to extract
structured information from unstructured text. Some technical
aspects you may be interested in: - Used structured generation to
reliably parse unstructured text into a consistent schema - Tested
multiple models (Mistral-7B-v0.3, Llama3.2-3B, gpt4o-mini) for
information extraction - Created an interactive visualization
using gephi-lite and Retina (WebGL) - Built (with Claude) a simple
Flask web app to clean and deduplicate the data - Total cost for
inferencing 2000 reviews with gpt4o-mini: less than 1EUR! You can
explore the visualization here: [Interactive Culinary Network](http
s://ouestware.gitlab.io/retina/1.0.0-beta.4/#/graph/?url...) The
code for the project is available on GitHub: - Main project:
https://github.com/theophilec/foudinge - Data cleaning tool:
https://github.com/theophilec/foudinge-scrub Happy to get
feedback!
Author : theophilec
Score : 133 points
Date : 2025-03-03 15:43 UTC (7 hours ago)
(HTM) web link (theophilecantelob.re)
(TXT) w3m dump (theophilecantelob.re)
| martinky24 wrote:
| What's the use case for maintaining a list of restaurants that
| use LLMs?
| moandcompany wrote:
| The phrasing, possibly due to French->English translation, may
| cause a misleading reading.
|
| It appears the author/poster is using LLMs (OpenAI and Claude,
| specifically) to extract entity and relationship data to create
| a knowledge graph of French Restaurants and Chefs.
|
| https://github.com/theophilec/foudinge-scrub/blob/0a2701756f...
| theophilec wrote:
| Thanks for the clarification. I updated the title!
| neuroelectron wrote:
| Yeah i read it that way to. :D
| moandcompany wrote:
| Very cool work.
|
| It's worth mentionion that the Graph browser using "Retina" is a
| project from Ouestware (https://www.ouestware.com/en/) which is
| also contributor to the GraphCommons and GephiLite projects.
| theophilec wrote:
| Thank you for the kind words!
|
| Yes, Retina and Gephi are great. In fact I noticed a bug which
| they fixed immediatly while making the project.
| dylan604 wrote:
| I was clicking around with the embed, and eventually hit the
| "home/house" icon. That takes me to a Retnia credit/loading
| screen with no way to back out of it that I could see. Was
| forced to hard refresh the page. If there is a close button, it
| could be made more obvious
| tantalor wrote:
| The embedding is kind of weird. Like, there's no reason a
| "degree: 1" node should be so far away from its sibling.
|
| Example: https://imgur.com/a/7Cktyzp
|
| This makes the graph look more random/noisy/disorganized than it
| actually is.
| theophilec wrote:
| I agree the spatialization could be better. I used one of the
| algorithms in Gephi-lite directly. Do you have a favorite
| spatialization algorithm to recommend?
| visarga wrote:
| Yeah, they should have used UMAP or tSNE to cluster the data a
| bit
| peppery wrote:
| Since you did the hard work of parsing rich metadata already,
| it would be even cooler if your network visualization oriented
| nodes by some of this information. Here the 'hiveplot' idea
| (https://hiveplot.com/ ) is often even more useful than e.g.
| springloaded or UMAP based layouts; clustering into
| semantically-meaningful categories into axes (say, city or
| arrondissement? years open? cuisine? an explicit phylogeny from
| oldest culinary grandparents to youngest?) then choosing a
| coordinate to localize nodes on the axes (total node degree?
| prix? "les plus" tags?...) automatically compels us think about
| salient features of the data.
| repsiace wrote:
| Looks interesting, have you tried utilizing a multimodal model?
| nickthegreek wrote:
| Graph embed does not appear to work in FF 135. Loaded in Chrome
| though.
|
| Edit: Seems to be a me issue.
| speerer wrote:
| Worked for me (FF 135.0.1 on Ubuntu)
| theophilec wrote:
| It works on FF135.0.1 (aarch64) for me. Ad blocker?
| nickthegreek wrote:
| Tried without adblocker and turned off pihole. I did get it
| work on Zen Browser (FF engine). So my FF might have gotten
| borked. Console is giving me:
|
| Failed to create WebGL context: WebGL creation failed: *
| tryANGLE (FEATURE_FAILURE_EGL_NO_CONFIG) * Exhausted GL
| driver options. (FEATURE_FAILURE_WEBGL_EXHAUSTED_DRIVERS)
|
| Glad its workin for others!
| arnath wrote:
| This is a super cool idea! I've sort of mused about an idea for
| general web search that's very similar to this concept, where you
| start with a set of trusted entities and then branch out from
| there, but choosing how you establish trust is really important.
| But this is a really clever application, well done!
| fads_go wrote:
| so you mean how academics search for relevant publications on a
| topic, which is the direct inspiration for the original Google
| ranking algorithm, back in the happy days when the web was
| young and had not optimized itself around short-term money
| making.
| nluken wrote:
| Given the structured nature of the data, how does this compare to
| running a specialized classification model that looks for
| specific words in a review and uses those to assign Chefs to
| Restaurants? With some fine tuning, you might get more consistent
| results than feeding the reviews into a generative model.
| theophilec wrote:
| The data is initially not at all structured, and the critics
| talk about a chef's CV in passing. For instance, take this
| example:
|
| > At Grenat, Antoine Joannier and Neil Mahatsry are bathed in
| an ardent red glow, much like the pomegranate-toned walls of
| their space. After working together at La Brasserie Communale,
| where they first met, the duo is now firing on all cylinders in
| the heart of Marseille, where Antoine tends to guests seated
| around blonde wood tables, delivering dishes ignited by Neil
| behind the bar. From oysters to prime cuts of red meat, [...]
|
| I tried using NER models and the results were not great.
| Furthermore, these models do not extract relationships between
| entities (other models exist for that though). Haven't tried
| fine-tuning at all!
|
| There is also a lot of variation in the ways of presenting a
| chef's prior restaurants, which makes this a good use-case for
| LLMs.
| nluken wrote:
| Nice breakdown. Cheers!
| holtwork wrote:
| Great project. I propose an improvement over this conventional
| kind of object-style graph. Instead, every single item should be
| a node or an edge. The objects are needless complexities that
| obscure pure graph relations. Like this: https://memelang.net/03/
| jonnycoder wrote:
| This looks great! I was just looking for a good web knowledge
| graph visualizer.
| nswanberg wrote:
| Nice! How'd the local models do vs gpt4o-mini? Did you spend much
| time playing with datasette?
| theophilec wrote:
| Local models hallucinated a lot more that gpt4o-mini, so I
| stayed with OpenAI. On top of that, I paid around 14EUR for
| inference on ~200 examples on OVH and inference was much
| slower. I am planning on getting everything running on Mistral
| or Llama though.
|
| I used sqlite everywhere so datasette was good for visualizing
| scraped and extracted data. Simon released structured
| generation for llm a few days after I did the project though,
| so I haven't tried yet.
| bevan wrote:
| This was inspiring, what a cool idea. Just curious---for 4o mini
| isn't there a json mode that reliably produces structured output?
| Was that what you were referring to / ended up using?
| theophilec wrote:
| Yes I ended up using that. Libraries like outlines give that
| functionality to open models.
___________________________________________________________________
(page generated 2025-03-03 23:00 UTC)