[HN Gopher] Vanna.ai: Chat with your SQL database
___________________________________________________________________
Vanna.ai: Chat with your SQL database
Author : ignoramous
Score : 217 points
Date : 2024-01-14 17:58 UTC (5 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| arter4 wrote:
| I'm curious about how this performs with more complex queries,
| like joins across five tables.
|
| Also, does the training phase actually involve writing SELECT
| queries by hand?
|
| In the age of ORMs and so on, many people have probably forgotten
| how to write raw SQL queries.
| nkozyra wrote:
| From my experience, GPT-4 will do just as well with joins as
| without. And that needs no specific, separate SQL training
| (which I assume tens of thousands of examples are already in).
| teaearlgraycold wrote:
| > In the age of ORMs and so on, many people have probably
| forgotten how to write raw SQL queries.
|
| I've heard this general sentiment repeated quite a lot - mostly
| by people that don't use ORMs. In my experience pretty quickly
| you reach the limits of even the best ORMs and need to write
| some queries by hand. And these tend to be the relatively
| complicated queries. You need to know about all of the
| different join types, coalescing, having clauses, multiple
| joins to the same table with where filters, etc.
|
| Not that this makes you a SQL expert but you can't get too far
| if you don't know SQL.
| account-5 wrote:
| What I'd really be interested in is being able to describe a
| problem space and have it generate a schema that models it. I'm
| actually not that bad at generating my own SQL queries.
| CharlesW wrote:
| This works pretty well without a dedicated application today,
| e.g. _" Knowing everything you do about music and music
| distribution, please define a database schema that supports
| albums, tracks, and artists"_. If you have additional
| requirements or knowledge that the response doesn't address,
| just add it and re-prompt. When you're done, ask for the SQL to
| set up the schema in your database of choice.
| account-5 wrote:
| Maybe my prompting needs to improve, I tried recently to get
| chatgpt to provide a schema for an sqlite database that
| implements vcard data in a normalised way. I gave up...
| coder543 wrote:
| ChatGPT-3.5 or ChatGPT-4? There is a big difference.
|
| For fun, I just asked ChatGPT-4 to generate a normalized
| database representation of vcard information: https://chat.
| openai.com/share/1c88813c-0a50-4ec6-ba92-4d6ff8...
|
| It seems like a reasonable start to me.
| account-5 wrote:
| Chatgpt 3.5. Maybe I should pay for a couple of months
| access to 4 to see the difference. Is it worth the money?
| coder543 wrote:
| ChatGPT-3.5 isn't even worth touching as an end-user
| application. Bard is better (due to having some
| integrations), but it's still barely useful.
|
| ChatGPT-4 is on an another level entirely compared to
| either 3.5 or Bard. It is actually useful for a lot.
|
| ChatGPT-3.5 can still serve a purpose when you're talking
| about API automations where you provide all the data in
| the prompt and have ChatGPT-3.5 help with parsing or
| transforming it, but not as a complete chat application
| on its own.
|
| Given the bad experiences ChatGPT-3.5 gives out on a
| regular basis as a chat application, I don't even know
| why OpenAI offers it for free. It seems like a net-
| negative for ChatGPT/OpenAI's reputation.
|
| I think it is worth paying for a month of ChatGPT-4. Some
| people get more use out of it than others, so it may not
| be worth it to you to continue, but it's hard for anyone
| to know just how big of a difference ChatGPT-4 represents
| when they haven't used it.
|
| I provided a sample of ChatGPT-4's output in my previous
| response, so you can compare that to your experiences
| with ChatGPT-3.5.
| account-5 wrote:
| You sample completely blows away what I got out of 3.5.
| I'm now wondering if Bing is 3.5 or 4. But will likely
| fork out for a couple of months.
| simonw wrote:
| Yeah, GPT-4 is really good at schema design. ChatGPT can even
| go a step further and create those tables in a SQLite
| database file for you to download.
| burcs wrote:
| We actually built something that does this at Outerbase.
| ob1.outerbase.com it'll generate API endpoints as well, if you
| need them.
| jedberg wrote:
| This is awesome. It's a quick turnkey way to get started with RAG
| using your own existing SQL database. Which to be honest is what
| most people really want when they say they "want ChatGPT for
| their business".
|
| They just want a way to ask questions in prose and get an answer
| back, and this gets them a long way there.
|
| Very cool!
| new_user_final wrote:
| Does it work with Google/Facebook ads data? Can I ask it to show
| best performing ads from BigQuery Facebook/Google ads data by
| supermetrics or improvado.
| sonium wrote:
| Aren't there already tons of apps answering that specific
| question? I think the strength of this approach is answering
| the non-obvious questions.
| osigurdson wrote:
| I wish we had landed on a better acronym than RAG.
| vinnymac wrote:
| Every single time I see it, I immediately think of Red Amber
| Green.
| nightski wrote:
| It doesn't matter, RAG is very temporary and will not be around
| long imho.
| sroecker wrote:
| Care to enlighten us why?
| nkozyra wrote:
| Most of this stuff is replaced within a calendar year and
| that will probably accelerate.
| osigurdson wrote:
| It sounds dumb to me.
| ren_engineer wrote:
| how else would you get private or recent data into an LLM
| without some form of RAG? The only aspect that might not be
| needed is the vector database
| mediaman wrote:
| RAG, at its core, is a very human way of doing research,
| because RAG is essentially just building a search mechanism
| for a reasoning engine. Much like human research.
|
| Your boss asks you to look into something, and you do it
| through a combination of structured and semantic research.
| Perhaps you get some books that look relevant, you use search
| tools to find information, you use structured databases to
| find data. Then you synthesize it into a response that's
| useful to answer the question.
|
| People say RAG is temporary, that it's just a patch until
| "something else" is achieved.
|
| I don't understand what technically is being proposed.
|
| That the weights will just learn everything it needs to know?
| That is an awful way of knowing things, because it is
| difficult to update, difficult to cite, difficult to ground,
| and difficult to precisely manage weights.
|
| That the context windows will get huge so retrieval will be
| unnecessary? That's an argument about chunking, not
| retrieval. Perhaps people could put 30,000 pages of documents
| into the context for every question. But there will always be
| tradeoffs between size and quality: you could run a smarter
| model with smaller contexts for the same money, so why, for a
| given budget, would you choose to stuff a dumber model with
| enormous quantities of unnecessary information, when you
| could get a better answer from a higher intelligence using
| more reasonably sized retrievals at the same cost?
|
| Likewise, RAG is not just vector DBs, but (as in this case)
| the use of structured queries to analyze information, the use
| of search mechanisms to find information in giant
| unstructured corpuses (i.e., the Internet, corporate
| intranets, etc).
|
| Because RAG is relatively similar to the way organic
| intelligence conducts research, I believe RAG is here for the
| long haul, but its methods will advance significantly and the
| way it gets information will change over time. Ultimately,
| achieving AGI is not about developing a system that "knows
| everything," but a system that can reason about anything, and
| dismissing RAG is to confuse the two objectives.
| bdcravens wrote:
| Rags are used for cleaning, and this gives you a cleaner
| interface into your data :-)
| arbot360 wrote:
| REALM (REtrieval Augmented Language Model) is a better acronym.
| spencerchubb wrote:
| I'm pretty sure whoever coined the term just wanted to sound
| smart. Retrieval Augmented Generation is a fancy way to say
| "put data in the prompt"
| teaearlgraycold wrote:
| I haven't loaded this up so maybe this has been accounted for,
| but I think a critical feature is tying the original SQL query to
| all artifacts generated by Vanna.
|
| Vanna would be helpful for someone that knows SQL when they don't
| know the existing schema and business logic and also just to save
| time as a co-pilot. But the users that get the most value out of
| this are the ones _without_ the ability to validate the generated
| SQL. Issues will occur - people will give incomplete definitions
| to the AI, the AI will reproduce some rookie mistake it saw
| 1,000,000 times in its training data (like failing to realize
| that by default a UNIQUE INDEX will consider NULL != NULL), etc.
| At least if all distributed assets can tie back to the query
| people will be able to retroactively verify the query.
| sighansen wrote:
| This looks really helpful! I'm working a lot on graph databases
| and am wondering, if there are similar projects working with say
| neo4j. I guess because you don't have a schema, the complexity
| goes up.
| jazzyjackson wrote:
| neo4j advertises such an integration on their landing page
|
| https://neo4j.com/generativeai/
| altdataseller wrote:
| What's the origin behind the name Vanna?
| booleandilemma wrote:
| Vanna White? It's the only Vanna I know.
|
| https://en.wikipedia.org/wiki/Vanna_White
| bob1029 wrote:
| The most success I had with AI+SQL was when I started feeding
| errors from the sql provider back to the LLM after each
| iteration.
|
| I also had a formatted error message wrapper that would strongly
| suggest querying system tables to discover schema information.
|
| These little tweaks made it scary good at finding queries, even
| ones requiring 4+ table joins. Even without any examples or fine
| tuning data.
| echelon wrote:
| Please turn this into a product. There's enormous demand for
| that.
| teaearlgraycold wrote:
| Someone get YC on the phone
| bob1029 wrote:
| I feel like by the time I could turn it into a product,
| Microsoft & friends will release something that makes it look
| like a joke. If there is no one on the SQL Server team
| working on this right now, I don't know what the hell their
| leadership is thinking.
|
| I am not chasing this rabbit. Someone else will almost
| certainly catch it first. For now, this is a fun toy I enjoy
| in my free time. The moment I try to make money with it the
| fun begins to disappear.
|
| Broadly speaking, I do think this is approximately the only
| thing that matters once you realize you can put pretty much
| anything in a big SQL database. What happens when 100% of the
| domain is in-scope of an LLM that has iteratively optimized
| itself against the schema?
| andy_ppp wrote:
| I will be extremely surprised if Microsoft build this for
| open source databases, however someone else will definitely
| build it if you don't, that is completely true :-)
| JelteF wrote:
| Disclaimer: I work at Microsoft on Postgres related open
| source tools (Citus & PgBouncer mostly)
|
| Microsoft is heavily investing in Postgres and its
| ecosystem, so I wouldn't be extremely surprised if we
| would do this. We're definitely building things to
| combine AI with Postgres[1]. Although afaik no-one is
| working actively on query generation using AI.
|
| But I actually did a very basic POC of "natural language
| queries" in Postgres myself last year:
|
| Conference talk about it:
| https://youtu.be/g8lzx0BABf0?si=LM0c6zTt8_P1urYC Repo
| (unmaintained): https://github.com/JelteF/pg_human
|
| 1: https://techcommunity.microsoft.com/t5/azure-database-
| for-po...
| dcreater wrote:
| You can just make a GitHub repo with what you have. It'd
| still be valuable to the community
| whoiscroberts wrote:
| If the do release it , they will only release it for
| enterprise. Many many sql server installs are sql server
| standard. There is an entire ecosystem of companies built
| on selling packages that support sql server standard, wee
| DevArt, RedGate.
| personjerry wrote:
| Wouldn't it be pretty fast to make it as a chatgpt?
| quickthrower2 wrote:
| Or open source? You could get 10k stars :-)
| mcapodici wrote:
| I would be tempted to pivot to that! I am working on similar
| for CSS (see bio) but if that doesn't work out my plan was to
| pivot to other languages.
| petters wrote:
| It sounds like pretty standard constructions with OpenAI's
| API. I have a couple of such iterative scripts myself for
| bash commands, SQL etc.
|
| But sure, why not!
| SOLAR_FIELDS wrote:
| There are already several products out there with varying
| success.
|
| Some findings after I played with it awhile:
|
| - Langchain already does something like this - a lot of the
| challenge is not with the query itself but efficiently
| summarizing data to fit in the context window. In other words
| if you give me 1-4 tables I can give you a product that will
| work well pretty easy. But when your data warehouse has tens
| or hundreds of tables with columns and meta types now we need
| to chain together a string of queries to arrive at the answer
| and we are basically building a state machine of sorts that
| has to do fun and creative RAG stuff - the single biggest
| thing that made a difference in effectiveness was not what op
| mentioned at all, but instead having a good summary of what
| every column in the db was stored in the db. This can be AI
| generated itself, but the way Langchain attempts to do it on
| the fly is slow and rather ineffective (or at least was the
| case when I played with it last summer, it might be better
| now).
|
| Not affiliated, but after reviewing the products out there
| the data team I was working with ended up selecting getdot.ai
| as it had the right mix of price, ease of use, and
| effectiveness.
| l5870uoo9y wrote:
| You can check this out https://www.sqlai.ai. It has AI-
| powered generators for:
|
| - Generate SQL
|
| - Generate optimized SQL
|
| - Fix query
|
| - Optimize query
|
| - Explain query
|
| Disclaimer: I am the solo developer behind it.
| holoduke wrote:
| It would be fun if you could actually train your raw sql and the
| llm output is the actual answer and not sql commands. In this way
| its just another language layer on top/in between of sql.
| Probably hurts efficiency and performance in the long run.
| esafak wrote:
| The nitty gritty: https://vanna.ai/blog/ai-sql-accuracy.html
| pamelafox wrote:
| I love that this exists but I worry how it uses the term "train",
| even in quotes, as I spend a lot of time explaining how RAG works
| and I try to emphasize that there is no training/fine-tuning
| involved. Just data preparation, chunking and vectorization as
| needed.
| jonahx wrote:
| Is the architecture they use in this diagram currently the best
| way to train LLMs in general on custom data sets?
|
| https://raw.githubusercontent.com/vanna-ai/vanna/main/img/va...
|
| That is, store your trained custom data in vector db and then use
| RAG to retrieve relevant content and inject that into the prompt
| of the LLM the user is querying with?
|
| As opposed to fine tuning or other methods?
| firejake308 wrote:
| All the podcasts I've been listening to recommend RAG over
| fine-tuning. My intuition is that having the relevant knowledge
| in the context rather than the weights brings it closer to the
| outputs, thereby making it much more likely to provide accurate
| information and avoid hallucinations/confabulations.
| benjaminwootton wrote:
| Do you have any podcasts you would reccomend with this type
| of content?
| zmmmmm wrote:
| > All the podcasts I've been listening to recommend RAG over
| fine-tuning
|
| I'm always suspicious that is just because RAG is so much
| more accessible (both compute wise and in terms of expertise
| required). There's far more profit in selling something
| accessible to the masses to a lot of people than something
| only a niche group of users can do.
|
| I think most people who do actual fine tuning would still
| probably then use RAG afterwards ...
| ajhai wrote:
| We can get a lot done with vector db + RAG before having to
| finetune or custom models. There are a lot of techniques to
| improve RAG performance. Captured a few of them a while back at
| https://llmstack.ai/blog/retrieval-augmented-generation.
| 331c8c71 wrote:
| Yes from what I gather. And just to emphasize there's no LLM
| (re)training involved at all.
| metflex wrote:
| that's it, we are going to lose our jobs
| kleiba wrote:
| Sorry, maybe I'm just too tired to see it, but how much control
| do you have over the SQL query that is generated by the AI? Is
| there a risk that it could access unwanted portions or, worse,
| delete parts of your data? (the AI equivalent of Bobby Tables, so
| to speak)
| htk wrote:
| I guess you could limit that with the correct user permissions.
| thih9 wrote:
| Why not give it access to relevant parts of the database only?
| And read only access too?
| bob1029 wrote:
| In some SQL providers, your can define rules that dynamically
| mask fields, suppress rows, etc. based upon connection-specific
| details (e.g. user or tenant ID).
|
| So, you could have all connections from the LLM-enabled systems
| enforce masking of PII, whereas any back-office connections get
| to see unmasked data. Doing things at this level makes it very
| difficult to break out of the intended policy framework.
| iuvcaw wrote:
| Guessing its intended use case is business analytic queries
| without write permissions --- particularly for non-programmers.
| I don't think it'd be advisable to use something like this for
| app logic
| ajhai wrote:
| We have recently added support to query data from SingleStore to
| our agent framework, LLMStack
| (https://github.com/trypromptly/LLMStack). Out of the box
| performance performance when prompting with just the table
| schemas is pretty good with GPT-4.
|
| The more domain specific knowledge needed for queries, the harder
| it has gotten in general. We've had good success `teaching` the
| model different concepts in relation to the dataset and giving it
| example questions and queries greatly improved performance.
| hrpnk wrote:
| Prompts are quite straightforward.
|
| - OpenAI: https://github.com/vanna-
| ai/vanna/blob/a4cdf7593ac0c584f7d74...
|
| - Mistral: https://github.com/vanna-
| ai/vanna/blob/a4cdf7593ac0c584f7d74...
| peheje wrote:
| Many of these AI "products" - Is it just feeding text into LLMs
| in a structured manner?
| okwhateverdude wrote:
| Basically, yeah. It is shockingly trivial to do, and yet like
| playing with alchemy when it comes to the prompting,
| especially if doing inference on the cheap like running
| smaller models. They can get distracted in your formatting,
| ordering, CAPITALIZATION, etc.
| benjaminwootton wrote:
| I built a demo of something similar, using LlamaIndex to query
| data as it streamed into ClickHouse.
|
| I think this has a lot of real world potential, particularly when
| you move between the query and a GenAI task:
|
| https://youtu.be/F3Eup8yQiQQ?si=pa_JrUbBNyvPXlV0
|
| https://youtu.be/7G-VwZ_fC5M?si=TxDQgi-w5f41xRJL
|
| I generally found this worked quite well. It was good at
| identifying which fields to query and how to build where clauses
| and aggregations. It could pull off simple joins but started to
| break down much past there.
|
| I agree with the peer comment that being able to process and
| respond to error logs would make it more robust.
| breadwinner wrote:
| I have seen good results from just describing the schema to
| ChatGPT-4 and then asking it to translate English to SQL. Does
| this work significantly better?
| SOLAR_FIELDS wrote:
| That's mostly what the products and libraries around this like
| llamaindex or Langchain are doing. If you look at the Langchain
| sql agent all it's doing is chaining together a series of
| prompts that take the users initial query, attempt to take in a
| db and discover its schema on the fly and then execute queries
| against it based on that discovered schema, ensuring the result
| makes sense.
|
| The tough part is doing this at scale as part of a fully
| automated solution (picture a slack bot hooked up to your data
| warehouse that just does all of that for you that you converse
| with). When you have tens or hundreds of tables with
| relationships and metadata in that schema and you want your AI
| to be able to unprompted walk all of them, you're then
| basically doing some context window shenanigans and building
| complex state machines to walk that schema
|
| Unfortunately that's kind of what you need if you want to
| achieve the dream of just having a db that you can ask
| arbitrary questions to with no other knowledge of sql or how it
| works. Else the end user has to have some prior knowledge of
| the schema and db's to get value from the LLM. Which somewhat
| reduces the audience for said chatbot if you have to do that
| codegeek wrote:
| I have been keeping track of a few products like these including
| some that are YC backed. Interesting space as I am looking for a
| solution myself:
|
| - Minds DB (YC W20) https://github.com/mindsdb/mindsdb
|
| - Buster (YC W24) https://buster.so
|
| - DB Pilot https://dbpilot.io
|
| and now this one
| bredren wrote:
| Have you written up any results of your experience with each?
|
| I'm interested in a survey of this field so far and would read
| it.
| pylua wrote:
| I don't fully understand the use of business case after reading
| the documentation. Is it really a time save?
| EmilStenstrom wrote:
| Allow people that don't know SQL to query a database.
| MattGaiser wrote:
| It would be for people who are not that fluent in SQL. Even
| as a dev, I find ChatGPT to be easier for writing queries
| than hand coding them as I do it so infrequently.
| pylua wrote:
| Yeah, same here. Seems like that approach is much simpler
| than this.
|
| I guess the real benefit here is that you don't need to
| understand the schemas so the knowledge is not lost when
| someone leaves a company.
|
| Sort of an abstraction layer for the schemas
| realanswe91 wrote:
| In the 1970s SQL was developed, with an easy English-like
| syntax, such that the average white collar worker (IQ ~115)
| could use it.
|
| But it's 2024 and the average American white collar worker
| (IQ ~100) need something that can parse English-like language
| and tolerate ambiguities and still return some sort of result
| (hopefully the right one lol).
|
| This might sound like a cynical take but there is a big and
| growing market in making sure Americans can continue to use
| technology.
| refset wrote:
| It's not a public facing product, but there was a talk from a
| team at Alibaba a couple of months ago during CMU's "ML=DB
| Seminar Series" [0] on how they augmented their NL2SQL
| transformer model with "Semantics Correction [...] a post-
| processing routine, which checks the initially generated SQL
| queries by applying rules to identify and correct semantic
| errors" [1]. It will be interesting to see whether VC-backed
| teams can keep up with the state of the art coming out of
| BigCorps.
|
| [0] "Alibaba: Domain Knowledge Augmented AI for Databases (Jian
| Tan)" -
| https://www.youtube.com/watch?v=dsgHthzROj4&list=PLSE8ODhjZX...
|
| [1] "CatSQL: Towards Real World Natural Language to SQL
| Applications" - https://www.vldb.org/pvldb/vol16/p1534-fu.pdf
| kszucs wrote:
| Please add Ibis Birdbrain https://ibis-project.github.io/ibis-
| birdbrain/ to the list. Birdbrain is an AI-powered data bot,
| built on Ibis and Marvin, supporting more than 18 database
| backends.
|
| See https://github.com/ibis-project/ibis and https://ibis-
| project.org for more details.
| codyvoda wrote:
| note that Ibis Birdbrain is very much work-in-progress, but
| should provide an open-source solution to do this w/ 20+
| backends
|
| old demo here: https://gist.github.com/lostmygithubaccount/08
| ddf29898732101...
|
| planning to finish it...soon...
| jug wrote:
| I wonder if this supports spatial queries as in PostGIS,
| SpatiaLite, SQL Server Spatial as per the OGC standard?
|
| I'm interested in integrating a user friendly natural language
| query tool for our GIS application.
|
| I've looked at LangChain and the SQL chain before but I didn't
| feel it was robust enough for professional use. You needed to run
| an expensive GPT-4 backend to begin with and even then, it wasn't
| perfect. I think a major part of this is that it wasn't actually
| trained on the data like Vanna apparently does.
| crimbles wrote:
| I can't wait until it naively does a table scan on one of our
| several TB tables...
| swimwiththebeat wrote:
| I'm curious to see if people have tried this out with their
| datasets and seen success? I've been using similar techniques at
| work to build a bot that allows employees internally to talk to
| our structured datasets (a couple MySQL tables). It works kind of
| ok in practice, but there are a few challenges:
|
| 1. We have many enums and data types specific to our business
| that will never be in these foundation models. Those have to be
| manually defined and fed into the prompt as context also (i.e.
| the equivalent of adding documentation in Vanna.ai).
|
| 2. People can ask many kinds of questions that are time-related
| like 'how much demand was there in the past year?'. If you store
| your data in quarters, how would you prompt engineer the model to
| take into account the current time AND recognize it's the last 4
| quarters? This has typically broken for me.
|
| 3. It took a LOT of sample and diverse example SQL queries in
| order for it to generate the right SQL queries for a set of
| plausible user questions (15-20 SQL queries for a single MySQL
| table). Given that users can ask anything, it has to be extremely
| robust. Requiring this much context for just a single table means
| it's difficult to scale to tens or hundreds of tables. I'm
| wondering if there's a more efficient way of doing this?
|
| 4. I've been using the Llama2 70B Gen model, but curious to know
| if other models work significantly better than this one in
| generating SQL queries?
| aussieguy1234 wrote:
| I've already done this with GPT-4.
|
| It goes something like this:
|
| Here's the table structure from MySQL cli `SHOW TABLE` statements
| for my tables I want to query.
|
| Now given those tables, give me a query to show me my cart
| abandonment rate (or, some other business metric I want to know).
|
| Seems to work pretty well.
| miohtama wrote:
| How about instead of making AI wrappers to over 50 years old SQL,
| we'd make a database query language that's easier to read an
| write?
| marginalia_nu wrote:
| In general, if something has been around for a very long time
| and nobody apparently seems to have thought to improve it, then
| odds are the reason is it's pretty good and genuinely hard to
| improve on.
| aae42 wrote:
| in other words, SQL is a shark, not a dinosaur
| neodymiumphish wrote:
| My fear with this approach is that the first implementation
| would be severely handicapped compared to SQL, amd it'd take
| years to support some one-off need for any organizational user,
| so it'd never be fully utilized.
| neofrommatrix wrote:
| I've done this with Neo4j. Pretty simple to hook it up with Open
| AI APIs and have a conversational interface.
| Vosporos wrote:
| I can hire a DBA to tell me that my indexes aren't shit, no need
| for AI.
| l5870uoo9y wrote:
| Is there a list of SQL generations to see how it performs? This
| is a list of SQL examples using GTP-4 and the DVDrental database
| sample.
|
| [1]: https://www.sqlai.ai/sql-examples
|
| [2]: https://www.postgresqltutorial.com/postgresql-getting-
| starte...
| kulikalov wrote:
| While I recognize the efforts in developing natural language to
| SQL translation systems, I remain skeptical. The core of my
| concern lies in the inherent nature of natural language and these
| models, which are approximative and lack precision. SQL
| databases, on the other hand, are built to handle precise,
| accurate information in most cases. Introducing an approximative
| layer, such as a language model, into a system that relies on
| precision could potentially create more problems than it solves,
| leading me to question the productivity of these endeavors in
| effectively addressing real-world needs.
| samstave wrote:
| Reverse idea:
|
| Use this to POPULATE sql based on captured NLP "surveillance"
| -- for example, build a DB of things I say as my thing listens
| to me, and categorize things, topics, place, people etc
| mentioned.
|
| Keep count of experiencing the same things....
|
| When I say I need to "buy thing" build table of frequency for
| "buy thing" etc...
|
| Effectively - query anything you've said to Alexa and be able
| to map behaviors/habits/people/things...
|
| If I say - Bob's phone number is BLAH. It add's bob+# to my
| "random people I met today table" with a note of "we met at the
| dog park"
___________________________________________________________________
(page generated 2024-01-14 23:00 UTC)