[HN Gopher] Storm: LLM system that researches a topic and genera...
       ___________________________________________________________________
        
       Storm: LLM system that researches a topic and generates full-length
       wiki article
        
       Author : GavCo
       Score  : 73 points
       Date   : 2024-04-11 17:53 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | Logans_Run wrote:
       | Oh dear lord .... sub heading states - _Storm - Assisting in
       | Writing Wikipedia-like Articles From Scratch with Large Language
       | Models_
       | 
       | Good luck with _this_ storm, wiki 's the world over. Just a
       | thought but ... maybe someone should ask an org like the Internet
       | Archive to snap-shot Wikipedia asap and label it Pre-Storm and
       | After-Storm
        
         | tossandthrow wrote:
         | there is this sentiment of Ai induced deterioration and
         | pollution.
         | 
         | what if that is not the case? what if the quality of this type
         | of content actually increases?
        
           | CamperBob2 wrote:
           | It will for a while, I imagine. But the long-term is a
           | concern. Where will new information come from, exactly?
        
             | tossandthrow wrote:
             | why not Ai?
             | 
             | And even if we accept the premise (as flawed as it might
             | be) that Ai is not able to create original knowledge, most
             | of what's online is dessimination and dies not represent
             | new information but just old information rewritten to be
             | understandable by a certain segment.
             | 
             | something LLMs excel at.
        
               | poyu wrote:
               | > Ai is not able to create original knowledge
               | 
               | The current state of LLMs do hallucinate though. It's
               | just not a very trustworthy source of facts.
        
               | tossandthrow wrote:
               | just like my first teachers said I should absolutely not
               | use Wikipedia.
               | 
               | LLMs was popularized less than 2 years ago.
               | 
               | I think it is safe to assume that it will be as
               | trustworthy as you see Wikipedia today, and probably even
               | more as you can embed reasoning techniques into the LLMs
               | to correct misunderstandings.
               | 
               | Wikipedia cannot self correct.
        
               | howenterprisey wrote:
               | Wikipedia absolutely self-corrects, that's the whole
               | point!
        
               | tossandthrow wrote:
               | it does not. it's authors corrects it.
               | 
               | unless you see Wikipedia as the organisation and not the
               | encyklopedia?
               | 
               | in that case: sigh, then everything self corrects
        
               | howenterprisey wrote:
               | It is incoherent to discuss Wikipedia as some text
               | divorced from the community and process that made it, so
               | I'm done here.
        
               | pksebben wrote:
               | There's an important difference between wikipedia and the
               | LLMs that are actually useful today.
               | 
               | Wikipedia is open, like completely open.
               | 
               | GPT is not.
               | 
               | Unless we manage to crack the distributed training /
               | incremental improvement barriers, LLMs are a lot more
               | likely to follow the Google path (that is, start awesome
               | and gradually enshittify as capitalist concerns pollute
               | the decision matrix) than they are the Wikipedia path
               | (gradual improvement as more eyes and minds work to
               | improve them).
        
               | tossandthrow wrote:
               | this is super interesting!
               | 
               | it also carves I to the question what constituted model
               | openness?
               | 
               | most people agree that just releasing weights are not
               | enough.
               | 
               | but I don't think it will ever be feasible to say that
               | reproducing model training is feasible. especially when
               | factoring in branching and merging of models.
               | 
               | for me this is an open and super interesting question.
        
               | pksebben wrote:
               | Here's what I envision (note: impossible with current
               | state of the art)
               | 
               | A model that can be incrementally trained (this is the
               | bit we're missing) hosted by a nonprofit, belonging to
               | "we the people" (like wikipedia).
               | 
               | The training process could be done a little like
               | wikipedia talk pages are now - datasets are proposed and
               | discussed out in the open and once generally approved,
               | trained into the model.
               | 
               | Because training currently involves backpropagation, this
               | isn't possible. Hinton was working on a structure called
               | "forward-forward" that would have overcome this (if it
               | worked) before he decided humanity couldn't be trusted
               | [1]. It is my hope that someone smarter than me picks up
               | this thread of research - although in the spirit of
               | personal responsibility I've started picking up my old
               | math books to try and get to a point where I grok the
               | implementation enough to experiment myself (I'm not super
               | confident I'm gonna get there but you can't win if you
               | don't play, right?)
               | 
               | It's hard to tell when (if?) we're ever going to have
               | this - if it does happen, it'll be because a lot of
               | people do a lot of really smart unpaid work (after seeing
               | OpenAI do what it did, I don't have a ton of faith that
               | even non-profit orgs have the will or the structure to
               | pull it off. Please prove me wrong.)
               | 
               | 1 - https://arxiv.org/abs/2212.13345
        
           | prionassembly wrote:
           | I mean, putting a bullet to someone's head can extirpate a
           | brain tumor they hadn't been alerted to before, while leaving
           | a grateful person owing you kudos. What if?
        
             | tossandthrow wrote:
             | you can always find some radical regressionist argument
             | that is completely out of contact with anything.
             | 
             | congrats on that!
        
           | pksebben wrote:
           | On the one hand, a tool is as good or bad as the person
           | wielding it. Smart folks with the right intentions will
           | certainly be able to use this stuff to increase the rate
           | _and_ quality of their output (because they 're smart, so
           | they'll verify rather than trust. Hopefully.)
           | 
           | On the other, moderation is an unsolved problem. The general
           | mess of the internet is probably not quite ready to be handed
           | a footgun of this caliber.
           | 
           | As with many things tech, some of the outcome falls to us,
           | the techies. We can build systems to help steer this.
        
             | tossandthrow wrote:
             | > On the one hand, a tool is as good or bad as the person
             | wielding it.
             | 
             | I think the real reason is one line dogmas like this.
        
               | pksebben wrote:
               | I'm not sure I follow you - reason for what?
               | 
               | To be clear - I'm with you that these systems can
               | absolutely be a force for vast good (at least, I think
               | that was what you were getting at unless there was a
               | missing '/s'). I use them daily to pretty astounding
               | effect.
               | 
               | I'll admit to being a little put off by being labeled
               | dogmatic - it's not something I consider myself to be.
        
               | tossandthrow wrote:
               | it was a half sentence, for that I apologize. and I don't
               | remember entirely what I meant.
               | 
               | However, I do see a lot of one-sentence "truthms" being
               | thrown around. like "garbage in; garbage out" and the
               | likest.
               | 
               | these are not correct. we can just look at the current
               | state of the art with LLMs that has vast amounts of
               | garbage going in - it seems like the value is in the
               | vastness of the data over the quality.
               | 
               | > On the one hand, a tool is as good or bad as the person
               | wielding it.
               | 
               | I see this as being a dogme. smart people make good LLMs
               | dumb people do not. but this is an open question. it
               | seems like the biggest wallet will be the winner of the
               | LLM game.
               | 
               | please correct me if I misunderstood something.
        
           | jerf wrote:
           | The concern is not just a vaguely cynical hand-wringing about
           | how bad AI is. Feeding AIs their own output as training
           | material is a bad thing for mathematical reasons, and feeding
           | AIs the output of other very similar AIs is close enough for
           | it to also be bad. The reasons are subtle and hard to
           | describe in plain English, and I'm not enough of an expert to
           | even try, so pardon if I don't. But given that it is hard to
           | determine if output is from an AI, AI really does face a
           | crisis of having a hard time coming across good training
           | material in the future.
        
             | tossandthrow wrote:
             | can you show me a mathematical reason that cannot
             | philosophically be applied to people also? people only
             | being fed other people output.
        
               | jerf wrote:
               | I'd go with "no", because people just consuming the
               | output of other people is a big ongoing problem. Input
               | from the universe needs to be added in order to maintain
               | alignment with the universe, for whichever "universe" you
               | are considering. Without frequent reference to reality,
               | people feeding too much on people will inevitably depart
               | from reality.
               | 
               | In another context, you may know this as an "echo
               | chamber". Not quite _exactly_ the same concept, but very,
               | very similar.
               | 
               | I do like to remind people that the AI of today and LLMs
               | are not the whole of reality. Perhaps someday there will
               | be AIs that are also capable of directly consulting the
               | universe, through some sort of body they can use. But the
               | current LLMs, which are trained on some sort of human
               | output, need to exclude AI-generated input or they too
               | will converge on some sort of degenerate attractor.
        
               | tossandthrow wrote:
               | yep, then we are back a "vaguely cynical hand-wringing
               | about how bad AI is."
               | 
               | currently we have mostly LLMs in the mix. but there are
               | no reason that the Ai mix will not contain embodied
               | agents thst also publish stuff in the internet. (think
               | search and rescue bots that automatically write a
               | report).
               | 
               | Now Ai is connected to reality without people in the mix.
        
             | orbital-decay wrote:
             | _> Feeding AIs their own output as training material is a
             | bad thing for mathematical reasons_
             | 
             | Most model collapse studies explore degenerate cases to
             | determine the potential limits of the training process of
             | the _same_ model. No wonder you will get terrible results
             | if you recursively recompress a JPEG 100 times! In real
             | world it 's nowhere near that bad, because models are never
             | trained on their output alone and always guaranteed to
             | receive the certain amount of external data, starting from
             | the manual dataset curation (yes, that's also fresh data in
             | itself).
             | 
             | Meanwhile, synthetic datasets are entirely common. I
             | suspect this is a non-issue that is way overblown by people
             | misinterpreting these studies.
        
               | jerf wrote:
               | I suspect it's overblown today. Hopefully it'll be
               | overblown indefinitely.
               | 
               | However, if AIs become as successful as Nvidia stock
               | price implies, it could indeed become difficult to find
               | text that is _guaranteed_ to not be AI. It is conceivable
               | that in 20 years it will be very difficult to generate a
               | training set at any scale that isn 't 90% already touched
               | by AIs.
               | 
               | Of course, it's conceivable that in 20 years we'll have
               | AIs that don't need the equivalent of millennia of
               | training to come up to their full potential. The problem
               | is much more tractable if one merely needs to produce
               | megabytes of training data to obtain a decent
               | understanding of English rather than many gigabytes.
        
           | skywhopper wrote:
           | How could it? LLMs hallucinate false information. Even if
           | hallucinations are improved, the false information they've
           | generated is now part of the body of text they will be
           | trained on.
        
         | achrono wrote:
         | LLM mediocrity is just a reflection of human mediocrity, and my
         | bet is on the average LLM to get way better much faster than
         | the average human doing the same.
        
           | bschmidt1 wrote:
           | Agree with you, but on mediocrity: Mistral barely passes as
           | usable, GPT-4 is barely better than Googling, and nothing
           | else I've tried is even ready for production. So there's some
           | element of the model's design, weights/embeddings, and
           | training data that matters a lot.
           | 
           | Only fine-tuned models are producing impressive work, because
           | when we say something is impressive it by definition means
           | not like the status quo - the model must be tuned toward some
           | bias or other, whether it's aesthetic or otherwise, in order
           | to stand out from the rest. And generic models like GPT or
           | Stable Diffusion will always be generic, they won't have a
           | bias toward certain truths - they'll be mostly unbiased which
           | we want for general research or internet search.
           | 
           | So it's interesting, in order to get incredible quality of
           | work out of AI, you have to make it specific, but in order to
           | that, you have to train it on the work of humans. I think for
           | this reason AI will always be ultimately behind humans,
           | though it of course will displace a lot of work we do, which
           | is significant.
        
           | singleshot_ wrote:
           | Humans are limited in the volume of garbage they can produce.
        
       | LeoPanthera wrote:
       | I saved a full snapshot of Wikipedia (and Stack Overflow) in the
       | weeks before ChatGPT launched, and every day I'm more glad that I
       | did. They will become the Low Background Steel of text.
        
         | jakderrida wrote:
         | The thing is that the Wiki mods will need to be more diligent
         | with uncited things. I also see 2 massive opportunities here.
         | First is that they can have agents check the cited source and
         | verify whether the source backs up what's said to a reasonable
         | degree. Second opportunity is fitting in things only found in
         | other language Wikis that either be incorporated into the
         | english one or help introduce new articles. Believe it or not,
         | LLMs can't generate english answers for things answered only in
         | Russian (or any language) in the training data.
        
           | groceryheist wrote:
           | > First is that they can have agents check the cited source
           | and verify whether the source backs up what's said to a
           | reasonable degree.
           | 
           | This is a hard and tmk unsolved NLP/IR problem, and data
           | access is an issue.
           | 
           | > Second opportunity is fitting in things only found in other
           | language Wikis that either be incorporated into the english
           | one or help introduce new articles.
           | 
           | This has been attempted via machine translation in the past,
           | and it failed because you need native speakers to verify and
           | correct the translations and this wasn't the sort of work
           | that people were jumping to volunteer to do.
        
           | WhitneyLand wrote:
           | >>LLMs can't generate english answers for things answered
           | only in Russian in the training data.
           | 
           | For multilingual LLM's? Why do you think that?
           | 
           | An LLM can translate inputs of arbitrary Russian text. If
           | there were an English question about something only in the
           | training data as Russian, I would expect an answer - with the
           | quality being on par with its general translation
           | capabilities.
        
         | barbarr wrote:
         | Good analogy! There's good reason to believe that web archives
         | "uncontaminated" by LLM output will have some unique value in
         | the future (if not now).
        
           | cmcollier wrote:
           | For those wondering about the analogy:
           | 
           | * https://en.wikipedia.org/wiki/Low-background_steel
        
         | pksebben wrote:
         | That's gonna be a lot of fun to play with in a year or so.
         | 
         | There's a concurrent explosion of 'veracity' analysis - it'll
         | be fun to run those against wikipedia a year from now and your
         | data.
         | 
         | Incidentally, are you interested in mirroring your dataset and
         | making it more robust? I'm sure I've got a few TB of storage
         | lying around somewhere...
        
           | Anon84 wrote:
           | You can just download it yourself. Wikimedia publishes
           | regular dumps in easily accessible formats:
           | https://dumps.wikimedia.org/enwiki/20240320/ (the most recent
           | for english Wikipedia)
        
             | pksebben wrote:
             | I don't see historical dumps. Am I just dumb?
        
               | Anon84 wrote:
               | No, the website is just weird. The original link I posted
               | is for the most recent dump... if you want older ones:
               | https://dumps.wikimedia.org/enwiki/
        
             | bschmidt1 wrote:
             | "Note that the data dumps are not backups, not consistent,
             | and not complete."
        
           | LeoPanthera wrote:
           | They are already on the Internet Archive as Kiwix archives.
        
         | tiptup300 wrote:
         | You know that wikipedia keeps revisions on all articles. I'm
         | sure you could put together a script to make a copy any time of
         | each page from a certain point of time.
        
       | barbarr wrote:
       | I guess this is a good thing for increasing coverage of neglected
       | areas. But given how cleverly LLMs can hide hallucinations, I
       | feel like at least a few different auditor bots should also sign
       | off on edits to ensure everything is correct.
        
         | pksebben wrote:
         | This method has actually been proven effective at increasing
         | reliability / decreasing hallucinations [1]
         | 
         | 1 - https://arxiv.org/abs/2402.05120
        
       | whitehexagon wrote:
       | Hmm something about this title containing the word 'research'
       | disturbs me. I associate that word with rigorous scientific
       | methods that leads to fact based knowledge or maybe some new
       | hypothesis, not some LLM hallucinating sources, references,
       | quotes and all the other garbage they spit out when challenged
       | over a point of fact. Horrifying to think peeps might turn
       | towards these tools for factual information.
        
         | devmor wrote:
         | Yes, I came to the comments to say the same thing. The LLM is
         | not doing research - it is aggregating data associated with
         | terms and reorganizing text based on what previous responses to
         | a similar prompt would look like.
         | 
         | At the most generous level of scrutiny, the only part that
         | could be related to research would be the aggregation of
         | sources - but that is only a precursor to research and likely
         | is too generalized to be as accurate as a specialist preparing
         | data for actual research.
        
         | madeofpalk wrote:
         | This anthropomorphism really bothers me. These tools are useful
         | for what they're good for, but I really dislike the agency
         | people keep trying to give to them.
        
           | Terr_ wrote:
           | I think there's always been fine line between
           | anthropomorphism as a metaphorical way to indicate complexity
           | versus a pitfall where people (especially outside of a field)
           | start acting like it's a literal statement.
           | 
           | Ex: "the gyroscope is trying to stay upright", or "the
           | computer complains because the update is broken" or
           | "evolution will give the birds longer beaks".
           | 
           | That said, I agree that the problem is dramatically more-
           | severe when it comes to "AI".
        
           | bschmidt1 wrote:
           | It should also bother marketers in the AI industry because it
           | confuses people on what the incredible value is.
           | 
           | So many people think LLM _means_ chatbot, even here on HN. So
           | many people think agent means mentally humanoid.
           | 
           | But we have others, like Stable Diffusion's Web UI and
           | Leonardo.AI - these are just tools with interfaces and the
           | text entry for prompting is not presented as though it's a
           | conversation between 2 people.
           | 
           | Someone shared an AI songmaker here recently... And there's a
           | number of promising RAG tools for improving workflows for:
           | Doctors, mechanics, researchers, lawyers.
           | 
           | I agree with you and expect the "AI character" use case to
           | narrow significantly.
        
         | mistermann wrote:
         | > Hmm something about this title containing the word 'research'
         | disturbs me. I associate that word with rigorous scientific
         | methods...
         | 
         | The presence of the word "scientific" in this statement
         | disturbs me.
        
       | agilob wrote:
       | Nucleo AI Alpha
       | 
       | An AI assistant app that mixes AI features with traditional
       | personal productivity. The AI can work in the background to
       | answer multiple chats, handle tasks, and stream/feed entries.
       | 
       | https://old.reddit.com/r/LocalLLaMA/comments/1b8uvpw/does_fr...
        
       | spxneo wrote:
       | I hope somebody took a snapshot of the entire internet before
       | 2020, that is our only defence against knowledge laundry.
       | 
       | Wreaking havoc on the digital Akashic records.
        
       | manishsharan wrote:
       | At what point will it be just LLM Bots arguing with Other LLM
       | Bots on Wikepedia edits ?
        
         | bschmidt1 wrote:
         | As long as the LLM Moderator deems it safe discourse let the
         | best idea win! I'd love a debate between 2 highly-accurate and
         | context-aware LLMs - if such a thing existed.
         | 
         | Otherwise it would be like reading HN or Reddit debates where 2
         | egomaniacs who are both wrong continually straw man each other
         | with statements peppered with lies and parroted disinfo, aint
         | got time for that.
        
           | neverokay wrote:
           | You'd have to train a model good at debating. That's the
           | agent that will have the winning response. The problem is the
           | world's knowledge is basically who made the best case, which
           | is often whoever had a bullet proof case (undeniable
           | evidence) or whoever debated better, and I guess observations
           | (and people debate that even). Something something, history
           | is written by victors.
           | 
           | That means a lot of what an LLM spits might be patterns it
           | found in whoever won the debate (which has nothing to do with
           | the truth). Measuring those responses as "intelligent with
           | reasoning abilities" might be premature.
           | 
           | I almost feel like we need to train the LLMs not with the
           | truth and perfect data, but with the logs of tons of trial
           | and error experiments, and even then it might just learn
           | brute force.
        
       | pstorm wrote:
       | I looked into this to see where it was getting new information,
       | and as far as I can tell, it is searching wikipedia exclusively.
       | Useful for sure, but not exactly what I was expecting based on
       | the title.
        
         | pksebben wrote:
         | That gives me an idea.
         | 
         | There are wikipedias in other languages - Maybe this framework
         | could be adapted to translate the search terms, fetch
         | mulitlingual sources, translate them back, and use those as
         | comparisons.
         | 
         | I've found a lot of stuff out through similar by-hand
         | techniques that would be difficult to discover on english
         | search. I'd be curious to see how much differential there is
         | between accounts across language barriers.
        
         | Lerc wrote:
         | As a base for researching the idea, Wikipedia seems like a
         | decent data source.
         | 
         | For broader implementation you would want to develop the
         | approach further. The idea of sampling other-language Wikipedia
         | mentioned in a sibling comment seems to be a decent next step.
         | 
         | Extending it to bringing in from wider sources would be another
         | step. I doubt it would be infallible but it would be really
         | interesting to see how it compares to humans performing the
         | same task. Especially if there were a additional ability to
         | verify written articles and make corrections.
        
           | philipov wrote:
           | > As a base for researching the idea, Wikipedia seems like a
           | decent data source.
           | 
           | If your goal is to generate a wiki article, you can't assume
           | one already exists. That's begging the question. If you could
           | just search wikipedia for the answer, you wouldn't need to
           | generate an article.
        
             | Lerc wrote:
             | I don't think their goal is to generate a wikipedia
             | article. Their goal is to figure out how one might generate
             | a wikipedia article.
        
       | lukev wrote:
       | I can see this being useful iif the content is generated on
       | demand and then discarded.
       | 
       |  _Publishing_ AI generated material is generally speaking a
       | horrible idea and does nobody any good (at least until accuracy
       | levels get much much better.)
       | 
       | Even if they do it well and truthfully (which they don't) current
       | LLMs can only summarize, digest, and restate. There is no non-
       | transient value add. LLMs may have a place to help _query_ , but
       | there is no reason to publish LLM regurgitations alongside the
       | ground truth used to generate them.
        
         | tiptup300 wrote:
         | are llms able to look at a list of categories, read content and
         | then determine which of the categories apply?
        
           | warkdarrior wrote:
           | This is a very broad question, but in short, yes, they can do
           | this. It depends on the granularity and overlap of those
           | categories.
        
           | msp26 wrote:
           | Absolutely
        
           | OKRainbowKid wrote:
           | This could be achieved by generating embeddings of suitable
           | representations of the categories once, and then embedding
           | the content at runtime, before using some distance metric to
           | find matching categories for the content embedding.
        
         | petercooper wrote:
         | _current LLMs can only summarize, digest, and restate. There is
         | no non-transient value add._
         | 
         | Though, at a stretch, Wikipedia itself could be considered
         | based around summarization, digesting, and restating/citing
         | things said elsewhere, given its policy of verifiability: _"
         | Even if you are sure something is true, it must have been
         | previously published in a reliable source before you can add
         | it."_ Now, LLMs aren't well known for their citation skills, to
         | be fair.. :-)
        
           | lukev wrote:
           | Yeah, when AIs can comprehensively cite their sources I might
           | change my opinion on that.
           | 
           | Though note that there _still_ isn 't any need to publish
           | static content. The power of LLMs is that they can be dynamic
           | and responsive!
           | 
           | Even if we hypothesize that it were possible for a LLM to
           | write a high-quality wikipedia-like output, generating the
           | whole thing statically in advance like existing Wikipedia
           | would be relatively pointless. It'd be much more interesting
           | to generate arbitrary (and infinite!) pages on demand.
        
         | CuriouslyC wrote:
         | I think bootstrapping documentation with LLM output is a great
         | practice. It's a wiki, people can update it from a baseline,
         | just as long as they can see what was LLM generated to know
         | that it shouldn't be taken as absolute truth.
         | 
         | The hardest part of good documentation is getting started. Once
         | there are docs in place it's usually much easier to revise and
         | correct than it would have been to write correctly by hand the
         | first time. Think of it like automating a rough draft.
        
           | msp26 wrote:
           | Maybe the generated text could be a slightly different colour
           | until it's verified. But you'd have to make sure there's no
           | easy way of verifying everything mindlessly without having
           | read it.
        
         | visarga wrote:
         | > current LLMs can only summarize, digest, and restate. There
         | is no non-transient value add.
         | 
         | No, you're wrong. LLMs create new experiences after deployment,
         | either by assisting humans, or by solving tasks they can
         | validate, such as code or game play. In fact any deployed LLM
         | gets to be embedded in a larger system - a chat room, a code
         | running environment, a game, a simulation, a robot or inside a
         | company - it can learn from iterative tasks because each
         | following iteration carries some kind of real world feedback.
         | 
         | Besides that, LLMs trivially learn new concepts and even new
         | skills with a short explanation or demonstration, they can be
         | pulled out of their training distribution and collect
         | experiences doing new things. If OpenAI has 100M users and they
         | consume 10K tokens/user/month, that makes for 1 trillion tokens
         | of human-AI interaction rich with new experiences and feedback.
         | 
         | In the text modality LLMs have consumed most of the high
         | quality human text, that is why all SOTA models are roughly on
         | par, they trained on the same data. That means easy time is
         | over, AI has caught up with all human language data. But from
         | now on AI models need to create experiences of their own,
         | because learning from your own mistakes is much faster. The
         | more they get used, the more feedback and new information they
         | collect. The environment is the teacher, not everything is
         | written in books.
         | 
         | And all that text - the trillions of tokens they are going to
         | speak to us - in turn contributes to scientific discoveries and
         | progress, and percolate back into the next training set. LLMs
         | have massive impact at language level on people so by extension
         | on the physical world and culture. They have already influenced
         | language and the arts.
         | 
         | LLMs can create new experiences, learn new skills, and have a
         | significant impact through widespread deployment and
         | interaction. There is "value add" if you look at the grand
         | picture.
        
         | observationist wrote:
         | This is categorically untrue. Publishing material generated
         | like this is going to be generally better than human generated
         | content. It takes less time, can be systematically tested and
         | rigorous, and you can specifically avoid the pitfalls of bias
         | and prejudice.
         | 
         | A system like this is multilayered, with prompts going through
         | the whole problem solving process, considering the information
         | presented, assuring quality and factuality, assigning the
         | necessary citations and documentation for claims.
         | 
         | Accuracy isn't a problem. The way in which AI is used creates
         | the problem - ChatGPT and most chat based models are single
         | pass, query/response type interactions with models. Sometimes
         | you get a second pass with a moderation system, doing a review
         | to ensure offensive or illegal things get filtered out. Without
         | any additional testing and prompt engineering, you're going to
         | run into hallucinations, inefficient formulations, random
         | "technically correct but not very useful" generations, and so
         | forth. Raw ChatGPT content shouldn't be published without
         | significant editing and going through the same quality review
         | process any human written text should go through.
         | 
         | What Storm accomplishes is an algorithmic and methodical series
         | of problem solving steps, each of which can be tested and
         | verified and validated. This is synthesized in a particular
         | way, intended as a factual reference article. Presumably you
         | could insert debiasing and checks for narrative or political
         | statements, ensuring attribution and citation occur for
         | quotations, and rephrasing anything generated by the AI as a
         | neutral, academic statement of fact with no stylistic and
         | artistic features.
         | 
         | This is significantly different from the almost superficial
         | interactions you get with chatbots, unless you specifically
         | engineer your prompts and cycle through similar problem solving
         | methods.
         | 
         | Tasks like this are well within the value add domain of current
         | AI capabilities.
         | 
         | Compared to the absolute trash of SEO optimized blog posts, the
         | agenda driven, ulterior laden rants and rambles in social
         | media, and the "I'm oh-so-cleverly influencing the narrative"
         | articles posted to Wikipedia by humans, content like this is a
         | clear winner in quality, in my opinion.
         | 
         | AI isn't at the point where it's going to spit out well
         | grounded novel answers to things like "what's the cure for
         | cancer?" but it can absolutely produce a principled and legible
         | explanation of a phenomenon or collection of facts about a
         | thing.
        
       | cess11 wrote:
       | Kinda weird to promote automated reordering and rephrasing of
       | information as research.
       | 
       | What do the authors call what they're doing? Magic?
        
       | brap wrote:
       | I don't know how well this works (demo is broken on mobile), but
       | I like the idea.
       | 
       | Imagine an infinite wiki where articles are generated on the fly
       | (from reputable sources - with links), including links to other
       | articles (which are also generated) etc.
       | 
       | I actually like this sort of interface more than chat.
        
         | rrr_oh_man wrote:
         | Check out https://github.com/MxDkl/AutoWiki (there are project
         | with similar names doing stuff like this)
        
       | jankovicsandras wrote:
       | One
        
       | jankovicsandras wrote:
       | This looks cool!
       | 
       | There's a small ironically funny typo in the first line:
       | knolwedge
        
       | bschmidt1 wrote:
       | This would be useful for RAG when a Wiki doesn't exist.
       | findOrCreate
        
       | samgriesemer wrote:
       | Small thing, but the blurb on the README says
       | 
       | > While the system cannot produce publication-ready articles that
       | often require a significant number of edits, experienced
       | Wikipedia editors have found it helpful in their pre-writing
       | stage.
       | 
       | So it _can 't_ produce articles that require many edits? Meaning
       | it _can_ produce publication-ready articles that don 't need lots
       | of edits? Or it _can 't_ produce publication-ready articles,
       | _and_ the articles produced require lots of edits? I can 't make
       | sense of this statement.
        
         | adr1an wrote:
         | It gives you a draft that you should keep working on. For
         | example, fact checking.
        
       | skywhopper wrote:
       | From my experiments, this thing is pretty bad. It mixes up things
       | that have similar names, it pulls in entirely unrelated concepts,
       | the articles it generates are mind-numbingly repetitive and
       | verbose (although notably with slightly different "facts" each
       | time things are restated), its citations are often completely
       | unrelated to the topic at hand, and facts are cited by references
       | that don't back them up.
       | 
       | I mean, the spelling and syntax of the sentences is mostly
       | correct, just like any LLM content. But there's ultimately still
       | no coherence to the output.
        
       ___________________________________________________________________
       (page generated 2024-04-11 23:01 UTC)