[HN Gopher] Natural language generation: The commercial state of...
       ___________________________________________________________________
        
       Natural language generation: The commercial state of the art in
       2020
        
       Author : polm23
       Score  : 82 points
       Date   : 2021-01-10 06:56 UTC (1 days ago)
        
 (HTM) web link (www.cambridge.org)
 (TXT) w3m dump (www.cambridge.org)
        
       | jaxwerk wrote:
       | As an engineer who has worked briefly with one of the start-ups
       | focused on here on their core product, Robert's critique of the
       | the technology driving the current generation of template-based
       | commercial solutions is absolutely spot-on. The results are
       | useful but the template based approach along with the really
       | lackluster tooling both my employer and competitors in the space
       | had cobbled together made self-service nigh impossible. His
       | assessment that these companies are really professional services
       | companies dealing in bespoke one-off software solutions despite
       | their efforts to market as robust AI products that magically turn
       | some arbitrary JSON schema into prose is very fair I think.
       | 
       | I am eager to see if neural approaches develop to better handle
       | constraints, but part of me thinks the level of control over
       | generated prose that template based approaches provide is
       | essential for customer-facing text and that part of the market
       | will persist even when every other recipe in Google search
       | results is GPT-3 generated rubbish around the actual damn list of
       | ingredients.
        
       | ghenson2 wrote:
       | Why is Google's T5 not mentioned?
       | 
       | https://github.com/google-research/text-to-text-transfer-tra...
        
         | riku_iki wrote:
         | curious if there any examples of using it for text generation,
         | since it has different model structure from GPT.
        
           | FL33TW00D wrote:
           | From my experience T5 is currently the best publicly
           | available model for NLG.
        
           | pigscantfly wrote:
           | I've been using a finetuned T5 model for commercial text
           | generation for the past six months. The volume of discussion
           | around the model on github and elsewhere leads me to believe
           | others are as well, although people tend to be circumspect
           | about implementation details.
        
           | ghenson2 wrote:
           | Check out the T5 models at huggingface for this. Main NLG use
           | cases with T5 are translations, summarization and question
           | generation. Latter is sophisticated + nothing trivial and
           | "NLG", so yeah.
        
       | polm23 wrote:
       | This is a short overview of the state of NLG by Robert Dale, co-
       | author of "Building Natural Language Generation Systems", which
       | is basically the book for NLG.
       | 
       | He give a list of commercial providers and concludes that most of
       | them just offer smart templates. This is the important part:
       | 
       | > To the extent that you can tell from the clues to functionality
       | that are surfaced by these various products, all the tools are
       | ultimately very similar in terms of how they work, which might be
       | referred to as 'smart template' mechanisms. There is a
       | recognition that, at least for the kinds of use cases we see
       | today, much of the text in any given output can be predetermined
       | and provided as boilerplate, with gaps to be filled dynamically
       | based on per-record variations in the underlying data source. Add
       | conditional inclusion of text components and maybe some kind of
       | looping control construct, and the resulting NLG toolkit, as in
       | the case of humans and chimpanzees, shares 99% of its DNA with
       | the legal document automation and assembly tools of the 1990s,
       | like HotDocs (https://www.hotdocs.com). As far as I can tell,
       | linguistic knowledge, and other refined ingredients of the NLG
       | systems built in research laboratories, is sparse and generally
       | limited to morphology for number agreement (one stock dropped in
       | value vs. three stocks dropped in value).
       | 
       | That sounds pretty negative, but he emphasizes that putting an
       | easy-to-use UI on well understood technology is meeting real
       | business needs.
       | 
       | At the end he briefly touches on GPT2 and related technology.
        
         | PaulHoule wrote:
         | Smart templates arent bad at all given that for most uses you
         | want to enforce style and attributes.
         | 
         | Generation is easier than understanding. Systems like GPT-3 are
         | not capable of respecting constraints, which is this basis for
         | practical creativity.
        
       | visarga wrote:
       | The state of NLG as of June 2020. A long time ago.
       | 
       | Apparently it was published just one day before GPT-3.
        
       | is_true wrote:
       | I think that most use cases for NLG are unethical.
       | 
       | I provide raw data for some news services and that data could be
       | shown as tables or simple dashboards that are faster to read for
       | users, but my customers insist in generating as much text as
       | possible from, sometimes, 2 values.
       | 
       | The incentives are totally misaligned, news services try to get
       | as much time from it's readers as possible. It's quite
       | depressing.
        
         | coding123 wrote:
         | I tend to agree. While technologically its a marvel. But I do
         | have the feeling the Robocallers and NLG customers are hanging
         | out in the same place.
        
         | gxqoz wrote:
         | I did some experiments last summer trying to get GPT-3 to
         | author quizbowl questions, a type of trivia question. As
         | expected, GPT-3's loose relationship with the truth made these
         | results not acceptable. But a system that really could generate
         | new questions from existing facts would be immensely useful in
         | this domain.
        
           | polm23 wrote:
           | I'm not surprised GPT is awful at that, but there are good
           | ways to do it. Look for papers on "factoid question
           | generation".
        
             | gxqoz wrote:
             | Quizbowl questions have a particular structure that makes
             | them generally not work with these other approaches. For
             | instance, it needs to generate paragraph-long questions
             | with the clues that go from the most to least difficult.
             | This is hard skill for a human to learn, much less a
             | machine.
        
         | tyingq wrote:
         | Google is probably going to struggle with SEO content/link
         | mills that now produce something better than spun gibberish.
        
           | lumost wrote:
           | I'd expect the only rational decision would be to ban the
           | content mill domains from the crawler. Is there really that
           | much value for the search customer in being directed to a
           | generated page?
        
             | tyingq wrote:
             | That's the dilemma. NLG of sufficient quality[1] won't be
             | easily detectable as machine generated.
             | 
             | [1] "Quality" here meaning how it looks to Google's
             | crawler, not actual quality content.
        
               | lumost wrote:
               | domains auto-generating junk or derivative content should
               | be detectable from both human annotation or automated
               | means. If a site with low click through rates, traffic
               | and extreme volumes of text appears, it's worth a re-
               | examination by people.
               | 
               | In particular at search time, google has a vested
               | interest in limiting both the number of links to a
               | particular domain and the occurrence of content farm
               | links when Wikipedia would suffice. Content farms are
               | pretty detectable in any objective relevance annotation
               | workflow.
        
               | tyingq wrote:
               | It wasn't that long ago that really poor quality auto
               | generated content was clearly working for SEO purposes,
               | so I'm not convinced Google is ahead of current state of
               | the art.
        
         | mkl95 wrote:
         | Usage of these algorithms and heuristics is starting to become
         | quite obvious in Spanish sports journalism. I stumble into
         | articles that are basically AI gibberish generated from a short
         | quote every other day.
        
         | Saad_M wrote:
         | I very much disagree. There are use case such as weather
         | reporting, electronic medical report summarisation, etc. That
         | make more sense as a textual representation than bundles of
         | graphs and tables. In fact textual summarisation has been shown
         | to lead to better decision making[1].
         | 
         | Good data-to-text NLG applications not only summarise data but
         | they also can provides insight into causal relations of why
         | events occurred by leveraging domain knowledge.
         | 
         | [1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2656014/
        
           | is_true wrote:
           | I agree with you Saad. I worded that comment poorly. There
           | are some use cases that are gonna be used to keep people
           | reading a lot of algorithmical gibberish for no reason other
           | than increasing revenue, for example sports articles.
           | 
           | I think NLG is really interesting, the problem is that there
           | are incentives to create long form content that doesn't add
           | any value to readers.
        
       | pdevr wrote:
       | Is there any place where NLG guys hang around? Most groups seem
       | to have NLP as the focus.
        
         | Saad_M wrote:
         | On the academic side the SIGGEN mailing list is where a lot of
         | the activity in the NLG community goes on:
         | https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=SIGGEN
        
         | ghenson2 wrote:
         | Wonder the same thing or just an _active_ discord for these
         | topics...
        
         | visarga wrote:
         | EleutherAI on discord. https://discord.gg/vtRgjbM
        
       | picodguyo wrote:
       | ML and NLP surveys are outdated basically the minute the author
       | hits send nowadays. With GPT-3 all of these old school NLG
       | companies seem quaint. Much of their functionality can be
       | replicated in minutes with GPT-3, even by non-technical users
       | familiar with prompt design. As soon as OpenAI offer fine tuning
       | OOTB (coming soon) NLP/G will be all but solved for 99% of use
       | cases.
        
         | PedroBatista wrote:
         | Flying cars buddy! you just wait..
        
           | picodguyo wrote:
           | I know, I know. But I don't say this lightly. I've worked in
           | NLP/G for nearly 15 years and after spending the last 6
           | months working with GPT-3 I feel the writing is on the wall.
        
             | Saad_M wrote:
             | But it isn't. I am still actively involved in NLG both
             | professionally and academically and neural NLG systems have
             | great promise but are still far from actively delivering
             | tangible solutions to areas such data-to-text NLG.
             | Inaccuracy, hallucinations are still highly problematic.
        
         | polm23 wrote:
         | I don't want to suggest you shot off a random comment without
         | actually reading the linked survey. But it spends a whole
         | section discussing that, and why it doesn't see commercial use.
         | Here's a particularly relevant bit:
         | 
         | > Sometimes the results produced by GPT-2 and its ilk are quite
         | startling in their apparent authenticity. More often than not
         | they are just a bit off. And sometimes they are just gibberish.
         | As is widely acknowledged, neural text generation as it stands
         | today has a significant problem: driven as it is by information
         | that is ultimately about language use, rather than directly
         | about the real world, it roams untethered to the truth. While
         | the output of such a process might be good enough for the
         | presidential teleprompter, it would not cut it if you want the
         | hard facts about how your pension fund is performing. So, at
         | least for the time being, nobody who develops commercial
         | applications of NLG technology is going to rely on this
         | particular form of that technology.
        
           | picodguyo wrote:
           | I agree with the author on GPT-2. But GPT-3, which became
           | available shortly after this was published, is quite a bit
           | more powerful and there are many commercial applications
           | being built on it now.
        
             | leereeves wrote:
             | Even GPT-3 knows nothing about the real world; it's merely
             | trained to repeat the words that most often followed the
             | prompt in its training data. That's obviously not useful
             | for news...if a fact is in the training data, it's not
             | news. It's not useful for "hard facts about how your
             | pension fund is performing" unless you want to know how it
             | performed a long time ago.
             | 
             | But I agree there are some applications it is useful for,
             | like education.
        
               | FeepingCreature wrote:
               | > Even GPT-3 knows nothing about the real world; it's
               | merely trained to repeat the words that most often
               | followed the prompt in its training data.
               | 
               | I don't know why that would imply that it knows nothing
               | about the real world, unless the data corpus it is
               | trained on likewise bears no relation to reality...
        
               | nmfisher wrote:
               | > unless the data corpus it is trained on likewise bears
               | no relation to reality
               | 
               | It's trained on Reddit, so I wouldn't rule that out.
        
             | probably_wrong wrote:
             | I do not see how GPT-3 could solve the basic architectural
             | problem that the parent comment quotes, namely, that _"
             | driven as it is by information that is ultimately about
             | language use, rather than directly about the real world, it
             | roams untethered to the truth"_.
             | 
             | As an experiment I used a GPT-3-powered website [1] to see
             | what GPT-3 has to say about bears, and the first answer
             | was:
             | 
             | > _" Weird that every day, there are so many
             | cute/funny/entertaining bears to enjoy online but hardly
             | any on the ground."_
             | 
             | When asked about _beards_ , the first answer has no
             | relation with beards at all:
             | 
             | > _" If a person doesn't constantly outwit, outplay,
             | outlast, others, the strong eat the weak."_
             | 
             | And then there's that time when GPT-3 told someone to kill
             | themselves [2].
             | 
             | While funny and (mostly) grammatically correct, these
             | "thoughts" are nonsense and no amount of extra parameters
             | is going to solve the disconnection between GPT-3 and
             | reality. I imagine you could condition GPT-3 to generate
             | text for a specific piece of data in such a way that
             | guarantees the correctness of its output, but at that point
             | you might as well throw GPT-3 away and write a rule-based
             | system.
             | 
             | [1] https://thoughts.sushant-kumar.com/bears
             | 
             | [2] https://www.nabla.com/blog/gpt-3/
        
               | picodguyo wrote:
               | Is language use not inherently shaped by the real world?
               | 
               | The site you tried is a tweet generator, not a question
               | answering site. I prompted GPT-3 with "Bears and beards
               | are different because" and got...
               | 
               | "Bears and beards are different because they are not the
               | same thing.
               | 
               | Bears are animals. Beards are facial hair.
               | 
               | Bears are dangerous. Beards are not.
               | 
               | Bears live in the woods. Beards live on your face.
               | 
               | Bears eat people. Beards do not."
               | 
               | But my original point was mainly that this field is
               | moving fast and the the old school NLG companies (I
               | created one back in the day!) are toast.
        
               | hannasanarion wrote:
               | There is more to "the real world" than the definition of
               | words, which is the most that you can expect a language
               | model to learn.
               | 
               | Yes it is true that bears are animals.
               | 
               | No it is not true that, as GPT-3 said, "There aren't any
               | on the ground"
        
               | visarga wrote:
               | Let's just wait a couple of years to see if GPT-3 was any
               | good in applications. Doesn't matter what we think, what
               | matters is if it is viable.
               | 
               | It's younger sibling DALL-E is capable of language
               | grounded in images, I expect the next version to be
               | multi-modal as well. On another line of research there's
               | effort to tame the horse (GPT) by attaching a secondary
               | neural net. This can monitor language, topic, style and
               | bias and ensure increased accuracy in tasks by auto-
               | learning good prompts. It would make development of
               | applications much easier because the base model which was
               | super expensive to train can be reused many times while
               | the secondary net is small and fast to train. Other
               | efforts are related to including a search engine on an
               | inner loop, to make the language model able to query
               | large collections. Also, there's an open effort to create
               | a huge text corpus, so far 800GB (The Pile). It improves
               | on the GPT-3 training corpus on some categories that were
               | lacking.
               | 
               | I think it's safe to say the article is way off the
               | current research level.
        
         | j-pb wrote:
         | "solving" NLP, is AGI complete. GPT-3 is great at superficially
         | correct syntax but breaks at deep / consistent / meaningful
         | semantics.
        
           | yowlingcat wrote:
           | How well can GPT-3 distinguish between human generated and
           | GPT-3 generated?
        
         | bulldog13 wrote:
         | Can you recommend any good tutorials on prompt design for GPT-3
         | ? Or how to use GPT-3 in general ?
        
           | picodguyo wrote:
           | The main docs are behind a private beta wall, but these are
           | good resources too: http://gptprompts.wikidot.com/
           | https://gpttools.com/tutorial_searchQA
           | https://aidungeon.medium.com/world-creation-by-
           | analogy-f26e3...
        
       | wodenokoto wrote:
       | What are the commercial usage of GPT-3 API?
       | 
       | I must completely lack imagination, because I don't know what to
       | use it for if it doesn't give me access to the weights.
        
       ___________________________________________________________________
       (page generated 2021-01-11 22:02 UTC)