[HN Gopher] Natural language generation: The commercial state of...
___________________________________________________________________
Natural language generation: The commercial state of the art in
2020
Author : polm23
Score : 82 points
Date : 2021-01-10 06:56 UTC (1 days ago)
(HTM) web link (www.cambridge.org)
(TXT) w3m dump (www.cambridge.org)
| jaxwerk wrote:
| As an engineer who has worked briefly with one of the start-ups
| focused on here on their core product, Robert's critique of the
| the technology driving the current generation of template-based
| commercial solutions is absolutely spot-on. The results are
| useful but the template based approach along with the really
| lackluster tooling both my employer and competitors in the space
| had cobbled together made self-service nigh impossible. His
| assessment that these companies are really professional services
| companies dealing in bespoke one-off software solutions despite
| their efforts to market as robust AI products that magically turn
| some arbitrary JSON schema into prose is very fair I think.
|
| I am eager to see if neural approaches develop to better handle
| constraints, but part of me thinks the level of control over
| generated prose that template based approaches provide is
| essential for customer-facing text and that part of the market
| will persist even when every other recipe in Google search
| results is GPT-3 generated rubbish around the actual damn list of
| ingredients.
| ghenson2 wrote:
| Why is Google's T5 not mentioned?
|
| https://github.com/google-research/text-to-text-transfer-tra...
| riku_iki wrote:
| curious if there any examples of using it for text generation,
| since it has different model structure from GPT.
| FL33TW00D wrote:
| From my experience T5 is currently the best publicly
| available model for NLG.
| pigscantfly wrote:
| I've been using a finetuned T5 model for commercial text
| generation for the past six months. The volume of discussion
| around the model on github and elsewhere leads me to believe
| others are as well, although people tend to be circumspect
| about implementation details.
| ghenson2 wrote:
| Check out the T5 models at huggingface for this. Main NLG use
| cases with T5 are translations, summarization and question
| generation. Latter is sophisticated + nothing trivial and
| "NLG", so yeah.
| polm23 wrote:
| This is a short overview of the state of NLG by Robert Dale, co-
| author of "Building Natural Language Generation Systems", which
| is basically the book for NLG.
|
| He give a list of commercial providers and concludes that most of
| them just offer smart templates. This is the important part:
|
| > To the extent that you can tell from the clues to functionality
| that are surfaced by these various products, all the tools are
| ultimately very similar in terms of how they work, which might be
| referred to as 'smart template' mechanisms. There is a
| recognition that, at least for the kinds of use cases we see
| today, much of the text in any given output can be predetermined
| and provided as boilerplate, with gaps to be filled dynamically
| based on per-record variations in the underlying data source. Add
| conditional inclusion of text components and maybe some kind of
| looping control construct, and the resulting NLG toolkit, as in
| the case of humans and chimpanzees, shares 99% of its DNA with
| the legal document automation and assembly tools of the 1990s,
| like HotDocs (https://www.hotdocs.com). As far as I can tell,
| linguistic knowledge, and other refined ingredients of the NLG
| systems built in research laboratories, is sparse and generally
| limited to morphology for number agreement (one stock dropped in
| value vs. three stocks dropped in value).
|
| That sounds pretty negative, but he emphasizes that putting an
| easy-to-use UI on well understood technology is meeting real
| business needs.
|
| At the end he briefly touches on GPT2 and related technology.
| PaulHoule wrote:
| Smart templates arent bad at all given that for most uses you
| want to enforce style and attributes.
|
| Generation is easier than understanding. Systems like GPT-3 are
| not capable of respecting constraints, which is this basis for
| practical creativity.
| visarga wrote:
| The state of NLG as of June 2020. A long time ago.
|
| Apparently it was published just one day before GPT-3.
| is_true wrote:
| I think that most use cases for NLG are unethical.
|
| I provide raw data for some news services and that data could be
| shown as tables or simple dashboards that are faster to read for
| users, but my customers insist in generating as much text as
| possible from, sometimes, 2 values.
|
| The incentives are totally misaligned, news services try to get
| as much time from it's readers as possible. It's quite
| depressing.
| coding123 wrote:
| I tend to agree. While technologically its a marvel. But I do
| have the feeling the Robocallers and NLG customers are hanging
| out in the same place.
| gxqoz wrote:
| I did some experiments last summer trying to get GPT-3 to
| author quizbowl questions, a type of trivia question. As
| expected, GPT-3's loose relationship with the truth made these
| results not acceptable. But a system that really could generate
| new questions from existing facts would be immensely useful in
| this domain.
| polm23 wrote:
| I'm not surprised GPT is awful at that, but there are good
| ways to do it. Look for papers on "factoid question
| generation".
| gxqoz wrote:
| Quizbowl questions have a particular structure that makes
| them generally not work with these other approaches. For
| instance, it needs to generate paragraph-long questions
| with the clues that go from the most to least difficult.
| This is hard skill for a human to learn, much less a
| machine.
| tyingq wrote:
| Google is probably going to struggle with SEO content/link
| mills that now produce something better than spun gibberish.
| lumost wrote:
| I'd expect the only rational decision would be to ban the
| content mill domains from the crawler. Is there really that
| much value for the search customer in being directed to a
| generated page?
| tyingq wrote:
| That's the dilemma. NLG of sufficient quality[1] won't be
| easily detectable as machine generated.
|
| [1] "Quality" here meaning how it looks to Google's
| crawler, not actual quality content.
| lumost wrote:
| domains auto-generating junk or derivative content should
| be detectable from both human annotation or automated
| means. If a site with low click through rates, traffic
| and extreme volumes of text appears, it's worth a re-
| examination by people.
|
| In particular at search time, google has a vested
| interest in limiting both the number of links to a
| particular domain and the occurrence of content farm
| links when Wikipedia would suffice. Content farms are
| pretty detectable in any objective relevance annotation
| workflow.
| tyingq wrote:
| It wasn't that long ago that really poor quality auto
| generated content was clearly working for SEO purposes,
| so I'm not convinced Google is ahead of current state of
| the art.
| mkl95 wrote:
| Usage of these algorithms and heuristics is starting to become
| quite obvious in Spanish sports journalism. I stumble into
| articles that are basically AI gibberish generated from a short
| quote every other day.
| Saad_M wrote:
| I very much disagree. There are use case such as weather
| reporting, electronic medical report summarisation, etc. That
| make more sense as a textual representation than bundles of
| graphs and tables. In fact textual summarisation has been shown
| to lead to better decision making[1].
|
| Good data-to-text NLG applications not only summarise data but
| they also can provides insight into causal relations of why
| events occurred by leveraging domain knowledge.
|
| [1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2656014/
| is_true wrote:
| I agree with you Saad. I worded that comment poorly. There
| are some use cases that are gonna be used to keep people
| reading a lot of algorithmical gibberish for no reason other
| than increasing revenue, for example sports articles.
|
| I think NLG is really interesting, the problem is that there
| are incentives to create long form content that doesn't add
| any value to readers.
| pdevr wrote:
| Is there any place where NLG guys hang around? Most groups seem
| to have NLP as the focus.
| Saad_M wrote:
| On the academic side the SIGGEN mailing list is where a lot of
| the activity in the NLG community goes on:
| https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=SIGGEN
| ghenson2 wrote:
| Wonder the same thing or just an _active_ discord for these
| topics...
| visarga wrote:
| EleutherAI on discord. https://discord.gg/vtRgjbM
| picodguyo wrote:
| ML and NLP surveys are outdated basically the minute the author
| hits send nowadays. With GPT-3 all of these old school NLG
| companies seem quaint. Much of their functionality can be
| replicated in minutes with GPT-3, even by non-technical users
| familiar with prompt design. As soon as OpenAI offer fine tuning
| OOTB (coming soon) NLP/G will be all but solved for 99% of use
| cases.
| PedroBatista wrote:
| Flying cars buddy! you just wait..
| picodguyo wrote:
| I know, I know. But I don't say this lightly. I've worked in
| NLP/G for nearly 15 years and after spending the last 6
| months working with GPT-3 I feel the writing is on the wall.
| Saad_M wrote:
| But it isn't. I am still actively involved in NLG both
| professionally and academically and neural NLG systems have
| great promise but are still far from actively delivering
| tangible solutions to areas such data-to-text NLG.
| Inaccuracy, hallucinations are still highly problematic.
| polm23 wrote:
| I don't want to suggest you shot off a random comment without
| actually reading the linked survey. But it spends a whole
| section discussing that, and why it doesn't see commercial use.
| Here's a particularly relevant bit:
|
| > Sometimes the results produced by GPT-2 and its ilk are quite
| startling in their apparent authenticity. More often than not
| they are just a bit off. And sometimes they are just gibberish.
| As is widely acknowledged, neural text generation as it stands
| today has a significant problem: driven as it is by information
| that is ultimately about language use, rather than directly
| about the real world, it roams untethered to the truth. While
| the output of such a process might be good enough for the
| presidential teleprompter, it would not cut it if you want the
| hard facts about how your pension fund is performing. So, at
| least for the time being, nobody who develops commercial
| applications of NLG technology is going to rely on this
| particular form of that technology.
| picodguyo wrote:
| I agree with the author on GPT-2. But GPT-3, which became
| available shortly after this was published, is quite a bit
| more powerful and there are many commercial applications
| being built on it now.
| leereeves wrote:
| Even GPT-3 knows nothing about the real world; it's merely
| trained to repeat the words that most often followed the
| prompt in its training data. That's obviously not useful
| for news...if a fact is in the training data, it's not
| news. It's not useful for "hard facts about how your
| pension fund is performing" unless you want to know how it
| performed a long time ago.
|
| But I agree there are some applications it is useful for,
| like education.
| FeepingCreature wrote:
| > Even GPT-3 knows nothing about the real world; it's
| merely trained to repeat the words that most often
| followed the prompt in its training data.
|
| I don't know why that would imply that it knows nothing
| about the real world, unless the data corpus it is
| trained on likewise bears no relation to reality...
| nmfisher wrote:
| > unless the data corpus it is trained on likewise bears
| no relation to reality
|
| It's trained on Reddit, so I wouldn't rule that out.
| probably_wrong wrote:
| I do not see how GPT-3 could solve the basic architectural
| problem that the parent comment quotes, namely, that _"
| driven as it is by information that is ultimately about
| language use, rather than directly about the real world, it
| roams untethered to the truth"_.
|
| As an experiment I used a GPT-3-powered website [1] to see
| what GPT-3 has to say about bears, and the first answer
| was:
|
| > _" Weird that every day, there are so many
| cute/funny/entertaining bears to enjoy online but hardly
| any on the ground."_
|
| When asked about _beards_ , the first answer has no
| relation with beards at all:
|
| > _" If a person doesn't constantly outwit, outplay,
| outlast, others, the strong eat the weak."_
|
| And then there's that time when GPT-3 told someone to kill
| themselves [2].
|
| While funny and (mostly) grammatically correct, these
| "thoughts" are nonsense and no amount of extra parameters
| is going to solve the disconnection between GPT-3 and
| reality. I imagine you could condition GPT-3 to generate
| text for a specific piece of data in such a way that
| guarantees the correctness of its output, but at that point
| you might as well throw GPT-3 away and write a rule-based
| system.
|
| [1] https://thoughts.sushant-kumar.com/bears
|
| [2] https://www.nabla.com/blog/gpt-3/
| picodguyo wrote:
| Is language use not inherently shaped by the real world?
|
| The site you tried is a tweet generator, not a question
| answering site. I prompted GPT-3 with "Bears and beards
| are different because" and got...
|
| "Bears and beards are different because they are not the
| same thing.
|
| Bears are animals. Beards are facial hair.
|
| Bears are dangerous. Beards are not.
|
| Bears live in the woods. Beards live on your face.
|
| Bears eat people. Beards do not."
|
| But my original point was mainly that this field is
| moving fast and the the old school NLG companies (I
| created one back in the day!) are toast.
| hannasanarion wrote:
| There is more to "the real world" than the definition of
| words, which is the most that you can expect a language
| model to learn.
|
| Yes it is true that bears are animals.
|
| No it is not true that, as GPT-3 said, "There aren't any
| on the ground"
| visarga wrote:
| Let's just wait a couple of years to see if GPT-3 was any
| good in applications. Doesn't matter what we think, what
| matters is if it is viable.
|
| It's younger sibling DALL-E is capable of language
| grounded in images, I expect the next version to be
| multi-modal as well. On another line of research there's
| effort to tame the horse (GPT) by attaching a secondary
| neural net. This can monitor language, topic, style and
| bias and ensure increased accuracy in tasks by auto-
| learning good prompts. It would make development of
| applications much easier because the base model which was
| super expensive to train can be reused many times while
| the secondary net is small and fast to train. Other
| efforts are related to including a search engine on an
| inner loop, to make the language model able to query
| large collections. Also, there's an open effort to create
| a huge text corpus, so far 800GB (The Pile). It improves
| on the GPT-3 training corpus on some categories that were
| lacking.
|
| I think it's safe to say the article is way off the
| current research level.
| j-pb wrote:
| "solving" NLP, is AGI complete. GPT-3 is great at superficially
| correct syntax but breaks at deep / consistent / meaningful
| semantics.
| yowlingcat wrote:
| How well can GPT-3 distinguish between human generated and
| GPT-3 generated?
| bulldog13 wrote:
| Can you recommend any good tutorials on prompt design for GPT-3
| ? Or how to use GPT-3 in general ?
| picodguyo wrote:
| The main docs are behind a private beta wall, but these are
| good resources too: http://gptprompts.wikidot.com/
| https://gpttools.com/tutorial_searchQA
| https://aidungeon.medium.com/world-creation-by-
| analogy-f26e3...
| wodenokoto wrote:
| What are the commercial usage of GPT-3 API?
|
| I must completely lack imagination, because I don't know what to
| use it for if it doesn't give me access to the weights.
___________________________________________________________________
(page generated 2021-01-11 22:02 UTC)