[HN Gopher] KELM: Integrating Knowledge Graphs with Language Mod...
___________________________________________________________________
KELM: Integrating Knowledge Graphs with Language Model Pre-Training
Corpora
Author : theafh
Score : 66 points
Date : 2021-05-21 13:01 UTC (10 hours ago)
(HTM) web link (ai.googleblog.com)
(TXT) w3m dump (ai.googleblog.com)
| dexter89_kp3 wrote:
| Any links/resources to the opposite problem? i.e generating
| accurate knowledge graphs from corpora of documents?
|
| Google does have an experimental API, but have not found an
| associated blog post or paper with it:
| https://cloud.google.com/ai-workshop/experiments/generating-...
| cpdomina wrote:
| The field of Open Information Extraction has been trying to do
| that in a generic way for a long time, but the results are
| still far from good. A few references: OpenIE [1] Graphene [2]
| MinIE [3].
|
| If you already have a Knowledge Graph (KG) and want to populate
| its instances from documents, that's called KG Population, and
| Knowledge-net [4] is a good reference.
|
| Relation Extraction is another interesting approach if you know
| which kind of relations you're interested in, OpenNRE [5] a
| good example.
|
| [1] https://github.com/dair-iitd/OpenIE-standalone
|
| [2] https://github.com/Lambda-3/Graphene
|
| [3] https://github.com/uma-pi1/minie
|
| [4] https://github.com/diffbot/knowledge-net
|
| [5] https://github.com/thunlp/OpenNRE
| MilStdJunkie wrote:
| I'm hip deep into this subject myself. I tried modding
| TiddlyMap and just now am checking out InfraNodus. When you dig
| into this, you find there's not a standard method because
| natural language is itself deeply non-standardized. To take one
| example, a procedure uses the same structure as a ordered list,
| but a procedure regards the sequence as representing a temporal
| structure, whereas the ordered list is just using sequence as
| providing a unique identifier - it doesn't even need numbers or
| precise intervals. You need NLP to chip out the subject-noun-
| verb from the ordered list items, or you need an element or
| role telling you what you are looking at.
|
| If someone way more smart than myself could chip in on the
| subject, that would be pretty dang awesome.
| PaulHoule wrote:
| Are they interested in using the generated text as the input to
| some other process? (e.g. training "convert text to knowledge
| graph?")
|
| You could do this kind of graph -> text translation with
| conventional template-based tools, in fact people do that all the
| time. You very much run into the stages of "pick out a subgraph
| of salient facts", materializing text. If you scale it up you'll
| discover it has "erroneous zones" and end up building filters
| that block dangerous (likely to be wrong) outputs.
| gradys wrote:
| It's easy to use templates to convert some given knowledge
| graph node or maybe subgraph into a paragraph that could maybe
| serve as a Wikipedia intro paragraph.
|
| It's much harder to generate answers to questions. This calls
| for jointly choosing what knowledge to use in the answer and
| synthesizing text that presents that knowledge in a way that
| actually answers the question. This work is about this more
| dynamic problem.
| mark_l_watson wrote:
| Looks interesting. I saw that training data is available, but
| didn't see any pre trained models, etc.
|
| We talked about trying to do this at my last job.
| softwaredoug wrote:
| > To that end, we leverage the publicly available English
| Wikidata KG and convert it into natural language text in order to
| create a synthetic corpus
|
| Wikimedia data is heavily depended on in the FAANG world for
| Google search, Siri, Alexa, etc... when Siri directly answers a
| factual question, I'd make a strong bet the answer ultimately
| comes from WikiDatas knowledge graph.
|
| I just hope these companies give as much back to Wikimedia and
| society as the value they extract.
___________________________________________________________________
(page generated 2021-05-21 23:01 UTC)