[HN Gopher] Loki: An open-source tool for fact verification
___________________________________________________________________
Loki: An open-source tool for fact verification
Author : Xudong
Score : 151 points
Date : 2024-04-06 10:59 UTC (12 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| dscottboggs wrote:
| That seems like something unlikely to do well at being automated,
| and not that at least current-gen ai is capable of.
|
| Does it...work?
| Xudong wrote:
| Hi there, I agree that fact-checking is not something that
| current generative AI models can directly solve. Therefore, we
| decompose this complex into five simpler steps, which current
| techniques can better solve. Please refer to
| https://github.com/Libr-AI/OpenFactVerification?tab=readme-o...
| for more details.
|
| However, errors can always occur. We try to help users in an
| interpretable and transparent way by showing all retrieved
| evidence and the rationale behind each assessment. We hope this
| could at least help people when dealing with such problems.
| szszrk wrote:
| I just tried similar queries as they show on their screenshots
| with Kagi. Basically asked it the exact same question.
|
| While it answered a general "yes" when the more precise answer
| was "no", the motivation in the answer was perfectly on point
| and exactly the same things.
|
| As a general LLM for regular user fastGPT (their llm service)
| is in my opinion "meh" (lacks conversations for instance). But
| it's really impressive that it contains VERY recent data (like
| news and articles from last few days) and always provides great
| references.
| chamomeal wrote:
| Very cool! I've toyed with an idea like this for a while. The
| scraping is a cool extra feature, but tbh just breaking down text
| into verifiable claims and setting up the logic tokens is way
| cooler.
|
| I imagine somebody feeding a live presidential debate into this.
| Could be a great tool for fact checking
| Xudong wrote:
| ahah thanks!
| vinni2 wrote:
| It's a bit misleading to call it open source tool when it relies
| on proprietary LLMs for everything.
| btbuildem wrote:
| Presumably the LLMs are swappable -- today the proprietary ones
| are very powerful and accessible, but the landscape may yet
| change.
| vinni2 wrote:
| Well but they don't mention that it is clickbait to call it
| open source fact checking tool which needs LLMs to do
| everything. Also code is not designed to easily swap with a
| free locally running LLM.
| Xudong wrote:
| I apologize for any confusion caused earlier. The core
| components have been defined separately
| (https://github.com/Libr-
| AI/OpenFactVerification/tree/main/fa...) to make
| customization easier. We understand that switching between
| different LLMs isn't particularly easy in the current
| version. However, we will be adding these features in
| future versions. You are most welcome to collaborate with
| us and contribute to this project!
| vinni2 wrote:
| So only thing they open sourced is the prompts [1] and code to
| call LLM APIs? There are plenty of such libraries out there. And
| the prompts seem to be copied from here [2]?
|
| [1] https://github.com/Libr-
| AI/OpenFactVerification/blob/main/fa...
|
| [2] https://github.com/yuxiaw/Factcheck-
| GPT/blob/main/src/utils/...
| raycat7 wrote:
| regarding your last concern, I found that yuxiaw is their
| COO[1], so it can't be considered a copy?
|
| [1] https://www.librai.tech/team
| vinni2 wrote:
| Ok but bigger issue is there is evidence that the LLMs are
| not better than specialized models for fact-checking.
| https://arxiv.org/abs/2402.12147
| Xudong wrote:
| Hello vinni2, thank you for mentioning the paper. However,
| I noticed that it hasn't gone through peer review yet.
| Also, the paper suggests that fine-tuning may work better
| than in-context learning, but that's not a problem. You can
| fine-tune any LLMs like GPT-3.5 for this purpose and use
| them with this framework. Once you have fine-tuned GPT, for
| example, with specific data, you'll only need to modify the
| model name (https://github.com/Libr-
| AI/OpenFactVerification/blob/8fd1da9...). I believe this
| approach can lead to better results than what the paper
| suggests.
| rjb7731 wrote:
| Isn't this similar to the Deepmind paper on long form factuality
| posted a few days ago?
|
| https://arxiv.org/abs/2403.18802
|
| https://github.com/google-deepmind/long-form-factuality/tree...
| Xudong wrote:
| Yes, they are similar. Actually, our initial paper was
| presented around five months ago
| (https://arxiv.org/abs/2311.09000). Unfortunately, our paper
| isn't cited by the DeepMind paper, which you may see this
| discussion as an example:
| https://x.com/gregd_nlp/status/1773453723655696431
|
| Compared with our initial version, we have mainly focused on
| its efficiency, with a 10X faster checking process without
| decreasing accuracy.
| westurner wrote:
| > _We further construct an open-domain document-level
| factuality benchmark in three-level granularity: claim,
| sentence and document_
|
| A 2020 Meta paper [1] mentions FEVER [2], which was published
| in 2018.
|
| [1] "Language models as fact checkers?" (2020)
| https://scholar.google.com/scholar?cites=3466959631133385664
|
| [2] https://paperswithcode.com/dataset/fever
|
| I've collected various ideas for publishing premises as
| linked data; "#StructuredPremises" "#nbmeta"
| https://www.google.com/search?q=%22structuredpremises%22
|
| From "GenAI and erroneous medical references"
| https://news.ycombinator.com/item?id=39497333 :
|
| >> _Additional layers of these 'LLMs' could read the
| responses and determine whether their premises are valid and
| their logic is sound as necessary to support the presented
| conclusion(s), and then just suggest a different citation URL
| for the preceding text_
|
| > [...] _" Find tests for this code"_
|
| > _" Find citations for this bias"_
|
| From https://news.ycombinator.com/item?id=38353285 :
|
| > _" LLMs cannot find reasoning errors, but can correct them"
| https://news.ycombinator.com/item?id=38353285 _
|
| > _" Misalignment and [...]"_
| RcouF1uZ4gsC wrote:
| > This tool is especially useful for journalists, researchers,
| and anyone interested in the factuality of information.
|
| Sorry, I think an individual who is not only aware of reliable
| sources to verify information, and who is not familiar enough
| with LLMs to come up with appropriate prompts and judge output
| should be the last person presenting themselves as the judger of
| factual information.
| Xudong wrote:
| Thanks for your response. When discussing fact-checking
| capabilities, the key question is always: Can we guarantee that
| it will always offer the correct justification? While it's
| unfortunate, errors can occur. Nonetheless, we prioritize
| making the checking process both interpretable and transparent,
| allowing users to understand and trust the rationale behind
| each assessment.
|
| We present the results at each step to help users understand
| the decision process, which can be seen from our screenshot at
| https://raw.githubusercontent.com/Libr-AI/OpenFactVerificati...
|
| We will try our best to ensure this tool makes a positive
| difference
| tudorw wrote:
| Anyone tried this?
| https://journaliststudio.google.com/pinpoint/about
| meling wrote:
| My friend's startup: https://factiverse.ai/
| Der_Einzige wrote:
| You might want to look into integrating DebateSum or
| OpenDebateEvidence (OpenCaseList) into this tool as sources of
| evidence. They are uniquely good for these sorts of tasks:
|
| https://huggingface.co/datasets/Hellisotherpeople/DebateSum
|
| https://huggingface.co/datasets/Yusuf5/OpenCaselist
| Xudong wrote:
| Hi Der_Einzige, thanks for pointing out these two great
| datasets! We are currently working on including customized
| evidence sources internally and will definitely consider these
| two datasets in the future version of this open-source project.
| axegon_ wrote:
| Overall great idea though, I'll be definitely checking it back in
| the future. A few things that hit me out of the box:
|
| * The idea behind using Serper is great, however it would be cool
| if other search engines/data sources can be used instead, ie.
| Kagi or some private search engine/data. Reason for the latter:
| there are tons of people who are sourcing all sorts of
| information which will not immediately show up on google and some
| might never do. For context: I have roughly 60GB (and growing) of
| cleaned news article with where I got them from and with a good
| amount of pre-processing done on the fly(I collect those all the
| time).
|
| * Relying heavily on OpenAI. Yes, OpenAI is great but there's
| always the thing at the back of our minds that is "where are all
| those queries going and do we trust that shit won't hit the fan
| some day". It would be nice to have the ability to use a local
| LLM, given how many and how good there are around.
|
| * The installation can be improved massively: setuptools +
| entry_points + console_scripts to avoid all the hassle behind
| having to manage dependencies, where your scripts are located and
| all that. The cp factcheck/config/secret_dict.template
| factcheck/config/secret_dict.py is a bit.... Uuuugh...
| pydantic[dotenv] + .env? That would also make the containerizing
| the application so much easier.
| xyst wrote:
| I fully expect some sort of enshittification of openai at some
| point.
| lta wrote:
| That's assuming it's not done already with their mission of
| being open completely forgotten
| Xudong wrote:
| Thank you for your suggestions, axegon!!! We will definitely
| consider them and add the features in a future version shortly.
|
| Regarding the first version, we are currently working on
| enabling customized evidence retrieval, including local files.
| Our plan is to integrate existing tools like LlamaIndex. Any
| suggestion is greatly appreciated!
|
| Regarding the second point, we have found OpenAI's JSON mode to
| be greatly helpful, and have optimized our prompts to fully
| utilize these advances. However, we agree that it would be
| beneficial to enable the use of other models. As promised, we
| will add this feature soon.
|
| Lastly, we appreciate your suggestion and will work on
| improving the installation process for the next version.
| big_hacker wrote:
| Dead internet.
| antihipocrat wrote:
| Have to agree with you, every comment from the product
| creator reads like a chatGPT response.
| Xudong wrote:
| I will take it as a compliment, lol. But I do hope
| ChatGPT or some agents could help me with this. Btw, our
| recent study on machine-generated text detection might be
| interesting to you.
|
| https://arxiv.org/abs/2305.14902
| https://arxiv.org/abs/2402.11175
| swores wrote:
| Feedback on the example gif: at the moment it's almost comically
| useless. First you're bored watching the beginning 90% while
| commands are slowly being typed, and then the bit that's actually
| interesting and worth reading scrolls too fast and then resets to
| the beginning of the gif before there's a chance to read it.
| Xudong wrote:
| Thanks for your feedback on the gif figure, swores! We will
| revise it soon.
| eMPee584 wrote:
| mpv ftw: playback speed control even for gifs..
| eeue56 wrote:
| Interesting. In the Nordics, we have a couple of sites dedicated
| to fact checking news stories, done by real people. I think these
| kinds of automated tools can be helpful too, but needs to be tied
| to reliable sources. This became pretty apparent to me with the
| tech news coverage of xz, too. Lots of accidental (or sometimes
| intentional?) misinformation being spread in news articles. I
| wrote about it a bit[0], it was pretty sad to see big
| international publishers publishing an article based entirely on
| the journalist's misunderstandings of the situation. Facts and
| truth is important, especially as we see gen AI furthering the
| amount of legitimate looking content online that might not
| actually be true.
|
| [0] - https://open.substack.com/pub/thetechenabler/p/trust-in-
| brea...
| pelasaco wrote:
| > In the Nordics, we have a couple of sites dedicated to fact
| checking news stories, done by real people.
|
| We have it everywhere. The problem is however well-known: Human
| bias, political engagement from the fact checkers, etc.. AI
| (without any kind of lock, political bias built-in etc) could
| be the real deal, but because it may be not political correct,
| it will never happen.
| Xudong wrote:
| I wholeheartedly agree on the necessity of linking fact-
| checking tools to credible sources. Currently, our team's
| expertise lies primarily in AI, and we find ourselves at a
| disadvantage when it comes to pinpointing authoritative
| sources. Acknowledging the challenges posed by the rapid spread
| of misinformation, as highlighted by recent studies, we
| developed this prototype to assist in information verification.
| We recognize the value of collaboration in enhancing our tool's
| effectiveness and invite those experienced in evaluating
| sources to join our effort. If our project interests you and
| you're willing to contribute, please don't hesitate to reach
| out. We're eager to collaborate and make a positive impact
| together.
| siffland wrote:
| When I saw Loki as the name, I instantly thought of Grafana Loki
| for logging. I click on the GitHub and get Libr-AI and
| OpenFactVerification.
|
| I am not commenting on the actual software and I know names are
| hard and often overlap, but with something as popular as Loki
| already used for logging I think it might get confusing.
| Xudong wrote:
| Hi siffland! Thank you for your feedback. We understand your
| concern about the potential confusion given the popularity of
| Grafana Loki in the logging space. When naming our project, we
| sought a name that encapsulates our goal of combating
| misinformation. We chose Loki, inspired by the Norse god often
| associated with stories and trickery, to symbolize our
| commitment to unveiling the truth hidden within nonfactual
| information.
|
| When we named our project, we were unaware of the overlap with
| Grafana Loki. We appreciate you bringing this to our attention!
| I will discuss this issue with my team in the next meeting, and
| figure out if there is a better way of solving this. If you
| have any suggestions or thoughts on how we can better
| differentiate our project, we would love to hear them.
|
| Thank you again for your valuable input!
| dekervin wrote:
| I have a project where I take a different approach [0] . I
| basically extract statements , explicit or implicit , that should
| be accompanied by a reference to some data but aren't and I let
| user find the most relevant data for those statements.
|
| [0] https://datum.alwaysdata.net/
| martinbaun wrote:
| Maybe the name is not so fitting as Loki is a name in Norse
| Mythology. Known for deceiving and lying which is basically the
| opposite you're trying to do :)
| smoyer wrote:
| It's also the name of a well-known open-source log collection
| system that's part of the LGTM stack (predominantly led by
| GrafanaCloud Labs.)
| croes wrote:
| Maybe it's on purpose.
|
| Who could better know the patterns of liars than the god of
| lying.
| badrunaway wrote:
| I found it very interesting. I had this funny thought that just
| like CAPTCHA, may be soon we will have to ask humans to give
| their input on fact verification systems at scale.
| redder23 wrote:
| The name Loki is such a great fit! WOW!
|
| This is some giant BS that is for sure. Some stupid, literally
| brain-dead AI searching things created by humans to determine
| what is a "fact". This is beyond dystopian crap.
|
| We all know all the fact-checker orgs. used by big tech like
| Facebook and others are filled with hyper biased woke people who
| do not actually fact-check things but get off on having the power
| to enforce their beliefs, feelings and biases.
|
| I can already tell this is total BS without even looking into it,
| what kinds of sources will it use? What ranking will they give
| them? Snopes? ROFL. Probably just uses some woke infested,
| censored and curated language model to determine a fact based on
| what has the most matches or THE MOST LIKELY because that how AI
| works. Has absolutely nothing to do with facts.
|
| And it's even worse, we are literally in a time when AI
| hallucinates things that do not exist. I won't use a stupid AI to
| find me "facts".
___________________________________________________________________
(page generated 2024-04-06 23:00 UTC)