[HN Gopher] Show HN: Turn a paper's DOI into its full reference ...
___________________________________________________________________
Show HN: Turn a paper's DOI into its full reference list
(BibTeX/RIS, etc.)
Author : mireklzicar
Score : 24 points
Date : 2025-06-22 18:25 UTC (4 hours ago)
(HTM) web link (references.mireklzicar.com)
(TXT) w3m dump (references.mireklzicar.com)
| oersted wrote:
| Is an open-source library being used for this? Or can you
| describe the methods you use? I worked on this and related
| problems around extracting features from paper PDFs, we could all
| learn from how you did it.
|
| Generally, an About page is always appreciated for such web tools
| with minimal UX, particularly when it's rather automagical.
| trurl42 wrote:
| Looks like it's just calling the crossref API
| afandian wrote:
| You can look at the network requests to see what it's doing.
| It's querying the OpenCitations database followed by the
| DOI.org content negotiation endpoint, which 302's to Crossref
| (or whoever the relevant DOI registration agency is).
|
| More info on content negotiation:
|
| https://citation.doi.org/
| afandian wrote:
| In this case it's querying the relevant DOI registration
| agency's API for the metadata (statistically that's likely
| Crossref) that the publisher themselves registered. So it
| doesn't look like there's any extraction going on here.
|
| Could you share _your_ work though? It's always interesting to
| see new approaches to metadata.
|
| Traditionally, it was a bit of a one-way street (data comes
| from publisher) but there's some interesting work being done by
| COMET [0] and (separately) OpenAlex [1] around cleanup of the
| publisher-supplied data within the community.
|
| (I used to work at Crossref; am a little involved with COMET)
|
| [0] https://www.cometadata.org/
|
| [1] https://openalex.org/
| mireklzicar wrote:
| Its actually open-source. Here is the repo:
| https://github.com/mireklzicar/doi-reference-extractor
|
| APIs Used OpenCitations API (v2)
|
| Endpoint: https://opencitations.net/index/api/v2/references/
| Purpose: Retrieves a list of all references from a paper by its
| DOI Data format: JSON containing cited DOIs and metadata DOI
| Content Negotiation
|
| Endpoint: https://doi.org/{DOI} Purpose: Fetches metadata and
| formatted citations for DOIs Formats: BibTeX, RIS, CSL JSON,
| RDF XML, etc. Implements CSL (Citation Style Language) for
| text-based citations Local Citation Style Files
|
| Purpose: Provides access to thousands of citation styles
| Storage: Pre-generated JSON files with style information
| avoutos wrote:
| This tool might be useful for quick one-off referencing, but I
| feel that most will probably be better off using a proper
| citation manager like the open-source Zotero.
| mireklzicar wrote:
| Keep Zotero/Mendeley for collection management; use this simple
| tool when you just need the formatted references list in five
| seconds.
|
| Where it helps
|
| - Deep-dive reading - fetch bulk RIS file and dump a seminal
| paper's entire bibliography into Zotero/Mendeley and follow the
| threads.
|
| - Bulk citing - grab BibTeX's for a cluster of related papers
| without hunting them down one-by-one.
|
| - LLM grounding - feed language models a clean reference list
| so they stop hallucinating citations.
| foundry27 wrote:
| Did you just use a LLM to write this reply?
___________________________________________________________________
(page generated 2025-06-22 23:00 UTC)