t002-scholarref.html - adamsgaard.dk - my academic webpage
(HTM) git clone git://src.adamsgaard.dk/adamsgaard.dk
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) README
(DIR) LICENSE
---
t002-scholarref.html (9831B)
---
1 <h2>Rationale</h2>
2 <p>During the writing phase of an academic paper, common tasks include
3 downloading PDFs of publications and getting their references into your
4 bibliography. However, I am not a fan of navigating the slow, bloated,
5 tracker-filled, and distracting webpages of academic journals and
6 publication aggregators. For some reason, many publishers decided
7 that clicking the "Download PDF" link should redirect the user to
8 an unusable in-browser PDF viewer instead of providing the PDF file
9 directly. While the majority of journal webpages provide formatted
10 citations for their publications, these are inconsistent in style and
11 content.</p>
12
13 <p>For these reasons, I constructed a set of shell tools
14 called <strong>scholarref</strong> that allow me to
15 perform most of the tasks without having to open a web browser.
16 As the title of this post indicates, the goal of the toolset is to
17 provide as much functionality a person might need during scientific
18 writing without leaving the command line. The tools are under <a
19 href="https://src.adamsgaard.dk/scholarref/log.html">continuous
20 development</a>. At present I avoid roughly 90% of visits to journal
21 webpages. I hope to get to 100% someday.</p>
22
23 <p>The <strong>scholarref</strong> design goals are the following:</p>
24 <ul>
25 <li>Written as POSIX shell scripts with minimal external dependencies:
26 Ensures maximum flexibility and portability.</li>
27 <li>Aim for simplicity:
28 Fewer lines of code make the programs easier to understand, maintain,
29 and debug.</li>
30 <li>Each tool should do one thing, and do it well:
31 Let the users piece the components together to fit their workflow.</li>
32 <li>Return references in BibTeX format.</li>
33 </ul>
34
35 <p><strong>DISCLAIMER:</strong> The functionality provided by these
36 programs depends on communication with third party webpages, which may
37 or may not be permitted by law and the terms of service upheld by the
38 third parties. What is demonstrated here are examples only. Use of
39 the tools is entirely your own responsibility.</p>
40
41
42 <h2>Installation</h2>
43
44 <pre><code>$ git clone git://src.adamsgaard.dk/scholarref
45 $ cd scholarref
46 # make install</pre></code>
47
48 <p>The <strong>make install</strong> command may require superuser
49 priviledges to install the tools to <strong>/usr/local</strong>. Prefix
50 with <strong>doas</strong> or <strong>sudo</strong>, whatever is
51 appropriate for the target system.</p>
52
53 <h2>The scholarref toolset</h2>
54
55 <p>The core functionality is provided by the scripts
56 <strong>getdoi</strong>, <strong>getref</strong>, and
57 <strong>shdl</strong>. All programs accept input as command-line
58 arguments or from standard input (stdin). The programs come with
59 several OPTIONS, and it is encouraged to explore the help text
60 (invoke with option <strong>-h</strong>). The <strong>-t</strong> option
61 may be of particular interest, since it tunnels all communication through
62 <a href="https://torproject.org">Tor</a> via <strong>torsocks</strong>
63 (if available on the system).</p>
64
65 <h3>getdoi</h3>
66 This tool accepts either names of PDF files or arbitrary search queries.
67 If a PDF file name is supplied, <strong>getdoi</strong> scans the PDF
68 text in order to find the first occurring DOI entry, which typically is
69 the DOI of the publication itself. If an arbitrary query is supplied,
70 the <a href="http://api.crossref.org">CrossRef API</a> is used to find
71 the DOI of the closest publication match. You can supply author names,
72 parts of the title, ORCID, journal name, etc. Examples:</p>
73
74 <pre><code>$ getdoi damsgaard2018.pdf
75 10.1029/2018ms001299
76 $ getdoi 'damsgaard sergienko adcroft journal advances modeling earth systems'
77 10.1029/2018ms001299
78 </code></pre>
79
80 <h3>getref</h3>
81 <p>The <strong>getref</strong> tool fetches the BibTeX citation for a
82 given DOI from <a href="https://doi.org">doi.org</a>. By default, the
83 journal names and author first names are abbreviated, which is what most
84 journals want. I have taken abbreviations from the <a
85 href="https://www.library.caltech.edu/journal-title-abbreviations">Caltech
86 Library list of Journal Title Abbreviations</a>. The
87 <strong>getref</strong> ruleset of journal-title abbreviations is
88 incomplete, and is expanded on a per-need basis. If desired, the
89 abbreviation functionality can be disabled. See <strong>getref -h</strong>
90 for details.</p>
91
92 <pre><code>$ getref 10.1029/2018ms001299
93 @article{Damsgaard2018,
94 doi = {10.1029/2018ms001299},
95 year = 2018,
96 publisher = {American Geophysical Union ({AGU})},
97 volume = {10},
98 number = {9},
99 pages = {2228--2244},
100 author = {A. Damsgaard and A. Adcroft and O. Sergienko},
101 title = {Application of Discrete Element Methods to Approximate Sea Ice Dynamics},
102 journal = {J. Adv. Mod. Earth Sys.}
103 }
104 $ getref -j 10.1029/2018ms001299 # do not abbreviate journal title
105 @article{Damsgaard2018,
106 doi = {10.1029/2018ms001299},
107 year = 2018,
108 publisher = {American Geophysical Union ({AGU})},
109 volume = {10},
110 number = {9},
111 pages = {2228--2244},
112 author = {A. Damsgaard and A. Adcroft and O. Sergienko},
113 title = {Application of Discrete Element Methods to Approximate Sea Ice Dynamics},
114 journal = {Journal of Advances in Modeling Earth Systems}
115 }
116 </code></pre>
117
118 <h3>shdl</h3>
119 <p>This tool takes a DOI as input and attempts to
120 download the corresponding publication as a PDF through <a
121 href="https://sci-hub.tw">sci-hub</a>. Unfortunately, the sci-hub web
122 interface often puts up captias to restrict automated downloads. If that's
123 the case, <strong>shdl</strong> opens the tor browser (if installed)
124 or the system web browser in order to manually complete the
125 download. Output PDF files are saved in the present working directory.</p>
126
127
128 <h2>Usage examples</h2>
129
130 <p>The <strong>scholarref</strong> tools are meant to be chained
131 together. For example, if you want a BibTeX reference a search query,
132 simply use UNIX pipes to send the <strong>getdoi</strong> output as
133 input to <strong>getref</strong>:</p>
134
135 <pre><code>$ getdoi 'damsgaard egholm ice flow dynamics' | getref
136 @article{Damsgaard2016,
137 doi = {10.1002/2016gl071579},
138 year = 2016,
139 publisher = {American Geophysical Union ({AGU})},
140 volume = {43},
141 number = {23},
142 pages = {12,165--12,173},
143 author = {A. Damsgaard and D. L. Egholm and L. H. Beem and S. Tulaczyk and N. K. Larsen and J. A. Piotrowski and M. R. Siegfried},
144 title = {Ice flow dynamics forced by water pressure variations in subglacial granular beds},
145 journal = {Geophys. Res. Lett.}
146 }
147 </code></pre>
148
149 <p>The <strong>scholarref</strong> program itself is an aggregation of
150 the <strong>getdoi</strong> and <strong>getref</strong> commands. If
151 called with the <strong>-a</strong> option, the reference
152 is directly inserted into the system bibliography. The full
153 path to the bibliography file (.bib) is assumed to be set in the
154 <strong>$BIB</strong> environment variable, for instance defined in the
155 user <strong>~/.profile</strong>.</p>
156
157 <pre><code>$ echo $BIB
158 /home/ad/articles/own/BIBnew.bib
159 $ scholarref -a 'damsgaard egholm ice flow dynamics'
160 Citation Damsgaard2016 added to /home/ad/articles/own/BIBnew.bib
161 </code></pre>
162
163
164 <h2>Integrating into your favorite $EDITOR</h2>
165 <p>The <strong>scholarref</strong> tool is particularly useful if called
166 from within a text editor. Below I demonstrate how keyboard bindings
167 can be bound in various editors to provide scholarref functionality.</p>
168
169 <h3>vi</h3>
170 <p>My editor of choice is the plain, old, and simple <a
171 href="https://man.openbsd.org/vi">vi(1)</a>. I have the following binding
172 in my <strong>~/.exrc</strong>, including a trailing space:</p>
173
174 <pre><code>map qr :r !scholarref </code></pre>
175 <p>The rest of my editor configuration can be found under my <a
176 href="https://src.adamsgaard.dk/dotfiles/file/.exrc.html">dotfiles source
177 code repository</a>.</p>
178
179 <h3>vim</h3>
180 <p>You can add the following bindings to <strong>~/.vimrc</strong>
181 or <strong>~/.vim/vimrc</strong> in order to get scholarref functionality
182 within <a href="https://www.vim.org/">vim(1)</a>:</p>
183
184 <pre><code>nnoremap <leader>r :r !scholarref<space> " insert reference into current buffer
185 nnoremap <leader>R :r !scholarref --add<space> " append reference into $BIB file
186 </code></pre>
187
188 <h3>vis</h3>
189 <p>The <a href="https://github.com/martanne/vis">vis(1)</a> editor is an
190 interesting combination of modal editing and structural regular expressions
191 from the plan9 editor <a href="https://sam.cat-v.org/">sam(1)</a>. If
192 desired, add the following binding to
193 <strong>~/.config/vis/visrc.lua</strong>:</p>
194
195 <pre><code>vis:map(vis.modes.NORMAL, leader..'r', ':< scholarref ')</code></pre>
196
197 <h3>emacs</h3>
198 <p>Don't know, figure it out yourself.</p>
199
200 <h2>Integrating into your pdf viewer</h2>
201 <p>My PDF viewer of choice is <a
202 href="https://pwmt.org/projects/zathura">zathura(1)</a>, which has a
203 minimal graphical user interface and is keyboard-centric. The following
204 configuration calls <strong>getdoi</strong> on the currently open file
205 if I press <strong>Ctrl-i</strong>. The resultant DOI is copied to the
206 clipboard. Similarly, <strong>Ctrl-s</strong> tries to extract the DOI
207 in the same manner, but fetches the accompanying reference and adds it
208 directly to the bibliography.</p>
209
210 <pre><code>map <C-i> feedkeys ":exec getdoi --notify --clip '$FILE'<Return>"
211 map <C-s> feedkeys ":exec scholarref --add '$FILE'<Return>"
212 </code></pre>
213
214 <p>My full zathura configuration is available <a
215 href="https://src.adamsgaard.dk/dotfiles/file/.config/zathura/zathurarc.html">here</a>.</p>
216
217 <h2>Questions/bugs/feedback/improvements</h2>
218 <p>Please <a href="contact.html">get in touch</a> if you encounter
219 any. Improvement suggestions are best sent as patches by e-mail.</p>